How do language models generalize from information they learn in-context vs. via finetuning? We show that in-context learning can generalize more flexibly, illustrating key differences in the inductive biases of these modes of learning — and ways to improve finetuning. Thread: 1/
Had a great time presenting this work at the NAACL 2025 Insights Workshop yesterday!
We adapted a Monte Carlo sampling method to analyze the impact of the number of in-context examples.
aclanthology.org/2025.insights-…
I’ll be presenting our work “In-Context Learning (and Unlearning) of Length Biases” at NAACL 25 in Hall 3 from 11AM-12:30PM today. Looking forward to chatting about ICL with everyone!
@cwolferesearch If you thought the information on data they release is interesting, you should check out our recent survey on data for LLMs
We include a TON more information about data processing, and most information Meta includes in the release isn't particularly new
@cwolferesearch If you thought the information on data they release is interesting, you should check out our recent survey on data for LLMs
We include a TON more information about data processing, and most information Meta includes in the release isn't particularly new
From the LLaMa 3 blogpost - they use a combination of rejection sampling, DPO and PPO for post-training. Really interested to know what tasks/parts of the process each algorithms benefits the most.
LLaMA-3 is a prime example of why training a good LLM is almost entirely about data quality…
TL;DR. Meta released LLaMA-3-8B/70B today and 95% of the technical info we have so far is related to data quality:
- 15T tokens of pretraining data
- More code during pretraining…
Chat with MLX 🚀 a high-performance macOS app linking your local docs to a custom large language model (LLM) on your machine 🧵
Now open-source in beta!
github.com/mlx-chat/mlx-c…
Collaboratively built by @itsstock & @parkersm1th
Our group released a Python package of data valuation in machine learning, Valda. It supports five methods (LOO, Influence Function, TMC-Shapley, Beta-Shapley, and CS-Shapley) via a unified API. Please try it out if you are interested:
uvanlp.org/valda/@stephschoch
Our work on class-wise Shapley values for data valuation is accepted to #NeurIPS2022 Congratulations to my student @stephschoch and collaborator @haifengxu0! See you in New Orleans!
INLG 2022 will be 18-22 July, in Colby College (Waterville, Maine, USA)!
Calls for papers, workshops, etc available at
inlgmeeting.github.io/calls.html
Very excited to share this update: I passed my PhD Qualifying Examination! A big thank you to my committee and to my advisor @yangfeng_ji for all of his support and guidance!
Very excited to share this update: I passed my PhD Qualifying Examination! A big thank you to my committee and to my advisor @yangfeng_ji for all of his support and guidance!
The commendation for outstanding position paper goes to "Underreporting of errors in NLG output, and what to do about it" by van Miltenburg, Clinciu, Dušek, Gkatzia, Inglis, Leppänen, Mahamood, Manning, Schoch, Thomson, & Wen
2K Followers 887 FollowingAssociate professor @EmoryUniversity. Working on large language models, LLM inference, reasoning, natural language generation, and various aspects of GenAI.
85 Followers 249 FollowingMathemacoder, Australian, Data Scientist, Husband, Father.
Co-chair of AI in Measurement and Education (https://t.co/BXQvQvw0dY)
Opinions are my own.
807 Followers 1K FollowingBrain boffin / machine learning mercenary at NIMH. My opinions, not my employer's. @[email protected] @franciscopereira.bsky.social
1K Followers 793 FollowingStaff Researcher @AlibabaGroup. Previously @MBZUAI, PhD from @ml_labs_irl and @dcucomputing @dcu interested in Large Language Models (LLMs).
425 Followers 1K FollowingPh.D. Candidate at @umasscs. Prev @genentech @Google @IIITDelhi.
Dabbling with Interpretability, Retrieval and some Bioinformatics.
401 Followers 1K FollowingStaff Research Scientist@Visa Research | CS Ph.D. @tamu| Formerly, @Amazon & @Visa & @Samsung | Trustworthy ML & Graph Neural Network | Opinions are my own
1K Followers 2K FollowingPhD @NYUDataScience, visiting researcher @AIatMeta, interested in AI & CogSci, specifically in goals and their representations in minds and machines (he/him).
49K Followers 9K FollowingI lead @Cohere_Labs. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, ML reliability. Changing spaces where breakthroughs happen.
102 Followers 181 FollowingLanguage generation (LG) is crucial if machines are to communicate with humans seamlessly using human natural language. CA18231 will work in this direction.
967 Followers 594 FollowingPhD student at @LTIatCMU / @SCSatCMU she/her, prev. @UVA and intern @ai2_allennlp
@/clara on https://t.co/GHxXbrRHSB and @/clarana on https://t.co/47UIhMGaRd
331 Followers 3K Following🤸♀️budding developer•runner•recovering philosopher•he/them•e/iadsan•carson a bha mi cho fada nam aonar a’ cothachadh ri tìm?🧜
543 Followers 819 FollowingTeaching computers to talk at Charles University. (Computational) linguistics, politics, climate, public transit. He/him. https://t.co/zFcgerpMcA
380 Followers 776 FollowingSoftware Engineer. PhD from Aberdeen University on Data Quality in NLG. Long covid hauler fighting to get back to gigs again one day.
155K Followers 0 FollowingThe free and flexible app for your private thoughts. For help and deeper discussions, join our community: https://t.co/QsDArfFkkv
118K Followers 16K FollowingCommunity, guidance & tips for academics 🎓 Practical resources, including AI | Courses at @AcademicCourses | Led by The PhD Place Team ✨
85 Followers 249 FollowingMathemacoder, Australian, Data Scientist, Husband, Father.
Co-chair of AI in Measurement and Education (https://t.co/BXQvQvw0dY)
Opinions are my own.
807 Followers 1K FollowingBrain boffin / machine learning mercenary at NIMH. My opinions, not my employer's. @[email protected] @franciscopereira.bsky.social
425 Followers 1K FollowingPh.D. Candidate at @umasscs. Prev @genentech @Google @IIITDelhi.
Dabbling with Interpretability, Retrieval and some Bioinformatics.
35K Followers 189 FollowingCo-founder and CEO https://t.co/efv72CKpAG (@WaveFormsAI) - Ex @OpenAI GPT-4o/AVM Audio Research Lead - #Her #TARS - Ex @AIatMeta, @Polytechnique (X11)
4K Followers 389 Following@rplevy.bsky.social | Director, MIT Computational Psycholinguistics Laboratory | Past President, Cognitive Science Society | Chair of the MIT Faculty | He
3K Followers 416 FollowingAI Group (NLP/CV/ML etc) at @UNCCS @UNC
Faculty: @mohitban47+@gberta227+@snigdhac25+@shsriva+@tianlongchen4+@huaxiuyaoml+@dingmyu+@zhun_deng +@SenguptRoni et al
12K Followers 744 FollowingResearch Scientist, Deepmind
I try to think hard about everything I tweet, esp on 90s football and 80s music
None of my opinions are really someone else's
5K Followers 2K FollowingAssociate Professor @UWCSE developing computational methods that leverage large-scale behavioral data to improve human well-being. Recruiting PhD students :-)
12K Followers 1K FollowingFounder of https://t.co/9KM4uFScMi, Associate Professor at Columbia. Making ai agent design and deployment easy and fast!
Forbes 30 under 30.
923 Followers 153 FollowingProfessor in the Computer Science and the Information Science departments at Cornell University. Studies natural language processing.
49K Followers 9K FollowingI lead @Cohere_Labs. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, ML reliability. Changing spaces where breakthroughs happen.
12K Followers 0 FollowingInformation Sciences / English at UIUC. Not active here, but you can find me at any of the other places. Gradually unfollowing everyone here; it’s not personal.
8K Followers 894 FollowingAssociate professor at NYU (Courant CS + Center for Data Science) | advisor for @bespokelabsai | large language models and NLP | he/him
38K Followers 992 FollowingCreator of bitsandbytes.Research Scientist @allen_ai and incoming professor @CarnegieMellon. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
1K Followers 185 FollowingAssociate professor of Computer Science at Northeastern University, researching NLP, ML, IR, and digital humanities. https://t.co/rSvC7ikLhe
10K Followers 1K FollowingWaiting on a robot body. All opinions are universal and held by both employers and family. Now a dedicated grok hate account.
Accepting ML/NLP PhD students.
3K Followers 2K FollowingResearch Scientist at Meta. 10-yr test-of-time ACL 22, Best Demo ACL 25, Best Resource Paper ACL 24, Best Theme Paper ACL 24, Best Student Paper NAACL 15 🏳️🌈