PhD student at the Max Planck Institute for Intelligent Systems, working with Moritz Hardt and Bernhard Schölkopf.ricardodominguez.github.io Tübingen, GermanyJoined January 2014
My PhD advisor, Moritz Hardt, has just released the first half of his new book, The Emerging Science of Machine Learning Benchmarks. It’s freely available and highly recommended: mlbenchmarks.org
New preprint out! 🎉🎉
How does LLM training loss translate to downstream performance?
We show that pretraining data and tokenizer shape loss-to-loss scaling laws, while architecture and other factors play a surprisingly minor role! brendel-group.github.io/llm-line/
🧵1/8
“Aha moments” can be observed at step 0, so we should not fixate on reporting individual instances.
Instead, we should seek reliable measures of internal reasoning that can be tracked throughout training.
So far, response length appears to be one such (imperfect) measure.
R1-style GRPO on Llama 3.2 1B Instruct yields +10 accuracy points on GSM8K. It just works!
The train data is GSM8K train. Interestingly, supervised fine-tuning yields no performance improvements, since the dataset is tiny compared to all the math reasoning data seen by Llama 3.
Really cool paper questioning all the 'incredible' progress we've seen recently: "after fine-tuning all models on the same amount of task data, performance per pre-training compute equalizes and newer models are no better than earlier models."
Really cool paper questioning all the 'incredible' progress we've seen recently: "after fine-tuning all models on the same amount of task data, performance per pre-training compute equalizes and newer models are no better than earlier models."
15K Followers 6K FollowingI build tough benchmarks for LMs and then I get the LMs to solve them. SWE-bench & SWE-agent. Postdoc @Princeton. PhD @nlpnoah @UW.
413 Followers 683 FollowingData scientist @ Grow Progress, testing Dem ads. Usually thinking about methods for survey weighting, efficient causal inference, or scalable Bayes.
248 Followers 1K FollowingData Engineer @ibm 🐧 Liberal Economist @idlibertes 🕊
Tennisman 🎾 HouseDJ since 89 🙂 Chess Player ♟️
Open standards advocate ⭕ Nulla dies sine linea 🔥
158 Followers 198 FollowingApparently an 🇪🇺 @ELLISforEurope AI/ML PhD Student @ 🇮🇹 the University of Trento (@UniTrento) and @ 🇩🇪 the University of Tübingen (@uni_tue)
49K Followers 9K FollowingI lead @Cohere_Labs. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, ML reliability. Changing spaces where breakthroughs happen.
562 Followers 497 FollowingResearch group leader @ Max Planck Institute working on theory & social aspect of CS. Previous @UCSC @GoogleDeepMind @Stanford @PKU1898
1K Followers 830 FollowingCS Ph.D. @Stanford, researching data quality, foundation models, and ML for Theorem Proving. Prev: @MIT, @MIT_CBMM, @IllinoisCS, @IBM. Opinions are mine. 🇲🇽
8K Followers 736 FollowingHead of AI @NormalComputing. Ex @Meta, @BARCdk, SupWiz, @OxfordQuantum. Tweets on Math, AI, #dspy, Probability, ML, Algorithms and Randomness. Recently tensors.
15K Followers 6K FollowingI build tough benchmarks for LMs and then I get the LMs to solve them. SWE-bench & SWE-agent. Postdoc @Princeton. PhD @nlpnoah @UW.
49K Followers 9K FollowingI lead @Cohere_Labs. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, ML reliability. Changing spaces where breakthroughs happen.
1K Followers 830 FollowingCS Ph.D. @Stanford, researching data quality, foundation models, and ML for Theorem Proving. Prev: @MIT, @MIT_CBMM, @IllinoisCS, @IBM. Opinions are mine. 🇲🇽
20K Followers 1K FollowingResearcher @MSFTResearch, AI Frontiers Lab; Prof @UWMadison (on leave); learning in context; thinking about reasoning; babas of Inez Lily.
4K Followers 800 FollowingMachine Learning Research at the ELLIS Institute & Max-Planck for Intelligent Systems// Excited about fundamental questions in Safety & Efficiency of modern ML
43K Followers 3K FollowingWe're in a race. It's not USA vs China but humans and AGIs vs ape power centralization.
@deepseek_ai stan #1, 2023–Deep Time
«C’est la guerre.» ®1
384 Followers 418 FollowingPhD Student at Max Planck Institute. Past @iiit_hyderabad @VectorInst. Interested in better evals, forecasting, and open-endedness.
8K Followers 6K FollowingPhD student @berkeley_ai; research @cursor_ai; prev @GoogleDeepMind. My friend told me to tweet more. I stare at my computer a lot and make things