LLMs can be programmed by backprop 🔎
In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.
Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto@marcelroed@neilbband@rckpudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:
Our interpretability team recently released research that traced the thoughts of a large language model.
Now we’re open-sourcing the method. Researchers can generate “attribution graphs” like those in our study, and explore them interactively.
It was a dream come true to teach the course I wish existed at the start of my PhD. We built up the algorithmic foundations of modern-day RL, imitation learning, and RLHF, going deeper than the usual "grab bag of tricks". All 25 lectures + 150 pages of notes are now public! 🧵
L1 regularization for sparse solutions - as usually taught - is actually terrible in practice! I’m always surprised how few people know this. To get good results, retrain with the sparsity pattern found from the initial L1 run, but without the regularizer. Works much better.
L1 regularization for sparse solutions - as usually taught - is actually terrible in practice! I’m always surprised how few people know this. To get good results, retrain with the sparsity pattern found from the initial L1 run, but without the regularizer. Works much better.
We're missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name - system prompt learning?
Pretraining is for knowledge.
Finetuning (SL/RL) is for habitual behavior.
Both of these involve a change in parameters but a lot of human…
Today, we're announcing our $50M Series A and sharing a preview of Ember - a universal neural programming platform that gives direct, programmable access to any AI model's internal thoughts.
This paper also recommended for understanding GRPO. TLDR is that the output-length normalization in DeepSeek GRPO is making models not penalize repetitive behaviors while rewarding shorter correct responses. Same intuition as the last RL paper I posted.
Writeup soon.
This paper also recommended for understanding GRPO. TLDR is that the output-length normalization in DeepSeek GRPO is making models not penalize repetitive behaviors while rewarding shorter correct responses. Same intuition as the last RL paper I posted.
Writeup soon.
68 Followers 371 FollowingProfessor
@PrincetonEcon
, Director of
@PrincetonBCF
, Research on Macro, Money, and Finance, Author The Resilient Society and A Crash Course on Crises
2K Followers 6K Followingconsciousness accelerationist - ai non determinist computing physics philosophy… trying to never forget that in our infinite ignorance we are all equal -popper-
18K Followers 4K FollowingAssociate Professor at UC Berkeley. Former Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learning.
16K Followers 489 FollowingAssociate Professor @UofT, Vice President of AI Research @nvidia, founding member of @VectorInst. Computer vision, deep learning, 3D. Opinions are my own.
6K Followers 64 FollowingThe Department of Mathematics at Harvard University is one of the world's leading centers for research and education in pure mathematics. #harvardmath
8K Followers 6K FollowingPhD student @berkeley_ai; research @cursor_ai; prev @GoogleDeepMind. My friend told me to tweet more. I stare at my computer a lot and make things
5K Followers 710 Followingassociate prof @ucberkeley, co-director @ucbepic, cofounder @ponderdata (acq. @snowflakeDB) | on a mission to make data science effortless at scale | he/him
12K Followers 1K FollowingAGI research @DeepMind. Ex cofounder & CTO @vicariousai (acqd by Alphabet) and @Numenta. Triply EE (BTech IIT-Mumbai, MS&PhD Stanford). #AGIComics
3K Followers 616 FollowingTrying to understand the emergence of generally intelligent robotic behavior at @berkeley_ai @AIatMeta. Previously @CILVRatNYU @MIT & @Apple AI/ML fellow.
3K Followers 393 FollowingSenior Research Scientist @nvidia. PhD @Mila_Quebec. BSc @PKU1898. Reasoning, LLMs, ML systems. Photographer. Opinions are my own.
4K Followers 26 FollowingAI for mathematics and theoretical physics
Tomorrow's problems on yesterday's machines
Axiom - École nationale des ponts et chaussées
263K Followers 670 FollowingBuilding with AI agents @dair_ai • Prev: Meta AI, Galactica LLM, Elastic, PaperswithCode, PhD • I share insights on how to build with AI Agents ↓
16K Followers 5K FollowingFull of childlike wonder. Teaching robots manners. UT Austin PhD candidate. 🆕 RL Intern @ Apptronik. Past: Boston Dynamics AI Institute, NASA JPL, MIT ‘20.
4K Followers 984 FollowingHi, I like reinforcement learning, robots, and video games:) I am an amateur pianist. Assistant Prof at Tsinghua; Postdoc at Stanford; Ph.D. at Berkeley
8K Followers 878 FollowingAssistant Professor @Cambridge_Eng, working on 3D computer vision and inverse graphics, previously postdoc @StanfordSVL, PhD @Oxford_VGG