New paper & surprising result.
LLMs transmit traits to other models via hidden signals in data.
Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
🚀 Hello, Kimi K2! Open-Source Agentic Model!
🔹 1T total / 32B active MoE model
🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models
🔹Strong in coding and agentic tasks
🐤 Multimodal & thought-mode not supported for now
With Kimi K2, advanced agentic intelligence…
People are racing to push math reasoning performance in #LLMs—but have we really asked why? The common assumption is that improving math reasoning should transfer to broader capabilities in other domains. But is that actually true?
In our study (arxiv.org/pdf/2507.00432), we…
Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then surely the singularity is just around the corner.
How can we get a pulse check on whether current LLMs are capable of driving this kind of total…
New paper: What happens when an LLM reasons?
We created methods to interpret reasoning steps & their connections: resampling CoT, attention analysis, & suppressing attention
We discover thought anchors: key steps shaping everything else. Check our tool & unpack CoT yourself 🧵
What Makes a Base Language Model Suitable for RL?
Rumors in the community say RL (i.e., RLVR) on LLMs is full of “mysteries”:
(1) Is the magic only happening on Qwen + Math?
(2) Does the "aha moment" only spark during math reasoning?
(3) Is evaluation hiding some tricky traps?…
Can we actually control reasoning behaviors in thinking LLMs?
Our @iclr_conf workshop paper is out! 🎉
We show how to steer DeepSeek-R1-Distill’s reasoning: make it backtrack, add knowledge, test examples. Just by adding steering vectors to its activations!
Details in 🧵👇
🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation.
🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46%
🌐 Website: multiverse4fm.github.io
🧵 1/n
DeepSeek 671b and Qwen3 236b support with Megatron backend is now available as preview in verl v0.4.0 🔥🔥🔥
We will continue optimizing MoE model performance down the road.
DeepSeek 671b: verl.readthedocs.io/en/latest/perf…
verl v0.4: github.com/volcengine/ver…
Announcing OpenThinker3-7B, the new SOTA open-data 7B reasoning model: improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average over code, science, and math evals.
We also release our dataset, OpenThoughts3-1.2M, which is the best open reasoning dataset across all data…
Incredible work by my mentors and open-source collaborators—honored to have played a tiny part! Huge respect for Simon Huang & team for leading this! 👏🙏
Incredible work by my mentors and open-source collaborators—honored to have played a tiny part! Huge respect for Simon Huang & team for leading this! 👏🙏
🤯 We cracked RLVR with... Random Rewards?!
Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:
- Random rewards: +21%
- Incorrect rewards: +25%
- (FYI) Ground-truth rewards: + 28.8%
How could this even work⁉️ Here's why: 🧵
Blogpost: tinyurl.com/spurious-rewar…
Giving your models more time to think before prediction, like via smart decoding, chain-of-thoughts reasoning, latent thoughts, etc, turns out to be quite effective for unblocking the next level of intelligence.
New post is here :)
“Why we think”: lilianweng.github.io/posts/2025-05-…
4K Followers 2K FollowingResearch Scientist @NVIDIA focusing on efficient post-training of LLMs. Finetuning your own LLMs with LMFlow: https://t.co/UTykmQBwFr Views are my own.
50K Followers 5K FollowingCofounder and Head of Post Training @NousResearch, prev @StabilityAI
Github: https://t.co/LZwHTUFwPq
HuggingFace: https://t.co/sN2FFU8PVE
225 Followers 570 FollowingSecond year PhD @UW | Post-Training, LLM reasoning and synthetic dataset.
https://t.co/cYAkbnCsCp
Open to chat and collaborate!
18K Followers 4K FollowingAssociate Professor at UC Berkeley. Former Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learning.
196K Followers 6K Followingcanadian startup founder. prev eng @ x, stripe. yacine_kv on insta
i make my memes with https://t.co/pWRBfY8kn2 -
I write a subscriber only blog. Subscribe!
3K Followers 910 FollowingUnderstanding the universe @xAI;
Previously: Co-Founder at @FennelAI, ex ML Infra at Google Brain, ex Infra at GCE
Startups, Software, Tech, Infra, AI :)
2K Followers 14 FollowingThe AI benchmark for predictive intelligence, advancing collective foresight via human–AI collaboration, from SIGMA Lab @UChicagoCS @DSI_UChicago
554K Followers 131 FollowingFather of three, Creator of Ruby on Rails + Omarchy, Co-owner & CTO of 37signals, Shopify director, NYT best-selling author, and Le Mans 24h class-winner.