We ran more experiments to better understand “why” diffusion models do better in data-constrained settings than autoregressive. Our findings support the hypothesis that diffusion models benefit from learning over multiple token orderings, which contributes to their robustness and…
We ran more experiments to better understand “why” diffusion models do better in data-constrained settings than autoregressive. Our findings support the hypothesis that diffusion models benefit from learning over multiple token orderings, which contributes to their robustness and… https://t.co/lXEPNn3mrV
🚨 The era of infinite internet data is ending, So we ask:
👉 What’s the right generative modelling objective when data—not compute—is the bottleneck?
TL;DR:
▶️Compute-constrained? Train Autoregressive models
▶️Data-constrained? Train Diffusion models
Get ready for 🤿 1/n
I converted one of my favorite talks I've given over the past year into a blog post.
"On the Tradeoffs of SSMs and Transformers"
(or: tokens are bullshit)
In a few days, we'll release what I believe is the next major advance for architectures.
📢 New Paper Alert! 📢
"Principal Components" Enable A New Language of Images ✨ arxiv.org/abs/2503.08685
We introduce Semanticist, a PCA-guided tokenizer that revolutionizes visual tokenization for generative models!
🧵 Thread below! 👇
🎉 New Pre-print! 🎉
Do CLIP models truly generalize to new, out-of-domain (OOD) images, or are they only doing well because they’ve been exposed to these domains in training? Our latest study reveals that CLIP’s ability to “generalize OOD” may be more limited than previously…
1K Followers 794 Followingphotographer, comedian, billionaire (in parameters). I like training models. ***I hate people individually, but love mankind as a whole.***
987 Followers 1K FollowingCoFounder & C.T.O of M-XR
@M_XRstudio
🎨 Digital Artist
🧑🏽💻 Technologist
🔮 Interactive Art
Push the boundaries of creative possibilities...
63K Followers 2K FollowingResearch Scientist at Google DeepMind (WaveNet, Imagen, Veo). I tweet about deep learning (research + software), music, generative models (personal account).
632 Followers 883 FollowingRS Intern Meta. Second-year PhD student at UT Austin. Working on generative modeling, visual understanding, and visual compression.
252 Followers 218 FollowingPhD student in CS at UIUC @UofIllinois. Research intern @Nvidia, ex-intern @Adobe. Previously at @CMU_Robotics. Research interests in VLM and embodied AI.
23K Followers 380 FollowingCo-Founder & CEO @SkildAI, Faculty @CarnegieMellon.
PhD @UCBerkeley.
I study topics in AI (machine learning, robotics & computer vision).
152 Followers 379 Following2nd Year PhD Student @ Boston University. (Ex) AI-Resident at Google. Interested in representation learning. Daydreaming of mountains.
987 Followers 1K FollowingCoFounder & C.T.O of M-XR
@M_XRstudio
🎨 Digital Artist
🧑🏽💻 Technologist
🔮 Interactive Art
Push the boundaries of creative possibilities...
1K Followers 794 Followingphotographer, comedian, billionaire (in parameters). I like training models. ***I hate people individually, but love mankind as a whole.***
19K Followers 1K FollowingAgents @Meta MSL TBD Lab. previously posttraining research @OpenAI train LLMs to do things: deep research, chatgpt agent, etc. CS PhD @LTIatCMU
488K Followers 146 FollowingNobel Laureate. Co-Founder & CEO @GoogleDeepMind - working on AGI. Solving disease @IsomorphicLabs. Trying to understand the fundamental nature of reality.