We have a fun collaboration of @GPU_MODE x @scaleml coming up!
We’re hosting a week-long online bootcamp that explores the core components of GPT-OSS while also diving into cutting-edge research that pushes beyond what’s currently in GPT-OSS!
For example, how can MoE's power…
was actually wondering with @hyundongleee the fundamental differences between diffusion and autoregressive modeling other than the structure imposed in the modeling of the sequential conditional distribution and how they manifest. a poignant paper that addresses this thought
was actually wondering with @hyundongleee the fundamental differences between diffusion and autoregressive modeling other than the structure imposed in the modeling of the sequential conditional distribution and how they manifest. a poignant paper that addresses this thought
[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.
But actually this is the og way of doing it and should stop by E-2103 to see @jxbz and Laker Newhouse whiteboard the whole paper. https://t.co/NjV3qnxCaK
If you are interested in questioning how we should pretrain models and create new architectures for general reasoning
- then checkout E606 @ ICML, our position by @seungwookh and I on potential directions for the next generation reasoning models!
At #ICML 🇨🇦 this week.
I'm convinced that the core computations are shared across modalities (vision, text, audio, etc). The real question is the (synthetic) generative process that ties them.
Reach out if you have thoughts or want to chat!
wholeheartedly agree with this direction that games can be a good playground for learning reasoning. makes us think what other synthetic environments we can design and grow over complexity
wholeheartedly agree with this direction that games can be a good playground for learning reasoning. makes us think what other synthetic environments we can design and grow over complexity
Our computer vision textbook is now available for free online here:
visionbook.mit.edu
We are working on adding some interactive components like search and (beta) integration with LLMs.
Hope this is useful and feel free to submit Github issues to help us improve the text!
391 Followers 101 FollowingCo-Founder of DGENZ | We make CGI mini-movies for luxury brands such as Tommy Hilfiger, Nespresso & more | Exclusive storytelling through animation 🎥
17K Followers 6K FollowingNeurodivergent physics student with a keen interest in multisensory integration and emergent perception. Exploring research on a proposed ‘sixth sense’. Δ
2K Followers 3K Following#Scientist #Researcher #Author of 20 research papers #Catalysis #Water splitting ; Content Creator; Professional adviser in paper writing ❤️ "single"
401 Followers 7K FollowingUnited States Navy
Deputy commander of United States Central Command
Former Commander United States Fifth Fleet
From Winston-Salem, North Carolina
180 Followers 6K Following"In this is love, not that we have loved God but that he loved us and sent his Son to be the propitiation for our sins."
1 JOHN 4:10 ✝️🙏🏼🫶🏿
18K Followers 4K FollowingAssociate Professor at UC Berkeley. Former Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learning.
20K Followers 1K FollowingResearcher @MSFTResearch, AI Frontiers Lab; Prof @UWMadison (on leave); learning in context; thinking about reasoning; babas of Inez Lily.
35K Followers 745 FollowingThe MIT Department of Architecture is a department in the School of Architecture + Planning, @mitsap. Related: @akpiamit, @actmit, @mitdusp, @medialab
1K Followers 166 Following(jolly good) Fellow at @KempnerInst, incoming assistant professor at @UBCLinguistics (Sept 2025). PhD @stanfordnlp with the lovely @jurafsky.
18K Followers 2K Following@Stanford, Professor of Computer Science. I design (better) social tech. Author @mitpress: Flash Teams - data/algos in future of work - releasing Oct