Alex Zhang @a1zhang
phd student @MIT_CSAIL + @SakanaAILabs, ugrad @Princeton, 🫵🏻 go participate in the @GPU_MODE kernel competitions! alexzhang13.github.io/blog USA Joined December 2015-
Tweets434
-
Followers13K
-
Following587
-
Likes2K
This is a really neat analysis on training dynamics for post-training! As a field, we should start being rigorous about the properties of our training methods :)
This is a really neat analysis on training dynamics for post-training! As a field, we should start being rigorous about the properties of our training methods :)
For agents to improve over time, they can’t afford to forget what they’ve already mastered. We found that supervised fine-tuning forgets more than RL when training on a new task! Want to find out why? 👇
ts is fire <3 go give it a read!
ts is fire <3 go give it a read!
We asked LLMs to estimate the *fraction* of a math solution that was right… Turns out that while they can reason through complex problems they still have a hard time producing precise numerical outputs Let’s talk about what we call Reasoning-Intensive Regression (RiR) tasks 🧵
Last lecture! @exists_forall finished their amazing talk on GPU programming from first principles (+ a deep dive on GPU arch), with slides here: docs.google.com/presentation/d… Simran's talk on ThunderKittens & more is now going on, so come join if you have questions!
Last lecture! @exists_forall finished their amazing talk on GPU programming from first principles (+ a deep dive on GPU arch), with slides here: docs.google.com/presentation/d… Simran's talk on ThunderKittens & more is now going on, so come join if you have questions!
We are live! This will be a super long session with two amazing speakers, so feel free to stop by and ask any questions you may have :) 🔗: youtube.com/watch?v=LMk8nq…
We are live! This will be a super long session with two amazing speakers, so feel free to stop by and ask any questions you may have :) 🔗: youtube.com/watch?v=LMk8nq…
We are ending strong with GPU Programming 🚀! 2 talks today back to back! First @exists_forall for intro to CUDA and then @simran_s_arora for Thunder Kittens 🐈! Today at: 1:00pm EST / 11:00am PT - scale-ml.org/bootcamp
best explanation of YaRN i've heard youtube.com/watch?v=l6_fdw…
best explanation of YaRN i've heard youtube.com/watch?v=l6_fdw…
Talk starting now on the GPU MODE YT channel! youtube.com/watch?v=l6_fdw… x.com/jyo_pari/statu…
Talk starting now on the GPU MODE YT channel! youtube.com/watch?v=l6_fdw… x.com/jyo_pari/statu…
Day 4 of the @GPU_MODE x @scaleml series!!! as always, come join live to ask questions for @SonglinYang4 and enjoy!
Day 4 of the @GPU_MODE x @scaleml series!!! as always, come join live to ask questions for @SonglinYang4 and enjoy!
Today is all about Positions! Excited to have our friend @SonglinYang4 go through some past work such as RoPE to get everyone up to speed, and share her latest work, PaTH 🛣️! Today at: 2:00 EST / 11:00 PT - scale-ml.org/bootcamp
Oct 17 at Toronto School of Foundation Modelling: @m_sirovatka will talk about model sharding, network topologies of large-scale clusters and how these pieces connect.
Happening now! Feel free to hop in and ask any questions you have: youtube.com/watch?v=k8PcSG…
Happening now! Feel free to hop in and ask any questions you have: youtube.com/watch?v=k8PcSG…
New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

Haque Ishfaq @HaqueIshfaq
1K Followers 1K Following PhD student at @mcgillu/ @MILAMontreal. Reinforcement Learning. BS, MS @Stanford 🇧🇩🇺🇸🇨🇦
Social Use @socialuseai
255K Followers 6K Following Where Social meets AI: Exploring the future of connected intelligence
Darsu @Darsu6668
95 Followers 1K Following
Lily @Lily4367623
59 Followers 247 Following My journey is not about perfection but about progress growth and becoming a better version of myself
Xueguang Ma @xueguang_ma
841 Followers 636 Following PhD student at @uwaterloo. Working on encoding the world into vectors. Prev. intern at @Meta, @MSFTResearch, @amazon
Lakshya A Agrawal @LakshyAAAgrawal
2K Followers 2K Following AI PhD @ UC Berkeley | GEPA Creator (https://t.co/EdPqvzj7k4) | Created https://t.co/YxPZsXZJeS | Past: AI4Code Research Fellow @MSFTResearch | Hobbyist Saxophonist
Lucas Sangdae Nam @luuk112233
0 Followers 47 Following
Anikait Singh @Anikait_Singh_
553 Followers 800 Following PhD'ing @StanfordAILab @stanfordnlp, Intern @MSFTResearch. Previously @ToyotaResearch @GoogleDeepMind @Berkeley_AI https://t.co/Qz5IOlHvqf
Hoang Phi Nguyen @nghgphi
4 Followers 42 Following
Mariita @houseofdr3
11 Followers 158 Following
Xiang Long @SDxFaith
26 Followers 542 Following MLE @ModelBest, ex-Alibaba Group/MSRA/Tencent AI Lab
Junxiao Yang @Junxiao_THU
11 Followers 165 Following I'm a first-year Ph.D. student at Tsinghua University, focusing on building safe and reliable LLMs. My personal website: https://t.co/s5qVMF1nBK
Yahya @AliSherGhumman_
0 Followers 39 Following Parallel Programming | Distributed Systems | Deep Learning
Grozziie @grozziie
23 Followers 455 Following Grozziie is your go-to global provider of printing, power, and attendance solutions.
Anoop Saha @asyncanoop
720 Followers 2K Following I correlate; therefore, I cause! 100k GPU cluster is all you need
Jerry Yin @JerryYin777
241 Followers 981 Following B.A. @UMNComputerSci | Contributor of LLM Yi & SenseNova5o|ex-Intern @ https://t.co/18PtAfVcDW @ THUNLP @ SenseTime | MLSys & LLM | CUDA & Triton
Aman Swar @AmanSwar_
2 Followers 141 Following MLSys. Hacking on CUDA kernels, compilers,and LLM infra. Pushing performance
Shivam Kumar @ShivamKumar212
198 Followers 3K Following Techno-optimist | AI Enthusiast | Go Developer | Learning everyday | Exploring the unexplored
Len Mo @LenMo477296
1 Followers 502 Following
Elontribe @oembkf0fae90530
0 Followers 25 Following Access 𝐌𝘶𝘀𝐤’𝐬 pre-IPO shares: SpaceX, Starlink, xAI, Neural ink. 1000x potential. Limited time. 👉 https://t.co/WNxT2xj8Gl
Geosh @Geoshh
98 Followers 978 Following Embodied A.I. | Socioaffective Alignment | Systems Biology & Interpersonal Neurobiology | @UChicago | @EuroGradSchool |healing,science,technology,connection
Oudbib Khalid @OudbibK
1 Followers 6 Following
𝗠𝘂𝗵𝗮𝗺�... @evided
233 Followers 2K Following i am a graphics designer known to create captivating visual experiences. Join me on this creative journey. #GraphicsDesigner, #imrankhan
Ssh @Ssh1318005
1 Followers 273 Following
dnomsed @ddnomsed
136 Followers 848 Following doing what you like is freedom; liking what you do is happiness
Suhnylla Kler @SuhnyllaKler
2K Followers 5K Following +30 years value creation for CEOs of $100M++ Companies. Sharing journey. Now, LLMs, SLMs, GNNs, Synthesis, Emergence, data, and Appn/Deploymt with Automation
Barry wang @Bbwang347889
39 Followers 106 Following
Tuan-Hung VU @tuan_hung_vu
176 Followers 409 Following Senior Research Scientist at https://t.co/x8Ys2iYSHH
AmAzing- @amazing129
735 Followers 3K Following AI Products and Crypto Researcher. Building @byreal_io. Cooking @openomy_hub. Supporting @lobehub.
Alireza Noroozi @Alireza_N0roozi
47 Followers 461 Following AI [email protected] We are people who came from computer science to help Biology
Blake Jackson @blakexjackson
0 Followers 68 Following
kandalete @kandalete
8 Followers 287 Following
Vicky Singh @iVkeySingh
4 Followers 473 Following
Anuj Agrawal @AnujAgr10292366
46 Followers 805 Following Camera Systems Engineer at Qualcomm | ex Apple | MS ECE at UCLA | Exploring ideas in Robotics, Cameras and Vision
Huanxuan Liao @xn_hyacinth
23 Followers 219 Following 3rd year PhD Student @UCAS1978 | Focusing on Long Context Modeling
Quanquan Gu @QuanquanGu
16K Followers 2K Following Professor @UCLA, Pretraining and Scaling at ByteDance Seed | Recent work: Build AGI | Opinions are my own
Luiz Frias @mebroskee
29 Followers 154 Following
Lily Liu @eqhylxx
1K Followers 429 Following CS PhD Student, Sky Lab @UCBerkeley, @vllm_project, @OpenAI
Harmya @racerfunction
148 Followers 170 Following gpu enjoyer @tensarahq and prev @modal_labs, i like math, ml and guitar
Anikait Singh @Anikait_Singh_
553 Followers 800 Following PhD'ing @StanfordAILab @stanfordnlp, Intern @MSFTResearch. Previously @ToyotaResearch @GoogleDeepMind @Berkeley_AI https://t.co/Qz5IOlHvqf
Adrien Lemercier @adri_lemercier
198 Followers 655 Following Building. Prev quant @jumptrading, CS&math @Stanford, 3x IMO medalist.
Lily Liu @eqhylxx
1K Followers 429 Following CS PhD Student, Sky Lab @UCBerkeley, @vllm_project, @OpenAI
Varun Ullanat @VUllanat
26 Followers 27 Following
Michael Tu @tuzhucheng
308 Followers 2K Following Research Engineer @Netflix. Prev. ML Engineer @apple. @UWaterloo grad.
Diane @dianetc_
154 Followers 237 Following Figuring things out slowly. MIT PhD student, prev: @UofMaryland
Kai Arulkumaran @kaixhin
8K Followers 5K Following Researcher, programmer, DJ, transhumanist. @SakanaAILabs @ArayaGlobal; formerly @imperialcollege @MSFTResearch Twitter @AIatMeta @GoogleDeepMind @nnaisense
Edward Z. Yang @ezyang
14K Followers 1K Following I work on PyTorch at Meta. Chatty alt at @difficultyang.
William Brandon @exists_forall
743 Followers 1K Following he/him • Trying to become compute-bound • PhD student at MIT CSAIL • Prev: CS & Math at UC Berkeley; ML Compilers at NVIDIA • Opinions my own
Peter Chen @peterxichen
3K Followers 2K Following Covariant CEO and Co-Founder. Previously @OpenAI, @UCBerkeley PhD.
vmoens @VincentMoens
1K Followers 681 Following TorchRL maintainer (@torchrl1) - PyTorch SWE @ Meta - London Neuroscience PhD, ex-MD vmoens on the butterfly platform
Tianyuan Zhang @tianyuanzhang99
2K Followers 920 Following PhDing in@MIT, towards general intelligence and lifelong machine M.S. in CMU, B.S. in PKU.
Adam Zweiger @AdamZweiger
940 Followers 415 Following Rethinking how language models learn | Researcher @MIT_CSAIL
❄️Andrew Zhao❄�... @_AndrewZhao
4K Followers 3K Following PhD @Tsinghua_Uni. Absolute Zero,ExpeL,Diver-CT Research Intern @MSFTResearch, Ex. @ BIGAI. Interested in RL, Reasoning/Safety 4 LLMs, Agents. On job market 26'
Melissa Pan @melissapan
2K Followers 528 Following CS PhD @UCBerkeley Sky Lab 🐻 Systems & AI & Sustainability 🌍 Prev: @google, @ibm, @CarnegieMellon🐕🦺, @UofT🇨🇦
suuun @qqqi_suuun
5 Followers 6 Following getting my phd @sciencetokyo_en; lifting barbell @SakanaAILabs
Lucia Cipolina Kun @LuciaCKun
1K Followers 5K Following Research engineer at META working on LLM agents, RL and Game Theory. Alternative account on art restoration: @ArtRestoreAI. Personal account not work.
Chenfeng_X @Chenfeng_X
1K Followers 945 Following PhD @UCBerkeley, Incoming Assistant Professor @UTCompSci, Senior Researcher @togethercompute. Working on building cooler things with fewer dollars 😊
Tian Jin @tjingrant
568 Followers 434 Following PhD student @MIT_CSAIL, previously @IBMResearch, @haverfordedu .
Sid @sid_srk
2K Followers 678 Following create things now: https://t.co/5VlY4Tnrho before: diffusion acceleration @runwayml pretraining and distributed training @cohere
Vincent Richard @vincrichard_ai
3 Followers 97 Following
Weijia Shi @WeijiaShi2
9K Followers 1K Following PhD student @uwnlp @allen_ai | Prev @MetaAI @CS_UCLA | 🏠 https://t.co/Q6Mzg8ow2j
Eddy Wu @eddywu_
65 Followers 237 Following scout @soma_capital, machine learning @distspectrum | physics & cs/ai @princeton
Christopher De Sa @chrismdesa
496 Followers 23 Following
Yoon Kim @yoonrkim
445 Followers 539 Following
Guangxuan Xiao @Guangxuan_Xiao
3K Followers 697 Following Ph.D. student at @MITEECS Prev: CS & Finance @Tsinghua_Uni
ivy @_ivyzhang
394 Followers 804 Following choreographing bits in search of open-endedness RS intern @sakana
j4orz @j4orz
156 Followers 68 Following i 🖤 tensor compilers. make the impossible hard, and the hard easy. talk is cheap, show me the code.
Tim Rocktäschel @_rockt
39K Followers 2K Following Director and Open-Endedness Team Lead @GoogleDeepMind, Professor of AI @AI_UCL, PI @UCL_DARK, Fellow @ELLISforEurope.
Jaya Gupta @JayaGup10
9K Followers 3K Following tweets about AI and other fun stuff. currently @foundationcap; previously McKinsey, @georgiatech alum, @stackfolio (acquired), @peak6, @raymondjames
Han Guo @HanGuo97
3K Followers 4K Following PhD Student @MIT_CSAIL | Past: @LTIatCMU @MITIBMLab @UNCNLP, @SFResearch, @BaiduResearch | Machine Learning, NLP.
Motoki Sato @aonotas
2K Followers 715 Following Working at Sakana AI (Applied Team) I'm interested in NLP and Speech Processing. #nlproc. (NAIST : Matsumoto Lab → PFN → Sakana AI).
George Hotz 🌑 @realGeorgeHotz
300K Followers 204 Following President @comma_ai. Founder @__tinygrad__