Aashu Singh @iam_aashusingh
ML Engg @Facebook Alum @GeorgiaTech Joined April 2010-
Tweets340
-
Followers95
-
Following511
-
Likes988
@giffmana @laurence_ai @TheGregYang Shameless plug there is this outdate blog notion.so/cloneofsimo/Wh… that you gave some input actually lol
a good set of tips for GRPO RL training in @willccbb's verifiers repo
New video, starting to look at Diffusion Language Models. This one introduces some ideas, then shows how I turn ModernBERT into a LLaDA-style generative model. Lots of avenues to explore from here! Join me in playing with this? Project ideas in thread :) youtube.com/watch?v=Ds_cTc…
I love Cutlass, and this new Python DSL looks very well-designed. Will for sure accelerate kernel dev + exploring new ideas in ML + GPU. I'm already playing with it and having fun
I love Cutlass, and this new Python DSL looks very well-designed. Will for sure accelerate kernel dev + exploring new ideas in ML + GPU. I'm already playing with it and having fun
We’re also releasing the SkyAgent-v0 models which achieve promising results on SWE-Bench-Verified across model lines. Check it out! Blog: novasky-ai.notion.site/skyrl-v0 Model Collection: huggingface.co/collections/No… Github: github.com/NovaSky-AI/Sky… 3/N
A deep conversation with @SavinovNikolay, the Gemini long context pre-training co-lead… We go from the basics to what is needed to scale to infinite context to long context best practices for devs:
Thrilled to share our new paper: MetaQueries! We've created novel approach that bridges MM-LLMs and diffusion models using learnable queries . The method enables knowledge augmented image generation while preserving SOTA understanding capabilities.
Thrilled to share our new paper: MetaQueries! We've created novel approach that bridges MM-LLMs and diffusion models using learnable queries . The method enables knowledge augmented image generation while preserving SOTA understanding capabilities.
Llama4 models are out! Open sourced! Check them out: “Native multimodality, mixture-of-experts models, super long context windows, step changes in performance, and unparalleled efficiency. All in easy-to-deploy sizes custom fit for how you want to use it” llama.com
Pretty cool "Multi-Head Attention Shape Transformations (Cheat Sheet)" shared by a reader: github.com/rasbt/LLMs-fro…
We are bringing back Stanford’s CS 25 Transformers Course (cs25.stanford.edu) today! It’s open to everybody! This is one of @Stanford's hottest seminar courses. We open the course through Zoom to the public. Lectures start today (Tuesdays), 3-4:20pm PDT, at…
Lecture 15: Quantization (Guest lecture by @Tim_Dettmers) youtu.be/YXZZaje76r4 - Quantization basics - Quantized foundation models: LLM.int8() - Finetuning foundation models: QLoRA - Quantization and users
Lecture 15: Quantization (Guest lecture by @Tim_Dettmers) youtu.be/YXZZaje76r4 - Quantization basics - Quantized foundation models: LLM.int8() - Finetuning foundation models: QLoRA - Quantization and users
Since launching Agent S2, many folks working on GUI/computer-use agents asked for our tech report. Here we go! 🎉New SOTA on 3 major computer use benchmarks. • OSWorld (15 steps): 27.0% 🚀 (+18.9%) • OSWorld (50 steps): 34.5% 🚀 (+32.7%) • WindowsAgentArena: 29.8% 🚀…
Since launching Agent S2, many folks working on GUI/computer-use agents asked for our tech report. Here we go! 🎉New SOTA on 3 major computer use benchmarks. • OSWorld (15 steps): 27.0% 🚀 (+18.9%) • OSWorld (50 steps): 34.5% 🚀 (+32.7%) • WindowsAgentArena: 29.8% 🚀… https://t.co/2AYVcE9IHa
Blog post: all-hands.dev/blog/introduci… Model: huggingface.co/all-hands/open…
🚨Multi-Token Attention🚨 📝: arxiv.org/abs/2504.00927 Attention is critical for LLMs, but its weights are computed by single query & key vectors, limiting capability. MTA combines query, key & head operations over multiple tokens, improving performance in terms of PPL, std…
Interesting paper: Video-R1 improves temporal reasoning in MM LLMs using T-GRPO a variant of GRPO and high quality curated data for SFT. Here's a summary: medium.com/@aashus18_1308… Original paper: arxiv.org/abs/2503.21776
🎨 Understanding GPU Architecture from Cornell This GPU architecture roadmap is a good starting point for diving deeper, along with the CUDA C++ programming guide PDF - both freely available from Cornell and NVIDIA.
I read the R1 zero paper and the method is very simple , just a tweak to PPO to fine tune deepseek v3 base using a verifiable sparse binary reward. The fact that they got it to work even though others failed is likely due to better data and/or their very efficient implementation
I read the R1 zero paper and the method is very simple , just a tweak to PPO to fine tune deepseek v3 base using a verifiable sparse binary reward. The fact that they got it to work even though others failed is likely due to better data and/or their very efficient implementation
For those trying to understand DeepSeeks Group Relative Policy Optimization (GRPO): GRPO is just PPO without a value function using monte carlo estimates of the advantage. So, study why PPO exists (lots of docs / writing on that) and understand that value functions are tricky…
I re-recorded the post-training part of our NeurIPS tutorial on language models, added some more slides, and wrote up a mini state of the union on @interconnectsai. Enjoy! Links in QT. 00:00 Introduction 10:00 Prompts & Skill Selection 14:19 Instruction Finetuning 21:45…
I re-recorded the post-training part of our NeurIPS tutorial on language models, added some more slides, and wrote up a mini state of the union on @interconnectsai. Enjoy! Links in QT. 00:00 Introduction 10:00 Prompts & Skill Selection 14:19 Instruction Finetuning 21:45… https://t.co/ckTcQU5PqU
10 short videos about LLM infrastructure to help you appreciate Pages 12-18 of the DeepSeek-v3 paper (arxiv.org/abs/2412.19437) 🧵 youtube.com/watch?v=76gulN…

TheresaHazlitt @79547WvUI7pTr
0 Followers 91 Following
Aruheej @Aruheej182
23 Followers 1K Following
Hawehe @Hawehe1847549
63 Followers 2K Following
fx__evoIutıons… @fx_evoIution
1K Followers 8K Following 🚀Ready to level up? Join 20,000+ traders getting free weekly insights & pro strategies. 💹 Don't miss out-grab yours now! 👉 https://t.co/PCglJC36Te
Alice Cruickshank @AliceC73115
105 Followers 2K Following
Srikanth Vidapanakal @sreak1089
807 Followers 4K Following Founder, https://t.co/ggFrmJ8WEw Research Engineer, Data Scientist, Applied Math guy, interested in building embodied intelligence products
Gail Ward @GailWard361957
40 Followers 3K Following
Apurva Pathak @technoapurva
125 Followers 462 Following Software Engineer @ Facebook | Ex- Microsoft | Alumni University of California San Diego | NIT Rourkela
Shlok Kumar Mishra @shlokkkk
427 Followers 1K Following Research Scientist @AIatMeta | Prev @GoogleAI | CS PhD UMD
returnhome @returnhome7
716 Followers 3K Following
Mr. Jack Tung @MrJackTung
294 Followers 6K Following
Gpbhupinder @gpbhupinder
472 Followers 7K Following 👨💻 Full-Stack Developer & AI Integration Expert 🚀 From concept to launch, we bring your tech vision to life
Wenhao Chai @wenhaocha1
2K Followers 2K Following Ph.D. Student @PrincetonCS. Prev @Stanford @UW @pika_labs @MSFTResearch @UofIllinois @ZJU_China. I used to work on computer vision, but it's not all I do.
Eva Louise Marie Gabr... @e681554349
11 Followers 7K Following
King Hong Chuang @KingHongChuang
30 Followers 2K Following
Dung Doan @dungdx34
330 Followers 8K Following
Satya Narayan Shukla @ImSNShukla
435 Followers 663 Following Senior Research Scientist @MetaAI | PhD @UMassAmherst | Prev @MSFTResearch, @facebookai and @Bosch_AI | BTech @IITKgp
Xichen Pan @xichen_pan
624 Followers 494 Following CS Ph.D. Student @NYU_Courant, Visiting Researcher @metaai | Prev: @MSFTResearch, @AlibabaGroup, https://t.co/EVVU493Kwp, @sjtu1896
λux @novasarc01
20K Followers 2K Following tensor shepherd in a non-euclidean pasture | grazing on cuda cores
Miroslav Pekárek @MiroslavPe79985
1K Followers 8K Following
Abhay Sharma @abhay110011
31 Followers 739 Following
Make money easily @sGqXS4i7Ojsuj
16 Followers 575 Following MEXC focuses on financial management, stocks, cryptocurrencies, digital assets and investments. Currently, new users can get free dollars when they sign up.
Pramit Saha @PramitSaha5
333 Followers 1K Following DPhil Candidate @UniofOxford @oxengsci working with Alison Noble on #Multimodal #Federated Learning #PEFT | MASc @ECEUBC | @MICCAI Young Scientist Award Winner
chris Judge @judgefws
495 Followers 7K Following
SwissCognitive, AI Ve... @SwissCognitive
146K Followers 100K Following We are committed to unleashing the power of AI in the business world. With our AI research, advisory, and ventures, we bring a blend of expertise to the Table.
Martin Görner @martin_gorner
14K Followers 6K Following AI/ML engineer. Previously at Google: Product Manager for Keras and TensorFlow and developer advocate on TPUs. Passionate about democratizing Machine Learning.
Mehmet Can @mehmetcansvs
25 Followers 287 Following
Web Culture @realwebculture
204 Followers 2K Following Technology, AI, programming, crypto and blockchain enthusiast.
merve @mervenoyann
80K Followers 5K Following open-sourceress at @huggingface 🧙🏻♀️proud Aegean, I work on computer vision, VLMs & agents | gençleri serbest bırakın
MBH Corporation PLC @MBH_Corporation
11K Followers 13K Following Giving investors access to profitable businesses in the $1m-$10m EBITDA range through a 'Buy and Build' approach, creating shareholder value through synergies.
Gustavo Rayo 🇳🇮... @rayogustavo
91 Followers 820 Following Software developer, chess player. Interested in AI and languages.
Asif Razzaq @asifrazzaq1988
6K Followers 7K Following Unleashing AI's potential. Editor and CEO at @marktechpost : AI News Platform with over 1.5 Million Visits per month
Nathan Benaich @nathanbenaich
61K Followers 34K Following solo member of investment staff @airstreet @airstreetpress @stateofaireport @raais
Sergios Karagiannakos @KarSergios
2K Followers 1K Following Writing about AI on https://t.co/qn6ZyTwnrj | Senior Data Engineer at @CausalyAI | 📖 Deep Learning course: https://t.co/e3QHPwOBnA
Naman Goyal @NamanGoyal21
2K Followers 620 Following Research @thinkymachines, previously pretraining LLAMA at GenAI MetaTanmay Pal @tanmay_pal
45 Followers 80 Following
MIT CSAIL @MIT_CSAIL
326K Followers 21K Following MIT's Computer Science & Artificial Intelligence Laboratory (CSAIL). Media Inquiries: [email protected] Check out the latest CSAIL content ⬇️
Snehal Lokhande 🦋 @snehal3105
331 Followers 1K Following Be happy for this moment this moment is your life. Python | Cloud | Data Analytics
Jacob Kahn @jacob_d_kahn
181 Followers 3 Following AI Researcher at FAIR, @MetaAI. CS Faculty at @Penn.
Krishna Mohan @KMohan2006
3K Followers 337 Following Denoising present to hopefully get brighter future | loves diffusion models
SemiAnalysis @SemiAnalysis_
34K Followers 16 Following
Noam Brown @polynoamial
91K Followers 853 Following Researching reasoning @OpenAI | Co-created Libratus/Pluribus superhuman poker AIs, CICERO Diplomacy AI, and OpenAI o3 / o1 / 🍓 reasoning models
Bert Maher @tensorbert
3K Followers 342 Following I’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
Zach Mueller @TheZachMueller
12K Followers 591 Following Let's make billions of parameters go brr https://t.co/rUxXIfNpwh
Jiawei Zhao @jiawzhao
3K Followers 242 Following Research Scientist at Meta FAIR @AIatMeta, PhD @Caltech, GaLore, DeepConf
Jacob Austin @jacobaustin132
7K Followers 918 Following Research at @GoogleDeepMind. Currently making LLMs go fast. I also play piano and climb. NYC. Opinions my own
ARC Prize @arcprize
26K Followers 173 Following A North Star for open AGI. Co-founders: @fchollet @mikeknoop. President: @gregkamradt. Help support the mission - make a donation today.
Feng Yao @fengyao1909
1K Followers 634 Following Ph.D. student @UCSD_CSE | Intern @Amazon Rufus Foundation Model Ex. @MSFTResearch @TsinghuaNLP
Jack Morris @jxmnop
45K Followers 975 Following research @cornell @meta // language models, information theory, science of AI
Vipin PIllai @vipin2pillai
127 Followers 517 Following Applied Scientist at Amazon Just Walk Out (previously Amazon Go) Computer Vision Ph.D. from UMBC.
verl project @verl_project
1K Followers 5 Following Open RL library for LLMs. https://t.co/Xpaq0thhgi Join us on https://t.co/uWI5Zbd6IH
Denny Zhou @denny_zhou
21K Followers 541 Following Founded & lead the Reasoning Team in Google Brain (now part of Google DeepMind). Build LLMs to reason. Opinions my own.
Kimi.ai @Kimi_Moonshot
50K Followers 98 Following Built by Moonshot AI to empower everyone to be superhuman.
Micah Goldblum @micahgoldblum
8K Followers 765 Following 🤖Prof at Columbia University 🏙️. All things machine learning.🤖
David Hall @dlwh
3K Followers 1K Following Research Engineering Lead at @StanfordCRFM. Previously co-founder at Semantic Machines ⟶ MSFT. Lead developer of Levanter and Marin @[email protected]
Alexander Kolesnikov @__kolesnikov__
12K Followers 192 Following
Xiaohua Zhai @XiaohuaZhai
11K Followers 311 Following Researcher at Meta (previously at OpenAI Zürich, Google DeepMind)
Wenhu Chen @WenhuChen
22K Followers 663 Following AI researcher. Interested in Reasoning, Multimodal. I direct TIGER-Lab. Author of PoT, MMMU, MMLU-Pro, MAmmoTH, LongRAG, MAP-Neo, YuE, VL-Rethinker
William Wang @WilliamWangNLP
19K Followers 759 Following CEO & Founder, @AlphaDesignAI. We make https://t.co/1LfDYicsF2 I'm also Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS.
Percy Liang @percyliang
84K Followers 417 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist
Ai2 @allen_ai
73K Followers 409 Following Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj
Rose Yu @yuqirose
9K Followers 575 Following Machine Learning Prof @UCSanDiego, Scholar @amazon, Previously @google, @Northeastern, @Caltech, @USC, #Physics-Guided #AI, MIT TR-35 Innovator.
Akari Asai @AkariAsai
18K Followers 867 Following Incoming Assistant Professor @SCSatCMU & research scientist @allen_ai. akariasai @ 🦋
Yu Su (hiring postdoc... @ysu_nlp
11K Followers 948 Following cooking something new. prof. @osunlp. sloan fellow. intelligence and agents. author of Mind2Web, SeeAct, MMMU, HippoRAG, BioCLIP, UGround.
Jim Fan @DrJimFan
325K Followers 3K Following NVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
Boshi Wang @BoshiWang2
2K Followers 507 Following Fourth-year Ph.D. @OhioState. Prev intern @MSFTResearch
Yoav Artzi @yoavartzi
17K Followers 183 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaC and @COLM_conf
Bill Yuchen Lin @billyuchenlin
23K Followers 3K Following Building Grok @xAI. Affiliate Assistant Prof @UW; Focusing on Grok Code for Macrohard now. Ex: @allen_ai, Google AI, Meta FAIR.
Jingbo Shang @shangjingbo
448 Followers 85 Following Assoc Prof at UC San Diego CSE & HDSI. Research on weak supervision and LLM. UIUC PhD.
MiniMax (official) @MiniMax__AI
18K Followers 11 Following Our mission is to build a world where intelligence thrives with everyone. MiniMax Agent: https://t.co/XzaTmAos0V
William Merrill @lambdaviking
5K Followers 668 Following Incoming Assistant Prof, Toyota Technical Institute at Chicago @TTIC_Connect Recruiting PhD students (start 2026) 👀 Will irl - TC0 enthusiast
Songlin Yang @SonglinYang4
12K Followers 3K Following PhD-ing @MIT_CSAIL. Working on scalable and principled algorithms in #LLM and #MLSys. In open-sourcing I trust 🐳. she/her/hers
Grad @Grad62304977
4K Followers 2K Following
Lifan Yuan @lifan__yuan
2K Followers 137 Following PhD student @uiuc_nlp @GoogleDeepMind. Prev: @TsinghuaNLP
Prime Intellect @PrimeIntellect
45K Followers 26 Following find compute. train models. contribute to open superintelligence. https://t.co/ZRZOsRRbwr
Radek Osmulski 🇺�... @radekosmulski
28K Followers 593 Following LLMs and retrieval by day and other genres of AI when I get the chance 🧪 Senior AI Eng @NVIDIAAI 🏫 @fastdotai trained DL Eng 📝 https://t.co/By87iXx5Pu
Surya Ganguli @SuryaGanguli
18K Followers 523 Following Associate Prof of Applied Physics @Stanford, and departments of Computer Science, Electrical Engineering and Neurobiology. Venture Partner @a16z
vLLM @vllm_project
17K Followers 20 Following A high-throughput and memory-efficient inference and serving engine for LLMs. Join https://t.co/lxJ0SfX5pJ to discuss together with the community!
Leon Bottou @LeonBottou
255 Followers 8 Following
TNG Technology Consul... @tngtech
2K Followers 137 Following TNG, aka "The Nerd Group", is a consulting partnership focused on high end information technology, particularly AI. 906 employees, 99.9% academics, ~53% PhDs.
sunil kumar @__sunil_kumar_
2K Followers 560 Following ml research and eng @groundlightai ex. @meta @harveymudd