One of the most important papers in AI: a tiny brain-inspired 27M param model trained on 1000 samples outperforms o3-mini-high on reasoning tasks!
Still can't believe this tiny lab of Tsinghua grads gets 40% on ARC-AGI, solves hard sudoku and mazes.
We're still so early.
The real reason people from top universities succeed isn't just IQ. When you get into IIT or Harvard or MIT, you've already proven you can delay gratification for years. You studied when your friends partied. You chose hard classes when easy ones were available. You optimized for…
I study the history of software because most people think code innovation happens in a vacuum. They see React and think Facebook just invented components. They miss the decades of work on MVC patterns, the failed attempts at web components, the slow evolution from server-side…
From GPT to MoE: I reviewed & compared the main LLMs of 2025 in terms of their architectural design from DeepSeek-V3 to Kimi 2.
Multi-head Latent Attention, sliding window attention, new Post- & Pre-Norm placements, NoPE, shared-expert MoEs, and more...
magazine.sebastianraschka.com/p/the-big-llm-…
🇮🇳 India at #ICML2025! From @lossfunk
📄 ACCEPTED PAPERS: 42
💡 SPOTLIGHTS: 6 (3 oral, 3 spotlight)
👥 AUTHORS: 96
🏆 GLOBAL RANK: #18
Thread with all papers & Indian authors below 👇
Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9%
This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA
🤖🔬Today we are debuting Zochi, the world’s first Artificial Scientist with state-of-the-art contributions accepted in ICLR 2025 workshops.
Unlike existing systems, Zochi autonomously tackles some of the most challenging problems in AI, producing novel contributions in…
Here are the top AI Papers of the Week (April 14 - 20):
- GUI-R1
- AgentA/B
- DocAgent
- SocioVerse
- A Survey of Frontiers in LLM Reasoning
- Scaling Reasoning in Diffusion LLMs via RL
Read on for more:
🛠️ DeepSeek-R1: Technical Highlights
📈 Large-scale RL in post-training
🏆 Significant performance boost with minimal labeled data
🔢 Math, code, and reasoning tasks on par with OpenAI-o1
📄 More details: github.com/deepseek-ai/De…
🐋 4/n
arXiv -> alphaXiv
Students at Stanford have built alphaXiv, an open discussion forum for arXiv papers. @askalphaxiv
You can post questions and comments directly on top of any arXiv paper by changing arXiv to alphaXiv in any URL!
Announcing "Super Study Guide: Transformers & Large Language Models", a 250-page book with ~600 colored illustrations covering the concepts of the Stanford workshop that Shervine and I are teaching at this summer.
5K Followers 3 FollowingTweeting interesting papers submitted at https://t.co/rXX8x0HzXV.
Submit your own at https://t.co/QhbJKXBd4Q, and link models/datasets/demos to it!
165K Followers 870 FollowingReality is programmable | Building digital leverage w/ AI | Stay ahead with the latest AI & robotics developments | 📧 [email protected]
18K Followers 366 FollowingThe top education and research institution in the 🌎 for #AI and #machinelearning | Research
→ https://t.co/jUD0hZ8SFx | Learn more ↓
26K Followers 173 FollowingA North Star for open AGI. Co-founders: @fchollet @mikeknoop. President: @gregkamradt. Help support the mission - make a donation today.
7K Followers 652 FollowingResearch Scientist @AIatMeta
Previously Researcher @ Samsung AI
Outstanding Paper Award @icmlconf 2023
Action Editor @TmlrOrg
I tweet about ML papers and math
1K Followers 1 FollowingAdvancing the scientific method with Artificial Scientists. Designers of @Zochi_AS, the first AI system to publish in an A* conference.
18K Followers 2K FollowingMad Scientist, DeepDream creator. Designing Self-Organising Systems and Programmable Artificial Life. https://t.co/rntipHzHW3
163K Followers 0 FollowingInvented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.
7K Followers 1K FollowingAssistant Professor @UW, Principal Research Scientist @Nvidia. Prior Cofounder @NexusflowX, @Berkeley_EECS @Google @Microsoft. I work on LLMs.
15K Followers 6K FollowingI build tough benchmarks for LMs and then I get the LMs to solve them. SWE-bench & SWE-agent. Postdoc @Princeton. PhD @nlpnoah @UW.
24K Followers 1 Followingcovering the latest AI & LLM research /// see "highlights" for all previous weekly threads /// building the best AI paper search engine @findmypapersai
79K Followers 1 FollowingDemocratizing AI research, education, and technologies. Learn how to build with AI in our new AI Academy: https://t.co/zQXQt0Pem8
20K Followers 352 FollowingProfessor at Imperial College London and Principal Scientist at Google DeepMind. Tweeting in a personal capacity. To send me a message please use email
50K Followers 3K FollowingAI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.
57K Followers 568 FollowingAssistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Joining @NYU_Courant September 2026. Co-EiC @TmlrOrg. I lead @TheSalonML.
2K Followers 408 FollowingCourant Institute of Mathematical Sciences, home of @NYUniversity's Math and Computer Science departments. https://t.co/rlQfyJ0l9l