The most comprehensive,
LLM architecture analysis I've read.
Covers every flagship model:
1. DeepSeek V3/R1
2. OLMo 2
3. Gemma 3
4. Mistral Small 3.1
5. Llama 4
6. Qwen3
7. SmolLM3
8. Kimi 2
9. GPT-OSS
Great article by @rasbt🙌
Link in the comments 👇
♻️ Repost if you…
A free book 👇
"Foundations of Lange Language Models" by Tong Xiao and Jingbo Zhu
It's good to refresh the core concepts and techniques behind LLMs.
This 230-page book covers topics, such as:
- Pre-training
- Generative models (training, fine-tuning, memory, scaling)
-…
1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
The freshest AI/ML research of the week
Our top 9
▪️ Sotopia-RL: Reward Design for Social Intelligence
▪️ Agent Lightning: Train ANY AI Agents with RL
▪️ Exploitation Is All You Need... for Exploration
▪️ Learning to Reason for Factuality
▪️ VeOmni
▪️ Is Chain-of-Thought…
Been working HRM, had been getting mixed results. AdamAtan2 usage is interesting. Paper covers Sudoku and ARC AGI 1/2. These are essentially step-based grid struct prob. Anyone working w/ HRM & finding other interesting examples? Seen tons of hype, but v few people implementing.
Training an LLM on 8 M4 Mac Minis
Ethernet interconnect between Macs is 100x slower than NVLink so Macs can’t synchronise model gradients every training step.
I got DiLoCo running so Macs synchronise once every 1000 training steps using 1000x less communication than DDP
Local Deep Research - A local LLM research assistant that generates follow-up questions and uses DuckDuckGo for web searches
- Runs 100% locally with Ollama
- Works with Mistral 7B or DeepSeek 14B
- Generates structured research reports with sources
You can solve 80% of interview problems about strings with a basic approach.
But if the question is tricky, you probably have to think about tries.
Tries are unique data structures you can use to represent strings efficiently.
This is how to use them: ↓
OpenAI has released a new prompting guide for their reasoning models.
It emphasizes simplicity, avoiding chain-of-thought prompts, the use of delimiters, and when to use them.
Here’s a breakdown and an optimized prompt to have it write like you:
Training LLMs with Reinforcement Learning (RL) isn’t a new idea.
So why does it suddenly seem to be working now (o1/DeepSeek)?
Here are a few theories and my thoughts on each of them: (1/N)
745 Followers 603 FollowingCommunication & Computer Network student🎓| HTML5/CSS3 Coder💻| Front-End Newbie Programmer👨🏾💻| On my Road to become a Full-Stack Web Developer🌟|
🇨🇲🇬🇧
5K Followers 176 FollowingSharing insights to use AI in practical ways for you and your business • Follow me to learn and master AI & Tech tools • AI Educator & Writer @theprohumanai
423K Followers 1K Following“A little bit of DAILY READING goes long way.”📚 📖 ☕️ || Book Review, Lessons, Recomm, & Wisdom || Engineer 👨💻 Solutions Architect - Data Storage
79K Followers 17 FollowingCertified Fitness Coach Sharing Daily Expert Workout Tips, Nutrition Advice, and Glow Up Tips. Join the journey to a healthier, Stronger, and Better You!
136K Followers 45K FollowingEx @Streamlit @Snowflake Maestro 🪄 • X about AI agents, LLMs, web apps, Python & SEO • My ❤️ is open source • DM for collabs 📩
451K Followers 77 FollowingTensors and neural networks in Python with strong hardware acceleration. PyTorch is an open source project at the Linux Foundation. #PyTorchFoundation
3K Followers 836 FollowingAssistant Professor @UWCheritonCS, @CIFAR_News AI Chair @VectorInst, @ReviewAcl Co-CTO | PhD @TTIC_Connect | Excited about "grounding" in any form
204K Followers 25 FollowingManus is the general AI agent that bridges minds and actions: it doesn't just think, it delivers results. Download our app: https://t.co/XSfjRhjdgo
85K Followers 481 FollowingFuture Is Generative AI + Data Science | Helping My Students Become Generative AI Data Scientists & AI Engineers ($200,000+ career) 👇
1K Followers 330 FollowingInfra & AI enthusiast, dreaming about test-time compute ✨
Research Scholar at @Berkeley_EECS, @ucbrise, @berkeley_ai | MS in CS @ETH_en | Prev, @IBMResearch
4K Followers 416 FollowingOptimize cost & performance with AI platforms powered by our industry-leading SLMs: Arcee Conductor for model routing, & Arcee Orchestra for agentic workflows.
2.6M Followers 3K FollowingResearch, News, and Commentary from Nature, the international science journal
For daily science news, get Nature Briefing: https://t.co/wGmQlQ8a4D
165K Followers 870 FollowingReality is programmable | Building digital leverage w/ AI | Stay ahead with the latest AI & robotics developments | 📧 [email protected]
101K Followers 2K FollowingFollow for posts about GitHub repos, DSPy, and agents
Subscribe for top posts
DM to share your AI project (Due to volume of DMs I'll prioritize subscribers)
269K Followers 7K FollowingFounder and CEO of @acquiredotcom. https://t.co/wRMIssDmhl has helped 100s of startups get acquired and facilitated $500m+ in closed deals.
No recent Favorites. New Favorites will appear here.