The Illustrated DeepSeek-R1
Spent the weekend reading the paper and sorting through the intuitions. Here's a visual guide and the main intuitions to understand the model and the process that created it.
Link in the first reply. All feedback welcome.
“How could I be sure it wasn’t a spoof call?”
2024 physics laureate Geoffrey Hinton received a phone call from Stockholm in the early hours in a hotel room in California. Multiple Swedish accents helped reassure him that his #NobelPrize in Physics, awarded today, was real.
Physics Nobel Prize goes to the AI people! This is huge!
Also, I believe Terry Sejnowski's contributions to artificial neural networks should have been recognized alongside Geoffrey Hinton and JJ Hopfield.
#NobelPrize
New (2h13m 😅) lecture: "Let's build the GPT Tokenizer"
Tokenizers are a completely separate stage of the LLM pipeline: they have their own training set, training algorithm (Byte Pair Encoding), and after training implement two functions: encode() from strings to tokens, and…
People seem to be falling for two rather thoughtless extremes:
1. "LLMs are AGI, they work like the human brain, they can reason, etc."
2. "LLMs are dumb and useless."
Reality is that LLMs are not AGI -- they're a big curve fit to a very large dataset. They work via…
Whenever you are contemplating participating in @kaggle competitions and you might have heard someone say it is too far-fetched from practical data science work, consider this example:
In the recent Science LLM competition participants learned among many other things:
- How to…
Embrace the randomness
As promised, here is my first post about potentially useful tips and tricks for training deep learning models. Some of these posts might be quite long, some will be shorter. The first thing we will talk about is randomness when fine-tuning models.
When…
theres been a lot of excitement around fine-tuning recently, both in open source and with OpenAI's API
Here’s a list of some of our favorite resources, use-cases, and experiments on the topic over the last ~week
🧵
To start with Machine Learning:
1. Learn Python
2. Practice using Google Colab
Take these 2 free courses:
• Introduction to Python Programming (Udacity)
• Machine Learning Crash Course (Google)
If you need a bit more time before diving deeper, finish the following Kaggle…
Building AI applications will be one of the most crucial skills for the next 20 years.
If I were starting today, I'd learn these:
• Python
• OpenAI API
• Langchain
Here is the most comprehensive, free Langchain certification that you'll find online:
82 Followers 263 FollowingScifi fan: Alien, Black Mirror, West World, The Expanse, Foundation, Terminator, Blade Runner, Pantheon, Total Recall, 5th Element, The Truman Show, Silo.
161 Followers 871 FollowingAI for Medical, AGI is coming, Fine-tuning & Graph is fun! I sculpt when I can. Ex-GE Healthcare. Ex-IP Lawyer. #AI #AGI #MechInterp #AIWelfare #AISafety
15K Followers 1K FollowingSenior Research Scientist - @google, Adjunct Faculty - @iitmadras, @iitbombay, Ex: @NICT_Publicity
Use of my tweets without permission ➡️ legal action
10K Followers 131 FollowingBuilding super-tiny AI models that (hopefully) think • YC S25 • Made @DiffusionBee • Previously: AI research @Microsoft, @CarnegieMellon, @Meta
82 Followers 263 FollowingScifi fan: Alien, Black Mirror, West World, The Expanse, Foundation, Terminator, Blade Runner, Pantheon, Total Recall, 5th Element, The Truman Show, Silo.
26K Followers 173 FollowingA North Star for open AGI. Co-founders: @fchollet @mikeknoop. President: @gregkamradt. Help support the mission - make a donation today.
6K Followers 886 FollowingPosts about AI/ML and occasionally other random tidbits. DevRel & AI Engineering at @llama_index from Istanbul ☀️ in Amsterdam 🚲
163K Followers 166 FollowingCo-founder of Thinking Machines Lab @thinkymachines; Ex-VP, AI Safety & robotics, applied research @OpenAI; Author of Lil'Log
29K Followers 333 FollowingAuthor of the book Hands-On #MachineLearning with #ScikitLearn, #Keras and #TensorFlow. Former PM of #YouTube video classification. Founder of telco operator.
50K Followers 3K FollowingDeveloper Experience Lead at @GoogleDeepMind
Building Gemini API, Gemma, AI Studio and more AI products. My views
ex-Chief Llama Officer @huggingface 🇵🇪🇲🇽