META could have trained DeepSeek-V3 at least 15 times using the compute budget of the Llama 3 model family ( 39.3 million H100 hours )
Meanwhile DeepSeek only spent 2.6 million H800 hours (a handicapped / worse H100) for a much better model
META could have trained DeepSeek-V3 at least 15 times using the compute budget of the Llama 3 model family ( 39.3 million H100 hours )
Meanwhile DeepSeek only spent 2.6 million H800 hours (a handicapped / worse H100) for a much better model https://t.co/Cp6TxFrTdA
Chain of Thought Reasoning without Prompting
arxiv.org/abs/2402.10200 (NeurIPS 2024)
Chain of thought (CoT) reasoning ≠ CoT prompting. While the term "chain of thought" was popularized from prompting, it now primarily refers to the generation of step by step reasoning – the…
335 Followers 105 FollowingSenior 📱 Mobile Engineer @ 📦 MarleySpoon
Founder of 🐝 YBee, 🤖 IT
CoFounder of 💬 HodHod Messenger
Talk to me about 🧠 AI, Agents, and the 🎇 future
3K Followers 43 FollowingNPO founded by @Yoshua_Bengio, committed to advancing safe-by-design AI - OBNL fondée par @Yoshua_Bengio visant à concevoir des systèmes d'IA sécuritaires
6K Followers 782 Followingreasoning research @OpenAI 🍓 | UCLA CS PhD (‘21-‘24) | Ex. Microsoft Research (AI Frontiers), Meta (FAIR), NVIDIA Research (LPR)
205K Followers 5K FollowingVC at @MenloVentures. Formerly founding team @glean, @Google Search. @Cornell CS. Tweets about tech, immigration, India, fitness and search.
50K Followers 3K FollowingDeveloper Experience Lead at @GoogleDeepMind
Building Gemini API, Gemma, AI Studio and more AI products. My views
ex-Chief Llama Officer @huggingface 🇵🇪🇲🇽
19K Followers 1K FollowingAgents @Meta MSL TBD Lab. previously posttraining research @OpenAI train LLMs to do things: deep research, chatgpt agent, etc. CS PhD @LTIatCMU
3K Followers 1K FollowingPretrained by evolution, finetuned by experience, prompted by situations. | ML PhD @UofT, sharing ideas on AI, forecasting research and the human condition.
46K Followers 1K FollowingWriter https://t.co/TquuQXlLOJ. O'Reilly Author https://t.co/Fl3uPAZHLg. LLM Builder @Cohere. Visualizing AI one concept at a time.
45K Followers 1K FollowingAI Developer Experience @GoogleDeepMind | prev: Tech Lead at @huggingface, AWS ML Hero 🤗 Sharing my own views and AI News 🧑🏻💻 https://t.co/7IosdlNz22