SGLang, verl, OpenBMB and Tsinghua University: Pioneering End-to-End Multi-Turn RLHF
We are thrilled to announce the release of the first fully functional, convergence-verified, end-to-end open source multi-turn Reinforcement Learning with Human Feedback (RLHF) framework,…
Demystifying Long CoT Reasoning in LLMs
arxiv.org/pdf/2502.03373
Reasoning models like R1 / O1 / O3 have gained massive attention, but their training dynamics remain a mystery. We're taking a first deep dive into understanding long CoT reasoning in LLMs!
11 Major…
"When you find a genius, give them all power."
I've been obsessed with this idea for 12 months now.
I learned it from Munger but then saw all successful people apply it.
From Steve Jobs to Robert Oppenheimer.
Thread:
This is periodic reminder / recommendation to read this paper inside out. It is still the most helpful paper ive ever read. You may have not have encountered it because its not super popular like "Attention is all you need", but you WILL thank me.
Despite the paper's title,…
We're excited about the future of truly multimodal tokens, where input and output seamlessly integrate across different media types. 🚀 #MultimodalAI#AI#Innovation
We're excited about the future of truly multimodal tokens, where input and output seamlessly integrate across different media types. 🚀 #MultimodalAI#AI#Innovation
I know your timeline is flooded now with word salads of "insane, HER, 10 features you missed, we're so back". Sit down. Chill. <gasp> Take a deep breath like Mark does in the demo </gasp>. Let's think step by step:
- Technique-wise, OpenAI has figured out a way to map audio to…
[1/n]
Happy to share our new work "MuPT: A Generative Symbolic Music Pretrained Transformer", encompassing a series of music generation models ranging from 190 million to 4.2 billion parameters, all based on the ABC Notation. According to human preference evaluations, our models…
Our 12 scaling laws (for LLM knowledge capacity) are out: arxiv.org/abs/2404.05405. Took me 4mos to submit 50,000 jobs; took Meta 1mo for legal review; FAIR sponsored 4,200,000 GPU hrs. Hope this is a new direction to study scaling laws + help practitioners make informed decisions
Introducing a new, fully open robotics dataset!
- 76k episodes
- 564 unique scenes
- 100 contributors
- 13 labs/institutions
- 3 continents
droid-dataset.github.io
A short 🧵 on the backstory
Today is the beginning of our moonshot to solve embodied AGI in the physical world. I’m so excited to announce Project GR00T, our new initiative to create a general-purpose foundation model for humanoid robot learning.
The GR00T model will enable a robot to understand multimodal…
Custom LLM and AI Agents (RAG) On Structured + Unstructured Data - AI Brain For Your Organization
Imagine a ChatGPT-like interface over all your structured (database) and unstructured data. Ideally, you want to ask a question to an AI bot, and it should be able to run multiple…
Failure Points In RAG Systems
Anyone who has tried to deploy an RAG system knows that there are several failure modes to watch out for
While RAG helps you reduce hallucinations and create custom ChatLLM, there can be several failure points, given the complexity of the system.…
What is the correct recipe for finetuning LLMs for math reasoning? In MAmmoTH, we systematically study the SFT data composition and format for improving math, either in-distribution or out-of-distribution.
Key takes:
> Substantial, unprecidented performance gain for open-source…
What is the correct recipe for finetuning LLMs for math reasoning? In MAmmoTH, we systematically study the SFT data composition and format for improving math, either in-distribution or out-of-distribution.
Key takes:
> Substantial, unprecidented performance gain for open-source…
Ever want to make your LLM inference go brrrrr but got stuck at implementing speculative decoding and finding the suitable draft model?
No more pain! Thrilled to unveil Medusa, a simple framework that removes the annoying draft model while getting 2x speedup. 🧵👇
50 Perplexity Prompts to Take Your Research to the Next Level.
Don't forget to bookmark this post, to try these with Bard, ChatGPT, and Claude. 👇🏾
1. Emerging Industry Trends
Prompt: "What are the emerging trends in [user input: specific industry] for the current year?"
2.…
35 Followers 58 FollowingStudent at the Georgia Institute of Technology. Studying Computer Science with a concentration in Intelligence and Theory. Twitter Developer Account.
760 Followers 1K Following[email protected], Postdoc@tsinghua, working with Prof. Jie Tang. PhD advised by Prof. Yue Zhang. Prev: Interned @AWScloud. LLM Evaluation, Posttraining
1K Followers 1K FollowingCS PhD student @HKUniversity. Previously M.S. in @Columbia. Intern at @MSFTResearch, prev. at @AlibabaGroup. LLM, SQL Intelligence, Code Gen for New User Exp
2K Followers 831 FollowingEngineering tech lead for Qwen and Wan api, Qwen Chat, Founding member for ModelScope, AI enthusiastic,Racing lover, Opinions are my own.
24K Followers 1 Followingcovering the latest AI & LLM research /// see "highlights" for all previous weekly threads /// building the best AI paper search engine @findmypapersai
81K Followers 583 FollowingFilm director | AI Consultant | Partner with https://t.co/Vn9g3Z63CI Paris | Sharing practical ways to use AI for you and your business. All views are my own.
1K Followers 103 FollowingAI/RL researcher, Assistant Prof. at @Tsinghua_Uni, leading the RL lab at @AntResearch_, PhD at @berkeley_ai, frequent flyer and milk tea lover.
2K Followers 543 FollowingPhD student/research scientist intern at @ucl_nlp/@GoogleDeepMind (50/50 split). Previously MS at @kaist_ai and research engineer at Naver Clova. #NLProc & ML
7K Followers 94 Followinghttps://t.co/FmX1B3nzjA is working on finding the scaling laws of agents. The first and the best multi-agent framework. Discord: https://t.co/DRweXf0nOl. Product @Eigent_AI
1K Followers 509 FollowingPh.D. Candidate. Currently focusing on multimodal reasoning and planning with large models. Past research interns: ByteDance Seed, Tencent PCG/AILab.
50K Followers 3K FollowingDeveloper Experience Lead at @GoogleDeepMind
Building Gemini API, Gemma, AI Studio and more AI products. My views
ex-Chief Llama Officer @huggingface 🇵🇪🇲🇽
204K Followers 25 FollowingManus is the general AI agent that bridges minds and actions: it doesn't just think, it delivers results. Download our app: https://t.co/XSfjRhjdgo
47K Followers 110 FollowingMy new LM book: https://t.co/YXNQUy7O3t
PhD in AI, author of 📖 The Hundred-Page Language Models Book and 📖 The Hundred-Page Machine Learning Book
602K Followers 5K FollowingPresident & CEO @ycombinator —Founder @Initialized—designer/engineer who helps founders—San Francisco Dem accelerating the boom loop—e/acc—technology brother
8K Followers 167 FollowingLarge Model Systems Organization: Join our Slack: https://t.co/mSPNyKTLTS We developed SGLang https://t.co/jEqIJcGwGA, Chatbot Arena (now @lmarena_ai), and Vicuna!
9K Followers 2K FollowingAssociate professor of @umdcs @umiacs @ml_umd at UMD. Researcher in #AI/#ML, AI #Alignment, #RLHF, #Trustworthy ML, #EthicalAI, AI #Democratization, AI for ALL.
3K Followers 3 FollowingAI model built by the community, for everyone in this world
Part of the Linux Foundation, Apache 2 licensed
An RNN scaled to 14B params with GPT-level of perf
125K Followers 971 FollowingPartner @a16z AI 🤖 and twin to @omooretweets | Investor in @elevenlabsio, @krea_ai, @bfl_ml, @hedra_labs, @WaveFormsAI, @ViggleAI, & more
12K Followers 3K FollowingPhD-ing @MIT_CSAIL. Working on scalable and principled algorithms in #LLM and #MLSys. In open-sourcing I trust 🐳. she/her/hers