Nikhil Barhate @nikhilbarhate99

ML @scale_AI | prev @AMD @mila_quebec nikhilbarhate99.github.io San Francisco, CA Joined June 2015

Tweets

1K
Followers

203
Following

813
Likes

3K

Aviral Kumar @aviral_kumar2

4 days ago

We have been doing work on scaling laws for off-policy RL for some time now and we just put a new paper out: arxiv.org/abs/2508.14881 Here, @preston_fu @_oleh lead a study on how to best allocate compute for training value functions in deep RL: 🧵⬇️

2 25 159 7K 93

Download Video

Orion Weller @orionweller

a week ago

Instructions/reasoning are now everywhere in retrieval - we want embeddings to do it all! 🚀 But... is it even possible? 🤔 Turns out, it's not possible for single-vector models 😱 theoretically and empirically! To make it obvious we OSS a simple eval SoTA models flop on! 🧵

13 73 311 29K 207

Download Image

Lucas Beyer (bl16) @giffmana

a week ago

@suchenzang @yaroslavvb No way!! I literally wrote libheatmap precisely to make heatmaps of event locations in dota2, in 2013 lucasb.eyer.be/articles/color…

2 2 25 2K 2

Download Image

will brown @willccbb

a week ago

very recently (as of v0.1.3) figured out what i think is the "right" way to handle Rubric-level state + objects being made available to reward functions inside verifiers previously, you'd just declare extra things globally (def an anti-pattern, always bugged me) and i'd manually…

10 6 161 15K 92

Download Image

Glen Berseth @GlenBerseth

2 weeks ago

VLAs offer an avenue for generalist robot policies; however, naively following the action predictions leads to brittle or unsafe behaviours. We introduce VLAPS, which integrates model-based search with pre-trained VLA policies to improve performance without additional training.

3 25 208 12K 145

Download Video

Tu Trinh @thetututrain

4 weeks ago

Cooking up cool stuff at work 🍜🤖 had a great time building model debate for data quality!

Scale AI @scale_AI

4 weeks ago

Cooking up cool stuff at work 🍜🤖 had a great time building model debate for data quality!

5 15 78 12K 30

Download Image

3 1 17 835 0

Guangxuan Xiao @Guangxuan_Xiao

a month ago

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: hanlab.mit.edu/blog/streaming…

39 275 2K 247K 2K

Download Image

Amy Deng @amydeng_

4 weeks ago

I spent the past months investigating: Can we trust reasoning models' CoTs? Researchers showed that LLMs aren't always faithful, but that's not the full story. LLMs are very faithful when the reasoning is complex, and unfaithful CoTs remain monitorable! Check out my latest work🥳

METR @METR_Evals

4 weeks ago

6 36 303 51K 142

Download Image

1 4 50 6K 14

Feng Yao @fengyao1909

a month ago

Failing on 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞 𝐑𝐋 with VeRL? ⚠️ Mixing inference backend (𝐯𝐋𝐋𝐌/𝐒𝐆𝐋𝐚𝐧𝐠) with training backends (𝐅𝐒𝐃𝐏/𝐌𝐞𝐠𝐚𝐭𝐫𝐨𝐧) 𝐬𝐞𝐜𝐫𝐞𝐭𝐥𝐲 𝐭𝐮𝐫𝐧𝐬 𝐲𝐨𝐮𝐫 𝐑𝐋 𝐢𝐧𝐭𝐨 𝐨𝐟𝐟-𝐩𝐨𝐥𝐢𝐜𝐲 — even if they share the same weights! 📉 Blog:…

13 117 717 128K 647

Download Image

Demis Hassabis @demishassabis

a month ago

Genie 3 is here - it can generate an entire world simulation that you can interact with in real-time, just from a text prompt! It's pretty mind-blowing really when you stop to think about it, and it's rapidly improving - one day we will be able to build the Holodeck for real!

254 805 5K 489K 876

Download Video

Suning Huang @suning_huang

a month ago

🚀 Excited to share our #CoRL2025 paper! See you in Korea 🇰🇷!🎉 We present ParticleFormer, a Transformer-based 3D world model that learns from point cloud perception and captures complex dynamics across multiple objects and material types ! 🌐 Project website:…

6 19 107 16K 53

Download Video

Brent Yi @brenthyi

a month ago

July has been a big month for Viser! - Released v1.0.0😊 - We did some writing Some demos👇

14 101 761 123K 375

Download Video

Ali Madani @thisismadani

a month ago

Excited to have our AI research published in @Nature today. Proud of the @ProfluentBio team and the extensive final version available under open-access. OpenCRISPR is a milestone. It's the first successful demonstration of editing the human genome with a molecule fully designed…

Profluent @ProfluentBio

a month ago

4 60 272 98K 102

Download Video

15 109 578 75K 221

David McAllister @davidrmcall

a month ago

Excited to share Flow Matching Policy Gradients: expressive RL policies trained from rewards using flow matching. It’s an easy, drop-in replacement for Gaussian PPO on control tasks.

8 199 1K 132K 931

Download Video

Perry Dong @perryadong

2 months ago

Fine-tuning pre-trained robotic models with online RL requires a way to train RL with expressive policies Can we design an effective method for this? We propose EXPO, a sample-efficient online RL algorithm that enables stable fine-tuning of expressive policy classes (1/6)

1 10 57 37K 46

Simo Ryu @cloneofsimo

a month ago

There is no fucking way i wasnt aware of this work that came out this may, literally DAVID SILVER and JEFF DEAN is coauthor ???

21 77 1K 93K 1K

Download Image

Misha Laskin @MishaLaskin

2 months ago

Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.

100 181 1K 344K 1K

Download Video

Siddharth Ancha @siddancha

4 months ago

Diffusion/flow policies 🤖 sample a “trajectory of trajectories” — a diffusion/flow trajectory of action trajectories. Seems wasteful? Presenting Streaming Flow Policy that simplifies and speeds up diffusion/flow policies by treating action trajectories as flow trajectories! 🌐…

1 12 59 9K 39

Download Gif

Sukjun (June) Hwang @sukjun_hwang

2 months ago

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

98 745 5K 736K 4K

Download Gif

Ji Woong Kim @jwbkim

2 months ago

Introducing Hierarchical Surgical Robot Transformer (SRT-H), a language-guided policy for autonomous surgery🤖🏥 On the da Vinci robot, we perform a real surgical procedure on animal tissue. Collaboration b/w @JohnsHopkins & @Stanford