🚀 RL is powering breakthroughs in LLM alignment, reasoning, and agentic apps.
Are you ready to dive into the RL x LLM frontier?
Join us at @aclmeeting ACL’25 tutorial:
Inverse RL Meets LLM Alignment
this Sunday at Vienna🇦🇹(Jul 27th, 9am)
📄 Preprint at huggingface.co/papers/2507.13…
Now with Qwen’s RL-fine-tuning results, are we witnessing a quiet return of prompt optimization/engineering?
Now we have a 2-player game: users become “lazy prompters”, but the system prompts (e.g. thinking patterns) need to be highly optimized.
Next: Bi-level optimization?
📢New Paper on Process Reward Modelling 📢
Ever wondered about the pathologies of existing PRMs and how they could be remedied? In our latest paper, we investigate this through the lens of Information theory! #icml2025
Here’s a 🧵on how it works 👇
arxiv.org/abs/2411.11984
Happy to share that our paper on "Active Reward Modeling" has been accepted to ICML 2025! #ICML2025
The part I like the most about the project is its simplicity!
Huge thanks to my amazing co-authors @ShenRaphael@HolarisSun
More to come!
For more detailed 🧵 see 👇
Happy to share that our paper on "Active Reward Modeling" has been accepted to ICML 2025! #ICML2025
The part I like the most about the project is its simplicity!
Huge thanks to my amazing co-authors @ShenRaphael@HolarisSun
More to come!
For more detailed 🧵 see 👇
Heading to 🇸🇬ICLR next week!
Can’t wait to catch up with old friends and meet new ones — let’s chat about RL, reward models, alignment, reasoning, and agents!
Also, fun fact🤓: Yunyi won’t be there physically, but his digital twin will be attending instead. Stay tuned!
117 Followers 432 FollowingSenior Researcher at @MSFTResearch AI for Science, working on reinforcement learning, large language models, and AI for Science.
194 Followers 392 FollowingPh.D. student @PurdueCS. 2025Intern at @MSFTResearch. I do research that helps developers—from pros to vibe coders to agent builders.
167 Followers 2K FollowingMasters Student at Robotics research centre @ IIITH. Interested in robot learning, long horizon planning and RL. Previously undergrad @iitmadras
217 Followers 2K FollowingPhysicist to AI researcher.
Building AI assistant for scientific discovery.
Interpretability.
Connection between ML and renormalization group
17K Followers 6K FollowingNeurodivergent physics student with a keen interest in multisensory integration and emergent perception. Exploring research on a proposed ‘sixth sense’. Δ
34 Followers 435 FollowingI teach machines to think for themselves. Code, Tea, Code, Read, Code, Damn this model is overfitting. And not necessarily in that order.
496 Followers 319 FollowingVisiting Researcher at FAIR, Meta and CS PhD student at UT Austin. Previously, SR at Google | Pre-Doctoral Research Fellow at MSR India | CS UG at IIT KGP
117 Followers 432 FollowingSenior Researcher at @MSFTResearch AI for Science, working on reinforcement learning, large language models, and AI for Science.
194 Followers 392 FollowingPh.D. student @PurdueCS. 2025Intern at @MSFTResearch. I do research that helps developers—from pros to vibe coders to agent builders.
75K Followers 13K FollowingNewsletter exploring AI&ML - AI 101, Agentic Workflow, Business insights. From ML history to AI trends. Led by @kseniase_ Know what you are talking about👇🏼
167 Followers 2K FollowingMasters Student at Robotics research centre @ IIITH. Interested in robot learning, long horizon planning and RL. Previously undergrad @iitmadras
217 Followers 2K FollowingPhysicist to AI researcher.
Building AI assistant for scientific discovery.
Interpretability.
Connection between ML and renormalization group
17K Followers 6K FollowingNeurodivergent physics student with a keen interest in multisensory integration and emergent perception. Exploring research on a proposed ‘sixth sense’. Δ
34 Followers 435 FollowingI teach machines to think for themselves. Code, Tea, Code, Read, Code, Damn this model is overfitting. And not necessarily in that order.
4K Followers 132 FollowingAI safety research @AnthropicAI. Prev postdoc in LLM interpretability with @davidbau, math PhD at @Harvard, director of technical programs at https://t.co/FxRv4QgERO
66 Followers 385 FollowingPh.D. student @iitHri2Lab @IITalk @UniGenova, M.Eng. from Tongji University. Interested in robot,RL, Go, Human. INTJ🤖. human help human, make a better world💪
496 Followers 319 FollowingVisiting Researcher at FAIR, Meta and CS PhD student at UT Austin. Previously, SR at Google | Pre-Doctoral Research Fellow at MSR India | CS UG at IIT KGP
315 Followers 3K Following📎 Learning & Research: Deep Learning, Computational Protein Design, Protein Language Models.
📎 PhD Student at Drexel University.
📎 Becoming an avid reader.