Adam Yanxiao Zhao @sdpkjc_adam
🧑🎓 CS PhD Student @UCAS1978 | 🤖 RL | 🏄♂️ Research Intern @Zai_org | 🦶 Ex-Intern @ LiAuto @SenseTime @ https://t.co/lQs9eBMtvx sdpkjc.com Joined June 2018-
Tweets124
-
Followers46
-
Following287
-
Likes345
🚨Thrilled to share our latest progress on Computer Use Agent, ComputerRL, an end-to-end RL method which achieves 48.1% success rate on OSWorld Benchmark with only 9B open model, beating OpenAI Operator, Claude Sonnet 4.0, and other previous models, state-of-the-art performance.…
🚨Thrilled to share our latest progress on Computer Use Agent, ComputerRL, an end-to-end RL method which achieves 48.1% success rate on OSWorld Benchmark with only 9B open model, beating OpenAI Operator, Claude Sonnet 4.0, and other previous models, state-of-the-art performance.…
Lucky to have collaborated with an amazing team on this work! 🎉🚀😃
ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents "To support scalable and robust training, we develop a distributed RL infrastructure capable of orchestrating thousands of parallel virtual desktop environments to accelerate large-scale…
Introducing GLM-4.5 and GLM-4.5 Air: new flagship models designed to unify frontier reasoning, coding, and agentic capabilities. GLM-4.5: 355B total / 32B active parameters GLM-4.5-Air: 106B total / 12B active parameters API Pricing (per 1M tokens): GLM-4.5: $0.6 Input / $2.2…
Today we release DeepSeek-R1T-Chimera, an open weights model adding R1 reasoning to @deepseek_ai V3-0324 with a novel construction method. In benchmarks, it appears to be as smart as R1 but much faster, using 40% fewer output tokens. The Chimera is a child LLM, using V3s…
🥳 I'm releasing Rejax, a lightweight library of fully vectorizable RL algorithms! ⚡ Enjoy lightning-fast speed using jax.jit on the training function 🧬Use vmap and pmap on hyperparameters 🔙 Log using flexible callbacks 🌐 Available @ github.com/kerajli/rejax 📸 Take a tour!
Sorry to hear that @jsuarez5341, Open RL Benchmark was also rejected from RLC, and we mostly feel the same way about review quality (LLM-generated?). Among other things, we read that "the meaning of "metrics" is never made clear", whereas we have a section dedicated to metrics,…
Sorry to hear that @jsuarez5341, Open RL Benchmark was also rejected from RLC, and we mostly feel the same way about review quality (LLM-generated?). Among other things, we read that "the meaning of "metrics" is never made clear", whereas we have a section dedicated to metrics,…
The Open RL Leaderboard now fully supports all Stable Baselines 3 models! 🚀 Thanks to this update, it now compares over 10,000 models! 📈🎉 🏆 Leaderboard: huggingface.co/spaces/open-rl… 🐙 RL Zoo 3: github.com/DLR-RM/rl-base…
🆕 LeRobot 🤖 github.com/huggingface/le… 📈 Pre-trained robotics models 💾 Datasets of human collected demos 🔩 Modular architecture This is part of our efforts @huggingface to make 🤖 more accessible. By @RemiCadene @asoare159 @alibert_s @Thom_Wolf @AdilZtn , @HaixuanT ...
Which is the best RL agent on the Hub? Now you can, thanks to the Open RL leaderboard 🏆 ! 🧩 Features: - Automatic evaluation of models on the 🤗 Hub - Compatible with all torch-based RL libraries - Supports 87 environments, with more to come 🔥 huggingface.co/spaces/open-rl…
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency. arxiv.org/abs/2403.00673
Announcing the Reinforcement Learning Beyond Rewards workshop at the first @RL_Conference. Think that rewards aren't enough for RL? Working on RLHF? Thinking of alternative ways of alignment? Creating a foundational model for RL? or have ideas on task-agnostic RL algo? Join us
I wrote a modification of CleanRL that runs with MLX, feel free to check it out or offer suggestions! github.com/andrew-silva/c… Thanks @awnihannun for the amazing library!
Super simple code change to get value-based deep RL scale *much* better w/ big models across the board on Atari games, robotic manipulation w/ transformers, LLM + text games, & even Chess! Just use classification loss (i.e., cross entropy), not MSE!! arxiv.org/abs/2403.03950🧵⬇️
Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents Presents an exploration-based trajectory optimization approach, which consistently surpasses baseline performance by a large margin repo: github.com/Yifan-Song793/… abs: arxiv.org/abs/2403.02502
Check out our Humanoid-Gym! Humanoid-Gym is an easy-to-use RL framework that emphasizes zero-shot sim2real transfer for humanoid robots! We construct specifically designed reward functions for humanoid robots, which greatly reduces the difficulty of the sim2real transfer.…
Check out our Humanoid-Gym! Humanoid-Gym is an easy-to-use RL framework that emphasizes zero-shot sim2real transfer for humanoid robots! We construct specifically designed reward functions for humanoid robots, which greatly reduces the difficulty of the sim2real transfer.… https://t.co/pEqK6VZpph

MadgeJerry @4l9NOw9zX12mN05
28 Followers 2K Following
OctaviaAnn @Gf32SqbMVCvLG
8 Followers 565 Following
Tiny Anna Das 🙋�... @tinyannadas
606 Followers 699 Following Software Engineer | Full Stack Python Developer | Building in public | AI + Startups
WebAgentlab @webagentlab
451 Followers 1K Following WebAgentLab is building an open-source community focused on Web Agent and the broader GUI Agent field.
Freeman Lewin @Freeman_Lewin
743 Followers 1K Following Brick layer behind @TryBrickroad Building the future of data licensing.
Tianbao Xie @TianbaoX
3K Followers 2K Following Ph.D. candidate @XLangNLP lab and @hkunlp2020 . Incoming @OpenAI . Advised by @taoyds and @ikekong . 🤝 @Alibaba_Qwen @SFResearch
hc @zhc_7
3 Followers 39 Following
Yvette Luettgen @luettgen51911
23 Followers 2K Following
Jin-Qiang(Richard) Wa... @JQWang2020
23 Followers 236 Following CS Ph.d student and Research on RL at @iLZU1909, Founder & maintainer of the Deep Reinforcement Learning (Chinese Community) https://t.co/jwqyVc9Vc7
Roger Creus Castanyer @creus_roger
632 Followers 773 Following Maximizing the unexpected return. PhD student @Mila_Quebec | Prev: @UbisoftLaForge @la_UPC @HP
Thatat @ThatatDrsEpxl
12 Followers 307 Following
Jonathan Lorraine @jonLorraine9
7K Followers 6K Following Research scientist @NVIDIA | PhD in machine learning @UofT. Previously @Google / @MetaAI. Opinions are my own. 🤖 💻 ☕️
Kun Lei @kunlei15
440 Followers 2K Following Now: CS Ph.D. student at WashU. Pre: RA @Tsinghua_Uni. Research on reinforcement learning algorithms and their real-world applications.
Bartłomiej Cupiał @CupiaBart
1K Followers 521 Following PhD Student @ University of Warsaw | @IDEAS_NCBR https://t.co/DrOexJe5Tf
Alexander Nikulin @how_uhh
345 Followers 787 Following Research Scientist, RL https://t.co/JesJsTrrTy | https://t.co/nYq9gTt9oQ
Antonio @manjavacas_
139 Followers 349 Following Reinforcement learner 🤖📚 PhD. fellow at @CanalUGR and @IFMIF_DONES
Nafis Faiyaz @Casio991ms
19 Followers 359 Following Developer in Therap BD, doing MSc in CSE at BUET. And really interested in Reinforcement Learning.XTao @XTao
797 Followers 3K Following [email protected], [email protected], Worked@EinPlus, Worked@CreditEase, [email protected], [email protected], Worked@Hyenae, Studied CS&EE@BUPT.
Masudur Rahman @masud99r
271 Followers 494 Following PostDoc @PurdueEngineers @purdue_ie. Ph.D. @PurdueCS. Reinforcement Learning, Robotics, AI in Surgery. Interested in understanding Intelligence and Universe.
Flavius Moldovan @MrLeritaite
5 Followers 105 Following Building AI Agents. Interested in quantization and making llm runs faster. Building a startup.
Quentin Gallouédec @QGallouedec
3K Followers 664 Following PhD - Research @huggingface 🤗 TRL lead maintainer 🇫🇷 in 🇨🇦
Guillaume Thomas @GThomas3_1415
31 Followers 382 Following
Jackmin @jackminong
2K Followers 756 Following brutally slashing misbehaving computers @PrimeIntellect 🇺🇸. Previously @JinaAI_ 🇩🇪 @MoneyLion 🇲🇾.
Rishabh Agarwal @agarwl_
17K Followers 791 Following Reinforcement Learner, Adjunct Prof at McGill. Ex MSL Meta, DeepMind, Brain, Mila, IIT Bombay. NeurIPS Best Paper
Shangbin Feng @shangbinfeng
4K Followers 2K Following PhD student @uwcse @uwnlp. Model collaboration, social NLP, networks and structures. #水文学家
Ethan Xu @LinjieXu
165 Followers 636 Following 5th-year PhD stu. @GameAI_QMUL. Prev. intern at Microsoft Research and Apple. Working on Reinforcement Learning.
Dennis Soemers @DennisSoemers
318 Followers 555 Following Assistant Professor @UM_DACS. Opinions my own, but should be everyone's. Anon feedback: https://t.co/dWtWvc41ha https://t.co/TYaw5PODFv
Tom Dupuis @bellmantd
132 Followers 607 Following PhD student in Deep RL @CEA_List @ENSTAParis | Teaching @CentraleSupelec | Centrale Paris & MVA @ENS_ParisSaclay alumni
raviteja @raviteja_ankem
25 Followers 1K Following
Brenno de Mello @BrennoMeello
101 Followers 4K Following
Rujikorn Charakorn (T... @tan51616
258 Followers 2K Following 🇹🇭 Research @SakanaAILabs Ph.D @VISTEC_Thailand prev. intern @SakanaAILabs @naverlabseurope B.Eng. @chulalongkornU
Jaekyeom Kim @Jaekyeom__Kim
148 Followers 368 Following 🤖 Working on computer use agents and RL. Researcher at LG AI Research in Michigan 🇺🇸 https://t.co/haO05EiWPc
Seohong Park @seohong_park
4K Followers 532 Following Reinforcement learning | CS Ph.D. student @berkeley_ai
Florian Felten @FlorianFelten1
116 Followers 198 Following PostDoc @ETH Zürich | Optimization and RL stuffs Bike rider / trail runner when I'm not lazy Sprezzatura apprentice
simida @simida35378839
1 Followers 1 Following
siiiiiiiiiiily @siiiiiily
1 Followers 3 Following
Yu Su (hiring postdoc... @ysu_nlp
11K Followers 948 Following cooking something new. prof. @osunlp. sloan fellow. intelligence and agents. author of Mind2Web, SeeAct, MMMU, HippoRAG, BioCLIP, UGround.
Alex Shaw @alexgshaw
296 Followers 462 Following Researching @LaudeInstitute & investing @LaudeVentures Co-creator of Terminal Bench. Formerly Google. BYU alum.
Tianbao Xie @TianbaoX
3K Followers 2K Following Ph.D. candidate @XLangNLP lab and @hkunlp2020 . Incoming @OpenAI . Advised by @taoyds and @ikekong . 🤝 @Alibaba_Qwen @SFResearch
Yuxiao Dong @ericdongyx
235 Followers 233 Following Associate Prof. of Tsinghua CS @thukeg reasoning & vision & agent for #llms prev. @MetaAI @MSFTResearch
Z.ai @Zai_org
15K Followers 142 Following The AI lab behind GLM models, dedicated to inspiring the development of AGI to benefit humanity. https://t.co/b6zGxJvzzS
jietang @jietang
3K Followers 108 Following Professor @ Tsinghua University Artificial General Intelligence, Large Language Model
Xiao Liu (Shaw) @ShawLiu12
569 Followers 168 Following PhD @Tsinghua @THUKEG Developing P-Tuning, ChatGLM, AgentBench, and AutoGLM. 📖 Sharing paper digest on LLMs.
Yiping Wang @ypwang61
1K Followers 1K Following Ph.D. @uwcse. undergraduate @ZJU_China. I'm interested in mathematics, agi, and physics.
OpenRouter @OpenRouterAI
53K Followers 304 Following Discover and use the latest LLMs. 500+ models (incl. 50+ free), explorable data, private chat, & a unified API. https://t.co/qJG5mKrigL
Daniel Han @danielhanchen
28K Followers 2K Following Building @UnslothAI. Finetune train LLMs faster. LLMs bug hunter. OSS package https://t.co/aRyAAgKOR7. YC S24. Prev ML at NVIDIA. Hyperlearn used by NASA.
Jiaxin Wen @jiaxinwen22
4K Followers 271 Following CS PhD student @UCBerkeley. Part-time @AnthropicAI. Part-time eater. Prev @Tsinghua_Uni. Try to understand and control intelligence as a human.
Robert Lange @RobertTLange
9K Followers 603 Following Founding Research Scientist @SakanaAILabs 🎏 💬 Agentic Discovery 🔬 AI Scientist 🧬 EvoLLM 🏋️ gymnax 🦎 evosax 🤹 MLE-Infra Ex: SR & Intern @Google DeepMind
xAI @xai
1.8M Followers 38 Following
Stefano Albrecht @s_albrecht
2K Followers 161 Following Research in AI and machine learning for autonomous systems. MIT Press textbook: https://t.co/6j2qF00zsU Views are my own.
Sam Devlin @smdvln
2K Followers 1K Following AI Research Scientist @Meta. Previously @MSFTResearch and @UniOfYork
DAIR.AI @dair_ai
79K Followers 1 Following Democratizing AI research, education, and technologies. Learn how to build with AI in our new AI Academy: https://t.co/zQXQt0Pem8
Jin-Qiang(Richard) Wa... @JQWang2020
23 Followers 236 Following CS Ph.d student and Research on RL at @iLZU1909, Founder & maintainer of the Deep Reinforcement Learning (Chinese Community) https://t.co/jwqyVc9Vc7
RLDM @RLDMDublin2025
191 Followers 4 Following The Multi-disciplinary Conference on Reinforcement Learning and Decision Making. 11-14 June 2025. Trinity College Dublin.
John Schulman @johnschulman2
65K Followers 1K Following Recently started @thinkymachines. Interested in reinforcement learning, alignment, birds, jazz music
Peiyi Wang @sybilhyz
11K Followers 302 Following PhD @PKU1898; Researcher @deepseek_ai; Recent: DeepSeek-R1/CoderV2/Math/V1/V2/V3, Mathshepherd, FairEval, Speculative Decoding.
DeepSeek @deepseek_ai
973K Followers 0 Following Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.
Zihan Wang - on RAGEN @wzihanw
23K Followers 609 Following PhD Student @NorthwesternU. Intern @yutori_ai. I study PhysiCS of LLM. Ex @deepseek_ai @uiuc_nlp @RUC. RAGEN | Chain-of-Experts | ESFT.
Nat McAleese @__nmca__
14K Followers 353 Following Research @AnthropicAI. Previously @OpenAI, @DeepMind. Views my own.
LDJ @ldjconfirmed
6K Followers 532 Following e/λ Currently: Doing some stuff with AI. Prev founding team of both: @NousResearch and @TTSLabsAI DM for interesting conversations.
Hongyu Ren @ren_hongyu
23K Followers 691 Following research @meta superintelligence. CS PhD @stanford. prev @openai, led the development of o3-mini and o1-mini.
Zhiqing Sun @EdwardSun0909
19K Followers 1K Following Agents @Meta MSL TBD Lab. previously posttraining research @OpenAI train LLMs to do things: deep research, chatgpt agent, etc. CS PhD @LTIatCMU
john allard 🇺🇸 @john__allard
2K Followers 281 Following chatgpt personalization @ openai | views are explicitly disavowed by employer
Lifan Yuan @lifan__yuan
2K Followers 137 Following PhD student @uiuc_nlp @GoogleDeepMind. Prev: @TsinghuaNLP
gabriel @GabrielPeterss4
35K Followers 488 Following research sora at @OpenAI, previously at midjourney, swedish high school dropout
JB @IAMJBDEL
1K Followers 944 Following Stanford - Radiology AI | RadAI @ HOPPR | Previous: ML @HuggingFace, Academic staff - Research @Stanford University, @StanfordAIMI Affiliate
Roger Creus Castanyer @creus_roger
632 Followers 773 Following Maximizing the unexpected return. PhD student @Mila_Quebec | Prev: @UbisoftLaForge @la_UPC @HP
Li Auto_理想 @LiAuto_FZ
595 Followers 35 Following Create a mobile home, create a happy home ,Welcome to Li Auto!no Li Auto official!
Li Auto @Li_Auto_
6K Followers 2 Following Li Auto is a customer-oriented automotive technology company.
RLMatrix @RLMatrixCsharp
6 Followers 8 Following Deep Reinforcement Learning library for .NET, Unity, Godot, Stride. Written in pure C# https://t.co/aNB0jY0tJ7
Conference on Languag... @COLM_conf
5K Followers 6 Following https://t.co/GhGCMEoHU8 Abstract submission: March 20, 2025
vmoens @VincentMoens
1K Followers 681 Following TorchRL maintainer (@torchrl1) - PyTorch SWE @ Meta - London Neuroscience PhD, ex-MD vmoens on the butterfly platform
Shangmin Guo @ShangminGuo
237 Followers 176 Following PhD student at the University of Edinburgh, curious about how humans and AI co-exist and co-evolve. Previously at Cohere and Google DeepMind.
SenseTime @SenseTime_AI
3K Followers 15 Following SenseTime is a leading AI software company focused on creating a better AI-empowered future through innovation.
Zhuohan Li @zhuohan123
9K Followers 865 Following mts @ openai | cs phd @ 🌁 uc berkeley | building @vllm_project | machine learning system | the real agi is the friends we made along the way
Aditya Bhatt @aditya_bhatt
1K Followers 2K Following PhD student (DFKI / IAS / TU Darmstadt) in Robotics 🤖 researching dexterous manipulation with flexible robots 🦾 weird control 🪁 Deep RL 🧠 [email protected]
cubercsl @cubercsl
447 Followers 366 Following 5E44 F076 C004 2771 | Former *CPCer | Arch Linux User | @[email protected] | Software Engineer @Microsoft
Xindi Wu @cindy_x_wu
4K Followers 1K Following PhD student @PrincetonCS | Interning @nvidia | Data-centric multimodal ml | prev @roboVisionCMU @CMU_Robotics | @RealityLabs @Snapchat | 🏎️
Ben Eysenbach @ben_eysenbach
5K Followers 0 Following Prof @ Princeton CS working on AI/ML/RL. 🦋@ https://t.co/hz4KZsv5iO
Accepted papers at TM... @TmlrPub
4K Followers 5 Following
Wayve @wayve_ai
13K Followers 596 Following Wayve is a leading developer of embodied intelligence for autonomous vehicles. We use AI to pioneer a next-generation approach to self-driving: AV2.0.