Karthik Narasimhan @karthik_r_n
Professor@PrincetonCS, Research@SierraPlatform. Previously @OpenAI, @MIT_CSAIL, @iitmadras karthiknarasimhan.com Princeton, NJ Joined July 2015-
Tweets282
-
Followers4K
-
Following456
-
Likes903
As we optimize model reasoning over verifiable objectives, how does this affect human understanding of said reasoning to achieve superior collaborative outcomes? In our new preprint, we investigate human-centric model reasoning for knowledge transfer 🧵:
Today we announced a set of major advances to our agent benchmark, 𝜏-bench. This new benchmark, 𝜏², introduces the notion of "dual control", where AI agents are challenged not just to reason and act, but to coordinate, guide, and assist a user in achieving a shared objective.…
Learn more: sierra.ai/blog/benchmark…
Last year, we introduced 𝜏-bench, a benchmark for evaluating AI agents on realistic, multi-step tasks involving tool use and domain-specific constraints. It surfaced a critical limitation in LLM-based agents: low repeatability, even under identical conditions. Now, we’re…
Can GPT, Claude, and Gemini play video games like Zelda, Civ, and Doom II? 𝗩𝗶𝗱𝗲𝗼𝗚𝗮𝗺𝗲𝗕𝗲𝗻𝗰𝗵 evaluates VLMs on Game Boy & MS-DOS games given only raw screen input, just like how a human would play. The best model (Gemini) completes just 0.48% of the benchmark! 🧵👇
Successful agents are the result of collaboration between teams: engineering, operations, customer experience, and marketing. Yet every platform available today except Sierra forces businesses to optimize for one group over another. Our Agent OS enables both no code and…
Like all great products, the best agents are the product of many teams working together — some technical, some non-technical. Sierra’s Agent OS uniquely supports both no code and programmatic agent development, enabling customer experience and engineering teams alike to build…
Humans evolved to communicate so we could coordinate better. But these days, it feels like we communicate so much, yet coordinate so little.
I’m at ICLR to present a poster and give a talk, both related to the second half blogpost. See you there if you wanna chat about it :)
I’m at ICLR to present a poster and give a talk, both related to the second half blogpost. See you there if you wanna chat about it :) https://t.co/TCewhDwJGR
Interesting tidbits on using dedicated "thinking" steps in agents from @AnthropicAI Also loved seeing full pass^k curves for τ-bench - measuring this was the primary motivation of the benchmark, not just avg scores!
Interesting tidbits on using dedicated "thinking" steps in agents from @AnthropicAI Also loved seeing full pass^k curves for τ-bench - measuring this was the primary motivation of the benchmark, not just avg scores!
In the AI age, agent reliability is key, and Sierra’s 𝜏-bench is setting the standard—shaping academic research, industry applications and next-generation development. Read more: sierra.ai/blog/tau-bench….
The best thing about SWE-agents and tools like cursor is the amount of additional agency they provide us
SWE-agent 1.0 is the open-source SOTA on SWE-bench Lite! Tons of new features: massively parallel runs; cloud-based deployment; extensive configurability with tool bundles; new command line interface & utilities.
The biggest mistake we can make right now is not dreaming big enough, especially w.r.t AI
Today we're releasing Common Sense Agents, a new backbone for agentic creative computing: 💻 Windows VMs for safe and repeatable workflows 🔧 Long workflows broken down into reusable tasks 🦾Support for off the shelf agents like Claude ⌛️ Data recording + finetuning infra
Today we're excited to announce a new way to interact with Sierra agents: voice. Learn more about how this new capability is transforming customer interactions in our latest blog post.: sierra.ai/blog/sierra-sp…
We're launching SWE-bench Multimodal to eval agents' ability to solve visual GitHub issues. - 617 *brand new* tasks from 17 JavaScript repos - Each task has an image! Existing agents struggle here! We present SWE-agent Multimodal to remedy some issues Led w/ @_carlosejimenez 🧵
Sierra partnered with @Casper to launch Luna 2.0, their AI agent delivering 24/7 personalized customer support. From helping with mattress purchases to driving lifelong loyalty, Luna 2.0 is transforming the shopping experience!💤✨️ Learn more: sierra.ai/customers/casp…
In a year or two from now, 'fine-tuning' will become synonymous with 'training' (as used in the good old ML days). LLMs will be seen more widely as starting points, just like weight initialization or choosing the number of layers for a Transformer. Pick a starting point, curate…
We're launching EnIGMA, our state-of-the-art AI agent for offensive cybersec! It uses tools like Ghidra & pwntools, can debug, connect to servers, and exploit vulnerabilities to solve CTF challenges. Built with researchers from Princeton, NYU, and TAU. enigma-agent.github.io

Jim Fan @DrJimFan
325K Followers 3K Following NVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
Jacob Andreas @jacobandreas
20K Followers 951 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw
Yoav Artzi @yoavartzi
17K Followers 183 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaC and @COLM_conf
Delip Rao e/σ @deliprao
61K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈
Akari Asai @AkariAsai
18K Followers 867 Following Incoming Assistant Professor @SCSatCMU & research scientist @allen_ai. akariasai @ 🦋
Kyunghyun Cho @kchonyc
77K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre physicist at @nyuniversity (@CILVRatNYU) & @PrescientDesign
Danish Pruthi @danish037
11K Followers 706 Following Faculty at the Indian Institute of Science, Bangalore. PhD from @LTIatCMU.
Sam Bowman @sleepinyourhat
50K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.
William Wang @WilliamWangNLP
19K Followers 759 Following CEO & Founder, @AlphaDesignAI. We make https://t.co/1LfDYicsF2 I'm also Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS.
Prithviraj (Raj) Amma... @rajammanabrolu
8K Followers 612 Following Reinforcement Learning and Language. Assistant Prof @UCSanDiego. Research Scientist @Nvidia.
Felix Hill @FelixHill84
12K Followers 745 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else's
Shunyu Yao @ShunyuYao12
19K Followers 1K Following @OpenAI Language agents (ReAct, Reflexion, Tree of Thoughts, SWE-agent, CoALA) for digital automation (WebShop, SWE-bench, tau-bench)
Victor Zhong @hllo_wrld
5K Followers 500 Following ML+NLP AP @UWCheritonCS, @cifar_news AIChair @vectorinst. Former @MSFTResearch @MetaAI, @SFResearch via @MetamindIO, @uwnlp, @StanfordNLP, @eceuoft.
Bill Yuchen Lin @billyuchenlin
23K Followers 3K Following Building Grok @xAI. Affiliate Assistant Prof @UW; Focusing on Grok Code for Macrohard now. Ex: @allen_ai, Google AI, Meta FAIR.
Sreejan Kumar @sreejan_kumar
2K Followers 328 Following Joint Postdoc at Columbia @ZuckermanBrain and NYU @NYUPsych. Supported by @NYASciences. Prev at: Princeton PhD, RS Intern @Meta, Yale '19
Dipanjan Das @dipanjand
6K Followers 320 Following Researcher at @GoogleDeepmind. Factuality and Gemini x Search.
Behnam Neyshabur @bneyshabur
29K Followers 857 Following Research @AnthropicAI (Co-lead Discovery team) 💼 Past: Gemini @GoogleDeepMind (Co-led Blueshift team) 🧠 LLM Reasoning / AI Scientist 🎒Traveling & Backpacking
Gabriel Ilharco @gabriel_ilharco
7K Followers 1K Following AI Research Scientist at Meta. Prev. PhD at UW, Google Research, xAI
Vadim Liventsev @vadimdotme
38 Followers 750 Following lead hip hop engineer @ https://t.co/y80r2iuMsN
Ashudeep Singh @AshudeepSingh
499 Followers 600 Following 📌 Applied Scientist @Microsoft AI. 🧑🎓 PhD @Cornell. Previously @IITKanpur @Pinterest @GoogleAI @Meta. Work on: AI Safety, ML Fairness, RecSys. 🤖⚖
Lijie(Derrick) Yang @LijieyYang
125 Followers 249 Following CS PhD @Princeton, SCS Alum @CarnegieMellon, doing research in ML and Systems
Michael Jurka 🇺�... @mikejurka
270 Followers 1K Following VP of Engineering, Spatial (https://t.co/RP4FuNES2a)
Omair Shahid @OmairShahid
654 Followers 3K Following Product of progressive public policy; raised by public libraries and public education that produced a passion for politics. and apparently alliteration
sreeprasad @sreeprasad
305 Followers 5K Following To use agile development to built massively scalable applications and perform deep analytics to achieve real time results at BlackRock
Fleati @Fleati938
15 Followers 1K Following
HUANG Qichang @huang85993
356 Followers 8K Following
Black Sapience© @garsay10
607 Followers 3K Following I am human being & nothing human can be alien to me. #Innovation #PeaceforAfrica #JusticeforEritrea RTs≠endorsements.
Shivank @shivank_sh
32 Followers 260 Following
Aravindan M K @AravindanMK23
3 Followers 166 Following
AI_SnackPack @AI_SnackPack
6 Followers 60 Following
Hammad Maqsood @hammad_ignite
0 Followers 38 Following
!.! @xypyth
45 Followers 4K Following
Chris Mann @chriswmann
10 Followers 238 Following
Jason @EinNewton
284 Followers 96 Following AI x AGENT pardus Search : https://t.co/0MnuCbLvsF https://t.co/NnDn8U4Lb5 21 & Learning & Growing & Love coding
Hydrophyte @Hydrovophyte
0 Followers 48 Following
Albert Villanova @avillanovamoral
2K Followers 5K Following ML Engineer @huggingface. Data Scientist, PhD Theoretical Particle Physics, BSc Computer Science. Always learning. he/him
Farah Attia @FarahAttia6979
0 Followers 112 Following
Srikrishna Kompella @the_real_skc
27 Followers 256 Following
Balaji Ganesan @balajinix
463 Followers 488 Following Research Engineer. Interested in Knowledge Graphs, LLMs, NLP and Information Retrieval. Personal opinions.
Nishit Anand @nishitanand99
100 Followers 2K Following MS CS @umdcs | Former ML Research - @iitdelhi, @IIITDelhi | Computer Vision | Multimodal LLMs | Photography
Abhijit Gore @abhijitgore
822 Followers 3K Following product manager @ Microsoft. speaking only for myself. he/him/his.
Tùng Vũ @Play_With_Mino
2 Followers 168 Following
Krishna Prasad Sriniv... @fewshotlearner
4 Followers 192 Following ai/ml researcher @ https://t.co/Ht4vrb51vi | previously microsoft research, everwell, harvard university
Tegan Jegede @jegede_tegan
197 Followers 6K Following Tegan = print( “👨🏾💻passionate engineer ,AI/ML enthusiast , Real Madrid ,arsenal ⚽️ and GSW🏀: ”)
Samee Ur Rehman @sameeurehman
283 Followers 2K Following AI Architect @ASMLcompany. Building Agentic AI for Physical Engineering Systems. Previously PhD in ML/Optimization @TUDelft
TechnoliRama @TechnoliRama
6 Followers 59 Following
Sherwood 💬 @realshcallaway
1K Followers 2K Following Agents & Observability | Previously @11x_official @ycombinator @brexHQ @crunchbase
Anoop Saha @asyncanoop
718 Followers 2K Following I correlate; therefore, I cause! 100k GPU cluster is all you need
metehan @nothimhuman
9 Followers 97 Following 0x9π Founder of Hardwey Music Group Founder of SCAR REC
Millennium Twain #Tru... @MillenniumTwain
2K Followers 7K Following A Million-fold refinement in our EM-Field Mapping of Creation in Electrons, Protons, DiProtons, Alphas, FTL StarShips, Astrospheres, Cluster/Streams, Galaxies —
Sacramento King @SSacrament94313
69 Followers 2K Following
Meagan Reichel @MeaganReic70384
90 Followers 4K Following
Subhashree Radhakrish... @subhashree_r
356 Followers 1K Following Engineering Manager, Metropolis Foundational Models Applied Research @nvidia
Andrej Karpathy @karpathy
1.4M Followers 1K Following Building @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets.
(((ل()(ل() 'yoav)))... @yoavgo
65K Followers 2K Following
Jim Fan @DrJimFan
325K Followers 3K Following NVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
Jacob Andreas @jacobandreas
20K Followers 951 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw
Percy Liang @percyliang
84K Followers 417 Following Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist
Yoav Artzi @yoavartzi
17K Followers 183 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry7yHs / asso. faculty director @arxiv / building https://t.co/f9QkzO5kaC and @COLM_conf
François Chollet @fchollet
572K Followers 813 Following Co-founder @ndea. Co-founder @arcprize. Creator of Keras and ARC-AGI. Author of 'Deep Learning with Python'.
Delip Rao e/σ @deliprao
61K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈
Christopher Manning @chrmanning
151K Followers 228 Following Director, @StanfordAILab. Assoc. Director, @StanfordHAI. Founder, @stanfordnlp. Prof. CS & Linguistics, @Stanford. IP @aixventureshq. 🇦🇺 Do #NLProc & #AI. 👋
Google DeepMind @GoogleDeepMind
1.2M Followers 279 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.
Akari Asai @AkariAsai
18K Followers 867 Following Incoming Assistant Professor @SCSatCMU & research scientist @allen_ai. akariasai @ 🦋
Tal Linzen @tallinzen
18K Followers 898 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAI, inventor of the word "bertology"
Kyunghyun Cho @kchonyc
77K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre physicist at @nyuniversity (@CILVRatNYU) & @PrescientDesign
Yi Tay @YiTayML
46K Followers 81 Following research scientist @googledeepmind ✨♊, model co-lead/captain of gemini deepthink imo gold medal 🥇, opinions are my own.
Sam Bowman @sleepinyourhat
50K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.
William Wang @WilliamWangNLP
19K Followers 759 Following CEO & Founder, @AlphaDesignAI. We make https://t.co/1LfDYicsF2 I'm also Mellichamp Chair Prof. at UCSB CS. PhD @ CMU SCS.
Prithviraj (Raj) Amma... @rajammanabrolu
8K Followers 612 Following Reinforcement Learning and Language. Assistant Prof @UCSanDiego. Research Scientist @Nvidia.
Graham Neubig @gneubig
40K Followers 708 Following Associate professor @LTIatCMU. Co-founder/chief scientist @allhands_ai. I mostly work on modeling language.
Xin Eric Wang @xwang_lk
18K Followers 1K Following Professor @ UCSB (@ucsantabarbara). Head of Research @SimularAI. Interim Director @ucsbcrml. #Multimodal #Embodied #Agents. AI for Humanity in the long run.
Corinne Marie Riley @CorinneMRiley
9K Followers 3K Following Partner @GreylockVC investing in data and AI products at the infrastructure and application layers
EXO Labs @exolabs
36K Followers 2 Following AI on any device. 12 Days of EXO: https://t.co/VMrJ6Vi4h3 We're hiring: https://t.co/BzEO8ZCvBV
sarah guo @saranormous
119K Followers 3K Following startup investor/helper, founder @conviction. accelerating AI adoption, interested in progress. tech podcast: @nopriorspod
Tara Viswanathan @TaraViswanathan
20K Followers 899 Following Building @unltdindustries 🏗️ prev: Founder & CEO @Rupa_Health (💰sold in '24), @stanford, TX raised, CA living 🤠🌊 sharing stuff I learn & want to remember 🙌
Bespoke Labs @bespokelabsai
2K Followers 104 Following RL Environment Curation for the Agentic Future. Data Curation: https://t.co/EnYs1QL3Hj
Tim Dettmers @Tim_Dettmers
38K Followers 991 Following Creator of bitsandbytes.Research Scientist @allen_ai and incoming professor @CarnegieMellon. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
Kevin Ellis @ellisk_kellis
2K Followers 176 Following Cornell Computer Science, Assistant Professor. Program synthesis, AI
Deepa Seetharaman @dseetharaman
20K Followers 2K Following tech reporter writing about AI. dseetharaman.23 on Signal. Bluesky: https://t.co/w3hxcTRpsv
Glaive AI @GlaiveAI
7K Followers 4 Following Build and improve custom language models, powered by synthetic data.
Josh Roberts @jcroberts57
28 Followers 353 Following
Laurens van der Maate... @lvdmaaten
4K Followers 2K Following Member of Technical Staff at Anthropic. Ex-Meta. t-SNE. Llama 3. DenseNet. Web-scale weakly supervised vision. CrypTen.
SSI Inc. @ssi
102K Followers 0 Following A straight shot to safe superintelligence. Join us https://t.co/hHla3vusDE.
Daniel Gross @danielgross
119K Followers 0 Following
Joseph Suarez 🐡 @jsuarez5341
17K Followers 104 Following I build sane open-source RL tools. MIT PhD, creator of Neural MMO and founder of PufferAI. https://t.co/z468O4HDxF
Shuyan Zhou @shuyanzhxyc
3K Followers 806 Following assistant professor @dukecompsci | past: research lead @AIatMeta / msl (?) llama computer use agent, phd @LTIatCMU
Aditya Kusupati @adityakusupati
5K Followers 2K Following Been places..... Done things.... Next-Gen Modelling @GoogleDeepMind
Kianté Brantley @xkianteb
2K Followers 1K Following Assistant Professor at Harvard | Fitness enthusiast | (He/Him/His)
Alex Wettig @_awettig
2K Followers 584 Following PhD @Princeton trying to make sense of language models and their training data; trying to train agents @cursor_ai
Andrew D. Huberman, P... @hubermanlab
1.6M Followers 2K Following Professor of Neurobiology and Ophthalmology at Stanford Medicine • Host of Huberman Lab • Focused on science and health research and public education
John Yang @jyangballin
4K Followers 783 Following 🌲 CS PhD @Stanford 🤖 SWE-bench + agent + smith 🎓 Prev. @princeton_nlp 🐯; @Berkeley_EECS 🐻
Sierra @SierraPlatform
5K Followers 181 Following We help companies build better, more human customer experiences with AI.
Sampriti Bhattacharyy... @sampritibh
13K Followers 627 Following CEO & Founder @navierboat 🌊Roboticist🤖Ex @ NASA🚀Subcritical nuclear reactors @ Fermilab ⚛️Aerospace @ OSU. MIT MechE PhD'17⚙️
Sayash Kapoor @sayashk
10K Followers 2K Following CS PhD candidate @PrincetonCITP. I tweet about AI agents, AI evals, AI for science. AI as Normal Technology: https://t.co/5amOkqKDf2 Book: https://t.co/DabpkhNrcM
rishi @RishiBommasani
6K Followers 2K Following Societal/economic impacts of AI; AI policy & governance @StanfordHAI Stanford CS PhD w/ @percyliang @jurafsky Cornell CS undergrad w/ @clairecardie
Yu Su (hiring postdoc... @ysu_nlp
11K Followers 948 Following cooking something new. prof. @osunlp. sloan fellow. intelligence and agents. author of Mind2Web, SeeAct, MMMU, HippoRAG, BioCLIP, UGround.
Tao Yu @taoyds
5K Followers 888 Following @XLangNLP lab, asst. prof. @HKUniversity. author of OpenCUA, OSWorld, Aguvis, Spider, OpenAgents, Text2Reward, Instructor. prev. postdoc @uwnlp; phd @Yale.
Abhishek Gupta @abhishekunique7
9K Followers 874 Following Assistant Professor at University of Washington. I like robots, and reinforcement learning. Previously: post-doc at MIT, PhD at Berkeley
Animesh Garg @animesh_garg
29K Followers 1K Following Foundation Models for Generalizable Autonomy in Robotics. Assistant Professor in AI Robotics @GeorgiaTech. Prev @nvidia
David Sontag @david_sontag
9K Followers 301 Following CEO & Co-founder @layerhealth. Professor, MIT. Research on machine learning in health care. Part of @MIT_CSAIL, @MIT_IMES, @MITEECS, @AIHealthMIT
Ming-Wei Chang @mchang21
1K Followers 644 Following GenAI @GoogleDeepMind. BERT. Gemini 1 to 2.5. REALM, Imagen Editing. Multimodal / Agent. Retrieval.
Kexin Pei @Kexin_Pei
1K Followers 643 Following Assistant Prof @UChicago 2024. CS Ph.D. @Columbia. Ex @GoogleDeepMind @MSFTResearch @Purdue. Security, SE, and ML. AISec+ML4Code.
Princeton PLI @PrincetonPLI
2K Followers 32 Following Princeton University initiative enhancing fundamental understanding of AI, enabling its use in academic disciplines, and examining AI's societal implications.
Peter Henderson @PeterHndrsn
4K Followers 882 Following Assistant Professor @ Princeton (ML/RL+strategic decision-making+Law). Prev: Stanford (JD/PhD); McGill/Mila; Meta FAIR; Amazon; Cal Supreme Court.
Deepak Subramani @deepakns
3K Followers 702 Following PhD, @MIT. BTech, @iitmadras. AI, ML, Climate, Upskilling, Education. Views are my own.
Jennifer Rexford @jrexnet
4K Followers 739 Following Provost @Princeton, Professor @PrincetonCS and @EPrinceton, affiliated with @PrincetonCITP, computer networking researcher, and mom. 🏳️🌈
Marzyeh @MarzyehGhassemi
7K Followers 207 Following Healthy Machine Learning @ MIT EECS/IMES & Vector Institute
bigAI @BrownBigAI
986 Followers 383 Following Brown Integrative, General Artificial Intelligence @BrownUniversity
Tri Dao @tri_dao
32K Followers 632 Following Asst. Prof @PrincetonCS, Chief Scientist @togethercompute. Machine learning & systems.
Ajeya Cotra @ajeya_cotra
11K Followers 462 Following Helping the world prepare for extremely powerful AI @open_phil (views my own), writer and editor of Planned Obsolescence newsletter.
Andreas Vlachos @vlachos_nlp
5K Followers 1K Following Professor in NLP/ML at @Cambridge_CL, member of the PaNLP group: https://t.co/CcOYgtiRTv, Fellow of @FitzwilliamColl, @ELLISforEurope member