Are AI scientists already better than human researchers?
We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts.
Main finding: LLM ideas result in worse projects than human ideas.
Happy to share that our paper got accepted at #EMNLP2025! In this work, we study which factors—other than scaling model size or training data size—affect the performance of LLMs. For details, see Emmy's thread 👇
Happy to share that our paper got accepted at #EMNLP2025! In this work, we study which factors—other than scaling model size or training data size—affect the performance of LLMs. For details, see Emmy's thread 👇
#embodied All forms of biological intelligence are grounded movements🏃♂️ muscles & motor neurons 🧠 emerge before visual cortex & rods & cones in eyes 👁️
Building monocular better-than-mocap-studio #video2motion is our critical step towards human embodied intelligence.
#embodied All forms of biological intelligence are grounded movements🏃♂️ muscles & motor neurons 🧠 emerge before visual cortex & rods & cones in eyes 👁️
Building monocular better-than-mocap-studio #video2motion is our critical step towards human embodied intelligence.
I do think that AI has a lot of promise for science, but we need lots of serious work to get there! One thing I'm very interested in is how we can build AI systems that can effectively judge the quality of research.
Returning from #NAACL2025 & had some interesting discussions! One topic that came up a lot was AI scientists and how they should be implemented, evaluated, etc. Used this as inspiration to finish up a blog post on AI science and the state of reviewing:
nightingal3.github.io/blog/2025/04/2…
Induction heads are commonly associated with in-context learning, but are they the primary driver of ICL at scale?
We find that recently discovered "function vector" heads, which encode the ICL task, are the actual primary drivers of few-shot ICL.
arxiv.org/abs/2502.14010
🧵
Join us this week at the #NAACL student research workshop for an exciting keynote presentation by @psresnik! Fun fact, he was a co-creator of the SRW back in the day, so this is coming back full circle ⭐
⏱️ May 1st, 4-5PM MDT
📍 San Miguel, or online
Past work has shown that world state is linearly decodable from LMs trained on text and games like Othello. But how do LMs *compute* these states? We investigate state tracking using permutation composition as a model problem, and discover interpretable, controllable procedures🧵
11K Followers 1K FollowingI like tokens! I lead the OLMo data team at @allen_ai w/ @kylelostat. Open source is fun 🤖☕️🍕🏳️🌈 Opinions are sampled from my own stochastic parrot
15K Followers 6K FollowingI build tough benchmarks for LMs and then I get the LMs to solve them. SWE-bench & SWE-agent. Postdoc @Princeton. PhD @nlpnoah @UW.
5K Followers 1K FollowingComputational Linguist and Professional Nerd at Georgetown University
he/him pronouns, ALL the prepositions
@[email protected] @complingy.bsky.social
38K Followers 991 FollowingCreator of bitsandbytes.Research Scientist @allen_ai and incoming professor @CarnegieMellon. I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
75 Followers 6K FollowingI am a Powerball winner of $390.7M,I'm using this medium to appreciate the society by giving out $3500 each to my first100 followers&to support Dneedy
15 Followers 309 Following✨ Earn Big Daily: 50-100000 USDT Potential! Secure & Fast Crypto Earning Starts Here For You. High Potential, Quick Returns Always. 💰⚡
949K Followers 764 FollowingProfessor at NYU. Chief AI Scientist at Meta.
Researcher in AI, Machine Learning, Robotics, etc.
ACM Turing Award Laureate.
11K Followers 1K FollowingI like tokens! I lead the OLMo data team at @allen_ai w/ @kylelostat. Open source is fun 🤖☕️🍕🏳️🌈 Opinions are sampled from my own stochastic parrot
12K Followers 745 FollowingResearch Scientist, Deepmind
I try to think hard about everything I tweet, esp on 90s football and 80s music
None of my opinions are really someone else's
10K Followers 4K Followingsth new // ex Gemini RL+Inference @GoogleDeepMind // Chat AI @Meta // RL Agents @EA // ML+Information Theory @MIT+@Harvard+@GeorgiaTech // زن زندگی آزادی
608 Followers 374 FollowingCS PhD student @Penn studying the news and its impact on democracy. Computational social science, natural language processing, machine learning, cats.
2K Followers 597 FollowingOpen-endedness, Data-centric AI @LilaSciences
Previously: RS @synth_labs, PhD @ucsbNLP, Internships @AIatMeta @MSFTResearch
All puns are my own
2K Followers 935 FollowingPh.D. student @LTIatCMU and intern at @AIatMeta (FAIR) working on (V)LM Evaluation & Systems that SeIf-Improve | Prev: @kaist_ai @yonsei_u
5K Followers 1K FollowingAssociate Professor, CMU. Researcher, Google. Evaluation and design of information retrieval and recommendation systems, including their societal impacts.
26K Followers 173 FollowingA North Star for open AGI. Co-founders: @fchollet @mikeknoop. President: @gregkamradt. Help support the mission - make a donation today.
459 Followers 652 FollowingAssistant professor at ECE, University of Tehran. Previously, a researcher at @MSFTResearch and Phd student at @cislmu. #NLProc #Machinelearning