PhD student @unipotsdam, supervised by @davidschlangen. Working on NLP, ML and CogSci. Prev @LstSaar. Former NLP engineer.limengnlp.github.ioJoined December 2021
Ilya Sutskever: bald
Demis Hassabis: bald
Noam Shazeer: bald
Greg Brockman: bald
forget AGI.
forget curing cancer.
cure baldness now.
My hairline is on gradient descent.
⏰ We introduce Reinforcement Pre-Training (RPT🍒)
— reframing next-token prediction as a reasoning task using RLVR
✅ General-purpose reasoning
📑 Scalable RL on web corpus
📈 Stronger pre-training + RLVR results
🚀 Allow allocate more compute on specific tokens
In Hinton's NN class, there is an interesting tip to get a geometric view of high dimensional space. I think authors of interpretability papers did the opposite; they stare at LLMs and pray in their minds that it's linear and interpretable.
In Hinton's NN class, there is an interesting tip to get a geometric view of high dimensional space. I think authors of interpretability papers did the opposite; they stare at LLMs and pray in their minds that it's linear and interpretable.
I just read this WSJ article on why Europe's tech scene is so much smaller than the US's and China's.
I'm afraid that, like most articles on this topic, it largely misses the mark.
Which in itself illustrates a key reason why Europe is lagging behind: when you fail to…
📢 I am on the JOB market this year 📢
I am looking for both faculty and research scientist positions.
My research makes AI agents useful and safe for humans. I enable them to effectively convey uncertainty, ask for help, learn from human feedback, and pursue goals that benefit…
Excited to be at #NAACL2025! Let’s meet (and grab a Char's Zaku sticker 🚀).
📅 May 4, 11–12, RepL4NLP: "Amuro&Char: Analyzing the Relationship between Pre-Training and Fine-Tuning"
📅 May 2, 12 PM, Ballroom B: "SHADES: Towards a Multilingual Assessment of Stereotypes in LLMs"
🚀 Day 0: Warming up for #OpenSourceWeek!
We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency.
These humble building blocks in our online service have been documented,…
🚀 DeepSeek-R1 is here!
⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!
🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today!
🐋 1/n
The #NobelPrizeinPhysics2024 for Hopfield & Hinton rewards plagiarism and incorrect attribution in computer science. It's mostly about Amari's "Hopfield network" and the "Boltzmann Machine."
1. The Lenz-Ising recurrent architecture with neuron-like elements was published in…
My Bet: Strawberry is algorithm distillation/procedural cloning. Everyone right now is coming up with ways to distill System 2 into System 1, but that will always be limited. We need to train the model to run the algorithms, not just outputs (and post-train with RL of course).
Good Scientific American piece on the idea of AGI -I think and argue here that its incoherent - there is no general intelligence natural or artificial but different cognitive abilities that often trade-off..
scientificamerican.com/article/what-d…
cognitive scientist: so the lesson of Clever Hans is we need..
engineer: more horses
cognitive scientist:
engineer: stacked horses. parallel horses. pooled horses. horse dropout. RL with horses in the loop.
cognitive scientist:
engineer: Hans is All You Need
1K Followers 793 FollowingStaff Researcher @AlibabaGroup. Previously @MBZUAI, PhD from @ml_labs_irl and @dcucomputing @dcu interested in Large Language Models (LLMs).
40K Followers 28K FollowingBiologist at The Sainsbury Lab; passionate about plant pathogens and evolution; open science advocate; loves travel, food and sports; nomad and hunter-gatherer.
3K Followers 6K Followingnlab fan account, arxiv surveyor, pubmed enjoyer, two culture bridger, vacuous high gossiper, dearth of any domain expertise, reluctant g theorist, gpu poor,
450 Followers 1K FollowingPhD Student @cvml_mpiinf at the Max Planck Institute for Informatics, @SIC_Saar. Member of @neuroexplicit. Explainability in Computer Vision. @cse_iith alumnus.
12K Followers 3K Followingresearch @MIT_CSAIL @thinkymachines. work on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
1K Followers 2K FollowingPhD @NYUDataScience, visiting researcher @AIatMeta, interested in AI & CogSci, specifically in goals and their representations in minds and machines (he/him).
21K Followers 19K FollowingInspired by Algorithms, Powered by Imagination: Unleashing the Potential of Generative AI.
#GenerativeAI #deeplearning #AI #MachineLearning
37K Followers 483 FollowingDigital Geometer, Assoc. Prof. of Computer Science & Robotics @CarnegieMellon @SCSatCMU and member of the @GeomCollective. There are four lights.
13K Followers 225 FollowingNLP/ML research group at @UCLCS, PIs: S. Riedel (@riedelcastro), P. Stenetorp, T. Rocktäschel (@_rockt), E. Grefenstette (@egrefen), P. Minervini (@pminervini)
4K Followers 197 FollowingUCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab at @AI_UCL led by @_rockt, @egrefen, @robertarail, and @jparkerholder.
263K Followers 670 FollowingBuilding with AI agents @dair_ai • Prev: Meta AI, Galactica LLM, Elastic, PaperswithCode, PhD • I share insights on how to build with AI Agents ↓
1K Followers 793 FollowingStaff Researcher @AlibabaGroup. Previously @MBZUAI, PhD from @ml_labs_irl and @dcucomputing @dcu interested in Large Language Models (LLMs).
2K Followers 529 FollowingAssistant Professor at @TelAvivUni and Research Scientist at @GoogleResearch; previously postdoc at @GoogleDeepMind and @allen_ai
1K Followers 34 Followingdeveloping embodied AI agents that empower users to use language to interact with digital and physical environments to carry out real-world tasks.
26K Followers 173 FollowingA North Star for open AGI. Co-founders: @fchollet @mikeknoop. President: @gregkamradt. Help support the mission - make a donation today.
1K Followers 571 FollowingPhD student at the department of brain and cognitive sciences, MIT. I build deep learning and probabilistic models to understand the brain and mind.
1K Followers 439 FollowingPhD student @UBC_CS w/ @jeffclune. RS Intern @SakanaAILabs. I am building open-ended agentic AI systems that can accumulate complexity in language.
3K Followers 346 FollowingResearch Scientist and Manager @Apple AI/ML. Ex-Principal Researcher @Microsoft Azure AI. Working on building vision and multimodal foundation models.