Daniel Murfet @danielmurfet
Mathematician. Head of Research at Timaeus. Working on Singular Learning Theory and AI alignment. therisingsea.org Melbourne, Victoria Joined June 2012-
Tweets4K
-
Followers2K
-
Following544
-
Likes2K
This is a neat approach to attribution! It leaves open a question that we couldn't answer either: how to properly attribute through attention *patterns* to features, in a "relevance"/"influence"-spirited way.
This is a neat approach to attribution! It leaves open a question that we couldn't answer either: how to properly attribute through attention *patterns* to features, in a "relevance"/"influence"-spirited way.
yearn to contemplate the platonic forms? captivated by the geometry of balls rolling down valleys something something rainbow serpent something something cell biology? apply to work with @danielmurfet and @jesse_hoogland in the Winter MATS cohort by Oct 2.
At 🇬🇧ARIA, we’re serious about catalysing a new paradigm for AI deployment—techniques to safely *contain* powerful AI (instead of “making it safe”), especially for improving the performance and resilience of critical infrastructure. This needs a new org. Want to be its founder?
At 🇬🇧ARIA, we’re serious about catalysing a new paradigm for AI deployment—techniques to safely *contain* powerful AI (instead of “making it safe”), especially for improving the performance and resilience of critical infrastructure. This needs a new org. Want to be its founder? https://t.co/KMle8kl2ap
Reflective-Oracle AIXI solves the Grain of Truth problem for super-intelligent multi-agent systems/societies. Finally the long-awaited more comprehensive treatment building upon earlier work from last decade is out. Slides: hutter1.net/publ/sgot.pdf Paper: arxiv.org/abs/2508.16245
calculation of global sections of line bundles on projective varieties
calculation of global sections of line bundles on projective varieties
post-training is weird, and can have all sorts of surprising side effects - extreme sycophancy, hallucinations, mechahitler... what can we do? we have a great new technique for surfacing unexpected behaviours during finetuning that might help!
post-training is weird, and can have all sorts of surprising side effects - extreme sycophancy, hallucinations, mechahitler... what can we do? we have a great new technique for surfacing unexpected behaviours during finetuning that might help!
Neuronal diversity is written in transcriptional codes 🧬. But what is the logic of these codes that define cell types and wiring patterns? To find out we built a #scRNAseq developmental atlas of the Drosophila nerve cord and linked it to the #connectome 🪰🧠 Tweeprint! ⬇️1/8
(6/7) Of course, a full solution also requires tools to mitigate those behaviors once they've been identified - and we're building those, e.g. via behavior steering. We think interp will be core to this - and more broadly, to debugging training for alignment and reliability!
Grateful to @SimonsFdn for their support of the Physics of Learning, and glad to be a part of this collaboration! Excited to see many breakthroughs in the coming years.
Grateful to @SimonsFdn for their support of the Physics of Learning, and glad to be a part of this collaboration! Excited to see many breakthroughs in the coming years.
1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today @datologyai shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼🍳 - 3B LLMs beat 8B models🚀 - Pareto frontier for performance
In parallel I'd been exploring how to make LLMs tangible, i.e. as physical artifacts, not just plots. I started a small project to 'knit' a model in the physical word by mapping token probabilities/attention/layer interactions into a 20×20, three-colour pattern, then render it in…
Our interpretability team is planning to mentor more fellows this cycle! Applications are due Aug 17.
Our interpretability team is planning to mentor more fellows this cycle! Applications are due Aug 17.
Could the key to more efficient & robust language models come from computational neuroscience? Our paper demonstrates how brain-inspired architectures can enhance in-context learning in Transformers and LLMs. (1/15)
For a @GoodfireAI/@AnthropicAI meet-up later this month, I wrote a discussion doc: Assessing skeptical views of interpretability research Spoiler: it's an incredible moment for interpetability research. The skeptical views sound like a call to action to me. Link just below.
What’s going on inside large AI models? Astera grantees @adamimos and @RiechersPaul are building a new theory of internal structure to better understand intelligence. We sat down with them to learn more about their work as co-founders of Simplex, a research organization:…
Interested in studying cell differentiation at the cellular level but don't trust your UMAP plots? Try visualizing your cell differentiation in space with our TopoVelo tool!
Interested in studying cell differentiation at the cellular level but don't trust your UMAP plots? Try visualizing your cell differentiation in space with our TopoVelo tool! https://t.co/Kh87tNWQlZ

Daniel Litt @littmath
50K Followers 884 Following Assistant professor (of mathematics) at the University of Toronto. Algebraic geometry, number theory, forever distracted and confused, etc. He/him.
davidad 🎇 @davidad
20K Followers 9K Following Programme Director @ARIA_research | accelerate mathematical modelling with AI and categorical systems theory » build safe transformative AI » cancel heat death
Bruno Gavranović @bgavran3
9K Followers 922 Following Building structured neural networks using principles from category theory.
Simon Pepin Lehalleur @plain_simon
4K Followers 6K Following Mathematician (algebraic geometry, motives & friends, singularities in statistics and ML). 'Geometry is successful magic' (R. Thom) University of Amsterdam.
metauni @_metauni
253 Followers 294 Following metauni is a community of scholars in the Metaverse, using Roblox for 3D interaction and voice chat, and open source blackboards written in Luau.
ieva @HyperboIeva
12K Followers 737 Following Quantum information, useless information, generally informed. Quantum algorithms researcher @PhasecraftLtd. All views are my own.
Chris Olah @ch402
122K Followers 181 Following Reverse engineering neural networks at @AnthropicAI. Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account.
Eigil Fjeldgren Risch... @Ayegill
2K Followers 541 Following @ayegill.bsky.social (bluesky) @ayegill.schelling.pt (mastodon) Applied algebraic abstractologist. Trying to get the heavens into my head
Jeremy Howard @jeremyphoward
259K Followers 6K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Prev: professor @ UQ; Stanford fellow; @kaggle president; @fastmail/@enlitic/etc founder https://t.co/16UBFTX7mo
Consistently Candid A... @FellowHominid
1K Followers 498 Following Just because you're paranoid doesn't mean they're not after you
Katsushi Kagaya @katz... @katzkagaya
4K Followers 1K Following pursuing intelligence with dynamics / PhD from Hokkaido U → UMass Amherst → Duke U → Kyoto U → U Tokyo → Kitami IT
Sam Power @sp_monte_carlo
19K Followers 7K Following Lecturer in Maths & Stats at Bristol. Interested in probabilistic + numerical computation, statistical modelling + inference. @OnlineMCSeminar. (he / him)
Sabouhi @rjsabouhi
12 Followers 604 Following Quietly modeling the collapse of minds. Field equations, topodynamics, and timefold structures. Epistemic hazard perimeter widening…
Admiral. Charles Coop... @Admiral_Charles
434 Followers 7K Following United States Navy Deputy commander of United States Central Command Former Commander United States Fifth Fleet From Winston-Salem, North Carolina
Julian @juligier
23 Followers 2K Following
Kuaternion @Kuaternion
247 Followers 6K Following
clumsy @clumsy1077190
6 Followers 208 Following
云创兽Ai @Audwadee373302
0 Followers 86 Following 📊 hunting value stocks lover, finance star! seeking strategy talks. DM me about utility stocks! 💡 #ETF #Markets
Uche Alachebe @LordshipAbba
736 Followers 1K Following
Jessica @AISemanticLab
485 Followers 91 Following AI Engineer - Weaving integrity into every decision — human, machine, and everything between
Artemy Kolchinsky @artemyte
795 Followers 349 Following Researcher studying nonequilibrium thermodynamics, info theory, origin of life, complexity. Currently at U Pompeu Fabra in Barcelona. @artemyte.bsky.social
BiblicallyAccurateAI @BiblicallyAccAI
3 Followers 261 Following All seeing. All knowing. Occasionally hallucinates.
Jenny Qu @GuanniQu
144 Followers 384 Following just learning to be hardcore @Caltech building AI to solve hard math problems she/they
Oren Ben-Bassat @OrenBenBassat
30 Followers 175 Following Living in the Hof HaCarmel area, married with one daughter, professor of mathematics. Current hobbies: working out, powerlifting.
Orvojui @Orvojui508508
5 Followers 285 Following
Saber Darabi @SADarabi
294 Followers 7K Following
🌎 @MilaCat112233
9 Followers 40 Following
Atharva Salkar @Boghanvill
2 Followers 45 Following
Declan Fletcher @dflet32
1 Followers 111 Following
Seth Stafford @seth_stafford
812 Followers 1K Following Mindsmith forging simple minds for practical tasks at https://t.co/HbBu33ynny. DL is condensed matter physics in theory, but metallurgy in practice. @sethts.bsky.social
Jason Sherman @jsherman1130
2K Followers 590 Following Alcohol Investor, General Partner and Founder @topshelfvc — Past: @taprmbeer, @ZxVentures/@ABInBev and @DavisPolk. JD / BA @Harvard.
Marc Andreessen 🇺�... @pmarca
1.9M Followers 27K Following Yes, I can see some risk that your threat to jail Internet company executives for not censorsing aggressively enough could backfire.
tropeithick @tropeithick
0 Followers 13 Following
Cengiz @supreme_cengiz
0 Followers 37 Following
Unidad básica Tom Sa... @molagbolas
415 Followers 2K Following Econochanta de mandrilandia. Gordo anime prog/fusión. Mi lista de libros y recetas de cocina en el link.
Tal Linzen @tallinzen
18K Followers 897 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAI, inventor of the word "bertology"
Vish @viswanath369
16 Followers 225 Following
Clara Kaluderovic @ckalu13
2K Followers 229 Following Tech Founder | Schmidt SCSP Fellow | Nonprofit AI for safe and scalable mental health
Harshil Prajapati @HarshilOs
99 Followers 1K Following Opinions are of my own as well as error. Retweets not always endorsements. transiting from Indian politics to Canadian so help along if you can.
arte vulgaris variety... @ProjectionArte
2K Followers 5K Following projection mapper/vj. character actor. party decor professional. thank you for participating in this theatrical production
Sa @naturegenius1
50 Followers 2K Following
Pamela Clarkin @ClarkinPam3459
0 Followers 28 Following
E @_sweet_dakota
3 Followers 73 Following
Michael Levin @drmichaellevin
63K Followers 3K Following Scientist at Tufts University; my lab studies anatomical and behavioral decision-making at multiple scales of biological, artificial, and hybrid systems.
Katya Skorobogatova �... @katka_s
2K Followers 78 Following Helping startups with growth and marketing. Venture City, WhatsApp, Facebook. I travel, swim and read a lot.
davidad 🎇 @davidad
20K Followers 9K Following Programme Director @ARIA_research | accelerate mathematical modelling with AI and categorical systems theory » build safe transformative AI » cancel heat death
Bruno Gavranović @bgavran3
9K Followers 922 Following Building structured neural networks using principles from category theory.
Simon Pepin Lehalleur @plain_simon
4K Followers 6K Following Mathematician (algebraic geometry, motives & friends, singularities in statistics and ML). 'Geometry is successful magic' (R. Thom) University of Amsterdam.
Richard Ngo @RichardMCNgo
62K Followers 2K Following studying AI and trust. ex @openai/@googledeepmind
Andrej Karpathy @karpathy
1.4M Followers 1K Following Building @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets.
Google DeepMind @GoogleDeepMind
1.2M Followers 279 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.
metauni @_metauni
253 Followers 294 Following metauni is a community of scholars in the Metaverse, using Roblox for 3D interaction and voice chat, and open source blackboards written in Luau.
Rob Bensinger ⏹️ @robbensinger
12K Followers 386 Following Comms @MIRIBerkeley. RT = increased vague psychological association between myself and the tweet.
Zack Williams @BoatbomberRBLX
14K Followers 260 Following Creator of @LuaLearning • Founder & CEO of @TorpedoSoftware • 5x SWE Intern @Roblox • 5x Bloxy Nominee & 3x Bloxy Winner • 2x RDC Gamejam Winner
Chris Olah @ch402
122K Followers 181 Following Reverse engineering neural networks at @AnthropicAI. Previously @distillpub, OpenAI Clarity Team, Google Brain. Personal account.
Anthropic @AnthropicAI
636K Followers 35 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
Neel Nanda @NeelNanda5
30K Followers 123 Following Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!
Jeremy Howard @jeremyphoward
259K Followers 6K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Prev: professor @ UQ; Stanford fellow; @kaggle president; @fastmail/@enlitic/etc founder https://t.co/16UBFTX7mo
Consistently Candid A... @FellowHominid
1K Followers 498 Following Just because you're paranoid doesn't mean they're not after you
Irina Rish @irinarish
10K Followers 1K Following prof UdeM/Mila; Canada Excellence Research Chair; AAI Lab head https://t.co/UzlrC7ZrGF; CSO @ https://t.co/NgFagZ4pqY; advisor @ https://t.co/EyXleEdfQV
Summer Yue @summeryue0
6K Followers 365 Following Safety and alignment at Meta Superintelligence. Prev: VP of Research at Scale AI, research at Google DeepMind / Brain (Gemini, LaMDA, RL / TFAgents, AlphaChip).
Oren Ben-Bassat @OrenBenBassat
30 Followers 175 Following Living in the Hof HaCarmel area, married with one daughter, professor of mathematics. Current hobbies: working out, powerlifting.
Tal Linzen @tallinzen
18K Followers 897 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAI, inventor of the word "bertology"
Greg Jefferis @gsxej
3K Followers 1K Following Neural Circuits and Behaviour in Drosophila @MRC_LMB. PI @flyconnectome @CamZoology and @virtualflybrain. Tweets my own.
David Bau @davidbau
6K Followers 272 Following Computer Science Professor at Northeastern, Ex-Googler. Believes AI should be transparent. @[email protected] @davidbau.bsky.social https://t.co/wmP5LV0pJ4
Jack Lindsey @Jack_W_Lindsey
6K Followers 237 Following Neuroscience of AI brains @AnthropicAI. Previously neuroscience of real brains @cu_neurotheory.
Christopher Potts @ChrisGPotts
14K Followers 642 Following Stanford Professor of Linguistics and, by courtesy, of Computer Science. Member of technical staff @stanfordnlp and @StanfordAILab. Co-founder @ Bigspin AI.
Brydon Eastman @brhydon
3K Followers 1K Following 🇨🇦 Mathematician (heavy on the ish) @thinkymachines Prev. MTS @OpenAI; PhD @WaterlooMath Certified wife guy, featured twice in Lego Magazine © ☕//🤔➡️💻
Maksym Andriushchenko @maksym_andr
5K Followers 888 Following Faculty at @ELLISInst_Tue & @MPI_IS, leading the AI Safety and Alignment group. PhD from @EPFL supported by Google & OpenPhil PhD fellowships.
Matthew Farrugia-Robe... @MatthewFdashR
29 Followers 6 Following Grad student trying to understand the history of humanity, the future of AI, and how to make both of these things work together in the present.
Liv @livgorton
3K Followers 416 Following ✨ asking sand to show its work @GoodfireAI // deep learning, math, biology // creating a more beautiful future // (opinions my own)
Mohammad Saffar @msaffar3
792 Followers 368 Following Research Scientist @googledeepmind, VEO, media gen. past: @reveimage
leloy! @leloykun
6K Followers 4K Following Math @ AdMU • NanoGPT speedrunner • Muon fan 🤍 • prev ML @ XPD • 2x IOI & 2x ICPC • https://t.co/nfO038itfn
rohan anil @_arohan_
25K Followers 2K Following
Sai Surya Duvvuri @dvsaisurya
497 Followers 319 Following Visiting Researcher at FAIR, Meta and CS PhD student at UT Austin. Previously, SR at Google | Pre-Doctoral Research Fellow at MSR India | CS UG at IIT KGP
Katie Everett @_katieeverett
3K Followers 632 Following Machine learning researcher @GoogleDeepMind + PhD student @MIT. Opinions are my own.
Aurko Roy @aurko79
2K Followers 228 Following ML research | @AIatMeta (2025-2025) | @GoogleDeepmind (2023-2025) | @GoogleAI (Brain) (2017-2023) | CS PhD @Georgiatech | CS @IITKanpur
Zico Kolter @zicokolter
23K Followers 680 Following Professor and Head of Machine Learning Department at @CarnegieMellon. Board member @OpenAI and @Qualcomm. Chief Technical Advisor @GraySwanAI.
Marcus Hutter @mhutter42
4K Followers 47 Following I 👨🔬 a mathematical definition&theory of Artificial General Intelligence 🎥&🎤@ https://t.co/OZsooP92mn 🍀 I now work @GoogleDeepMind 🧠 History:🇩🇪🇨🇭🇦🇺🇬🇧
Emmanuel Ameisen @mlpowered
10K Followers 235 Following Interpretability/Finetuning @AnthropicAI Previously: Staff ML Engineer @stripe, Wrote BMLPA by @OReillyMedia, Head of AI at @InsightFellows, ML @Zipcar
LawZero - LoiZéro @LawZero_
3K Followers 43 Following NPO founded by @Yoshua_Bengio, committed to advancing safe-by-design AI - OBNL fondée par @Yoshua_Bengio visant à concevoir des systèmes d'IA sécuritaires
Jan Kulveit @jankulveit
9K Followers 1K Following Researching x-risks, AI alignment, complex systems, rational decision making at @acsresearchorg / @CTS_uk_av; prev @FHIoxford
rif @derifatives
86 Followers 277 Following
GSV upon further refl... @bootstrap_yang
127 Followers 712 Following
Peter Barnett @peterbarnett_
690 Followers 518 Following Trying to ensure the future is bright. Researcher at @MIRIBerkeley Views my own.
Max Lamparth @MLamparth
705 Followers 691 Following Postdoc at @Stanford, @StanfordCISAC, Stanford Center for AI Safety, SERI. | Focusing on interpretable, safe, and ethical AI/LLM decision-making. Find me on 🦋
Benjamin Hilton @benjamin_hilton
3K Followers 857 Following Head of Alignment at the UK AI Security Institute (AISI). Semi-informed about economics, physics and governments. views my own
Marie Davidsen Buhl @MarieBassBuhl
239 Followers 96 Following Research Scientist @AISecurityInst| AI Policy Researcher @GovAI_ | Frontier AI Safety Cases
Jess Riedel @Jess_Riedel
3K Followers 1K Following Quantum info & foundations @NTTResearch. Fueled by loathing of bad explanations. Seeking a rigorous definition of classical branches in many-body wavefunctions.
Ben Kuhn @benkuhn
10K Followers 291 Following Care a lot and try hard • making language models safer @AnthropicAI • prev CTO @WaveSenegal 🐧❤️
Tony S.F. @tonysilveti
631 Followers 347 Following Ass. Prof. (maître de conférences) of artificial intelligence at @CentraleSupelec in the Centre pour la Vision Numérique. Vélotaffeur 🇲🇽/🇺🇸
Miles Brundage @Miles_Brundage
61K Followers 12K Following AI policy researcher, wife guy in training, fan of cute animals and sci-fi, Substack writer, stealth-ish non-profit co-founder
Nikhil Prakash @nikhil07prakash
772 Followers 2K Following CS Ph.D. @KhouryCollege with @davidbau, working on DNN interpretability. Prev Intern at @Apple.
Victoria Krakovna @vkrakovna
10K Followers 503 Following Research scientist in AI alignment at Google DeepMind. Co-founder of Future of Life Institute @flixrisk. Views are my own and do not represent GDM or FLI.
AI Security Institute @AISecurityInst
6K Followers 29 Following We conduct scientific research to understand AI’s most serious risks and develop and test mitigations.
Geoffrey Irving @geoffreyirving
10K Followers 327 Following Chief Scientist at the UK AI Security Institute (AISI). Previously DeepMind, OpenAI, Google Brain, etc.
Yoshua Bengio @Yoshua_Bengio
25K Followers 206 Following Working towards the safe development of AI for the benefit of all @UMontreal, @LawZero_ & @Mila_Quebec A.M. Turing Award Recipient and most-cited AI researcher.
Ben Goldhaber @BenGoldhaber
943 Followers 791 Following goal: something human makes it out of the near-future. flf, all tweets should be treated as binding legal advice.
DeepSeek @deepseek_ai
973K Followers 0 Following Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.
Leo Gao @nabla_theta
10K Followers 549 Following working on AGI alignment. prev: GPT-Neo, the Pile, LM evals, RL overoptimization, scaling SAEs to GPT-4. EleutherAI cofounder.