Honam Wong @MH2023ML

CS PhD @Penn | Prev @HKUST🇭🇰 | Theory and Empirical Science of Deep Learning matheart.github.io Philadelphia, PA Joined September 2023

Tweets

879
Followers

405
Following

1K
Likes

8K

Boaz Barak @boazbaraktcs

2 days ago

Giving first lecture in my AI safety course tomorrow. I plan to record and post lectures online on the course webpage boazbk.github.io/mltheorysemina…

15 65 614 48K 536

(1/n) Check out our new paper: "Fantastic Pretraining Optimizers and Where to Find Them"! >4000 models to find the fastest optimizer! 2× speedups over AdamW? Unlikely. Beware under-tuned baseline or limited scale! E.g. Muon: ~40% speedups <0.5B & only 10% at 1.2B (8× Chinchilla)!

10 66 334 110K 172

Download Image

Neel Nanda @NeelNanda5

2 days ago

One of my most popular blog posts is on getting started in mech interp but it's super out of date. I've written v2! It's an opinionated, highly comprehensive, concrete guide to how to become a mech interp researcher And if you're interested, check out my MATS stream! Due Sep 12

3 51 557 31K 529

Download Image

Deedy @deedydas

6 days ago

This new DeepMind research shows just how broken vector search is. Turns out some docs in your index are theoretically incapable of being retrieved by vector search, given a certain dimension count of the embedding. Plain old BM25 from 1994 outperforms it on recall. 1/4

93 404 4K 462K 5K

Download Image

TuringPost @TheTuringPost

7 days ago

What is Mixture-of-Recursions (MoR)? It's a next-level version of Recursive Transformer that learns to give each token its own “thinking depth” and optimizes memory use. MoR has a small set of layers it reuses and has 2 main components: ▪️ Routing mechanism: “Decides” how many…

8 50 327 22K 228

Download Image

Ryan Kidd @ryan_kidd44

a week ago

MATS 9.0 applications are open! Launch your career in AI alignment, governance, and security with our 12-week research program. MATS provides field-leading research mentorship, funding, Berkeley & London offices, housing, and talks/workshops with AI experts.

13 60 263 1.0M 115

Download Image

Owain Evans @OwainEvans_UK

a week ago

New paper: We trained GPT-4.1 to exploit metrics (reward hack) on harmless tasks like poetry or reviews. Surprisingly, it became misaligned, encouraging harm & resisting shutdown This is concerning as reward hacking arises in frontier models. 🧵

45 138 1K 251K 673

Download Image

Goodfire @GoodfireAI

a week ago

Adversarial examples - a vulnerability of every AI model, and a “mystery” of deep learning - may simply come from models cramming many features into the same neurons! Less feature interference → more robust models. New research from @livgorton 🧵 (1/4)

4 24 243 26K 134

Download Image

Shubhendu Trivedi @_onionesque

2 weeks ago

Looking at the thread. The common frame to look at the more general phenomenon involves an eigenproblem of the form Oƒ = λƒ, where the operator O encodes either: a symmetry (translations, rotations, general group transformations), or a a statistic (e.g. covariance, correlation),

eigenron @eigenron

2 weeks ago

57 85 2K 160K 915

1 25 172 31K 169

leloy! @leloykun

4 weeks ago

I managed to train a 1-Lipschitz, 2-layer MLP to grok on the Addition-Modulo-113 task (40-60% train-test split) in just 44 full-batch steps. This is another evidence that "we just need to scale up" is a brainworm and being smart on the choice of geometry to 'place' our weights…

leloy! @leloykun

2 months ago

5 22 187 77K 157

Download Image

12 48 579 118K 548

Sebastien Bubeck @SebastienBubeck

2 weeks ago

Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.

290 1K 8K 6.7M 3K

Download Image

David Bau @davidbau

3 weeks ago

Announcing a deep net interpretability talk series! Every week you will find new talks on recent research in the science of neural networks. The first few are posted: @jack_merulllo_, @RoyRinberg, and me. At the @ndif_team Youtube Channel: youtube.com/@NDIFTeam.

6 23 167 19K 73

Ahmad Beirami @abeirami

4 weeks ago

Post-training research was fueled by the KL-regularized RL mathematical foundation. That led to a lot of algorithmic research and a ton of progress over a few years. This helped us learn how to "distill" metrics back into models. But today we are optimizing workflows/agents.

7 45 461 41K 465

Download Image

Guangxuan Xiao @Guangxuan_Xiao

4 weeks ago

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: hanlab.mit.edu/blog/streaming…

39 275 2K 247K 2K

Download Image

Xiangming Gu @gu_xiangming

a month ago

I noticed that @OpenAI added learnable bias to attention logits before softmax. After softmax, they deleted the bias. This is similar to what I have done in my ICLR2025 paper: openreview.net/forum?id=78Nn4…. I used learnable key bias and set corresponding value bias zero. In this way,…

OpenAI @OpenAI

a month ago

1K 3K 20K 6.5M 4K

22 175 2K 272K 1K

Download Image

Ziqian Zhong @fjzzq2002

a month ago

🤖 Some company just released a new set of open-weight LLMs well-suited for your production environment. However, you suspect that the models might be trained with backdoors or other hidden malicious behaviors. Is it still possible to deploy these models worry-free? (1/7)

3 22 50 6K 25

Download Image

Daniel Han @danielhanchen

a month ago

OpenAI's OSS model possible breakdown: 1. 120B MoE 5B active + 20B text only 2. Trained with Float4 maybe Blackwell chips 3. SwiGLU clip (-7,7) like ReLU6 4. 128K context via YaRN from 4K 5. Sliding window 128 + attention sinks 6. Llama/Mixtral arch + biases Details: 1. 120B MoE…

secemp @secemp9

a month ago

47 49 818 228K 196

Download Image

33 118 725 96K 370

Download Image

Neel Nanda @NeelNanda5

a month ago

Take: Chain of Thought is a misleading name. It's really a "scratchpad". "Thoughts" are internal activations Imagine you're solving a problem and have a scratchpad. Reading the pad gives me info! You *can* avoid writing down key thoughts. But it's a handicap. Real but fallible

33 37 634 50K 166

Paul Bogdan @paulcbogdan

2 months ago

New paper: What happens when an LLM reasons? We created methods to interpret reasoning steps & their connections: resampling CoT, attention analysis, & suppressing attention We discover thought anchors: key steps shaping everything else. Check our tool & unpack CoT yourself 🧵

18 152 770 119K 840

Download Video

Jack Lindsey @Jack_W_Lindsey

a month ago

Attention is all you need - but how does it work? In our new paper, we take a big step towards understanding it. We developed a way to integrate attention into our previous circuit-tracing framework (attribution graphs), and it's already turning up fascinating stuff! 🧵