Excited to share what I worked on during my time at Meta.
- We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention
- We show how to adapt RoPE to tri-linear forms
- We show 2-simplicial attention scales…
We updated our codebase with a Colab notebook to finetune Gemma 3 (12B) using a TPU v6e-1 with just 32 GB of memory. We implemented everything from scratch in JAX, including sampling! We also updated our paper to be more explicit about parameter precision.
github.com/martin-marek/b…
Our IMO gold medal-winning AI pipeline is now model-agnostic. 🥇
What worked for Gemini 2.5 Pro now gets the same 5/6 score with GPT-5 & Grok4. This confirms the power of our verification-and-refinement pipeline to improve base model capabilities.
The new code & results are…
🔍 Ever notice how attention layers only tweak the residual stream in a low-dimensional way? That low-rank writing is exactly why so many SAE features stay dead—until Active Subspace Init rescues them. 👇
#AI#ML#MechInterp
Every time I start making a research poster thinking I know how to make these things, I’m instantly corrected by the abomination that is my first draft.
113 Followers 5K FollowingGuiding @Elonmusk’s vision for a better future through SpaceX, Tesla, Neuralink and more 🚀 I teach enthusiasts, dream chaser and innovation advocate 🌟
24K Followers 4K FollowingTurnaround CTO, advisor, and startup vagabond. Former head of AI @NASA CAS and tech wonk for (Obama) @WhiteHouse, DOD, and DOJ. Tweets are my own.
9 Followers 46 Followingprogrammer & developer building AI/ML models and server-side applications. Loves math. Information technology major, B. Tech from 2024-2028
1K Followers 1K Following25 | SWE | Technical Writer | Math Wizard | Building Medtech B2B SaaS Startup | Exploring AI & ML | Rust Community Discord: https://t.co/LUBbclBTHz
4K Followers 20 FollowingAt Essential AI, we're building an open platform to democratize frontier AI capabilities and accelerate breakthroughs globally through collaborative science.
55K Followers 2K FollowingHead of Design @Cursor_ai. Early @NotionHQ, @Stripe, built startups. I make a world where anyone can make software. Aspiring k-pop idol.
18K Followers 1K FollowingPretraining @xAI. Previously: @InflectionAI, @AIatMeta, @DeepMind, @Google, @LMU_Muenchen, PhD math-ph. Opinions my own. (Can be yours for a small fee.)
114 Followers 100 FollowingRanked 4th nationally in math | Building at the intersection of biological and artificial intelligence | Computational Neuroscience • ML • Robotics