Ted - 🥖/acc @ted_engineer

🇫🇷 / 25yo / dad / deep learning chef @duonlabshq theoboyer.fr Savoie Joined August 2023

Tweets

688
Followers

179
Following

754
Likes

914

himanshu @himanshustwts

2 days ago

The Lore of Kalomaze! ⚡️ bringing a great pod with @kalomaze (20yo ml researcher, prime intellect) - we'd talked about training, finetuning, RL (environments and recipes), scaling, working at PI and a Lot of Lores! (link in replies)

24 39 330 40K 126

Download Video

Ted - 🥖/acc @ted_engineer

a week ago

Mom, can we have trainable attention sinks at home ? Mom: Combining multiple tricks from the TL during the last week we can have efficient attention sinks without taking the trouble to write custom bwd kernels

2 9 88 10K 73

Download Image

Ted - 🥖/acc @ted_engineer

a week ago

Funny how most of deep learning is about not preventing the model from learning

0 0 0 41 0

Ted - 🥖/acc @ted_engineer

a week ago

PI's Environments Hub is the american way of doing ARC-AGI-3

0 0 0 56 0

Andrej Karpathy @karpathy

a week ago

In era of pretraining, what mattered was internet text. You'd primarily want a large, diverse, high quality collection of internet documents to learn from. In era of supervised finetuning, it was conversations. Contract workers are hired to create answers for questions, a bit…

Prime Intellect @PrimeIntellect

a week ago

117 394 3K 1.2M 2K

Download Video

261 850 7K 844K 5K

@

56 years ago

0 0 0 0 0

Prime Intellect @PrimeIntellect

a week ago

Introducing the Environments Hub RL environments are the key bottleneck to the next wave of AI progress, but big labs are locking them down We built a community platform for crowdsourcing open environments, so anyone can contribute to open-source AGI

117 394 3K 1.2M 2K

Download Video

You Jiacheng @YouJiacheng

a week ago

lol great collaboration. glad to contribute 2 bits here.

Ted - 🥖/acc @ted_engineer

a week ago

lol great collaboration. glad to contribute 2 bits here.

2 9 88 10K 73

Download Image

0 1 20 2K 4

Thien Tran @gaunernst

2 weeks ago

TIL you can use PyTorch's built-in varlen attention directly. Backward works too gist.github.com/gau-nernst/1d8… Say goodbye to flash-attn😂. Thanks @main_horse for pointing this out to me.

11 12 188 17K 151

Download Image

Ted - 🥖/acc @ted_engineer

2 weeks ago

data > optimizer > HP > model size > model arch

0 0 1 29 0

will brown @willccbb

2 weeks ago

i'll confess i do have a very specific mission in mind with this project. the semi-vague private beta rollout is part of it. the set of tasks we're sourcing is part of it. the GPU bounties are part of it. the shitposts are part of it. the podcasts are part of it. mindshare is…

45 78 757 227K 419

leloy! @leloykun

2 weeks ago

I've finally solved steepest descent on Finsler-structured (matrix) manifolds more generally. This generalizes work by me, @jxbz, and @Jianlin_S on Muon, Orthogonal Muon, & Stiefel Muon. --- The general solution turned out to be much simpler than I thought. And it should…