Alexander Long @_AlexanderLong

Founder @PluralisHQ | ML PhD Protocol Learning: Multi-participant, low-bandwidth model parallel Pluralis.ai Joined July 2023

Tweets

362
Followers

2K
Following

990
Likes

2K

Alexander Long @_AlexanderLong

4 days ago

People think about pretraining runs as single long monolithic loss curves but they're not like that even in the centralised case. You run stuff, move it through different data stages, go back and fork off a checkpoint, change some norm somewhere etc. etc.

Alex Hägele @haeggee

4 days ago

9 24 267 42K 216

Download Image

2 0 12 935 1

Alexander Long @_AlexanderLong

2 weeks ago

perps on god

Shirin Ghaffary @shiringhaffary

3 weeks ago

perps on god

12 24 349 132K 105

Download Image

0 0 9 573 0

Alexander Long @_AlexanderLong

4 weeks ago

Obsessing over the SWE-bench chart is one of the most mid-curve things I've ever seen. Take a second and absorb what's actually been achieved here. Understandable to not like OpenAI since they're destroying your whole identity but don't pretend its the chart that's the issue.

2 0 18 643 1

Alexander Long @_AlexanderLong

a month ago

Every single datapoint pointing towards GPT-5 being really really good.

Eric Wallace @Eric_Wallace_

a month ago

Every single datapoint pointing towards GPT-5 being really really good.

113 384 3K 618K 528

Download Image

1 0 18 1K 4

roon @tszzl

a month ago

pretraining is an elegant science, done by mathematicians who sit in cold rooms writing optimization theory on blackboards, engineers with total absorb of distributed systems of titanic scale posttraining is hair raising cowboy research where people drinking a lot of diet coke…

96 197 3K 261K 729

Alexander Long @_AlexanderLong

a month ago

Missing the point that it'll be a system who's behaviour is controlled by a few people, in companies that don't have the best track record when it comes to this kind of thing. It's fundamentally a level of power that's never existed before it's that simple.

Haider. @slow_developer

a month ago

92 80 522 151K 188

Download Video

1 1 22 1K 1

Alexander Long @_AlexanderLong

2 months ago

Thats a wrap for ICML2025. Incredible to watch the space go from "What are you talking about" to "That's impossible" to "Hmmm thats very interesting" in just over a year. @tha_ajanthan @hmdolatabadi

4 5 53 4K 1

Download Image

vitrupo @vitrupo

2 months ago

Jack Dorsey says AI must be permissionless because constraint kills innovation. Five CEOs shouldn't dictate what brings humanity forward. Open source is the answer. To protect ourselves, we have to race ahead. Eliminating single points of failure before they become…

284 416 3K 452K 1K

Download Video

Nat McAleese @nmca

2 months ago

If you claim to have seen this coming, you better have the manifold P&L to back it up

12 4 143 22K 15

Download Image

Alexander Long @_AlexanderLong

2 months ago

Noam tends to not exaggerate.

Noam Brown @polynoamial

2 months ago

Noam tends to not exaggerate.

4 40 725 42K 51

2 1 11 777 0

Jake Brukhman @jbrukh

2 months ago

Welcome to DeAI Summer. My op ed in @CoinDesk on the progress we're making in decentralized AI. coindesk.com/opinion/2025/0…

10 16 56 6K 12

Alexander Long @_AlexanderLong

2 months ago

Totally agree - Flower labs another group actively publishing great stuff and now squarely focused on decentralised training. Should be a major datapoint for everyone still skeptical of the area - flower team is as legitimate as it gets and Nic Lane is pretty much top of the…

nic lane @niclane7

2 months ago

0 1 15 2K 1

1 1 16 1K 1

kel @kelxyz_

2 months ago

Pluralis + Prime Intellect + Ambient flips Ethereum by 2030

19 4 130 12K 59

Max Ryabinin @m_ryabinin

2 months ago

From my experience, getting a paper on decentralized DL accepted to top-level conferences can be quite tough. The motivation is not familiar to many reviewers, and standard experiment settings don't account for the problems you aim to solve. Hence, I'm very excited to see…

Alexander Long @_AlexanderLong

2 months ago

10 9 99 20K 29

2 7 43 8K 10

Alexander Long @_AlexanderLong

2 months ago

Feel like meta closing models was very predictable. I explicitly said this would happen last year and explained why (from blog.pluralis.ai/p/article-2-pr…).

Shane Gu @shaneguML

2 months ago

Feel like meta closing models was very predictable. I explicitly said this would happen last year and explained why (from blog.pluralis.ai/p/article-2-pr…). https://t.co/VQkbescnx7

59 46 817 170K 227

Download Image

3 3 20 4K 1

Download Image

Alexander Long @_AlexanderLong

2 months ago

Hidden in the article "Furthermore, there have been some billion dollar offers that were not accepted by researcher/engineering leadership at OpenAI." I believe if @dylan522p wrote it that its true... but how can that be possible?

SemiAnalysis @SemiAnalysis_

2 months ago

40 58 704 166K 205

Download Image

0 0 3 474 2

Greg Osuri 🇺🇸 deAI Summer 2025 @gregosuri

2 months ago

Spoke 50 minutes straight to a packed room of cracked AI researchers at ICML, presenting work by @akashnet_, @PrimeIntellect, @gensynai, @NousResearch, @PluralisHQ, and @GoogleDeepMind. There is now an enormous interest in DeAI. Mission (Partially) Accomplished.

Amanda @vakaytion

2 months ago

0 2 33 9K 1

Download Image

16 34 198 11K 10

Alexander Long @_AlexanderLong

2 months ago

People forget Policy Gradient based RL is the most data-inefficient form of training. Going to be major algorithmic advances in RL'ing the base models, probably using something like artificial curiosity (arxiv.org/pdf/1705.05363). But the current methods are not there.

Andrej Karpathy @karpathy

2 months ago

412 863 8K 1.1M 6K

1 0 10 654 2

Alexander Long @_AlexanderLong

2 months ago

For people not familiar with AI publishing; there are 3 main conferences every year. ICML, ICLR and NeurIPS. These are technical conferences and the equivalent of journals in other disciplines - they are the main publishing venue for AI. The competition to have papers at these…

Pluralis Research @PluralisHQ

2 months ago

11 14 120 28K 15

10 9 99 20K 29

Alexander Long @_AlexanderLong

2 months ago

Using beautiful Grafana dashboards for everything internally, so much nicer than Tensorboard. Wandb still good but doesn't really work with decentralised training. Makes me wonder what the internal vis tooling is like in openai - must be incredible.