XMaster96 @_XMaster96

Former Senior AI researcher @Aleph__Alpha EVE Online player since 2013 Co-Founder Pageshift Entertainment - Building the worst best story telling AI pageshift.ai Joined April 2024

Tweets

96
Followers

112
Following

66
Likes

46

XMaster96 @_XMaster96

23 hours ago

I feel like it is such an achievement that people are starting to appreciate what we are doing here, considering how many times I was told to just simply build a ChatGPT wrapper instead of our own models...

pageshift.ai @pageshiftAI

3 days ago

3 4 26 4K 15

Download Image

1 1 4 96 1

Download Image

XMaster96 @_XMaster96

3 days ago

we were cooking HARD

pageshift.ai @pageshiftAI

3 days ago

we were cooking HARD

3 4 26 4K 15

Download Image

1 0 3 138 0

Tim @tymberger

a week ago

POV: Silicon Valley networking events. Founder 1: So what are you working on? Founder 2: We are building this AI Agent B2B SaaS business. What about you? Founder 1: Oh that’s so cool, we are also building an AI Agent B2B SaaS product. Founder 2: Awesome, we should connect on…

0 1 1 164 0

XMaster96 @_XMaster96

2 weeks ago

how the heck is Gemini (via the official chat interface) so unbelievably bad at using web search to look up documentation for a library it clearly does not know how to use

0 0 1 121 0

XMaster96 @_XMaster96

a month ago

After playing with gpt-oss for a bit, I sadly have to say that it gives me major Microsoft Phi vibes. Heavily overtrained on synthetic data and quite fragile in real world setting

0 0 5 274 0

XMaster96 @_XMaster96

a month ago

apparently I am not the only one who thinks that gpt-oss is bad

NomoreID @Hangsiin

a month ago

apparently I am not the only one who thinks that gpt-oss is bad

17 17 230 25K 74

Download Image

0 0 2 202 0

XMaster96 @_XMaster96

2 months ago

Nice, we just got V5 and V6 TPUs 🥳

0 0 0 175 0

Download Image

XMaster96 @_XMaster96

2 months ago

And do we get the base model checkpoint?

Qwen @Alibaba_Qwen

2 months ago

And do we get the base model checkpoint?

318 1K 9K 2.0M 4K

Download Image

0 1 1 376 0

XMaster96 @_XMaster96

2 months ago

I can’t sleep right now so I started to read the source code of chatterbox from @resembleai and I really have to say, their audio tokenizer is damn smart, and the reason why their model sounds this good. They are basically doing diffusion inference steps to clean up their audio.…

0 0 1 106 0

XMaster96 @_XMaster96

2 months ago

Oh, damn @Alibaba_Qwen never published their pretrained Qwen3-32B model...

0 0 0 108 0

XMaster96 @_XMaster96

2 months ago

Somehow Claude is better in writing wrappers for the Gemini API than Gemini is

0 0 2 139 0

XMaster96 @_XMaster96

2 months ago

just to quickly explain what I am working on. so I need a dynamic sparse Mixture of Expert (MoE) kernel, that allows for a highly uneven batch based routing behavior. In a normal MoE training setting we assume / force an even usage of all experts across the full batch. Which is…

XMaster96 @_XMaster96

2 months ago

1 0 3 433 0

0 0 1 286 1

XMaster96 @_XMaster96

2 months ago

This is now the third time in a row that I was already lying in bed, and went back up because I had a new idea for a parallel algorithm that would turn a really expensive dense multiplication into a really efficient sparse one. It would be so easy if TPUs would allow Vector…

1 0 3 433 0

XMaster96 @_XMaster96

2 months ago

I remember when GPT-1 was a joke

Tanishq Mathew Abraham, Ph.D. @iScienceLuvr

2 months ago

I remember when GPT-1 was a joke

26 20 510 45K 19

0 0 2 170 0

XMaster96 @_XMaster96

2 months ago

Oh great, the normies have finally discovered year-old memes

Judd Rosenblatt @juddrosenblatt

2 months ago

Oh great, the normies have finally discovered year-old memes

348 1K 9K 1.8M 10K

Download Image

0 0 2 354 0

XMaster96 @_XMaster96

3 months ago

We are re-writing our code base from Torch to JAX right now. And oh boy, it is a good feeling to finally use a XLA-based framework again. This is like waking up from a really long and bad dream

3 0 7 395 0

Tim @tymberger

3 months ago

I was in the audience, and one key point that wasn’t mentioned here was their argument around the Jevons Paradox: the idea that people will consume more content simply because it’s easier and cheaper to access. At pageshift, we strongly believe this will be the case. People…