Sir Hakase @codehakase, Twitter Profile

Sir Hakase @codehakase

3 months ago

Random thought: If older LLMs scored >90% on benchmarks, why are they suddenly “bad” the moment a new one drops? Were they just trained to ace the test, not generalise?

0 0 0 71 0