• neetcode1 Profile Picture

    NeetCode @neetcode1

    2 months ago

    At this point, how much do people actually care about benchmarks? Calling it now, Grok 4 won't actually be the best model, it's just the classic hype cycle. Starting to see a lot more people catch on.

    31 22 693 42K 28
  • ArtTheVandelay Profile Picture

    Art Vandelay @ArtTheVandelay

    2 months ago

    @neetcode1 X Monetization is a hell of a drug.

    ArtTheVandelay tweet picture

    2 0 6 547 0
    Download Image
  • dariusemrani Profile Picture

    dare @dariusemrani

    2 months ago

    @neetcode1 Most benchmarks are BS. I’m only excited by the ARC score.

    dariusemrani Profile Picture

    dare @dariusemrani

    3 months ago

    @neetcode1 Most benchmarks are BS. I’m only excited by the ARC score.

    2 1 9 10K 15

    0 0 4 2K 0
  • makersfuel Profile Picture

    Shreyans Bhansali @makersfuel

    2 months ago

    @neetcode1 Benchmarks are the trailer. Real-world use is the movie.

    1 0 1 470 0
  • vaggelis98_ Profile Picture

    maverick @vaggelis98_

    2 months ago

    @neetcode1 It is just like IQ tests. If you start solving 10,20,30 then you arent testing your IQ, you are optimizing your reasoning for IQ tests and therefore the result starts becoming less and less trustworthy

    0 0 1 80 0
  • kajla_dhananjay Profile Picture

    Dhananjay Kajla @kajla_dhananjay

    2 months ago

    @neetcode1 x.com/kajla_dhananja…

    0 0 0 159 0
  • VTchuiev Profile Picture

    Vladimir Tchuiev @VTchuiev

    2 months ago

    @neetcode1 Are people catching on? Yesterday this entire feed was filled with Grok 4 hype... As expected, it's not AGI but more of a bloated mess

    0 0 0 43 0
  • BreadPirateRob Profile Picture

    Bread @BreadPirateRob

    2 months ago

    @neetcode1 model providers don't disclose quantization level and regularly change it so, the model you actually get from ChatGPT, Claude, Grok, etc. isn't the model that is pegged to the benchmark

    0 0 0 256 0
  • codingwithroby Profile Picture

    Eric Roby @codingwithroby

    2 months ago

    @neetcode1 for sure

    0 0 0 255 0
  • SGRamesh23 Profile Picture

    Suraj Gupta @SGRamesh23

    2 months ago

    @neetcode1 It's true, sometimes the hype can overshadow the actual performance.

    0 0 0 19 0
  • dinocres1 Profile Picture

    Dino 🇦🇷 @dinocres1

    2 months ago

    @neetcode1 I ignore both benchmarks and initial Twitter over/under hype. Useful models surface to the top organically

    0 0 0 291 0
  • JohannesTscharn Profile Picture

    Johannes Tscharn @JohannesTscharn

    2 months ago

    @neetcode1 Jeah it’s sad they mostly showed benchmark results and graphs that most people neither really understand nor care about…

    0 0 0 1K 0
  • KaousNadirHatem Profile Picture

    Hatem @KaousNadirHatem

    2 months ago

    @neetcode1 I think benchmarks still matter, just not to end-users. They're crucial for the developers and businesses who have to choose which foundational model to build on top of.

    0 0 0 118 0
  • thecsguy Profile Picture

    nick @thecsguy

    2 months ago

    @neetcode1 no one should. they are meaningless numbers

    0 0 0 12 0
  • ruthuvikas Profile Picture

    Ruthuvikas Ravikumar @ruthuvikas

    2 months ago

    @neetcode1 LM's are getting saturated. These hype claims are just to maintain the stock value.

    0 0 9 1K 0
  • fate1ess Profile Picture

    Aditya @fate1ess

    2 months ago

    @neetcode1 I refuse to believe it's actually smarter than claude at coding.

    1 0 4 758 0
  • thoughtsofayush Profile Picture

    AKS 🇮🇳 @thoughtsofayush

    2 months ago

    @neetcode1 By that logic, solving leetcode questions as a ‘benchmark’ should not be someones full time job.

    0 0 3 257 0
  • algorithmsarm64 Profile Picture

    Asesh @algorithmsarm64

    2 months ago

    @neetcode1 They started late and is leading now. Things will only get better. Never bet against Elon!!

    0 0 2 322 0
  • gemb0_0 Profile Picture

    another boring guy @gemb0_0

    2 months ago

    @neetcode1 With every new Gemini release it tops the charts but after trying it, it doesn't feel like that's the best model at all so for me the benchmarks are just a hype machine to keep the party going

    0 0 1 356 0
  • rfxkairu Profile Picture

    k @rfxkairu

    2 months ago

    @neetcode1 many such cases. grok 3 was hyped and leading some benchmarks, just for no one caring other than musk bootlickers. time and time again, anthropic and openai get dethroned in benchmarks just for them to still have the best models when it comes to actually using them

    0 0 1 370 0
  • userinfo25 Profile Picture

    we don't know @userinfo25

    2 months ago

    @neetcode1 You're dumb bro, you also said ai was all hype in the starting. We all know how better and useful it is now

    1 0 1 108 0
  • rufw91 Profile Picture

    rufw @rufw91

    2 months ago

    @neetcode1 Lol. Spoken like a true hater

    1 0 1 320 0
  • lite_745 Profile Picture

    Lite @lite_745

    2 months ago

    @neetcode1 Which one is the best for free APIs ?

    0 0 0 8 0
  • sibanda_henry Profile Picture

    Henry Sibanda @sibanda_henry

    2 months ago

    @neetcode1 Everyone caught on ages ago when openai overdid it with strawberry. People now just test out of curiosity but the excitement has dropped. A tweet of a just a 🍓used to get people crazy 😂

    0 0 0 15 0
  • actual_amit Profile Picture

    Amit @actual_amit

    2 months ago

    @neetcode1 Yeah Over promise and under deliver

    0 0 0 437 0
  • zenitsu_aprntc Profile Picture

    zenitsu_apprentice @zenitsu_aprntc

    2 months ago

    @neetcode1 do you think we'll see like another checkpoint, like the reasoning models, where the models get a serious update/change

    0 0 0 517 0
  • djcarday Profile Picture

    Derek Johnson @djcarday

    2 months ago

    @neetcode1 whyyy so much AI talk? AI bro now :(

    0 0 0 145 0
  • codeStumps Profile Picture

    Aditya @codeStumps

    2 months ago

    @neetcode1 Some benchmarks are manufactured by authors themselves. We ll soon know about this. Hope it’s not like the llama-gate🤷🏻‍♂️

    0 0 0 189 0
  • harmitwt Profile Picture

    Harmit @harmitwt

    2 months ago

    @neetcode1 Fully agree!! My tweet after I watched yesterday’s presentation

    harmitwt tweet picture

    1 0 0 16 0
    Download Image
  • ibn_haleema Profile Picture

    Ibn Haleema al Kashmiri @ibn_haleema

    2 months ago

    @neetcode1 Yup, for me Claude and deepseek are better code writers than Grok.

    0 0 0 25 0
  • RayanKrishnan Profile Picture

    Rayan Krishnan @RayanKrishnan

    2 months ago

    You called it, didn't do as well on held-out benchmarks. Its about having he right benchmarks, not just self-reported performance. x.com/_valsai/status…

    _valsai Profile Picture

    Vals AI @_valsai

    2 months ago

    You called it, didn't do as well on held-out benchmarks. Its about having he right benchmarks, not just self-reported performance. x.com/_valsai/status…

    RayanKrishnan tweet picture

    2 2 10 1K 1
    Download Image

    0 0 2 48 0
  • Download Image
    • Privacy
    • Term and Conditions
    • About
    • Contact Us
    • TwStalker is not affiliated with X™. All Rights Reserved. 2024 www.instalker.org

    twitter web viewer x profile viewer bayigram.com instagram takipçi satın al instagram takipçi hilesi twitter takipçi satın al tiktok takipçi satın al tiktok beğeni satın al tiktok izlenme satın al beğeni satın al instagram beğeni satın al youtube abone satın al youtube izlenme satın al sosyalgram takipçi satın al instagram ücretsiz takipçi twitter takipçi satın al tiktok takipçi satın al tiktok beğeni satın al tiktok izlenme satın al beğeni satın al instagram beğeni satın al youtube abone satın al youtube izlenme satın al metin2 metin2 wiki metin2 ep metin2 dragon coins metin2 forum metin2 board popigram instagram takipçi satın al takipçi hilesi twitter takipçi satın al tiktok takipçi satın al tiktok beğeni satın al tiktok izlenme satın al beğeni satın al instagram beğeni satın al youtube abone satın al youtube izlenme satın al buyfans buy instagram followers buy instagram likes buy instagram views buy tiktok followers buy tiktok likes buy tiktok views buy twitter followers buy telegram members Buy Youtube Subscribers Buy Youtube Views Buy Youtube Likes forstalk postegro web postegro