OpenRouter @OpenRouterAI, Twitter Profile

OpenRouter @OpenRouterAI

2 months ago

Qwen3 Coder has now passed Grok 4 in the Programming prompt rankings Tied with Kimi!

46 93 1K 118K 268

Download Image

Florian S @airesearch12

2 months ago

@OpenRouterAI @Alibaba_Qwen my personal feeling: Kimi-K2 > Qwen3-Coder > Grok 4

4 0 21 3K 2

Apple Lamps @lamps_apple

2 months ago

@OpenRouterAI @Alibaba_Qwen I like using grok-4 for code reviews. Then I have sonnet implement the suggestions! Lampcodereview.streamlit.app

0 1 3 1K 1

Himanshu Kumar @codewithimanshu

2 months ago

@OpenRouterAI @Alibaba_Qwen Impressive, but Grok's eval weaknesses with recursion.. might skew this a bit, no?

0 0 1 407 0

Reneil @reneil1337

2 months ago

@OpenRouterAI @Alibaba_Qwen head to head 🤯

0 0 1 40 0

michielh.eth @michieldoteth

2 months ago

@OpenRouterAI @Alibaba_Qwen METRICS HAVE SPOKEN 🗣 x.com/michieldoteth/…

michielh.eth @michieldoteth

2 months ago

@OpenRouterAI @Alibaba_Qwen METRICS HAVE SPOKEN 🗣 x.com/michieldoteth/…

0 1 2 412 0

Download Image

0 0 0 66 0

@OpenRouterAI @Alibaba_Qwen Used it yesterday with OpenRouter and OpenCode. It was VERY good, probably a bit better than Sonnet imo. But so expensive - 30 min of use cost me 13 dollars. Tested Sonnet on OR - it was cheaper because of cache hits. I wish qwen had cache on OR, because the model is so good.

2 0 16 2K 2

Karim C @BrandGrowthOS

2 months ago

speaking of kimi k2 - been testing all access methods this week for our production migration here's the real breakdown nobody talks about: OFFICIAL API ($0.15 input / $2.50 output) - 42 tokens/sec, 0.55s latency - direct relationship for debugging/support - hosted in china (latency considerations) - most reliable for production systems OPENROUTER ($0.55-1.00 input / $2.20-3.00 output) - varies by backend provider - perfect for multi-model workflows - automatic failover = zero downtime - openai SDK compatible (huge win) GROQ ($1.00 input / $3.00 output) - 250 tokens/sec (insanely fast) - 4.6s first token (tradeoff) - best for real-time applications - US infrastructure migrated our sales analysts from claude sonnet yesterday. went official API for production, openrouter for experimentation all three crush sonnet on cost - 80% savings with identical quality. temperature 0.1 for deterministic outputs works perfectly bottom line: - production: official API - multi-model testing: openrouter - speed demons: groq saved $3k/month vs claude. performance identical where it matters

2 0 14 2K 12

Dimitris Efstathiadis @Dimitris_Efsta

2 months ago

It’s not that good. 4.1 and Sonnet 4.0 are far better! Too much stuff for nothing. What are those benchmarks on? I gave it multiple shots and it was bad, bad. Not intermittent bad or interesting bad! It was straight up wrong, too much code that didn’t make sense and hallucinations galore..

4 0 6 1K 0

Shen Sean Chen @ShenSeanChen

2 months ago

@OpenRouterAI @Alibaba_Qwen Tested out a one shot 3d game generation using Qwen3 Coder. Doing pretty well. x.com/shenseanchen/s…

Shen Sean Chen @ShenSeanChen

2 months ago

@OpenRouterAI @Alibaba_Qwen Tested out a one shot 3d game generation using Qwen3 Coder. Doing pretty well. x.com/shenseanchen/s…

0 1 9 2K 3

Download Video

0 0 5 583 2

jeff kazzee @JeffKazzee

2 months ago

@OpenRouterAI @Alibaba_Qwen Qwen is an amazing model - it's fucking SOTA, for anything I throw at it, it performs similarly to opus. (Good prompting with BMAD Method) leads to even BETTER outputs than vanilla opus. Nothing beats claude code yet, other than models that are locked in lmarena.