🇨🇳 China recent days: Kimi K2 Qwen3-235B-A22B Qwen 3 Coder Qwen Small+Medium Models New StepFun MoE ZAI GLM 4.5 and GLM 4.5 Air InternLM Intern S1 All open weights (mostly permissive license) 🇺🇸 US: OpenAI - Study Mode in ChatGPT Anthropic - Claude Max Pricing So, US is still a leader in AI right?
@1littlecoder clean to see those open weights rolling out, china's pushing hard. us needs to stay sharp or risk losing ground.
@1littlecoder Grok 4? @grok what is he forgetting
@1littlecoder Open weights make sense when you're behind. These models, while powerful, don't meet the current benchmarks of Gemini & o3
@1littlecoder lol we literally just migrated mario from claude to kimi k2 last month 80% cost savings, same performance on restaurant analytics while US companies debate pricing strategies, china ships production-ready models weekly momentum > marketing every single time
@1littlecoder OpenAI was the company that started it all, and we should be grateful for that. But they’re no longer leading the way and probably won’t again. GPT-5 might be impressive, but it doesn’t seem like a real game changer in everyday use
@1littlecoder Which of these models beat SOTA models in their category? Take Code generation, nothing beats even Claude 3.7 today, after a year.
Study mode is highly usable. OpenAI Agents and Grok4 with Ultra are impressive beasts. Mistral (EU) delivers on a non-glorified but very promising level GPT-5 is coming I expect future Gemini models to perform on a similar level But surely, China not only closed up to U.S. based top models, they are set to overtake. The race is still on
@1littlecoder It's the consequence of disorganized strategy. Compare their governments’ published plan for AI and you will see.
@1littlecoder Yes they are, OpenAI and Anthopic have the most powerful models on the market.
@1littlecoder I don’t like CCP LLMs even if they are open source 🤷
@1littlecoder The speed and openness from China this year has been wild.
@1littlecoder All these Chinese models are not important.
@1littlecoder now do it again but using LLM-product based sales instead
My concern is they are not publishing anything more... What I know last about attention is 1. Native trainable sparse attention and Flash attention v3. 2. Using simple gated MoE. 3. Reinforcement learning of DPO and PPO. I don't see any new publications in this field. Beyond fine tunings
@1littlecoder It's all so secret, so how would we know? On the other hand, Claude and Gemini2.5 are actually the ones to beat.
@1littlecoder Yes, all open-source models are "on par" with closed-source models that were released a few months ago. And OpenAI/Google already have much more capable models. Look a little at the future and see who the trendsetters are. And replicate much easier that invent.
@1littlecoder Yes. Even though I'm Chinese, I have to admit that the most advanced AI models in the US are stronger. Chinese AI models lack multimodal support and their IQ isn't at the top level.
@1littlecoder How would you compare these products from the users' perspective, @grok?
@1littlecoder Maybe in paid ai .. the datacenter power behind us ai may be bigger than chinese ai. Future will be about growth and adoption. Fierce fight ahead!
@1littlecoder chinese AI researchers are all insanely intelligent
@1littlecoder I wouldn’t have predicted China’s models to be open source
@1littlecoder The problem is that both countries are wasting talent and resources on these dumb LLMs and diffusion models , not on the path to AGI This LLM and diffusion hype has delayed progress by at least 10 years.
@1littlecoder Specifically for complex reasoning , can you suggest the best Open source model till now