Cerebras Inference powers the world's top models - like Qwen3 Coder at 2000 tokens/s 🤯 And you can now access these models directly in @code with the @CerebrasSystems extension and an API key (that you can get for free from cerebras.ai!) aka.ms/VSCode/Cerebras
@code @CerebrasSystems 💯💯 works pretty awesome 😂 the best so far for using the cerebras code
@code @CerebrasSystems Just rate limit instantly, it’s only 15/minute. You get much more token throughout and quality from just standard sonnet 4
@code @CerebrasSystems @grok free for how long ?
@code @CerebrasSystems we will have a good perk on building ❤️ @Alibaba_Qwen
@code @CerebrasSystems I couldn't try because of 'Reason: 400 Please reduce the length of the messages or completion. Current length is 65591 while limit is 65536' error. also context window 131k and starter package price is 50$. I don't know possible use cases of this product.
@code @CerebrasSystems why is gemini 2.5pro so lazy and stop before finnish any task??
@code @CerebrasSystems Don't even attempt guys, with so terrible rate limits it's pretty useless.
@code @CerebrasSystems It's always bugged me that Azure doesn't host Chinese models and serve them in @code for 0x, especially with the release of models like GLM 4.5 and Kimi K2 0905