OpenAI recently released its first open-weights model since GPT-2, entering a field led by DeepSeek and Alibaba's Qwen. Ankit (@GuptaAnkitV) breaks down these top OSS models, including what sets them apart under the hood: mixture-of-experts, long-context training, and post-training techniques that shape reasoning and alignment—and how different design choices lead to surprisingly similar performance. 00:00 – OpenAI OSS Launch 01:00 – Comparing Open Source LLM Architectures 01:46 – GPT OSS Overview 02:37 – Under The Hood of GPT OSS 03:25 – Qwen-3 Architecture 04:17 – Qwen-3 Training 05:12 – Qwen-3 Post-Training 06:08 – Qwen-3 Reasoning & RL Innovations 06:52 – DeepSeek V3 Overview 07:40 – DeepSeek V3.1 Updates 08:39 – Attention Mechanism (MLA) 09:39 – Comparing Model Sizes 10:35 – Long Context Strategies 11:25 – Reflections on Methods 12:00 – Takeaways
@ycombinator @GuptaAnkitV Would love to see a breakdown of GLM-4.5 one day👀
@ycombinator @GuptaAnkitV This one is super insightful Bookmarked this for this weekend
@ycombinator @GuptaAnkitV Open weights foster broader access, but careful consideration of responsible development remains crucial. This shift could democratize AI, prompting exciting innovation.
@ycombinator @GuptaAnkitV if you watched the video (or not), here are a few charts to capture this OSS👌 breakdown
@ycombinator @GuptaAnkitV pathetic cunts. do please go F yourself
@ycombinator @GuptaAnkitV Say it isn’t so!
@ycombinator @GuptaAnkitV Say it isn’t so!
@ycombinator @GuptaAnkitV This is really interesting, can’t wait to see how these models evolve!
@ycombinator @GuptaAnkitV This is actually pretty cool !
@ycombinator @GuptaAnkitV wild that such different training philosophies are all landing at roughly the same performance ceiling
@ycombinator @GuptaAnkitV Amazing video thanks for sharing!
@ycombinator @GuptaAnkitV open weights shift is strategic openai realizes they can't win on inference costs alone. giving developers the weights while keeping the training infrastructure advantage is smart positioning. deepseek and qwen proved open can compete on quality.
@ycombinator @GuptaAnkitV Future is very exciting!!!
@ycombinator @GuptaAnkitV this is the first timestamped vid i saw in x
@ycombinator @GuptaAnkitV Been testing all three models this week. OpenAI's feels cleaner, but DeepSeek handles complex reasoning better in my experience.
@ycombinator @GuptaAnkitV fascinating deep dive into the latest oss ml models
@ycombinator @paulg @GuptaAnkitV @venice_mind how does this compare to the Qwen models currently being offered on @AskVenice ?
@ycombinator @paulg @GuptaAnkitV Wow, feels like things are heating up again in AI! Curious to see how OpenAI's new model stacks up.
@ycombinator @paulg @GuptaAnkitV Wow, things are really picking up in the AI world. Exciting times ahead!