Yuanhe Zhang @yuanhezhang6

Pragmatic Learning Theory, using tools from probability and Statistics | PhD in Stats @warwickstats 🇬🇧 | MMathStat @warwickstats 🇬🇧 yuanhez.github.io Joined September 2022

Tweets

498
Followers

22
Following

102
Likes

997

Jack Morris @jxmnop

4 weeks ago

OpenAI hasn’t open-sourced a base model since GPT-2 in 2019. they recently released GPT-OSS, which is reasoning-only... or is it? turns out that underneath the surface, there is still a strong base model. so we extracted it. introducing gpt-oss-20b-base 🧵

159 470 6K 919K 4K

Download Image

Rohan Paul @rohanpaul_ai

a week ago

Another bad news for reasoning LLMs 🤔 The paper claims Chain-of-Thought in Language Models, is a brittle mirage bounded by training data, which is just pattern matching rather than genuine inference. 🤯 Argues that chain of thought in LLMs is pattern replay bound to training…

Rohan Paul @rohanpaul_ai

a week ago

217 469 3K 679K 2K

Download Image

37 108 682 85K 727

Download Image

Alex Prompter @alex_prompter

a week ago

Steal my ChatGPT prompt to master any topic using Feynman technique. -------------------------------- FEYNMAN LEARNING COACH -------------------------------- #CONTEXT: Adopt the role of breakthrough learning architect. The user struggles with complex concepts that traditional…

23 268 2K 355K 7K

Download Image

Rohan Paul @rohanpaul_ai

a week ago

"A Unified Theory of Language" The paper argues language is a fast Bayesian pattern system, shaped by sexual selection to display intelligence. It uses Construction Grammar, where a construction is a stored pairing that links form to meaning across words and gestures.…

17 55 269 17K 283

Download Image

Rohan Paul @rohanpaul_ai

2 weeks ago

Again another paper on the line of "Illusion" of thinking abilities of LLMs, this time from Japan. 😃 The author’s core logic is that true reasoning requires 100% guaranteed correctness, where premises must give conclusive relevant evidence for the conclusion, and LLMs can never…

130 100 709 159K 605

Download Image

shyamal @shyamalanadkat

2 weeks ago

❯ npm i -g @openai/codex ❯ codex ❯ codex -m gpt-5 -c model_reasoning_effort="high" codex CLI with gpt-5 is included in your @ChatGPT plan!

17 29 619 111K 614

Ethan Mollick @emollick

2 weeks ago

A useful thing that GPT-5 can do that wasn’t previously possible before powerful AI is to monitor complex topics by asking it to give you scheduled reports. Example: I have a weekly report on “reproducible, benchmarked evidence of autonomous or recursive self‑improvement in AI”

42 97 1K 107K 904

Download Image

Connor Davis @connordavis_ai

3 weeks ago

You don’t need GPT-5 or Claude 5... You need better prompts. MIT just confirmed what AI experts already knew: Prompting drives 50% of performance. Here’s how to level up without touching the model:

36 154 1K 151K 2K

Download Image

Ihtesham Haider @ihteshamit

3 weeks ago

This one paper might kill the LLM agent hype. NVIDIA just published a blueprint for agentic AI powered by Small Language Models. And it makes a scary amount of sense. Here’s the full breakdown:

336 1K 8K 1.3M 11K

Download Image

steve hsu @hsu_steve

a month ago

Is Chain-of-Thought Reasoning of LLMs a Mirage? ... Our results reveal that CoT reasoning is a brittle mirage that vanishes when it is pushed beyond training distributions. This work offers a deeper understanding of why and when CoT reasoning fails, emphasizing the ongoing…

200 970 6K 782K 5K

Download Image

Rohan Paul @rohanpaul_ai

a month ago

Beautiful @GoogleResearch paper. LLMs can learn in context from examples in the prompt, can pick up new patterns while answering, yet their stored weights never change. That behavior looks impossible if learning always means gradient descent. The mechanisms through which this…

60 346 3K 360K 3K

Download Image

马东锡 NLP @dongxi_nlp

2 months ago

这周末读了context engineering的论文。个人感觉，主要方法非常接近RAG的4R: Retriver, Rewriter, Rranker, Reader. CE中关于memory，tool call response的方法，或多或少被4R 覆盖。而4R中，我最喜欢围绕 Rewriter 的工作。即，系统处理的query，并不一定是用户最初的query。…

12 48 274 37K 229

Pramod Goyal @goyal__pramod

2 months ago

A beautiful visual blog, where you can change values, interact, and see what each head does exactly inside the transformer.

12 418 3K 246K 4K

Download Image

Shashwat Goel @ShashwatGoel7

3 months ago

Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇

33 125 878 318K 535

Download Image

马东锡 NLP @dongxi_nlp

2 months ago

「 Data Contamination，Qwen2.5 」 Qwen2.5 系列的 Data Contamination 问题被证实，模型在预训练阶段已经见过评测题目。前几个月，数篇 LLM Reasoning + RL 的论文发现，用极弱或随机奖励即可显著提升 Qwen 系列数学推理能力。这引发出 Qwen 模型在 pretraining 阶段已经见过评测题目的疑问。…