Tingchen Fu @TingchenFu

Incoming PhD student @UniofOxford and @MetaAI, prev Renmin University of China (RUC) tingchenfu.github.io Beijing, China Joined September 2022

Tweets

184
Followers

220
Following

644
Likes

1K

Jakob Foerster @j_foerst

4 days ago

After thousands of papers on meta-learning, the approach that ended up being successful (ICL) was an accidental byproduct of language modeling. Serendipity at its best and a good reminder that research needs to be open-ended and pursue a diversity of goals to escape local minima.

11 23 258 24K 70

Feng Yao @fengyao1909

a month ago

Failing on 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞 𝐑𝐋 with VeRL? ⚠️ Mixing inference backend (𝐯𝐋𝐋𝐌/𝐒𝐆𝐋𝐚𝐧𝐠) with training backends (𝐅𝐒𝐃𝐏/𝐌𝐞𝐠𝐚𝐭𝐫𝐨𝐧) 𝐬𝐞𝐜𝐫𝐞𝐭𝐥𝐲 𝐭𝐮𝐫𝐧𝐬 𝐲𝐨𝐮𝐫 𝐑𝐋 𝐢𝐧𝐭𝐨 𝐨𝐟𝐟-𝐩𝐨𝐥𝐢𝐜𝐲 — even if they share the same weights! 📉 Blog:…

13 117 717 128K 647

Download Image

Owain Evans @OwainEvans_UK

2 months ago

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

292 1K 9K 1.9M 5K

Download Image

Micah Goldblum @micahgoldblum

2 months ago

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

28 114 830 390K 646

Download Image

Lorenzo Xiao @lrzneedresearch

2 months ago

my favorite joke of the year

19 5 260 54K 21

Download Image

Rohan Paul @rohanpaul_ai

2 months ago

the Grok 4 benchmark chart (leaked version) is just beautiful Did @xai really hit 45% on HLE (Humanities Last Exam) 🤯 Because the HLE test is so hard. It (HLE) holds 2,500 expert-written questions spanning more than 100 subjects, including math, physics, computer science and…

TestingCatalog News 🗞 @testingcatalog

2 months ago

32 64 635 348K 95

Download Image

5 7 52 34K 13

Download Image

Valentina Pyatkin @valentina__py

2 months ago

💡Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR. But the set of constraints and verifier functions is limited and most models overfit on IFEval. We introduce IFBench to measure model generalization to unseen constraints.

5 95 353 48K 184

Download Image

Yupeng Hou @yupenghou97

2 months ago

All reviews: positive; Meta: accept; Rejected by #RecSys2025 I get that decisions are complex, eg, "maintaining a competitive ac rate expected for top-tier conf." Still, frustrating to see months of work from an amazing team dismissed in a single shot, with no further feedback.

3 6 24 2K 1

Download Image

Fazl Barez @FazlBarez

2 months ago

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵

26 136 646 115K 462

Download Image

Bo Liu (Benjamin Liu) @Benjamin_eecs

2 months ago

We've always been excited about self-play unlocking continuously improving agents. Our insight: RL selects generalizable CoT patterns from pretrained LLMs. Games provide perfect testing grounds with cheap, verifiable rewards. Self-play automatically discovers and reinforces…

4 50 273 65K 182

Download Image

Feng Yao @fengyao1909

2 months ago

😵‍💫 Struggling with 𝐟𝐢𝐧𝐞-𝐭𝐮𝐧𝐢𝐧𝐠 𝐌𝐨𝐄? Meet 𝐃𝐞𝐧𝐬𝐞𝐌𝐢𝐱𝐞𝐫 — an MoE post-training method that offers more 𝐩𝐫𝐞𝐜𝐢𝐬𝐞 𝐫𝐨𝐮𝐭𝐞𝐫 𝐠𝐫𝐚𝐝𝐢𝐞𝐧𝐭, making MoE 𝐞𝐚𝐬𝐢𝐞𝐫 𝐭𝐨 𝐭𝐫𝐚𝐢𝐧 and 𝐛𝐞𝐭𝐭𝐞𝐫 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐢𝐧𝐠! Blog: fengyao.notion.site/moe-posttraini……

4 60 272 58K 218

Download Image

Yafu Li @yafuly

2 months ago

Excited to share our ACL 2025 oral presentation—see you in Vienna!!

Jianhao (Elliott) Yan @yan_elliott

2 months ago

Excited to share our ACL 2025 oral presentation—see you in Vienna!!

2 5 16 2K 1

Download Image

1 5 5 977 1

bycloud @bycloudai

2 months ago

many such cases

20 44 859 32K 70

Download Image

Rohan Paul @rohanpaul_ai

2 months ago

🚨 CHINA’S BIGGEST PUBLIC AI DROP SINCE DEEPSEEK @Baidu_Inc open source Ernie, 10 multimodal MoE variants 🔥 Surpasses DeepSeek-V3-671B-A37B-Base on 22 out of 28 benchmarks 🔓 All weights and code released under the commercially friendly Apache 2.0 license (available on…

27 97 434 52K 272

Download Image

Andrei Lupu @_andreilupu

2 months ago

Theory of Mind (ToM) is crucial for next gen LLM Agents, yet current benchmarks suffer from multiple shortcomings. Enter 💽 Decrypto, an interactive benchmark for multi-agent reasoning and ToM in LLMs! Work done with @TimonWilli & @j_foerst at @AIatMeta & @FLAIR_Ox 🧵👇

4 29 104 22K 36

Download Video

Joelle Pineau @jpineau1

2 months ago

I'm excited to be joining the board of the Laude Institute! We need more support and incentives for university researchers who have great ideas and early results to accelerate their work, and build new real-world solutions that have a meaningful impact on people and society.

Andy Konwinski @andykonwinski

2 months ago

57 121 1K 322K 462

Download Image

25 22 304 49K 40

Jonny Cook @JonnyCoook

2 months ago

Can an LLM be programmed? In our new preprint, we show that LLMs can learn to evaluate programs for a range of inputs by being trained on the program source code alone – a phenomenon we call Programming by Backprop (PBB). 🧵⬇️

6 32 128 19K 70

Download Image

Tanishq Mathew Abraham, Ph.D. @iScienceLuvr

3 months ago

EvoLM: In Search of Lost Language Model Training Dynamics "We present EvoLM, a model suite that enables systematic and transparent analysis of LMs' training dynamics across pre-training, continued pre-training, supervised fine-tuning, and reinforcement learning. By training…

8 25 136 10K 68

Download Image

DailyPapers @HuggingPapers

3 months ago

Discrete Diffusion in Large Language and Multimodal Models: A Survey just released on Hugging Face Get an overview of research in discrete diffusion LLMs and MLLMs, which achieve performance comparable to autoregressive models with up to 10x faster inference!

6 94 378 46K 261

Download Image

Chenxin An @AnChancy46881

3 months ago

# 🚨 4B open-recipe model beats Claude-4-Opus 🔓 100% open data, recipe, model weights and code. Introducing Polaris✨--a post-training recipe for scaling RL on advanced reasoning models. 🥳 Check out how we boost open-recipe reasoning models to incredible performance levels…