Jiarui Yao @ExplainMiracles

UIUC CS PhD, 24 Joined May 2023

Tweets

19
Followers

84
Following

510
Likes

53

Cheng Qian @qiancheng1231

a month ago

🤝 Can LLM agents really understand us? We introduce UserBench: a user-centric gym environment for benchmarking how well agents align with nuanced human intent, not just follow commands. 📄 arxiv.org/pdf/2507.22034 💻 github.com/SalesforceAIRe…

6 35 120 12K 60

Download Image

Yong Lin @Yong18850571

2 months ago

(1/4)🚨 Introducing Goedel-Prover V2 🚨 🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 64 problems—with far less compute. 🧠 New SOTA on MiniF2F: * 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%. * 8B > 671B: Our 8B…

7 87 254 64K 121

Download Image

Noam Razin @noamrazin

2 months ago

Reward models (RMs) are key to language model post-training and inference pipelines. But, little is known about the relative pros and cons of different RM types. 📰 We investigate why RMs implicitly defined by language models (LMs) often generalize worse than explicit RMs 🧵 1/6

3 20 157 10K 132

Download Image

Shulin Tian @shulin_tian

3 months ago

🎥 Video is already a tough modality for reasoning. Egocentric video? Even tougher! It is longer, messier, and harder. 💡 How do we tackle these extremely long, information-dense sequences without exhausting GPU memory or hitting API limits? We introduce 👓Ego-R1: A framework…

7 9 37 5K 7

Download Video

Xiusi Chen @xiusi_chen

3 months ago

Can LLMs make rational decisions like human experts? 📖Introducing DecisionFlow: Advancing Large Language Model as Principled Decision Maker We introduce a novel framework that constructs a semantically grounded decision space to evaluate trade-offs in hard decision-making…

2 16 55 7K 26

Download Image

Peixuan Han @peixuanhakhan

3 months ago

(1/5) Want to make your LLM a skilled persuader? Check out our latest paper: "ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind"! For details: 📄Arxiv: arxiv.org/pdf/2505.22961 🛠️GitHub: github.com/ulab-uiuc/ToMAP

2 6 24 2K 9

Download Image

Cheng Qian @qiancheng1231

3 months ago

📢 New Paper Drop: From Solving to Modeling! LLMs can solve math problems — but can they model the real world? 🌍 📄 arXiv: arxiv.org/pdf/2505.15068 💻 Code: github.com/qiancheng0/Mod… Introducing ModelingAgent, a breakthrough system for real-world mathematical modeling with LLMs.

3 30 103 13K 52

Download Image

Hanze Dong @hendrydong

4 months ago

How to improve the test-time scalability? - Separate thinking & solution phases to control performance under budget constraint - Budget-Constrained Rollout + GRPO - Outperforms baselines on math/code. - Cuts token 30% usage without hurting performance huggingface.co/papers/2505.05…

5 21 82 7K 46

Xiusi Chen @xiusi_chen

4 months ago

🚀 Can we cast reward modeling as a reasoning task? 📖 Introducing our new paper: RM-R1: Reward Modeling as Reasoning 📑 Paper: arxiv.org/pdf/2505.02387 💻 Code: github.com/RM-R1-UIUC/RM-… Inspired by recent advances of long chain-of-thought (CoT) on reasoning-intensive tasks, we…

3 47 201 36K 113

Download Image

Jiarui Yao @ExplainMiracles

4 months ago

We introduce Gradient Variance Minimization (GVM)-RAFT, a principled dynamic sampling strategy that minimizes gradient variance to improve the efficiency of chain-of-thought (CoT) training in LLMs. – Achieves 2–4× faster convergence than RAFT – Improves accuracy on math…

0 27 88 6K 46

Download Image

Haocheng Xi @HaochengXiUCB

4 months ago

Thrilled to announce that our paper Sparse VideoGen got into #ICML2025! 🎉 Our new approach to speedup Video Generation by 2×. Details in the thread/paper. Huge thanks to my collaborators! Blog: svg-project.github.io Paper: arxiv.org/abs/2502.01776 Code:…

Haocheng Xi @HaochengXiUCB

6 months ago

13 55 257 45K 123

Download Video

0 14 72 6K 16

Manling Li @ManlingLi_

4 months ago

Welcome to join our Tutorial on Foundation Models Meet Embodied Agents, with @YunzhuLiYZ @maojiayuan @wenlong_huang ! Website: …models-meet-embodied-agents.github.io

4 40 231 16K 105

Download Image

Shizhe Diao @shizhediao

5 months ago

Thrilled to share my first project at NVIDIA! ✨ Today’s language models are pre-trained on vast and chaotic Internet texts, but these texts are unstructured and poorly understood. We propose CLIMB — Clustering-based Iterative Data Mixture Bootstrapping — a fully automated…

16 55 312 32K 208

Download Image

Jiarui Yao @ExplainMiracles

5 months ago

Negative samples are "not that important", while removing samples with all negative outputs is "important". 🤣

Hanze Dong @hendrydong

5 months ago

Negative samples are "not that important", while removing samples with all negative outputs is "important". 🤣

9 102 468 39K 437

0 0 2 113 0

Cheng Qian @qiancheng1231

7 months ago

🚀Can your language model think strategically? 🧠 SMART: Boosting LM self-awareness to reduce Tool Overuse & optimize reasoning! 🌐 arxiv.org/pdf/2502.11435 📊 github.com/qiancheng0/Ope… Smaller models, bigger brains. Smarter tool use, better results! 🔥 #AI #LLM

1 38 109 14K 42

Download Image

Hanning Zhang @HanningZhangHK

7 months ago

🚀 Excited to share our latest work on Iterative-DPO for math reasoning! Inspired by DeepSeek-R1 & rule-based PPO, we trained Qwen2.5-MATH-7B on Numina-Math prompts. Our model achieves 47.0% pass@1 on AIME24, MATH500, AMC, Minerva-Math, OlympiadBench—outperforming…