Weiting (Steven) Tan @weiting_nlp

Ph.D. Candidate at @jhuclsp, Student Researcher @Bytedance Seed | Prev @AIatMeta @Amazon Alexa AI steventan0110.github.io USA Joined July 2021

Tweets

69
Followers

211
Following

290
Likes

162

Dongfu Jiang @DongfuJiang

3 days ago

🚀 Excited to finally share our paper on VerlTool, released today after months of work since the initial release in late May! VerlTool is a high-efficiency, easy-to-use framework for Agentic RL with Tool use (ARLT), built on top of VeRL. It currently supports a wide range of…

Dongfu Jiang @DongfuJiang

3 months ago

5 72 376 73K 260

Download Image

2 34 150 15K 93

Download Image

Jason Weston @jaseweston

3 days ago

🌀Diversity Aware RL (DARLING)🌀 📝: arxiv.org/abs/2509.02534 - Jointly optimizes for quality & diversity using a learned partition function - Outperforms standard RL in quality AND diversity metrics, e.g. higher pass@1/p@k - Works for both non-verifiable & verifiable tasks 🧵1/5

4 78 394 55K 327

Download Image

Benjamin Van Durme @ben_vandurme

6 months ago

Our latest on compressed representations: Key-Value Distillation (KVD). Query-independen transformer compression, with offline supervised distillation.

2 29 135 13K 72

Download Image

DeepSeek @deepseek_ai

8 months ago

🛠️ DeepSeek-R1: Technical Highlights 📈 Large-scale RL in post-training 🏆 Significant performance boost with minimal labeled data 🔢 Math, code, and reasoning tasks on par with OpenAI-o1 📄 More details: github.com/deepseek-ai/De… 🐋 4/n

242 828 5K 1.8M 926

Download Image

JHU Computer Science @JHUCompSci

8 months ago

Congratulations to Prof. Philipp Koehn on being named a Fellow of the @aclmeeting! cs.jhu.edu/news/philipp-k…

0 4 30 5K 0

Weiting (Steven) Tan @weiting_nlp

9 months ago

I had a great time helping host MASC-SLL at Hopkins last year. MASC-SLL is a great opportunity to connect with fellow AI/NLP/Speech researchers. If your organization is in the Mid-Atlantic region and is interested in hosting the event, please reach out!

MASC-ALL Conference @MASC_Conference

9 months ago

1 16 14 5K 0

0 1 4 1K 0

Tianjian Li @tli104

9 months ago

I have written a blogpost offering an explanation of why both the chosen and the rejected log-probability decreases during DPO, and more interestingly, why it is a desired phenomenon to some extent. Link: tianjianl.github.io/blog/2024/dpo/

0 5 12 3K 4

Sherjil Ozair @sherjilozair

9 months ago

Very happy to hear that GANs are getting the test of time award at NeurIPS 2024. The NeurIPS test of time awards are given to papers which have stood the test of the time for a decade. I took some time to reminisce how GANs came about and how AI has evolve in the last decade.

16 119 981 218K 378

Weiting (Steven) Tan @weiting_nlp

11 months ago

Excited to see that SpiritLM is fully open-sourced now. It supports speech and text as both input and output. Please consider trying it at: github.com/facebookresear…

AI at Meta @AIatMeta

11 months ago

Excited to see that SpiritLM is fully open-sourced now. It supports speech and text as both input and output. Please consider trying it at: github.com/facebookresear…

22 121 632 149K 209

Download Video

0 1 4 653 0

Saining Xie @sainingxie

11 months ago

Representation matters. Representation matters. Representation matters, even for generative models. We might've been training our diffusion models the wrong way this whole time. Meet REPA: Training Diffusion Transformers is easier than you think! sihyun.me/REPA/(🧵1/n)

29 267 2K 370K 1K

Download Image

Haoran Xu @fe1ixxu

11 months ago

Multilingual models are usually heavily skewed in favor of high-resource languages. We change this with X-ALMA: an LLM-based translator committed to ensuring top-tier performance across 50 diverse languages, regardless of their resource levels! Paper: arxiv.org/pdf/2410.03115