Xilun Chen @ccsasuke

Research Scientist @ Meta FAIR xilunchen.com Seattle, WA Joined March 2010

Tweets

731
Followers

570
Following

512
Likes

187

Gargi Ghosh @gargighosh

a week ago

New research from FAIR- Active Reading: a framework to learn a given set of material with self-generated learning strategies for generalized and expert domains(such as Finance). Absorb significantly more knowledge than vanilla finetuning and usual data augmentations strategies

Jessy Lin @realJessyLin

a week ago

2 7 93 8K 57

0 11 27 4K 10

Jessy Lin @realJessyLin

a week ago

🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge? In new work with @AIatMeta, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results: * 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia…

15 158 1K 126K 1K

Download Image

Xueguang Ma @xueguang_ma

4 weeks ago

🚀 Introducing BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent. It is a new Deep-Research evaluation benchmark built on top of BrowseComp. It features - 📚 a fixed, carefully curated corpus of web documents - ✅ human-verified positive…

10 36 225 44K 142

Download Image

Rulin Shao @RulinShao

4 weeks ago

Factuality and logical reasoning (e.g., math, code) favor different sets of reasoning patterns. 🧑‍🍳 A fresh RL recipe to improve factuality is here — crafted by the amazing @ccsasuke!

Jason Weston @jaseweston

4 weeks ago

Factuality and logical reasoning (e.g., math, code) favor different sets of reasoning patterns. 🧑‍🍳 A fresh RL recipe to improve factuality is here — crafted by the amazing @ccsasuke!

1 49 383 35K 295

Download Image

0 5 82 7K 34

Jason Weston @jaseweston

4 weeks ago

...is today a good day for new paper posts? 🤖Learning to Reason for Factuality 🤖 📝: arxiv.org/abs/2508.05618 - New reward func for GRPO training of long CoTs for *factuality* - Design stops reward hacking by favoring precision, detail AND quality - Improves base model across…

1 49 383 35K 295

Download Image

Xueguang Ma @xueguang_ma

4 months ago

Now accepted by #ACL2025 main. We propose a training framework to generate strong smaller retriever with integration of LLM data augmentation and LLM pruning, letting smaller retriever improves together with the advancement of LLM.

Xueguang Ma @xueguang_ma

6 months ago

1 23 77 11K 42

Download Image

2 8 50 3K 9

Rulin Shao @RulinShao

4 months ago

Accepted by #ACL2025! Congrats @mingdachen and the team🥳 Several cool ideas: - Maintain an explicit editable working memory during generation; - Actively integrate external feedback (factual check w/ VeriScore); A smart LM learns to memorize, a smarter LM learns to forget too!

Aran Komatsuzaki @arankomatsuzaki

8 months ago

2 27 274 70K 221

Download Image

2 12 109 11K 30

AK @_akhaliq

4 months ago

Meta just dropped ReasonIR on Hugging Face Training Retrievers for Reasoning Tasks

5 50 312 41K 174

Download Image

AI at Meta @AIatMeta

5 months ago

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model…

843 2K 13K 3.4M 3K

Download Image

Zhuang Liu @liuzhuang1234

6 months ago

New paper - Transformers, but without normalization layers (1/n)

77 599 4K 1.3M 2K

Download Image

Matthew Finlayson @mattf1n

6 months ago

🧵 Adapting your LLM for new tasks is dangerous! A bad training set degrades models by encouraging hallucinations and other misbehavior. Our paper remedies this for RAG training by replacing gold responses with self-generated demonstrations. Check it out: arxiv.org/abs/2502.10

1 4 7 417 0

Download Image

Xilun Chen @ccsasuke

6 months ago

Today we released DRAMA, a set of small (sub-1B) multilingual dense retrievers that perform strongly across multiple languages and tasks. It also offers flexible model sizes and embedding dimensionalities. Led by my awesome intern @xueguang_ma arxiv.org/abs/2502.18460

Xueguang Ma @xueguang_ma

6 months ago

1 23 77 11K 42

Download Image

0 3 14 1K 2

Srini Iyer @sriniiyer88

9 months ago

New paper! Byte-Level models are finally competitive with tokenizer-based models with better inference efficiency and robustness! Dynamic patching is the answer! Read all about it here: dl.fbaipublicfiles.com/blt/BLT__Patch… (1/n)

1 22 87 16K 32

Jack Lin @jacklin_64

9 months ago

I will present our paper FLAME on factuality alignment for LLMs with @luyu_gao at #NeurIPS2024! 🎉 Join us at East Exhibit Hall A-C, Booth #3501 for a chat on Wed (Dec 11, 4:30--7:30 pm). Looking forward to connecting! More detail: neurips.cc/virtual/2024/p…

Xilun Chen @ccsasuke

a year ago

3 9 34 7K 15

Download Image

0 6 14 3K 5

Akari Asai @AkariAsai

9 months ago

🚨 I’m on the job market this year! 🚨 I’m completing my @uwcse Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵

26 119 821 126K 198

Download Image

Minghan @alexlimh23

9 months ago

1/ Excited to share that our paper "NEST🪺: Nearest Neighbor Speculative Decoding for LLM Generation and Attribution" is accepted at #NeurIPS2024! 🚀 Catch us at the poster session on Thu, Dec 12, 4:30–7:30 PM PST, East Exhibit Hall A-C, #2201. [Details: neurips.cc/virtual/2024/p…]

Minghan @alexlimh23

a year ago

3 16 59 23K 40

Download Gif

2 7 24 12K 7

Jason Wei @_jasonwei

10 months ago

Excited to open-source a new hallucinations eval called SimpleQA! For a while it felt like there was no great benchmark for factuality, and so we created an eval that was simple, reliable, and easy-to-use for researchers. Main features of SimpleQA: 1. Very simple setup: there…

28 122 865 105K 554

Download Image

Lili Yu (ICLR2025) @liliyu_lili

a year ago

🚀 Excited to share our latest work: Transfusion! A new multi-modal generative training combining language modeling and image diffusion in a single transformer! Huge shout to @violet_zct @omerlevy_ @michiyasunaga @arunbabu1234 @kushal_tirumala and other collaborators.

Chunting Zhou @violet_zct

a year ago

24 209 1K 198K 550

Download Image

6 17 105 22K 29

Sasha Rush @srush_nlp

a year ago

Lillian!

Sarvnaz Karimi @sarvk

a year ago

Lillian!

1 9 92 17K 21

Download Image

1 7 38 7K 5

AI at Meta @AIatMeta

a year ago

Last week we released Meta Chameleon: a new mixed-modal research model from Meta FAIR. Get the models ➡️ go.fb.me/4m87kk The 7B & 34B safety tuned models we’ve released can take any combination of text and images as input and produce text outputs using a new early…