Jacqueline He @jcqln_h

cs phd @uwnlp, prev. bse cs @princeton jacqueline-he.github.io Joined May 2018

Tweets

206
Followers

191
Following

66
Likes

6K

Ilia Shumailov🦔 @iliaishacked

3 months ago

Are modern large language models (LLMs) vulnerable to privacy attacks that can determine if given data was used for training? Models and dataset are quite large, what should we even expect? Our new paper looks into this exact question. 🧵 (1/10)

1 21 113 20K 71

Download Image

Shangbin Feng @shangbinfeng

3 months ago

Check out our work on LLMs and scientific knowledge updates!

Yike Wang @yikewang_

3 months ago

Check out our work on LLMs and scientific knowledge updates!

11 55 242 23K 128

Download Image

0 11 54 5K 17

Jacqueline He @jcqln_h

3 months ago

congrats @kjha02 !! cool work 🎊🎉🎇

Kunal Jha @kjha02

3 months ago

congrats @kjha02 !! cool work 🎊🎉🎇

0 3 50 6K 8

1 0 0 143 0

Zhiyuan Zeng @ZhiyuanZeng_

6 months ago

Is a single accuracy number all we can get from model evals?🤔 🚨Does NOT tell where the model fails 🚨Does NOT tell how to improve it Introducing EvalTree🌳 🔍identifying LM weaknesses in natural language 🚀weaknesses serve as actionable guidance (paper&demo 🔗in🧵) [1/n]

5 92 264 61K 147

Download Image

Hamish Ivison @hamishivi

6 months ago

How well do data-selection methods work for instruction-tuning at scale? Turns out, when you look at large, varied data pools, lots of recent methods lag behind simple baselines, and a simple embedding-based method (RDS) does best! More below ⬇️ (1/8)

4 64 331 86K 282

Download Image

Hamish Ivison @hamishivi

7 months ago

We trained a diffusion LM! 🔁 Adapted from Mistral v0.1/v0.3. 📊 Beats AR models in GSM8k when we finetune on math data. 📈 Performance improves by using more test-time compute (reward guidance or more diffusion steps). Check out @jaesungtae's thread for more details!

Jake Tae @jaesungtae

7 months ago

1 2 23 9K 9

Download Gif

1 8 39 4K 6

Download Image

Stella Li @StellaLisy

7 months ago

Asking the right questions can make or break decisions in high-stake fields like medicine, law, and beyond✴️ Our new framework ALFA—ALignment with Fine-grained Attributes—teaches LLMs to PROACTIVELY seek information through better questions🏥❓ (co-led with @jiminmun_) 👉🏻🧵

7 43 198 24K 111

Download Image

Ai2 @allen_ai

8 months ago

Can AI really help with literature reviews? 🧐 Meet Ai2 ScholarQA, an experimental solution that allows you to ask questions that require multiple scientific papers to answer. It gives more in-depth, detailed, and contextual answers with table comparisons, expandable sections…

14 74 221 41K 134

Download Image

Hila Gonen @hila_gonen

9 months ago

Extremely excited to share that I will be joining @UBC_CS as an Assistant Professor this summer! I will be recruiting students this coming cycle!

15 18 146 11K 12

Akari Asai @AkariAsai

9 months ago

🚨 I’m on the job market this year! 🚨 I’m completing my @uwcse Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵

26 119 821 126K 198

Download Image

Jacqueline He @jcqln_h

10 months ago

Check out our OpenScholar project!! Huge congrats to @AkariAsai for leading the project — working with her has been a wonderful experience!! 🌟

Akari Asai @AkariAsai

10 months ago

Check out our OpenScholar project!! Huge congrats to @AkariAsai for leading the project — working with her has been a wonderful experience!! 🌟

37 294 1K 245K 738

Download Video

0 0 7 222 0

Howard Yen @HowardYen1

11 months ago

Introducing HELMET, a long-context benchmark that supports >=128K length, covering 7 diverse applications. We evaluated 51 long-context models and found HELMET provide more reliable signals for model development github.com/princeton-nlp/… A 🧵 on why you should use HELMET⛑️