Stephanie Schoch @stephschoch

PhD candidate working on NLP and data contribution estimation @CS_UVA. Member of @UVA_ILP. Joined September 2019

Tweets

29
Followers

70
Following

188
Likes

244

Andrew Lampinen @AndrewLampinen

4 months ago

How do language models generalize from information they learn in-context vs. via finetuning? We show that in-context learning can generalize more flexibly, illustrating key differences in the inductive biases of these modes of learning — and ways to improve finetuning. Thread: 1/

8 152 768 99K 689

Download Image

Stephanie Schoch @stephschoch

4 months ago

Had a great time presenting this work at the NAACL 2025 Insights Workshop yesterday! We adapted a Monte Carlo sampling method to analyze the impact of the number of in-context examples. aclanthology.org/2025.insights-…

0 0 1 50 0

Stephanie Schoch @stephschoch

4 months ago

I’ll be presenting our work “In-Context Learning (and Unlearning) of Length Biases” at NAACL 25 in Hall 3 from 11AM-12:30PM today. Looking forward to chatting about ICL with everyone!

0 0 5 96 0

Download Image

Alon Albalak @AlbalakAlon

a year ago

@cwolferesearch If you thought the information on data they release is interesting, you should check out our recent survey on data for LLMs We include a TON more information about data processing, and most information Meta includes in the release isn't particularly new

Alon Albalak @AlbalakAlon

2 years ago

10 76 309 110K 265

Download Image

1 13 56 12K 39

Rafael Rafailov @ NeurIPS @rm_rafailov

a year ago

From the LLaMa 3 blogpost - they use a combination of rejection sampling, DPO and PPO for post-training. Really interested to know what tasks/parts of the process each algorithms benefits the most.

3 13 121 72K 64

Download Image

Cameron R. Wolfe, Ph.D. @cwolferesearch

a year ago

LLaMA-3 is a prime example of why training a good LLM is almost entirely about data quality… TL;DR. Meta released LLaMA-3-8B/70B today and 95% of the technical info we have so far is related to data quality: - 15T tokens of pretraining data - More code during pretraining…

21 216 897 106K 552

Download Image

Fred Oswald @FredOswald

a year ago

Will the Real Linda Please Stand up...to Large Language Models? Examining the Representativeness Heuristic in LLMs arxiv.org/abs/2404.01461 @PanDwww @ZilinXiao2 @hanjie_chen

1 6 12 8K 6

Jason Stock @itsstock

2 years ago

Chat with MLX 🚀 a high-performance macOS app linking your local docs to a custom large language model (LLM) on your machine 🧵 Now open-source in beta! github.com/mlx-chat/mlx-c… Collaboratively built by @itsstock & @parkersm1th

4 19 100 13K 86

Download Image

Matthew Berman @MatthewBerman

2 years ago

OpenAI just dropped their Prompt Engineering guide. Here are 6 strategies they recommend for getting better results from LLMs:

69 609 5K 2.0M 12K

Alon Jacovi @alon_jacovi

2 years ago

Worried about test data being used in training? The LLM world is going through a data contamination crisis. Here's us trying to do something about it: Paper: arxiv.org/abs/2305.10160 Blog: medium.com/@alonjacovi/st… w\ @clu_avi @omerNLP @yoavgo

7 69 259 48K 110

Download Image

Yangfeng Ji @yangfeng_ji

3 years ago

Our group released a Python package of data valuation in machine learning, Valda. It supports five methods (LOO, Influence Function, TMC-Shapley, Beta-Shapley, and CS-Shapley) via a unified API. Please try it out if you are interested: uvanlp.org/valda/ @stephschoch

2 6 45 8K 6

Yangfeng Ji @yangfeng_ji

3 years ago

Our work on class-wise Shapley values for data valuation is accepted to #NeurIPS2022 Congratulations to my student @stephschoch and collaborator @haifengxu0! See you in New Orleans!

1 4 35 0 1

siggen_acl @siggen_acl

4 years ago

INLG 2022 will be 18-22 July, in Colby College (Waterville, Maine, USA)! Calls for papers, workshops, etc available at inlgmeeting.github.io/calls.html

0 19 32 0 2

UVA ILP @UVA_ILP

4 years ago

UVA ILP Lab Group Photo: Fall 2021

2 2 38 0 0

Download Image

Stephanie Schoch @stephschoch

4 years ago

Very excited to share this update: I passed my PhD Qualifying Examination! A big thank you to my committee and to my advisor @yangfeng_ji for all of his support and guidance!

UVA ILP @UVA_ILP

4 years ago

Very excited to share this update: I passed my PhD Qualifying Examination! A big thank you to my committee and to my advisor @yangfeng_ji for all of his support and guidance!

1 0 10 0 0

0 0 10 0 0

Yangfeng Ji @yangfeng_ji

4 years ago

After three years @CS_UVA, my group finally has its Twitter account.

UVA ILP @UVA_ILP

4 years ago

After three years @CS_UVA, my group finally has its Twitter account.

1 3 19 0 1

0 5 39 0 1

INLG 2025 @inlgmeeting

4 years ago

The commendation for outstanding position paper goes to "Underreporting of errors in NLG output, and what to do about it" by van Miltenburg, Clinciu, Dušek, Gkatzia, Inglis, Leppänen, Mahamood, Manning, Schoch, Thomson, & Wen