Shmulik Amar @pyshmulik

Building things Israel Joined September 2018

Tweets

174
Followers

30
Following

62
Likes

1K

Eviatar Nachshoni @ENachshoni

2 weeks ago

🚨 New paper out! 📄 What happens when LLMs & RLMs face conflicting answers to a question? 🤔 They often ignore disagreement and confidently pick one “correct” answer. 🤯 📄 arxiv.org/pdf/2508.12355 #AI #LLM #NLP #MachineLearning

1 7 23 1K 8

Download Image

Mosh Levy @mosh_levy

3 weeks ago

Producing reasoning texts boosts the capabilities of AI models, but do we humans correctly understand these texts? Our latest research suggests that we do not. This highlights a new angle on the "Are they transparent?" debate: they might be, but we misinterpret them. 🧵

8 27 136 27K 98

Download Image

Aviya Maimon @AviyaMaimon

a month ago

🚨 New paper alert! 🚨 We propose an IQ Test for LLMs — a new way to evaluate models that goes beyond benchmarks and uncovers their core skills. Think: 🧠🤖 psychometrics for LLMs. 👇 (1/6)

1 14 29 4K 1

Aviya Maimon @AviyaMaimon

a month ago

We release: ✅ Code ✅ Leaderboard ✅ Skill matrices & tools Let’s shift to skill‑based evaluation for LLMs! Full paper here 👉 arxiv.org/abs/2507.20208 (6/6)

2 3 5 253 1

BIU NLP @biunlp

a month ago

Congrats @itaimond @Tzuf6 @rtsarfaty !

ACL 2025 @aclmeeting

a month ago

Congrats @itaimond @Tzuf6 @rtsarfaty !

2 28 218 75K 131

Download Image

2 4 31 887 0

Arie Cattan @ArieCattan

3 months ago

🚨 RAG is a popular approach but what happens when the retrieved sources provide conflicting information?🤔 We're excited to introduce our paper: “DRAGged into CONFLICTS: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs”🚀 A thread 🧵👇

2 14 37 2K 5

Download Image

Ori Ernst @oriern1

3 months ago

🧵 New paper at Findings #ACL2025 @aclmeeting! Not all documents are processed equally well. Some consistently yield poor results across many models. But why? And can we predict that in advance? Work with Steven Koniaev and Jackie Cheung @Mila_Quebec @McGill_NLP #NLProc (1/n)

1 12 26 990 2

Download Image

Aviv Slobodkin @lovodkin93

3 months ago

Check out our new paper on highly localized attributions, both in the input and the output!

Eran Hirsch @hirscheran

3 months ago

Check out our new paper on highly localized attributions, both in the input and the output!

3 32 84 12K 22

Download Image

0 7 27 1K 3

Eran Hirsch @hirscheran

3 months ago

🚨 Introducing LAQuer, accepted to #ACL2025 (main conf)! LAQuer provides more granular attribution for LLM generations: users can just highlight any output fact (top), and get attribution for that input snippet (bottom). This reduces the amount of text the user has to read by 2…

3 32 84 12K 22

Download Image

Alon Eirew @AlonEirew

4 months ago

Excited to present our system demonstration paper on EventFull — an Event-Event Relation annotation tool — at #NAACL25 Come see us Thursday, May 1, at Poster Session I (16:00–17:30) (Paper and tool links at the end of the thread👇)

1 6 11 387 1

Download Image

AK @_akhaliq

4 months ago

RefVNLI Towards Scalable Evaluation of Subject-driven Text-to-image Generation

1 52 136 17K 40

Download Image

Shmulik Amar @pyshmulik

5 months ago

0 0 0 7 0

Download Image

Shir Ashury-Tahan @ShirAshuryTahan

6 months ago

LLMs struggle with tables—but how robust are they really? 🔍 ToRR goes beyond accuracy, testing real-world robustness across formats & tasks. 📊 Different formats, same data—models show brittle behavior affecting rankings. Prompt configuration is a key dimension for evaluation!🚀

2 13 36 5K 12

Download Image

Lital Binyamin @litalby

6 months ago

🎉 I'm happy to share that our paper, Make It Count, has been accepted to #CVPR2025! A huge thanks to my amazing collaborators - @YoadTewel, @SegevHilit , @hirscheran, @RoyiRassin, and @GalChechik! 🔗 Paper page: make-it-count-paper.github.io Excited to share our key findings!