Rishabh Maheshwary @rmahesh__

Applied Scientist @ServiceNow | Prev - AI Resident @AIatMeta. rishabhmaheshwary.github.io India Joined August 2013

Tweets

12
Followers

141
Following

2K
Likes

6K

Vikas Yadav @Vikas_NLP_UA

3 months ago

🎉 Our work “Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs” is accepted at #ACLFindings2025 📎 arxiv.org/abs/2406.17415 — Keep key layers high-precision, push others lower → compact LLMs w/ ~no accuracy loss — Simple LIM & ZD scores rank layers

1 3 5 482 1

Srishti Gureja @srishti_gureja

3 months ago

Our paper M-RewardBench got accepted to ACL main: arxiv.org/abs/2410.15522 We construct the first-of-its-kind multilingual RM evaluation benchmark and leverage it to look into the performances of several Reward Models in non-English settings along w/ other interesting insights.

5 12 105 7K 23

Cohere Labs @Cohere_Labs

5 months ago

🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark. 📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation

6 23 112 35K 26

Download Image

Cohere Labs @Cohere_Labs

6 months ago

One standout project, “Evaluating Reward Models in Multilingual Settings” introduced a benchmark dataset for 23 languages, showing performance gaps between English and non-English languages, and highlights the impact of translation quality. 📜:arxiv.org/abs/2410.15522

1 1 2 133 0

Shivalika Singh @singhshiviii

7 months ago

Thrilled to see INCLUDE accepted as a Spotlight at ICLR 2025! 🎉 This was a massive open science effort! Amazing work led by @agromanou @negarforoutan, Anna ❤️ Was lovely collaborating with them as well as @Sree_Harsha_N @rmahesh__ and others from @CohereForAI community! 🙌

Cohere Labs @Cohere_Labs

7 months ago

0 10 39 9K 3

6 12 62 5K 7

Sara Hooker @sarahookr

9 months ago

🔥 INCLUDE is an ambitious and critical release. Very proud of cross-instutional collaboration. Most extensive collection to-date of in-language examinations from across the world. 🌎🌍🌏 Critical work to ensure AI progress is not overfitting to knowledge of US exam subjects.

Angelika Romanou @ACL @agromanou

9 months ago

1 61 185 61K 65

Download Image

1 26 87 8K 14

Cohere Labs @Cohere_Labs

9 months ago

What would it take for AI evaluations to truly support our global experiences? 🌍 Our cross-institutional paper introduces INCLUDE, a multilingual LLM evaluation benchmark of local exams capturing in-language nuances & cultural context for truly localized AI evaluation.

2 29 79 30K 16

Download Image

Angelika Romanou @ACL @agromanou

9 months ago

🚀 Introducing INCLUDE 🌍: A multilingual LLM evaluation benchmark spanning 44 languages! Contains *newly-collected* data, prioritizing *regional knowledge*. Setting the stage for truly global AI evaluation. Ready to see how your model measures up? #AI #Multilingual #LLM #NLProc

1 61 185 61K 65

Download Image

Cohere Labs @Cohere_Labs

10 months ago

🌍 As multilingual language models grow in reach and impact, the need for robust evaluation datasets intensifies. 🚨 We present a multilingual reward benchmarking dataset, designed to rigorously evaluate models and reveal any blind spots in current multilingual model training.

1 12 49 4K 15

Download Image

Marzieh Fadaee @mziizm

10 months ago

Evaluation drives progress ⛰️ We're excited to share our latest work! 🌍 We built a multilingual evaluation set to see how reward models really hold up across languages and ran extensive benchmarks on top LLMs.

Srishti Gureja @srishti_gureja

10 months ago

8 24 104 39K 39

Download Image

2 10 34 3K 2

Srishti Gureja @srishti_gureja

10 months ago

✨ New Evaluation Benchmark for Reward Models - We Go Multilingual! ✨ Introducing M-RewardBench: A massively multilingual RM evaluation benchmark covering 23 typologically different languages across 5 tasks. Paper, code, dataset: m-rewardbench.github.io Our contributions: 1/9

8 24 104 39K 39

Download Image

Vikas Yadav @Vikas_NLP_UA

11 months ago

Thrilled to share our work has been accepted at @EMNLP2024 (Findings)🎉🔥. -𝗜𝘁𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁 𝗼𝗳 𝗟𝗟𝗠𝘀 ✅ -Curriculum DPO training ✅ -Impressive gains across Vicuna bench, WizardLM, MT-bench, and UltraFeedback Paper - arxiv.org/abs/2403.07230 (1/2)