Itay Levy @itayoush

Deep Learning Researcher @NVIDIA linkedin.com/in/itay-levy-cs Israel Joined September 2020

Tweets

64
Followers

293
Following

3K
Likes

2K

The AI Timeline @TheAITimeline

9 months ago

Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Author's Explanation: x.com/itayoush/statu… Overview: Puzzle introduces a distillation-based neural architecture search framework that significantly optimizes LLM inference on specific hardware, achieving a 2.17x…

Itay Levy @itayoush

9 months ago

1 1 8 739 0

Download Image

1 3 4 516 1

Download Image

Pavlo Molchanov @PavloMolchanov

11 months ago

🚀 @NeurIPSConf Spotlight! 🥳 Imagine fine-tuning an LLM with just a sparsity mask! In our latest work, we freeze the LLM and use 2:4 structured sparsity to learn binary masks for each linear layer. Thanks to NVIDIA Ampere’s 2:4 sparsity, we can achieve up to 2x compute…

2 36 159 14K 91

Download Gif

Talor Abramovich @AbramovichTalor

12 months ago

We're launching EnIGMA, our state-of-the-art AI agent for offensive cybersec! It uses tools like Ghidra & pwntools, can debug, connect to servers, and exploit vulnerabilities to solve CTF challenges. Built with researchers from Princeton, NYU, and TAU. enigma-agent.github.io

2 15 44 15K 25

Download Image

Pavlo Molchanov @PavloMolchanov

12 months ago

🚀 Exciting news! We’ve just released a new LLM: Llama-3.1-Nemotron-51B = LLaMa-70B-Instruct + Block Distillation + NAS + Logics Distillation; Powered by a single H100 GPU with nearly the same accuracy! ⚡ This gives a 2.2x inference speed-up with MT Bench 8.99 ➡️ 8.94.…

2 22 76 7K 19

Download Image

Ben Bogin @ben_bogin

12 months ago

📢 New Benchmark: SUPER for Setting UP and Executing tasks from Research repositories Reproducibility is crucial in science. We introduce SUPER to evaluate LLMs' capabilities in autonomously running experiments from research repositories. ⬇️ arxiv.org/pdf/2409.07440

5 20 73 20K 21

Download Image

Pavlo Molchanov @PavloMolchanov

a year ago

🚀 Our team is hiring! Join to Advance Efficiency in Deep Learning at NVIDIA! 🚀 🔗 Apply here: bit.ly/nvdler-job Our team, Deep Learning Efficiency Research (nv-dler.github.io) at NVIDIA Research, is about a year old, and we are expanding. We're looking for…

3 32 198 69K 126

Download Image

Pavlo Molchanov @PavloMolchanov

a year ago

🌟 The best 8B Base model via pruning and distillation! 🚀 Introducing Mistral-NeMo-Minitron-8B-Base model we derived from the recent Mistral-NeMo-12B. Our recipe: finetune teacher on 100B tokens, prune to 8B params, run teacher-student distillation on <400B tokens. Result: the…