We're hiring!
Over the past few months, we’ve been building up our agent tech stack. Now we're ready to scale up.
If you live and breathe agentic systems and how they are going to impact work—DM me. We just opened a few engineering and product roles, see careers.graidd.com
Introducing 🌸BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks!
BigCodeBench goes beyond simple evals like HumanEval and MBPP and tests LLMs on more realistic and challenging coding tasks.
We are (finally) releasing the 🍷 FineWeb technical report!
In it, we detail and explain every processing decision we took, and we also introduce our newest dataset: 📚 FineWeb-Edu, a (web only) subset of FW filtered for high educational content.
Link: hf.co/spaces/Hugging…
It’s been a year since the release of @BigCodeProject’s 💫 StarCoder models and paper: May the source be with you! Join us as we celebrate the anniversary, and share what you’ve done using #StarCoder. Read how StarCoder has helped ServiceNow developers: servicenow.com/blogs/2024/big…
Self-Instruct for CodeLLMs! 👀 @BigCodeProject released a new StarCoder2-Instruct, the first entirely self-aligned code LLM trained with a transparent and permissive pipeline. 🧑🏻💻 It used itself to generate thousands of instruction-response pairs, which were then used to…
Test-of-time awards should maybe be handed out after a longer period of time but in my opinion this blog post (and the following) were incredibly prescient, and about one year later, everybody in LLMs is doing exactly what it suggested
Test-of-time awards should maybe be handed out after a longer period of time but in my opinion this blog post (and the following) were incredibly prescient, and about one year later, everybody in LLMs is doing exactly what it suggested
Took some time to reflect on the past 1+year of the @BigCodeProject: Here are a few of my learnings from leading it during this time and some ingredients I think are important for a successful open collaboration in ML.
What is BigCode?
BigCode is an open scientific collaboration…
Introducing: StarCoder2 and The Stack v2 ⭐️
StarCoder2 is trained with a 16k token context and repo-level information for 4T+ tokens. All built on The Stack v2 - the largest code dataset with 900B+ tokens.
All code, data and models are fully open!
hf.co/bigcode/starco…
Instruction Tuning Code LLMs Using #PEFT methods? Introducing 🌠
✨Astraios Model Suite: A suite of 28 #StarCoder instruct-tuned using #OctoPack, 7 tuning methods & 4 model sizes, and up to 16B parameters.
📝Extensive Evaluation: 5 tasks & 8 datasets in both Code Comprehension…
Exciting times: we are working on the next generation of StarCoder trained on a new dataset! 🚀
If you would like to have your code excluded from the training run you can check if your data is in the dataset and follow the link to opt-out:
huggingface.co/spaces/bigcode…
First promising results for pre-training with related documents in the context window, nicely addressing the data issue I explained in my last blog post.
Looks de-risked enough to go into llama-3.
arxiv.org/abs/2310.10638
First promising results for pre-training with related documents in the context window, nicely addressing the data issue I explained in my last blog post.
Looks de-risked enough to go into llama-3.
arxiv.org/abs/2310.10638 https://t.co/EFkLWZlG24
107 Followers 5K FollowingExplorer of ideas, lover of good conversations, and always ready to discover new perspectives. Let’s connect and see where our curiosity takes us!
365K Followers 6K FollowingChief Scientist, Google DeepMind & Google Research. Gemini Lead. Opinions stated here are my own, not those of Google. TensorFlow, MapReduce, Bigtable, ...
1K Followers 2K FollowingBack in the EU after some time in sunny California and happy Copenhagen. Mistral, prev Photoroom, Meta (xformers, FairScale), EyeTribe (acq)
50K Followers 405 Following@AnthropicAI. Prev. @Google Brain/DeepMind, founding team @OpenAI. Computer scientist; inventor of the VAE, Adam optimizer, and other methods. ML PhD.
3K Followers 6K FollowingLLM for code and reasoning. PhD student at Cornell. Previously Student Researcher at @google. Previously intern at @theteamatx.
831 Followers 2K FollowingCS PhD student @illinoisCDS. Research intern at AWS AI Labs @AmazonScience. Towards building advanced code LLMs with better reasoning and planning.
4K Followers 598 FollowingAssociate Professor in Machine Learning at the University of Oxford.
Interested in automatic inductive bias selection using Bayesian tools.
8K Followers 242 FollowingLlama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.
10K Followers 851 FollowingI want to understand things deeply and explain them well. Building friendly AI @AnthropicAI
Give me anonymous feedback: https://t.co/7aBNrpbad8
4K Followers 431 FollowingCTO AI and Fellow, Imec - Ex: Team Lead @ DeepMind and @GoogleDeepMind, Meta. Also CS professor (Liverpool/Leuven) and LFC fan.
5K Followers 6K FollowingScientist professor hacker writer citizen. Committed to building a better world through science technology and community. Director @swheritage. Follow ≠ endorse
24K Followers 3K FollowingCo-founder @cradlebio. Previously designed the first apps for @uber, @bookingcom, @catawiki and many others.
@jelleprins.com elsewhere.
18K Followers 325 FollowingFounder & CEO @leptonai (now part of NVIDIA). @UCBerkeley @Tsinghua_Uni Alumni. Built decaf, caffe, ONNX, PyTorch 1.0. Former Google/Meta/Alibaba.