Really excited to share this!
We set out to develop a form of quantization at Weaviate with the following goals:
1. No training requirements like product quantization.
2. Better recall that scalar or binary quantization.
3. Something we could unreservedly recommend as a better…
Really excited to share this!
We set out to develop a form of quantization at Weaviate with the following goals:
1. No training requirements like product quantization.
2. Better recall that scalar or binary quantization.
3. Something we could unreservedly recommend as a better…
Products with extensive/rich UIs lots of sliders, switches, menus, with no scripting support, and built on opaque, custom, binary formats are ngmi in the era of heavy human+AI collaboration.
If an LLM can't read the underlying representations and manipulate them and all of the…
🍻 BEIR v2.1.0 is now released ~ after a long break! 🥳
1⃣ Supports the latest embedding models - like E5, Stella, ModernBERT, RepLLAMA, and NV-Embed!
2⃣ Easy util functions to store run files and metrics!
3⃣ Python upgraded from 3.6 to 3.9+!
Checkout 👇
github.com/beir-cellar/be…
Blazingly fast keyword generation with KeyBERT v0.9 and Model2Vec 🔥
I have been a big fan of the amazing embedding models by @minishlab, so I had to integrate them with KeyBERT.
A release for the GPU-poor 😉
Nomic Embed Text V2 is now available
- First general purpose Mixture-of-Experts (MoE) embedding model
- SOTA performance on the multilingual MIRACL benchmark for its size
- Support for 100+ languages
- Truly open source - open training data, weights, & code
- Apache 2.0 License
Today, @minishlab released 2 more Static Embedding models; potion-base-32M & potion-retrieval-32M stronger performance than before, while still easily processing e.g. 50k sentences per second.
The text embeddings can be used for retrieval, classification, clustering, etc.
🧵
🧵 Excited to announce modernbert-embed-base, a new embedding model built on the newly released ModernBERT!
Trained on the public Nomic Embed datasets, modernbert-embed-base is a ~nomic-embed~ quality model with Matryoshka capabilities and brings the great advances of…
"I use jina-embeddings-v3 and set output_dim=2 for visualization; why does it perform so much worse than UMAP?" - An interesting question from one of our users.
Background: v3 supports Matryoshka Representation, allowing users to set any output dimension below 1024 with minimal…
sneak preview 🍿 of our new embedding model: cde-small-v1
cde-small-v1 is the text embedding model that we (@srush_nlp and i) have been working on at Cornell for about a year
tested the model yesterday on MTEB, the text embeddings benchmark; turns out we have state-of-the-art…
Our work on scaling RAG with a 1.4T token corpus was accepted at @NeurIPSConf
This work led by @RulinShao has many interesting findings eg.,
- RAG with massive corpus for better compute-optimal scaling
- thorough analysis on modeling / analysis choices at scale
Check it out!
Our work on scaling RAG with a 1.4T token corpus was accepted at @NeurIPSConf
This work led by @RulinShao has many interesting findings eg.,
- RAG with massive corpus for better compute-optimal scaling
- thorough analysis on modeling / analysis choices at scale
Check it out! https://t.co/u7VdMGOs2S
DSPy is a SUPER exciting advancement for AI and building applications with LLMs!🧩🤯
Pioneered by frameworks such as LangChain and LlamaIndex, we can build much more powerful systems by chaining together LLM calls! This means that the output of one call to an LLM is the input to…
NeurIPS 2023 was such an amazing conference! @ecardenas300 and I learned a ton, met so many amazing people, and... put together our first in-person podcast series! 🍾
Super excited to share 10 interviews with @JayAlammar, @alexchaomander, @DivGarg9, @mgoin_, @lateinteraction,…
Hey everyone!! I am SUPER excited to publish the first episode of the AI-Native Database podcast series with @andy_pavlo and @bobvanluijt! 🎉🎉
This was an epic one! Beginning with the Self-Driving Database and all the opportunities to optimize DBs with AI/ML at both the…
Even if you are not an Open Source Absolutist, it is hard to overestimate how much value OSS has added to the world.
This goes for AI as for everything else.
Even if you are not an Open Source Absolutist, it is hard to overestimate how much value OSS has added to the world.
This goes for AI as for everything else.
655 Followers 3K FollowingProduct of progressive public policy; raised by public libraries and public education that produced a passion for politics. and apparently alliteration
250 Followers 2K Followingtechnical investor @ AI Fund CDP Venture Capital - trying to help make Italy appear on the tech map. previously seen at @unibocconi @epfl @ETH_en
20K Followers 914 FollowingRecent writing at https://t.co/lCWFZXt72C Formerly built the world's fastest filesystem at AWS, now the fastest spreadsheet at https://t.co/hLkbCuJG7H
16K Followers 5K FollowingTechnoOptimist
Founder https://t.co/65yq6WPo4s
Currently Directing Games
Former Story Artist The Lego Batman Movie. Human centric AI. Abundance for all.
31K Followers 706 FollowingTasmanian Geologist in South America | Assume I hold or advise on the companies I post about | My publication: https://t.co/1mbs8lBbSO 🥩📸 ⛏🇦🇷🇵🇾
59K Followers 830 FollowingCreator of Flask; A decade at @getsentry; Building new things — love API design & AI. Bypassing Permissions. Husband and father of 3 — “more nuanced in person”
5K Followers 2K FollowingBuilding the last wearable at @chargerlessxyz. Interested in fun facts, holography, human perception, comp photography, and pseudoscience
40K Followers 817 FollowingSenior Fellow and Chief Economist, @TheIPA;
Columnist (ex Washington Correspondent), The @Australian;
ex @MarjorieDeane alum at The Economist; ex RBA, APRA, 🎼
37K Followers 483 FollowingDigital Geometer, Assoc. Prof. of Computer Science & Robotics @CarnegieMellon @SCSatCMU and member of the @GeomCollective. There are four lights.
1.7M Followers 91 FollowingOfficial account of Ray Dalio, founder of Bridgewater Associates, author of #1 New York Times bestseller 'Principles,' professional mistake maker
22K Followers 680 FollowingThoughts and Opinions are my own
https://t.co/ygdxgGpBBH
https://t.co/cQ8BlV2AQ4
https://t.co/vIKWFMgee4
https://t.co/cleOHvdIVl
31K Followers 877 FollowingVP GenAI @Databricks. Former CEO/cofounder MosaicML & Nervana/IntelAI. Neuro + CS. I like to build stuff that will eventually learn how to build other stuff.
88 Followers 13 FollowingBuilding Model2Vec, SemHash, and Vicinity. Check out our GitHub here: https://t.co/WAoFTUEQ6O. We are also on HuggingFace: https://t.co/1khDD6Y4YB
101K Followers 2K FollowingFollow for posts about GitHub repos, DSPy, and agents
Subscribe for top posts
DM to share your AI project (Due to volume of DMs I'll prioritize subscribers)
210K Followers 359 FollowingI build & teach AI stuff. Founder @TakeoffAI where we’re building an AI coding tutor. Come learn to code + build with AI at https://t.co/oJ8PNoAutE.