Octopus v2: On-device language model for super agent
Presents a new method that empowers an on-device 2B model to outperform GPT-4 in both accuracy and latency, and decrease the context length by 95%
arxiv.org/abs/2404.01744
Apple's 3B LLM Outperforms GPT-4 🤯
📌 They also found that the performance of ReaLM and GPT-4 are very similar for the unseen domain.
📌 ReALM significantly improves how conversational assistants like Siri or Alexa can understand the way humans naturally talk. Imagine you're…
I'm very excited to announce Weave, our new tools to track and evaluate your LLM apps.
Use Weave to:
🍩log and version LLM interactions and surrounding data, from development to production
🍩experiment with prompting techniques, model changes, and parameters
🍩evaluate your…
Intel presents LLaVA-Gemma
Accelerating Multimodal Foundation Models with a Compact Language Model
We train a suite of multimodal foundation models (MMFM) using the popular LLaVA framework with the recently released Gemma family of large language models (LLMs). Of particular
Language Models as Compilers
Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Algorithmic reasoning refers to the ability to understand the complex patterns behind the problem and decompose them into a sequence of reasoning steps
Google annoucnes Bigger is not Always Better
Scaling Properties of Latent Diffusion Models
We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their sampling efficiency. While improved network architecture and inference algorithms have
Google presents Mixture-of-Depths
Dynamically allocating compute in transformer-based language models
Transformer-based language models spread FLOPs uniformly across input sequences. In this work we demonstrate that transformers can instead learn to dynamically allocate
If you have a certain combination of naïveté and self-delusion, you might think that superhuman AI is just around the corner.
It wasn't true in 2016.
And it's still not true today.
If you have a bit of a superiority complex, you might think that you will be the one producing…
If you have a certain combination of naïveté and self-delusion, you might think that superhuman AI is just around the corner.
It wasn't true in 2016.
And it's still not true today.
If you have a bit of a superiority complex, you might think that you will be the one producing…
The 2024 Brain Prize goes to pioneers of computational and theoretical neuroscience: Larry Abbott, @HSompolinsky, and Terry Sejnowksi.
It's fabulous to see the field being recognized in a big way, and I can't think of a more deserving group of laureates for it.…
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The model is trained to generate videos of realistic or imaginative scenes from text instructions and…
Just ported Gemma from @GoogleDeepMind to MLX
github.com/ml-explore/mlx…
Gemma is almost identical to a Mistral / Llama style model with a couple of distinctions that you model mechanics might interested in 👇
We need open source AI foundation models so that a highly diverse set of specialized models can be built on top of them.
We need a free and diverse set of AI assistants for the same reasons we need a free and diverse press.
They must reflect the diversity of languages, culture,…
We need open source AI foundation models so that a highly diverse set of specialized models can be built on top of them.
We need a free and diverse set of AI assistants for the same reasons we need a free and diverse press.
They must reflect the diversity of languages, culture,…
Lots of confusion about what a world model is. Here is my definition:
Given:
- an observation x(t)
- a previous estimate of the state of the world s(t)
- an action proposal a(t)
- a latent variable proposal z(t)
A world model computes:
- representation: h(t) = Enc(x(t))
-…
7K Followers 6K FollowingCenter for Language and Speech Processing at @JohnsHopkins #NLProc #MachineLearning #AI https://t.co/6IXR5OSQtw
@[email protected]
774 Followers 8K FollowingCS PhD student at the University of Birmingham. Research interests: Automated Machine Learning (Bayesian optimization), Reinforcement Learning.
28 Followers 241 FollowingDigital media advisor cum AI-domain speaker, with 39 years trade exposure, made globe-trotting to more than over 100nations for works..
5K Followers 5K FollowingIn awe of knowledge's breadth, a learning journey, with teams to realize dreams, guided by scientific evidence. Only personal musings. Inquisitive, grow.
3K Followers 5K FollowingA curious guy. Becoming a better human?
QUOTE:
Tell me about despair, yours, and I will tell you mine.
Meanwhile the world goes on.
--Mary Oliver, Wild Geese
53 Followers 175 FollowingDiscord: @gauravcr
Searching: ideas touching embedded and AI
Developing: an AI powered meditation app to help retrain the psyche
Alterok, N&W S4 Buildspace
760 Followers 4K FollowingFollow Me If You Like
💻 Programming
🎆 Graphic Design/Ui Ux
⚙️ Software Engineering
🔥 App/Website Development
📊 Artificial intelligence
📩 DM For Collab
2K Followers 3K FollowingFounder/CEO of Graphlit (@graphlit), Managed Knowledge API + MCP server. ex-MSFT, PA born, Seattle bred. Dad to dogs/humans. 🚀@zine_ai
5K Followers 828 FollowingPostdoc @LTIatCMU. PhD from Ohio State @osunlp. Author of MMMU, MAmmoTH. Training & evaluating foundation models. Opinions are my own.
5K Followers 303 FollowingI love building things. AppliedAI/ChatGPT @openai. Formerly, eng @airbnb, founder @fabric_app. Creator of the first @facebook Timeline, Memories, See Friendship
9K Followers 3K Following🦹🏼♀️ VP of AI @weights_biases
👩🏼💻 Your neighborhood nerd in faux fur
🐉 Trainer of dragons & AI models
🔮 Forbes Technology Council
💃 Chaotic good
25K Followers 33 FollowingWorld Labs is a spatial intelligence company building Large World Models to perceive, generate, and interact with the 3D world.
2K Followers 2K FollowingPh.D. Student @PrincetonCS. Prev @Stanford @UW @pika_labs @MSFTResearch @UofIllinois @ZJU_China. I used to work on computer vision, but it's not all I do.
2K Followers 941 FollowingThe world can be ugly and cruel to the most innocent. Consider donating to help children suffering from one of the worst things: https://t.co/PYZWj8o4OW
6K Followers 700 Following👨💻 AI Research & Engineering @GroqInc. Occasional angel investor. I publish technical resources about LLMs on Substack. Opinions are my own.
10K Followers 37 FollowingTeam member at something young.
Adjunct Prof @ McGill.
Member of Mila, Quebec AI Institute.
Stream of consciousness is my own.