Evals are dead.
Let LLMs play arcade games instead.
I couldn't decide which model to use...
So I made them play Connect 4 against each other.
@OpenAI Vs @AnthropicAI
GPT-5 won almost every time.
Built with @browserbasehq
🔥 Excited to share our latest work: WebWatcher 🕵️♂️
An open-source multimodal agent that achieves new SOTA on multiple challenging vision-language (VL) deep research benchmarks — outperforming GPT-4o & Gemini!
Paper: arxiv.org/abs/2508.05748
Code: github.com/Alibaba-NLP/We…
soho is on fire rn. cursor, oai, gc all moving in.
running point for a gp who will own nyc for one of the best funds on the planet. office in… soho.
in town next week. looking to meet the best investors and angels here.
venture’s next 2-3 years will be the most…
soho is on fire rn. cursor, oai, gc all moving in.
running point for a gp who will own nyc for one of the best funds on the planet. office in… soho.
in town next week. looking to meet the best investors and angels here.
venture’s next 2-3 years will be the most…
I can’t believe I’m recommending a totally random AI summary I came across online but…
This 6-min NotebookLM video is an excellent summary of my perspective on the bitter lesson and what it means for AI engineers.
Link below.
The @slashapp team just launched Global USD.
It's the first platform that gives international businesses a real USD account—without an LLC, EIN, or compliance nightmares.
Businesses in 100+ countries can use it to send and receive ACH, wires, and USDC/USDT.
This is the future of how all startups will raise money in the future
At Carry, we used a roll-up for our friends & family round to bring in ~200 small check investors
And we used it again during our Series A to let 100+ customers invest
It was & continues to be a no-brainer
This is the future of how all startups will raise money in the future
At Carry, we used a roll-up for our friends & family round to bring in ~200 small check investors
And we used it again during our Series A to let 100+ customers invest
It was & continues to be a no-brainer
We’re thrilled to see our advanced ML models and EMG hardware — that transform neural signals controlling muscles at the wrist into commands that seamlessly drive computer interactions — appearing in the latest edition of @Nature.
Read the story: nature.com/articles/s4158…
Find…
Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline 🧵
Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline 🧵
Join us at Agentic AI Summit 2025 — August 2 at UC Berkeley, with ~2,000 in-person attendees and the leading minds in AI.
Building on the momentum of the 25K+ LLM Agents MOOC community, this is the largest and most cutting-edge event on #AgenticAI.
As 2025 emerges as the Year of…
A conversation with @patrickc on old programming languages, software at industrial scale, and AI's effect on economics/biology/Patrick's daily life.
00:15 - Why Patrick wrote his first startup in Smalltalk
03:35 - LISP chatbots
06:09 - Good ideas from esoteric programming…
We just unveiled Grok 4, the world’s smartest artificial intelligence. 🧵
Grok 4 outperforms all other models on the ARC-AGI benchmark, scoring 15.9% - nearly double that of the next best model - and establishing itself as the most intelligent AI to date.
Everyone is talking about Perplexity’s new product.
It's called Comet, an AI web browser that can browse the internet and do things for you.
Like Google Chrome + Siri (if Siri worked).
Some things it does:
• Clicks, scrolls, summarizes, compares prices, checks calendars,…
Been building something a little different.
It’s called XPMap, a retro map-style DeFi game.
You move across a pixel world, complete onchain quests, and actually learn how this stuff works.
Even if you don’t know anything about web3, XPMap acts as a bridge taking you from clicks…
Amjad Masad says true AGI means an AI that can enter new situations and quickly learn how to achieve goals
Even the latest models still struggle with tasks outside their training data.
The real breakthrough will come when AI can train itself -- like AlphaGo playing billions of…
271 Followers 6K FollowingInvestment analyst based in the Baltics. This is my personal trading journal where I share my insights based on data analysis, and technicals! - DYOR/NFA
298 Followers 2K Followingyour favorite pro’s favorite pro 👷🏼 i went to @stanford then worked @google @morganstanley @baincapital and now I build decks (real ones)
2K Followers 4K FollowingInvestment analyst based in the Baltics. This is my personal trading journal where I share my insights based on data analysis, and technicals! - DYOR/NFA Market
1K Followers 2K FollowingLiving where the breeze lingers longer. 🌳
Wearing soft layers, chasing quiet joys, believing that slow moments make the deepest roots.
50K Followers 41K Following#AutonomousSupplyChain | Top 50 Outstanding #AI Business #Influencer | Top 22 AI Influencer to Follow by 2023 | Top 100 Global #ThoughtLeaders | @blueyonder
73K Followers 137 FollowingEvery company has a story. Learn the playbooks that built the world’s greatest companies — and how you can apply them.
Hosted by @gilbert and @djrosent.
8K Followers 304 FollowingFounder & CEO, @CAForever. Raised $1bn+ to build a new city on 100+ square miles an hour north of SF/SV. For those who believe California's best days are ahead.
30K Followers 2K FollowingJaded FinTech PM | White shoes, ABC pants, and a mint puff bar | 6 feet tall (pre-haircut) | Interests include birth control and gut health
46K Followers 334 FollowingInvestment analyst based in the Baltics. This is my personal trading journal where I share my insights based on data analysis, and technicals! - DYOR/NFA
108K Followers 9K FollowingTech founder, investor, philanthropist, driver of change. Iranian American immigrant working to spread the American Dream and to restore democracy to Iran.
159K Followers 990 FollowingFounder @teamSundial. Angel investor. Author of "The Making of a Manager" https://t.co/6HwJhCW5Hi. Obsessed with systems. Design + data person.