David J Wu @lightvector1

Researcher, game AI enthusiast, author of KataGo (https://t.co/rJKWY2qU5p) Joined October 2020

Tweets

27
Followers

511
Following

59
Likes

28

David J Wu @lightvector1

a year ago

Wooo, tensor diagrams are cool. (Transformer self-attention layer, from greaterwrong.com/posts/BQKKQiBm…)

0 0 1 735 1

Download Image

Thomas Ahle @thomasahle

a year ago

I always found the tensor notation in Fast Matrix Multiplication algorithms confusing. But using tensor diagrams it's pretty easy to see what's going on:

8 98 780 91K 762

Download Image

Even though we've known from word2vec and much work since that LLM representations correlate well with human concepts (both in linear additivity, distance/clustering, etc), I still find it cool that it holds up with larger models so far. Lots of space to explore further.

Anthropic @AnthropicAI

a year ago

66 550 2K 750K 1K

Download Image

0 0 10 2K 2

Samuel Sokota @ssokota

a year ago

SOTA AI for games like poker & Hanabi rely on search methods that don’t scale to games w/ large amounts of hidden information. In our ICLR paper, we introduce simple search methods that scale to large games & get SOTA for Hanabi w/ 100x less compute. 1/N arxiv.org/abs/2304.13138

5 52 330 60K 180

Download Image

David J Wu @lightvector1

2 years ago

There are tons of articles on MCTS, which wastes compute whenever paths lead to the same state, but few on Monte-Carlo *Graph* Search, which doesn't. But implementing MCGS soundly can be tricky! Here's a doc on how to do it, and the theory behind it: github.com/lightvector/Ka…

3 20 120 10K 107

Download Image

Leela Chess Zero @LeelaChessZero

2 years ago

@GoogleDeepMind ..and the blog post with more details is live at lczero.org/blog/2024/02/h…

0 3 10 2K 1

Leela Chess Zero @LeelaChessZero

2 years ago

In the recent paper arxiv.org/abs/2402.04494 @GoogleDeepMind introduced a transformer chess network, but didn't include Lc0 in their comparison. We've used transformers for a while, and our network is stronger with fewer parameters. More details soon.

3 17 91 7K 14

Download Image

Samuel Sokota @ssokota

2 years ago

There are two shapes below: one is named “kiki” and one is named “bouba”. Which is which? This is the puzzle we consider in our ICML paper: Learning Intuitive Policies Using Action Features. 1/N arxiv.org/abs/2201.12658 ⚫ ✴

4 12 42 19K 14

Eugene Vinitsky (@RLC) 🍒🦋 @EugeneVinitsky

3 years ago

What is off-belief learning and how does it help us build agents that coordinate only in grounded ways ? Part 1 of a new blog series on intuitive summaries of key ideas in multi-agent RL: eugenevinitsky.github.io/posts/Off-Beli…

2 18 66 24K 28

Download Image

Lex Fridman @lexfridman

3 years ago

Here's my conversation with Noam Brown (@polynoamial), co-creator of AI systems that achieve superhuman level performance in games of poker and Diplomacy that involves strategic negotiations with humans. This was a fascinating, technical conversation. youtube.com/watch?v=2oHH4a…

65 119 1K 0 134

Download Image

317070 @317070

3 years ago

Did you know, that you can build a virtual machine inside ChatGPT? And that you can use this machine to create files, program and even browse the internet? engraved.blog/building-a-vir…

223 2K 8K 0 2K

David J Wu @lightvector1

3 years ago

We know that search can be a powerful RL policy improvement method, (e.g. search outperforms the raw policy by 2000 Elo in AlphaGoZero!). One challenge is how to get this kind of RL to be robust when also needing to remain compatible with humans or other agents. Our work on how:

Noam Brown @polynoamial

3 years ago

27 239 1K 0 288

Download Image

0 2 23 0 1

David J Wu @lightvector1

4 years ago

We have a new paper out! It is well-known that in many games the raw policy of an SL model can blunder in silly ways even after extensive training. Search seems to capture a component of human planning that deep neural nets have difficulty fitting or modeling on their own.