Introducing the Distributional Successor Measure (DSM): a model of the range of possible futures an agent faces. As a distributional extension of the Successor Representation, it enables zero-shot distributional policy evaluation beyond the capabilities of existing methods. The DSM disentangles the Successor Representation (SR) into distinct future state occupancies by modeling a distribution over distributions of state. It is this additional layer of distributional modeling that allows for distributional return predictions, unlocking the ability to generalize over tasks and risk-aware objectives. This is joint work with @harwiltz with what was absolutely the dream team from @GoogleDeepMind / @Mila_Quebec / @GatsbyUCL: @ArthurGretton, @robinphysics, André Barreto, @wwdabney, @marcgbellemare, and Mark Rowland. Learn more about the DSM in our paper: 📄arxiv.org/abs/2402.08530 🧵Thread below 👇
The DSM is characterized as the fixed point of a (doubly-infinite-dimensional!) distributional Bellman equation, which recursively defines the distribution of futures encountered by a policy. To address the challenges of modeling a distribution over (infinite-dimensional) state distributions, we introduce δ-models: collections of generative models of state, which can be trained via gradient descent on a hierarchical Maximum Mean Discrepancy (MMD).
@JesseFarebro Nice work! What did you use to create the animation in the initial tweet? Was it github.com/3b1b/manim ?