Microsoft just dropped VASA-1. This AI can make single image sing and talk from audio reference expressively. Similar to EMO from Alibaba 10 wild examples: 1. Mona Lisa rapping Paparazzi
706
3K
15K
7.2M
11K
Download Video
2. Realism and liveliness - example 1