I’m am surprised at the amount of coherency I’ve gotten by trying to fine-tune RoBERTa into a language diffusion model. Pretty decent for a 6 year-old model with only 125 million parameters
1
0
7
388
1
Download Video