DeepLearning.AI @DeepLearningAI, Twitter Profile

DeepLearning.AI @DeepLearningAI

6 years ago

Recent language models like BERT and ERNIE rely on trendy layers based on transformer networks and require lots of compute. Deep learning researcher @Smerity shows these layers and enormous GPUs may not be necessary: bit.ly/2QZXxWV

5 96 365 0 59

Download Image

Santosh ML @SantoshStyles

6 years ago

@DeepLearningAI_ @AI_Fund @Smerity "The attention mechanism is also readily extended to large contexts with minimal computation. Take that Sesame Street. " This was their reaction: i.imgur.com/SzhmLOg.jpg

0 0 2 0 0

Josh Miller @JoshMiller656

6 years ago

@DeepLearningAI_ @Smerity I read this a while ago and thought it was a joke. Is this actually serious? I loved the "take that, sesame street" line

1 0 1 0 0

Guocan Shang @GuocanShang

6 years ago

@DeepLearningAI_ @Smerity good job

0 0 0 0 0

Demirlenk @demirlenk92

6 years ago

@DeepLearningAI_ @egrefen @Smerity I like the tone of the abstract. Maybe can read it further. :)

1 0 0 0 0

Sushant @sushgandhi1

6 years ago

@DeepLearningAI_ @Smerity Dumb question.....what is evergreen wikitext initiative...... can't find any reference anywhere...from the short description in the paper... sounds very interesting....

0 0 0 0 0