Andrej Karpathy and Justin Johnson deep dive into OpenAI's DALL-E and use it as an anchor point to recurse into some of the recent work in AI on image generation. Approximate agenda:
DALL-E Blog Post:
https://openai.com/blog/dall-e/
ImageGPT
https://openai.com/blog/image-gpt/
VQ-VAE
https://arxiv.org/abs/1711.00937
VQ-VAE-2
https://arxiv.org/abs/1906.00446
Gumbel-Softmax / Concrete Distribution
https://arxiv.org/abs/1611.01144
https://arxiv.org/abs/1611.00712
VQGAN
https://arxiv.org/abs/2012.09841
Andrej's attempted re-implementation of VQVAE and GumbelSoftmax:
https://github.com/karpathy/deep-vector-quantization/blob/main/model.py
You can see a video version of this episode on YouTube:
https://www.youtube.com/watch?v=gMc90bqHMSM
We reached out to all speakers and obtained their written consent to appear in this recording.