diff --git a/README.md b/README.md index 2f26812..9e0ede8 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,8 @@ Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch +The main novelty seems to be an extra layer of indirection with the prior network (whether it is a transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. + This is SOTA for text-to-image now, but probably not for long. It may also explore an extension of using latent diffusion in the decoder