From 7b54195da4bd2f7ec73bc9372efbb1735c23253c Mon Sep 17 00:00:00 2001 From: Phil Wang Date: Thu, 7 Apr 2022 09:53:56 -0700 Subject: [PATCH] explain to public --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 2f26812..9e0ede8 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,8 @@ Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch +The main novelty seems to be an extra layer of indirection with the prior network (whether it is a transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. + This is SOTA for text-to-image now, but probably not for long. It may also explore an extension of using latent diffusion in the decoder