diff --git a/README.md b/README.md
index 9e0ede8..49fe63f 100644
--- a/README.md
+++ b/README.md
@@ -4,11 +4,11 @@
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
-The main novelty seems to be an extra layer of indirection with the prior network (whether it is a transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP.
+The main novelty seems to be an extra layer of indirection with the prior network (whether it is an autoregressive transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. Specifically, this repository will only build out the diffusion prior network, as it is the best performing variant (but which incidentally involves a causal transformer as the denoising network 😂)
-This is SOTA for text-to-image now, but probably not for long.
+This model is SOTA for text-to-image for now.
-It may also explore an extension of using latent diffusion in the decoder
+It may also explore an extension of using latent diffusion in the decoder from Rombach et al.
## Citations