Commit Graph

14 Commits

Author SHA1 Message Date
Phil Wang
59b8abe09e prepare unet to be conditioned on image embedding, optionally text encodings, and reminder for self to build conditional dropout for classifier free guidance 2022-04-12 12:38:56 -07:00
Phil Wang
40aa304b7e rename to DiffusionPriorNetwork in case ARPriorNetwork is ever built 2022-04-12 11:45:57 -07:00
Phil Wang
fd38eb83c4 complete the main contribution of the paper, the diffusion prior network, minus the diffusion training setup 2022-04-12 11:43:59 -07:00
Phil Wang
83aabd42ca move epsilon inside of square root for further stability in rmsnorm
improvise and use rmsnorm in convnext blocks too
2022-04-12 11:18:36 -07:00
Phil Wang
cf22affcbb bring in modified unet using convnext blocks https://arxiv.org/abs/2201.03545 2022-04-12 10:58:44 -07:00
Phil Wang
522f42f582 start using RMSNorm, used in Gopher and AlphaCode, and as a way to go complete bias-less (purportedly more stable according to PaLM) 2022-04-12 10:45:03 -07:00
Phil Wang
0a60818965 dropouts in transformer, also prep for classifier free guidance in decoder 2022-04-12 10:42:57 -07:00
Phil Wang
771fe0d0d2 also consider accepting tokenizer, so dalle2 forward pass can just be invoked as DALLE2(<prompt string>) 2022-04-12 10:29:29 -07:00
Phil Wang
df4dac4f5a bring in attention - it is all we need 2022-04-12 10:23:07 -07:00
Phil Wang
24b428bdfc readme 2022-04-12 10:12:42 -07:00
Phil Wang
62c0d321a6 sketch 2022-04-12 09:39:42 -07:00
Phil Wang
7cf1637d24 bring in the simple tokenizer released by openai, but also plan on leaving room for custom tokenizer with yttm 2022-04-12 09:23:17 -07:00
Phil Wang
4ff6d021c9 pin to newer version of CLIP that returns encoded text and images, get some helper functions ready for XCLIP 2022-04-12 08:54:47 -07:00
Phil Wang
f283bf25be scaffold 2022-04-07 07:29:34 -07:00