Commit Graph

217 Commits

Author SHA1 Message Date
Phil Wang
d4c8373635 complete conditional dropout mask creation for both prior network as well as image decoder unet for classifier free guidance 2022-04-12 14:04:08 -07:00
Phil Wang
74aec9d8ca further prepare attention for classifier free guidance 2022-04-12 13:01:18 -07:00
Phil Wang
7647be2569 prep for classifier free guidance for the image embedding diffusion step, even though not mentioned in paper 2022-04-12 12:57:09 -07:00
Phil Wang
59b8abe09e prepare unet to be conditioned on image embedding, optionally text encodings, and reminder for self to build conditional dropout for classifier free guidance 2022-04-12 12:38:56 -07:00
Phil Wang
40aa304b7e rename to DiffusionPriorNetwork in case ARPriorNetwork is ever built 2022-04-12 11:45:57 -07:00
Phil Wang
fd38eb83c4 complete the main contribution of the paper, the diffusion prior network, minus the diffusion training setup 2022-04-12 11:43:59 -07:00
Phil Wang
83aabd42ca move epsilon inside of square root for further stability in rmsnorm
improvise and use rmsnorm in convnext blocks too
2022-04-12 11:18:36 -07:00
Phil Wang
cf22affcbb bring in modified unet using convnext blocks https://arxiv.org/abs/2201.03545 2022-04-12 10:58:44 -07:00
Phil Wang
522f42f582 start using RMSNorm, used in Gopher and AlphaCode, and as a way to go complete bias-less (purportedly more stable according to PaLM) 2022-04-12 10:45:03 -07:00
Phil Wang
0a60818965 dropouts in transformer, also prep for classifier free guidance in decoder 2022-04-12 10:42:57 -07:00
Phil Wang
771fe0d0d2 also consider accepting tokenizer, so dalle2 forward pass can just be invoked as DALLE2(<prompt string>) 2022-04-12 10:29:29 -07:00
Phil Wang
df4dac4f5a bring in attention - it is all we need 2022-04-12 10:23:07 -07:00
Phil Wang
24b428bdfc readme 2022-04-12 10:12:42 -07:00
Phil Wang
62c0d321a6 sketch 2022-04-12 09:39:42 -07:00
Phil Wang
7cf1637d24 bring in the simple tokenizer released by openai, but also plan on leaving room for custom tokenizer with yttm 2022-04-12 09:23:17 -07:00
Phil Wang
4ff6d021c9 pin to newer version of CLIP that returns encoded text and images, get some helper functions ready for XCLIP 2022-04-12 08:54:47 -07:00
Phil Wang
f283bf25be scaffold 2022-04-07 07:29:34 -07:00