Commit Graph

42 Commits

Author SHA1 Message Date
Phil Wang
791d27326a add diffusion code for the image embedding. nearly all the code is there except for the cascading ddpm in the decoder (with upscaling etc) 2022-04-13 10:06:52 -07:00
Phil Wang
6d4e9c97bf todo 2022-04-12 20:50:29 -07:00
Phil Wang
40140b54d6 put on project manager hat 2022-04-12 17:51:23 -07:00
Phil Wang
33d69d3859 take care of DDPM decoder (DDPM for producing image embedding will have a separate objective, predicting directly the embedding rather than the noise [epsilon in paper]) 2022-04-12 17:48:41 -07:00
Phil Wang
862e5ba50e more sketches to base dalle2 class 2022-04-12 17:31:01 -07:00
Phil Wang
25d980ebbf complete naive conditioning of unet with image embedding, with ability to dropout for classifier free guidance 2022-04-12 17:27:39 -07:00
Phil Wang
d546a615c0 complete helper methods for doing condition scaling (classifier free guidance), for decoder unet and prior network 2022-04-12 16:11:16 -07:00
Phil Wang
d4c8373635 complete conditional dropout mask creation for both prior network as well as image decoder unet for classifier free guidance 2022-04-12 14:04:08 -07:00
Phil Wang
c814b2b278 sponsor project button 2022-04-12 13:34:02 -07:00
Phil Wang
74aec9d8ca further prepare attention for classifier free guidance 2022-04-12 13:01:18 -07:00
Phil Wang
7647be2569 prep for classifier free guidance for the image embedding diffusion step, even though not mentioned in paper 2022-04-12 12:57:09 -07:00
Phil Wang
59b8abe09e prepare unet to be conditioned on image embedding, optionally text encodings, and reminder for self to build conditional dropout for classifier free guidance 2022-04-12 12:38:56 -07:00
Phil Wang
46dde54948 for integration of X-CLIP automagically in the gaussian diffusion classes 2022-04-12 12:17:34 -07:00
Phil Wang
40aa304b7e rename to DiffusionPriorNetwork in case ARPriorNetwork is ever built 2022-04-12 11:45:57 -07:00
Phil Wang
fd38eb83c4 complete the main contribution of the paper, the diffusion prior network, minus the diffusion training setup 2022-04-12 11:43:59 -07:00
Phil Wang
83aabd42ca move epsilon inside of square root for further stability in rmsnorm
improvise and use rmsnorm in convnext blocks too
2022-04-12 11:18:36 -07:00
Phil Wang
cf22affcbb bring in modified unet using convnext blocks https://arxiv.org/abs/2201.03545 2022-04-12 10:58:44 -07:00
Phil Wang
522f42f582 start using RMSNorm, used in Gopher and AlphaCode, and as a way to go complete bias-less (purportedly more stable according to PaLM) 2022-04-12 10:45:03 -07:00
Phil Wang
0a60818965 dropouts in transformer, also prep for classifier free guidance in decoder 2022-04-12 10:42:57 -07:00
Phil Wang
604765b563 readme 2022-04-12 10:35:56 -07:00
Phil Wang
7bbc62f3d5 bring in pillow, for image encoding to and from 2022-04-12 10:29:55 -07:00
Phil Wang
771fe0d0d2 also consider accepting tokenizer, so dalle2 forward pass can just be invoked as DALLE2(<prompt string>) 2022-04-12 10:29:29 -07:00
Phil Wang
de75a8af76 link to yannic, since he is the best 2022-04-12 10:27:01 -07:00
Phil Wang
df4dac4f5a bring in attention - it is all we need 2022-04-12 10:23:07 -07:00
Phil Wang
24b428bdfc readme 2022-04-12 10:12:42 -07:00
Phil Wang
2ab042b862 create the eventual dream cli, like bigsleep library 2022-04-12 10:04:17 -07:00
Phil Wang
b93ad8b7a2 add cli file, use click 2022-04-12 09:58:53 -07:00
Phil Wang
f5e0aea140 get ready for CLI tool, just like stylegan2_pytorch 0.0.2 2022-04-12 09:57:54 -07:00
Phil Wang
5e03b7f932 get ready for all the training related classes and functions 2022-04-12 09:54:50 -07:00
Phil Wang
62c0d321a6 sketch 2022-04-12 09:39:42 -07:00
Phil Wang
7cf1637d24 bring in the simple tokenizer released by openai, but also plan on leaving room for custom tokenizer with yttm 0.0.1 2022-04-12 09:23:17 -07:00
Phil Wang
4ff6d021c9 pin to newer version of CLIP that returns encoded text and images, get some helper functions ready for XCLIP 2022-04-12 08:54:47 -07:00
Phil Wang
0070547e3b add a link to laion discord 2022-04-10 19:03:31 -07:00
Phil Wang
2dc8717bbe readme 2022-04-09 10:47:49 -07:00
Phil Wang
850271e2d9 bring in x-clip 2022-04-08 12:19:31 -07:00
Phil Wang
7b54195da4 explain to public 2022-04-07 09:53:56 -07:00
Phil Wang
0754a694ba cite katherine, as she was the true genesis of CLIP + diffusion (and now latent diffusion) 2022-04-07 09:26:28 -07:00
Phil Wang
c5d49db762 intent 2022-04-07 09:14:08 -07:00
Phil Wang
f283bf25be scaffold 2022-04-07 07:29:34 -07:00
Phil Wang
25fb133c83 diagram 2022-04-07 05:08:11 +00:00
Phil Wang
32b584d6c0 readme 2022-04-06 21:17:16 -07:00
Phil Wang
cfba049416 Initial commit 2022-04-06 21:14:09 -07:00