Commit Graph

167 Commits

Author SHA1 Message Date
Phil Wang
97e951221b bring in blur, as it will be used somewhere in the cascading DDPM in the decoder eventually, once i figure it out 2022-04-14 09:16:09 -07:00
Phil Wang
7fb3f695d5 offer continuously parameterized time embedding for diffusion prior network, remove a hyperparameter that may trip up people, if not set correctly 2022-04-14 08:28:11 -07:00
Phil Wang
7e93b9d3c8 make sure classifier free guidance condition scaling is exposed on DALLE2 forward function 2022-04-13 20:14:28 -07:00
Phil Wang
14ddbc159c cleanup 2022-04-13 18:24:32 -07:00
Phil Wang
5e06cde4cb always work in the l2normed space for image and text embeddings 2022-04-13 18:08:42 -07:00
Phil Wang
a1a8a78f21 fix everything and make sure it runs end to end, document everything in readme for public 2022-04-13 18:05:25 -07:00
Phil Wang
791d27326a add diffusion code for the image embedding. nearly all the code is there except for the cascading ddpm in the decoder (with upscaling etc) 2022-04-13 10:06:52 -07:00
Phil Wang
33d69d3859 take care of DDPM decoder (DDPM for producing image embedding will have a separate objective, predicting directly the embedding rather than the noise [epsilon in paper]) 2022-04-12 17:48:41 -07:00
Phil Wang
46dde54948 for integration of X-CLIP automagically in the gaussian diffusion classes 2022-04-12 12:17:34 -07:00
Phil Wang
fd38eb83c4 complete the main contribution of the paper, the diffusion prior network, minus the diffusion training setup 2022-04-12 11:43:59 -07:00
Phil Wang
7bbc62f3d5 bring in pillow, for image encoding to and from 2022-04-12 10:29:55 -07:00
Phil Wang
2ab042b862 create the eventual dream cli, like bigsleep library 2022-04-12 10:04:17 -07:00
Phil Wang
f5e0aea140 get ready for CLI tool, just like stylegan2_pytorch 2022-04-12 09:57:54 -07:00
Phil Wang
7cf1637d24 bring in the simple tokenizer released by openai, but also plan on leaving room for custom tokenizer with yttm 2022-04-12 09:23:17 -07:00
Phil Wang
4ff6d021c9 pin to newer version of CLIP that returns encoded text and images, get some helper functions ready for XCLIP 2022-04-12 08:54:47 -07:00
Phil Wang
850271e2d9 bring in x-clip 2022-04-08 12:19:31 -07:00
Phil Wang
f283bf25be scaffold 2022-04-07 07:29:34 -07:00