DALLE2-pytorch

mirror of https://github.com/lucidrains/DALLE2-pytorch.git synced 2025-12-19 17:54:20 +01:00

Author	SHA1	Message	Date
Phil Wang	1cce4225eb	0.0.18	2022-04-17 07:29:34 -07:00
Phil Wang	c400d8758c	prepare for cascading diffusion in unet, save the full progressive upsampling architecture to be built next week	2022-04-15 07:03:28 -07:00
Phil Wang	bece206699	fix bug thanks to @jihoonerd	2022-04-15 06:44:40 -07:00
Phil Wang	6e27f617f1	use t5 relative positional bias in prior network causal transformer, since it makes more sense than rotary embeddings	2022-04-14 12:01:09 -07:00
Phil Wang	9f55c24db6	allow for decoder conditioning with the text encodings from CLIP, if it is passed in. use lazy linear to avoid researchers having to worry about text encoding dimensions, but remove later if it does not work well	2022-04-14 11:46:45 -07:00
Phil Wang	23c401a5d5	use the eval decorator	2022-04-14 10:13:43 -07:00
Phil Wang	68e9883f59	use cross attention for conditioning unet based on image embedding tokens (which opens up the door on conditioning on text encodings as well	2022-04-14 10:10:04 -07:00
Phil Wang	95b018374a	start using swish glu everywhere, given success of PaLM	2022-04-14 09:34:32 -07:00
Phil Wang	f2c52d8239	fix bug with classifier free guidance for prior network, even though it seems it may not be used	2022-04-14 09:21:51 -07:00
Phil Wang	97e951221b	bring in blur, as it will be used somewhere in the cascading DDPM in the decoder eventually, once i figure it out	2022-04-14 09:16:09 -07:00
Phil Wang	7fb3f695d5	offer continuously parameterized time embedding for diffusion prior network, remove a hyperparameter that may trip up people, if not set correctly	2022-04-14 08:28:11 -07:00
Phil Wang	7e93b9d3c8	make sure classifier free guidance condition scaling is exposed on DALLE2 forward function	2022-04-13 20:14:28 -07:00
Phil Wang	14ddbc159c	cleanup	2022-04-13 18:24:32 -07:00
Phil Wang	5e06cde4cb	always work in the l2normed space for image and text embeddings	2022-04-13 18:08:42 -07:00
Phil Wang	a1a8a78f21	fix everything and make sure it runs end to end, document everything in readme for public	2022-04-13 18:05:25 -07:00
Phil Wang	791d27326a	add diffusion code for the image embedding. nearly all the code is there except for the cascading ddpm in the decoder (with upscaling etc)	2022-04-13 10:06:52 -07:00
Phil Wang	33d69d3859	take care of DDPM decoder (DDPM for producing image embedding will have a separate objective, predicting directly the embedding rather than the noise [epsilon in paper])	2022-04-12 17:48:41 -07:00
Phil Wang	46dde54948	for integration of X-CLIP automagically in the gaussian diffusion classes	2022-04-12 12:17:34 -07:00
Phil Wang	fd38eb83c4	complete the main contribution of the paper, the diffusion prior network, minus the diffusion training setup	2022-04-12 11:43:59 -07:00
Phil Wang	7bbc62f3d5	bring in pillow, for image encoding to and from	2022-04-12 10:29:55 -07:00
Phil Wang	2ab042b862	create the eventual dream cli, like bigsleep library	2022-04-12 10:04:17 -07:00
Phil Wang	f5e0aea140	get ready for CLI tool, just like stylegan2_pytorch	2022-04-12 09:57:54 -07:00
Phil Wang	7cf1637d24	bring in the simple tokenizer released by openai, but also plan on leaving room for custom tokenizer with yttm	2022-04-12 09:23:17 -07:00
Phil Wang	4ff6d021c9	pin to newer version of CLIP that returns encoded text and images, get some helper functions ready for XCLIP	2022-04-12 08:54:47 -07:00
Phil Wang	850271e2d9	bring in x-clip	2022-04-08 12:19:31 -07:00
Phil Wang	f283bf25be	scaffold	2022-04-07 07:29:34 -07:00

1 2 3 4

176 Commits