DALLE2-pytorch

mirror of https://github.com/lucidrains/DALLE2-pytorch.git synced 2025-12-19 17:54:20 +01:00

Author	SHA1	Message	Date
Phil Wang	ba58ae0bf2	add two asserts to diffusion prior to ensure matching image embedding dimensions for clip, diffusion prior network, and what was set on diffusion prior	2022-08-28 10:11:37 -07:00
Phil Wang	1cc5d0afa7	upgrade to best downsample	2022-08-25 10:37:02 -07:00
Phil Wang	59fa101c4d	fix classifier free guidance for diffusion prior, thanks to @jaykim9870 for spotting the issue	2022-08-23 08:29:01 -07:00
Phil Wang	083508ff8e	cast attention matrix back to original dtype pre-softmax in attention	2022-08-20 10:56:01 -07:00
Phil Wang	7762edd0ff	make it work for @ethancohen123	2022-08-19 11:28:58 -07:00
Phil Wang	44e09d5a4d	add weight standardization behind feature flag, which may potentially work well with group norm	2022-08-14 11:34:45 -07:00
Phil Wang	34806663e3	make it so diffusion prior p_sample_loop returns unnormalized image embeddings	2022-08-13 10:03:40 -07:00
Phil Wang	dc816b1b6e	dry up some code around handling unet outputs with learned variance	2022-08-12 15:25:03 -07:00
Phil Wang	05192ffac4	fix self conditioning shape in diffusion prior	2022-08-12 12:30:03 -07:00
Phil Wang	9440411954	make self conditioning technique work with diffusion prior	2022-08-12 12:20:51 -07:00
Phil Wang	981d407792	comment	2022-08-12 11:41:23 -07:00
Phil Wang	7c5477b26d	bet on the new self-conditioning technique out of geoffrey hintons group	2022-08-12 11:36:08 -07:00
Phil Wang	be3bb868bf	add gradient checkpointing for all resnet blocks	2022-08-02 19:21:44 -07:00
Phil Wang	f22e8c8741	make open clip available for use with dalle2 pytorch	2022-07-30 09:02:31 -07:00
Phil Wang	87432e93ad	quick fix for linear attention	2022-07-29 13:17:12 -07:00
Phil Wang	d167378401	add cosine sim for self attention as well, as a setting	2022-07-29 12:48:20 -07:00
Phil Wang	2d67d5821e	change up epsilon in layernorm the case of using fp16, thanks to @Veldrovive for figuring out this stabilizes training	2022-07-29 12:41:02 -07:00
Phil Wang	748c7fe7af	allow for cosine sim cross attention, modify linear attention in attempt to resolve issue on fp16	2022-07-29 11:12:18 -07:00
Phil Wang	80046334ad	make sure entire readme runs without errors	2022-07-28 10:17:43 -07:00
Phil Wang	36fb46a95e	fix readme and a small bug in DALLE2 class	2022-07-28 08:33:51 -07:00
Phil Wang	07abfcf45b	rescale values in linear attention to mitigate overflows in fp16 setting	2022-07-27 12:27:38 -07:00
Phil Wang	406e75043f	add upsample combiner feature for the unets	2022-07-26 10:46:04 -07:00
Phil Wang	62043acb2f	fix repaint	2022-07-24 15:29:06 -07:00
Aidan Dempster	4145474bab	Improved upsampler training (#181 ) Sampling is now possible without the first decoder unet Non-training unets are deleted in the decoder trainer since they are never used and it is harder merge the models is they have keys in this state dict Fixed a mistake where clip was not re-added after saving	2022-07-19 19:07:50 -07:00
Phil Wang	291377bb9c	@jacobwjs reports dynamic thresholding works very well and 0.95 is a better value	2022-07-19 11:31:56 -07:00
Phil Wang	723bf0abba	complete inpainting ability using inpaint_image and inpaint_mask passed into sample function for decoder	2022-07-19 09:26:55 -07:00
Phil Wang	d88c7ba56c	fix a bug with ddim and predict x0 objective	2022-07-18 19:04:26 -07:00
Phil Wang	3676a8ce78	comments	2022-07-18 15:02:04 -07:00
Phil Wang	da8e99ada0	fix sample bug	2022-07-18 13:50:22 -07:00
Phil Wang	6afb886cf4	complete imagen-like noise level conditioning	2022-07-18 13:43:57 -07:00
Phil Wang	a2ee3fa3cc	offer way to turn off initial cross embed convolutional module, for debugging upsampler artifacts	2022-07-15 17:29:10 -07:00
Phil Wang	a58a370d75	takes care of a grad strides error at https://github.com/lucidrains/DALLE2-pytorch/issues/196 thanks to @YUHANG-Ma	2022-07-14 15:28:34 -07:00
Phil Wang	1662bbf226	protect against random cropping for base unet	2022-07-14 12:49:43 -07:00
Phil Wang	a34f60962a	let the neural network peek at the low resolution conditioning one last time before making prediction, for upsamplers	2022-07-14 10:27:04 -07:00
Phil Wang	0b40cbaa54	just always use nearest neighbor interpolation when resizing for low resolution conditioning, for https://github.com/lucidrains/DALLE2-pytorch/pull/181	2022-07-13 20:59:43 -07:00
Phil Wang	f141144a6d	allow for using classifier free guidance for some unets but not others, by passing in a tuple of cond_scale during sampling for decoder, just in case it is causing issues for upsamplers	2022-07-13 13:12:30 -07:00
Phil Wang	f988207718	hack around some inplace error, also make sure for openai clip text encoding, only tokens after eos_id is masked out	2022-07-13 12:56:02 -07:00
Phil Wang	cc0f7a935c	fix non pixel shuffle upsample	2022-07-13 10:16:02 -07:00
Phil Wang	95a512cb65	fix a potential bug with conditioning with blurred low resolution image, blur should be applied only 50% of the time	2022-07-13 10:11:49 -07:00
Phil Wang	972ee973bc	fix issue with ddim and normalization of lowres conditioning image	2022-07-13 09:48:40 -07:00
Phil Wang	79e2a3bc77	only use the stable layernorm for final output norm in transformer	2022-07-13 07:56:30 -07:00
Phil Wang	349aaca56f	add yet another transformer stability measure	2022-07-12 17:49:16 -07:00
Phil Wang	3ee3c56d2a	add learned padding tokens, same strategy as dalle1, for diffusion prior, and get rid of masking in causal transformer	2022-07-12 17:33:14 -07:00
Phil Wang	775abc4df6	add setting to attend to all text encodings regardless of padding, for diffusion prior	2022-07-12 17:08:12 -07:00
Phil Wang	11b1d533a0	make sure text encodings being passed in has the correct batch dimension	2022-07-12 16:00:19 -07:00
Phil Wang	e76e89f9eb	remove text masking altogether in favor of deriving from text encodings (padded text encodings must be pad value of 0.)	2022-07-12 15:40:31 -07:00
Phil Wang	bb3ff0ac67	protect against bad text mask being passed into decoder	2022-07-12 15:33:13 -07:00
Phil Wang	1ec4dbe64f	one more fix for text mask, if the length of the text encoding exceeds max_text_len, add an assert for better error msg	2022-07-12 15:01:46 -07:00
Phil Wang	e0835acca9	generate text mask within the unet and diffusion prior itself from the text encodings, if not given	2022-07-12 12:54:59 -07:00
Phil Wang	1d9ef99288	add PixelShuffleUpsample thanks to @MalumaDev and @marunine for running the experiment and verifyng absence of checkboard artifacts	2022-07-11 16:07:23 -07:00

1 2 3 4 5 ...

264 Commits