DALLE2-pytorch

mirror of https://github.com/lucidrains/DALLE2-pytorch.git synced 2025-12-19 09:44:19 +01:00

Author	SHA1	Message	Date
Romain Beaumont	2d25c89f35	Fix passing of l2norm_output to DiffusionPriorNetwork (#51 )	2022-05-02 10:48:16 -07:00
Phil Wang	3fe96c208a	add ability to train diffusion prior with l2norm on output image embed	2022-05-02 09:53:20 -07:00
Phil Wang	0fc6c9cdf3	provide option to l2norm the output of the diffusion prior 0.0.91	2022-05-02 09:41:03 -07:00
Phil Wang	7ee0ecc388	mixed precision for training diffusion prior + save optimizer and scaler states	2022-05-02 09:31:04 -07:00
Phil Wang	1924c7cc3d	fix issue with mixed precision and gradient clipping 0.0.90	2022-05-02 09:20:19 -07:00
Phil Wang	f7df3caaf3	address not calculating average eval / test loss when training diffusion prior https://github.com/lucidrains/DALLE2-pytorch/issues/49	2022-05-02 08:51:41 -07:00
Phil Wang	fc954ee788	fix calculation of adaptive weight for vit-vqgan, thanks to @CiaoHe 0.0.89	2022-05-02 07:58:14 -07:00
Phil Wang	c1db2753f5	todo	2022-05-01 18:02:30 -07:00
Phil Wang	ad87bfe28f	switch to using linear attention for the sparse attention layers within unet, given success in GAN projects 0.0.88	2022-05-01 17:59:03 -07:00
Phil Wang	76c767b1ce	update deps, commit to using webdatasets, per @rom1504 consultation	2022-05-01 12:22:15 -07:00
Phil Wang	d991b8c39c	just clip the diffusion prior network parameters	2022-05-01 12:01:08 -07:00
Phil Wang	902693e271	todo	2022-05-01 11:57:08 -07:00
Phil Wang	35cd63982d	add gradient clipping, make sure weight decay is configurable, make sure learning rate is actually passed into get_optimizer, make sure model is set to training mode at beginning of each epoch	2022-05-01 11:55:38 -07:00
Kumar R	53ce6dfdf6	All changes implemented, current run happening. Link to wandb run in comments. (#43 ) * Train DiffusionPrior with pre-computed embeddings This is in response to https://github.com/lucidrains/DALLE2-pytorch/issues/29 - more metrics will get added.	2022-05-01 11:46:59 -07:00
Phil Wang	ad8d7a368b	product management	2022-05-01 11:26:21 -07:00
Phil Wang	b8cf1e5c20	more attention 0.0.87	2022-05-01 11:00:33 -07:00
Phil Wang	94aaa08d97	product management	2022-05-01 09:43:10 -07:00
Phil Wang	8b9bbec7d1	project management 0.0.86	2022-05-01 09:32:57 -07:00
Phil Wang	1bb9fc9829	add convnext backbone for vqgan-vae, still need to fix groupnorms in resnet encdec	2022-05-01 09:32:24 -07:00
Phil Wang	5e421bd5bb	let researchers do the hyperparameter search 0.0.85	2022-05-01 08:46:21 -07:00
Phil Wang	67fcab1122	add MLP based time conditioning to all convnexts, in addition to cross attention. also add an initial convolution, given convnext first depthwise conv 0.0.84	2022-05-01 08:41:02 -07:00
Phil Wang	5bfbccda22	port over vqgan vae trainer	2022-05-01 08:09:15 -07:00
Phil Wang	989275ff59	product management	2022-04-30 16:57:56 -07:00
Phil Wang	56408f4a40	project management	2022-04-30 16:57:02 -07:00
Phil Wang	d1a697ac23	allows one to shortcut sampling at a specific unet number, if one were to be training in stages	2022-04-30 16:05:13 -07:00
Phil Wang	ebe01749ed	DecoderTrainer sample method uses the exponentially moving averaged 0.0.81	2022-04-30 14:55:34 -07:00
Phil Wang	63195cc2cb	allow for division of loss prior to scaling, for gradient accumulation purposes 0.0.80	2022-04-30 12:56:47 -07:00
Phil Wang	a2ef69af66	take care of mixed precision, and make gradient accumulation do-able externally 0.0.79	2022-04-30 12:27:24 -07:00
Phil Wang	5fff22834e	be able to finely customize learning parameters for each unet, take care of gradient clipping 0.0.78	2022-04-30 11:56:05 -07:00
Phil Wang	a9421f49ec	simplify Decoder training for the public 0.0.77	2022-04-30 11:45:18 -07:00
Phil Wang	77fa34eae9	fix all clipping / clamping issues 0.0.76	2022-04-30 10:08:24 -07:00
Phil Wang	1c1e508369	fix all issues with text encodings conditioning in the decoder, using null padding tokens technique from dalle v1 0.0.75	2022-04-30 09:13:34 -07:00
Phil Wang	f19c99ecb0	fix decoder needing separate conditional dropping probabilities for image embeddings and text encodings, thanks to @xiankgx ! 0.0.74	2022-04-30 08:48:05 -07:00
Phil Wang	721a444686	Merge pull request #37 from ProGamerGov/patch-1 Fix spelling and grammatical errors	2022-04-30 08:19:07 -07:00
ProGamerGov	63450b466d	Fix spelling and grammatical errors	2022-04-30 09:18:13 -06:00
Phil Wang	20e7eb5a9b	cleanup	2022-04-30 07:22:57 -07:00
Phil Wang	e2f9615afa	use @clip-anytorch , thanks to @rom1504 0.0.73	2022-04-30 06:40:54 -07:00
Phil Wang	0d1c07c803	fix a bug with classifier free guidance, thanks to @xiankgx again! 0.0.72	2022-04-30 06:34:57 -07:00
Phil Wang	a389f81138	todo 0.0.71	2022-04-29 15:40:51 -07:00
Phil Wang	0283556608	fix example in readme, since api changed	2022-04-29 13:40:55 -07:00
Phil Wang	5063d192b6	now completely OpenAI CLIP compatible for training just take care of the logic for AdamW and transformers used namedtuples for clip adapter embedding outputs	2022-04-29 13:05:01 -07:00
Phil Wang	f4a54e475e	add some training fns	2022-04-29 09:44:55 -07:00
Phil Wang	fb662a62f3	fix another bug thanks to @xiankgx 0.0.65	2022-04-29 07:38:32 -07:00
Phil Wang	587c8c9b44	optimize for clarity	2022-04-28 21:59:13 -07:00
Phil Wang	aa900213e7	force first unet in the cascade to be conditioned on image embeds 0.0.64	2022-04-28 20:53:15 -07:00
Phil Wang	cb26187450	vqgan-vae codebook dims should be 256 or smaller 0.0.63	2022-04-28 08:59:03 -07:00
Phil Wang	625ce23f6b	🐛 0.0.62	2022-04-28 07:21:18 -07:00
Phil Wang	dbf4a281f1	make sure another CLIP can actually be passed in, as long as it is wrapped in an adapter extended from BaseClipAdapter 0.0.61	2022-04-27 20:45:27 -07:00
Phil Wang	4ab527e779	some extra asserts for text encoding of diffusion prior and decoder 0.0.60	2022-04-27 20:11:43 -07:00
Phil Wang	d0cdeb3247	add ability for DALL-E2 to return PIL images with `return_pil_images = True` on forward, for those who have no clue about deep learning	2022-04-27 19:58:06 -07:00

1 2 3 4 5

203 Commits