Commit Graph

191 Commits

Author SHA1 Message Date
Phil Wang
35cd63982d add gradient clipping, make sure weight decay is configurable, make sure learning rate is actually passed into get_optimizer, make sure model is set to training mode at beginning of each epoch 2022-05-01 11:55:38 -07:00
Kumar R
53ce6dfdf6 All changes implemented, current run happening. Link to wandb run in comments. (#43)
* Train DiffusionPrior with pre-computed embeddings

This is in response to https://github.com/lucidrains/DALLE2-pytorch/issues/29 - more metrics will get added.
2022-05-01 11:46:59 -07:00
Phil Wang
ad8d7a368b product management 2022-05-01 11:26:21 -07:00
Phil Wang
b8cf1e5c20 more attention 0.0.87 2022-05-01 11:00:33 -07:00
Phil Wang
94aaa08d97 product management 2022-05-01 09:43:10 -07:00
Phil Wang
8b9bbec7d1 project management 0.0.86 2022-05-01 09:32:57 -07:00
Phil Wang
1bb9fc9829 add convnext backbone for vqgan-vae, still need to fix groupnorms in resnet encdec 2022-05-01 09:32:24 -07:00
Phil Wang
5e421bd5bb let researchers do the hyperparameter search 0.0.85 2022-05-01 08:46:21 -07:00
Phil Wang
67fcab1122 add MLP based time conditioning to all convnexts, in addition to cross attention. also add an initial convolution, given convnext first depthwise conv 0.0.84 2022-05-01 08:41:02 -07:00
Phil Wang
5bfbccda22 port over vqgan vae trainer 2022-05-01 08:09:15 -07:00
Phil Wang
989275ff59 product management 2022-04-30 16:57:56 -07:00
Phil Wang
56408f4a40 project management 2022-04-30 16:57:02 -07:00
Phil Wang
d1a697ac23 allows one to shortcut sampling at a specific unet number, if one were to be training in stages 2022-04-30 16:05:13 -07:00
Phil Wang
ebe01749ed DecoderTrainer sample method uses the exponentially moving averaged 0.0.81 2022-04-30 14:55:34 -07:00
Phil Wang
63195cc2cb allow for division of loss prior to scaling, for gradient accumulation purposes 0.0.80 2022-04-30 12:56:47 -07:00
Phil Wang
a2ef69af66 take care of mixed precision, and make gradient accumulation do-able externally 0.0.79 2022-04-30 12:27:24 -07:00
Phil Wang
5fff22834e be able to finely customize learning parameters for each unet, take care of gradient clipping 0.0.78 2022-04-30 11:56:05 -07:00
Phil Wang
a9421f49ec simplify Decoder training for the public 0.0.77 2022-04-30 11:45:18 -07:00
Phil Wang
77fa34eae9 fix all clipping / clamping issues 0.0.76 2022-04-30 10:08:24 -07:00
Phil Wang
1c1e508369 fix all issues with text encodings conditioning in the decoder, using null padding tokens technique from dalle v1 0.0.75 2022-04-30 09:13:34 -07:00
Phil Wang
f19c99ecb0 fix decoder needing separate conditional dropping probabilities for image embeddings and text encodings, thanks to @xiankgx ! 0.0.74 2022-04-30 08:48:05 -07:00
Phil Wang
721a444686 Merge pull request #37 from ProGamerGov/patch-1
Fix spelling and grammatical errors
2022-04-30 08:19:07 -07:00
ProGamerGov
63450b466d Fix spelling and grammatical errors 2022-04-30 09:18:13 -06:00
Phil Wang
20e7eb5a9b cleanup 2022-04-30 07:22:57 -07:00
Phil Wang
e2f9615afa use @clip-anytorch , thanks to @rom1504 0.0.73 2022-04-30 06:40:54 -07:00
Phil Wang
0d1c07c803 fix a bug with classifier free guidance, thanks to @xiankgx again! 0.0.72 2022-04-30 06:34:57 -07:00
Phil Wang
a389f81138 todo 0.0.71 2022-04-29 15:40:51 -07:00
Phil Wang
0283556608 fix example in readme, since api changed 2022-04-29 13:40:55 -07:00
Phil Wang
5063d192b6 now completely OpenAI CLIP compatible for training
just take care of the logic for AdamW and transformers

used namedtuples for clip adapter embedding outputs
2022-04-29 13:05:01 -07:00
Phil Wang
f4a54e475e add some training fns 2022-04-29 09:44:55 -07:00
Phil Wang
fb662a62f3 fix another bug thanks to @xiankgx 0.0.65 2022-04-29 07:38:32 -07:00
Phil Wang
587c8c9b44 optimize for clarity 2022-04-28 21:59:13 -07:00
Phil Wang
aa900213e7 force first unet in the cascade to be conditioned on image embeds 0.0.64 2022-04-28 20:53:15 -07:00
Phil Wang
cb26187450 vqgan-vae codebook dims should be 256 or smaller 0.0.63 2022-04-28 08:59:03 -07:00
Phil Wang
625ce23f6b 🐛 0.0.62 2022-04-28 07:21:18 -07:00
Phil Wang
dbf4a281f1 make sure another CLIP can actually be passed in, as long as it is wrapped in an adapter extended from BaseClipAdapter 0.0.61 2022-04-27 20:45:27 -07:00
Phil Wang
4ab527e779 some extra asserts for text encoding of diffusion prior and decoder 0.0.60 2022-04-27 20:11:43 -07:00
Phil Wang
d0cdeb3247 add ability for DALL-E2 to return PIL images with return_pil_images = True on forward, for those who have no clue about deep learning 2022-04-27 19:58:06 -07:00
Phil Wang
8c610aad9a only pass text encodings conditioning in diffusion prior if specified on initialization 0.0.58 2022-04-27 19:48:16 -07:00
Phil Wang
6700381a37 prepare for ability to integrate other clips other than x-clip 0.0.57 2022-04-27 19:35:05 -07:00
Phil Wang
20377f889a todo 2022-04-27 17:22:14 -07:00
Phil Wang
6edb1c5dd0 fix issue with ema class 0.0.56 2022-04-27 16:40:02 -07:00
Phil Wang
b093f92182 inform what is possible 2022-04-27 08:25:16 -07:00
Phil Wang
fa3bb6ba5c make sure cpu-only still works 0.0.55 2022-04-27 08:02:10 -07:00
Phil Wang
2705e7c9b0 attention-based upsampling claims unsupported by local experiments, removing 2022-04-27 07:51:04 -07:00
Phil Wang
77141882c8 complete vit-vqgan from https://arxiv.org/abs/2110.04627 0.0.54 2022-04-26 17:20:47 -07:00
Phil Wang
4075d02139 nevermind, it could be working, but only when i stabilize it with the feedforward layer + tanh as proposed in vit-vqgan paper (which will be built into the repository later for the latent diffusion) 2022-04-26 12:43:31 -07:00
Phil Wang
de0296106b be able to turn off warning for use of LazyLinear by passing in text embedding dimension for unet 0.0.52 2022-04-26 11:42:46 -07:00
Phil Wang
eafb136214 suppress a warning 0.0.51 2022-04-26 11:40:45 -07:00
Phil Wang
bfbcc283a3 DRY a tiny bit for gaussian diffusion related logic 2022-04-26 11:39:12 -07:00