Commit Graph

  • 93ba019069 product management Phil Wang 2022-05-05 07:39:51 -07:00
  • 8518684ae9 does not make much sense, as researchers may want to try predicting noise with diffusionprior instead of predicting x0 0.0.105 Phil Wang 2022-05-05 07:37:00 -07:00
  • 1d5dc08810 take @crowsonkb 's suggestion at https://github.com/lucidrains/DALLE2-pytorch/issues/60#issue-1226116132 0.0.104 Phil Wang 2022-05-05 07:28:53 -07:00
  • d8d8b6caf1 dataloaders for decoder training, from @Veldrovive 0.0.102 Phil Wang 2022-05-05 07:09:45 -07:00
  • 15acc03bd4 Add a dataloader for training the decoder (#57) Aidan Dempster 2022-05-05 10:08:45 -04:00
  • 896f19786d remove convnext blocks, they are illsuited for generative work, validated by early experimental results at https://github.com/lucidrains/video-diffusion-pytorch 0.0.101 Phil Wang 2022-05-05 07:07:21 -07:00
  • aec5575d09 take a bet on resize right, given Katherine is using it 0.0.100 Phil Wang 2022-05-04 19:26:45 -07:00
  • 9773f10d6c use inference mode whenever possible, cleanup 0.0.99 Phil Wang 2022-05-04 15:24:57 -07:00
  • a6bf8ddef6 advertise laion Phil Wang 2022-05-04 15:04:05 -07:00
  • 86e692d24f fix random crop probability 0.0.98 Phil Wang 2022-05-04 11:52:24 -07:00
  • 97b751209f allow for last unet in the cascade to be trained on crops, if it is convolution-only 0.0.97 Phil Wang 2022-05-04 11:48:41 -07:00
  • 74103fd8d6 product management Phil Wang 2022-05-04 11:20:50 -07:00
  • 1992d25cad project management 0.0.96 Phil Wang 2022-05-04 11:18:54 -07:00
  • 5b619c2fd5 make sure some hyperparameters for unet block is configurable Phil Wang 2022-05-04 11:18:32 -07:00
  • 9359ad2e91 0.0.95 0.0.95 Phil Wang 2022-05-04 10:53:05 -07:00
  • 9ff228188b offer old resnet blocks, from the original DDPM paper, just in case convnexts are unsuitable for generative work Phil Wang 2022-05-04 10:52:47 -07:00
  • 2d9963d30e Reporting metrics - Cosine similarity. (#55) Kumar R 2022-05-04 20:34:36 +05:30
  • 58d9b422f3 0.0.94 0.0.94 Phil Wang 2022-05-04 07:42:33 -07:00
  • 44b319cb57 add missing import (#56) Ray Bell 2022-05-04 10:42:20 -04:00
  • c30f380689 final reminder Phil Wang 2022-05-03 08:18:53 -07:00
  • e4e884bb8b keep all doors open Phil Wang 2022-05-03 08:17:02 -07:00
  • 803ad9c17d product management again Phil Wang 2022-05-03 08:15:25 -07:00
  • a88dd6a9c0 todo Phil Wang 2022-05-03 08:09:02 -07:00
  • 72c16b496e Update train_diffusion_prior.py (#53) Kumar R 2022-05-03 11:14:57 +05:30
  • 81d83dd7f2 defaults align with paper (#52) z 2022-05-02 13:52:11 -07:00
  • fa66f7e1e9 todo Phil Wang 2022-05-02 12:57:15 -07:00
  • aa8d135245 allow laion to experiment with normformer in diffusion prior Phil Wang 2022-05-02 11:35:00 -07:00
  • 70282de23b add ability to turn on normformer settings, given @borisdayma reported good results and some personal anecdata 0.0.93 Phil Wang 2022-05-02 11:33:15 -07:00
  • 83f761847e todo Phil Wang 2022-05-02 10:52:39 -07:00
  • 11469dc0c6 makes more sense to keep this as True as default, for stability 0.0.92 Phil Wang 2022-05-02 10:50:55 -07:00
  • 2d25c89f35 Fix passing of l2norm_output to DiffusionPriorNetwork (#51) Romain Beaumont 2022-05-02 19:48:16 +02:00
  • 3fe96c208a add ability to train diffusion prior with l2norm on output image embed Phil Wang 2022-05-02 09:53:20 -07:00
  • 0fc6c9cdf3 provide option to l2norm the output of the diffusion prior 0.0.91 Phil Wang 2022-05-02 09:41:03 -07:00
  • 7ee0ecc388 mixed precision for training diffusion prior + save optimizer and scaler states Phil Wang 2022-05-02 09:31:04 -07:00
  • 1924c7cc3d fix issue with mixed precision and gradient clipping 0.0.90 Phil Wang 2022-05-02 09:20:19 -07:00
  • f7df3caaf3 address not calculating average eval / test loss when training diffusion prior https://github.com/lucidrains/DALLE2-pytorch/issues/49 Phil Wang 2022-05-02 08:51:41 -07:00
  • fc954ee788 fix calculation of adaptive weight for vit-vqgan, thanks to @CiaoHe 0.0.89 Phil Wang 2022-05-02 07:57:28 -07:00
  • c1db2753f5 todo Phil Wang 2022-05-01 18:02:30 -07:00
  • ad87bfe28f switch to using linear attention for the sparse attention layers within unet, given success in GAN projects 0.0.88 Phil Wang 2022-05-01 17:59:03 -07:00
  • 76c767b1ce update deps, commit to using webdatasets, per @rom1504 consultation Phil Wang 2022-05-01 12:22:15 -07:00
  • d991b8c39c just clip the diffusion prior network parameters Phil Wang 2022-05-01 12:01:01 -07:00
  • 902693e271 todo Phil Wang 2022-05-01 11:57:08 -07:00
  • 35cd63982d add gradient clipping, make sure weight decay is configurable, make sure learning rate is actually passed into get_optimizer, make sure model is set to training mode at beginning of each epoch Phil Wang 2022-05-01 11:55:38 -07:00
  • 53ce6dfdf6 All changes implemented, current run happening. Link to wandb run in comments. (#43) Kumar R 2022-05-02 00:16:59 +05:30
  • ad8d7a368b product management Phil Wang 2022-05-01 11:26:21 -07:00
  • b8cf1e5c20 more attention 0.0.87 Phil Wang 2022-05-01 11:00:26 -07:00
  • 94aaa08d97 product management Phil Wang 2022-05-01 09:43:10 -07:00
  • 8b9bbec7d1 project management 0.0.86 Phil Wang 2022-05-01 09:32:57 -07:00
  • 1bb9fc9829 add convnext backbone for vqgan-vae, still need to fix groupnorms in resnet encdec Phil Wang 2022-05-01 09:32:24 -07:00
  • 5e421bd5bb let researchers do the hyperparameter search 0.0.85 Phil Wang 2022-05-01 08:46:21 -07:00
  • 67fcab1122 add MLP based time conditioning to all convnexts, in addition to cross attention. also add an initial convolution, given convnext first depthwise conv 0.0.84 Phil Wang 2022-05-01 08:41:02 -07:00
  • 5bfbccda22 port over vqgan vae trainer Phil Wang 2022-05-01 08:09:15 -07:00
  • 989275ff59 product management Phil Wang 2022-04-30 16:57:56 -07:00
  • 56408f4a40 project management Phil Wang 2022-04-30 16:57:02 -07:00
  • d1a697ac23 allows one to shortcut sampling at a specific unet number, if one were to be training in stages Phil Wang 2022-04-30 16:05:13 -07:00
  • 8260fc933a allows one to shortcut sampling at a specific unet number, if one were to be training in stages 0.0.82 Phil Wang 2022-04-30 15:10:25 -07:00
  • ebe01749ed DecoderTrainer sample method uses the exponentially moving averaged 0.0.81 Phil Wang 2022-04-30 14:55:34 -07:00
  • 63195cc2cb allow for division of loss prior to scaling, for gradient accumulation purposes 0.0.80 Phil Wang 2022-04-30 12:56:47 -07:00
  • a2ef69af66 take care of mixed precision, and make gradient accumulation do-able externally 0.0.79 Phil Wang 2022-04-30 12:27:24 -07:00
  • 5fff22834e be able to finely customize learning parameters for each unet, take care of gradient clipping 0.0.78 Phil Wang 2022-04-30 11:56:05 -07:00
  • a9421f49ec simplify Decoder training for the public 0.0.77 Phil Wang 2022-04-30 11:45:18 -07:00
  • 77fa34eae9 fix all clipping / clamping issues 0.0.76 Phil Wang 2022-04-30 10:08:24 -07:00
  • 1c1e508369 fix all issues with text encodings conditioning in the decoder, using null padding tokens technique from dalle v1 0.0.75 Phil Wang 2022-04-30 09:13:34 -07:00
  • f19c99ecb0 fix decoder needing separate conditional dropping probabilities for image embeddings and text encodings, thanks to @xiankgx ! 0.0.74 Phil Wang 2022-04-30 08:47:56 -07:00
  • 721a444686 Merge pull request #37 from ProGamerGov/patch-1 Phil Wang 2022-04-30 08:19:07 -07:00
  • 63450b466d Fix spelling and grammatical errors ProGamerGov 2022-04-30 09:18:13 -06:00
  • 20e7eb5a9b cleanup Phil Wang 2022-04-30 07:22:57 -07:00
  • e2f9615afa use @clip-anytorch , thanks to @rom1504 0.0.73 Phil Wang 2022-04-30 06:40:54 -07:00
  • 0d1c07c803 fix a bug with classifier free guidance, thanks to @xiankgx again! 0.0.72 Phil Wang 2022-04-30 06:34:18 -07:00
  • a389f81138 todo 0.0.71 Phil Wang 2022-04-29 15:40:51 -07:00
  • 0283556608 fix example in readme, since api changed Phil Wang 2022-04-29 13:40:55 -07:00
  • 5063d192b6 now completely OpenAI CLIP compatible for training Phil Wang 2022-04-29 13:05:01 -07:00
  • 846162ef3e just take care of the logic for AdamW and transformers 0.0.70 Phil Wang 2022-04-29 11:43:26 -07:00
  • 39d3659ad9 now completely OpenAI CLIP compatible for training 0.0.67 Phil Wang 2022-04-29 11:26:24 -07:00
  • f4a54e475e add some training fns Phil Wang 2022-04-29 09:44:55 -07:00
  • fb662a62f3 fix another bug thanks to @xiankgx 0.0.65 Phil Wang 2022-04-29 07:38:32 -07:00
  • 587c8c9b44 optimize for clarity Phil Wang 2022-04-28 21:59:13 -07:00
  • aa900213e7 force first unet in the cascade to be conditioned on image embeds 0.0.64 Phil Wang 2022-04-28 20:53:15 -07:00
  • cb26187450 vqgan-vae codebook dims should be 256 or smaller 0.0.63 Phil Wang 2022-04-28 08:59:03 -07:00
  • 625ce23f6b 🐛 0.0.62 Phil Wang 2022-04-28 07:21:18 -07:00
  • dbf4a281f1 make sure another CLIP can actually be passed in, as long as it is wrapped in an adapter extended from BaseClipAdapter 0.0.61 Phil Wang 2022-04-27 20:45:27 -07:00
  • 4ab527e779 some extra asserts for text encoding of diffusion prior and decoder 0.0.60 Phil Wang 2022-04-27 20:11:43 -07:00
  • d0cdeb3247 add ability for DALL-E2 to return PIL images with return_pil_images = True on forward, for those who have no clue about deep learning Phil Wang 2022-04-27 19:58:06 -07:00
  • 8c2015fd39 add ability for DALL-E2 to return PIL images with return_pil_images = True on forward, for those who have no clue about deep learning 0.0.59 Phil Wang 2022-04-27 19:57:27 -07:00
  • 8c610aad9a only pass text encodings conditioning in diffusion prior if specified on initialization 0.0.58 Phil Wang 2022-04-27 19:48:16 -07:00
  • 6700381a37 prepare for ability to integrate other clips other than x-clip 0.0.57 Phil Wang 2022-04-27 19:34:56 -07:00
  • 20377f889a todo Phil Wang 2022-04-27 17:22:14 -07:00
  • 6edb1c5dd0 fix issue with ema class 0.0.56 Phil Wang 2022-04-27 16:40:02 -07:00
  • b093f92182 inform what is possible Phil Wang 2022-04-27 08:25:16 -07:00
  • fa3bb6ba5c make sure cpu-only still works 0.0.55 Phil Wang 2022-04-27 08:02:10 -07:00
  • 2705e7c9b0 attention-based upsampling claims unsupported by local experiments, removing Phil Wang 2022-04-27 07:51:04 -07:00
  • 77141882c8 complete vit-vqgan from https://arxiv.org/abs/2110.04627 0.0.54 Phil Wang 2022-04-26 17:20:47 -07:00
  • e024971dc3 complete vit-vqgan from https://arxiv.org/abs/2110.04627 0.0.53 Phil Wang 2022-04-26 17:04:18 -07:00
  • 4075d02139 nevermind, it could be working, but only when i stabilize it with the feedforward layer + tanh as proposed in vit-vqgan paper (which will be built into the repository later for the latent diffusion) Phil Wang 2022-04-26 12:43:31 -07:00
  • de0296106b be able to turn off warning for use of LazyLinear by passing in text embedding dimension for unet 0.0.52 Phil Wang 2022-04-26 11:42:46 -07:00
  • eafb136214 suppress a warning 0.0.51 Phil Wang 2022-04-26 11:40:45 -07:00
  • bfbcc283a3 DRY a tiny bit for gaussian diffusion related logic Phil Wang 2022-04-26 11:39:12 -07:00
  • c30544b73a no CLIP altogether for training DiffusionPrior 0.0.50 Phil Wang 2022-04-26 10:23:34 -07:00
  • bdf5e9c009 todo Phil Wang 2022-04-26 09:56:54 -07:00
  • 9878be760b have researcher explicitly state upfront whether to condition with text encodings in cascading ddpm decoder, have DALLE-2 class take care of passing in text if feature turned on 0.0.49 Phil Wang 2022-04-26 09:47:09 -07:00