Phil Wang
b494ed81d4
take care of backwards within trainer classes for diffusion prior and decoder, readying to take care of gradient accumulation as well (plus, unsure if loss should be backwards within autocast block)
2022-05-14 15:49:24 -07:00
Phil Wang
ff3474f05c
normalize conditioning tokens outside of cross attention blocks
2022-05-14 14:23:52 -07:00
Light-V
6f76652d11
fix typo in README.md ( #85 )
...
The default config for clip from openai should be ViT-B/32
2022-05-11 13:38:16 -07:00
Phil Wang
908088cfea
wrap up cross embed layer feature
2022-05-10 12:19:34 -07:00
Phil Wang
8dc8a3de0d
product management
2022-05-10 11:51:38 -07:00
Phil Wang
35f89556ba
bring in the cross embed layer from Crossformer paper for initial convolution in unet
2022-05-10 11:50:38 -07:00
Phil Wang
fc8fce38fb
make sure cascading DDPM can be trained unconditionally, to ready for CLI one command training for the public
2022-05-10 10:48:10 -07:00
Phil Wang
a1bfb03ba4
project management
2022-05-10 10:13:51 -07:00
Phil Wang
b1e7b5f6bb
make sure resnet groups in unet is finely customizable
2022-05-10 10:12:50 -07:00
Phil Wang
64f7be1926
some cleanup
2022-05-09 16:50:21 -07:00
Kumar R
8647cb5e76
Val loss changes, with quite a few other changes. This is in place of the earlier PR( https://github.com/lucidrains/DALLE2-pytorch/pull/67 ) ( #77 )
...
* Val_loss changes - no rebased with lucidrains' master.
* Val Loss changes - now rebased with lucidrains' master
* train_diffusion_prior.py updates
* dalle2_pytorch.py updates
* __init__.py changes
* Update train_diffusion_prior.py
* Update dalle2_pytorch.py
* Update train_diffusion_prior.py
* Update train_diffusion_prior.py
* Update dalle2_pytorch.py
* Update train_diffusion_prior.py
* Update train_diffusion_prior.py
* Update train_diffusion_prior.py
* Update train_diffusion_prior.py
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
2022-05-09 08:53:29 -07:00
Phil Wang
53c189e46a
give more surface area for attention in diffusion prior
2022-05-09 08:08:11 -07:00
Phil Wang
c87b84a259
todo
2022-05-07 09:21:08 -07:00
Phil Wang
8b05468653
todo
2022-05-07 08:33:45 -07:00
Piero Rolando
fd53fa17db
Fix a typo in README ( #70 )
...
Change "pyhon" for "python" (correct)
2022-05-06 16:53:36 -07:00
Phil Wang
09e9eaa5a6
project management
2022-05-06 09:00:22 -07:00
Phil Wang
e6d752cf4a
reprioritize
2022-05-06 08:55:26 -07:00
Phil Wang
0be1e0d64c
support CoCa, which seems to be better than CLIP (has an autoregressive text encoder) https://arxiv.org/abs/2205.01917
2022-05-06 08:27:12 -07:00
Phil Wang
98df1ba51e
add diffusion prior trainer, which automatically takes care of the exponential moving average (training and sampling), as well as mixed precision, gradient clipping
2022-05-06 08:11:09 -07:00
Phil Wang
79fabc4341
reorg readme
2022-05-05 07:54:12 -07:00
Kumar R
f7ef4bde38
Added some documentation for the diffusion prior in README.md ( #62 )
...
* Delete README.md
* Create README.md
* Update README.md
* Update README.md
2022-05-05 07:51:31 -07:00
Phil Wang
93ba019069
product management
2022-05-05 07:39:51 -07:00
Aidan Dempster
15acc03bd4
Add a dataloader for training the decoder ( #57 )
...
* Added dataloader and updated requirements
* Added option to set embedding shard width separately from webdataset shard length.
There must be a better way to do this.
* Changed embedding loader to read using fsspec
* Moved the loader into a more compatible location
* Removed unnecessary package
* Fixed typo (Embeding -> Embedding)
* Simplified example embedding finder code to remove unnecessary get_file_list function
* Added example usage of ImageEmbeddingDataset
* Changed the name of create_dataloader to be more verbose
Added a dataloaders __init__.py
2022-05-05 07:08:45 -07:00
Phil Wang
896f19786d
remove convnext blocks, they are illsuited for generative work, validated by early experimental results at https://github.com/lucidrains/video-diffusion-pytorch
2022-05-05 07:07:21 -07:00
Phil Wang
a6bf8ddef6
advertise laion
2022-05-04 15:04:05 -07:00
Phil Wang
97b751209f
allow for last unet in the cascade to be trained on crops, if it is convolution-only
2022-05-04 11:48:48 -07:00
Phil Wang
74103fd8d6
product management
2022-05-04 11:20:50 -07:00
Phil Wang
1992d25cad
project management
2022-05-04 11:18:54 -07:00
Phil Wang
9ff228188b
offer old resnet blocks, from the original DDPM paper, just in case convnexts are unsuitable for generative work
2022-05-04 10:52:58 -07:00
Phil Wang
c30f380689
final reminder
2022-05-03 08:18:53 -07:00
Phil Wang
e4e884bb8b
keep all doors open
2022-05-03 08:17:02 -07:00
Phil Wang
803ad9c17d
product management again
2022-05-03 08:15:25 -07:00
Phil Wang
a88dd6a9c0
todo
2022-05-03 08:09:02 -07:00
Phil Wang
fa66f7e1e9
todo
2022-05-02 12:57:15 -07:00
Phil Wang
70282de23b
add ability to turn on normformer settings, given @borisdayma reported good results and some personal anecdata
2022-05-02 11:33:15 -07:00
Phil Wang
83f761847e
todo
2022-05-02 10:52:39 -07:00
Phil Wang
c1db2753f5
todo
2022-05-01 18:02:30 -07:00
Phil Wang
ad87bfe28f
switch to using linear attention for the sparse attention layers within unet, given success in GAN projects
2022-05-01 17:59:03 -07:00
Phil Wang
902693e271
todo
2022-05-01 11:57:08 -07:00
Phil Wang
ad8d7a368b
product management
2022-05-01 11:26:21 -07:00
Phil Wang
94aaa08d97
product management
2022-05-01 09:43:10 -07:00
Phil Wang
8b9bbec7d1
project management
2022-05-01 09:32:57 -07:00
Phil Wang
5bfbccda22
port over vqgan vae trainer
2022-05-01 08:09:15 -07:00
Phil Wang
989275ff59
product management
2022-04-30 16:57:56 -07:00
Phil Wang
56408f4a40
project management
2022-04-30 16:57:02 -07:00
Phil Wang
d1a697ac23
allows one to shortcut sampling at a specific unet number, if one were to be training in stages
2022-04-30 16:05:13 -07:00
Phil Wang
ebe01749ed
DecoderTrainer sample method uses the exponentially moving averaged
2022-04-30 14:55:34 -07:00
Phil Wang
a2ef69af66
take care of mixed precision, and make gradient accumulation do-able externally
2022-04-30 12:27:24 -07:00
Phil Wang
a9421f49ec
simplify Decoder training for the public
2022-04-30 11:45:18 -07:00
Phil Wang
f19c99ecb0
fix decoder needing separate conditional dropping probabilities for image embeddings and text encodings, thanks to @xiankgx !
2022-04-30 08:48:05 -07:00