Phil Wang
99778e12de
trainer classes now takes care of auto-casting numpy to torch tensors, and setting correct device based on model parameter devices
2022-05-15 15:25:45 -07:00
Phil Wang
7b7a62044a
use eval vs training mode to determine whether to call backprop on trainer forward
2022-05-15 14:20:59 -07:00
Phil Wang
68e7d2f241
make sure gradient accumulation feature works even if all arguments passed in are keyword arguments
2022-05-15 11:16:16 -07:00
Phil Wang
f7eee09d8b
0.2.30
2022-05-15 09:56:59 -07:00
Phil Wang
4ec6d0ba81
backwards pass is not recommended under the autocast context, per pytorch docs
2022-05-14 18:26:19 -07:00
Phil Wang
aee92dba4a
simplify more
2022-05-14 17:16:46 -07:00
Phil Wang
b0cd5f24b6
take care of gradient accumulation automatically for researchers, by passing in a max_batch_size on the decoder or diffusion prior trainer forward
2022-05-14 17:04:09 -07:00
Phil Wang
b494ed81d4
take care of backwards within trainer classes for diffusion prior and decoder, readying to take care of gradient accumulation as well (plus, unsure if loss should be backwards within autocast block)
2022-05-14 15:49:24 -07:00
Phil Wang
ff3474f05c
normalize conditioning tokens outside of cross attention blocks
2022-05-14 14:23:52 -07:00
Phil Wang
d5293f19f1
lineup with paper
2022-05-14 13:57:00 -07:00
Phil Wang
e697183849
be able to customize adam eps
2022-05-14 13:55:04 -07:00
Phil Wang
591d37e266
lower default initial learning rate to what Jonathan Ho had in his original repo
2022-05-14 13:22:43 -07:00
Phil Wang
d1f02e8f49
always use sandwich norm for attention layer
2022-05-14 12:13:41 -07:00
Phil Wang
9faab59b23
use post-attn-branch layernorm in attempt to stabilize cross attention conditioning in decoder
2022-05-14 11:58:09 -07:00
Phil Wang
5d27029e98
make sure lowres conditioning image is properly normalized to -1 to 1 for cascading ddpm
2022-05-14 01:23:54 -07:00
Phil Wang
3115fa17b3
fix everything around normalizing images to -1 to 1 for ddpm training automatically
2022-05-14 01:17:11 -07:00
Phil Wang
124d8577c8
move the inverse normalization function called before image embeddings are derived from clip to within the diffusion prior and decoder classes
2022-05-14 00:37:52 -07:00
Phil Wang
2277b47ffd
make sure learned variance can work for any number of unets in the decoder, defaults to first unet, as suggested was used in the paper
2022-05-12 14:18:15 -07:00
Phil Wang
924455d97d
align the ema model device back after sampling from the cascading ddpm in the decoder
2022-05-11 19:56:54 -07:00
Phil Wang
6021945fc8
default to l2 loss
2022-05-11 19:24:51 -07:00
Phil Wang
3dda2570ed
fix amp issue for https://github.com/lucidrains/DALLE2-pytorch/issues/82
2022-05-11 08:21:39 -07:00
Phil Wang
2f3c02dba8
numerical accuracy for noise schedule parameters
2022-05-10 15:28:46 -07:00
Phil Wang
908088cfea
wrap up cross embed layer feature
2022-05-10 12:19:34 -07:00
Phil Wang
35f89556ba
bring in the cross embed layer from Crossformer paper for initial convolution in unet
2022-05-10 11:50:38 -07:00
Phil Wang
2b55f753b9
fix new issue with github actions and auto pypi package uploading
2022-05-10 10:51:15 -07:00
Phil Wang
fc8fce38fb
make sure cascading DDPM can be trained unconditionally, to ready for CLI one command training for the public
2022-05-10 10:48:10 -07:00
Phil Wang
b1e7b5f6bb
make sure resnet groups in unet is finely customizable
2022-05-10 10:12:50 -07:00
Phil Wang
9b322ea634
patch
2022-05-09 19:46:19 -07:00
Phil Wang
ba64ea45cc
0.2.3
2022-05-09 16:50:31 -07:00
Phil Wang
db805e73e1
fix a bug with numerical stability in attention, sorry! 🐛
2022-05-09 16:23:37 -07:00
Phil Wang
e46eaec817
deal the diffusion prior problem yet another blow
2022-05-09 11:08:52 -07:00
Phil Wang
53c189e46a
give more surface area for attention in diffusion prior
2022-05-09 08:08:11 -07:00
Phil Wang
dde51fd362
revert restriction for classifier free guidance for diffusion prior, given @crowsonkb advice
2022-05-07 20:55:41 -07:00
Phil Wang
4010aec033
turn off classifier free guidance if predicting x_start for diffusion prior
2022-05-07 09:38:17 -07:00
Phil Wang
830afd3c15
sinusoidal embed time embeddings for diffusion prior as well, for continuous version
2022-05-07 08:32:43 -07:00
Phil Wang
8f93729d19
when in doubt, make it a hyperparameter
2022-05-07 07:52:17 -07:00
Phil Wang
85ed77d512
fix a potentially huge bug thanks to @CiaoHe https://github.com/lucidrains/DALLE2-pytorch/issues/71
2022-05-07 05:05:54 -07:00
Phil Wang
3676ef4d49
make sure vqgan-vae trainer supports mixed precision
2022-05-06 10:44:16 -07:00
Phil Wang
28e944f328
make sure openai clip adapter outputs l2normed embeddings
2022-05-06 10:12:03 -07:00
Phil Wang
14e63a3f67
also offer l2norm clamping in diffusion prior during training, if one were using predict x0 objective
2022-05-06 10:05:14 -07:00
Phil Wang
ad20a14a4d
bring in rotary embeddings for diffusion prior causal transformer (the most powerful relative positional encoding, used in PaLM) - 0.1.0 because of breaking change
2022-05-06 08:45:30 -07:00
Phil Wang
0be1e0d64c
support CoCa, which seems to be better than CLIP (has an autoregressive text encoder) https://arxiv.org/abs/2205.01917
2022-05-06 08:27:12 -07:00
Phil Wang
98df1ba51e
add diffusion prior trainer, which automatically takes care of the exponential moving average (training and sampling), as well as mixed precision, gradient clipping
2022-05-06 08:11:09 -07:00
Phil Wang
878b555ef7
fix training with clip
2022-05-06 07:37:57 -07:00
Phil Wang
c76a964fd6
allow for CLIP to be optional in Decoder, and allow DecoderTrainer to work off training pre-encoded image embeddings
2022-05-05 08:11:01 -07:00
Phil Wang
8518684ae9
does not make much sense, as researchers may want to try predicting noise with diffusionprior instead of predicting x0
2022-05-05 07:37:00 -07:00
Phil Wang
1d5dc08810
take @crowsonkb 's suggestion at https://github.com/lucidrains/DALLE2-pytorch/issues/60#issue-1226116132
2022-05-05 07:28:53 -07:00
Phil Wang
d8d8b6caf1
dataloaders for decoder training, from @Veldrovive
2022-05-05 07:09:45 -07:00
Aidan Dempster
15acc03bd4
Add a dataloader for training the decoder ( #57 )
...
* Added dataloader and updated requirements
* Added option to set embedding shard width separately from webdataset shard length.
There must be a better way to do this.
* Changed embedding loader to read using fsspec
* Moved the loader into a more compatible location
* Removed unnecessary package
* Fixed typo (Embeding -> Embedding)
* Simplified example embedding finder code to remove unnecessary get_file_list function
* Added example usage of ImageEmbeddingDataset
* Changed the name of create_dataloader to be more verbose
Added a dataloaders __init__.py
2022-05-05 07:08:45 -07:00
Phil Wang
896f19786d
remove convnext blocks, they are illsuited for generative work, validated by early experimental results at https://github.com/lucidrains/video-diffusion-pytorch
2022-05-05 07:07:21 -07:00