Phil Wang
|
36c5079bd7
|
LazyLinear is not mature, make users pass in text_embed_dim if text conditioning is turned on
|
2022-05-15 18:56:52 -07:00 |
|
Phil Wang
|
99778e12de
|
trainer classes now takes care of auto-casting numpy to torch tensors, and setting correct device based on model parameter devices
|
2022-05-15 15:25:45 -07:00 |
|
Phil Wang
|
7b7a62044a
|
use eval vs training mode to determine whether to call backprop on trainer forward
|
2022-05-15 14:20:59 -07:00 |
|
Phil Wang
|
e66c7b0249
|
incorrect naming
|
2022-05-15 11:23:52 -07:00 |
|
Phil Wang
|
68e7d2f241
|
make sure gradient accumulation feature works even if all arguments passed in are keyword arguments
|
2022-05-15 11:16:16 -07:00 |
|
Phil Wang
|
aa6772dcff
|
make sure optimizer and scaler is reloaded on resume for training diffusion prior script, move argparse to click
|
2022-05-15 10:48:10 -07:00 |
|
Phil Wang
|
89de5af63e
|
experiment tracker agnostic
|
2022-05-15 09:56:40 -07:00 |
|
Phil Wang
|
4ec6d0ba81
|
backwards pass is not recommended under the autocast context, per pytorch docs
|
2022-05-14 18:26:19 -07:00 |
|
Phil Wang
|
aee92dba4a
|
simplify more
|
2022-05-14 17:16:46 -07:00 |
|
Phil Wang
|
b0cd5f24b6
|
take care of gradient accumulation automatically for researchers, by passing in a max_batch_size on the decoder or diffusion prior trainer forward
|
2022-05-14 17:04:09 -07:00 |
|
Phil Wang
|
b494ed81d4
|
take care of backwards within trainer classes for diffusion prior and decoder, readying to take care of gradient accumulation as well (plus, unsure if loss should be backwards within autocast block)
|
2022-05-14 15:49:24 -07:00 |
|
Phil Wang
|
d5293f19f1
|
lineup with paper
|
2022-05-14 13:57:00 -07:00 |
|
Phil Wang
|
e697183849
|
be able to customize adam eps
|
2022-05-14 13:55:04 -07:00 |
|
Phil Wang
|
591d37e266
|
lower default initial learning rate to what Jonathan Ho had in his original repo
|
2022-05-14 13:22:43 -07:00 |
|
Phil Wang
|
924455d97d
|
align the ema model device back after sampling from the cascading ddpm in the decoder
|
2022-05-11 19:56:54 -07:00 |
|
Phil Wang
|
9b322ea634
|
patch
|
2022-05-09 19:46:19 -07:00 |
|
Phil Wang
|
64f7be1926
|
some cleanup
|
2022-05-09 16:50:21 -07:00 |
|
Phil Wang
|
98df1ba51e
|
add diffusion prior trainer, which automatically takes care of the exponential moving average (training and sampling), as well as mixed precision, gradient clipping
|
2022-05-06 08:11:09 -07:00 |
|
Phil Wang
|
1924c7cc3d
|
fix issue with mixed precision and gradient clipping
|
2022-05-02 09:20:19 -07:00 |
|
Phil Wang
|
ebe01749ed
|
DecoderTrainer sample method uses the exponentially moving averaged
|
2022-04-30 14:55:34 -07:00 |
|
Phil Wang
|
63195cc2cb
|
allow for division of loss prior to scaling, for gradient accumulation purposes
|
2022-04-30 12:56:47 -07:00 |
|
Phil Wang
|
a2ef69af66
|
take care of mixed precision, and make gradient accumulation do-able externally
|
2022-04-30 12:27:24 -07:00 |
|
Phil Wang
|
5fff22834e
|
be able to finely customize learning parameters for each unet, take care of gradient clipping
|
2022-04-30 11:56:05 -07:00 |
|
Phil Wang
|
a9421f49ec
|
simplify Decoder training for the public
|
2022-04-30 11:45:18 -07:00 |
|
Phil Wang
|
5063d192b6
|
now completely OpenAI CLIP compatible for training
just take care of the logic for AdamW and transformers
used namedtuples for clip adapter embedding outputs
|
2022-04-29 13:05:01 -07:00 |
|
Phil Wang
|
f4a54e475e
|
add some training fns
|
2022-04-29 09:44:55 -07:00 |
|
Phil Wang
|
6edb1c5dd0
|
fix issue with ema class
|
2022-04-27 16:40:02 -07:00 |
|
Phil Wang
|
45262a4bb7
|
bring in the exponential moving average wrapper, to get ready for training
|
2022-04-25 19:24:13 -07:00 |
|
Phil Wang
|
5e03b7f932
|
get ready for all the training related classes and functions
|
2022-04-12 09:54:50 -07:00 |
|