Phil Wang
|
41ca896413
|
depend on huggingface accelerate, move appreciation thread up for visibility
|
2022-06-19 08:50:35 -07:00 |
|
Phil Wang
|
a851168633
|
make youtokentome optional package, due to reported installation difficulties
|
2022-06-01 09:25:35 -07:00 |
|
Phil Wang
|
6f8b90d4d7
|
add packaging package
|
2022-05-30 11:45:00 -07:00 |
|
Phil Wang
|
b588286288
|
fix version
|
2022-05-30 11:06:34 -07:00 |
|
Phil Wang
|
b693e0be03
|
default number of resnet blocks per layer in unet to 2 (in imagen it was 3 for base 64x64)
|
2022-05-30 10:06:48 -07:00 |
|
Phil Wang
|
a0bed30a84
|
additional conditioning on image embedding by summing to time embeddings (for FiLM like conditioning in subsequent layers), from passage found in paper by @mhh0318
|
2022-05-30 09:26:51 -07:00 |
|
Phil Wang
|
a13d2d89c5
|
0.5.7
|
2022-05-29 07:40:25 -07:00 |
|
Phil Wang
|
b8af2210df
|
make sure diffusion prior can be instantiated from pydantic class without clip
|
2022-05-26 08:47:30 -07:00 |
|
Phil Wang
|
f4fe6c570d
|
allow for full customization of number of resnet blocks per down or upsampling layers in unet, as in imagen
|
2022-05-26 08:33:31 -07:00 |
|
Phil Wang
|
6161b61c55
|
0.5.4
|
2022-05-25 09:32:17 -07:00 |
|
Phil Wang
|
f326a95e26
|
0.5.3
|
2022-05-25 09:07:28 -07:00 |
|
Phil Wang
|
f23fab7ef7
|
switch over to scale shift conditioning, as it seems like Imagen and Glide used it and it may be important
|
2022-05-24 21:46:12 -07:00 |
|
Phil Wang
|
857b9fbf1e
|
allow for one to stop grouping out weight decayable parameters, to debug optimizer state dict problem
|
2022-05-24 21:42:32 -07:00 |
|
Phil Wang
|
8864fd0aa7
|
bring in the dynamic thresholding technique from the Imagen paper, which purportedly improves classifier free guidance for the cascading ddpm
|
2022-05-24 18:15:14 -07:00 |
|
Phil Wang
|
fa533962bd
|
just use an assert to make sure clip image channels is never different than the channels of the diffusion prior and decoder, if clip is given
|
2022-05-22 22:43:14 -07:00 |
|
Phil Wang
|
276abf337b
|
fix and cleanup image size determination logic in decoder
|
2022-05-22 22:28:45 -07:00 |
|
Phil Wang
|
ae42d03006
|
allow for saving of additional fields on save method in trainers, and return loaded objects from the load method
|
2022-05-22 22:14:25 -07:00 |
|
Phil Wang
|
4d346e98d9
|
allow for config driven creation of clip-less diffusion prior
|
2022-05-22 20:36:20 -07:00 |
|
Phil Wang
|
5c397c9d66
|
move neural network creations off the configuration file into the pydantic classes
|
2022-05-22 19:18:18 -07:00 |
|
Phil Wang
|
0f4edff214
|
derived value for image preprocessing belongs to the data config class
|
2022-05-22 18:42:40 -07:00 |
|
Phil Wang
|
501a8c7c46
|
small cleanup
|
2022-05-22 15:39:38 -07:00 |
|
Phil Wang
|
49de72040c
|
fix decoder trainer optimizer loading (since there are multiple for each unet), also save and load step number correctly
|
2022-05-22 15:21:00 -07:00 |
|
Phil Wang
|
271a376eaf
|
0.4.3
|
2022-05-22 15:10:28 -07:00 |
|
Phil Wang
|
c12e067178
|
let the pydantic config base model take care of loading configuration from json path
|
2022-05-22 14:47:23 -07:00 |
|
Phil Wang
|
c6629c431a
|
make training splits into its own pydantic base model, validate it sums to 1, make decoder script cleaner
|
2022-05-22 14:43:22 -07:00 |
|
Phil Wang
|
a1ef023193
|
use pydantic to manage decoder training configs + defaults and refactor training script
|
2022-05-22 14:27:40 -07:00 |
|
Phil Wang
|
d49eca62fa
|
dep
|
2022-05-21 11:27:52 -07:00 |
|
Phil Wang
|
8b0d459b25
|
move config parsing logic to own file, consider whether to find an off-the-shelf solution at future date
|
2022-05-21 10:30:10 -07:00 |
|
Phil Wang
|
80497e9839
|
accept unets as list for decoder
|
2022-05-20 20:31:26 -07:00 |
|
Phil Wang
|
f526f14d7c
|
bump
|
2022-05-20 20:20:40 -07:00 |
|
Aidan Dempster
|
022c94e443
|
Added single GPU training script for decoder (#108)
Added config files for training
Changed example image generation to be more efficient
Added configuration description to README
Removed unused import
|
2022-05-20 19:46:19 -07:00 |
|
Phil Wang
|
430961cb97
|
it was correct the first time, my bad
|
2022-05-20 18:05:15 -07:00 |
|
Phil Wang
|
721f9687c1
|
fix wandb logging in tracker, and do some cleanup
|
2022-05-20 17:27:43 -07:00 |
|
Phil Wang
|
db0642c4cd
|
quick fix for @marunine
|
2022-05-18 20:22:52 -07:00 |
|
Phil Wang
|
bb86ab2404
|
update sample, and set default gradient clipping value for decoder training
|
2022-05-16 17:38:30 -07:00 |
|
Phil Wang
|
c7ea8748db
|
default decoder learning rate to what was in the paper
|
2022-05-16 13:33:54 -07:00 |
|
Phil Wang
|
13382885d9
|
final update to dalle2 repository for a while - sampling from prior in chunks automatically with max_batch_size keyword given
|
2022-05-16 12:57:31 -07:00 |
|
Phil Wang
|
164d9be444
|
use a decorator and take care of sampling in chunks (max_batch_size keyword), in case one is sampling a huge grid of images
|
2022-05-16 12:34:28 -07:00 |
|
Phil Wang
|
89ff04cfe2
|
final tweak to EMA class
|
2022-05-16 11:54:34 -07:00 |
|
Phil Wang
|
f4016f6302
|
allow for overriding use of EMA during sampling in decoder trainer with use_non_ema keyword, also fix some issues with automatic normalization of images and low res conditioning image if latent diffusion is in play
|
2022-05-16 11:18:30 -07:00 |
|
Phil Wang
|
1212f7058d
|
allow text encodings and text mask to be passed in on forward and sampling for Decoder class
|
2022-05-16 10:40:32 -07:00 |
|
Phil Wang
|
dab106d4e5
|
back to no_grad for now, also keep track and restore unet devices in one_unet_in_gpu contextmanager
|
2022-05-16 09:36:14 -07:00 |
|
Phil Wang
|
bb151ca6b1
|
unet_number on decoder trainer only needs to be passed in if there is greater than 1 unet, so that unconditional training of a single ddpm is seamless (experiment in progress locally)
|
2022-05-16 09:17:17 -07:00 |
|
Phil Wang
|
ecf9e8027d
|
make sure classifier free guidance is used only if conditional dropout is present on the DiffusionPrior and Decoder classes. also make sure prior can have a different conditional scale than decoder
|
2022-05-15 19:09:38 -07:00 |
|
Phil Wang
|
36c5079bd7
|
LazyLinear is not mature, make users pass in text_embed_dim if text conditioning is turned on
|
2022-05-15 18:56:52 -07:00 |
|
Phil Wang
|
4a4c7ac9e6
|
cond drop prob for diffusion prior network should default to 0
|
2022-05-15 18:47:45 -07:00 |
|
Phil Wang
|
11d4e11f10
|
allow for training unconditional ddpm or cascading ddpms
|
2022-05-15 16:54:56 -07:00 |
|
Phil Wang
|
99778e12de
|
trainer classes now takes care of auto-casting numpy to torch tensors, and setting correct device based on model parameter devices
|
2022-05-15 15:25:45 -07:00 |
|
Phil Wang
|
7b7a62044a
|
use eval vs training mode to determine whether to call backprop on trainer forward
|
2022-05-15 14:20:59 -07:00 |
|
Phil Wang
|
68e7d2f241
|
make sure gradient accumulation feature works even if all arguments passed in are keyword arguments
|
2022-05-15 11:16:16 -07:00 |
|