Phil Wang
ee75515c7d
remove forcing of softmax in f32, in case it is interfering with deepspeed
2022-07-05 16:53:58 -07:00
Phil Wang
b9a908ff75
bring in two tricks from the cogview paper for reducing the chances of overflow, for attention and layernorm
2022-07-05 14:27:04 -07:00
Phil Wang
e1fe3089df
do bias-less layernorm manually
2022-07-05 13:09:58 -07:00
Phil Wang
3d23ba4aa5
add ability to specify full self attention on specific stages in the unet
2022-07-01 10:22:07 -07:00
Phil Wang
7b0edf9e42
allow for returning low resolution conditioning image on forward through decoder with return_lowres_cond_image flag
2022-07-01 09:35:39 -07:00
Phil Wang
a922a539de
bring back convtranspose2d upsampling, allow for nearest upsample with hyperparam, change kernel size of last conv to 1, make configurable, cleanup
2022-07-01 09:21:47 -07:00
Phil Wang
8f2466f1cd
blur sigma for upsampling training was 0.6 in the paper, make that the default value
2022-06-30 17:03:16 -07:00
Phil Wang
908ab83799
add skip connections for all intermediate resnet blocks, also add an extra resnet block for memory efficient version of unet, time condition for both initial resnet block and last one before output
2022-06-29 08:16:58 -07:00
Phil Wang
6a11b9678b
bring in the skip connection scaling factor, used by imagen in their unets, cite original paper using it
2022-06-26 21:59:55 -07:00
Phil Wang
b90364695d
fix remaining issues with deriving cond_on_text_encodings from child unet settings
2022-06-26 21:07:42 -07:00
zion
868c001199
bug fixes for text conditioning update ( #175 )
2022-06-26 16:12:32 -07:00
Phil Wang
032e83b0e0
nevermind, do not enforce text encodings on first unet
2022-06-26 12:45:05 -07:00
Phil Wang
2e85e736f3
remove unnecessary decoder setting, and if not unconditional, always make sure the first unet is condition-able on text
2022-06-26 12:32:17 -07:00
zion
c453f468b1
autoswitch tqdm for notebooks ( #171 )
...
avoids printing the `tqdm` progress bar to a newline in notebooks when detected
2022-06-25 16:37:06 -07:00
Phil Wang
f545ce18f4
be able to turn off p2 loss reweighting for upsamplers
2022-06-20 09:43:31 -07:00
Phil Wang
fc7abf624d
in paper, blur sigma was 0.6
2022-06-20 09:05:08 -07:00
Phil Wang
138079ca83
allow for setting beta schedules of unets differently in the decoder, as what was used in the paper was cosine, cosine, linear
2022-06-20 08:56:37 -07:00
Aidan Dempster
58892135d9
Distributed Training of the Decoder ( #121 )
...
* Converted decoder trainer to use accelerate
* Fixed issue where metric evaluation would hang on distributed mode
* Implemented functional saving
Loading still fails due to some issue with the optimizer
* Fixed issue with loading decoders
* Fixed issue with tracker config
* Fixed issue with amp
Updated logging to be more logical
* Saving checkpoint now saves position in training as well
Fixed an issue with running out of gpu space due to loading weights into the gpu twice
* Fixed ema for distributed training
* Fixed isue where get_pkg_version was reintroduced
* Changed decoder trainer to upload config as a file
Fixed issue where loading best would error
2022-06-19 09:25:54 -07:00
Phil Wang
6651eafa93
one more residual, after seeing good results on unconditional generation locally
2022-06-16 11:18:02 -07:00
Phil Wang
e6bb75e5ab
fix missing residual for highest resolution of the unet
2022-06-15 20:09:43 -07:00
Phil Wang
b7f9607258
make memory efficient unet design from imagen toggle-able
2022-06-15 13:40:26 -07:00
Phil Wang
2219348a6e
adopt similar unet architecture as imagen
2022-06-15 12:18:21 -07:00
Phil Wang
9eea9b9862
add p2 loss reweighting for decoder training as an option
2022-06-14 10:58:57 -07:00
Phil Wang
5d958713c0
fix classifier free guidance for image hiddens summed to time hiddens, thanks to @xvjiarui for finding this bug
2022-06-13 21:01:50 -07:00
Phil Wang
0f31980362
cleanup
2022-06-07 17:31:38 -07:00
Kashif Rasul
1a81670718
fix quadratic_beta_schedule ( #141 )
2022-06-06 08:45:14 -07:00
Phil Wang
ffd342e9d0
allow for an option to constrain the variance interpolation fraction coming out from the unet for learned variance, if it is turned on
2022-06-03 09:34:57 -07:00
Phil Wang
8cc278447e
just cast to right types for blur sigma and kernel size augs
2022-06-02 11:21:58 -07:00
Phil Wang
38cd62010c
allow for random blur sigma and kernel size augmentations on low res conditioning (need to reread paper to see if the augmentation value needs to be fed into the unet for conditioning as well)
2022-06-02 11:11:25 -07:00
Phil Wang
b693e0be03
default number of resnet blocks per layer in unet to 2 (in imagen it was 3 for base 64x64)
2022-05-30 10:06:48 -07:00
Phil Wang
a0bed30a84
additional conditioning on image embedding by summing to time embeddings (for FiLM like conditioning in subsequent layers), from passage found in paper by @mhh0318
2022-05-30 09:26:51 -07:00
Phil Wang
f4fe6c570d
allow for full customization of number of resnet blocks per down or upsampling layers in unet, as in imagen
2022-05-26 08:33:31 -07:00
Phil Wang
f23fab7ef7
switch over to scale shift conditioning, as it seems like Imagen and Glide used it and it may be important
2022-05-24 21:46:12 -07:00
Phil Wang
8864fd0aa7
bring in the dynamic thresholding technique from the Imagen paper, which purportedly improves classifier free guidance for the cascading ddpm
2022-05-24 18:15:14 -07:00
Phil Wang
fa533962bd
just use an assert to make sure clip image channels is never different than the channels of the diffusion prior and decoder, if clip is given
2022-05-22 22:43:14 -07:00
Phil Wang
276abf337b
fix and cleanup image size determination logic in decoder
2022-05-22 22:28:45 -07:00
Phil Wang
ae42d03006
allow for saving of additional fields on save method in trainers, and return loaded objects from the load method
2022-05-22 22:14:25 -07:00
Phil Wang
5c397c9d66
move neural network creations off the configuration file into the pydantic classes
2022-05-22 19:18:18 -07:00
Phil Wang
80497e9839
accept unets as list for decoder
2022-05-20 20:31:26 -07:00
Phil Wang
db0642c4cd
quick fix for @marunine
2022-05-18 20:22:52 -07:00
Phil Wang
f4016f6302
allow for overriding use of EMA during sampling in decoder trainer with use_non_ema keyword, also fix some issues with automatic normalization of images and low res conditioning image if latent diffusion is in play
2022-05-16 11:18:30 -07:00
Phil Wang
1212f7058d
allow text encodings and text mask to be passed in on forward and sampling for Decoder class
2022-05-16 10:40:32 -07:00
Phil Wang
dab106d4e5
back to no_grad for now, also keep track and restore unet devices in one_unet_in_gpu contextmanager
2022-05-16 09:36:14 -07:00
Phil Wang
ecf9e8027d
make sure classifier free guidance is used only if conditional dropout is present on the DiffusionPrior and Decoder classes. also make sure prior can have a different conditional scale than decoder
2022-05-15 19:09:38 -07:00
Phil Wang
36c5079bd7
LazyLinear is not mature, make users pass in text_embed_dim if text conditioning is turned on
2022-05-15 18:56:52 -07:00
Phil Wang
4a4c7ac9e6
cond drop prob for diffusion prior network should default to 0
2022-05-15 18:47:45 -07:00
Phil Wang
11d4e11f10
allow for training unconditional ddpm or cascading ddpms
2022-05-15 16:54:56 -07:00
Phil Wang
156fe5ed9f
final cleanup for the day
2022-05-15 12:38:41 -07:00
Phil Wang
ff3474f05c
normalize conditioning tokens outside of cross attention blocks
2022-05-14 14:23:52 -07:00
Phil Wang
d1f02e8f49
always use sandwich norm for attention layer
2022-05-14 12:13:41 -07:00