Phil Wang
8cc278447e
just cast to right types for blur sigma and kernel size augs
2022-06-02 11:21:58 -07:00
Phil Wang
38cd62010c
allow for random blur sigma and kernel size augmentations on low res conditioning (need to reread paper to see if the augmentation value needs to be fed into the unet for conditioning as well)
2022-06-02 11:11:25 -07:00
Ryan Russell
1cc288af39
Improve Readability ( #133 )
...
Signed-off-by: Ryan Russell <git@ryanrussell.org >
2022-06-01 13:28:02 -07:00
Phil Wang
a851168633
make youtokentome optional package, due to reported installation difficulties
2022-06-01 09:25:35 -07:00
Phil Wang
1ffeecd0ca
lower default ema beta value
2022-05-31 11:55:21 -07:00
Phil Wang
3df899f7a4
patch
2022-05-31 09:03:43 -07:00
Aidan Dempster
09534119a1
Fixed non deterministic optimizer creation ( #130 )
2022-05-31 09:03:20 -07:00
Phil Wang
6f8b90d4d7
add packaging package
2022-05-30 11:45:00 -07:00
Phil Wang
b588286288
fix version
2022-05-30 11:06:34 -07:00
Phil Wang
b693e0be03
default number of resnet blocks per layer in unet to 2 (in imagen it was 3 for base 64x64)
2022-05-30 10:06:48 -07:00
Phil Wang
a0bed30a84
additional conditioning on image embedding by summing to time embeddings (for FiLM like conditioning in subsequent layers), from passage found in paper by @mhh0318
2022-05-30 09:26:51 -07:00
zion
44d4b1bba9
overhaul prior dataloader ( #122 )
...
add readme for loader
2022-05-29 07:39:59 -07:00
Phil Wang
b8af2210df
make sure diffusion prior can be instantiated from pydantic class without clip
2022-05-26 08:47:30 -07:00
Phil Wang
f4fe6c570d
allow for full customization of number of resnet blocks per down or upsampling layers in unet, as in imagen
2022-05-26 08:33:31 -07:00
zion
1ed0f9d80b
use deterministic optimizer params ( #116 )
2022-05-25 09:31:43 -07:00
zion
d7a0a2ce4b
add more support for configuring prior ( #113 )
2022-05-25 09:06:50 -07:00
Phil Wang
f23fab7ef7
switch over to scale shift conditioning, as it seems like Imagen and Glide used it and it may be important
2022-05-24 21:46:12 -07:00
Phil Wang
857b9fbf1e
allow for one to stop grouping out weight decayable parameters, to debug optimizer state dict problem
2022-05-24 21:42:32 -07:00
Phil Wang
8864fd0aa7
bring in the dynamic thresholding technique from the Imagen paper, which purportedly improves classifier free guidance for the cascading ddpm
2022-05-24 18:15:14 -07:00
Phil Wang
fa533962bd
just use an assert to make sure clip image channels is never different than the channels of the diffusion prior and decoder, if clip is given
2022-05-22 22:43:14 -07:00
Phil Wang
276abf337b
fix and cleanup image size determination logic in decoder
2022-05-22 22:28:45 -07:00
Phil Wang
ae42d03006
allow for saving of additional fields on save method in trainers, and return loaded objects from the load method
2022-05-22 22:14:25 -07:00
Phil Wang
4d346e98d9
allow for config driven creation of clip-less diffusion prior
2022-05-22 20:36:20 -07:00
Phil Wang
5c397c9d66
move neural network creations off the configuration file into the pydantic classes
2022-05-22 19:18:18 -07:00
Phil Wang
0f4edff214
derived value for image preprocessing belongs to the data config class
2022-05-22 18:42:40 -07:00
Phil Wang
501a8c7c46
small cleanup
2022-05-22 15:39:38 -07:00
Phil Wang
49de72040c
fix decoder trainer optimizer loading (since there are multiple for each unet), also save and load step number correctly
2022-05-22 15:21:00 -07:00
Phil Wang
e527002472
take care of saving and loading functions on the diffusion prior and decoder training classes
2022-05-22 15:10:15 -07:00
Phil Wang
c12e067178
let the pydantic config base model take care of loading configuration from json path
2022-05-22 14:47:23 -07:00
Phil Wang
c6629c431a
make training splits into its own pydantic base model, validate it sums to 1, make decoder script cleaner
2022-05-22 14:43:22 -07:00
Phil Wang
a1ef023193
use pydantic to manage decoder training configs + defaults and refactor training script
2022-05-22 14:27:40 -07:00
Phil Wang
8b0d459b25
move config parsing logic to own file, consider whether to find an off-the-shelf solution at future date
2022-05-21 10:30:10 -07:00
Phil Wang
80497e9839
accept unets as list for decoder
2022-05-20 20:31:26 -07:00
Phil Wang
8997f178d6
small cleanup with timer
2022-05-20 20:05:01 -07:00
Aidan Dempster
022c94e443
Added single GPU training script for decoder ( #108 )
...
Added config files for training
Changed example image generation to be more efficient
Added configuration description to README
Removed unused import
2022-05-20 19:46:19 -07:00
Phil Wang
430961cb97
it was correct the first time, my bad
2022-05-20 18:05:15 -07:00
Phil Wang
721f9687c1
fix wandb logging in tracker, and do some cleanup
2022-05-20 17:27:43 -07:00
Aidan Dempster
e0524a6aff
Implemented the wandb tracker ( #106 )
...
Added a base_path parameter to all trackers for storing any local information they need to
2022-05-20 16:39:23 -07:00
Aidan Dempster
c85e0d5c35
Update decoder dataloader ( #105 )
...
* Updated the decoder dataloader
Removed unnecessary logging for required packages
Transferred to using index width instead of shard width
Added the ability to select extra keys to return from the webdataset
* Added README for decoder loader
2022-05-20 16:38:55 -07:00
Phil Wang
db0642c4cd
quick fix for @marunine
2022-05-18 20:22:52 -07:00
Phil Wang
bb86ab2404
update sample, and set default gradient clipping value for decoder training
2022-05-16 17:38:30 -07:00
Phil Wang
c7ea8748db
default decoder learning rate to what was in the paper
2022-05-16 13:33:54 -07:00
Phil Wang
13382885d9
final update to dalle2 repository for a while - sampling from prior in chunks automatically with max_batch_size keyword given
2022-05-16 12:57:31 -07:00
Phil Wang
164d9be444
use a decorator and take care of sampling in chunks (max_batch_size keyword), in case one is sampling a huge grid of images
2022-05-16 12:34:28 -07:00
Phil Wang
89ff04cfe2
final tweak to EMA class
2022-05-16 11:54:34 -07:00
Phil Wang
f4016f6302
allow for overriding use of EMA during sampling in decoder trainer with use_non_ema keyword, also fix some issues with automatic normalization of images and low res conditioning image if latent diffusion is in play
2022-05-16 11:18:30 -07:00
Phil Wang
1212f7058d
allow text encodings and text mask to be passed in on forward and sampling for Decoder class
2022-05-16 10:40:32 -07:00
Phil Wang
dab106d4e5
back to no_grad for now, also keep track and restore unet devices in one_unet_in_gpu contextmanager
2022-05-16 09:36:14 -07:00
Phil Wang
bb151ca6b1
unet_number on decoder trainer only needs to be passed in if there is greater than 1 unet, so that unconditional training of a single ddpm is seamless (experiment in progress locally)
2022-05-16 09:17:17 -07:00
zion
4a59dea4cf
Migrate to text-conditioned prior training ( #95 )
...
* migrate to conditioned prior
* unify reader logic with a wrapper (#1 )
* separate out reader logic
* support both training methods
* Update train prior to use embedding wrapper (#3 )
* Support Both Methods
* bug fixes
* small bug fixes
* embedding only wrapper bug
* use smaller val perc
* final bug fix for embedding-only
Co-authored-by: nousr <>
2022-05-15 20:16:38 -07:00