Phil Wang
0b40cbaa54
just always use nearest neighbor interpolation when resizing for low resolution conditioning, for https://github.com/lucidrains/DALLE2-pytorch/pull/181
2022-07-13 20:59:43 -07:00
Phil Wang
f141144a6d
allow for using classifier free guidance for some unets but not others, by passing in a tuple of cond_scale during sampling for decoder, just in case it is causing issues for upsamplers
2022-07-13 13:12:30 -07:00
Phil Wang
f988207718
hack around some inplace error, also make sure for openai clip text encoding, only tokens after eos_id is masked out
2022-07-13 12:56:02 -07:00
Phil Wang
b2073219f0
foolproof sampling for decoder to always use eval mode (and restore training state afterwards)
2022-07-13 10:21:00 -07:00
Phil Wang
cc0f7a935c
fix non pixel shuffle upsample
2022-07-13 10:16:02 -07:00
Phil Wang
95a512cb65
fix a potential bug with conditioning with blurred low resolution image, blur should be applied only 50% of the time
2022-07-13 10:11:49 -07:00
Phil Wang
972ee973bc
fix issue with ddim and normalization of lowres conditioning image
2022-07-13 09:48:40 -07:00
Phil Wang
79e2a3bc77
only use the stable layernorm for final output norm in transformer
2022-07-13 07:56:30 -07:00
Phil Wang
349aaca56f
add yet another transformer stability measure
2022-07-12 17:49:16 -07:00
Phil Wang
3ee3c56d2a
add learned padding tokens, same strategy as dalle1, for diffusion prior, and get rid of masking in causal transformer
2022-07-12 17:33:14 -07:00
Phil Wang
cd26c6b17d
0.22.3
2022-07-12 17:08:31 -07:00
Phil Wang
775abc4df6
add setting to attend to all text encodings regardless of padding, for diffusion prior
2022-07-12 17:08:12 -07:00
Phil Wang
11b1d533a0
make sure text encodings being passed in has the correct batch dimension
2022-07-12 16:00:19 -07:00
Phil Wang
e76e89f9eb
remove text masking altogether in favor of deriving from text encodings (padded text encodings must be pad value of 0.)
2022-07-12 15:40:31 -07:00
Phil Wang
bb3ff0ac67
protect against bad text mask being passed into decoder
2022-07-12 15:33:13 -07:00
Phil Wang
1ec4dbe64f
one more fix for text mask, if the length of the text encoding exceeds max_text_len, add an assert for better error msg
2022-07-12 15:01:46 -07:00
Phil Wang
e0835acca9
generate text mask within the unet and diffusion prior itself from the text encodings, if not given
2022-07-12 12:54:59 -07:00
Phil Wang
1d9ef99288
add PixelShuffleUpsample thanks to @MalumaDev and @marunine for running the experiment and verifyng absence of checkboard artifacts
2022-07-11 16:07:23 -07:00
Phil Wang
bdd62c24b3
zero init final projection in unet, since openai and @crowsonkb are both doing it
2022-07-11 13:22:06 -07:00
Phil Wang
1f1557c614
make it so even if text mask is omitted, it will be derived based on whether text encodings are all 0s or not, simplify dataloading
2022-07-11 10:56:19 -07:00
Phil Wang
7ea314e2f0
allow for final l2norm clamping of the sampled image embed
2022-07-10 09:44:38 -07:00
Phil Wang
3dae43fa0e
fix misnamed variable, thanks to @nousr
2022-07-09 19:01:37 -07:00
Phil Wang
a598820012
do not noise for the last step in ddim
2022-07-09 18:38:40 -07:00
Phil Wang
4878762627
fix for small validation bug for sampling steps
2022-07-09 17:31:54 -07:00
Phil Wang
47ae17b36e
more informative error for something that tripped me up
2022-07-09 17:28:14 -07:00
Phil Wang
b7e22f7da0
complete ddim integration of diffusion prior as well as decoder for each unet, feature complete for https://github.com/lucidrains/DALLE2-pytorch/issues/157
2022-07-09 17:25:34 -07:00
Phil Wang
097afda606
0.18.0
2022-07-08 18:18:38 -07:00
Aidan Dempster
5c520db825
Added deepspeed support ( #195 )
2022-07-08 18:18:08 -07:00
Phil Wang
3070610231
just force it so researcher can never pass in an image that is less than the size that is required for CLIP or CoCa
2022-07-08 18:17:29 -07:00
Phil Wang
081d8d3484
0.17.0
2022-07-08 13:36:26 -07:00
Aidan Dempster
a71f693a26
Add the ability to auto restart the last run when started after a crash ( #191 )
...
* Added autoresume after crash functionality to the trackers
* Updated documentation
* Clarified what goes in the autorestart object
* Fixed style issues
Unraveled conditional block
Chnaged to using helper function to get step count
2022-07-08 13:35:40 -07:00
Phil Wang
d7bc5fbedd
expose num_steps_taken helper method on trainer to retrieve number of training steps of each unet
2022-07-08 13:00:56 -07:00
Phil Wang
8c823affff
allow for control over use of nearest interp method of downsampling low res conditioning, in addition to being able to turn it off
2022-07-08 11:44:43 -07:00
Phil Wang
ec7cab01d9
extra insurance that diffusion prior is on the correct device, when using trainer with accelerator or device was given
2022-07-07 10:08:33 -07:00
Phil Wang
46be8c32d3
fix a potential issue in the low resolution conditioner, when downsampling and then upsampling using resize right, thanks to @marunine
2022-07-07 09:41:49 -07:00
Phil Wang
900f086a6d
fix condition_on_text_encodings in dalle2 orchestrator class, fix readme
2022-07-07 07:43:41 -07:00
Phil Wang
6a59c7093d
more shots in the dark regarding fp16 with learned variance for deepspeed issue
2022-07-06 19:05:50 -07:00
Phil Wang
a6cdbe0b9c
relax learning rate constraint, as @rom1504 wants to try a higher one
2022-07-06 18:09:11 -07:00
Phil Wang
e928ae5c34
default the device to the device that the diffusion prior parameters are on, if the trainer was never given the accelerator nor device
2022-07-06 12:47:48 -07:00
Phil Wang
1bd8a7835a
attempting to fix issue with deepspeed fp16 seeing overflowing gradient
2022-07-06 08:27:34 -07:00
Phil Wang
f33453df9f
debugging with Aidan
2022-07-05 18:22:43 -07:00
Phil Wang
1e4bb2bafb
cast long as float before deriving sinusoidal pos emb
2022-07-05 18:01:22 -07:00
Phil Wang
ee75515c7d
remove forcing of softmax in f32, in case it is interfering with deepspeed
2022-07-05 16:53:58 -07:00
Phil Wang
ec68243479
set ability to do warmup steps for each unet during training
2022-07-05 16:24:16 -07:00
Phil Wang
3afdcdfe86
need to keep track of training steps separately for each unet in decoder trainer
2022-07-05 15:17:59 -07:00
Phil Wang
b9a908ff75
bring in two tricks from the cogview paper for reducing the chances of overflow, for attention and layernorm
2022-07-05 14:27:04 -07:00
Phil Wang
e1fe3089df
do bias-less layernorm manually
2022-07-05 13:09:58 -07:00
Phil Wang
ec5a77fc55
0.15.4
2022-07-02 08:56:34 -07:00
Aidan Dempster
fac63c61bc
Fixed variable naming issue ( #183 )
2022-07-02 08:56:03 -07:00
Phil Wang
3d23ba4aa5
add ability to specify full self attention on specific stages in the unet
2022-07-01 10:22:07 -07:00