DALLE2-pytorch

aljaz/DALLE2-pytorch

Fork 0

mirror of https://github.com/lucidrains/DALLE2-pytorch.git synced 2025-12-19 09:44:19 +01:00

Commit Graph

Select branches

Hide Pull Requests

fix_resizing_in_test

main

#105

#106

#108

#109

#11

#112

#113

#116

#117

#118

#12

#121

#122

#123

#127

#130

#133

#139

#14

#140

#141

#142

#147

#148

#153

#159

#16

#160

#162

#165

#166

#17

#171

#172

#175

#179

#181

#183

#186

#191

#193

#194

#195

#197

#201

#202

#204

#205

#210

#211

#212

#215

#226

#234

#24

#242

#253

#254

#259

#260

#261

#262

#270

#292

#294

#306

#312

#312

#317

#317

#318

#318

#323

#323

#324

#324

#327

#327

#34

#37

#43

#51

#52

#53

#55

#56

#57

#62

#66

#67

#70

#75

#77

#79

#81

#83

#84

#85

#92

#94

#95

0.0.1

0.0.10

0.0.100

0.0.101

0.0.102

0.0.104

0.0.105

0.0.106

0.0.107

0.0.108

0.0.109

0.0.11

0.0.12

0.0.14

0.0.15

0.0.16

0.0.17

0.0.18

0.0.2

0.0.20

0.0.21

0.0.22

0.0.23

0.0.24

0.0.25

0.0.26

0.0.27

0.0.28

0.0.31

0.0.32

0.0.33

0.0.34

0.0.35

0.0.36

0.0.37

0.0.38

0.0.39

0.0.4

0.0.40

0.0.41

0.0.42

0.0.43

0.0.44

0.0.45

0.0.46

0.0.47

0.0.48

0.0.49

0.0.5

0.0.50

0.0.51

0.0.52

0.0.53

0.0.54

0.0.55

0.0.56

0.0.57

0.0.58

0.0.59

0.0.6

0.0.60

0.0.61

0.0.62

0.0.63

0.0.64

0.0.65

0.0.67

0.0.6a

0.0.7

0.0.70

0.0.71

0.0.72

0.0.73

0.0.74

0.0.75

0.0.76

0.0.77

0.0.78

0.0.79

0.0.8

0.0.80

0.0.81

0.0.82

0.0.84

0.0.85

0.0.86

0.0.87

0.0.88

0.0.89

0.0.9

0.0.90

0.0.91

0.0.92

0.0.93

0.0.94

0.0.95

0.0.96

0.0.97

0.0.98

0.0.99

0.1.0

0.1.1

0.1.10

0.1.2

0.1.4

0.1.5

0.1.6

0.1.7

0.1.7a

0.1.8

0.1.9

0.10.8

0.11.4

0.2.0

0.2.1

0.2.10

0.2.11

0.2.12

0.2.14

0.2.15

0.2.16

0.2.17

0.2.18

0.2.19

0.2.2

0.2.20

0.2.21

0.2.22

0.2.23

0.2.24

0.2.25

0.2.26

0.2.27

0.2.28

0.2.29

0.2.2a

0.2.3

0.2.30

0.2.31

0.2.32

0.2.33

0.2.34

0.2.35

0.2.37

0.2.38

0.2.39

0.2.4

0.2.40

0.2.41

0.2.42

0.2.43

0.2.44

0.2.46

0.2.5

0.2.6

0.2.6a

0.2.7

0.2.8

0.2.9

0.26.2

0.3.0

0.3.1

0.3.2

0.3.3

0.3.4

0.3.5

0.3.6

0.3.7

0.3.8

0.3.9

0.4.0

0.4.1

0.4.10

0.4.11

0.4.12

0.4.2

0.4.3

0.4.5

0.4.6

0.4.7

0.4.8

0.5.0a

0.6.13

02.36

1.0.0

1.0.1

1.0.3

1.0.4

1.0.5

1.0.6

1.1.0

1.10.0

1.10.1

1.10.2

1.10.3

1.10.4

1.10.5

1.10.6

1.10.7

1.10.9

1.11.0

1.11.1

1.11.2

1.11.4

1.12.0

1.12.1

1.12.2

1.12.3

1.12.4

1.14.0

1.14.1

1.14.2

1.14.3

1.15.1

1.15.2

1.15.3

1.15.4

1.15.5

1.15.6

1.2.0

1.2.1

1.2.2

1.4.0

1.4.1

1.4.2

1.4.3

1.4.4

1.4.5

1.4.6

1.5.0

1.6.0

1.6.1

1.6.2

1.6.3

1.6.4

1.6.5

1.7.0

1.8.0

1.8.1

1.8.2

1.8.3

1.8.4

1.9.0

v0.10.0

v0.10.1

v0.11.0

v0.11.1

v0.11.2

v0.11.3

v0.11.5

v0.12.0

v0.12.1

v0.12.2

v0.12.3

v0.12.4

v0.14.0

v0.14.1

v0.15.0

v0.15.1

v0.15.2

v0.15.3

v0.15.4

v0.16.0

v0.16.1

v0.16.10

v0.16.11

v0.16.12

v0.16.13

v0.16.14

v0.16.15

v0.16.16

v0.16.17

v0.16.18

v0.16.19

v0.16.2

v0.16.3

v0.16.5

v0.16.6

v0.16.7

v0.16.8

v0.16.9

v0.17.0

v0.17.1

v0.18.0

v0.19.0

v0.19.1

v0.19.2

v0.19.3

v0.19.4

v0.19.5

v0.19.6

v0.20.0

v0.20.1

v0.21.0

v0.21.1

v0.21.2

v0.21.3

v0.22.1

v0.22.2

v0.22.3

v0.23.0

v0.23.1

v0.23.10

v0.23.2

v0.23.3

v0.23.4

v0.23.5

v0.23.6

v0.23.7

v0.23.8

v0.23.9

v0.24.0

v0.24.1

v0.24.2

v0.24.3

v0.25.0

v0.25.1

v0.25.2

v0.26.0

v0.26.1

v0.4.14

v0.4.9

v0.5.0

v0.5.1

v0.5.2

v0.5.3

v0.5.4

v0.5.5

v0.5.6

v0.5.7

v0.6.0

v0.6.1

v0.6.10

v0.6.11

v0.6.12

v0.6.14

v0.6.15

v0.6.16

v0.6.2

v0.6.3

v0.6.4

v0.6.5

v0.6.6

v0.6.7

v0.6.8

v0.6.9

v0.7.0

v0.7.1

v0.8.0

v0.8.1

v0.9.0

v0.9.1

v0.9.2

c52ce58e10 update Phil Wang 2022-07-14 10:54:51 -07:00
a34f60962a let the neural network peek at the low resolution conditioning one last time before making prediction, for upsamplers v0.24.0 Phil Wang 2022-07-14 10:27:04 -07:00
0b40cbaa54 just always use nearest neighbor interpolation when resizing for low resolution conditioning, for https://github.com/lucidrains/DALLE2-pytorch/pull/181 v0.23.10 Phil Wang 2022-07-13 20:59:43 -07:00
f141144a6d allow for using classifier free guidance for some unets but not others, by passing in a tuple of cond_scale during sampling for decoder, just in case it is causing issues for upsamplers v0.23.9 Phil Wang 2022-07-13 13:12:30 -07:00
f988207718 hack around some inplace error, also make sure for openai clip text encoding, only tokens after eos_id is masked out v0.23.8 Phil Wang 2022-07-13 12:56:02 -07:00
b2073219f0 foolproof sampling for decoder to always use eval mode (and restore training state afterwards) v0.23.7 Phil Wang 2022-07-13 10:21:00 -07:00
cc0f7a935c fix non pixel shuffle upsample v0.23.6 Phil Wang 2022-07-13 10:16:02 -07:00
95a512cb65 fix a potential bug with conditioning with blurred low resolution image, blur should be applied only 50% of the time v0.23.5 Phil Wang 2022-07-13 10:11:49 -07:00
972ee973bc fix issue with ddim and normalization of lowres conditioning image v0.23.4 Phil Wang 2022-07-13 09:48:40 -07:00
79e2a3bc77 only use the stable layernorm for final output norm in transformer v0.23.3 Phil Wang 2022-07-13 07:56:25 -07:00
544cdd0b29 Reverted to using basic dataloaders (#205) Aidan Dempster 2022-07-12 21:22:27 -04:00
349aaca56f add yet another transformer stability measure v0.23.2 Phil Wang 2022-07-12 17:49:16 -07:00
3ee3c56d2a add learned padding tokens, same strategy as dalle1, for diffusion prior, and get rid of masking in causal transformer v0.23.1 Phil Wang 2022-07-12 17:33:14 -07:00
5ffc341061 add learned padding tokens, same strategy as dalle1, for diffusion prior, and get rid of masking in causal transformer v0.23.0 Phil Wang 2022-07-12 17:32:09 -07:00
cd26c6b17d 0.22.3 v0.22.3 Phil Wang 2022-07-12 17:08:31 -07:00
775abc4df6 add setting to attend to all text encodings regardless of padding, for diffusion prior Phil Wang 2022-07-12 17:08:12 -07:00
11b1d533a0 make sure text encodings being passed in has the correct batch dimension v0.22.1 Phil Wang 2022-07-12 16:00:19 -07:00
e76e89f9eb remove text masking altogether in favor of deriving from text encodings (padded text encodings must be pad value of 0.) v0.22.2 Phil Wang 2022-07-12 15:40:31 -07:00
bb3ff0ac67 protect against bad text mask being passed into decoder v0.21.3 Phil Wang 2022-07-12 15:33:13 -07:00
1ec4dbe64f one more fix for text mask, if the length of the text encoding exceeds max_text_len, add an assert for better error msg v0.21.2 Phil Wang 2022-07-12 15:01:46 -07:00
e0835acca9 generate text mask within the unet and diffusion prior itself from the text encodings, if not given v0.21.1 Phil Wang 2022-07-12 12:54:54 -07:00
e055793e5d shoutout for @MalumaDev Phil Wang 2022-07-11 16:12:35 -07:00
1d9ef99288 add PixelShuffleUpsample thanks to @MalumaDev and @marunine for running the experiment and verifyng absence of checkboard artifacts v0.21.0 Phil Wang 2022-07-11 16:07:23 -07:00
bdd62c24b3 zero init final projection in unet, since openai and @crowsonkb are both doing it v0.20.1 Phil Wang 2022-07-11 13:22:06 -07:00
1f1557c614 make it so even if text mask is omitted, it will be derived based on whether text encodings are all 0s or not, simplify dataloading v0.20.0 Phil Wang 2022-07-11 10:56:11 -07:00
1a217e99e3 Unet parameter count is now shown (#202) Aidan Dempster 2022-07-10 19:45:59 -04:00
7ea314e2f0 allow for final l2norm clamping of the sampled image embed v0.19.6 Phil Wang 2022-07-10 09:44:31 -07:00
4173e88121 more accurate readme Phil Wang 2022-07-09 20:57:26 -07:00
3dae43fa0e fix misnamed variable, thanks to @nousr v0.19.5 Phil Wang 2022-07-09 19:01:37 -07:00
a598820012 do not noise for the last step in ddim v0.19.4 Phil Wang 2022-07-09 18:38:40 -07:00
4878762627 fix for small validation bug for sampling steps v0.19.3 Phil Wang 2022-07-09 17:31:54 -07:00
47ae17b36e more informative error for something that tripped me up v0.19.2 Phil Wang 2022-07-09 17:28:14 -07:00
b7e22f7da0 complete ddim integration of diffusion prior as well as decoder for each unet, feature complete for https://github.com/lucidrains/DALLE2-pytorch/issues/157 v0.19.1 Phil Wang 2022-07-09 17:25:34 -07:00
35e6bd4f43 complete ddim integration of diffusion prior as well as decoder for each unet, feature complete for https://github.com/lucidrains/DALLE2-pytorch/issues/157 v0.19.0 Phil Wang 2022-07-09 17:21:51 -07:00
68de937aac Fix decoder test by fixing the resizing output size (#197) Romain Beaumont 2022-07-09 16:48:07 +02:00
3a1dea7d97 Fix decoder test by fixing the resizing output size fix_resizing_in_test Romain Beaumont 2022-07-09 11:25:32 +02:00
097afda606 0.18.0 v0.18.0 Phil Wang 2022-07-08 18:18:32 -07:00
5c520db825 Added deepspeed support (#195) Aidan Dempster 2022-07-08 21:18:08 -04:00
3070610231 just force it so researcher can never pass in an image that is less than the size that is required for CLIP or CoCa v0.17.1 Phil Wang 2022-07-08 18:17:29 -07:00
870aeeca62 Fixed issue where evaluation would error when large image was loaded (#194) Aidan Dempster 2022-07-08 20:11:34 -04:00
f28dc6dc01 setup simple ci (#193) Romain Beaumont 2022-07-09 01:51:56 +02:00
081d8d3484 0.17.0 v0.17.0 Phil Wang 2022-07-08 13:36:26 -07:00
a71f693a26 Add the ability to auto restart the last run when started after a crash (#191) Aidan Dempster 2022-07-08 16:35:40 -04:00
d7bc5fbedd expose num_steps_taken helper method on trainer to retrieve number of training steps of each unet v0.16.19 Phil Wang 2022-07-08 13:00:56 -07:00
8c823affff allow for control over use of nearest interp method of downsampling low res conditioning, in addition to being able to turn it off v0.16.18 Phil Wang 2022-07-08 11:44:43 -07:00
ec7cab01d9 extra insurance that diffusion prior is on the correct device, when using trainer with accelerator or device was given v0.16.17 Phil Wang 2022-07-07 10:08:33 -07:00
46be8c32d3 fix a potential issue in the low resolution conditioner, when downsampling and then upsampling using resize right, thanks to @marunine v0.16.16 Phil Wang 2022-07-07 09:41:49 -07:00
900f086a6d fix condition_on_text_encodings in dalle2 orchestrator class, fix readme Phil Wang 2022-07-07 07:43:41 -07:00
88f516b5db fix condition_on_text_encodings in dalle2 orchestrator class, fix readme v0.16.15 Phil Wang 2022-07-07 07:42:13 -07:00
b3e646fd3b add readme for prior (#159) zion 2022-07-06 20:50:52 -07:00
6a59c7093d more shots in the dark regarding fp16 with learned variance for deepspeed issue v0.16.14 Phil Wang 2022-07-06 19:05:50 -07:00
a6cdbe0b9c relax learning rate constraint, as @rom1504 wants to try a higher one v0.16.13 Phil Wang 2022-07-06 18:09:11 -07:00
e928ae5c34 default the device to the device that the diffusion prior parameters are on, if the trainer was never given the accelerator nor device v0.16.12 Phil Wang 2022-07-06 12:47:48 -07:00
5943498cf2 default the device to the device that the diffusion prior parameters are on, if the trainer was never given the accelerator nor device v0.16.11 Phil Wang 2022-07-06 12:46:46 -07:00
1bd8a7835a attempting to fix issue with deepspeed fp16 seeing overflowing gradient v0.16.10 Phil Wang 2022-07-06 08:27:34 -07:00
f33453df9f debugging with Aidan v0.16.9 Phil Wang 2022-07-05 18:22:43 -07:00
1e4bb2bafb cast long as float before deriving sinusoidal pos emb v0.16.8 Phil Wang 2022-07-05 18:01:22 -07:00
ee75515c7d remove forcing of softmax in f32, in case it is interfering with deepspeed v0.16.7 Phil Wang 2022-07-05 16:53:58 -07:00
ec68243479 set ability to do warmup steps for each unet during training v0.16.6 Phil Wang 2022-07-05 16:24:16 -07:00
cf95d37e98 set ability to do warmup steps for each unet during training v0.16.5 Phil Wang 2022-07-05 16:20:49 -07:00
3afdcdfe86 need to keep track of training steps separately for each unet in decoder trainer v0.16.3 Phil Wang 2022-07-05 15:17:59 -07:00
b9a908ff75 bring in two tricks from the cogview paper for reducing the chances of overflow, for attention and layernorm v0.16.2 Phil Wang 2022-07-05 14:27:04 -07:00
3bdf85a5e9 bring in two tricks from the cogview paper for reducing the chances of overflow, for attention and layernorm v0.16.1 Phil Wang 2022-07-05 14:21:21 -07:00
e1fe3089df do bias-less layernorm manually v0.16.0 Phil Wang 2022-07-05 13:09:50 -07:00
6d477d7654 link to dalle2 laion Phil Wang 2022-07-05 11:43:07 -07:00
531fe4b62f status Phil Wang 2022-07-05 10:46:55 -07:00
ec5a77fc55 0.15.4 v0.15.4 Phil Wang 2022-07-02 08:56:34 -07:00
fac63c61bc Fixed variable naming issue (#183) Aidan Dempster 2022-07-02 11:56:03 -04:00
3d23ba4aa5 add ability to specify full self attention on specific stages in the unet v0.15.3 Phil Wang 2022-07-01 10:22:07 -07:00
282c35930f 0.15.2 v0.15.2 Phil Wang 2022-07-01 09:40:06 -07:00
27b0f7ca0d Overhauled the tracker system (#172) Aidan Dempster 2022-07-01 12:39:40 -04:00
7b0edf9e42 allow for returning low resolution conditioning image on forward through decoder with return_lowres_cond_image flag v0.15.1 Phil Wang 2022-07-01 09:35:39 -07:00
a922a539de bring back convtranspose2d upsampling, allow for nearest upsample with hyperparam, change kernel size of last conv to 1, make configurable, cleanup v0.15.0 Phil Wang 2022-07-01 09:21:47 -07:00
8f2466f1cd blur sigma for upsampling training was 0.6 in the paper, make that the default value v0.14.1 Phil Wang 2022-06-30 17:03:16 -07:00
908ab83799 add skip connections for all intermediate resnet blocks, also add an extra resnet block for memory efficient version of unet, time condition for both initial resnet block and last one before output v0.14.0 Phil Wang 2022-06-29 08:16:58 -07:00
46a2558d53 bug in pydantic decoder config class v0.12.4 Phil Wang 2022-06-29 07:17:35 -07:00
86109646e3 fix a bug of name error (#179) yytdfc 2022-06-29 22:16:44 +08:00
6a11b9678b bring in the skip connection scaling factor, used by imagen in their unets, cite original paper using it v0.12.3 Phil Wang 2022-06-26 21:59:55 -07:00
b90364695d fix remaining issues with deriving cond_on_text_encodings from child unet settings v0.12.2 Phil Wang 2022-06-26 21:07:42 -07:00
868c001199 bug fixes for text conditioning update (#175) zion 2022-06-26 18:12:32 -05:00
032e83b0e0 nevermind, do not enforce text encodings on first unet v0.12.1 Phil Wang 2022-06-26 12:45:05 -07:00
2e85e736f3 remove unnecessary decoder setting, and if not unconditional, always make sure the first unet is condition-able on text v0.12.0 Phil Wang 2022-06-26 12:32:09 -07:00
f5760bdb92 Add data flexibility to decoder trainer (#165) Aidan Dempster 2022-06-25 22:05:20 -04:00
c453f468b1 autoswitch tqdm for notebooks (#171) zion 2022-06-25 18:37:06 -05:00
98f0c17759 add sampels-seen and ema decay (#166) zion 2022-06-24 17:12:09 -05:00
a5b9fd6ca8 product management Phil Wang 2022-06-24 08:15:05 -07:00
4b994601ae just make sure decoder learning rate is reasonable and help out budding researchers v0.11.5 Phil Wang 2022-06-23 11:29:21 -07:00
fddf66e91e fix params in decoder (#162) zion 2022-06-22 16:45:01 -05:00
c8422ffd5d fix EMA updating buffers with non-float tensors 0.11.4 Phil Wang 2022-06-22 07:16:39 -07:00
2aadc23c7c Fix train decoder config example (#160) Conight 2022-06-22 13:17:06 +08:00
c098f57e09 EMA for vqgan vae comes from ema_pytorch now Phil Wang 2022-06-20 15:29:08 -07:00
0021535c26 move ema to external repo v0.11.3 Phil Wang 2022-06-20 11:48:32 -07:00
56883910fb cleanup Phil Wang 2022-06-20 11:14:50 -07:00
893f270012 project management Phil Wang 2022-06-20 10:00:22 -07:00
f545ce18f4 be able to turn off p2 loss reweighting for upsamplers v0.11.2 Phil Wang 2022-06-20 09:43:31 -07:00
fc7abf624d in paper, blur sigma was 0.6 v0.11.1 Phil Wang 2022-06-20 09:05:08 -07:00
67f0740777 small cleanup Phil Wang 2022-06-20 08:59:51 -07:00
138079ca83 allow for setting beta schedules of unets differently in the decoder, as what was used in the paper was cosine, cosine, linear v0.11.0 Phil Wang 2022-06-20 08:56:32 -07:00
f5a906f5d3 prior train script bug fixes (#153) zion 2022-06-19 17:55:15 -05:00
0215237fc6 update status Phil Wang 2022-06-19 09:42:24 -07:00