Phil Wang
|
a6bf8ddef6
|
advertise laion
|
2022-05-04 15:04:05 -07:00 |
|
Phil Wang
|
97b751209f
|
allow for last unet in the cascade to be trained on crops, if it is convolution-only
|
2022-05-04 11:48:48 -07:00 |
|
Phil Wang
|
74103fd8d6
|
product management
|
2022-05-04 11:20:50 -07:00 |
|
Phil Wang
|
1992d25cad
|
project management
|
2022-05-04 11:18:54 -07:00 |
|
Phil Wang
|
9ff228188b
|
offer old resnet blocks, from the original DDPM paper, just in case convnexts are unsuitable for generative work
|
2022-05-04 10:52:58 -07:00 |
|
Phil Wang
|
c30f380689
|
final reminder
|
2022-05-03 08:18:53 -07:00 |
|
Phil Wang
|
e4e884bb8b
|
keep all doors open
|
2022-05-03 08:17:02 -07:00 |
|
Phil Wang
|
803ad9c17d
|
product management again
|
2022-05-03 08:15:25 -07:00 |
|
Phil Wang
|
a88dd6a9c0
|
todo
|
2022-05-03 08:09:02 -07:00 |
|
Phil Wang
|
fa66f7e1e9
|
todo
|
2022-05-02 12:57:15 -07:00 |
|
Phil Wang
|
70282de23b
|
add ability to turn on normformer settings, given @borisdayma reported good results and some personal anecdata
|
2022-05-02 11:33:15 -07:00 |
|
Phil Wang
|
83f761847e
|
todo
|
2022-05-02 10:52:39 -07:00 |
|
Phil Wang
|
c1db2753f5
|
todo
|
2022-05-01 18:02:30 -07:00 |
|
Phil Wang
|
ad87bfe28f
|
switch to using linear attention for the sparse attention layers within unet, given success in GAN projects
|
2022-05-01 17:59:03 -07:00 |
|
Phil Wang
|
902693e271
|
todo
|
2022-05-01 11:57:08 -07:00 |
|
Phil Wang
|
ad8d7a368b
|
product management
|
2022-05-01 11:26:21 -07:00 |
|
Phil Wang
|
94aaa08d97
|
product management
|
2022-05-01 09:43:10 -07:00 |
|
Phil Wang
|
8b9bbec7d1
|
project management
|
2022-05-01 09:32:57 -07:00 |
|
Phil Wang
|
5bfbccda22
|
port over vqgan vae trainer
|
2022-05-01 08:09:15 -07:00 |
|
Phil Wang
|
989275ff59
|
product management
|
2022-04-30 16:57:56 -07:00 |
|
Phil Wang
|
56408f4a40
|
project management
|
2022-04-30 16:57:02 -07:00 |
|
Phil Wang
|
d1a697ac23
|
allows one to shortcut sampling at a specific unet number, if one were to be training in stages
|
2022-04-30 16:05:13 -07:00 |
|
Phil Wang
|
ebe01749ed
|
DecoderTrainer sample method uses the exponentially moving averaged
|
2022-04-30 14:55:34 -07:00 |
|
Phil Wang
|
a2ef69af66
|
take care of mixed precision, and make gradient accumulation do-able externally
|
2022-04-30 12:27:24 -07:00 |
|
Phil Wang
|
a9421f49ec
|
simplify Decoder training for the public
|
2022-04-30 11:45:18 -07:00 |
|
Phil Wang
|
f19c99ecb0
|
fix decoder needing separate conditional dropping probabilities for image embeddings and text encodings, thanks to @xiankgx !
|
2022-04-30 08:48:05 -07:00 |
|
ProGamerGov
|
63450b466d
|
Fix spelling and grammatical errors
|
2022-04-30 09:18:13 -06:00 |
|
Phil Wang
|
e2f9615afa
|
use @clip-anytorch , thanks to @rom1504
|
2022-04-30 06:40:54 -07:00 |
|
Phil Wang
|
a389f81138
|
todo
|
2022-04-29 15:40:51 -07:00 |
|
Phil Wang
|
0283556608
|
fix example in readme, since api changed
|
2022-04-29 13:40:55 -07:00 |
|
Phil Wang
|
5063d192b6
|
now completely OpenAI CLIP compatible for training
just take care of the logic for AdamW and transformers
used namedtuples for clip adapter embedding outputs
|
2022-04-29 13:05:01 -07:00 |
|
Phil Wang
|
6700381a37
|
prepare for ability to integrate other clips other than x-clip
|
2022-04-27 19:35:05 -07:00 |
|
Phil Wang
|
20377f889a
|
todo
|
2022-04-27 17:22:14 -07:00 |
|
Phil Wang
|
b093f92182
|
inform what is possible
|
2022-04-27 08:25:16 -07:00 |
|
Phil Wang
|
2705e7c9b0
|
attention-based upsampling claims unsupported by local experiments, removing
|
2022-04-27 07:51:04 -07:00 |
|
Phil Wang
|
77141882c8
|
complete vit-vqgan from https://arxiv.org/abs/2110.04627
|
2022-04-26 17:20:47 -07:00 |
|
Phil Wang
|
4075d02139
|
nevermind, it could be working, but only when i stabilize it with the feedforward layer + tanh as proposed in vit-vqgan paper (which will be built into the repository later for the latent diffusion)
|
2022-04-26 12:43:31 -07:00 |
|
Phil Wang
|
bfbcc283a3
|
DRY a tiny bit for gaussian diffusion related logic
|
2022-04-26 11:39:12 -07:00 |
|
Phil Wang
|
c30544b73a
|
no CLIP altogether for training DiffusionPrior
|
2022-04-26 10:23:41 -07:00 |
|
Phil Wang
|
bdf5e9c009
|
todo
|
2022-04-26 09:56:54 -07:00 |
|
Phil Wang
|
9878be760b
|
have researcher explicitly state upfront whether to condition with text encodings in cascading ddpm decoder, have DALLE-2 class take care of passing in text if feature turned on
|
2022-04-26 09:47:09 -07:00 |
|
Phil Wang
|
7ba6357c05
|
allow for training the Prior network with precomputed CLIP embeddings (or text encodings)
|
2022-04-26 09:29:51 -07:00 |
|
Phil Wang
|
0b28ee0d01
|
revert back to old upsampling, paper does not work
|
2022-04-26 07:39:04 -07:00 |
|
Phil Wang
|
13a58a78c4
|
scratch off todo
|
2022-04-25 19:01:30 -07:00 |
|
Phil Wang
|
3b520dfa85
|
bring in attention-based upsampling to strengthen vqgan-vae, seems to work as advertised in initial experiments in GAN
|
2022-04-25 17:27:45 -07:00 |
|
Phil Wang
|
79198c6ae4
|
keep readme simple for reader
|
2022-04-25 17:21:45 -07:00 |
|
Phil Wang
|
77a246b1b9
|
todo
|
2022-04-25 08:48:28 -07:00 |
|
Phil Wang
|
f93a3f6ed8
|
reprioritize
|
2022-04-25 08:44:27 -07:00 |
|
Phil Wang
|
fb8a66a2de
|
just in case latent diffusion performs better with prediction of x0 instead of epsilon, open up the research avenue
|
2022-04-24 10:04:22 -07:00 |
|
Phil Wang
|
d5318aef4f
|
todo
|
2022-04-23 08:23:08 -07:00 |
|