Phil Wang
|
0283556608
|
fix example in readme, since api changed
|
2022-04-29 13:40:55 -07:00 |
|
Phil Wang
|
5063d192b6
|
now completely OpenAI CLIP compatible for training
just take care of the logic for AdamW and transformers
used namedtuples for clip adapter embedding outputs
|
2022-04-29 13:05:01 -07:00 |
|
Phil Wang
|
f4a54e475e
|
add some training fns
|
2022-04-29 09:44:55 -07:00 |
|
Phil Wang
|
fb662a62f3
|
fix another bug thanks to @xiankgx
0.0.65
|
2022-04-29 07:38:32 -07:00 |
|
Phil Wang
|
587c8c9b44
|
optimize for clarity
|
2022-04-28 21:59:13 -07:00 |
|
Phil Wang
|
aa900213e7
|
force first unet in the cascade to be conditioned on image embeds
0.0.64
|
2022-04-28 20:53:15 -07:00 |
|
Phil Wang
|
cb26187450
|
vqgan-vae codebook dims should be 256 or smaller
0.0.63
|
2022-04-28 08:59:03 -07:00 |
|
Phil Wang
|
625ce23f6b
|
🐛
0.0.62
|
2022-04-28 07:21:18 -07:00 |
|
Phil Wang
|
dbf4a281f1
|
make sure another CLIP can actually be passed in, as long as it is wrapped in an adapter extended from BaseClipAdapter
0.0.61
|
2022-04-27 20:45:27 -07:00 |
|
Phil Wang
|
4ab527e779
|
some extra asserts for text encoding of diffusion prior and decoder
0.0.60
|
2022-04-27 20:11:43 -07:00 |
|
Phil Wang
|
d0cdeb3247
|
add ability for DALL-E2 to return PIL images with return_pil_images = True on forward, for those who have no clue about deep learning
|
2022-04-27 19:58:06 -07:00 |
|
Phil Wang
|
8c610aad9a
|
only pass text encodings conditioning in diffusion prior if specified on initialization
0.0.58
|
2022-04-27 19:48:16 -07:00 |
|
Phil Wang
|
6700381a37
|
prepare for ability to integrate other clips other than x-clip
0.0.57
|
2022-04-27 19:35:05 -07:00 |
|
Phil Wang
|
20377f889a
|
todo
|
2022-04-27 17:22:14 -07:00 |
|
Phil Wang
|
6edb1c5dd0
|
fix issue with ema class
0.0.56
|
2022-04-27 16:40:02 -07:00 |
|
Phil Wang
|
b093f92182
|
inform what is possible
|
2022-04-27 08:25:16 -07:00 |
|
Phil Wang
|
fa3bb6ba5c
|
make sure cpu-only still works
0.0.55
|
2022-04-27 08:02:10 -07:00 |
|
Phil Wang
|
2705e7c9b0
|
attention-based upsampling claims unsupported by local experiments, removing
|
2022-04-27 07:51:04 -07:00 |
|
Phil Wang
|
77141882c8
|
complete vit-vqgan from https://arxiv.org/abs/2110.04627
0.0.54
|
2022-04-26 17:20:47 -07:00 |
|
Phil Wang
|
4075d02139
|
nevermind, it could be working, but only when i stabilize it with the feedforward layer + tanh as proposed in vit-vqgan paper (which will be built into the repository later for the latent diffusion)
|
2022-04-26 12:43:31 -07:00 |
|
Phil Wang
|
de0296106b
|
be able to turn off warning for use of LazyLinear by passing in text embedding dimension for unet
0.0.52
|
2022-04-26 11:42:46 -07:00 |
|
Phil Wang
|
eafb136214
|
suppress a warning
0.0.51
|
2022-04-26 11:40:45 -07:00 |
|
Phil Wang
|
bfbcc283a3
|
DRY a tiny bit for gaussian diffusion related logic
|
2022-04-26 11:39:12 -07:00 |
|
Phil Wang
|
c30544b73a
|
no CLIP altogether for training DiffusionPrior
0.0.50
|
2022-04-26 10:23:41 -07:00 |
|
Phil Wang
|
bdf5e9c009
|
todo
|
2022-04-26 09:56:54 -07:00 |
|
Phil Wang
|
9878be760b
|
have researcher explicitly state upfront whether to condition with text encodings in cascading ddpm decoder, have DALLE-2 class take care of passing in text if feature turned on
0.0.49
|
2022-04-26 09:47:09 -07:00 |
|
Phil Wang
|
7ba6357c05
|
allow for training the Prior network with precomputed CLIP embeddings (or text encodings)
0.0.48
|
2022-04-26 09:29:51 -07:00 |
|
Phil Wang
|
76e063e8b7
|
refactor so that the causal transformer in the diffusion prior network can be conditioned without text encodings (for Laions parallel efforts, although it seems from the paper it is needed)
0.0.47
|
2022-04-26 09:00:11 -07:00 |
|
Phil Wang
|
4d25976f33
|
make sure non-latent diffusion still works
0.0.46
|
2022-04-26 08:36:00 -07:00 |
|
Phil Wang
|
0b28ee0d01
|
revert back to old upsampling, paper does not work
|
2022-04-26 07:39:04 -07:00 |
|
Phil Wang
|
45262a4bb7
|
bring in the exponential moving average wrapper, to get ready for training
|
2022-04-25 19:24:13 -07:00 |
|
Phil Wang
|
13a58a78c4
|
scratch off todo
|
2022-04-25 19:01:30 -07:00 |
|
Phil Wang
|
f75d49c781
|
start a file for all attention-related modules, use attention-based upsampling in the unets in dalle-2
0.0.45
|
2022-04-25 18:59:10 -07:00 |
|
Phil Wang
|
3b520dfa85
|
bring in attention-based upsampling to strengthen vqgan-vae, seems to work as advertised in initial experiments in GAN
0.0.44
|
2022-04-25 17:27:45 -07:00 |
|
Phil Wang
|
79198c6ae4
|
keep readme simple for reader
|
2022-04-25 17:21:45 -07:00 |
|
Phil Wang
|
77a246b1b9
|
todo
|
2022-04-25 08:48:28 -07:00 |
|
Phil Wang
|
f93a3f6ed8
|
reprioritize
|
2022-04-25 08:44:27 -07:00 |
|
Phil Wang
|
8f2a0c7e00
|
better naming
0.0.43
|
2022-04-25 07:44:33 -07:00 |
|
Phil Wang
|
863f4ef243
|
just take care of the logic for setting all latent diffusion to predict x0, if needed
0.0.42
|
2022-04-24 10:06:42 -07:00 |
|
Phil Wang
|
fb8a66a2de
|
just in case latent diffusion performs better with prediction of x0 instead of epsilon, open up the research avenue
0.0.41
|
2022-04-24 10:04:22 -07:00 |
|
Phil Wang
|
579d4b42dd
|
does not seem right to clip for the prior diffusion part
0.0.40
|
2022-04-24 09:51:18 -07:00 |
|
Phil Wang
|
473808850a
|
some outlines to the eventual CLI endpoint
|
2022-04-24 09:27:15 -07:00 |
|
Phil Wang
|
d5318aef4f
|
todo
|
2022-04-23 08:23:08 -07:00 |
|
Phil Wang
|
f82917e1fd
|
prepare for turning off gradient penalty, as shown in GAN literature, GP needs to be only applied 1 out of 4 iterations
0.0.39
|
2022-04-23 07:52:10 -07:00 |
|
Phil Wang
|
05b74be69a
|
use null container pattern to cleanup some conditionals, save more cleanup for next week
0.0.38
|
2022-04-22 15:23:18 -07:00 |
|
Phil Wang
|
a8b5d5d753
|
last tweak of readme
|
2022-04-22 14:16:43 -07:00 |
|
Phil Wang
|
976ef7f87c
|
project management
|
2022-04-22 14:15:42 -07:00 |
|
Phil Wang
|
fd175bcc0e
|
readme
|
2022-04-22 14:13:33 -07:00 |
|
Phil Wang
|
76b32f18b3
|
first pass at complete DALL-E2 + Latent Diffusion integration, latent diffusion on any layer(s) of the cascading ddpm in the decoder.
|
2022-04-22 13:53:13 -07:00 |
|
Phil Wang
|
f2d5b87677
|
todo
|
2022-04-22 11:39:58 -07:00 |
|