0.26.2

Quality of life improvements for tracker savers (#210 )
The default save location is now none so if keys are not specified the corresponding checkpoint type is not saved. Models and checkpoints are now both saved with version number and the config used to create them in order to simplify loading. Documentation was fixed to be in line with current usage.
2026-02-15 09:04:25 +01:00 · 2022-07-19 17:50:36 -07:00 · 2022-07-19 17:50:18 -07:00 · 2022-07-19 11:31:56 -07:00 · 2022-07-19 09:47:44 -07:00 · 2022-07-19 09:36:45 -07:00
8 changed files with 228 additions and 68 deletions
--- a/README.md
+++ b/README.md
@@ -628,6 +628,82 @@ images = dalle2(
 Now you'll just have to worry about training the Prior and the Decoder!
 ## Inpainting
 Inpainting is also built into the `Decoder`. You simply have to pass in the `inpaint_image` and `inpaint_mask` (boolean tensor where `True` indicates which regions of the inpaint image to keep)
 This repository uses the formulation put forth by <a href="https://arxiv.org/abs/2201.09865">Lugmayr et al. in Repaint</a>
 ```python
 import torch
 from dalle2_pytorch import Unet, Decoder, CLIP
 # trained clip from step 1
 clip = CLIP(
    dim_text = 512,
    dim_image = 512,
    dim_latent = 512,
    num_text_tokens = 49408,
    text_enc_depth = 6,
    text_seq_len = 256,
    text_heads = 8,
    visual_enc_depth = 6,
    visual_image_size = 256,
    visual_patch_size = 32,
    visual_heads = 8
 ).cuda()
 # 2 unets for the decoder (a la cascading DDPM)
 unet = Unet(
    dim = 16,
    image_embed_dim = 512,
    cond_dim = 128,
    channels = 3,
    dim_mults = (1, 1, 1, 1)
 ).cuda()
 # decoder, which contains the unet(s) and clip
 decoder = Decoder(
    clip = clip,
    unet = (unet,),               # insert both unets in order of low resolution to highest resolution (you can have as many stages as you want here)
    image_sizes = (256,),         # resolutions, 256 for first unet, 512 for second. these must be unique and in ascending order (matches with the unets passed in)
    timesteps = 1000,
    image_cond_drop_prob = 0.1,
    text_cond_drop_prob = 0.5
 ).cuda()
 # mock images (get a lot of this)
 images = torch.randn(4, 3, 256, 256).cuda()
 # feed images into decoder, specifying which unet you want to train
 # each unet can be trained separately, which is one of the benefits of the cascading DDPM scheme
 loss = decoder(images, unet_number = 1)
 loss.backward()
 # do the above for many steps for both unets
 mock_image_embed = torch.randn(1, 512).cuda()
 # then to do inpainting
 inpaint_image = torch.randn(1, 3, 256, 256).cuda()      # (batch, channels, height, width)
 inpaint_mask = torch.ones(1, 256, 256).bool().cuda()    # (batch, height, width)
 inpainted_images = decoder.sample(
    image_embed = mock_image_embed,
    inpaint_image = inpaint_image,    # just pass in the inpaint image
    inpaint_mask = inpaint_mask       # and the mask
 )
 inpainted_images.shape # (1, 3, 256, 256)
 ```
 ## Experimental
 ### DALL-E2 with Latent Diffusion
@@ -991,26 +1067,12 @@ dataset = ImageEmbeddingDataset(
 )
 ```
-### Scripts (wip)
+### Scripts
 #### `train_diffusion_prior.py`
 For detailed information on training the diffusion prior, please refer to the [dedicated readme](prior.md)
 ## CLI (wip)
 ```bash
 $ dream 'sharing a sunset at the summit of mount everest with my dog'
 ```
 Once built, images will be saved to the same directory the command is invoked
 <a href="https://github.com/lucidrains/big-sleep">template</a>
 ## Training CLI (wip)
 <a href="https://github.com/lucidrains/stylegan2-pytorch">template</a>
 ## Todo
 - [x] finish off gaussian diffusion class for latent embedding - allow for prediction of epsilon
@@ -1049,8 +1111,8 @@ Once built, images will be saved to the same directory the command is invoked
 - [x] test out grid attention in cascading ddpm locally, decide whether to keep or remove https://arxiv.org/abs/2204.01697 (keeping, seems to be fine)
 - [x] allow for unet to be able to condition non-cross attention style as well
 - [x] speed up inference, read up on papers (ddim)
- [ ] add inpainting ability using resampler from repaint paper https://arxiv.org/abs/2201.09865
+- [x] add inpainting ability using resampler from repaint paper https://arxiv.org/abs/2201.09865
- [ ] become an expert with unets, cleanup unet code, make it fully configurable, port all learnings over to https://github.com/lucidrains/x-unet (test out unet² in ddpm repo) - consider https://github.com/lucidrains/uformer-pytorch attention-based unet
+- [ ] try out the nested unet from https://arxiv.org/abs/2005.09007 after hearing several positive testimonies from researchers, for segmentation anyhow
 - [ ] interface out the vqgan-vae so a pretrained one can be pulled off the shelf to validate latent diffusion + DALL-E2
 ## Citations
@@ -1169,4 +1231,14 @@ Once built, images will be saved to the same directory the command is invoked
 }
 ```
 ```bibtex
@article{Lugmayr2022RePaintIU,
    title   = {RePaint: Inpainting using Denoising Diffusion Probabilistic Models},
    author  = {Andreas Lugmayr and Martin Danelljan and Andr{\'e}s Romero and Fisher Yu and Radu Timofte and Luc Van Gool},
    journal = {ArXiv},
    year    = {2022},
    volume  = {abs/2201.09865}
 }
 ```
 *Creating noise from data is easy; creating data from noise is generative modeling.* - <a href="https://arxiv.org/abs/2011.13456">Yang Song's paper</a>
--- a/configs/README.md
+++ b/configs/README.md
@@ -74,9 +74,6 @@ Settings for controlling the training hyperparameters.
 | `validation_samples` | No | `None` | The number of samples to use for validation. None mean the entire validation set. |
 | `use_ema` | No | `True` | Whether to use exponential moving average models for sampling. |
 | `ema_beta` | No | `0.99` | The ema coefficient. |
 | `save_all` | No | `False` | If True, preserves a checkpoint for every epoch. |
 | `save_latest` | No | `True` | If True, overwrites the `latest.pth` every time the model is saved. |
 | `save_best` | No | `True` | If True, overwrites the `best.pth` every time the model has a lower validation loss than all previous models. |
 | `unet_training_mask` | No | `None` | A boolean array of the same length as the number of unets. If false, the unet is frozen. A value of `None` trains all unets. |
 **<ins>Evaluate</ins>:**
@@ -163,9 +160,10 @@ All save locations have these configuration options
 | Option | Required | Default | Description |
 | ------ | -------- | ------- | ----------- |
 | `save_to` | Yes | N/A | Must be `local`, `huggingface`, or `wandb`. |
-| `save_latest_to` | No | `latest.pth` | Sets the relative path to save the latest model to. |
+| `save_latest_to` | No | `None` | Sets the relative path to save the latest model to. |
-| `save_best_to` | No | `best.pth` | Sets the relative path to save the best model to every time the model has a lower validation loss than all previous models. |
+| `save_best_to` | No | `None` | Sets the relative path to save the best model to every time the model has a lower validation loss than all previous models. |
-| `save_type` | No | `'checkpoint'` | The type of save. `'checkpoint'` saves a checkpoint, `'model'` saves a model without any fluff (Saves with ema if ema is enabled). |
+| `save_meta_to` | No | `None` | The path to save metadata files in. This includes the config files used to start the training. |
 | `save_type` | No | `checkpoint` | The type of save. `checkpoint` saves a checkpoint, `model` saves a model without any fluff (Saves with ema if ema is enabled). |
 If using `local`
 | Option | Required | Default | Description |
@@ -177,7 +175,6 @@ If using `huggingface`
 | ------ | -------- | ------- | ----------- |
 | `save_to` | Yes | N/A | Must be `huggingface`. |
 | `huggingface_repo` | Yes | N/A | The huggingface repository to save to. |
 | `huggingface_base_path` | Yes | N/A | The base path that checkpoints will be saved under. |
 | `token_path` | No | `None` | If logging in with the huggingface cli is not possible, point to a token file instead. |
 If using `wandb`
--- a/configs/train_decoder_config.example.json
+++ b/configs/train_decoder_config.example.json
@@ -56,9 +56,6 @@
        "use_ema": true,
        "ema_beta": 0.99,
        "amp": false,
        "save_all": false,
        "save_latest": true,
        "save_best": true,
        "unet_training_mask": [true]
    },
    "evaluate": {
@@ -96,14 +93,15 @@
        },
        "save": [{
-            "save_to": "wandb"
+            "save_to": "wandb",
            "save_latest_to": "latest.pth"
        }, {
            "save_to": "huggingface",
            "huggingface_repo": "Veldrovive/test_model",
-            "save_all": true,
+            "save_latest_to": "path/to/model_dir/latest.pth",
-            "save_latest": true,
+            "save_best_to": "path/to/model_dir/best.pth",
-            "save_best": true,
+            "save_meta_to": "path/to/directory/for/assorted/files",
            "save_type": "model"
        }]
--- a/configs/train_decoder_config.test.json
+++ b/configs/train_decoder_config.test.json
@@ -61,9 +61,6 @@
        "use_ema": true,
        "ema_beta": 0.99,
        "amp": false,
        "save_all": false,
        "save_latest": true,
        "save_best": true,
        "unet_training_mask": [true]
    },
    "evaluate": {
@@ -96,7 +93,8 @@
        },
       "save": [{
-            "save_to": "local"
+            "save_to": "local",
            "save_latest_to": "latest.pth"
        }]
    }
 }
--- a/dalle2_pytorch/dalle2_pytorch.py
+++ b/dalle2_pytorch/dalle2_pytorch.py
@@ -2115,7 +2115,7 @@ class Decoder(nn.Module):
        unconditional = False,                      # set to True for generating images without conditioning
        auto_normalize_img = True,                  # whether to take care of normalizing the image from [0, 1] to [-1, 1] and back automatically - you can turn this off if you want to pass in the [-1, 1] ranged image yourself from the dataloader
        use_dynamic_thres = False,                  # from the Imagen paper
-        dynamic_thres_percentile = 0.9,
+        dynamic_thres_percentile = 0.95,
        p2_loss_weight_gamma = 0.,                  # p2 loss weight, from https://arxiv.org/abs/2204.00227 - 0 is equivalent to weight of 1 across time - 1. is recommended
        p2_loss_weight_k = 1,
        ddim_sampling_eta = 1.                      # can be set to 0. for deterministic sampling afaict
@@ -2415,20 +2415,51 @@ class Decoder(nn.Module):
        return model_mean + nonzero_mask * (0.5 * model_log_variance).exp() * noise
    @torch.no_grad()
-    def p_sample_loop_ddpm(self, unet, shape, image_embed, noise_scheduler, predict_x_start = False, learned_variance = False, clip_denoised = True, lowres_cond_img = None, text_encodings = None, cond_scale = 1, is_latent_diffusion = False, lowres_noise_level = None):
+    def p_sample_loop_ddpm(
        self,
        unet,
        shape,
        image_embed,
        noise_scheduler,
        predict_x_start = False,
        learned_variance = False,
        clip_denoised = True,
        lowres_cond_img = None,
        text_encodings = None,
        cond_scale = 1,
        is_latent_diffusion = False,
        lowres_noise_level = None,
        inpaint_image = None,
        inpaint_mask = None
    ):
        device = self.device
        b = shape[0]
        img = torch.randn(shape, device = device)
        if exists(inpaint_image):
            inpaint_image = self.normalize_img(inpaint_image)
            inpaint_image = resize_image_to(inpaint_image, shape[-1], nearest = True)
            inpaint_mask = rearrange(inpaint_mask, 'b h w -> b 1 h w').float()
            inpaint_mask = resize_image_to(inpaint_mask, shape[-1], nearest = True)
            inpaint_mask = inpaint_mask.bool()
        if not is_latent_diffusion:
            lowres_cond_img = maybe(self.normalize_img)(lowres_cond_img)
        for i in tqdm(reversed(range(0, noise_scheduler.num_timesteps)), desc = 'sampling loop time step', total = noise_scheduler.num_timesteps):
            times = torch.full((b,), i, device = device, dtype = torch.long)
            if exists(inpaint_image):
                # following the repaint paper
                # https://arxiv.org/abs/2201.09865
                noised_inpaint_image = noise_scheduler.q_sample(inpaint_image, t = times)
                img = (img * ~inpaint_mask) + (noised_inpaint_image * inpaint_mask)
            img = self.p_sample(
                unet,
                img,
-                torch.full((b,), i, device = device, dtype = torch.long),
+                times,
                image_embed = image_embed,
                text_encodings = text_encodings,
                cond_scale = cond_scale,
@@ -2440,11 +2471,32 @@ class Decoder(nn.Module):
                clip_denoised = clip_denoised
            )
        if exists(inpaint_image):
            img = (img * ~inpaint_mask) + (inpaint_image * inpaint_mask)
        unnormalize_img = self.unnormalize_img(img)
        return unnormalize_img
    @torch.no_grad()
-    def p_sample_loop_ddim(self, unet, shape, image_embed, noise_scheduler, timesteps, eta = 1., predict_x_start = False, learned_variance = False, clip_denoised = True, lowres_cond_img = None, text_encodings = None, cond_scale = 1, is_latent_diffusion = False, lowres_noise_level = None):
+    def p_sample_loop_ddim(
        self,
        unet,
        shape,
        image_embed,
        noise_scheduler,
        timesteps,
        eta = 1.,
        predict_x_start = False,
        learned_variance = False,
        clip_denoised = True,
        lowres_cond_img = None,
        text_encodings = None,
        cond_scale = 1,
        is_latent_diffusion = False,
        lowres_noise_level = None,
        inpaint_image = None,
        inpaint_mask = None
    ):
        batch, device, total_timesteps, alphas, eta = shape[0], self.device, noise_scheduler.num_timesteps, noise_scheduler.alphas_cumprod_prev, self.ddim_sampling_eta
        times = torch.linspace(0., total_timesteps, steps = timesteps + 2)[:-1]
@@ -2452,6 +2504,13 @@ class Decoder(nn.Module):
        times = list(reversed(times.int().tolist()))
        time_pairs = list(zip(times[:-1], times[1:]))
        if exists(inpaint_image):
            inpaint_image = self.normalize_img(inpaint_image)
            inpaint_image = resize_image_to(inpaint_image, shape[-1], nearest = True)
            inpaint_mask = rearrange(inpaint_mask, 'b h w -> b 1 h w').float()
            inpaint_mask = resize_image_to(inpaint_mask, shape[-1], nearest = True)
            inpaint_mask = inpaint_mask.bool()
        img = torch.randn(shape, device = device)
        if not is_latent_diffusion:
@@ -2463,6 +2522,12 @@ class Decoder(nn.Module):
            time_cond = torch.full((batch,), time, device = device, dtype = torch.long)
            if exists(inpaint_image):
                # following the repaint paper
                # https://arxiv.org/abs/2201.09865
                noised_inpaint_image = noise_scheduler.q_sample(inpaint_image, t = time_cond)
                img = (img * ~inpaint_mask) + (noised_inpaint_image * inpaint_mask)
            pred = unet.forward_with_cond_scale(img, time_cond, image_embed = image_embed, text_encodings = text_encodings, cond_scale = cond_scale, lowres_cond_img = lowres_cond_img, lowres_noise_level = lowres_noise_level)
            if learned_variance:
@@ -2486,6 +2551,9 @@ class Decoder(nn.Module):
                  c1 * noise + \
                  c2 * pred_noise
        if exists(inpaint_image):
            img = (img * ~inpaint_mask) + (inpaint_image * inpaint_mask)
        img = self.unnormalize_img(img)
        return img
@@ -2585,6 +2653,8 @@ class Decoder(nn.Module):
        cond_scale = 1.,
        stop_at_unet_number = None,
        distributed = False,
        inpaint_image = None,
        inpaint_mask = None
    ):
        assert self.unconditional or exists(image_embed), 'image embed must be present on sampling from decoder unless if trained unconditionally'
@@ -2598,6 +2668,8 @@ class Decoder(nn.Module):
        assert not (self.condition_on_text_encodings and not exists(text_encodings)), 'text or text encodings must be passed into decoder if specified'
        assert not (not self.condition_on_text_encodings and exists(text_encodings)), 'decoder specified not to be conditioned on text, yet it is presented'
        assert not (exists(inpaint_image) ^ exists(inpaint_mask)), 'inpaint_image and inpaint_mask (boolean mask of [batch, height, width]) must be both given for inpainting'
        img = None
        is_cuda = next(self.parameters()).is_cuda
@@ -2609,6 +2681,8 @@ class Decoder(nn.Module):
            context = self.one_unet_in_gpu(unet = unet) if is_cuda and not distributed else null_context()
            with context:
                # prepare low resolution conditioning for upsamplers
                lowres_cond_img = lowres_noise_level = None
                shape = (batch_size, channel, image_size, image_size)
@@ -2619,12 +2693,16 @@ class Decoder(nn.Module):
                        lowres_noise_level = torch.full((batch_size,), int(self.lowres_noise_sample_level * 1000), dtype = torch.long, device = self.device)
                        lowres_cond_img, _ = lowres_cond.noise_image(lowres_cond_img, lowres_noise_level)
                # latent diffusion
                is_latent_diffusion = isinstance(vae, VQGanVAE)
                image_size = vae.get_encoded_fmap_size(image_size)
                shape = (batch_size, vae.encoded_dim, image_size, image_size)
                lowres_cond_img = maybe(vae.encode)(lowres_cond_img)
                # denoising loop for image
                img = self.p_sample_loop(
                    unet,
                    shape,
@@ -2638,7 +2716,9 @@ class Decoder(nn.Module):
                    lowres_noise_level = lowres_noise_level,
                    is_latent_diffusion = is_latent_diffusion,
                    noise_scheduler = noise_scheduler,
-                    timesteps = sample_timesteps
+                    timesteps = sample_timesteps,
                    inpaint_image = inpaint_image,
                    inpaint_mask = inpaint_mask
                )
                img = vae.decode(img)
--- a/dalle2_pytorch/trackers.py
+++ b/dalle2_pytorch/trackers.py
@@ -4,13 +4,15 @@ import json
 from pathlib import Path
 import shutil
 from itertools import zip_longest
-from typing import Optional, List, Union
+from typing import Any, Optional, List, Union
 from pydantic import BaseModel
 import torch
-
+from dalle2_pytorch.dalle2_pytorch import Decoder, DiffusionPrior
 from dalle2_pytorch.utils import import_or_print_error
 from dalle2_pytorch.trainer import DecoderTrainer, DiffusionPriorTrainer
 from dalle2_pytorch.version import __version__
 from packaging import version
 # constants
@@ -21,16 +23,6 @@ DEFAULT_DATA_PATH = './.tracker-data'
 def exists(val):
    return val is not None
 # load file functions
 def load_wandb_file(run_path, file_path, **kwargs):
    wandb = import_or_print_error('wandb', '`pip install wandb` to use the wandb recall function')
    file_reference = wandb.restore(file_path, run_path=run_path)
    return file_reference.name
 def load_local_file(file_path, **kwargs):
    return file_path
 class BaseLogger:
    """
    An abstract class representing an object that can log data.
@@ -234,7 +226,7 @@ class LocalLoader(BaseLoader):
    def init(self, logger: BaseLogger, **kwargs) -> None:
        # Makes sure the file exists to be loaded
-        if not self.file_path.exists():
+        if not self.file_path.exists() and not self.only_auto_resume:
            raise FileNotFoundError(f'Model not found at {self.file_path}')
    def recall(self) -> dict:
@@ -283,9 +275,9 @@ def create_loader(loader_type: str, data_path: str, **kwargs) -> BaseLoader:
 class BaseSaver:
    def __init__(self,
        data_path: str,
-        save_latest_to: Optional[Union[str, bool]] = 'latest.pth',
+        save_latest_to: Optional[Union[str, bool]] = None,
-        save_best_to: Optional[Union[str, bool]] = 'best.pth',
+        save_best_to: Optional[Union[str, bool]] = None,
-        save_meta_to: str = './',
+        save_meta_to: Optional[str] = None,
        save_type: str = 'checkpoint',
        **kwargs
    ):
@@ -295,10 +287,10 @@ class BaseSaver:
        self.save_best_to = save_best_to
        self.saving_best = save_best_to is not None and save_best_to is not False
        self.save_meta_to = save_meta_to
        self.saving_meta = save_meta_to is not None
        self.save_type = save_type
        assert save_type in ['checkpoint', 'model'], '`save_type` must be one of `checkpoint` or `model`'
-        assert self.save_meta_to is not None, '`save_meta_to` must be provided'
+        assert self.saving_latest or self.saving_best or self.saving_meta, 'At least one saving option must be specified'
        assert self.saving_latest or self.saving_best, '`save_latest_to` or `save_best_to` must be provided'
    def init(self, logger: BaseLogger, **kwargs) -> None:
        raise NotImplementedError
@@ -459,6 +451,11 @@ class Tracker:
            print(f'\n\nWARNING: RUN HAS BEEN AUTO-RESUMED WITH THE LOGGER TYPE {self.logger.__class__.__name__}.\nIf this was not your intention, stop this run and set `auto_resume` to `False` in the config.\n\n')
            print(f"New logger config: {self.logger.__dict__}")
        self.save_metadata = dict(
            version = version.parse(__version__)
        )  # Data that will be saved alongside the checkpoint or model
        self.blacklisted_checkpoint_metadata_keys = ['scaler', 'optimizer', 'model', 'version', 'step', 'steps']  # These keys would cause us to error if we try to save them as metadata
        assert self.logger is not None, '`logger` must be set before `init` is called'
        if self.dummy_mode:
            # The only thing we need is a loader
@@ -507,8 +504,15 @@ class Tracker:
        # Save the config under config_name in the root folder of data_path
        shutil.copy(current_config_path, self.data_path / config_name)
        for saver in self.savers:
-            remote_path = Path(saver.save_meta_to) / config_name
+            if saver.saving_meta:
-            saver.save_file(current_config_path, str(remote_path))
+                remote_path = Path(saver.save_meta_to) / config_name
                saver.save_file(current_config_path, str(remote_path))
    def add_save_metadata(self, state_dict_key: str, metadata: Any):
        """
        Adds a new piece of metadata that will be saved along with the model or decoder.
        """
        self.save_metadata[state_dict_key] = metadata
    def _save_state_dict(self, trainer: Union[DiffusionPriorTrainer, DecoderTrainer], save_type: str, file_path: str, **kwargs) -> Path:
        """
@@ -518,24 +522,34 @@ class Tracker:
        """
        assert save_type in ['checkpoint', 'model']
        if save_type == 'checkpoint':
-            trainer.save(file_path, overwrite=True, **kwargs)
+            # Create a metadata dict without the blacklisted keys so we do not error when we create the state dict
            metadata = {k: v for k, v in self.save_metadata.items() if k not in self.blacklisted_checkpoint_metadata_keys}
            trainer.save(file_path, overwrite=True, **kwargs, **metadata)
        elif save_type == 'model':
            if isinstance(trainer, DiffusionPriorTrainer):
                prior = trainer.ema_diffusion_prior.ema_model if trainer.use_ema else trainer.diffusion_prior
-                state_dict = trainer.unwrap_model(prior).state_dict()
+                prior: DiffusionPrior = trainer.unwrap_model(prior)
-                torch.save(state_dict, file_path)
+                # Remove CLIP if it is part of the model
                prior.clip = None
                model_state_dict = prior.state_dict()
            elif isinstance(trainer, DecoderTrainer):
-                decoder = trainer.accelerator.unwrap_model(trainer.decoder)
+                decoder: Decoder = trainer.accelerator.unwrap_model(trainer.decoder)
                # Remove CLIP if it is part of the model
                decoder.clip = None
                if trainer.use_ema:
                    trainable_unets = decoder.unets
                    decoder.unets = trainer.unets  # Swap EMA unets in
-                    state_dict = decoder.state_dict()
+                    model_state_dict = decoder.state_dict()
                    decoder.unets = trainable_unets  # Swap back
                else:
-                    state_dict = decoder.state_dict()
+                    model_state_dict = decoder.state_dict()
                torch.save(state_dict, file_path)
            else:
                raise NotImplementedError('Saving this type of model with EMA mode enabled is not yet implemented. Actually, how did you get here?')
            state_dict = {
                **self.save_metadata,
                'model': model_state_dict
            }
            torch.save(state_dict, file_path)
        return Path(file_path)
    def save(self, trainer, is_best: bool, is_latest: bool, **kwargs):
--- a/dalle2_pytorch/version.py
+++ b/dalle2_pytorch/version.py
@@ -1 +1 @@
-__version__ = '0.25.2'
+__version__ = '0.26.2'
--- a/train_decoder.py
+++ b/train_decoder.py
@@ -513,6 +513,7 @@ def create_tracker(accelerator: Accelerator, config: TrainDecoderConfig, config_
    }
    tracker: Tracker = tracker_config.create(config, accelerator_config, dummy_mode=dummy)
    tracker.save_config(config_path, config_name='decoder_config.json')
    tracker.add_save_metadata(state_dict_key='config', metadata=config.dict())
    return tracker
 def initialize_training(config: TrainDecoderConfig, config_path):
Author	SHA1	Message	Date
Phil Wang	4b912a38c6	0.26.2	2022-07-19 17:50:36 -07:00
Aidan Dempster	f97e55ec6b	Quality of life improvements for tracker savers (#210 ) The default save location is now none so if keys are not specified the corresponding checkpoint type is not saved. Models and checkpoints are now both saved with version number and the config used to create them in order to simplify loading. Documentation was fixed to be in line with current usage.	2022-07-19 17:50:18 -07:00
Phil Wang	291377bb9c	@jacobwjs reports dynamic thresholding works very well and 0.95 is a better value	2022-07-19 11:31:56 -07:00
Phil Wang	7f120a8b56	cleanup, CLI no longer necessary since Zion + Aidan have https://github.com/LAION-AI/dalle2-laion and colab notebook going	2022-07-19 09:47:44 -07:00
Phil Wang	8c003ab1e1	readme and citation	2022-07-19 09:36:45 -07:00
Phil Wang	723bf0abba	complete inpainting ability using inpaint_image and inpaint_mask passed into sample function for decoder	2022-07-19 09:26:55 -07:00
`@@ -1 +1 @@`
	`__version__ = '0.25.2'`	`__version__ = '0.26.2'`