Sampling is now possible without the first decoder unet
Non-training unets are deleted in the decoder trainer since they are never used and it is harder merge the models is they have keys in this state dict
Fixed a mistake where clip was not re-added after saving
The default save location is now none so if keys are not specified the
corresponding checkpoint type is not saved.
Models and checkpoints are now both saved with version number and the
config used to create them in order to simplify loading.
Documentation was fixed to be in line with current usage.
* Added autoresume after crash functionality to the trackers
* Updated documentation
* Clarified what goes in the autorestart object
* Fixed style issues
Unraveled conditional block
Chnaged to using helper function to get step count
* Overhauled the tracker system
Separated the logging and saving capabilities
Changed creation to be consistent and initializing behavior to be defined by a class initializer instead of in the training script
Added class separation between different types of loaders and savers to make the system more verbose
* Changed the saver system to only save the checkpoint once
* Added better error handling for saving checkpoints
* Fixed an error where wandb would error when passed arbitrary kwargs
* Fixed variable naming issues for improved saver
Added more logging during long pauses
* Fixed which methods need to be dummy to immediatly return
Added the ability to set whether you find unused parameters
* Added more logging for when a wandb loader fails
* Added the ability to train decoder with text embeddings
* Added the ability to train using on the fly generated embeddings with clip
* Clip now generates embeddings for whatever is not precomputed
* Converted decoder trainer to use accelerate
* Fixed issue where metric evaluation would hang on distributed mode
* Implemented functional saving
Loading still fails due to some issue with the optimizer
* Fixed issue with loading decoders
* Fixed issue with tracker config
* Fixed issue with amp
Updated logging to be more logical
* Saving checkpoint now saves position in training as well
Fixed an issue with running out of gpu space due to loading weights into the gpu twice
* Fixed ema for distributed training
* Fixed isue where get_pkg_version was reintroduced
* Changed decoder trainer to upload config as a file
Fixed issue where loading best would error