MLOps principles for Butterfly Image generation with Stable Diffusion

Project repository for the DTU course Machine Learning Operations (02476). The primary focus is to apply MLOps principles to a larger deep learning project.

Made by:

Andreas H: s194235
Andreas H: s194238
Yucheng F: s194241
Christian A: s194255
Malthe A: s194257

Overall goal of the project

The goal of the project is to train a Stable Diffusion generative model to generate photorealistic images of butterflies and implement MLOps practices for efficient training and deployment to the cloud.

Data specific framework	Training framework	Utility framework
Huggingface Diffusers	PyTorch Lightning	Hydra

We are using Huggingface Diffusers as our main framework. To code the model and make it simple we will use PyTorch Lightning. We are going to use Hydra to configure the models. Furthermore, we will use W&B to log relevant information pertaining to the training of models.

The Huggingface framework provides some convenience functions to load the data that we are going to use. Pytorch lightning has some tools to evaluate the quality of the reconstructed images (Inception score).

The deep learning model is an unconditional stable diffusion model, developed by Google. The stable diffusion model generates images by learning to remove noise. It is unconditional because it can generate images directly from noise, without an additional input, such as a text prompt.

Data

We are using a dataset called “Smithsonian butterflies subset”. The dataset is a subset of 1000 images of butterflies.

Our MLOps stack

To set up repo

git pull
dvc pull
pip install -r requirements.txt

or use the dockerfile docker build -f env.dockerfile . -t env:latest

Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

Name		Name	Last commit message	Last commit date
Latest commit History 333 Commits
.dvc		.dvc
.github/workflows		.github/workflows
conf		conf
docs		docs
models		models
notebooks		notebooks
references		references
reports		reports
src		src
tests		tests
.dvcignore		.dvcignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
api.py		api.py
cloudbuild.yaml		cloudbuild.yaml
data.dvc		data.dvc
fastapi_app.dockerfile		fastapi_app.dockerfile
reference.html		reference.html
requirements.txt		requirements.txt
requirements_cuda.txt		requirements_cuda.txt
requirements_tests.txt		requirements_tests.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini

License

AndreasLH/ML_Ops_stable_diffusion

Folders and files

Latest commit

History

Repository files navigation

MLOps principles for Butterfly Image generation with Stable Diffusion

Overall goal of the project

Data

Our MLOps stack

To set up repo

Project Organization

About

Topics

Resources

License

Stars

Watchers

Forks

Languages