Llama in a container

This README provides guidance for setting up a Dockerized environment with CUDA to run various services, including llama-cpp-python, stable diffusion, mariadb, mongodb, redis, and grafana.

Prerequisites

Setup

Change environment variables in .env (do not track changes to this file as it's not production-ready).

Building and Running Containers

cd llama-docker
docker build -t base_image -f docker/Dockerfile.base . # build the base image
docker build -t cuda_image -f docker/Dockerfile.cuda . # build the cuda image
docker compose up --build -d # build and start the containers, detached

## useful commands
docker compose up -d # start the containers
docker compose stop # stop the containers
docker compose up --build -d # rebuild the containers
docker ps # list running containers
docker logs {container id} # show container logs
docker exec -it {container id} /bin/bash # enter container cli

FastAPI/WebUI

Access: http://{ip address}:5000

llama-cpp-python OpenAI Compatible Server

Access: http://{ip address}:5001/docs
Model folder configuration: docker-compose.yml

llama-cpp-python OpenAI Compatible Server API Configuration

Multi-model support: Configuration and Multi-model Support
Configuration file: llama_config.json
OPENAI_BASE_URL and OPENAI_API_KEY are set in .env
- Interchangeable between local and OpenAI API.

Stable Diffusion

Access: http://{ip address}:5002/docs
Image saved to assets/sd_images

Grafana

http://{ip address}:7000
username/password: admin/admin

DB

MariaDB {ip address}:6000
MongoDB {ip address}:6001
Redis {ip address}:6002

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
assets		assets
docker		docker
llama_cpp_python		llama_cpp_python
stable_diffusion		stable_diffusion
webui		webui
.env		.env
.gitignore		.gitignore
README.md		README.md
TODO.md		TODO.md
docker-compose.yml		docker-compose.yml
models		models

arcadiahero/llama-docker

Folders and files

Latest commit

History

Repository files navigation

Llama in a container

Prerequisites

Setup

Building and Running Containers

FastAPI/WebUI

llama-cpp-python OpenAI Compatible Server

llama-cpp-python OpenAI Compatible Server API Configuration

Stable Diffusion

Grafana

DB

Additional Resources

About

Resources

Stars

Watchers

Forks

Languages