Skip to content

arcadiahero/llama-docker

Repository files navigation

Llama in a container

Llama in a container

This README provides guidance for setting up a Dockerized environment with CUDA to run various services, including llama-cpp-python, stable diffusion, mariadb, mongodb, redis, and grafana.

Prerequisites

Setup

  1. Change environment variables in .env (do not track changes to this file as it's not production-ready).

Building and Running Containers

cd llama-docker
docker build -t base_image -f docker/Dockerfile.base . # build the base image
docker build -t cuda_image -f docker/Dockerfile.cuda . # build the cuda image
docker compose up --build -d # build and start the containers, detached

## useful commands
docker compose up -d # start the containers
docker compose stop # stop the containers
docker compose up --build -d # rebuild the containers
docker ps # list running containers
docker logs {container id} # show container logs
docker exec -it {container id} /bin/bash # enter container cli

FastAPI/WebUI

  • Access: http://{ip address}:5000 Initial Run

llama-cpp-python OpenAI Compatible Server

  • Access: http://{ip address}:5001/docs
  • Model folder configuration: docker-compose.yml OpenAI Compatible API

llama-cpp-python OpenAI Compatible Server API Configuration

  • Multi-model support: Configuration and Multi-model Support
  • Configuration file: llama_config.json
  • OPENAI_BASE_URL and OPENAI_API_KEY are set in .env
    • Interchangeable between local and OpenAI API.

Stable Diffusion

  • Access: http://{ip address}:5002/docs
  • Image saved to assets/sd_images Stable Diffusion API Stable Diffusion

Grafana

  • http://{ip address}:7000
  • username/password: admin/admin Grafana

DB

  • MariaDB {ip address}:6000
  • MongoDB {ip address}:6001
  • Redis {ip address}:6002

Additional Resources

About

Dockerized AI with CUDA. Llama-cpp-python and stable diffusion.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published