Skip to content

Running with Docker

This project can be run using Docker and Docker Compose. Install it from here if not already available.

There are two separate configurations available: one for running with NVIDIA GPU support and another for CPU-only execution.

IMPORTANT: Make sure to also clone the ric-messages git submodule located in src folder with:

git submodule update --init

With GPU Support

To run the application with GPU acceleration, you will need to have the NVIDIA Container Toolkit installed on your system.

Once you have the toolkit installed, you can run the application using the following command:

docker compose up -d

This will build and run the tts and tts-node services.

The tts service will automatically download the specified TTS models and start the llama-swap server with GPU support.

Important: Do note that the ROS2 node makes use of rmw_zenoh for ROS2 communication. Use the provided zenoh_router for this purpose.

CPU-Only

If you do not have a compatible NVIDIA GPU, you can run the application in CPU-only mode.

To do this, use the compose.cpu.yaml file:

docker compose -f compose.cpu.yaml up

This will start the same services, but the tts service will be configured to run entirely on the CPU.

Note that the execution time using CPU-only will be very slow.

Services

The Docker Compose configurations define three main services: tts-model-downloader, tts, and tts-node.

The tts-model-downloader Service

This service is responsible for downloading the required Orpheus TTS models from Hugging Face.

  • Uses a lightweight Python Alpine image to install huggingface_hub[cli]
  • Downloads two models:
  • English model: isaiahbjork/orpheus-3b-0.1-ft-Q4_K_M-GGUF
  • German model: TheVisitorX/3b-de-ft-research_release-Q4_K_M-GGUF
  • Models are stored in ./.models directory on the host system
  • Runs as an initialization step before other services start

The tts Service

This service runs the llama-swap server, which manages the TTS model instances and provides an OpenAI-compatible API endpoint.

  • Uses pre-built Docker images from ghcr.io/mostlygeek/llama-swap (cuda for GPU, cpu for CPU-only)
  • Manages multiple TTS models through llama-swap configuration
  • Uses llama-swap.multi.config.yaml for multi-model support
  • Exposes port 8080 internally for the TTS API
  • Includes health checks to ensure proper startup sequence

Environment Variables

Variable Description Default Value
LLAMA_ARG_N_PARALLEL Number of requests to process in parallel 2
LLAMA_ARG_THREADS Number of threads to use (-1 for all available) -1
LLAMA_ARG_N_GPU_LAYERS Number of model layers to offload to GPU 49
LLAMA_ARG_NO_WEBUI Disable the web interface true

The tts-node Service

This service runs the ROS2 client node that acts as a bridge between the ROS2 ecosystem and the tts service.

  • Uses harbor.hb.dfki.de/helloric/ros_tts:latest (VPN required) or builds from the local Dockerfile
  • The node provides a ROS2 service at /tts that allows other ROS2 nodes to send text and receive audio
  • Supports both English and German text-to-speech conversion
  • Communicates with the tts service over the internal Docker network
  • Configured to start only after the tts service is healthy and running
  • Uses Zenoh as RMW implementation by default

Environment Variables

Variable Description Default Value
LLAMACPP_URL URL of the llama-swap server's completions endpoint http://tts:8080/v1/completions
PYTHONUNBUFFERED Prevents Python from buffering stdout and stderr 1
RMW_IMPLEMENTATION ROS2 middleware implementation rmw_zenoh_cpp
ROS_AUTOMATIC_DISCOVERY_RANGE Disables automatic discovery in ROS2 OFF
ZENOH_ROUTER_CHECK_ATTEMPTS Number of attempts to check for Zenoh router. 0 means wait indefinitely 0
ZENOH_CONFIG_OVERRIDE Zenoh configuration override, see rmw_zenoh mode="client";connect/endpoints=["tcp/host.docker.internal:7447"]

Usage

Create a ROS2 client for the /tts service and call it. The service uses the ric_messages/srv/TextToAudioBytes interface. For exact definition check out the ric_messages repository. For usage examples, check out service.