Running with Docker

This project can be run using Docker and Docker Compose. Install it from here if not already available.

There are two separate configurations available: one for running with NVIDIA GPU support and another for CPU-only execution.

IMPORTANT: Make sure to also clone the ric-messages git submodule located in src folder with:

git submodule update --init

With GPU Support

To run the application with GPU acceleration, you will need to have the NVIDIA Container Toolkit installed on your system.

Once you have the toolkit installed, you can run the application using the following command:

docker compose up -d

This will build and run the tts and tts-node services.

The tts service will automatically download the specified TTS models and start the llama-swap server with GPU support.

Important: Do note that the ROS2 node makes use of rmw_zenoh for ROS2 communication. Use the provided zenoh_router for this purpose.

CPU-Only

If you do not have a compatible NVIDIA GPU, you can run the application in CPU-only mode.

To do this, use the compose.cpu.yaml file:

docker compose -f compose.cpu.yaml up

This will start the same services, but the tts service will be configured to run entirely on the CPU.

Note that the execution time using CPU-only will be very slow.

Services

The Docker Compose configurations define three main services: tts-model-downloader, tts, and tts-node.

The `tts-model-downloader` Service

This service is responsible for downloading the required Orpheus TTS models from Hugging Face.

Uses a lightweight Python Alpine image to install huggingface_hub[cli]
Downloads two models:
English model: isaiahbjork/orpheus-3b-0.1-ft-Q4_K_M-GGUF
German model: TheVisitorX/3b-de-ft-research_release-Q4_K_M-GGUF
Models are stored in ./.models directory on the host system
Runs as an initialization step before other services start

The `tts` Service

This service runs the llama-swap server, which manages the TTS model instances and provides an OpenAI-compatible API endpoint.

Uses pre-built Docker images from ghcr.io/mostlygeek/llama-swap (cuda for GPU, cpu for CPU-only)
Manages multiple TTS models through llama-swap configuration
Uses llama-swap.multi.config.yaml for multi-model support
Exposes port 8080 internally for the TTS API
Includes health checks to ensure proper startup sequence

Environment Variables

Variable	Description	Default Value
`LLAMA_ARG_N_PARALLEL`	Number of requests to process in parallel	`2`
`LLAMA_ARG_THREADS`	Number of threads to use (-1 for all available)	`-1`
`LLAMA_ARG_N_GPU_LAYERS`	Number of model layers to offload to GPU	`49`
`LLAMA_ARG_NO_WEBUI`	Disable the web interface	`true`

The `tts-node` Service

This service runs the ROS2 client node that acts as a bridge between the ROS2 ecosystem and the tts service.

Uses harbor.hb.dfki.de/helloric/ros_tts:latest (VPN required) or builds from the local Dockerfile
The node provides a ROS2 service at /tts that allows other ROS2 nodes to send text and receive audio
Supports both English and German text-to-speech conversion
Communicates with the tts service over the internal Docker network
Configured to start only after the tts service is healthy and running
Uses Zenoh as RMW implementation by default

Environment Variables

Variable	Description	Default Value
`LLAMACPP_URL`	URL of the llama-swap server's completions endpoint	`http://tts:8080/v1/completions`
`PYTHONUNBUFFERED`	Prevents Python from buffering stdout and stderr	`1`
`RMW_IMPLEMENTATION`	ROS2 middleware implementation	`rmw_zenoh_cpp`
`ROS_AUTOMATIC_DISCOVERY_RANGE`	Disables automatic discovery in ROS2	`OFF`
`ZENOH_ROUTER_CHECK_ATTEMPTS`	Number of attempts to check for Zenoh router. `0` means wait indefinitely	`0`
`ZENOH_CONFIG_OVERRIDE`	Zenoh configuration override, see rmw_zenoh	`mode="client";connect/endpoints=["tcp/host.docker.internal:7447"]`

Usage

Create a ROS2 client for the /tts service and call it. The service uses the ric_messages/srv/TextToAudioBytes interface. For exact definition check out the ric_messages repository. For usage examples, check out service.

Running with Docker

With GPU Support

CPU-Only

Services

The tts-model-downloader Service

The tts Service

Environment Variables

The tts-node Service

Environment Variables

Usage

The `tts-model-downloader` Service

The `tts` Service

The `tts-node` Service