Skip to content

Running the Service

This document provides instructions on how to run the ros_stt service node and interact with it.

How to Run the Service

To run the chat service, you first need to source your ROS2 environment and then use the ros2 run command.

You can run the service node with the following command. This will start the STT client and make it available for receiving prompts.

# Source your ROS2 workspace
source install/setup.bash

ros2 run stt service

When running the service in docker, you can enter the container with the following command, where the above steps are already done:

docker exec -it stt-node bash

Arguments

You can customize the behavior of the node by passing the following ROS parameters.

ros2 run stt service --ros-args -p <argument>:=<value>
Argument Description Default Value
server_url The URL of the whisper.cpp server endpoint. http://localhost:8080/inference

Service Definition

/stt

  • Type: ric_messages/srv/AudioBytesToText
  • Description: Takes an audio input and returns the transcribed text and detected language.
  • Request: uint8[] audio
  • Response: string text, string language

To call the service from the command line, you need to provide the audio as an array of bytes.

ros2 service call /stt ric_messages/srv/AudioBytesToText "{'audio': [0, 1, 2, ...]}"

Note: Providing a raw audio byte array via the command line is impractical for real audio files. This method is primarily for testing with very short or empty audio clips. For actual use, it is recommended to create a dedicated ROS2 client node in Python or C++ that can read an audio file and send the request. The convert.py script can be used as an example of how to convert a .wav file into a JSON with the required byte array format. To use the JSON, use the following command:

ros2 service call /stt ric_messages/srv/AudioBytesToText --stdin < encoded_audio.json

Response Example

A successful call will return a response object containing the transcribed text and language:

response:
ric_messages.srv.AudioBytesToText_Response(
    text='Hello world.',
    language='en'
)