Chat Service
The chat
service is a ROS2 node that orchestrates a full speech-to-speech chat pipeline. It takes an audio input, processes it through Speech-to-Text (STT), a Large Language Model (LLM), and Text-to-Speech (TTS) to generate a spoken response.
Dependencies
The chat_node
requires the following services to be running and available:
/stt
(ric_messages/srv/AudioBytesToText
)/llm
(ric_messages/srv/LLMChat
)/tts
(ric_messages/srv/TextToAudioBytes
)
The node will wait indefinitely for these services to become available before it starts processing requests.
How to Run the Service
To run the chat service, you first need to source your ROS2 environment and then use the ros2 run
command.
When running the service in Docker, you can enter the container with the following command, where the above steps are already done:
Parameters
You can customize the behavior of the node by passing the following ROS parameters.
Parameter | Description | Default |
---|---|---|
stt_service |
The service name for the Speech-to-Text (STT) node. | /stt |
llm_service |
The service name for the Large Language Model (LLM) node. | /llm |
tts_service |
The service name for the Text-to-Speech (TTS) node. | /tts |
How to Call the Service
You can call the /chat
service using the ros2 service call
command or the provided Python script.
Service Definition
/chat
To call the service from the command line, you need to provide the audio as an array of bytes.
- Type:
ric_messages/srv/Chat
- Description: Takes an audio input, processes it through STT, LLM, and TTS, and returns the generated audio response along with metadata.
- Request:
uint8[] audio
- Response:
string language
,string text
,string emotion
,uint8[] audio
Note: Providing a raw audio byte array via the command line is impractical for real audio files. This method is primarily for testing with very short or empty audio clips.
Using the Example Python Script
For actual use, it is recommended a ROS2 client like call_chat_service.py
, which can read an audio file and send the ROS2 request.
call_chat_service.py
can be used as test script, while the service is running.
Use it as follows:
<input.wav>
: The path to the input WAV file to be sent to the service.<output.wav>
: The path where the generated output WAV file will be saved.
Note that this will only output the response audio. Language, text and emotion will be discarded in the script.
Important: Before running the script, you may need to configure the output audio parameters (sample rate, bit depth, channels) at the top of the call_chat_service.py
file to match the format produced by your TTS service. The script requires rclpy
to be able to directly call the /chat
service. The devcontainer already contains all requirements.
Response Example
A successful call will return a response object containing the generated audio and metadata: