Code documentation
The chat_node
is a central component that orchestrates the entire chat workflow.
It integrates multiple services to process an audio input and generate an audio response from it.
graph TD
subgraph User
A[Client]
end
subgraph "ROS2 Services"
B(chat_node)
C(STT Service /stt)
D(LLM Service /llm)
E(TTS Service /tts)
end
A -- "UI Audio" --> B
B -- "UI Audio" --> C
C -- "Transcription + Language" --> B
B -- "Transcription" --> D
D -- "Response" --> B
B -- "Response + Language" --> E
E -- "Response Audio" --> B
B -- "Response Audio + Response + Language + Emotion" --> A
The chat_node
performs the following steps:
- Receives an audio request: It listens on the
/chat
service (ric_messages/srv/Chat
) for an incoming request containing audio data. - Speech-to-Text (STT): It calls the
/stt
service (ric_messages/srv/AudioBytesToText
) to transcribe the input audio into text and detect the language (supports English and German). - Language Model (LLM): It sends the transcribed text to the
/llm
service (ric_messages/srv/LLMChat
) to generate a meaningful and context-aware response. The node also extracts an emotion from the LLM's response (e.g.,{calm}
). - Text-to-Speech (TTS): It takes the cleaned text response from the LLM and calls the
/tts
service (ric_messages/srv/TextToAudioBytes
) to convert it back into audio. - Returns Response: It sends the final generated audio, language, text, and emotion back to the original caller.
Asynchronous Processing
The chat_node
is designed to be highly responsive and efficient, leveraging asynchronous programming to handle long-running tasks without blocking the main execution thread. This is crucial for a robotics system where nodes must remain responsive to other events.
-
The node uses Python's
asyncio
library to manage concurrent operations. Thechat_callback
and the service client calls (_send_stt_request
,_send_llm_request
,_send_tts_request
) are defined asasync
functions. This allows the node toawait
the results of the service calls (which can take a significant amount of time) without freezing. While waiting for a response from one service, the node can still process other events. -
The
/chat
service is created with aReentrantCallbackGroup
. It allows thechat_callback
to be called again while a previous call is stillawait
-ing a long-running operation (like a call to the LLM service). This prevents deadlocks and allows the node to handle multiple concurrent requests to the/chat
service. That means multiple calls to the/chat
service, while the service is still processing the previous request won't "eat up" the request and would queue up the request instead. -
The
main
function andspin_node
function integrate theasyncio
event loop with the ROS2 event system (rclpy
). Thespin_node
function usesrclpy.spin_once()
within anasync
loop, allowing both ROS2 callbacks andasyncio
tasks to be processed concurrently. Theawait asyncio.sleep(0.05)
ensures that the event loop can run otherasyncio
tasks.