Code documentation
The chat_node is a central component that orchestrates the entire chat workflow.
It integrates multiple services to process an audio input and generate an audio response from it.
graph TD
subgraph User
A[Client]
end
subgraph "ROS2 Services"
B(chat_node)
C(STT Service /stt)
D(LLM Service /llm)
E(TTS Service /tts)
end
A -- "UI Audio" --> B
B -- "UI Audio" --> C
C -- "Transcription + Language" --> B
B -- "Transcription" --> D
D -- "Response" --> B
B -- "Response + Language" --> E
E -- "Response Audio" --> B
B -- "Response Audio + Response + Language + Emotion" --> A
The chat_node performs the following steps:
- Receives an audio request: It listens on the
/chatservice (ric_messages/srv/Chat) for an incoming request containing audio data. - Speech-to-Text (STT): It calls the
/sttservice (ric_messages/srv/AudioBytesToText) to transcribe the input audio into text and detect the language (supports English and German). - Language Model (LLM): It sends the transcribed text to the
/llmservice (ric_messages/srv/LLMChat) to generate a meaningful and context-aware response. The node also extracts an emotion from the LLM's response (e.g.,{calm}). - Text-to-Speech (TTS): It takes the cleaned text response from the LLM and calls the
/ttsservice (ric_messages/srv/TextToAudioBytes) to convert it back into audio. - Returns Response: It sends the final generated audio, language, text, and emotion back to the original caller.
Asynchronous Processing
The chat_node is designed to be highly responsive and efficient, leveraging asynchronous programming to handle long-running tasks without blocking the main execution thread. This is crucial for a robotics system where nodes must remain responsive to other events.
-
The node uses Python's
asynciolibrary to manage concurrent operations. Thechat_callbackand the service client calls (_send_stt_request,_send_llm_request,_send_tts_request) are defined asasyncfunctions. This allows the node toawaitthe results of the service calls (which can take a significant amount of time) without freezing. While waiting for a response from one service, the node can still process other events. -
The
/chatservice is created with aReentrantCallbackGroup. It allows thechat_callbackto be called again while a previous call is stillawait-ing a long-running operation (like a call to the LLM service). This prevents deadlocks and allows the node to handle multiple concurrent requests to the/chatservice. That means multiple calls to the/chatservice, while the service is still processing the previous request won't "eat up" the request and would queue up the request instead. -
The
mainfunction andspin_nodefunction integrate theasyncioevent loop with the ROS2 event system (rclpy). Thespin_nodefunction usesrclpy.spin_once()within anasyncloop, allowing both ROS2 callbacks andasynciotasks to be processed concurrently. Theawait asyncio.sleep(0.05)ensures that the event loop can run otherasynciotasks.