Google GenAI SDK (Gemini Live)
Google GenAI SDK (Gemini Live)
Google GenAI SDK (Gemini Live)
This guide shows how to integrate the AgenTao SDK with the Google GenAI SDK for bidirectional real-time audio streaming with Gemini’s native audio model.
The integration bridges two real-time streams:
send_audio()Interruption handling is built in: when Gemini detects the user is speaking over the model, the outgoing audio buffer is cleared instantly.
The stream_to_gemini() coroutine reads audio chunks from call.audio_stream() and forwards them to the Gemini Live session using send_realtime_input(). The audio is wrapped in a types.Blob with the PCM MIME type.
The receive_from_gemini() coroutine listens for Gemini responses:
content.interrupted is True, the caller has started speaking over the model. clear_send_audio_buffer() is called to immediately stop any queued audio.content.model_turn contains inline_data, the raw audio bytes are sent to the caller via send_audio().Both coroutines run concurrently via asyncio.gather(). This allows the system to simultaneously listen to the caller and send AI responses without blocking.
Both AgenTao and Gemini native audio use PCM 16-bit linear, 24kHz, mono (audio/pcm;rate=24000). No transcoding is needed.