Azure.AI.VoiceLive Namespace
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Classes
| AnimationOptions |
Configuration for animation outputs including blendshapes and visemes metadata. |
| AssistantMessageItem |
The AssistantMessageItem. |
| AudioEchoCancellation |
Echo cancellation configuration for server-side audio processing. |
| AudioInputTranscriptionOptions |
Configuration for input audio transcription. |
| AudioNoiseReduction |
Configuration for input audio noise reduction. |
| AvatarConfiguration |
Configuration for avatar streaming and behavior during the session. |
| AzureAIVoiceLiveContext |
Context class which will be filled in by the System.ClientModel.SourceGeneration. For more information https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/core/System.ClientModel/src/docs/ModelReaderWriterContext.md |
| AzureCustomVoice |
Azure custom voice configuration. |
| AzurePersonalVoice |
Azure personal voice configuration. |
| AzureSemanticEouDetection |
Azure semantic end-of-utterance detection (default). |
| AzureSemanticEouDetectionEn |
Azure semantic end-of-utterance detection (default). |
| AzureSemanticEouDetectionMultilingual |
Azure semantic end-of-utterance detection (default). |
| AzureSemanticVadTurnDetection |
Base model for VAD-based turn detection. |
| AzureSemanticVadTurnDetectionEn |
Base model for VAD-based turn detection. |
| AzureSemanticVadTurnDetectionMultilingual |
Base model for VAD-based turn detection. |
| AzureStandardVoice |
Azure standard voice configuration. |
| AzureVoice |
Base for Azure voice configurations. Please note this is the abstract base class. The derived classes available for instantiation are: AzureCustomVoice, AzureStandardVoice, and AzurePersonalVoice. |
| CachedTokenDetails |
Details of output token usage. |
| ConversationRequestItem |
Base for any response item; discriminated by |
| EouDetection |
Top-level union for end-of-utterance (EOU) semantic detection configuration. Please note this is the abstract base class. The derived classes available for instantiation are: AzureSemanticEouDetection, AzureSemanticEouDetectionEn, and AzureSemanticEouDetectionMultilingual. |
| FunctionCallItem |
A function call item within a conversation. |
| FunctionCallOutputItem |
A function call output item within a conversation. |
| IceServer |
ICE server configuration for WebRTC connection negotiation. |
| InputAudioContentPart |
Input audio content part. |
| InputTextContentPart |
Input text content part. |
| InputTokenDetails |
Details of input token usage. |
| LogProbProperties |
A single log probability entry for a token. |
| MaxResponseOutputTokensOption | |
| MessageContentPart |
Base for any message content part; discriminated by |
| MessageItem |
A message item within a conversation. |
| NoTurnDetection |
Disables turn detection. |
| OpenAIVoice |
OpenAI voice configuration with explicit type field. This provides a unified interface for OpenAI voices, complementing the existing string-based OAIVoice for backward compatibility. |
| OutputTextContentPart |
Output text content part. |
| OutputTokenDetails |
Details of output token usage. |
| RequestAudioContentPart |
An audio content part for a request. |
| RequestTextContentPart |
A text content part for a request. |
| ResponseAudioContentPart |
An audio content part for a response. |
| ResponseCancelledDetails |
Details for a cancelled response. |
| ResponseFailedDetails |
Details for a failed response. |
| ResponseFunctionCallItem |
A function call item within a conversation. |
| ResponseFunctionCallOutputItem |
A function call output item within a conversation. |
| ResponseIncompleteDetails |
Details for an incomplete response. |
| ResponseStatusDetails |
Base for all non-success response details. Please note this is the abstract base class. The derived classes available for instantiation are: ResponseCancelledDetails, ResponseIncompleteDetails, and ResponseFailedDetails. |
| ResponseTextContentPart |
A text content part for a response. |
| ResponseTokenStatistics |
Overall usage statistics for a response. |
| ServerVadTurnDetection |
Base model for VAD-based turn detection. |
| SessionResponse |
The response resource. |
| SessionResponseItem |
Base for any response item; discriminated by |
| SessionResponseMessageItem |
Base type for message item within a conversation. |
| SessionUpdate |
A voicelive server event. Please note this is the abstract base class. The derived classes available for instantiation are: SessionUpdateError, SessionUpdateSessionCreated, SessionUpdateSessionUpdated, SessionUpdateAvatarConnecting, SessionUpdateInputAudioBufferCommitted, SessionUpdateInputAudioBufferCleared, SessionUpdateInputAudioBufferSpeechStarted, SessionUpdateInputAudioBufferSpeechStopped, SessionUpdateConversationItemCreated, SessionUpdateConversationItemInputAudioTranscriptionCompleted, SessionUpdateConversationItemInputAudioTranscriptionFailed, SessionUpdateConversationItemTruncated, SessionUpdateConversationItemDeleted, SessionUpdateResponseCreated, SessionUpdateResponseDone, SessionUpdateResponseOutputItemAdded, SessionUpdateResponseOutputItemDone, SessionUpdateResponseContentPartAdded, SessionUpdateResponseContentPartDone, SessionUpdateResponseTextDelta, SessionUpdateResponseTextDone, SessionUpdateResponseAudioTranscriptDelta, SessionUpdateResponseAudioTranscriptDone, SessionUpdateResponseAudioDelta, SessionUpdateResponseAudioDone, SessionUpdateResponseAnimationBlendshapeDelta, SessionUpdateResponseAnimationBlendshapeDone, SessionUpdateResponseAudioTimestampDelta, SessionUpdateResponseAudioTimestampDone, SessionUpdateResponseAnimationVisemeDelta, SessionUpdateResponseAnimationVisemeDone, SessionUpdateConversationItemInputAudioTranscriptionDelta, SessionUpdateConversationItemRetrieved, SessionUpdateResponseFunctionCallArgumentsDelta, and SessionUpdateResponseFunctionCallArgumentsDone. |
| SessionUpdateAvatarConnecting |
Sent when the server is in the process of establishing an avatar media connection and provides its SDP answer. |
| SessionUpdateConversationItemCreated |
Returned when a conversation item is created. There are several scenarios that produce this event:
|
| SessionUpdateConversationItemDeleted |
Returned when an item in the conversation is deleted by the client with a
|
| SessionUpdateConversationItemInputAudioTranscriptionCompleted |
This event is the output of audio transcription for user audio written to the
user audio buffer. Transcription begins when the input audio buffer is
committed by the client or server (in |
| SessionUpdateConversationItemInputAudioTranscriptionDelta |
Returned when the text value of an input audio transcription content part is updated. |
| SessionUpdateConversationItemInputAudioTranscriptionFailed |
Returned when input audio transcription is configured, and a transcription
request for a user message failed. These events are separate from other
|
| SessionUpdateConversationItemRetrieved |
Returned when a conversation item is retrieved with |
| SessionUpdateConversationItemTruncated |
Returned when an earlier assistant audio message item is truncated by the
client with a |
| SessionUpdateError |
Returned when an error occurs, which could be a client problem or a server problem. Most errors are recoverable and the session will stay open, we recommend to implementors to monitor and log error messages by default. |
| SessionUpdateErrorDetails |
Details of the error. |
| SessionUpdateInputAudioBufferCleared |
Returned when the input audio buffer is cleared by the client with a
|
| SessionUpdateInputAudioBufferCommitted |
Returned when an input audio buffer is committed, either by the client or
automatically in server VAD mode. The |
| SessionUpdateInputAudioBufferSpeechStarted |
The SessionUpdateInputAudioBufferSpeechStarted. |
| SessionUpdateInputAudioBufferSpeechStopped |
The SessionUpdateInputAudioBufferSpeechStopped. |
| SessionUpdateResponseAnimationBlendshapeDelta |
Represents a delta update of blendshape animation frames for a specific output of a response. |
| SessionUpdateResponseAnimationBlendshapeDone |
Indicates the completion of blendshape animation processing for a specific output of a response. |
| SessionUpdateResponseAnimationVisemeDelta |
Represents a viseme ID delta update for animation based on audio. |
| SessionUpdateResponseAnimationVisemeDone |
Indicates completion of viseme animation delivery for a response. |
| SessionUpdateResponseAudioDelta |
Returned when the model-generated audio is updated. |
| SessionUpdateResponseAudioDone |
Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseAudioTimestampDelta |
Represents a word-level audio timestamp delta for a response. |
| SessionUpdateResponseAudioTimestampDone |
Indicates completion of audio timestamp delivery for a response. |
| SessionUpdateResponseAudioTranscriptDelta |
Returned when the model-generated transcription of audio output is updated. |
| SessionUpdateResponseAudioTranscriptDone |
Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseContentPartAdded |
Returned when a new content part is added to an assistant message item during response generation. |
| SessionUpdateResponseContentPartDone |
Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseCreated |
Returned when a new Response is created. The first event of response creation,
where the response is in an initial state of |
| SessionUpdateResponseDone |
Returned when a Response is done streaming. Always emitted, no matter the
final state. The Response object included in the |
| SessionUpdateResponseFunctionCallArgumentsDelta |
Returned when the model-generated function call arguments are updated. |
| SessionUpdateResponseFunctionCallArgumentsDone |
Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseOutputItemAdded |
Returned when a new Item is created during Response generation. |
| SessionUpdateResponseOutputItemDone |
Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseTextDelta |
Returned when the text value of a "text" content part is updated. |
| SessionUpdateResponseTextDone |
Returned when the text value of a "text" content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateSessionCreated |
Returned when a Session is created. Emitted automatically when a new connection is established as the first server event. This event will contain the default Session configuration. |
| SessionUpdateSessionUpdated |
Returned when a session is updated with a |
| SystemMessageItem |
The SystemMessageItem. |
| ToolChoiceOption |
Represents constraints placed on tool calls made by the model. |
| TurnDetection |
Top-level union for turn detection configuration. Please note this is the abstract base class. The derived classes available for instantiation are: ServerVadTurnDetection, AzureSemanticVadTurnDetection, AzureSemanticVadTurnDetectionEn, and AzureSemanticVadTurnDetectionMultilingual. |
| UserMessageItem |
The UserMessageItem. |
| VideoBackground |
Defines a video background, either a solid color or an image URL (mutually exclusive). |
| VideoCrop |
Defines a video crop rectangle. |
| VideoParams |
Video streaming parameters for avatar. |
| VideoResolution |
Resolution of the video feed in pixels. |
| VoiceLiveClient |
The VoiceLiveClient. |
| VoiceLiveClientOptions |
Client options for VoiceLiveClient. |
| VoiceLiveContentPart |
Base for any content part; discriminated by |
| VoiceLiveErrorDetails |
Error object returned in case of API failure. |
| VoiceLiveFunctionDefinition |
The definition of a function tool as used by the voicelive endpoint. |
| VoiceLiveModelFactory |
A factory class for creating instances of the models for mocking. |
| VoiceLiveResponse |
The response resource. |
| VoiceLiveSession |
Represents a WebSocket-based session for real-time voice communication with the Azure VoiceLive service. |
| VoiceLiveSessionOptions |
The VoiceLiveRequestSession. |
| VoiceLiveSessionResponse |
Base for session configuration in the response. |
| VoiceLiveToolDefinition |
The base representation of a voicelive tool definition. Please note this is the abstract base class. The derived classes available for instantiation are: VoiceLiveFunctionDefinition. |
| VoiceProvider |
Base interface for the different voice types supported by the VoiceLive service |
Structs
| AnimationOutputType |
Specifies the types of animation data to output. |
| AudioInputTranscriptionOptionsModel | |
| AudioNoiseReductionType | |
| AudioTimestampType |
Output timestamp types supported in audio response content. |
| EouThresholdLevel |
Threshold level settings for Azure semantic end-of-utterance detection. |
| InputAudioFormat |
Input audio format types supported. |
| InteractionModality |
Supported modalities for the session. |
| ItemParamStatus |
Indicates the processing status of an item or parameter. |
| OAIVoice |
Supported OpenAI voice names (string enum). |
| OutputAudioFormat |
Output audio format types supported. |
| PersonalVoiceModels |
PersonalVoice models. |
| ResponseCancelledDetailsReason | |
| ResponseIncompleteDetailsReason | |
| ResponseMessageRole | |
| SessionResponseItemStatus |
Indicates the processing status of a response item. |
| SessionResponseStatus |
Terminal status of a response. |
| ToolChoiceLiteral |
The available set of mode-level, string literal tool_choice options for the voicelive endpoint. |
Enums
| SessionUpdateModality | |
| VoiceLiveClientOptions.ServiceVersion |
The version of the service to use. |