Azure Real-Time diarization

Karyna Khinevich 0 Reputation points
2025-07-11T10:34:16.3133333+00:00

Hi! I am working on a project in Python, in which I use Azure AI Speech Service.

More specifically, I implemented real-time dairization using the azure.cognitiveservices.speech.transcription.ConversationTranscriber class. And now I am working on speaker recognition, so that instead of Guest-1, the transcription displays the name of the speaker, which I previously saved in the system.

I found a suitable Participant class for this, to which I need to pass a voice signature, but the services that allow you to get a voice signature are either unavailable in Python or will be depricated.

What alternatives does Azure currently offer for using the azure.cognitiveservices.speech.transcription.Participant class and similar ones? Or are these classes also planned to be depricated?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 36,716 Reputation points Volunteer Moderator
    2025-08-25T18:59:29.0133333+00:00

    Hello Karyna !

    Thank you for posting on Microsoft Learn.

    The Participant and voice signature flow belongs to the meeting transcription scenario and Microsoft has announced that Speaker Recognition (voice profiles / voice signatures) will be retired on September 30, 2025. After that date, apps won’t be able to use speaker recognition. Meanwhile, real-time diarization with ConversationTranscriber intentionally does not use voice signatures, it only gives generic IDs like Guest-1, Guest-2.

    https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speaker-recognition-overview

    You can stay with ConversationTranscriber + diarization and map names yourself where you leep a dictionary of {speakerId -> displayName} and let users claim a name the first time they speak or pre-map by device/channel if you capture multichannel audio.

    This is the supported real-time path, it returns stable speaker IDs per participant but not identities. https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-stt-diarization

    For recordings, you can use the Fast Transcription REST API with diarization, then do post-processing to attach names .https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.