Edit

Share via


Create a project for professional voice

All it takes to get started are a handful of audio files and the associated transcriptions. See if custom voice supports your language and region.

Start fine-tuning

In the Azure AI Foundry portal, you can fine-tune some Azure AI services models. For example, you can fine-tune a professional voice model.

To fine-tune a professional voice model, follow these steps:

  1. Go to your Azure AI Foundry project in the Azure AI Foundry portal. If you need to create a project, see Create an Azure AI Foundry project.

  2. Select Fine-tuning from the left pane.

  3. Select AI Service fine-tuning > + Fine-tune.

    Screenshot of the page to select fine-tuning of Azure AI Services models.

  4. In the wizard, select Custom voice (professional voice fine-tuning).

  5. Select Next.

  6. Follow the instructions provided by the wizard to create your fine-tuning workspace.

Continue fine-tuning

Go to the Azure AI Speech documentation to learn how to continue fine-tuning your professional voice model:

View fine-tuned models

After fine-tuning, you can access your custom voice models and deployments from the Fine-tuning page.

  1. Sign in to the Azure AI Foundry portal.

  2. Select Fine-tuning from the left pane.

  3. Select AI Service fine-tuning. You can view the status of your fine-tuning tasks and the models that were created.

    Screenshot of the page to view fine-tuned AI services models.

Next steps

Content for custom voice like data, models, tests, and endpoints are organized into projects in Speech Studio. Each project is specific to a country/region and language, and the gender of the voice you want to create. For example, you might create a project for a female voice for your call center's chat bots that use English in the United States.

All it takes to get started are a handful of audio files and the associated transcriptions. See if custom voice supports your language and region.

Start fine-tuning

To fine-tune a professional voice model, follow these steps:

  1. Sign in to the Speech Studio.

  2. Select the subscription and Speech resource to work with.

    Important

    Custom voice training is currently only available in some regions. After your voice model is trained in a supported region, you can copy it to a Speech resource in another region as needed. See footnotes in the regions table for more information.

  3. Select Custom voice > Create a project.

  4. Select Custom neural voice Pro > Next.

  5. Follow the instructions provided by the wizard to create your project.

Select the new project by name or select Go to project. You see these menu items in the left panel: Set up voice talent, Prepare training data, Train model, and Deploy model.

Next steps

Professional voice projects contain the voice talent consent statement, training datasets, voice models, and endpoints.

Each project is specific to a country/region and language, and the gender of the voice you want to create. For example, you might create a project for a female voice for your call center's chat bots that use English in the United States.

Create a project

To create a professional voice project, use the Projects_Create operation of the custom voice API. Construct the request body according to the following instructions:

  • Set the required kind property to ProfessionalVoice. The kind can't be changed later.
  • Optionally, set the description property for the project description. The project description can be changed later.

Make an HTTP PUT request using the URI as shown in the following Projects_Create example.

  • Replace YourResourceKey with your Speech resource key.
  • Replace YourResourceRegion with your Speech resource region.
  • Replace ProjectId with a project ID of your choice. The case sensitive ID must be unique within your Speech resource. The ID will be used in the project's URI and can't be changed later.
curl -v -X PUT -H "Ocp-Apim-Subscription-Key: YourResourceKey" -H "Content-Type: application/json" -d '{
  "description": "Project description",
  "kind": "ProfessionalVoice"
} '  "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/projects/ProjectId?api-version=2024-02-01-preview"

You should receive a response body in the following format:

{
  "id": "ProjectId",
  "description": "Project description",
  "kind": "ProfessionalVoice",
  "createdDateTime": "2023-04-01T05:30:00.000Z"
}

You use the project id in subsequent API requests to add voice talent consent and create a training set.

Next steps