Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
In this article, you learn how to create the resources required to use Azure AI Foundry Models in your projects.
Understand the resources
Azure AI Foundry Models is a capability in Azure AI Foundry Services (formerly known Azure AI Services). You can create model deployments under the resource to consume their predictions. You can also connect the resource to Azure AI Hubs and Projects in Azure AI Foundry to create intelligent applications if needed. The following picture shows the high level architecture.
Azure AI Foundry Services don't require AI projects or AI hubs to operate and you can create them to consume flagship models from your applications. However, additional capabilities are available if you deploy an Azure AI project and hub, including playground, or agents.
The tutorial helps you create:
- An Azure AI Foundry resource.
- A model deployment for each of the models supported with standard deployments.
- (Optionally) An Azure AI project and hub.
- (Optionally) A connection between the hub and the models in Azure AI Foundry.
Prerequisites
To complete this article, you need:
- An Azure subscription. If you're using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI Foundry if that's your case.
Create the resources
To create a project with an Azure AI Foundry (formerly known Azure AI Services) resource, follow these steps:
Go to Azure AI Foundry portal.
On the landing page, select Create project.
Give the project a name, for example "my-project".
In this tutorial, we create a brand new project under a new AI hub, hence, select Create new hub.
Give the hub a name, for example "my-hub" and select Next.
The wizard updates with details about the resources that are going to be created. Select Azure resources to be created to see the details.
You can see that the following resources are created:
Property Description Resource group The main container for all the resources in Azure. This helps get resources that work together organized. It also helps to have a scope for the costs associated with the entire project. Location The region of the resources that you're creating. Hub The main container for AI projects in Azure AI Foundry. Hubs promote collaboration and allow you to store information for your projects. AI Foundry In this tutorial, a new account is created, but Azure AI Foundry Services can be shared across multiple hubs and projects. Hubs use a connection to the resource to have access to the model deployments available there. To learn how, you can create connections between projects and Azure AI Foundry to consume Azure AI Foundry Models you can read Connect your AI project. Select Create. The resources creation process starts.
Once completed, your project is ready to be configured.
To use Azure AI Foundry Models, you need to add model deployments.
Next steps
You can decide and configure which models are available for inference in your Azure AI Foundry resource. When a given model is configured, you can then generate predictions from it by indicating its model name or deployment name on your requests. No further changes are required in your code to use it.
In this article, you'll learn how to add a new model to Azure AI Foundry.
Prerequisites
To complete this article, you need:
An Azure subscription. If you're using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI Foundry Models if that's your case.
An Azure AI Foundry resource (formerly known as Azure AI Services). For more information, see Create and configure all the resources for Azure AI Foundry Models.
Models from Partners and Community require access to Azure Marketplace. Ensure you have the permissions required to subscribe to model offerings. Models Sold Directly by Azure don't have this requirement.
Install the Azure CLI and the
cognitiveservices
extension for Azure AI Services:az extension add -n cognitiveservices
Some of the commands in this tutorial use the
jq
tool, which might not be installed in your system. For installation instructions, see Downloadjq
.Identify the following information:
Your Azure subscription ID.
Your Azure AI Services resource name.
The resource group where the Azure AI Services resource is deployed.
Add models
To add a model, you first need to identify the model that you want to deploy. You can query the available models as follows:
Log in into your Azure subscription:
az login
If you have more than 1 subscription, select the subscription where your resource is located:
az account set --subscription $subscriptionId
Set the following environment variables with the name of the Azure AI Services resource you plan to use and resource group.
accountName="<ai-services-resource-name>" resourceGroupName="<resource-group>" ___location="eastus2"
If you don't have an Azure AI Services account create yet, you can create one as follows:
az cognitiveservices account create -n $accountName -g $resourceGroupName --custom-___domain $accountName --___location $___location --kind AIServices --sku S0
Let's see first which models are available to you and under which SKU. SKUs, also known as deployment types, define how Azure infrastructure is used to process requests. Models may offer different deployment types. The following command list all the model definitions available:
az cognitiveservices account list-models \ -n $accountName \ -g $resourceGroupName \ | jq '.[] | { name: .name, format: .format, version: .version, sku: .skus[0].name, capacity: .skus[0].capacity.default }'
Outputs look as follows:
{ "name": "Phi-3.5-vision-instruct", "format": "Microsoft", "version": "2", "sku": "GlobalStandard", "capacity": 1 }
Identify the model you want to deploy. You need the properties
name
,format
,version
, andsku
. The propertyformat
indicates the provider offering the model. Capacity might also be needed depending on the type of deployment.Add the model deployment to the resource. The following example adds
Phi-3.5-vision-instruct
:az cognitiveservices account deployment create \ -n $accountName \ -g $resourceGroupName \ --deployment-name Phi-3.5-vision-instruct \ --model-name Phi-3.5-vision-instruct \ --model-version 2 \ --model-format Microsoft \ --sku-capacity 1 \ --sku-name GlobalStandard
The model is ready to be consumed.
You can deploy the same model multiple times if needed as long as it's under a different deployment name. This capability might be useful in case you want to test different configurations for a given model, including content filters.
Use the model
Deployed models in can be consumed using the Azure AI model's inference endpoint for the resource. When constructing your request, indicate the parameter model
and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:
Inference endpoint
az cognitiveservices account show -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'
To make requests to the Azure AI Foundry Models endpoint, append the route models
, for example https://<resource>.services.ai.azure.com/models
. You can see the API reference for the endpoint at Azure AI Foundry Models API reference page.
Inference keys
az cognitiveservices account keys list -n $accountName -g $resourceGroupName
Manage deployments
You can see all the deployments available using the CLI:
Run the following command to see all the active deployments:
az cognitiveservices account deployment list -n $accountName -g $resourceGroupName
You can see the details of a given deployment:
az cognitiveservices account deployment show \ --deployment-name "Phi-3.5-vision-instruct" \ -n $accountName \ -g $resourceGroupName
You can delete a given deployment as follows:
az cognitiveservices account deployment delete \ --deployment-name "Phi-3.5-vision-instruct" \ -n $accountName \ -g $resourceGroupName
Important
Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
In this article, you learn how to create the resources required to use Azure AI Foundry Models in your projects.
Understand the resources
Azure AI Foundry Models is a capability in Azure AI Foundry Services (formerly known Azure AI Services). You can create model deployments under the resource to consume their predictions. You can also connect the resource to Azure AI Hubs and Projects in Azure AI Foundry to create intelligent applications if needed. The following picture shows the high level architecture.
Azure AI Foundry Services don't require AI projects or AI hubs to operate and you can create them to consume flagship models from your applications. However, additional capabilities are available if you deploy an Azure AI project and hub, including playground, or agents.
The tutorial helps you create:
- An Azure AI Foundry resource.
- A model deployment for each of the models supported with standard deployments.
- (Optionally) An Azure AI project and hub.
- (Optionally) A connection between the hub and the models in Azure AI Foundry.
Prerequisites
To complete this article, you need:
- An Azure subscription. If you're using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI Foundry if that's your case.
Install the Azure CLI.
Identify the following information:
- Your Azure subscription ID.
About this tutorial
The example in this article is based on code samples contained in the Azure-Samples/azureai-model-inference-bicep repository. To run the commands locally without having to copy or paste file content, use the following commands to clone the repository and go to the folder for your coding language:
git clone https://github.com/Azure-Samples/azureai-model-inference-bicep
The files for this example are in:
cd azureai-model-inference-bicep/infra
Permissions required to subscribe to Models from Partners and Community
Models from Partners and Community available for deployment (for example, Cohere models) require Azure Marketplace. Model providers define the license terms and set the price for use of their models using Azure Marketplace.
When deploying third-party models, ensure you have the following permissions in your account:
- On the Azure subscription:
Microsoft.MarketplaceOrdering/agreements/offers/plans/read
Microsoft.MarketplaceOrdering/agreements/offers/plans/sign/action
Microsoft.MarketplaceOrdering/offerTypes/publishers/offers/plans/agreements/read
Microsoft.Marketplace/offerTypes/publishers/offers/plans/agreements/read
Microsoft.SaaS/register/action
- On the resource group—to create and use the SaaS resource:
Microsoft.SaaS/resources/read
Microsoft.SaaS/resources/write
Create the resources
Follow these steps:
Use the template
modules/ai-services-template.bicep
to describe your Azure AI Services resource:modules/ai-services-template.bicep
@description('Location of the resource.') param ___location string = resourceGroup().___location @description('Name of the Azure AI Services account.') param accountName string @description('The resource model definition representing SKU') param sku string = 'S0' @description('Whether or not to allow keys for this account.') param allowKeys bool = true @allowed([ 'Enabled' 'Disabled' ]) @description('Whether or not public endpoint access is allowed for this account.') param publicNetworkAccess string = 'Enabled' @allowed([ 'Allow' 'Deny' ]) @description('The default action for network ACLs.') param networkAclsDefaultAction string = 'Allow' resource account 'Microsoft.CognitiveServices/accounts@2023-05-01' = { name: accountName ___location: ___location identity: { type: 'SystemAssigned' } sku: { name: sku } kind: 'AIServices' properties: { customSubDomainName: accountName publicNetworkAccess: publicNetworkAccess networkAcls: { defaultAction: networkAclsDefaultAction } disableLocalAuth: allowKeys } } output endpointUri string = 'https://${account.outputs.name}.services.ai.azure.com/models' output id string = account.id
Use the template
modules/ai-services-deployment-template.bicep
to describe model deployments:modules/ai-services-deployment-template.bicep
@description('Name of the Azure AI services account') param accountName string @description('Name of the model to deploy') param modelName string @description('Version of the model to deploy') param modelVersion string @allowed([ 'AI21 Labs' 'Cohere' 'Core42' 'DeepSeek' 'Meta' 'Microsoft' 'Mistral AI' 'OpenAI' ]) @description('Model provider') param modelPublisherFormat string @allowed([ 'GlobalStandard' 'Standard' 'GlobalProvisioned' 'Provisioned' ]) @description('Model deployment SKU name') param skuName string = 'GlobalStandard' @description('Content filter policy name') param contentFilterPolicyName string = 'Microsoft.DefaultV2' @description('Model deployment capacity') param capacity int = 1 resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2024-04-01-preview' = { name: '${accountName}/${modelName}' sku: { name: skuName capacity: capacity } properties: { model: { format: modelPublisherFormat name: modelName version: modelVersion } raiPolicyName: contentFilterPolicyName == null ? 'Microsoft.Nill' : contentFilterPolicyName } }
For convenience, we define the model we want to have available in the service using a JSON file. The file infra/models.json contains a list of JSON object with keys
name
,version
,provider
, andsku
, which defines the models the deployment will provision. Since the models support standard deployments, adding model deployments doesn't incur on extra cost. Modify the file by removing/adding the model entries you want to have available. The following example shows only the first 7 lines of the JSON file:models.json
[ { "name": "AI21-Jamba-1.5-Large", "version": "1", "provider": "AI21 Labs", "sku": "GlobalStandard" },
If you plan to use projects (recommended), you need the templates for creating a project, hub, and a connection to the Azure AI Services resource:
modules/project-hub-template.bicep
param ___location string = resourceGroup().___location @description('Name of the Azure AI hub') param hubName string = 'hub-dev' @description('Name of the Azure AI project') param projectName string = 'intelligent-apps' @description('Name of the storage account used for the workspace.') param storageAccountName string = replace(hubName, '-', '') param keyVaultName string = replace(hubName, 'hub', 'kv') param applicationInsightsName string = replace(hubName, 'hub', 'log') @description('The container registry resource id if you want to create a link to the workspace.') param containerRegistryName string = replace(hubName, '-', '') @description('The tags for the resources') param tagValues object = { owner: 'santiagxf' project: 'intelligent-apps' environment: 'dev' } var tenantId = subscription().tenantId var resourceGroupName = resourceGroup().name var storageAccountId = resourceId(resourceGroupName, 'Microsoft.Storage/storageAccounts', storageAccountName) var keyVaultId = resourceId(resourceGroupName, 'Microsoft.KeyVault/vaults', keyVaultName) var applicationInsightsId = resourceId(resourceGroupName, 'Microsoft.Insights/components', applicationInsightsName) var containerRegistryId = resourceId( resourceGroupName, 'Microsoft.ContainerRegistry/registries', containerRegistryName ) resource storageAccount 'Microsoft.Storage/storageAccounts@2019-04-01' = { name: storageAccountName ___location: ___location sku: { name: 'Standard_LRS' } kind: 'StorageV2' properties: { encryption: { services: { blob: { enabled: true } file: { enabled: true } } keySource: 'Microsoft.Storage' } supportsHttpsTrafficOnly: true } tags: tagValues } resource keyVault 'Microsoft.KeyVault/vaults@2019-09-01' = { name: keyVaultName ___location: ___location properties: { tenantId: tenantId sku: { name: 'standard' family: 'A' } enableRbacAuthorization: true accessPolicies: [] } tags: tagValues } resource applicationInsights 'Microsoft.Insights/components@2018-05-01-preview' = { name: applicationInsightsName ___location: ___location kind: 'web' properties: { Application_Type: 'web' } tags: tagValues } resource containerRegistry 'Microsoft.ContainerRegistry/registries@2019-05-01' = { name: containerRegistryName ___location: ___location sku: { name: 'Standard' } properties: { adminUserEnabled: true } tags: tagValues } resource hub 'Microsoft.MachineLearningServices/workspaces@2024-07-01-preview' = { name: hubName kind: 'Hub' ___location: ___location identity: { type: 'systemAssigned' } sku: { tier: 'Standard' name: 'standard' } properties: { description: 'Azure AI hub' friendlyName: hubName storageAccount: storageAccountId keyVault: keyVaultId applicationInsights: applicationInsightsId containerRegistry: (empty(containerRegistryName) ? null : containerRegistryId) encryption: { status: 'Disabled' keyVaultProperties: { keyVaultArmId: keyVaultId keyIdentifier: '' } } hbiWorkspace: false } tags: tagValues } resource project 'Microsoft.MachineLearningServices/workspaces@2024-07-01-preview' = { name: projectName kind: 'Project' ___location: ___location identity: { type: 'systemAssigned' } sku: { tier: 'Standard' name: 'standard' } properties: { description: 'Azure AI project' friendlyName: projectName hbiWorkspace: false hubResourceId: hub.id } tags: tagValues }
modules/ai-services-connection-template.bicep
@description('Name of the hub where the connection will be created') param hubName string @description('Name of the connection') param name string @description('Category of the connection') param category string = 'AIServices' @allowed(['AAD', 'ApiKey', 'ManagedIdentity', 'None']) param authType string = 'AAD' @description('The endpoint URI of the connected service') param endpointUri string @description('The resource ID of the connected service') param resourceId string = '' @secure() param key string = '' resource connection 'Microsoft.MachineLearningServices/workspaces/connections@2024-04-01-preview' = { name: '${hubName}/${name}' properties: { category: category target: endpointUri authType: authType isSharedToAll: true credentials: authType == 'ApiKey' ? { key: key } : null metadata: { ApiType: 'Azure' ResourceId: resourceId } } }
Define the main deployment:
deploy-with-project.bicep
@description('Location to create the resources in') param ___location string = resourceGroup().___location @description('Name of the resource group to create the resources in') param resourceGroupName string = resourceGroup().name @description('Name of the AI Services account to create') param accountName string = 'azurei-models-dev' @description('Name of the project hub to create') param hubName string = 'hub-azurei-dev' @description('Name of the project to create in the project hub') param projectName string = 'intelligent-apps' @description('Path to a JSON file with the list of models to deploy. Each model is a JSON object with the following properties: name, version, provider') var models = json(loadTextContent('models.json')) module aiServicesAccount 'modules/ai-services-template.bicep' = { name: 'aiServicesAccount' scope: resourceGroup(resourceGroupName) params: { accountName: accountName ___location: ___location } } module projectHub 'modules/project-hub-template.bicep' = { name: 'projectHub' scope: resourceGroup(resourceGroupName) params: { hubName: hubName projectName: projectName } } module aiServicesConnection 'modules/ai-services-connection-template.bicep' = { name: 'aiServicesConnection' scope: resourceGroup(resourceGroupName) params: { name: accountName authType: 'AAD' endpointUri: aiServicesAccount.outputs.endpointUri resourceId: aiServicesAccount.outputs.id hubName: hubName } dependsOn: [ projectHub ] } @batchSize(1) module modelDeployments 'modules/ai-services-deployment-template.bicep' = [ for (item, i) in models: { name: 'deployment-${item.name}' scope: resourceGroup(resourceGroupName) params: { accountName: accountName modelName: item.name modelVersion: item.version modelPublisherFormat: item.provider skuName: item.sku } dependsOn: [ aiServicesAccount ] } ] output endpoint string = aiServicesAccount.outputs.endpointUri
Log into Azure:
az login
Ensure you are in the right subscription:
az account set --subscription "<subscription-id>"
Run the deployment:
RESOURCE_GROUP="<resource-group-name>" az deployment group create \ --resource-group $RESOURCE_GROUP \ --template-file deploy-with-project.bicep
If you want to deploy only the Azure AI Services resource and the model deployments, use the following deployment file:
deploy.bicep
@description('Location to create the resources in') param ___location string = resourceGroup().___location @description('Name of the resource group to create the resources in') param resourceGroupName string = resourceGroup().name @description('Name of the AI Services account to create') param accountName string = 'azurei-models-dev' @description('Path to a JSON file with the list of models to deploy. Each model is a JSON object with the following properties: name, version, provider') var models = json(loadTextContent('models.json')) module aiServicesAccount 'modules/ai-services-template.bicep' = { name: 'aiServicesAccount' scope: resourceGroup(resourceGroupName) params: { accountName: accountName ___location: ___location } } @batchSize(1) module modelDeployments 'modules/ai-services-deployment-template.bicep' = [ for (item, i) in models: { name: 'deployment-${item.name}' scope: resourceGroup(resourceGroupName) params: { accountName: accountName modelName: item.name modelVersion: item.version modelPublisherFormat: item.provider skuName: item.sku } dependsOn: [ aiServicesAccount ] } ] output endpoint string = aiServicesAccount.outputs.endpointUri
Run the deployment:
RESOURCE_GROUP="<resource-group-name>" az deployment group create \ --resource-group $RESOURCE_GROUP \ --template-file deploy.bicep
The template outputs the Azure AI Foundry Models endpoint that you can use to consume any of the model deployments you have created.