Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This document refers to the Microsoft Foundry (classic) portal.
Note
This document refers to the Microsoft Foundry (new) portal.
Managing Microsoft Foundry costs effectively starts with planning. This article shows you how to estimate expenses before deployment, track spending in real-time, and set up alerts to avoid budget surprises.
What you'll learn:
- Estimate costs using the Azure pricing calculator
- Monitor actual spending across different model types
- Create budgets and alerts to control expenses
- Understand billing differences between Azure-hosted and partner models
This article describes how to plan for and manage costs for Microsoft Foundry. First, use the Azure pricing calculator to help plan for Foundry costs before you add resources. Next, as you add Azure resources, review the estimated costs. After you start using Azure resources, use cost management features to set budgets and monitor costs.
This article describes how to plan for and manage costs for Microsoft Foundry. First, use the Azure pricing calculator to help plan for Foundry costs before you add resources. Next, as you add Azure resources, review the estimated costs. After you start using Azure resources, use cost management features to set budgets and monitor costs.
Tip
Foundry doesn't have a specific page in the Azure pricing calculator. Foundry is composed of several other Azure services, some of which are optional. This article shows how to use the pricing calculator to estimate costs for these services.
You use Foundry Tools in Foundry portal. Costs for Foundry Tools are only a portion of the monthly costs in your Azure bill. You're billed for all services and resources used in your Azure subscription, including third-party services. You can also review forecasted costs and identify spending trends to find areas where you might want to act.
Prerequisites
To view and analyze costs, you need:
- An Azure account with read access to Cost Management data
- One of the supported Azure account types
Need to grant access? See how to assign access to Cost Management data.
Estimate costs before using Foundry Tools
Use the Azure pricing calculator to estimate costs before you add Foundry Tools.
Search for and select a product, such as Azure Language in Foundry Tools, in the Azure pricing calculator.
Select more than one product to estimate costs for multiple products. For example, search for and select Azure AI Search to add potential costs.
As you add new resources to your project, return to this calculator and add the same resource here to update your cost estimates.
Costs associated with Foundry
When you create a Foundry resource, you pay to use services like Azure OpenAI, Speech, Content Safety, Vision, Document Intelligence, and Language. Costs vary for each service and for some features within each service. Find more details on the Foundry Tools pricing page.
Understand the billing model for Foundry Tools
Foundry Tools run on Azure infrastructure that accrues costs when you deploy the new resource. It's important to understand that extra infrastructure can accrue cost. You need to manage that cost when you make changes to deployed resources.
When you create or use Foundry Tools resources, you're charged based on the services that you use. Two billing models are available for Foundry Tools:
Serverless API: With serverless API pricing, you're billed according to the Foundry Tools offering you use, based on its billing information.
Commitment tiers: With commitment tier pricing, you commit to using several service features for a fixed fee, so you have a predictable total cost based on the needs of your workload. You're billed based on the plan you choose. For information on available services, how to sign up, and considerations when buying a plan, see Quickstart: Purchase commitment tier pricing.
Note
If you use the resource above the quota provided by the commitment plan, you pay for the extra usage as described in the overage amount in the Azure portal when you buy a commitment plan.
Understand the billing model for Foundry Models
Token-based pricing
Language models understand and process inputs by breaking them down into tokens. For reference, each token is roughly four characters for typical English text. Models that can process images or audio break them down into tokens too for billing purposes. The number of tokens per image or audio content depends on the model and the resolution of the input.
Costs per token vary depending on which model series you choose, but in all cases, models deployed in Foundry are charged per 1,000 tokens. For example, Azure OpenAI chat completions model inference is charged per 1,000 tokens with different rates depending on the model and deployment type. For most models, pricing is now listed in terms of 1 million tokens.
Token costs are for both input and output. For example, suppose you have a 1,000-token JavaScript code sample that you ask a model to convert to Python. You pay for approximately 1,000 tokens for the initial input request sent, and 1,000 more tokens for the output that is received in response for a total of 2,000 tokens.
In practice, for this type of completion call, the token input/output isn't perfectly 1:1. A conversion from one programming language to another can result in a longer or shorter output depending on many factors. One such factor is the value assigned to the max_tokens parameter.
Models sold directly by Azure
Models sold directly by Azure (including Azure OpenAI) are charged directly. They appear as billing meters under each Foundry resource. Microsoft handles this billing directly. When you inspect your bill, you see billing meters that account for inputs and outputs for each consumed model.
Models from partners and community
Models provided by third-party providers, such as Cohere, are billed using Azure Marketplace. Unlike Microsoft billing meters, those entries are associated with the resource group where your Foundry resource is deployed instead of to the Foundry resource itself. Given model providers charge you directly, you see entries under the category Marketplace and Service Name SaaS accounting for inputs and outputs for each consumed model.
Important
This distinction between Models Sold Directly by Azure (including Azure OpenAI) and Models from Partners and Community only affects how the model is made available to you and how you are charged. In all cases, models are hosted within Azure cloud, and there's no interaction with external services or providers.
Fine-tuned models
Azure OpenAI fine-tuning models are charged based on the number of tokens in your training file. For the latest prices, see the official pricing page.
Once your fine-tuned model is deployed, you're also charged based on:
- Hosting hours.
- Inference per 1,000 tokens (broken down by input usage and output usage).
The hosting hours cost is important to be aware of because after a fine-tuned model is deployed, it continues to incur an hourly cost regardless of whether you're actively using it. Monitor deployed fine-tuned model costs closely.
Important
After you deploy a customized model, if at any time the deployment remains inactive for more than 15 days, the deployment is deleted. The deployment of a customized model is inactive if the model was deployed more than 15 days ago and no completions or chat completions calls were made to it during a continuous 15-day period.
The deletion of an inactive deployment doesn't delete or affect the underlying customized model, and the customized model can be redeployed at any time.
Each customized (fine-tuned) model that's deployed incurs an hourly hosting cost regardless of whether completions or chat completions calls are being made to the model.
HTTP Error response code and billing status
If the service performs processing, you're charged even if the status code isn't successful (not 200). For example, a 400 error due to a content filter or input limit, or a 408 error due to a timeout.
If the service doesn't perform processing, you aren't charged. For example, a 401 error due to authentication or a 429 error due to exceeding the rate limit.
Monitor costs
As you use Foundry, you incur costs. Azure resource usage unit costs vary by time intervals (seconds, minutes, hours, and days) or by unit usage (bytes, megabytes, and so on). You can see the incurred costs in cost analysis.
When you use cost analysis, you view costs in graphs and tables for different time intervals. Some examples are by day, current and prior month, and year. You also view costs against budgets and forecasted costs. Switching to longer views over time can help you identify spending trends so you can see where overspending might occur. If you create budgets, you can also easily see where they're exceeded.
You can access cost information from either the Microsoft Foundry portal or the Azure portal.
You can access cost information from either the Microsoft Foundry portal or the Azure portal.
Important
Your Foundry costs are only a subset of your overall application or solution costs. You need to monitor costs for all Azure resources used in your application or solution.
Configure permissions to view costs
You need the AI User role and Cost Management Reader role at the resource group of subscription level to view the costs.
Or you can create the following custom rules:
Microsoft.Consumption/*/readMicrosoft.CostManagement/*/readMicrosoft.Resources/subscriptions/readMicrosoft.CognitiveServices/accounts/AIServices/usage/read
Note
You need the Owner role at the subscription or resource group scope to create custom roles in that scope.
To create a custom role, use one of the following articles:
For more information about custom roles, see Azure custom roles.
To create a custom role, construct a role definition JSON file that specifies the permission and scope for the role. The following example defines a custom Foundry Cost Reader role scoped at a specific resource level:
{
"Name": "Foundry Cost Reader",
"IsCustom": true,
"Description": "Can see cost metrics in Foundry",
"Actions": [
"Microsoft.Consumption/*/read",
"Microsoft.CostManagement/*/read",
"Microsoft.Resources/subscriptions/read",
"Microsoft.CognitiveServices/accounts/AIServices/usage/read"
],
"NotActions": [],
"DataActions": [],
"NotDataActions": [],
"AssignableScopes": [
"/subscriptions/<subscriptionId>/resourceGroups/<resourceGroupName>/providers/Microsoft.CognitiveServices/accounts/<foundryResourceName>"
]
}
Replace <subscriptionId>, <resourceGroupName>, and <foundryResourceName> with your actual values.
Monitor in Foundry portal
- Sign in to Microsoft Foundry. Make sure the New Foundry toggle is off. These steps refer to Foundry (classic).
Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).
- Use the sections below to monitor costs.
Note
These are estimated values and do not reflect any discounts or special contracted pricing that may appear on your final bill. They also cover only standard deployment costs, not provisioned throughput offerings.
Agent costs
- Select Operate in the upper-right navigation.
- Select Overview in the left pane.
- At the top of the page, select the subscription, one or more projects, and a date range.
- The Estimated cost tile shows estimates of all the agents for the selected project(s) for the selected dates. These estimates do not currently include prompt agent and non-Foundry agent costs.
For individual agent estimates:
- Select Assets in the left pane
- Select the Agents tab.
- The Estimated costs column shows monthly estimates based on the agent configuration and usage patterns.
For more details of an individual agent:
- Select Build in the upper-right navigation.
- Select Agents in the left pane.
- Select an agent.
- Select the Monitor tab for the agent.
- Set the date range in the upper-right corner.
- Operational metrics show the token cost and usage for the given range.
Model deployment costs
- Select Build in the upper-right navigation.
- Select Models in the left pane.
- Select a model.
- Select the Monitor tab.
- Select the date range in the upper right corner. You see the total cost along with an estimated cost chart for the given range.
When you select the View More Details or the Azure Cost Management link, you are directed to the Azure portal under the Cost Management section. The costs displayed there reflect the aggregated charges for the entire Cognitive Services account. These differ from the costs shown here, which are specific to the selected model only. These costs are only available in USD only and not in the user's billing currency.
Note
Token and request charts can sometimes show lower values than the Estimated cost view because late‑arrival usage events may not be included in those charts. If there’s a mismatch, rely on Estimated cost as the most accurate view, and note that your Azure Cost Management invoice remains the final source of truth.
Monitor in Azure portal
Here's an example of how to monitor costs in the Azure portal. The costs are used as an example only. Your costs vary depending on the services that you use and the amount of usage.
Sign in to the Azure portal
You can view costs for a resource group or for an individual Foundry resource.
Tip
To open your Resource group:
- Sign in to Microsoft Foundry. Make sure the New Foundry toggle is off. These steps refer to Foundry (classic).
Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).
- Select your project, then select Management center from the left menu.
- Under the Resource heading, select Overview.
- Under the Resource properties, select the link to open it directly in the Azure portal.
Tip
To open your Foundry resource in Azure portal:
- Sign in to Microsoft Foundry. Make sure the New Foundry toggle is off. These steps refer to Foundry (classic).
Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).
- Select Operate from the upper-right navigation.
- Select Admin.
- Select the link for the parent resource in the second column.
- Select Manage this resource in the Azure portal under the View resource heading in the upper-right.
In the Azure portal, for either your resource group or Foundry resource, select Cost analysis under Cost Management.
You see the cost overview. You can also add filters such as deployment level tags and user defined tags. For example, to see the costs based on model deployment:
Select Costs by resource > Resources to open the Cost analysis page.
You can see cost of your Foundry resource and the split of that cost across multiple model deployments under that resource.
Understand cost breakdown by meter
To understand the breakdown of the cost, use the Cost Analysis tool in Azure portal. Follow these steps to understand the cost of inference:
Sign in to the Azure portal and select the resource group that contains the project you want to monitor.
Tip
To open your Resource group:
- Sign in to Microsoft Foundry. Make sure the New Foundry toggle is off. These steps refer to Foundry (classic).
Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).
- Select your project, then select Management center from the left menu.
- Under the Resource heading, select Overview.
- Under the Resource properties, select the link to open it directly in the Azure portal.
Tip
To open your Foundry resource in Azure portal:
- Sign in to Microsoft Foundry. Make sure the New Foundry toggle is off. These steps refer to Foundry (classic).
Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).
- Select Operate from the upper-right navigation.
- Select Admin.
- Select the link for the parent resource in the second column.
- Select Manage this resource in the Azure portal under the View resource heading in the upper-right.
In Azure portal, select Cost analysis under Cost Management.
By default, cost analysis is scoped to the selected resource group.
Important
Scope Cost Analysis to the resource group where you deployed the Foundry resource. The cost meters associated with Models from Partners and Community display under the resource group instead of the Foundry resource.
Modify Group by to Meter. You can now see that for this particular resource group, the source of the costs comes from different model series.
Models sold directly by Azure
Models sold directly by Azure (including Azure OpenAI) are charged directly. They appear as billing meters under each Foundry resource. Microsoft handles this billing directly. When you inspect your bill, you see billing meters that account for inputs and outputs for each consumed model.
Models from partners and community
Models provided by third-party providers, like Cohere, are billed using Azure Marketplace. As opposite to Microsoft billing meters, those entries are associated with the resource group where your Foundry is deployed instead of to the Foundry resource itself. Given model providers charge you directly, you see entries under the category Marketplace and Service Name SaaS accounting for inputs and outputs for each consumed model.
Important
This distinction between Models Sold Directly by Azure (including Azure OpenAI) and Models from Partners and Community only affects how the model is made available to you and how you are charged. In all cases, models are hosted within Azure cloud and there's no interaction with external services or providers.
Monitor costs by resource
You can get more detailed billing information by grouping costs by resource:
In Cost Analysis, select View > Cost by resource.
Now you can see the resources generating each of the billing meters. To understand the breakdown of what makes up that cost, it can help to modify Group by to Meter and switching the chart type to Line.
Azure OpenAI models and Microsoft models are displayed as meters under each Foundry Tool resource.
Some providers' models are displayed as meters under Global resources. The word Global isn't related to the SKU of the model deployment (for instance, Global standard). If you have multiple Foundry Tool resources, your bill contains one entry for each model for each Foundry Tool resource. The resource meters have the format [model-name]-[GUID] where [GUID] is an identifier unique an associated with a given Foundry Tools resource. You notice billing meters accounting for inputs and outputs for each model you consumed.
It's important to understand scope when you evaluate costs associated with Foundry Tools. If your resources are part of the same resource group, you can scope Cost Analysis at that level to understand the effect on costs. If your resources are spread across multiple resource groups, you can scope to the subscription level.
When scoped at a higher level, you often need to add more filters to focus on Azure OpenAI usage. When scoped at the subscription level, you see many other resources that you might not care about in the context of Azure OpenAI cost management. When you scope at the subscription level, navigate to the full Cost analysis tool under the Cost Management service.
Here's an example of how to use the Cost analysis tool to see your accumulated costs for a subscription or resource group:
- Search for Cost Management in the top Azure search bar to navigate to the full service experience, which includes more options such as creating budgets.
- If necessary, select change if the Scope: isn't pointing to the resource group or subscription you want to analyze.
- On the left, select Reporting + analytics > Cost analysis.
- On the All views tab, select Accumulated costs.
The cost analysis dashboard shows the accumulated costs that are analyzed depending on what you specified for Scope.
If you try to add a filter by service, you can't find Azure OpenAI in the list. This situation occurs because Azure OpenAI has commonality with a subset of Foundry Tools where the service level filter is Cognitive Services. If you want to see all Azure OpenAI resources across a subscription without any other type of Foundry Tool resources, instead scope to Service tier: Azure OpenAI:
Monitor costs for models in Azure Marketplace
Azure Marketplace offers serverless API deployments. Model publishers might apply different costs depending on the offering. Each project in the Foundry portal has its own subscription with the offering, which you can use to monitor the costs and consumption happening on that project. Use Microsoft Cost Management to monitor the costs:
Sign in to the Azure portal
Select the portal menu icon to open the left pane.
On the left pane, select Cost Management + Billing and then select Cost Management.
On the left pane, under the section for Reporting + analytics, select Cost Analysis.
Select a view such as Resources. The cost associated with each resource is displayed.
On the Type column, select the filter icon to filter all the resources of type microsoft.saas/resources. This type corresponds to resources created from offers available in Azure Marketplace. For convenience, you can filter by resource types containing the string SaaS.
One resource is displayed for each model offer per project. Naming of those resources is [Model offer name]-[GUID].
Select to expand the resource details to get access to each of the costs meters associated with the resource.
- Tier represents the offering.
- Product is the specific product inside the offering.
Some model providers might use the same name for both.
Tip
Remember that one resource is created per project, for each plan that your project subscribes to.
When you expand the details, costs are reported per each of the meters associated with the offering. Each meter might track different sources of costs like inferencing, or fine tuning. The following meters are displayed (when some cost is associated with them):
Meter Group Description paygo-inference-input-tokens Base model Costs associated with the tokens used as input for inference of a base model. paygo-inference-output-tokens Base model Costs associated with the tokens generated as output for the inference of base model. paygo-finetuned-model-inference-hosting Fine-tuned model Costs associated with the hosting of an inference endpoint for a fine-tuned model. This value isn't the cost of hosting the model, but the cost of having an endpoint serving it. paygo-finetuned-model-inference-input-tokens Fine-tuned model Costs associated with the tokens used as input for inference of a fine tuned model. paygo-finetuned-model-inference-output-tokens Fine-tuned model Costs associated with the tokens generated as output for the inference of a fine tuned model.
Create budgets
Prevent cost overruns with automated alerts. Create budgets that track your spending limits and set up alerts to notify you when costs approach or exceed thresholds.
Best practice: Create budgets and alerts for Azure subscriptions and resource groups as part of an overall cost monitoring strategy.
Create budgets with filters for specific resources or services in Azure if you want more granularity in your monitoring. Filters help ensure that you don't accidentally create new resources that cost more money. For more about filter options when you create a budget, see Group and filter options.
Important
While OpenAI has an option for hard limits that prevent you from going over your budget, Azure OpenAI doesn't currently provide this functionality. You can start automation from action groups as part of your budget notifications to take more advanced actions, but this functionality requires additional custom development.
Export cost data
You can export your cost data to a storage account. Exporting data is helpful when you or others need to do additional data analysis for costs. For example, finance teams can analyze the data by using Excel or Power BI. You can export your costs on a daily, weekly, or monthly schedule and set a custom date range. Exporting cost data is the recommended way to retrieve cost datasets.
Other costs that might accrue
Enabling capabilities such as sending data to Azure Monitor Logs and alerting incur extra costs for those services. These costs are visible under those other services and at the subscription level, but aren't visible when scoped just to your Foundry resource.
Using Azure Prepayment
You can pay for Models Sold Directly by Azure charges with your Azure Prepayment (previously called monetary commitment) credit. However, you can't use Azure Prepayment credit to pay for charges for other provider models because they're billed through Azure Marketplace.
For more information, see Azure pricing calculator.
Related content
- Foundry management center
- Foundry status dashboard
- Learn how to optimize your cloud investment with cost management.
- Learn more about managing costs with cost analysis.
- Learn about how to prevent unexpected costs.
- Take the Cost Management guided learning course.