Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The large language model (LLM) tool in prompt flow enables you to use widely used large language models like OpenAI, Azure OpenAI in Azure AI Foundry Models, or any language model supported by the Azure AI model inference API for natural language processing.
Prompt flow provides several large language model APIs:
- Completion: OpenAI's completion models generate text based on provided prompts.
- Chat: OpenAI's chat models and the Azure AI chat models facilitate interactive conversations with text-based inputs and responses.
The Embeddings API isn't available in the LLM tool. Use the embedding tool to generate embeddings with OpenAI or Azure OpenAI.
Note
The LLM tool in prompt flow does not support reasoning models (such as OpenAI o1 or o3). For reasoning model integration, use the Python tool to call the model APIs directly. For more information, see Call a reasoning model from the Python tool.
Prerequisites
Create OpenAI resources:
OpenAI:
- Sign up for an account on the OpenAI website.
- Sign in and find your personal API key.
Azure OpenAI:
- Create Azure OpenAI resources by following these instructions. Use only ASCII characters in Azure OpenAI resource group names. Prompt flow doesn't support non-ASCII characters in resource group names.
Models deployed to standard deployments:
- Create an endpoint with the model from the catalog you want and deploy it with a standard deployment.
- To use models deployed to standard deployment supported by the Azure AI model inference API, like Mistral, Cohere, Meta Llama, or Microsoft family of models (among others), create a connection in your project to your endpoint.
Connections
Set up connections to provisioned resources in prompt flow.
Type | Name | API key | API type | API version |
---|---|---|---|---|
OpenAI | Required | Required | - | - |
Azure OpenAI - API key | Required | Required | Required | Required |
Azure OpenAI - Microsoft Entra ID | Required | - | - | Required |
Serverless model | Required | Required | - | - |
Tip
- To use Microsoft Entra ID auth type for Azure OpenAI connection, assign either the
Cognitive Services OpenAI User
orCognitive Services OpenAI Contributor
role to the user or user-assigned managed identity. - Learn more about how to specify to use user identity to submit flow run.
- Learn more about how to configure Azure OpenAI with managed identities.
Inputs
The following sections show various inputs.
Text completion
Name | Type | Description | Required |
---|---|---|---|
prompt | string | Text prompt for the language model. | Yes |
model, deployment_name | string | Language model to use. | Yes |
max_tokens | integer | Maximum number of tokens to generate in the completion. Default is 16. | No |
temperature | float | Randomness of the generated text. Default is 1. | No |
stop | list | Stopping sequence for the generated text. Default is null. | No |
suffix | string | Text appended to the end of the completion. | No |
top_p | float | Probability of using the top choice from the generated tokens. Default is 1. | No |
logprobs | integer | Number of log probabilities to generate. Default is null. | No |
echo | boolean | Value that indicates whether to echo back the prompt in the response. Default is false. | No |
presence_penalty | float | Value that controls the model's behavior for repeating phrases. Default is 0. | No |
frequency_penalty | float | Value that controls the model's behavior for generating rare phrases. Default is 0. | No |
best_of | integer | Number of best completions to generate. Default is 1. | No |
logit_bias | dictionary | Logit bias for the language model. Default is an empty dictionary. | No |
Chat
Name | Type | Description | Required |
---|---|---|---|
prompt | string | Text prompt that the language model uses for a response. | Yes |
model, deployment_name | string | Language model to use. This parameter isn't required if the model is deployed to a standard deployment. | Yes* |
max_tokens | integer | Maximum number of tokens to generate in the response. Default is inf. | No |
temperature | float | Randomness of the generated text. Default is 1. | No |
stop | list | Stopping sequence for the generated text. Default is null. | No |
top_p | float | Probability of using the top choice from the generated tokens. Default is 1. | No |
presence_penalty | float | Value that controls the model's behavior for repeating phrases. Default is 0. | No |
frequency_penalty | float | Value that controls the model's behavior for generating rare phrases. Default is 0. | No |
logit_bias | dictionary | Logit bias for the language model. Default is an empty dictionary. | No |
Outputs
API | Return type | Description |
---|---|---|
Completion | string | Text of one predicted completion |
Chat | string | Text of one response of conversation |
Use the LLM tool
- Set up and select the connections to OpenAI resources or to a standard deployment.
- Configure the large language model API and its parameters.
- Prepare the prompt with guidance.