LLM tool

2025-07-24

The large language model (LLM) tool in prompt flow enables you to use widely used large language models like OpenAI, Azure OpenAI in Azure AI Foundry Models, or any language model supported by the Azure AI model inference API for natural language processing.

Prompt flow provides several large language model APIs:

Completion: OpenAI's completion models generate text based on provided prompts.
Chat: OpenAI's chat models and the Azure AI chat models facilitate interactive conversations with text-based inputs and responses.

The Embeddings API isn't available in the LLM tool. Use the embedding tool to generate embeddings with OpenAI or Azure OpenAI.

Note

The LLM tool in prompt flow does not support reasoning models (such as OpenAI o1 or o3). For reasoning model integration, use the Python tool to call the model APIs directly. For more information, see Call a reasoning model from the Python tool.

Prerequisites

Create OpenAI resources:

OpenAI:
- Sign up for an account on the OpenAI website.
- Sign in and find your personal API key.
Azure OpenAI:
- Create Azure OpenAI resources by following these instructions. Use only ASCII characters in Azure OpenAI resource group names. Prompt flow doesn't support non-ASCII characters in resource group names.
Models deployed to standard deployments:
- Create an endpoint with the model from the catalog you want and deploy it with a standard deployment.
- To use models deployed to standard deployment supported by the Azure AI model inference API, like Mistral, Cohere, Meta Llama, or Microsoft family of models (among others), create a connection in your project to your endpoint.

Connections

Set up connections to provisioned resources in prompt flow.

Type	Name	API key	API type	API version
OpenAI	Required	Required	-	-
Azure OpenAI - API key	Required	Required	Required	Required
Azure OpenAI - Microsoft Entra ID	Required	-	-	Required
Serverless model	Required	Required	-	-

Tip

To use Microsoft Entra ID auth type for Azure OpenAI connection, assign either the Cognitive Services OpenAI User or Cognitive Services OpenAI Contributor role to the user or user-assigned managed identity.
Learn more about how to specify to use user identity to submit flow run.
Learn more about how to configure Azure OpenAI with managed identities.

Inputs

The following sections show various inputs.

Text completion

Name	Type	Description	Required
prompt	string	Text prompt for the language model.	Yes
model, deployment_name	string	Language model to use.	Yes
max_tokens	integer	Maximum number of tokens to generate in the completion. Default is 16.	No
temperature	float	Randomness of the generated text. Default is 1.	No
stop	list	Stopping sequence for the generated text. Default is null.	No
suffix	string	Text appended to the end of the completion.	No
top_p	float	Probability of using the top choice from the generated tokens. Default is 1.	No
logprobs	integer	Number of log probabilities to generate. Default is null.	No
echo	boolean	Value that indicates whether to echo back the prompt in the response. Default is false.	No
presence_penalty	float	Value that controls the model's behavior for repeating phrases. Default is 0.	No
frequency_penalty	float	Value that controls the model's behavior for generating rare phrases. Default is 0.	No
best_of	integer	Number of best completions to generate. Default is 1.	No
logit_bias	dictionary	Logit bias for the language model. Default is an empty dictionary.	No

Chat

Name	Type	Description	Required
prompt	string	Text prompt that the language model uses for a response.	Yes
model, deployment_name	string	Language model to use. This parameter isn't required if the model is deployed to a standard deployment.	Yes*
max_tokens	integer	Maximum number of tokens to generate in the response. Default is inf.	No
temperature	float	Randomness of the generated text. Default is 1.	No
stop	list	Stopping sequence for the generated text. Default is null.	No
top_p	float	Probability of using the top choice from the generated tokens. Default is 1.	No
presence_penalty	float	Value that controls the model's behavior for repeating phrases. Default is 0.	No
frequency_penalty	float	Value that controls the model's behavior for generating rare phrases. Default is 0.	No
logit_bias	dictionary	Logit bias for the language model. Default is an empty dictionary.	No

Outputs

API	Return type	Description
Completion	string	Text of one predicted completion
Chat	string	Text of one response of conversation

Use the LLM tool

Set up and select the connections to OpenAI resources or to a standard deployment.
Configure the large language model API and its parameters.
Prepare the prompt with guidance.