Share via


Authentication for AI agents

AI agents often need to authenticate to other resources to complete tasks. For example, a deployed agent might need to access a Vector Search index to query unstructured data or the Prompt Registry to load dynamic prompts.

This page covers the authentication methods available when developing and deploying agents using Mosaic AI Agent Framework.

Authentication methods

The following table compares the available authentication methods. You can mix and match any of these approaches:

Method Description Security posture Setup complexity
Automatic authentication passthrough Agent runs with the permissions of the user who deployed it
Databricks automatically manages short-lived credentials for declared resources
Short-lived credentials, automatic rotation Low - declare dependencies at logging time
On-behalf-of-user authentication (OBO) Agent runs with the permissions of the end user making the request Uses end user's credentials with restricted scopes Medium - requires scope declaration and runtime initialization
Manual authentication Explicitly provide credentials using environment variables Long-lived credentials need rotation management High - requires manual credential management

Choose the right authentication method for your resource

Use this flowchart to choose the right authentication method for each resource. You can combine methods as needed, and an agent can use a different method for each resource depending on its use case.

  1. Is per-user access control or user-attributed auditing required?

  2. Do all resources support automatic authentication?

Automatic authentication passthrough

Automatic authentication passthrough is the simplest method for accessing Databricks-managed resources. Declare resource dependencies when logging the agent, and Databricks automatically provisions, rotates, and manages short-lived credentials when the agent is deployed.

This authentication behavior is similar to the "Run as owner" behavior for Databricks dashboards. Downstream resources like Unity Catalog tables are accessed using the credentials of a service principal with least-privilege access to only the resources the agent needs.

How automatic authentication passthrough works

When an agent is served behind an endpoint using automatic authentication passthrough, Databricks performs the following steps:

  1. Permission verification: Databricks verifies that the endpoint creator can access all dependencies specified during agent logging.

  2. Service principal creation and grants: A service principal is created for the agent model version and is automatically granted read access to agent resources.

    Note

    The system-generated service principal does not appear in API or UI listings. If the agent model version is removed from the endpoint, the service principal is also deleted.

  3. Credential provisioning and rotation: Short-lived credentials (an M2M OAuth token) for the service principal are injected into the endpoint, allowing agent code to access Databricks resources. Databricks also rotates the credentials, ensuring that your agent has continued, secure access to dependent resources.

Supported resources for automatic authentication passthrough

The following table lists the Databricks resources that support automatic authentication passthrough and the permissions the endpoint creator must have when deploying the agent.

Note

Unity Catalog resources also require USE SCHEMA on the parent schema and USE CATALOG on the parent catalog.

Resource type Permission Minimum MLflow version
SQL Warehouse Use Endpoint 2.16.1 or above
Model Serving endpoint Can Query 2.13.1 or above
Unity Catalog Function EXECUTE 2.16.1 or above
Genie space Can Run 2.17.1 or above
Vector Search index Can Use 2.13.1 or above
Unity Catalog Table SELECT 2.18.0 or above
Unity Catalog Connection Use Connection 2.17.1 or above
Lakebase databricks_superuser 3.3.2 or above

Implement automatic authentication passthrough

To enable automatic authentication passthrough, specify dependent resources when you log the agent. Use the resources parameter of the log_model() API:

Note

Remember to log all downstream dependent resources, too. For example, if you log a Genie Space, you must also log its tables, SQL Warehouses, and Unity Catalog functions.

import mlflow
from mlflow.models.resources import (
  DatabricksVectorSearchIndex,
  DatabricksServingEndpoint,
  DatabricksSQLWarehouse,
  DatabricksFunction,
  DatabricksGenieSpace,
  DatabricksTable,
  DatabricksUCConnection,
  DatabricksApp,
  DatabricksLakebase
)

with mlflow.start_run():
  logged_agent_info = mlflow.pyfunc.log_model(
    python_model="agent.py",
    artifact_path="agent",
    input_example=input_example,
    example_no_conversion=True,
    # Specify resources for automatic authentication passthrough
    resources=[
      DatabricksVectorSearchIndex(index_name="prod.agents.databricks_docs_index"),
      DatabricksServingEndpoint(endpoint_name="databricks-meta-llama-3-3-70b-instruct"),
      DatabricksServingEndpoint(endpoint_name="databricks-bge-large-en"),
      DatabricksSQLWarehouse(warehouse_id="your_warehouse_id"),
      DatabricksFunction(function_name="ml.tools.python_exec"),
      DatabricksGenieSpace(genie_space_id="your_genie_space_id"),
      DatabricksTable(table_name="your_table_name"),
      DatabricksUCConnection(connection_name="your_connection_name"),
      DatabricksApp(app_name="app_name"),
      DatabricksLakebase(database_instance_name="lakebase_instance_name"),
    ]
  )

On-behalf-of-user authentication

Important

This feature is in Public Preview.

On-behalf-of-user (OBO) authentication lets an agent act as the Databricks user who runs the query. This provides:

  • Per-user access to sensitive data
  • Fine-grained data controls enforced by Unity Catalog
  • Security tokens are restricted ("downscoped") to only the APIs your agent declares, reducing risk of misuse

Requirements

  • On-behalf-of-user authentication requires MLflow 2.22.1 and above.
  • On-behalf-of-user authentication is disabled by default and must be enabled by a workspace admin. Review the security considerations before enabling this feature.

OBO supported resources

Agents with OBO authentication can access the following Databricks resources:

Databricks resource Compatible clients
Vector Search Index databricks_langchain.VectorSearchRetrieverTool, databricks_openai.VectorSearchRetrieverTool, VectorSearchClient
Model Serving Endpoint databricks.sdk.WorkspaceClient
SQL Warehouse databricks.sdk.WorkspaceClient
UC Connections databricks.sdk.WorkspaceClient
UC Tables and Functions databricks.sdk.WorkspaceClient (To access UC tables you must use SQL queries using SQL Statement Execution API)
Genie Space databricks.sdk.WorkspaceClient (recommended), databricks_langchain.GenieAgent, or databricks_ai_bridge.GenieAgent
Model Context Protocol (MCP) databricks_mcp.DatabricksMCPClient

Implement OBO authentication

To enable on-behalf-of-user authentication, complete the following steps:

  1. Update SDK calls to specify that resources are accessed on behalf of the end user.
  2. Update agent code to initialize OBO access inside the predict function, not in __init__, because the user identity is only known at runtime.
  3. When logging the agent for deployment, declare the Databricks REST API scopes that the agent requires.

The following snippets demonstrate how to configure on-behalf-of-user access to different Databricks resources. When initializing tools, handle permission errors gracefully by wrapping initialization in a try-except block.

Vector Search Retriever Tool

from databricks.sdk import WorkspaceClient
from databricks_ai_bridge import ModelServingUserCredentials
from databricks_langchain import VectorSearchRetrieverTool

# Configure a Databricks SDK WorkspaceClient to use on behalf of end
# user authentication
user_client = WorkspaceClient(credentials_strategy = ModelServingUserCredentials())

vector_search_tools = []
# Exclude exception handling if the agent should fail
# when users lack access to all required Databricks resources
try:
  tool = VectorSearchRetrieverTool(
    index_name="<index_name>",
    description="...",
    tool_name="...",
    workspace_client=user_client # Specify the user authorized client
    )
    vector_search_tools.append(tool)
except Exception as e:
    _logger.debug("Skipping adding tool as user does not have permissions")

Vector Search Client

from databricks.vector_search.client import VectorSearchClient
from databricks.vector_search.utils import CredentialStrategy

# Configure a VectorSearch Client to use on behalf of end
# user authentication
user_authenticated_vsc = VectorSearchClient(credential_strategy=CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS)
# Exclude exception handling if the agent should fail when
# users lack access to all required Databricks resources
try:
  vs_index = user_authenticated_vsc.get_index(endpoint_name="endpoint_name", index_name="index_name")
  ...
except Exception as e:
  _logger.debug("Skipping Vector Index because user does not have permissions")

MCP

from databricks.sdk import WorkspaceClient
from databricks_ai_bridge import ModelServingUserCredentials
from databricks_mcp import DatabricksMCPClient

# Configure a Databricks SDK WorkspaceClient to use on behalf of end
# user authentication
user_client = WorkspaceClient(credentials_strategy=ModelServingUserCredentials())

mcp_client = DatabricksMCPClient(
    server_url="<mcp_server_url>",
    workspace_client=user_client, # Specify the user client here
  )

Model Serving Endpoint

from databricks.sdk import WorkspaceClient
from databricks_ai_bridge import ModelServingUserCredentials

# Configure a Databricks SDK WorkspaceClient to use on behalf of end
# user authentication
user_client = WorkspaceClient(credentials_strategy=ModelServingUserCredentials())

# Exclude exception handling if the agent should fail
# when users lack access to all required Databricks resources
try:
  user_client.serving_endpoints.query("endpoint_name", input="")
except Exception as e:
  _logger.debug("Skipping Model Serving Endpoint due to no permissions")

UC Connections

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ExternalFunctionRequestHttpMethod
from databricks_ai_bridge import ModelServingUserCredentials

# Configure a Databricks SDK WorkspaceClient to use on behalf of end
# user authentication
user_client = WorkspaceClient(credentials_strategy=ModelServingUserCredentials())

user_client.serving_endpoints.http_request(
  conn="connection_name",
  method=ExternalFunctionRequestHttpMethod.POST,
  path="/api/v1/resource",
  json={"key": "value"},
  headers={"extra_header_key": "extra_header_value"},
)

Genie Spaces (WorkspaceClient)

from databricks_langchain.genie import GenieAgent
from databricks.sdk import WorkspaceClient
from databricks_ai_bridge import ModelServingUserCredentials


# Configure a Databricks SDK WorkspaceClient to use on behalf of end
# user authentication
user_client = WorkspaceClient(credentials_strategy=ModelServingUserCredentials())


genie_agent = GenieAgent(
    genie_space_id="space-id",
    genie_agent_name="Genie",
    description="This Genie space has access to sales data in Europe"
    client=user_client
)

# Use the Genie SDK methods available through WorkspaceClient
try:
    response = agent.invoke("Your query here")
except Exception as e:
    _logger.debug("Skipping Genie due to no permissions")

Genie Spaces (LangChain)

from databricks.sdk import WorkspaceClient
from databricks_ai_bridge import ModelServingUserCredentials
from databricks_langchain.genie import GenieAgent

# Configure a Databricks SDK WorkspaceClient to use on behalf of end
# user authentication
user_client = WorkspaceClient(credentials_strategy=ModelServingUserCredentials())

genie_agent = GenieAgent(
    genie_space_id="<genie_space_id>",
    genie_agent_name="Genie",
    description="Genie_description",
    client=user_client, # Specify the user client here
  )

Initialize the agent in the predict function

Because the user's identity is only known at query time, you must access OBO resources inside predict or predict_stream, not in the agent's __init__ method. This ensures that resources are isolated between invocations.

from mlflow.pyfunc import ResponsesAgent

class OBOResponsesAgent(ResponsesAgent):
  def initialize_agent():
    user_client = WorkspaceClient(
      credentials_strategy=ModelServingUserCredentials()
    )
    system_authorized_client = WorkspaceClient()
    ### Use the clients above to access resources with either system or user authentication

  def predict(
    self, request
  ) -> ResponsesAgentResponse:
    agent = initialize_agent() # Initialize the Agent in Predict

    agent.predict(request)
    ...

Declare REST API scopes when logging the agent

When you log your OBO agent for deployment, you must list the Databricks REST API scopes that your agent calls on the user's behalf. This ensures the agent follows the principle of least privilege: tokens are restricted to just the APIs your agent requires, reducing the chance of unauthorized actions or token misuse.

Below is a list of scopes required to access several common types of Databricks resources:

Resource type Required API scope
Model Serving endpoints serving.serving-endpoints
Vector Search endpoints vectorsearch.vector-search-endpoints
Vector Search indexes vectorsearch.vector-search-indexes
SQL warehouses sql.warehouses, sql.statement-execution
Genie spaces dashboards.genie
UC connections catalog.connections and serving.serving-endpoints
Databricks Apps apps.apps
MCP Genie spaces mcp.genie
MCP UC functions mcp.functions
MCP Vector Search mcp.vectorsearch
MCP external functions mcp.external

To enable on-behalf-of-user authentication, pass an MLflow AuthPolicy to log_model():

import mlflow
from mlflow.models.auth_policy import AuthPolicy, SystemAuthPolicy, UserAuthPolicy
from mlflow.models.resources import DatabricksServingEndpoint

# System policy: resources accessed with system credentials
system_policy = SystemAuthPolicy(
    resources=[DatabricksServingEndpoint(endpoint_name="my_endpoint")]
)

# User policy: API scopes for OBO access
user_policy = UserAuthPolicy(api_scopes=[
    "serving.serving-endpoints",
    "vectorsearch.vector-search-endpoints",
    "vectorsearch.vector-search-indexes"
])

# Log the agent with both policies
with mlflow.start_run():
    mlflow.pyfunc.log_model(
        name="agent",
        python_model="agent.py",
        auth_policy=AuthPolicy(
            system_auth_policy=system_policy,
            user_auth_policy=user_policy
        )
    )

OBO authentication for OpenAI clients

For agents that use the OpenAI client, use the Databricks SDK to authenticate automatically during deployment. Databricks SDK has a wrapper for constructing the OpenAI client with authentication automatically configured, get_open_ai_client():

% pip install databricks-sdk[openai]
from databricks.sdk import WorkspaceClient
def openai_client(self):
  w = WorkspaceClient()
  return w.serving_endpoints.get_open_ai_client()

Then, specify the Model Serving endpoint as part of resources to authenticate automatically at deployment time.

OBO security considerations

Consider the following security considerations before enabling on-behalf-of-user authentication with agents.

Expanded resource access: Agents can access sensitive resources on behalf of users. While scopes restrict APIs, endpoints might allow more actions than your agent explicitly requests. For example, the serving.serving-endpoints API scope grants an agent permission to run a serving endpoint on behalf of the user. However, the serving endpoint can access additional API scopes that the original agent isn't authorized to use.

OBO example notebook

The following notebook shows you how to create an agent with Vector Search using on-behalf-of-user authorization.

Get notebook

Manual authentication

Manual authentication allows for explicitly specifying credentials during agent deployment. This method has the most flexibility but requires more setup and ongoing credential management. Use this method when:

  • The dependent resource does not support automatic authentication passthrough
  • The agent needs to use credentials other than those of the agent deployer
  • The agent accesses external resources or APIs outside of Databricks
  • The deployed agent accesses the prompt registry

Important

Overriding security environment variables disables automatic passthrough for other resources your agent depends on.

OAuth is the recommended approach for manual authentication because it has secure, token-based authentication for service principals with automatic token refresh capabilities:

  1. Create a service principal and generate OAuth credentials.

  2. Grant the service principal permissions to any Databricks resource that the agent has access to privileges to access Databricks resources. To access the prompt registry, grant CREATE FUNCTION, EXECUTE, and MANAGE permissions on the Unity Catalog schema for storing prompts.

  3. Create databricks secrets for the OAuth credentials.

  4. Configure the OAuth credentials in the agent code:

    import os
    
    # Configure OAuth authentication for Prompt Registry access
    # Replace with actual secret scope and key names
    secret_scope_name = "your-secret-scope"
    client_id_key = "oauth-client-id"
    client_secret_key = "oauth-client-secret"
    
    os.environ["DATABRICKS_HOST"] = "https://<your-workspace-url>"
    os.environ["DATABRICKS_CLIENT_ID"] = dbutils.secrets.get(scope=secret_scope_name, key=client_id_key)
    os.environ["DATABRICKS_CLIENT_SECRET"] = dbutils.secrets.get(scope=secret_scope_name, key=client_secret_key)
    
  5. Use the secrets to connect to the workspace:

    w = WorkspaceClient(
      host=os.environ["DATABRICKS_HOST"],
      client_id=os.environ["DATABRICKS_CLIENT_ID"],
      client_secret = os.environ["DATABRICKS_CLIENT_SECRET"]
    )
    
  6. When deploying with agents.deploy(), include the OAuth credentials as environment variables:

    agents.deploy(
        UC_MODEL_NAME,
        uc_registered_model_info.version,
        environment_vars={
            "DATABRICKS_HOST": "https://<your-workspace-url>",
            "DATABRICKS_CLIENT_ID": f"{{{{secrets/{secret_scope_name}/{client_id_key}}}}}",
            "DATABRICKS_CLIENT_SECRET": f"{{{{secrets/{secret_scope_name}/{client_secret_key}}}}}"
        },
    )
    

PAT authentication

Personal Access Token (PAT) authentication provides a simpler setup for development and testing environments, though it requires more manual credential management:

  1. Get a PAT using a service principal or personal account:

    Service principal (recommended for security):

    1. Create a service principal.
    2. Grant the service principal permissions to any Databricks resource that the agent has access to privileges to access Databricks resources. To access the prompt registry, grant CREATE FUNCTION, EXECUTE, and MANAGE permissions on the Unity Catalog schema used to store prompts.
    3. Create a PAT for the service principal.

    Personal account:

    1. Create a PAT for a personal account.
  2. Store the PAT securely by creating a Databricks secret for the PAT.

  3. Configure PAT authentication in the agent code:

    import os
    
    # Configure PAT authentication for Prompt Registry access
    # Replace with your actual secret scope and key names
    secret_scope_name = "your-secret-scope"
    secret_key_name = "your-pat-key"
    
    os.environ["DATABRICKS_HOST"] = "https://<your-workspace-url>"
    os.environ["DATABRICKS_TOKEN"] = dbutils.secrets.get(scope=secret_scope_name, key=secret_key_name)
    
    # Validate configuration
    assert os.environ["DATABRICKS_HOST"], "DATABRICKS_HOST must be set"
    assert os.environ["DATABRICKS_TOKEN"], "DATABRICKS_TOKEN must be set"
    
  4. When deploying the agent using agents.deploy(), include the PAT as an environment variable:

    agents.deploy(
        UC_MODEL_NAME,
        uc_registered_model_info.version,
        environment_vars={
            "DATABRICKS_HOST": "https://<your-workspace-url>",
            "DATABRICKS_TOKEN": f"{{{{secrets/{secret_scope_name}/{secret_key_name}}}}}"
        },
    )