Edit

Share via


Use a Blob indexer to ingest RBAC scopes metadata

Note

This feature is currently in public preview. This preview is provided without a service-level agreement and isn't recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Azure Storage allows for role-based access on containers in blob storage, where roles like Storage Blob Data Reader or Storage Blob Data Contributor determine whether someone has access to content. Starting in 2025-05-01-preview, you can now include RBAC scope alongside document ingestion in Azure AI Search and use those permissions to control access to search results. If you have rights to the content, you can see those results in a search query. If you don't have rights (or more specifically, a role assignment on the blob container), you can't see those results even if you personally have a Search Index Data Reader assignment on the index.

RBAC scope is set at the container level and flows to all blobs (documents) through permission inheritance. RBAC scope is captured during indexing as permission metadata, You can use the push APIs to upload and index content and permission metadata manually see Indexing Permissions using the push REST API, or you can use an indexer to automate data ingestion. This article focuses on the indexer approach.

At query time, the identity of the caller is included in the request header via the x-ms-query-source-authorization parameter. The identity must match the permission metadata on documents if the user is to see the search results.

The indexer approach is built on this foundation:

  • Role-based access control (Azure RBAC). There's no support for Attribute-based access control (Azure ABAC).

  • An Azure AI Search indexer for blobs that retrieves and ingests data and metadata, including permission filters. To get permission filter support, you must use the 2025-05-01-preview REST API or a preview package of an Azure SDK that supports the feature.

  • An index in Azure AI Search containing the ingested documents and corresponding permissions. Permission metadata is stored as fields in the index. To set up queries that respect the permission filters, you must use the 2025-05-01-preview REST API or a preview package of an Azure SDK that supports the feature.

Prerequisites

Configure Blob storage

Verify your blob container uses role-based access.

  1. Sign in to the Azure portal and find your storage account.

  2. Expand containers and select the container that has the blobs you want to index.

  3. Select Access Control (IAM) to check role assignments. Users and groups with Storage Blob Data Reader or Storage Blob Data Contributor will have access to search documents in the index after the container is indexed.

Authorization

For indexer execution, your search service identity must have Storage Blob Data Reader permission. For more information, see Connect to Azure Storage using a managed identity.

Recall that the search service must have:

Authorization

For indexer execution, the client issuing the API call must have Search Service Contributor permission to create objects, Search Index Data Contributor permission to perform data import, and Search Index Data Reader to query an index see Connect to Azure AI Search using roles.

Configure indexing

In Azure AI Search, configure an indexer, data source, and index to pull permission metadata from blobs.

Create the data source

  • Data Source type must be azureblob.

  • Data source parsing mode must be the default.

  • Data source must have indexerPermissionOptions with rbacScope.

JSON example with system managed identity:

{
    "name" : "my-blob-datasource",
    "type": "azureblob",
    "indexerPermissionOptions": ["rbacScope"],
    "credentials": {
    "connectionString": "ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Storage/storageAccounts/<your storage account name>/;"
    },
    "container": {
        "name": "<your-container-name>",
        "query": "<optional-query-used-for-selecting-specific-blobs>"
    }
}

JSON schema example with a user-managed identity in the connection string:

{
    "name" : "my-blob-datasource",
    "type": "azureblob",
    "indexerPermissionOptions": ["rbacScope"],
    "credentials": {
    "connectionString": "ResourceId=/subscriptions/<your subscription ID>/resourceGroups/<your resource group name>/providers/Microsoft.Storage/storageAccounts/<your storage account name>/;"
    },
    "container": {
        "name": "<your-container-name>",
        "query": "<optional-query-used-for-selecting-specific-blobs>"
    },
    "identity": {
        "@odata.type": "#Microsoft.Azure.Search.DataUserAssignedIdentity",
        "userAssignedIdentity": "/subscriptions/{subscription-ID}/resourceGroups/{resource-group-name}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{user-assigned-managed-identity-name}"
    }
}

Create permission fields in the index

In Azure AI Search, make sure your index contains field definitions for the permission metadata. Permission metadata can be indexed when indexerPermissionOptions is specified in the data source definition.

Recommended schema attributes RBAC Scope:

  • RBAC scope field with rbacScope permissionFilter value.
  • Property permissionFilterOption to enable filtering at querying time.
  • Use string fields for permission metadata
  • Set filterable to true on all fields.

Notice that retrievable is false. You can set it true during development to verify permissions are present, but remember to set to back to false before deploying to a production environment so that security principal identities aren't visible in results.

JSON schema example:

{
  ...
  "fields": [
    ...
    { 
        "name": "RbacScope", 
        "type": "Edm.String", 
        "permissionFilter": "rbacScope", 
        "filterable": true, 
        "retrievable": false 
    }
  ],
  "permissionFilterOption": "enabled"
}

Configure the indexer

Field mappings within an indexer set the data path to fields in an index. Target and destination fields that vary by name or data type require an explicit field mapping. The following metadata fields in Azure Blob Storage might need field mappings if you vary the field name:

  • metadata_rbac_scope (Edm.String) - the container RBAC scope.

Specify fieldMappings in the indexer to route the permission metadata to target fields during indexing.

JSON schema example:

{
  ...
  "fieldMappings": [
    { "sourceFieldName": "metadata_rbac_scope", "targetFieldName": "RbacScope" }
  ]
}

Deletion tracking

To effectively manage blob deletion, ensure that you have enabled deletion tracking before your indexer runs for the first time. This feature allows the system to detect deleted blobs from your source and have them deleted from the index.

See also