Share via


Collection policy reference

Microsoft Purview collection policies have many components to configure. To create an effective policy, you need to understand what the purpose of each component is and how its configuration alters the behavior of the policy. This article provides a detailed anatomy of a collection policy.

Before you begin

If you're new to collection policies, here's a list of the core articles you need as you implement them in your organization:

  1. Collection Policies solution overview (preview)
  2. Collection policy reference (preview) - this article that you're reading now introduces all the components of a DLP policy and how each one influences the behavior of a policy
  3. Create and Deploy collection policies (preview).

Conditions

Specify conditions to define what data to detect.

Note

Conditions are optional, however some may be required for additional settings.

Collection policies support four conditions:

Condition More information
Content contains classifiers Sensitive information types and trainable classifiers to detect. Can be scoped to all classifiers, all classifiers except selected ones, or specific classifiers.

NOTE: The devices data source doesn't support trainable classifiers. Any selected trainable classifiers will be ignored by devices.
Document size equals or is greater than Detect files with a size that is greater than a specified number of bytes, kilobytes (KB), megabytes (MB), gigabytes (GB), or terabytes (TB).

This condition only applies to the devices data source.
Document is equal to or smaller than Detect files with a size that is smaller than a specified number of bytes, kilobytes (KB), megabytes (MB), gigabytes (GB), or terabytes (TB).

This condition only applies to the devices data source.
File extension is Detect files with specified file extensions.

This condition only applies to the devices data source.

Activities

Choose which activities to detect. Supported activities are specific to the data sources you want to include.

Tip

You can mix activities that support different data sources in a single policy, but you must add all applicable data sources to the policy to support the selected activities.

Activity Description Data source
Text sent to or shared with cloud or AI app When raw text is uploaded to a cloud app, including generative AI prompts, form submissions, and messages - Cloud apps
- Generative AI
File uploaded to or shared with cloud or AI app When a binary file is uploaded to a cloud app or generative AI services - Cloud apps
- Generative AI
Text received from cloud or AI app When raw text is downloaded from a cloud app, including generative AI responses - Cloud apps
- Generative AI
File downloaded from cloud or AI app When a binary file is downloaded from a cloud app or generative AI service - Cloud apps
- Generative AI
Archive created When an archive file is created on an onboarded endpoint device Devices
File accessed by unallowed app When a file is accessed by a restricted app or app group on an onboarded endpoint device Devices
File archived When a file is added to an archive on an onboarded endpoint device Devices
File copied to network share When a file is copied to a network share on an onboarded endpoint device Devices
File copied to remote desktop session When a file is copied to a remote computer through a remote desktop session on an onboarded endpoint device Devices
File copied to removable media When a file is copied to a removable media, such as a USB flash drive, on an onboarded endpoint device Devices
File created When a file is created on an onboarded endpoint device Devices
File created on network share When a file is created on a network share from an onboarded endpoint device Devices
File created on removable media When a file is created on removable media, such as a USB flash drive, from an onboarded endpoint device Devices
File deleted When a file deleted from an onboarded endpoint device Devices
File modified When a file is modified from an onboarded endpoint device Devices
File printed When a file printed from an onboarded endpoint device Devices
File read When a file is read from an onboarded endpoint device Devices
File renamed When a file is renamed from an onboarded endpoint device Devices
File transferred by Bluetooth When a file is transferred by Bluetooth from an onboarded endpoint device Devices
File uploaded to cloud When a file is uploaded to the cloud from an onboarded endpoint device Devices
Removable media mount When removable media, such as a USB flash drive, is mounted on an onboarded endpoint device Devices
Removable media unmount When removable media, such as a USB flash drive, is unmounted on an onboarded endpoint device Devices

Data sources

Data sources define where to apply the policy, and are directly correlated to the activities added to the policy.

The following data sources are supported:

Data source More information Supported activities
Devices (preview) Devices onboarded to Microsoft 365 and managed by your org. Windows devices onboarded into Microsoft 365.
Copilot experiences (preview) Includes Copilot in Fabric and Security Copilot only, with support for more experiences coming soon. - Text sent to or shared with cloud or AI app
- Text received from cloud or AI app
Enterprise AI (preview) Non-Copilot AI apps that are onboarded or connected to your org using methods like Microsoft Entra registration, Azure AI services, or Purview Data Map connectors. - Text sent to or shared with cloud or AI app
- Text received from cloud or AI app
Unmanaged cloud apps (preview) Cloud apps sourced in the Defender for Cloud Apps catalog which aren't set up for single sign-on (SSO), allowing users to access personal data through a browser, app, add-in, or API. Policies will only detect data while its being shared or transferred (data in motion) via network detection. - Text sent to or shared with cloud or AI app
- Text received from cloud or AI app
- File uploaded to or shared with cloud or AI app
-File downloaded from cloud or AI app
Adaptive app scopes (preview) Groups of apps, whose membership is determined based on app metadata, such as category.
Currently only "All unmanaged AI apps" - all unmanaged cloud apps categorized as generative AI - is supported.
- Text sent to or shared with cloud or AI app
- Text received from cloud or AI app
- File uploaded to or shared with cloud or AI app
-File downloaded from cloud or AI app

Scoping data sources to users and groups

For each data source, you can choose to scope to the following:

  • All users and groups (default)
  • Specific users and groups
  • All except specific users and groups

Note

Excluded users and groups take precedence over any included users or groups.

Other collection policy settings

Depending on the conditions, activities, and data sources specified, there may be other collection policy settings to configure. Whenever these settings are disabled or grayed-out, it means the policy configuration wasn't compatible with the setting.

Content capture for AI interactions

To help comply with regulatory requirements, you can decide whether to capture and store all detected prompts and responses from any generative AI data sources added to the policy. This makes it easy to discover and protect the captured content later with other Microsoft Purview policies and solutions. This capability doesn't include content in files shared with generative AI, and only applies to the following data sources:

  • Copilot experiences
  • Enterprise AI
  • Unmanaged cloud apps categorized as generative AI
  • All unmanaged AI apps adaptive app scope

Without this setting enabled, content detected in prompts and responses are limited to sensitive information only.

Note

To capture AI content, you must have the Content contains classifiers condition set to All.

Cloud apps detection

If any unmanaged cloud app or adaptive app scopes data sources have been added to the policy, you must choose how to detect this data. You can choose:

  • Network - Detect sensitive data shared with unmanaged cloud apps through browsers, apps, APIs, and more, with an integrated Secure Service Edge (SSE) provider and Purview network data security.

Note

Detecting cloud app activity over the network is a pay-as-you-go capability. Learn more about pay-as-you-go billing.

Privacy notice for Enterprise AI and Network Data Security

Enterprise AI data sources and network data security integrations might require integration with a third-party app or provider. It's important to note, if you choose to enable any third-party integration, they'll have access to and may store some policy configuration, including user identifiers. In this case, the third-party's terms, conditions, and privacy policy will govern the usage and storage of this data.