Ingestion Jobs - Create
Creates an ingestion job with the specified job id.
PUT {endpoint}/openai/ingestion/jobs/{job-id}?api-version=2025-03-01-preview
URI Parameters
| Name | In | Required | Type | Description |
|---|---|---|---|---|
|
endpoint
|
path | True |
string (url) |
Supported Cognitive Services endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI account name). |
|
job-id
|
path | True |
string |
The id of the job that will be created. |
|
api-version
|
query | True |
string |
The requested API version. |
Request Header
| Name | Required | Type | Description |
|---|---|---|---|
| mgmt-user-token |
string |
The token used to access the workspace (needed only for user compute jobs). |
|
| aml-user-token |
string |
The token used to access the resources within the job in the workspace (needed only for user compute jobs). |
Request Body
The request body can be one of the following:
| Name | Description |
|---|---|
|
Ingestion |
|
|
Ingestion |
IngestionJobSystemCompute
| Name | Required | Type | Description |
|---|---|---|---|
| kind | True |
string:
System |
IngestionJobType |
| completionAction |
The completion action. |
||
| dataRefreshIntervalInHours |
integer |
||
| datasource | SystemComputeDatasource: |
SystemComputeDatasource |
|
| jobId |
string |
||
| searchServiceConnection | BaseConnection: |
BaseConnection |
IngestionJobUserCompute
| Name | Required | Type | Description |
|---|---|---|---|
| kind | True |
string:
User |
IngestionJobType |
| workspaceId | True |
string |
|
| compute | JobCompute: |
JobCompute |
|
| dataRefreshIntervalInHours |
integer |
||
| datasource | UserComputeDatasource: |
UserComputeDatasource |
|
| jobId |
string |
||
| target | TargetIndex: |
TargetIndex |
Responses
| Name | Type | Description |
|---|---|---|
| 200 OK | IngestionJob: |
Success |
| Other Status Codes |
An error occurred. |
Security
api-key
API key authentication
Type:
apiKey
In:
header
OAuth2Auth
OAuth2 authentication
Type:
oauth2
Flow:
implicit
Authorization URL:
https://login.microsoftonline.com/common/oauth2/v2.0/authorize
Scopes
| Name | Description |
|---|---|
| https://cognitiveservices.azure.com/.default |
Examples
| Create a system-compute ingestion job |
| Create a user-compute ingestion job |
Create a system-compute ingestion job
Sample request
PUT {endpoint}/openai/ingestion/jobs/ingestion-job?api-version=2025-03-01-preview
{
"kind": "SystemCompute",
"searchServiceConnection": {
"kind": "EndpointWithManagedIdentity",
"endpoint": "https://aykame-dev-search.search.windows.net"
},
"datasource": {
"kind": "Storage",
"connection": {
"kind": "EndpointWithManagedIdentity",
"endpoint": "https://mystorage.blob.core.windows.net/",
"resourceId": "/subscriptions/1234567-abcd-1234-5678-1234abcd/resourceGroups/my-resource/providers/Microsoft.Storage/storageAccounts/mystorage"
},
"containerName": "container",
"chunking": {
"maxChunkSizeInTokens": 2048
},
"embeddings": [
{
"connection": {
"kind": "RelativeConnection"
},
"deploymentName": "Ada"
}
]
},
"dataRefreshIntervalInHours": 24,
"completionAction": "keepAllAssets"
}
Sample response
operation-___location: https://aoairesource.openai.azure.com/openai/ingestion/jobs/ingestion-job/runs/72a2792ef7d24ba7b82c7fe4a37e379f?api-version=2025-03-01-preview
{
"kind": "SystemCompute",
"jobId": "ingestion-job",
"searchServiceConnection": {
"kind": "EndpointWithManagedIdentity",
"endpoint": "https://aykame-dev-search.search.windows.net"
},
"datasource": {
"kind": "Storage",
"connection": {
"kind": "EndpointWithManagedIdentity",
"endpoint": "https://mystorage.blob.core.windows.net/",
"resourceId": "/subscriptions/1234567-abcd-1234-5678-1234abcd/resourceGroups/my-resource/providers/Microsoft.Storage/storageAccounts/mystorage"
},
"containerName": "container",
"chunking": {
"maxChunkSizeInTokens": 2048
},
"embeddings": [
{
"connection": {
"kind": "RelativeConnection"
},
"deploymentName": "Ada"
}
]
},
"dataRefreshIntervalInHours": 24,
"completionAction": "keepAllAssets"
}
Create a user-compute ingestion job
Sample request
PUT {endpoint}/openai/ingestion/jobs/ingestion-job?api-version=2025-03-01-preview
{
"kind": "UserCompute",
"workspaceId": "/subscriptions/f375b912-331c-4fc5-8e9f-2d7205e3e036/resourceGroups/adrama-copilot-demo/providers/Microsoft.MachineLearningServices/workspaces/adrama-rag-dev",
"compute": {
"kind": "ServerlessCompute"
},
"target": {
"kind": "AzureAISearch",
"connectionId": "/subscriptions/f375b912-331c-4fc5-8e9f-2d7205e3e036/resourceGroups/adrama-copilot-demo/providers/Microsoft.MachineLearningServices/workspaces/adrama-rag-dev/connections/search-connection"
},
"datasource": {
"kind": "Dataset",
"datasetId": "azureml://locations/centraluseuap/workspaces/83317fe6-efa6-4e4a-b020-d0edd11ec382/data/PlainText/versions/1",
"datasetType": "uri_folder"
}
}
Sample response
operation-___location: https://aoairesource.openai.azure.com/openai/ingestion/jobs/ingestion-job/runs/72a2792ef7d24ba7b82c7fe4a37e379f?api-version=2025-03-01-preview
{
"kind": "UserCompute",
"jobId": "ingestion-job",
"workspaceId": "/subscriptions/f375b912-331c-4fc5-8e9f-2d7205e3e036/resourceGroups/adrama-copilot-demo/providers/Microsoft.MachineLearningServices/workspaces/adrama-rag-dev",
"compute": {
"kind": "ServerlessCompute"
},
"target": {
"kind": "AzureAISearch",
"connectionId": "/subscriptions/f375b912-331c-4fc5-8e9f-2d7205e3e036/resourceGroups/adrama-copilot-demo/providers/Microsoft.MachineLearningServices/workspaces/adrama-rag-dev/connections/search-connection"
},
"datasource": {
"kind": "Dataset",
"datasetId": "azureml://locations/centraluseuap/workspaces/83317fe6-efa6-4e4a-b020-d0edd11ec382/data/PlainText/versions/1",
"datasetType": "uri_folder"
}
}
Definitions
| Name | Description |
|---|---|
|
Azure |
Azure AI Search Index. |
|
Chunking |
ChunkingSettings |
|
Compute |
The compute type. |
|
Connection |
Connection string connection. |
|
Connection |
The connection type. |
|
Cosmos |
CosmosDB Index. |
|
Crawling |
CrawlingSettings |
|
Custom |
Custom compute. |
|
Deployment |
Relative deployment connection. |
|
Endpoint |
Endpoint key connection. |
|
Endpoint |
Endpoint Managed Identity connection. |
| Error |
Error |
|
Error |
ErrorCode |
|
Error |
ErrorResponse |
|
Generic |
ConnectionEmbeddingSettings |
|
Ingestion |
The completion action. |
|
Ingestion |
|
|
Ingestion |
IngestionJobType |
|
Ingestion |
|
|
Inner |
InnerError |
|
Inner |
InnerErrorCode |
|
Pinecone |
Pinecone Index. |
|
Serverless |
Serverless compute. |
|
System |
The datasource type. |
|
System |
SystemComputeStorage |
|
System |
SystemComputeUrl |
|
Target |
The target type. |
|
User |
UserComputeStorage |
|
User |
The datasource type. |
|
User |
UserComputeUrl |
|
Workspace |
AML Workspace connection. |
|
Workspace |
WorkspaceConnectionEmbeddingSettings |
AzureAISearchIndex
Azure AI Search Index.
| Name | Type | Description |
|---|---|---|
| connectionId |
string |
The id of the connection pointing to the Azure AI Search Index. |
| kind |
string:
Azure |
The target type. |
ChunkingSettings
ChunkingSettings
| Name | Type | Description |
|---|---|---|
| maxChunkSizeInTokens |
integer |
ComputeType
The compute type.
| Value | Description |
|---|---|
| ServerlessCompute |
Serverless user compute. |
| CustomCompute |
Custom user compute. |
ConnectionStringConnection
Connection string connection.
| Name | Type | Description |
|---|---|---|
| connectionString |
string |
Connection string |
| kind |
string:
Connection |
The connection type. |
ConnectionType
The connection type.
| Value | Description |
|---|---|
| EndpointWithKey |
Endpoint and key connection. |
| ConnectionString |
Connection string. |
| EndpointWithManagedIdentity |
Endpoint and managed identity. |
| WorkspaceConnection |
AML Workspace connection. |
| RelativeConnection |
Relative deployment |
CosmosDBIndex
CosmosDB Index.
| Name | Type | Description |
|---|---|---|
| collectionName |
string |
The name of the cosmos DB collection. |
| connectionId |
string |
The id of the connection pointing to the cosmos DB. |
| databaseName |
string |
The name of the cosmos DB database. |
| kind |
string:
CosmosDB |
The target type. |
CrawlingSettings
CrawlingSettings
| Name | Type | Description |
|---|---|---|
| maxCrawlDepth |
integer |
|
| maxCrawlTimeInMins |
integer |
|
| maxDownloadTimeInMins |
integer |
|
| maxFileSize |
integer |
|
| maxFiles |
integer |
|
| maxRedirects |
integer |
CustomCompute
Custom compute.
| Name | Type | Description |
|---|---|---|
| computeId |
string |
Id of the custom compute |
| kind | string: |
The compute type. |
DeploymentConnection
Relative deployment connection.
| Name | Type | Description |
|---|---|---|
| kind |
string:
Relative |
The connection type. |
EndpointKeyConnection
Endpoint key connection.
| Name | Type | Description |
|---|---|---|
| endpoint |
string |
Endpoint |
| key |
string |
Key |
| kind |
string:
Endpoint |
The connection type. |
EndpointMIConnection
Endpoint Managed Identity connection.
| Name | Type | Description |
|---|---|---|
| endpoint |
string |
Endpoint |
| kind |
string:
Endpoint |
The connection type. |
| resourceId |
string |
Resource Id |
Error
Error
| Name | Type | Description |
|---|---|---|
| code |
ErrorCode |
|
| details |
Error[] |
The error details if available. |
| innererror |
InnerError |
|
| message |
string minLength: 1 |
The message of this error. |
| target |
string |
The ___location where the error happened if available. |
ErrorCode
ErrorCode
| Value | Description |
|---|---|
| conflict |
The requested operation conflicts with the current resource state. |
| invalidPayload |
The request data is invalid for this operation. |
| forbidden |
The operation is forbidden for the current user/api key. |
| notFound |
The resource is not found. |
| unexpectedEntityState |
The operation cannot be executed in the current resource's state. |
| itemDoesAlreadyExist |
The item does already exist. |
| serviceUnavailable |
The service is currently not available. |
| internalFailure |
Internal error. Please retry. |
| quotaExceeded |
Quota exceeded. |
| jsonlValidationFailed |
Validation of jsonl data failed. |
| fileImportFailed |
Import of file failed. |
| tooManyRequests |
Too many requests. Please retry later |
| unauthorized |
The current user/api key is not authorized for the operation. |
| contentFilter |
Image generation failed as a result of our safety system. |
ErrorResponse
ErrorResponse
| Name | Type | Description |
|---|---|---|
| error |
Error |
GenericEmbeddingSettings
ConnectionEmbeddingSettings
| Name | Type | Description |
|---|---|---|
| connection | BaseConnection: |
BaseConnection |
| deploymentName |
string |
|
| modelName |
string |
IngestionJobCompletionAction
The completion action.
| Value | Description |
|---|---|
| cleanUpTempAssets |
Will clean up intermediate assets created during the ingestion process. |
| keepAllAssets |
Will not clean up any of the intermediate assets created during the ingestion process. |
IngestionJobSystemCompute
| Name | Type | Description |
|---|---|---|
| completionAction |
The completion action. |
|
| dataRefreshIntervalInHours |
integer |
|
| datasource | SystemComputeDatasource: |
SystemComputeDatasource |
| jobId |
string |
|
| kind |
string:
System |
IngestionJobType |
| searchServiceConnection | BaseConnection: |
BaseConnection |
IngestionJobType
IngestionJobType
| Value | Description |
|---|---|
| SystemCompute |
Jobs that run on service owned resources. |
| UserCompute |
Jobs that run on user owned workspace. |
IngestionJobUserCompute
| Name | Type | Description |
|---|---|---|
| compute | JobCompute: |
JobCompute |
| dataRefreshIntervalInHours |
integer |
|
| datasource | UserComputeDatasource: |
UserComputeDatasource |
| jobId |
string |
|
| kind |
string:
User |
IngestionJobType |
| target | TargetIndex: |
TargetIndex |
| workspaceId |
string |
InnerError
InnerError
| Name | Type | Description |
|---|---|---|
| code |
InnerErrorCode |
|
| innererror |
InnerError |
InnerErrorCode
InnerErrorCode
| Value | Description |
|---|---|
| invalidPayload |
The request data is invalid for this operation. |
PineconeIndex
Pinecone Index.
| Name | Type | Description |
|---|---|---|
| connectionId |
string |
The id of the connection pointing to the pinecone. |
| kind |
string:
Pinecone |
The target type. |
ServerlessCompute
Serverless compute.
| Name | Type | Description |
|---|---|---|
| instanceCount |
integer |
The count of instances to run the job on. |
| kind | string: |
The compute type. |
| sku |
string |
SKU Level |
SystemComputeDatasourceType
The datasource type.
| Value | Description |
|---|---|
| Storage |
Azure Storage Account. |
| Urls |
URLs. |
SystemComputeStorage
SystemComputeStorage
| Name | Type | Description |
|---|---|---|
| chunking |
ChunkingSettings |
|
| connection | BaseConnection: |
BaseConnection |
| containerName |
string |
container name |
| embeddings |
ConnectionEmbeddingSettings |
|
| kind |
string:
Storage |
The datasource type. |
SystemComputeUrl
SystemComputeUrl
| Name | Type | Description |
|---|---|---|
| chunking |
ChunkingSettings |
|
| connection | BaseConnection: |
BaseConnection |
| containerName |
string |
container name |
| crawling |
CrawlingSettings |
|
| embeddings |
ConnectionEmbeddingSettings |
|
| kind |
string:
Urls |
The datasource type. |
| urls |
string[] |
TargetType
The target type.
| Value | Description |
|---|---|
| AzureAISearch |
Azure AI Search Index. |
| CosmosDB |
CosmosDB Index. |
| Pinecone |
Pinecone Index. |
UserComputeDataset
UserComputeStorage
| Name | Type | Description |
|---|---|---|
| chunking |
ChunkingSettings |
|
| datasetId |
string |
|
| datasetType |
string |
|
| embeddings |
WorkspaceConnectionEmbeddingSettings |
|
| kind |
string:
Dataset |
The datasource type. |
UserComputeDatasourceType
The datasource type.
| Value | Description |
|---|---|
| Dataset |
Workspace Dataset. |
| Urls |
URLs. |
UserComputeUrl
UserComputeUrl
| Name | Type | Description |
|---|---|---|
| chunking |
ChunkingSettings |
|
| crawling |
CrawlingSettings |
|
| embeddings |
WorkspaceConnectionEmbeddingSettings |
|
| kind |
string:
Urls |
The datasource type. |
| urls |
string[] |
WorkspaceConnection
AML Workspace connection.
| Name | Type | Description |
|---|---|---|
| connectionId |
string |
ConnectionId |
| kind | string: |
The connection type. |
WorkspaceConnectionEmbeddingSettings
WorkspaceConnectionEmbeddingSettings
| Name | Type | Description |
|---|---|---|
| connectionId |
string |
|
| deploymentName |
string |
|
| modelName |
string |