ドキュメントレイアウトスキル

2025-07-29

Note

現在、この機能はパブリックプレビュー段階にあります。このプレビュー版はサービスレベルアグリーメントなしで提供されています。運用環境のワークロードに使用することはお勧めできません。特定の機能はサポート対象ではなく、機能が制限されることがあります。詳しくは、Microsoft Azure プレビューの追加使用条件に関するページをご覧ください。

ドキュメントレイアウト スキルは、ドキュメントを分析して構造と特性を検出し、Markdown 形式またはテキスト形式でドキュメントの構文表現を生成します。これを使用してテキストと画像を抽出できます。画像抽出には、ドキュメント内の画像の位置を保持する場所メタデータが含まれます。関連するコンテンツに近い画像は、取得拡張生成 (RAG) ワークロードやマルチモーダル検索シナリオで役立ちます。

この記事は、ドキュメントレイアウトスキルのリファレンスドキュメントです。使用方法については、「ドキュメントレイアウトでチャンクとベクター化を行う方法」を参照してください。

このスキルでは、Azure AI ドキュメントインテリジェンスで提供される Document Intelligence レイアウトモデルを使用します。

このスキルは、インデクサーあたり 1 日あたり 20 ドキュメントを超えるトランザクションに対して課金対象の Azure AI マルチサービスリソースにバインドされます。組み込みスキルの実行は、既存の Azure AI サービス Standard 価格で課金されます。

Tip

このスキルは、構造や画像を持つ PDF などのコンテンツで使用するのが一般的です。次のチュートリアルでは、2 つの異なるデータチャンク手法を使用した画像の言語化について説明します。

Limitations

パブリックプレビュー中、このスキルには次の制限があります。

このスキルは、AI ドキュメントインテリジェンスレイアウトモデルで 5 分以上の処理を必要とする大規模なドキュメントには適していません。スキルはタイムアウトしますが、課金目的でスキルセットにアタッチされている場合、料金は引き続き AI Services マルチサービスリソースに適用されます。不要なコストを回避するために、ドキュメントが処理制限内に収まるように最適化されていることを確認します。

Supported regions

ドキュメントレイアウトスキルは、ドキュメントインテリジェンスパブリックプレビューバージョン 2024-07-31-preview を呼び出します。

サポートされるリージョンは、モダリティと、スキルがドキュメントインテリジェンスレイアウトモデルに接続する方法によって異なります。

Approach	Requirement
データのインポートとベクトル化ウィザード	次のいずれかのリージョンに Azure AI マルチサービスリソースを作成して、ポータルエクスペリエンス ( 米国東部、西ヨーロッパ、米国中北部) を取得します。
プログラムによる、課金に Microsoft Entra ID 認証 (プレビュー) を使用する	Azure AI Search は、米国東部、西ヨーロッパ、米国中北部、米国西部 2 のいずれかのリージョンで作成します。リージョン別の製品の可用性の表に記載されている任意のリージョンに Azure AI マルチサービスリソースを作成します。
課金にマルチサービスリソース API キーを使用するプログラム	Azure AI Search サービスと AI マルチサービスリソースを同じリージョン ( 米国東部、西ヨーロッパ、米国中北部、米国西部 2) に作成します。

サポートされるファイル形式

このスキルは、次のファイル形式を認識します。

.PDF
.JPEG
.JPG
.PNG
.BMP
.TIFF
.DOCX
.XLSX
.PPTX
.HTML

Supported languages

印刷されたテキストについては、 Azure AI ドキュメントインテリジェンスレイアウトモデルでサポートされている言語を参照してください。

Supported parameters

いくつかのパラメーターはバージョン固有です。スキルパラメーターテーブルには、スキルの構成方法を把握できるように、パラメーターが導入された API のバージョンが示されています。 2025-05-01-preview REST API でイメージや場所のメタデータ抽出などのバージョン固有の機能を使用するには、Azure portal を使用するか、2025-05-01-preview をターゲットにするか、Azure SDK の変更ログを確認して、新しいパラメーターがサポートされているかどうかを確認します。

Azure portal はほとんどのプレビュー機能をサポートしており、スキルセットを作成または更新するために使用できます。ドキュメントレイアウトスキルを更新するには、スキルセットの JSON 定義を編集して、新しいプレビューパラメーターを追加します。

@odata.type

Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill

Data limits

PDF および TIFF の場合、最大 2,000 ページを処理できます (Free レベルのサブスクリプションでは、最初の 2 ページのみが処理されます)。
ドキュメントを分析するためのファイルサイズが Azure AI Document Intelligence 有料 (S0) レベルでは 500 MB 、 Azure AI Document Intelligence Free (F0) レベルでは 4 MB場合でも、インデックス作成には検索サービスレベルのインデックスの制限が適用されます。
画像のサイズは、50 ピクセル x 50 ピクセルまたは 10,000 ピクセル x 10,000 ピクセルの間である必要があります。
PDF がパスワードロックされている場合は、インデクサーを実行する前にロックを解除します。

Skill parameters

パラメーターの大文字と小文字は区別されます。

Parameter name	Version	Allowed Values	Description
`outputMode`	2024-11-01-preview	`oneToMany`	スキルによって生成される出力のカーディナリティを制御します。
`markdownHeaderDepth`	2024-11-01-preview	`h1`、 `h2`、 `h3`、 `h4`、 `h5`、 `h6(default)`	`outputFormat` が `markdown` に設定されている場合にのみ適用されます。このパラメーターは、考慮する必要がある最も深い入れ子レベルを表します。たとえば、markdownHeaderDepth が `h3`されている場合、 `h4`など、より深いセクションは `h3`にロールされます。
`outputFormat`	2025-05-01-preview	`markdown(default)`、`text`	New. スキルによって生成される出力の形式を制御します。
`extractionOptions`	2025-05-01-preview	`["images"]`、 `["images", "locationMetadata"]`、 `["locationMetadata"]`	New. ドキュメントから抽出された追加コンテンツを特定します。出力に含めるコンテンツに対応する列挙型の配列を定義します。たとえば、 `extractionOptions` が `["images", "locationMetadata"]`されている場合、出力には画像と場所メタデータが含まれ、ページ番号やセクションなど、コンテンツが抽出された場所に関連するページの場所情報が提供されます。このパラメーターは両方の出力形式に適用されます。
`chunkingProperties`	2025-05-01-preview	See below.	New. `outputFormat` が `text` に設定されている場合にのみ適用されます。他のメタデータを再計算しながらテキストコンテンツをチャンクする方法をカプセル化するオプション。

ChunkingProperties Parameter	Version	Allowed Values	Description
`unit`	2025-05-01-preview	`Characters`. 現在、唯一の許容値です。チャンクの長さは、単語やトークンではなく、文字単位で測定されます	New. チャンク単位のカーディナリティを制御します。
`maximumLength`	2025-05-01-preview	300 ~ 50000 の任意の整数	New. String.Length で測定される最大チャンク長 (文字数)。
`overlapLength`	2025-05-01-preview	Integer. この値は、次の値の半分より小さくする必要があります。 `maximumLength`	New. 2 つのテキストチャンク間で指定されるオーバーラップの長さ。

Skill inputs

Input name	Description
`file_data`	コンテンツを抽出する必要があるファイル。

"file_data" の入力では、次のように定義されたオブジェクトを指定する必要があります。

{
  "$type": "file",
  "data": "BASE64 encoded string of the file"
}

または、次のように定義できます。

{
  "$type": "file",
  "url": "URL to download file",
  "sasToken": "OPTIONAL: SAS token for authentication if the URL provided is for a file in blob storage"
}

ファイル参照オブジェクトは、次のいずれかの方法で生成できます。

インデクサー定義の allowSkillsetToReadFileData パラメーターを true に設定します。この設定により、BLOB データソースからダウンロードされた元のファイルデータを表すオブジェクトであるパス /document/file_data が作成されます。このパラメーターは、Azure Blob Storage 内のファイルにのみ適用されます。
$type、data、またはurlとsastokenを提供する JSON オブジェクト定義を返すカスタムスキルを持つ。 $type パラメーターはfileに設定する必要があり、dataはファイルコンテンツの base 64 でエンコードされたバイト配列である必要があります。 url パラメーターは、その場所でファイルをダウンロードするためのアクセス権を持つ有効な URL である必要があります。

Skill outputs

Output name	Description
`markdown_document`	`outputFormat` が `markdown` に設定されている場合にのみ適用されます。 Markdown ドキュメント内の各セクションを表す "sections" オブジェクトのコレクション。
`text_sections`	`outputFormat` が `text` に設定されている場合にのみ適用されます。任意のセクションヘッダー自体を含む、ページの境界内のテキストを表すテキストチャンクオブジェクトのコレクション (構成されたさらにチャンクを考慮)。テキストチャンクオブジェクトには、該当する場合は `locationMetadata` が含まれます。
`normalized_images`	`outputFormat`が`text`に設定され、`extractionOptionsimages`含まれている場合にのみ適用されます。ドキュメントから抽出されたイメージのコレクション (該当する場合は `locationMetadata` を含む)。

マークダウン出力モードのサンプル定義

{
  "skills": [
    {
      "description": "Analyze a document",
      "@odata.type": "#Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill",
      "context": "/document",
      "outputMode": "oneToMany", 
      "markdownHeaderDepth": "h3", 
      "inputs": [
        {
          "name": "file_data",
          "source": "/document/file_data"
        }
      ],
      "outputs": [
        {
          "name": "markdown_document", 
          "targetName": "markdown_document" 
        }
      ]
    }
  ]
}

マークダウン出力モードのサンプル出力

{
  "markdown_document": [
    { 
      "content": "Hi this is Jim \r\nHi this is Joe", 
      "sections": { 
        "h1": "Foo", 
        "h2": "Bar", 
        "h3": "" 
      },
      "ordinal_position": 0
    }, 
    { 
      "content": "Hi this is Lance",
      "sections": { 
         "h1": "Foo", 
         "h2": "Bar", 
         "h3": "Boo" 
      },
      "ordinal_position": 1,
    } 
  ] 
}

markdownHeaderDepthの値は、"セクション" ディクショナリ内のキーの数を制御します。スキル定義の例では、 markdownHeaderDepth は "h3" であるため、"sections" ディクショナリには h1、h2、h3 の 3 つのキーがあります。

テキスト出力モードと画像とメタデータの抽出の例

この例では、 2025-05-01-preview で導入された新しいパラメーターを使用して、固定サイズのチャンクでテキストコンテンツを出力し、ドキュメントから場所メタデータと共に画像を抽出する方法を示します。

テキスト出力モードと画像とメタデータ抽出のサンプル定義

{
  "skills": [
    {
      "description": "Analyze a document",
      "@odata.type": "#Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill",
      "context": "/document",
      "outputMode": "oneToMany",
      "outputFormat": "text",
      "extractionOptions": ["images", "locationMetadata"],
      "chunkingProperties": {     
          "unit": "characters",
          "maximumLength": 2000, 
          "overlapLength": 200
      },
      "inputs": [
        {
          "name": "file_data",
          "source": "/document/file_data"
        }
      ],
      "outputs": [
        { 
          "name": "text_sections", 
          "targetName": "text_sections" 
        }, 
        { 
          "name": "normalized_images", 
          "targetName": "normalized_images" 
        } 
      ]
    }
  ]
}

テキスト出力モードと画像とメタデータ抽出のサンプル出力

{
  "text_sections": [
      {
        "id": "1_7e6ef1f0-d2c0-479c-b11c-5d3c0fc88f56",
        "content": "the effects of analyzers using Analyze Text (REST). For more information about analyzers, see Analyzers for text processing.During indexing, an indexer only checks field names and types. There's no validation step that ensures incoming content is correct for the corresponding search field in the index.Create an indexerWhen you're ready to create an indexer on a remote search service, you need a search client. A search client can be the Azure portal, a REST client, or code that instantiates an indexer client. We recommend the Azure portal or REST APIs for early development and proof-of-concept testing.Azure portal1. Sign in to the Azure portal 2, then find your search service.2. On the search service Overview page, choose from two options:· Import data wizard: The wizard is unique in that it creates all of the required elements. Other approaches require a predefined data source and index.All services > Azure Al services | Al Search >demo-search-svc Search serviceSearchAdd indexImport dataImport and vectorize dataOverviewActivity logEssentialsAccess control (IAM)Get startedPropertiesUsageMonitoring· Add indexer: A visual editor for specifying an indexer definition.",
        "locationMetadata": {
          "pageNumber": 1,
          "ordinalPosition": 0,
          "boundingPolygons": "[[{\"x\":1.5548,\"y\":0.4036},{\"x\":6.9691,\"y\":0.4033},{\"x\":6.9691,\"y\":0.8577},{\"x\":1.5548,\"y\":0.8581}],[{\"x\":1.181,\"y\":1.0627},{\"x\":7.1393,\"y\":1.0626},{\"x\":7.1393,\"y\":1.7363},{\"x\":1.181,\"y\":1.7365}],[{\"x\":1.1923,\"y\":2.1466},{\"x\":3.4585,\"y\":2.1496},{\"x\":3.4582,\"y\":2.4251},{\"x\":1.1919,\"y\":2.4221}],[{\"x\":1.1813,\"y\":2.6518},{\"x\":7.2464,\"y\":2.6375},{\"x\":7.2486,\"y\":3.5913},{\"x\":1.1835,\"y\":3.6056}],[{\"x\":1.3349,\"y\":3.9489},{\"x\":2.1237,\"y\":3.9508},{\"x\":2.1233,\"y\":4.1128},{\"x\":1.3346,\"y\":4.111}],[{\"x\":1.5705,\"y\":4.5322},{\"x\":5.801,\"y\":4.5326},{\"x\":5.801,\"y\":4.7311},{\"x\":1.5704,\"y\":4.7307}]]"
        },
        "sections": ["sectionHeading"]
      },
      {
        "id": "2_25134f52-04c3-415a-ab3d-80729bd58e67",
        "content": "All services > Azure Al services | Al Search >demo-search-svc | Indexers Search serviceSearch0«Add indexerRefreshDelete:selected: TagsFilter by name ...:selected: Diagnose and solve problemsSearch managementStatusNameIndexesIndexers*Data sourcesRun the indexerBy default, an indexer runs immediately when you create it on the search service. You can override this behavior by setting disabled to true in the indexer definition. Indexer execution is the moment of truth where you find out if there are problems with connections, field mappings, or skillset construction.There are several ways to run an indexer:· Run on indexer creation or update (default).. Run on demand when there are no changes to the definition, or precede with reset for full indexing. For more information, see Run or reset indexers.· Schedule indexer processing to invoke execution at regular intervals.Scheduled execution is usually implemented when you have a need for incremental indexing so that you can pick up the latest changes. As such, scheduling has a dependency on change detection.Indexers are one of the few subsystems that make overt outbound calls to other Azure resources. In terms of Azure roles, indexers don't have separate identities; a connection from the search engine to another Azure resource is made using the system or user- assigned managed identity of a search service. If the indexer connects to an Azure resource on a virtual network, you should create a shared private link for that connection. For more information about secure connections, see Security in Azure Al Search.Check results",
        "locationMetadata": {
          "pageNumber": 2,
          "ordinalPosition": 1,
          "boundingPolygons": "[[{\"x\":2.2041,\"y\":0.4109},{\"x\":4.3967,\"y\":0.4131},{\"x\":4.3966,\"y\":0.5505},{\"x\":2.204,\"y\":0.5482}],[{\"x\":2.5042,\"y\":0.6422},{\"x\":4.8539,\"y\":0.6506},{\"x\":4.8527,\"y\":0.993},{\"x\":2.5029,\"y\":0.9845}],[{\"x\":2.3705,\"y\":1.1496},{\"x\":2.6859,\"y\":1.15},{\"x\":2.6858,\"y\":1.2612},{\"x\":2.3704,\"y\":1.2608}],[{\"x\":3.7418,\"y\":1.1709},{\"x\":3.8082,\"y\":1.171},{\"x\":3.8081,\"y\":1.2508},{\"x\":3.7417,\"y\":1.2507}],[{\"x\":3.9692,\"y\":1.1445},{\"x\":4.0541,\"y\":1.1445},{\"x\":4.0542,\"y\":1.2621},{\"x\":3.9692,\"y\":1.2622}],[{\"x\":4.5326,\"y\":1.2263},{\"x\":5.1065,\"y\":1.229},{\"x\":5.106,\"y\":1.346},{\"x\":4.5321,\"y\":1.3433}],[{\"x\":5.5508,\"y\":1.2267},{\"x\":5.8992,\"y\":1.2268},{\"x\":5.8991,\"y\":1.3408},{\"x\":5.5508,\"y\":1.3408}]]"
        },
        "sections": ["sectionHeading", "title"]
       }
    ],
    "normalized_images": [ 
        { 
            "id": "1_550e8400-e29b-41d4-a716-446655440000", 
            "data": "SGVsbG8sIFdvcmxkIQ==", 
            "imagePath": "aHR0cHM6Ly9henNyb2xsaW5nLmJsb2IuY29yZS53aW5kb3dzLm5ldC9tdWx0aW1vZGFsaXR5L0NyZWF0ZUluZGV4ZXJwNnA3LnBkZg2/normalized_images_0.jpg",  
            "locationMetadata": {
              "pageNumber": 1,
              "ordinalPosition": 0,
              "boundingPolygons": "[[{\"x\":2.0834,\"y\":6.2245},{\"x\":7.1818,\"y\":6.2244},{\"x\":7.1816,\"y\":7.9375},{\"x\":2.0831,\"y\":7.9377}]]"
            }
        },
        { 
            "id": "2_123e4567-e89b-12d3-a456-426614174000", 
            "data": "U29tZSBtb3JlIGV4YW1wbGUgdGV4dA==", 
            "imagePath": "aHR0cHM6Ly9henNyb2xsaW5nLmJsb2IuY29yZS53aW5kb3dzLm5ldC9tdWx0aW1vZGFsaXR5L0NyZWF0ZUluZGV4ZXJwNnA3LnBkZg2/normalized_images_1.jpg",  
            "locationMetadata": {
              "pageNumber": 2,
              "ordinalPosition": 1,
              "boundingPolygons": "[[{\"x\":2.0784,\"y\":0.3734},{\"x\":7.1837,\"y\":0.3729},{\"x\":7.183,\"y\":2.8611},{\"x\":2.0775,\"y\":2.8615}]]"
            } 
        }
    ] 
}

このスキルでは、Azure AI ドキュメントインテリジェンスを使用して locationMetadata を計算します。ページと境界ポリゴン座標の定義方法の詳細については、「ドキュメントインテリジェンスレイアウトモデル」を参照してください。

imagePathは、格納されているイメージの相対パスを表します。ナレッジストアファイルプロジェクションがスキルセットで構成されている場合、このパスはナレッジストアに格納されているイメージの相対パスと一致します。