빠른 시작: Azure AI Content Understanding REST API 사용

2025-07-23

이 빠른 시작에서는 Content Understanding REST API 를 사용하여 문서, 이미지, 오디오 및 비디오 파일의 다중 모드 콘텐츠에서 구조화된 데이터를 가져오는 방법을 보여 줍니다.
Azure AI Foundry에서 코드 없이 Content Understanding 사용해 보기

필수 구성 요소

시작하려면 활성 Azure 구독이 필요합니다. Azure 계정이 없는 경우 체험 계정을 만드세요.
Azure 구독이 있으면 Azure Portal에서 Azure AI Foundry 리소스 를 만듭니다. 지원되는 지역에 만들어야 합니다.
- 이 리소스는 포털의 AI Foundry>AI Foundry 아래에 나열됩니다.
이 가이드에서는 cURL 명령줄 도구를 사용합니다. 설치되지 않은 경우 개발 환경에 적합한 버전을 다운로드 할 수 있습니다.

사전 구성된 분석기 사용 시작하기

분석기는 콘텐츠가 처리되는 방법과 추출되는 인사이트를 정의합니다. 일반적인 사용 사례 에 대해 미리 빌드된 분석기를 제공합니다. 미리 빌드된 분석기를 사용자 지정하여 특정 요구 사항 및 사용 사례에 더 잘 맞출 수 있습니다. 이 빠른 시작에서는 미리 빌드된 문서, 이미지, 오디오 및 비디오 분석기를 사용하여 시작하는 데 도움이 됩니다.

분석을 위해 파일 보내기

다음 cURL 명령을 실행하기 전에 HTTP 요청을 다음과 같이 변경합니다.

Azure Portal의 Azure AI 파운드리 인스턴스에서 {endpoint} 및 {key}를 해당 값으로 바꿉니다.
{analyzerId}를 prebuilt-documentAnalyzer로 교체합니다. 이 분석기는 문서에서 단락, 구역 및 표와 같은 텍스트 및 레이아웃 요소를 추출합니다.
SAS(공유 액세스 서명)가 있는 Azure Storage Blob 경로와 같은 분석할 파일의 공개적으로 접근 가능한 URL로 {fileUrl}을(를) 다음 중 하나로 변경하거나 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/invoice.pdf을 사용합니다.

Azure Portal의 Azure AI 파운드리 인스턴스에서 {endpoint} 및 {key}를 해당 값으로 바꿉니다.
{analyzerId}를 prebuilt-imageAnalyzer로 교체합니다. 이 분석기는 이미지에 대한 설명을 생성합니다.
SAS(공유 액세스 서명)가 있는 Azure Storage Blob 경로와 같은 분석할 파일의 공개적으로 접근 가능한 URL로 {fileUrl}을(를) 다음 중 하나로 변경하거나 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/pieChart.jpg을 사용합니다.

Azure Portal의 Azure AI 파운드리 인스턴스에서 {endpoint} 및 {key}를 해당 값으로 바꿉니다.
{analyzerId}를 prebuilt-audioAnalyzer로 교체합니다. 이 분석기는 오디오 대본을 추출하고 요약을 생성하며 화자 레이블 지정을 수행합니다.
SAS(공유 액세스 서명)가 있는 Azure Storage Blob 경로와 같은 분석할 파일의 공개적으로 접근 가능한 URL로 {fileUrl}을(를) 다음 중 하나로 변경하거나 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/audio.wav을 사용합니다.

Azure Portal의 Azure AI 파운드리 인스턴스에서 {endpoint} 및 {key}를 해당 값으로 바꿉니다.
{analyzerId}를 prebuilt-videoAnalyzer로 교체합니다. 이 분석기는 비디오에서 키 프레임, 대본 및 장 세그먼트를 추출합니다.
SAS(공유 액세스 서명)가 있는 Azure Storage Blob 경로와 같은 분석할 파일의 공개적으로 접근 가능한 URL로 {fileUrl}을(를) 다음 중 하나로 변경하거나 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/FlightSimulator.mp4을 사용합니다.

POST 요청

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-05-01-preview" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d "{\"url\":\"{fileUrl}\"}"

POST 응답

응답에는 비동기 분석 작업의 결과를 검색하는 데 사용하는 request-id가 포함됩니다. 또한 헤더는 Operation-Location 분석 결과에 액세스하기 위한 직접 URL을 제공합니다.

HTTP/1.1 202 Accepted
Transfer-Encoding: chunked
Content-Type: application/json
request-id: aaa-bbb-ccc-ddd
x-ms-request-id: aaa-bbb-ccc-ddd
Operation-Location: {endpoint}/contentunderstanding/analyzerResults/{request-id}?api-version=2025-05-01-preview
api-supported-versions: 2024-12-01-preview,2025-05-01-preview,2025-10-01
x-envoy-upstream-service-time: 800
apim-request-id: {request-id}
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
x-ms-region: West US

분석 결과 가져오기

POST 응답에서 request-id를 사용하여 분석 결과를 검색합니다.

{endpoint} 및 {key}를 Azure 포털의 Azure AI Foundry 인스턴스에서 가져온 엔드포인트 및 키 값으로 바꾸십시오.
{request-id}을 POST 응답의 request-id로 교체하거나 Operation-Location 응답 헤더의 전체 URL을 사용합니다.

GET 요청

curl -i -X GET "{endpoint}/contentunderstanding/analyzerResults/{request-id}?api-version=2025-05-01-preview" \
  -H "Ocp-Apim-Subscription-Key: {key}"

GET 응답

200(OK) JSON 응답에는 작업 상태를 나타내는 status 필드가 포함되어 있습니다. 작업이 완료되지 않은 경우 status의 값은 Running 또는 NotStarted입니다. 이러한 경우 수동으로 또는 스크립트를 GET 통해 요청을 다시 보내야 합니다. 통화 사이에 1초 이상 기다립니다.

{
  "id": "<request-id>",
  "status": "Succeeded",
  "result": {
    "analyzerId": "prebuilt-documentAnalyzer",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "CONTOSO LTD.\n\n\n# INVOICE\n\nContoso Headquarters\n123 456th St...",
        "fields": {
          "Summary": {
            "type": "string",
            "valueString": "This document is an invoice issued by Contoso Ltd. to Microsoft Corporation for services rendered during the period of 10/14/2019 to 11/14/2019..."
          }
        },
        "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "inch",
        "pages": [
          {
            "pageNumber": 1,
            "angle": -0.0039,
            "width": 8.5,
            "height": 11,
            "spans": [ { "offset": 0, "length": 1650 } ],
            "words": [
              {
                "content": "CONTOSO",
                "span": { "offset": 0, "length": 7 },
                "confidence": 0.998,
                "source": "D(1,0.5739,0.6582,1.7446,0.6595,1.7434,0.8952,0.5729,0.8915)"
              }, ...
            ],
            "lines": [
              {
                "content": "CONTOSO LTD.",
                "source": "D(1,0.5734,0.6563,2.335,0.6601,2.3345,0.8933,0.5729,0.8895)",
                "span": { "offset": 0, "length": 12 }
              }, ...
            ]
          }
        ],
        "paragraphs": [
          {
            "content": "CONTOSO LTD.",
            "source": "D(1,0.5734,0.6563,2.335,0.6601,2.3345,0.8933,0.5729,0.8895)",
            "span": { "offset": 0, "length": 12 }
          },
          {
            "role": "title",
            "content": "INVOICE",
            "source": "D(1,7.0515,0.5614,8.0064,0.5628,8.006,0.791,7.0512,0.7897)",
            "span": { "offset": 15, "length": 9 }
          }, ...
        ],
        "sections": [
          {
            "span": { "offset": 0, "length": 1649 },
            "elements": [ "/sections/1", "/sections/2" ]
          }, ...
        ],
        "tables": [
          {
            "rowCount": 2,
            "columnCount": 6,
            "cells": [
              {
                "kind": "columnHeader",
                "rowIndex": 0,
                "columnIndex": 0,
                "rowSpan": 1,
                "columnSpan": 1,
                "content": "SALESPERSON",
                "source": "D(1,0.5389,4.5514,1.7505,4.5514,1.7505,4.8364,0.5389,4.8364)",
                "span": { "offset": 512, "length": 11 },
                "elements": [ "/paragraphs/19" ]
              }, ...
            ],
            "source": "D(1,0.4885,4.5543,8.0163,4.5539,8.015,5.1207,0.4879,5.1209)",
            "span": { "offset": 495, "length": 228 }
          }, ...
        ]
      }
    ]
  }
}

{
  "id": "<request-id>",
  "status": "Succeeded",
  "result": {
    "analyzerId": "prebuilt-imageAnalyzer",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "![image](image)\n",
        "fields": {
          "Summary": {
            "type": "string",
            "valueString": "The image is a pie chart depicting the distribution of hours worked per week among a group of individuals. The chart is divided into four segments: 60+ hours (37.8%), 50-60 hours (36.6%), 40-50 hours (18.9%), and 1-39 hours (6.7%). Each segment is labeled with its corresponding percentage and color-coded for clarity."
          }
        },
        "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "pixel",
        "pages": [
          {
            "pageNumber": 1,
            "spans": []
          }
        ]
      }
    ]
  }
}

{
  "id": "<request-id>",
  "status": "Succeeded",
  "result": {
    "analyzerId": "prebuilt-audioAnalyzer",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "stringEncoding": "utf8",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Audio: 00:00.000 => 01:54.670\n\nTranscript\n```\nWEBVTT\n\n00:00.080 --> 00:02.160\n<v Speaker 1>Thank you for calling Woodgrove Travel...",
        "fields": {
          "Summary": {
            "type": "string",
            "valueString": "John Smith contacted Woodgrove Travel to report a negative experience with his flight from New York City to Los Angeles..."
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 114670,
        "transcriptPhrases": [
          {
            "speaker": "Speaker 1",
            "startTimeMs": 80,
            "endTimeMs": 2160,
            "text": "Thank you for calling Woodgrove Travel.",
            "words": [
              {
                "startTimeMs": 80,
                "endTimeMs": 280,
                "text": "Thank"
              }, ...
            ]
          }, ...
        ]
      }
    ]
  }
}

{
  "id": "<request-id>",
  "status": "Succeeded",
  "result": {
    "analyzerId": "prebuilt-videoAnalyzer",
    "apiVersion": "2025-05-01-preview",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Video: 00:00.000 => 00:43.866\nWidth: 1080\nHeight: 608\n\n## Segment 1: 00:00.000 => 00:07.367\nThe video begins with a scenic aerial view featuring the Flight Simulator and Microsoft Azure AI logos...\n\nTranscript\n```\nWEBVTT\n\n00:01.400 --> 00:06.560\n<Speaker 1 Speaker>When it comes to the neural TTS, in order to get a good voice, it's better to have good data.\n```\n\nKey Frames\n- 00:00.726 ![](keyFrame.726.jpg)...",
        "fields": {
          "Segments": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  "SegmentId": {
                    "type": "string",
                    "valueString": "1"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 43866,
        "width": 1080,
        "height": 608,
        "KeyFrameTimesMs": [ 726, 2046, ... ],
        "transcriptPhrases": [
          {
            "speaker": "Speaker 1",
            "startTimeMs": 1400,
            "endTimeMs": 6560,
            "text": "When it comes to the neural TTS, in order to get a good voice, it's better to have good data.",
            "words": []
          }, ...
        ],
        "cameraShotTimesMs": [ 1467, 3233, ... ],
        "segments": [
          {
            "startTimeMs": 0,
            "endTimeMs": 7367,
            "description": "The video begins with a scenic aerial view featuring the Flight Simulator and Microsoft Azure AI logos...",
            "segmentId": "1"
          }, ...
        ]
      }
    ]
  }
}

다음 단계

사용 사례에 대한 사용자 지정 분석기를 만드는 방법에 대해 자세히 알아봅니다.