Azure OpenAI 응답 API

2025-09-19

응답 API는 Azure OpenAI의 새로운 상태 저장 API입니다. 단일 통합 환경에서 채팅 완료 및 도우미 API에서 최상의 기능을 함께 제공합니다. 또한 응답 API는 computer-use-preview 기능을 지원하는 새 모델에 대한 지원을 추가합니다.

응답 API

API 지원

최신 기능에 액세스하려면 v1 API가 필요함

지역 가용성

응답 API는 현재 다음 지역에서 사용할 수 있습니다.

australiaeast
eastus
eastus2
francecentral
japaneast
norwayeast
polandcentral
southindia
swedencentral
switzerlandnorth
uaenorth
uksouth
westus
westus3

모델 지원

gpt-5-codex (버전: 2025-09-11)
gpt-5 (버전: 2025-08-07)
gpt-5-mini (버전: 2025-08-07)
gpt-5-nano (버전: 2025-08-07)
gpt-5-chat (버전: 2025-08-07)
gpt-5-chat (버전: 2025-10-03)
gpt-5-codex (버전: 2025-09-15)
gpt-4o(버전: 2024-11-20, 2024-08-062024-05-13)
gpt-4o-mini (버전: 2024-07-18)
computer-use-preview
gpt-4.1 (버전: 2025-04-14)
gpt-4.1-nano (버전: 2025-04-14)
gpt-4.1-mini (버전: 2025-04-14)
gpt-image-1 (버전: 2025-04-15)
gpt-image-1-mini (버전: 2025-10-06)
o1 (버전: 2024-12-17)
o3-mini (버전: 2025-01-31)
o3 (버전: 2025-04-16)
o4-mini (버전: 2025-04-16)

응답 API에서 지원하는 지역에서 모든 모델을 사용할 수 있는 것은 아닙니다. 모델 영역 가용성에 대한 모델 페이지를 확인합니다.

Note

현재 지원되지 않음:

웹 검색 도구
멀티 턴 편집 및 스트리밍을 사용하여 이미지 생성 - 출시 예정
이미지를 파일로 업로드한 다음 입력으로 참조할 수 없습니다. 곧 공개됩니다.

다음과 같은 알려진 문제가 있습니다.

이제 PDF를 입력 파일로 사용할 수 있지만 파일 업로드 용도를 user_data 설정하는 것은 현재 지원되지 않습니다.
백그라운드 모드를 스트리밍과 함께 사용하면 성능 문제가 발생합니다. 이 문제는 곧 해결될 예정입니다.

참조 설명서

응답 API 참조 설명서

응답 API 시작

응답 API 명령에 액세스하려면 OpenAI 라이브러리의 버전을 업그레이드해야 합니다.

pip install --upgrade openai

텍스트 응답 생성

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
)

response = client.responses.create(   
  model="gpt-4.1-nano", # Replace with your model deployment name 
  input="This is a test.",
)

print(response.model_dump_json(indent=2))

Important

주의해서 API 키를 사용합니다. API 키를 코드에 직접 포함하지 말고, 공개적으로 게시하지 마세요. API 키를 사용하는 경우 Azure Key Vault에 안전하게 저장합니다. 앱에서 API 키를 안전하게 사용하는 방법에 대한 자세한 내용은 Azure Key Vault를 사용하여 API 키를 참조하세요.

AI 서비스 보안에 대한 자세한 내용은 Azure AI 서비스에 대한 요청 인증을 참조하세요.

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

response = client.responses.create(
    model="gpt-4.1-nano",
    input= "This is a test" 
)

print(response.model_dump_json(indent=2))

Microsoft Entra ID (마이크로소프트 엔트라 ID)

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
     "model": "gpt-4o",
     "input": "This is a test"
    }'

API 키

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "api-key: $AZURE_OPENAI_API_KEY" \
  -d '{
     "model": "gpt-4.1-nano",
     "input": "This is a test"
    }'

Output:

{
  "id": "resp_67cb32528d6881909eb2859a55e18a85",
  "created_at": 1741369938.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cb3252cfac8190865744873aada798",
      "content": [
        {
          "annotations": [],
          "text": "Great! How can I help you today?",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "output_text": "Great! How can I help you today?",
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 20,
    "output_tokens": 11,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 31
  },
  "user": null,
  "reasoning_effort": null
}

응답 검색

응답 API에 대한 이전 호출에서 응답을 검색합니다.

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    base_url="https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
)

response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

Important

AI 서비스 보안에 대한 자세한 내용은 Azure AI 서비스에 대한 요청 인증을 참조하세요.

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
)

response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

print(response.model_dump_json(indent=2))

Microsoft Entra ID (마이크로소프트 엔트라 ID)

curl -X GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/{response_id} \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

API 키

curl -X GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/{response_id} \
  -H "Content-Type: application/json" \
  -H "api-key: $AZURE_OPENAI_API_KEY"

{
  "id": "resp_67cb61fa3a448190bcf2c42d96f0d1a8",
  "created_at": 1741382138.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cb61fa95588190baf22ffbdbbaaa9d",
      "content": [
        {
          "annotations": [],
          "text": "Hello! How can I assist you today?",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": null,
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 20,
    "output_tokens": 11,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 31
  },
  "user": null,
  "reasoning_effort": null
}

응답 삭제

기본적으로 응답 데이터는 30일 동안 보존됩니다. 응답을 삭제하려면 response.delete"("{response_id})

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.delete("resp_67cb61fa3a448190bcf2c42d96f0d1a8")

print(response)

응답을 함께 연결

이전 응답에서 response.id를 받아와 previous_response_id 매개 변수로 전달함으로써 응답을 함께 연결할 수 있습니다.

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o",  # replace with your model deployment name
    input="Define and explain the concept of catastrophic forgetting?"
)

second_response = client.responses.create(
    model="gpt-4o",  # replace with your model deployment name
    previous_response_id=response.id,
    input=[{"role": "user", "content": "Explain this at a level that could be understood by a college freshman"}]
)
print(second_response.model_dump_json(indent=2))

출력에서 첫 번째 입력 질문을 second_response API 호출과 공유한 적이 없지만, previous_response_id를 전달하면 모델이 이전 질문과 응답의 전체 컨텍스트를 파악하여 새 질문에 대한 답변을 제공합니다.

Output:

{
  "id": "resp_67cbc9705fc08190bbe455c5ba3d6daf",
  "created_at": 1741408624.0,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "metadata": {},
  "model": "gpt-4o-2024-08-06",
  "object": "response",
  "output": [
    {
      "id": "msg_67cbc970fd0881908353a4298996b3f6",
      "content": [
        {
          "annotations": [],
          "text": "Sure! Imagine you are studying for exams in different subjects like math, history, and biology. You spend a lot of time studying math first and get really good at it. But then, you switch to studying history. If you spend all your time and focus on history, you might forget some of the math concepts you learned earlier because your brain fills up with all the new history facts. \n\nIn the world of artificial intelligence (AI) and machine learning, a similar thing can happen with computers. We use special programs called neural networks to help computers learn things, sort of like how our brain works. But when a neural network learns a new task, it can forget what it learned before. This is what we call \"catastrophic forgetting.\"\n\nSo, if a neural network learned how to recognize cats in pictures, and then you teach it how to recognize dogs, it might get really good at recognizing dogs but suddenly become worse at recognizing cats. This happens because the process of learning new information can overwrite or mess with the old information in its \"memory.\"\n\nScientists and engineers are working on ways to help computers remember everything they learn, even as they keep learning new things, just like students have to remember math, history, and biology all at the same time for their exams. They use different techniques to make sure the neural network doesn’t forget the important stuff it learned before, even when it gets new information.",
          "type": "output_text"
        }
      ],
      "role": "assistant",
      "status": null,
      "type": "message"
    }
  ],
  "parallel_tool_calls": null,
  "temperature": 1.0,
  "tool_choice": null,
  "tools": [],
  "top_p": 1.0,
  "max_output_tokens": null,
  "previous_response_id": "resp_67cbc96babbc8190b0f69aedc655f173",
  "reasoning": null,
  "status": "completed",
  "text": null,
  "truncation": null,
  "usage": {
    "input_tokens": 405,
    "output_tokens": 285,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 690
  },
  "user": null,
  "reasoning_effort": null
}

수동으로 응답들을 연결

또는 아래 메서드를 사용하여 응답을 수동으로 연결할 수 있습니다.

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)


inputs = [{"type": "message", "role": "user", "content": "Define and explain the concept of catastrophic forgetting?"}] 
  
response = client.responses.create(  
    model="gpt-4o",  # replace with your model deployment name  
    input=inputs  
)  
  
inputs += response.output

inputs.append({"role": "user", "type": "message", "content": "Explain this at a level that could be understood by a college freshman"}) 
               

second_response = client.responses.create(  
    model="gpt-4o",  
    input=inputs
)  
      
print(second_response.model_dump_json(indent=2))

Streaming

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    input = "This is a test",
    model = "o4-mini", # replace with model deployment name
    stream = True
)

for event in response:
    if event.type == 'response.output_text.delta':
        print(event.delta, end='')

함수 호출

응답 API는 함수 호출을 지원합니다.

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(  
    model="gpt-4o",  # replace with your model deployment name  
    tools=[  
        {  
            "type": "function",  
            "name": "get_weather",  
            "description": "Get the weather for a ___location",  
            "parameters": {  
                "type": "object",  
                "properties": {  
                    "___location": {"type": "string"},  
                },  
                "required": ["___location"],  
            },  
        }  
    ],  
    input=[{"role": "user", "content": "What's the weather in San Francisco?"}],  
)  

print(response.model_dump_json(indent=2))  
  
# To provide output to tools, add a response for each tool call to an array passed  
# to the next response as `input`  
input = []  
for output in response.output:  
    if output.type == "function_call":  
        match output.name:  
            case "get_weather":  
                input.append(  
                    {  
                        "type": "function_call_output",  
                        "call_id": output.call_id,  
                        "output": '{"temperature": "70 degrees"}',  
                    }  
                )  
            case _:  
                raise ValueError(f"Unknown function call: {output.name}")  
  
second_response = client.responses.create(  
    model="gpt-4o",  
    previous_response_id=response.id,  
    input=input  
)  

print(second_response.model_dump_json(indent=2))

코드 인터프리터

코드 인터프리터 도구를 사용하면 모델이 안전한 샌드박스 환경에서 Python 코드를 작성하고 실행할 수 있습니다. 다음과 같은 다양한 고급 작업을 지원합니다.

다양한 데이터 형식 및 구조로 파일 처리
데이터 및 시각화를 포함하는 파일 생성(예: 그래프)
문제를 해결하기 위해 반복적으로 코드를 작성하고 실행합니다. 모델은 성공할 때까지 코드를 디버그하고 다시 시도할 수 있습니다.
자르기, 확대/축소 및 회전과 같은 이미지 변환을 사용하도록 설정하여 지원되는 모델(예: o3, o4-mini)에서 시각적 추론 향상
이 도구는 데이터 분석, 수학 계산 및 코드 생성과 관련된 시나리오에 특히 유용합니다.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses?api-version=preview \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "tools": [
            { "type": "code_interpreter", "container": {"type": "auto"} }
        ],
        "instructions": "You are a personal math tutor. When asked a math question, write and run code using the python tool to answer the question.",
        "input": "I need to solve the equation 3x + 11 = 14. Can you help me?"
    }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

instructions = "You are a personal math tutor. When asked a math question, write and run code using the python tool to answer the question."

response = client.responses.create(
    model="gpt-4.1",
    tools=[
        {
            "type": "code_interpreter",
            "container": {"type": "auto"}
        }
    ],
    instructions=instructions,
    input="I need to solve the equation 3x + 11 = 14. Can you help me?",
)

print(response.output)

Containers

Important

코드 인터프리터에는 Azure OpenAI 사용량에 대한 토큰 기반 요금 외에 추가 요금이 부과됩니다. 응답 API가 서로 다른 두 스레드에서 동시에 코드 인터프리터를 호출하는 경우 두 개의 코드 인터프리터 세션이 만들어집니다. 각 세션은 기본적으로 1시간 동안 활성화되며 유휴 시간 제한은 20분입니다.

코드 인터프리터 도구에는 모델이 Python 코드를 실행할 수 있는 완전히 샌드박스가 있는 가상 머신인 컨테이너가 필요합니다. 컨테이너는 실행 중에 생성된 업로드된 파일 또는 파일을 포함할 수 있습니다.

컨테이너를 만들려면 새 Response 개체를 만들 때 도구 구성에서 지정 "container": { "type": "auto", "file_ids": ["file-1", "file-2"] } 합니다. 그러면 새 컨테이너가 자동으로 만들어지거나 모델의 컨텍스트에서 이전 code_interpreter_call 활성 컨테이너를 다시 사용합니다. API 출력의 code_interpreter_call에는 생성된 container_id가 포함됩니다. 이 컨테이너는 20분 동안 사용되지 않으면 만료됩니다.

파일 입력 및 출력

코드 인터프리터를 실행할 때 모델은 자체 파일을 만들 수 있습니다. 예를 들어 플롯을 생성하거나 CSV를 만들도록 요청하는 경우 컨테이너에 직접 이러한 이미지를 만듭니다. 다음 메시지 주석에서 이러한 파일을 인용합니다.

모델 입력의 모든 파일은 컨테이너에 자동으로 업로드됩니다. 컨테이너에 명시적으로 업로드할 필요는 없습니다.

지원되는 파일

파일 형식	MIME type
`.c`	text/x-c
`.cs`	text/x-csharp
`.cpp`	text/x-c++
`.csv`	text/csv
`.doc`	application/msword
`.docx`	application/vnd.openxmlformats-officedocument.wordprocessingml.document
`.html`	text/html
`.java`	text/x-java
`.json`	application/json
`.md`	text/markdown
`.pdf`	application/pdf
`.php`	text/x-php
`.pptx`	application/vnd.openxmlformats-officedocument.presentationml.presentation
`.py`	text/x-python
`.py`	text/x-script.python
`.rb`	text/x-ruby
`.tex`	text/x-tex
`.txt`	text/plain
`.css`	text/css
`.js`	text/JavaScript
`.sh`	application/x-sh
`.ts`	application/TypeScript
`.csv`	application/csv
`.jpeg`	image/jpeg
`.jpg`	image/jpeg
`.gif`	image/gif
`.pkl`	application/octet-stream
`.png`	image/png
`.tar`	application/x-tar
`.xlsx`	application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
`.xml`	application/xml 혹은 "text/xml"
`.zip`	application/zip

입력 항목 나열

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.input_items.list("resp_67d856fcfba0819081fd3cffee2aa1c0")

print(response.model_dump_json(indent=2))

Output:

{
  "data": [
    {
      "id": "msg_67d856fcfc1c8190ad3102fc01994c5f",
      "content": [
        {
          "text": "This is a test.",
          "type": "input_text"
        }
      ],
      "role": "user",
      "status": "completed",
      "type": "message"
    }
  ],
  "has_more": false,
  "object": "list",
  "first_id": "msg_67d856fcfc1c8190ad3102fc01994c5f",
  "last_id": "msg_67d856fcfc1c8190ad3102fc01994c5f"
}

이미지 입력

이미지 주소(이미지 URL)

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what is in this image?" },
                {
                    "type": "input_image",
                    "image_url": "<image_URL>"
                }
            ]
        }
    ]
)

print(response)

Base64로 인코딩된 이미지

import base64
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# Path to your image
image_path = "path_to_your_image.jpg"

# Getting the Base64 string
base64_image = encode_image(image_path)

response = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                { "type": "input_text", "text": "what is in this image?" },
                {
                    "type": "input_image",
                    "image_url": f"data:image/jpeg;base64,{base64_image}"
                }
            ]
        }
    ]
)

print(response)

파일 입력

비전 기능이 있는 모델은 PDF 입력을 지원합니다. PDF 파일은 Base64로 인코딩된 데이터 또는 파일 ID로 제공할 수 있습니다. 모델이 PDF 콘텐츠를 해석하는 데 도움이 되도록 추출된 텍스트와 각 페이지의 이미지가 모델의 컨텍스트에 포함됩니다. 이는 다이어그램 또는 텍스트가 아닌 콘텐츠를 통해 키 정보를 전달할 때 유용합니다.

Note

추출된 모든 텍스트와 이미지는 모델의 컨텍스트에 배치됩니다. PDF를 입력으로 사용할 때의 가격 책정 및 토큰 사용량에 미치는 영향을 이해해야 합니다.
단일 API 요청에서 여러 입력(파일)에 업로드된 콘텐츠의 크기는 모델의 컨텍스트 길이 내에 있어야 합니다.
텍스트 및 이미지 입력을 모두 지원하는 모델만 PDF 파일을 입력으로 수락할 수 있습니다.
purpose의 user_data은(는) 현재 지원되지 않습니다. 임시 해결 방법으로 용도 assistants를 .로 설정해야 합니다.

PDF를 Base64로 변환 및 분석

import base64
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

with open("PDF-FILE-NAME.pdf", "rb") as f: # assumes PDF is in the same directory as the executing script
    data = f.read()

base64_string = base64.b64encode(data).decode("utf-8")

response = client.responses.create(
    model="gpt-4o-mini", # model deployment name
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "filename": "PDF-FILE-NAME.pdf",
                    "file_data": f"data:application/pdf;base64,{base64_string}",
                },
                {
                    "type": "input_text",
                    "text": "Summarize this PDF",
                },
            ],
        },
    ]
)

print(response.output_text)

PDF 업로드 및 분석

PDF 파일을 업로드합니다. purpose의 user_data은(는) 현재 지원되지 않습니다. 해결 방법으로 용도를 assistants으로 설정해야 합니다.

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)


# Upload a file with a purpose of "assistants"
file = client.files.create(
  file=open("nucleus_sampling.pdf", "rb"), # This assumes a .pdf file in the same directory as the executing script
  purpose="assistants"
)

print(file.model_dump_json(indent=2))
file_id = file.id

Output:

{
  "id": "assistant-KaVLJQTiWEvdz8yJQHHkqJ",
  "bytes": 4691115,
  "created_at": 1752174469,
  "filename": "nucleus_sampling.pdf",
  "object": "file",
  "purpose": "assistants",
  "status": "processed",
  "expires_at": null,
  "status_details": null
}

그런 다음, 해당 id 값을 가져와서 다음에서 file_id처리하기 위해 모델에 전달합니다.

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4o-mini",
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "file_id":"assistant-KaVLJQTiWEvdz8yJQHHkqJ"
                },
                {
                    "type": "input_text",
                    "text": "Summarize this PDF",
                },
            ],
        },
    ]
)

print(response.output_text)

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/files \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -F purpose="assistants" \
  -F file="@your_file.pdf" \

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "input": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "input_file",
                        "file_id": "assistant-123456789"
                    },
                    {
                        "type": "input_text",
                        "text": "ASK SOME QUESTION RELATED TO UPLOADED PDF"
                    }
                ]
            }
        ]
    }'

원격 MCP 서버 사용

모델 기능을 MCP(원격 모델 컨텍스트 프로토콜) 서버에 호스트되는 도구에 연결하여 확장할 수 있습니다. 이러한 서버는 개발자와 조직에서 유지 관리하며 응답 API와 같은 MCP 호환 클라이언트에서 액세스할 수 있는 도구를 노출합니다.

MCP(모델 컨텍스트 프로토콜)는 애플리케이션이 LLM(대규모 언어 모델)에 도구 및 컨텍스트 데이터를 제공하는 방법을 정의하는 개방형 표준입니다. 이를 통해 일관되고 확장 가능한 외부 도구를 모델 워크플로에 통합할 수 있습니다.

다음 예제에서는 가상의 MCP 서버를 사용하여 Azure REST API에 대한 정보를 쿼리하는 방법을 보여 줍니다. 이렇게 하면 모델이 리포지토리 콘텐츠를 실시간으로 검색하고 추론할 수 있습니다.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
  "model": "gpt-4.1",
  "tools": [
    {
      "type": "mcp",
      "server_label": "github",
      "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
      "require_approval": "never"
    }
  ],
  "input": "What is this repo in 100 words?"
}'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)
response = client.responses.create(
    model="gpt-4.1", # replace with your model deployment name 
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
            "require_approval": "never"
        },
    ],
    input="What transport protocols are supported in the 2025-03-26 version of the MCP spec?",
)

print(response.output_text)

MCP 도구는 응답 API에서만 작동하며 모든 최신 모델(gpt-4o, gpt-4.1 및 추론 모델)에서 사용할 수 있습니다. MCP 도구를 사용하는 경우 도구 정의를 가져오거나 도구를 호출할 때 사용되는 토큰에 대해서만 요금을 지불합니다. 추가 비용은 없습니다.

Approvals

기본적으로 응답 API는 원격 MCP 서버와 데이터를 공유하기 전에 명시적 승인이 필요합니다. 이 승인 단계는 투명성을 보장하고 외부로 전송되는 정보를 제어하는 데 도움이 됩니다.

원격 MCP 서버와 공유되는 모든 데이터를 검토하고 필요에 따라 감사 목적으로 로깅하는 것이 좋습니다.

승인이 필요한 경우 모델은 응답 출력의 항목을 반환합니다 mcp_approval_request . 이 개체는 보류 중인 요청의 세부 정보를 포함하며 계속하기 전에 데이터를 검사하거나 수정할 수 있습니다.

{
  "id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828",
  "type": "mcp_approval_request",
  "arguments": {},
  "name": "fetch_azure_rest_api_docs",
  "server_label": "github"
}

원격 MCP 호출을 계속하려면 mcp_approval_response 항목을 포함하는 새 응답 개체를 만들어 승인 요청에 응답해야 합니다. 이 개체는 모델이 지정된 데이터를 원격 MCP 서버로 보낼 수 있도록 허용하려는 의도를 확인합니다.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
  "model": "gpt-4.1",
  "tools": [
    {
      "type": "mcp",
      "server_label": "github",
      "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
      "require_approval": "never"
    }
  ],
  "previous_response_id": "resp_682f750c5f9c8198aee5b480980b5cf60351aee697a7cd77",
  "input": [{
    "type": "mcp_approval_response",
    "approve": true,
    "approval_request_id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828"
  }]
}'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4.1", # replace with your model deployment name 
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
            "require_approval": "never"
        },
    ],
    previous_response_id="resp_682f750c5f9c8198aee5b480980b5cf60351aee697a7cd77",
    input=[{
        "type": "mcp_approval_response",
        "approve": True,
        "approval_request_id": "mcpr_682bd9cd428c8198b170dc6b549d66fc016e86a03f4cc828"
    }],
)

Authentication

GitHub MCP 서버와 달리 대부분의 원격 MCP 서버에는 인증이 필요합니다. 응답 API의 MCP 도구는 사용자 지정 헤더를 지원하므로 필요한 인증 체계를 사용하여 이러한 서버에 안전하게 연결할 수 있습니다.

요청에서 직접 API 키, OAuth 액세스 토큰 또는 기타 자격 증명과 같은 헤더를 지정할 수 있습니다. 가장 일반적으로 사용되는 헤더는 헤더입니다 Authorization .

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
        "model": "gpt-4.1",
        "input": "What is this repo in 100 words?"
        "tools": [
            {
                "type": "mcp",
                "server_label": "github",
                "server_url": "https://contoso.com/Azure/azure-rest-api-specs",
                "headers": {
                    "Authorization": "Bearer $YOUR_API_KEY"
            }
        ]
    }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model="gpt-4.1",
    input="What is this repo in 100 words?",
    tools=[
        {
            "type": "mcp",
            "server_label": "github",
            "server_url": "https://gitmcp.io/Azure/azure-rest-api-specs",
            "headers": {
                "Authorization": "Bearer $YOUR_API_KEY"
        }
    ]
)

print(response.output_text)

백그라운드 작업

백그라운드 모드를 사용하면 o3 및 o1-pro와 같은 모델을 사용하여 장기 실행 작업을 비동기적으로 실행할 수 있습니다. 이는 Codex 또는 Deep Research와 같은 에이전트에서 처리하는 작업과 같이 완료하는 데 몇 분 정도 걸릴 수 있는 복잡한 추론 작업에 특히 유용합니다.

백그라운드 모드를 사용하도록 설정하면 시간 제한을 방지하고 확장 작업 중에 안정성을 유지할 수 있습니다. 요청이 전송 "background": true되면 작업이 비동기적으로 처리되며 시간이 지남에 따라 해당 상태를 폴링할 수 있습니다.

백그라운드 작업을 시작하려면 요청에서 백그라운드 매개 변수를 true로 설정합니다.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o3",
    "input": "Write me a very long story",
    "background": true
  }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model = "o3",
    input = "Write me a very long story",
    background = True
)

print(response.status)

엔드포인트를 GET 사용하여 백그라운드 응답의 상태를 확인합니다. 상태가 대기 중이거나 진행 중일 때 폴링을 계속합니다. 응답이 최종(터미널) 상태에 도달하면 검색에 사용할 수 있습니다.

curl GET https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

from time import sleep
import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.create(
    model = "o3",
    input = "Write me a very long story",
    background = True
)

while response.status in {"queued", "in_progress"}:
    print(f"Current status: {response.status}")
    sleep(2)
    response = client.responses.retrieve(response.id)

print(f"Final status: {response.status}\nOutput:\n{response.output_text}")

엔드포인트 cancel를 사용하여 진행 중인 백그라운드 작업을 취소할 수 있습니다. 취소는 멱등적입니다. 즉, 이후 호출은 최종 응답 개체를 반환합니다.

curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890/cancel \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

response = client.responses.cancel("resp_1234567890")

print(response.status)

백그라운드 응답 스트리밍

백그라운드 응답을 스트리밍하려면 둘 다 background 와 stream true로 설정합니다. 연결이 끊어지는 경우 나중에 스트리밍을 다시 시작하려는 경우에 유용합니다. 각 이벤트의 sequence_number 사용하여 위치를 추적합니다.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o3",
    "input": "Write me a very long story",
    "background": true,
    "stream": true
  }'

import os
from openai import OpenAI

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",
  api_key=os.getenv("AZURE_OPENAI_API_KEY")  
)

# Fire off an async response but also start streaming immediately
stream = client.responses.create(
    model="o3",
    input="Write me a very long story",
    background=True,
    stream=True,
)

cursor = None
for event in stream:
    print(event)
    cursor = event["sequence_number"]

Note

백그라운드 응답은 현재 동기 응답보다 먼저 토큰 대기 시간이 높습니다. 이러한 격차를 줄이기 위한 개선이 진행 중입니다.

Limitations

백그라운드 모드에는 store=true이 필요합니다. 상태 비저장 요청은 지원되지 않습니다.
원래 요청이 포함된 stream=true경우에만 스트리밍을 다시 시작할 수 있습니다.
동기 응답을 취소하려면 연결을 직접 종료합니다.

특정 지점에서 스트리밍 다시 시작

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890?stream=true&starting_after=42 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN"

암호화된 추론 항목

store를 false로 설정하거나 조직이 데이터 보존을 0으로 설정하는 등 상태 비저장 모드에서 응답 API를 사용하는 경우 대화 턴 전체에서 추론 컨텍스트를 보존해야 합니다. 이렇게 하려면 API 요청에 암호화된 추론 항목을 포함합니다.

차례로 걸쳐 추론 항목을 유지하려면 요청의 reasoning.encrypted_content 매개 변수에 include를 추가하십시오. 이렇게 하면 응답에 향후 요청에서 전달할 수 있는 추론 추적의 암호화된 버전이 포함됩니다.

curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
  -d '{
    "model": "o4-mini",
    "reasoning": {"effort": "medium"},
    "input": "What is the weather like today?",
    "tools": [<YOUR_FUNCTION GOES HERE>],
    "include": ["reasoning.encrypted_content"]
  }'

이미지 생성(미리 보기)

응답 API를 사용하면 대화 및 다단계 워크플로의 일부로 이미지를 생성할 수 있습니다. 컨텍스트 내에서 이미지 입력 및 출력을 지원하고 이미지를 생성하고 편집하기 위한 기본 제공 도구를 포함합니다.

독립 실행형 이미지 API에 비해 응답 API는 다음과 같은 몇 가지 이점을 제공합니다.

스트리밍: 생성 중에 부분 이미지 출력을 표시하여 인식된 대기 시간을 개선합니다.
유연한 입력: 원시 이미지 바이트 외에도 이미지 파일 ID를 입력으로 허용합니다.

Note

응답 API의 이미지 생성 도구는 gpt-image-1 시리즈 모델에서만 지원됩니다. 그러나 이 모델을 지원되는 모델 목록에서 호출할 수 있습니다 - gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, 그리고 gpt-5 시리즈 모델.

응답 API 이미지 생성 도구는 현재 스트리밍 모드를 지원하지 않습니다. 스트리밍 모드를 사용하고 부분 이미지를 생성하려면 응답 API 외부에서 직접 이미지 생성 API 를 호출합니다.

GPT 이미지를 사용하여 대화형 이미지 환경을 빌드하려면 응답 API를 사용합니다.

from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = OpenAI(  
  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
  api_key=token_provider,
  default_headers={"x-ms-oai-image-generation-deployment":"gpt-image-1", "api_version":"preview"}
)

response = client.responses.create(
    model="o3",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation"}],
)

# Save the image to a file
image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]
    
if image_data:
    image_base64 = image_data[0]
    with open("otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))

추론 모델

응답 API와 함께 추론 모델을 사용하는 방법에 대한 예제는 추론 모델 가이드를 참조하세요.

컴퓨터 사용

Playwright에서 컴퓨터 사용이 전용 컴퓨터 사용 모델 가이드로 이동되었습니다.

피드백

이 페이지가 도움이 되었나요?

다음을 통해 공유

Azure OpenAI 응답 API

응답 API

API 지원

지역 가용성

모델 지원

참조 설명서

응답 API 시작

텍스트 응답 생성

응답 검색

응답 삭제

응답을 함께 연결

수동으로 응답들을 연결

Streaming

함수 호출

코드 인터프리터

Containers

파일 입력 및 출력

지원되는 파일

입력 항목 나열

이미지 입력

이미지 주소(이미지 URL)

Base64로 인코딩된 이미지

파일 입력

PDF를 Base64로 변환 및 분석

PDF 업로드 및 분석

원격 MCP 서버 사용

Approvals

Authentication

백그라운드 작업

백그라운드 응답 스트리밍

Limitations

특정 지점에서 스트리밍 다시 시작

암호화된 추론 항목

이미지 생성(미리 보기)

추론 모델

컴퓨터 사용

피드백

추가 리소스