Introduction to OpenAI
Vision
Project 2 Image Captioning
In this tutorial, you’ll build a Python script that takes an image URL and generates a descriptive caption using GPT-4’s vision capabilities. Instead of DALL·E, we’ll use the GPT-4 chat completion endpoint, which can process image URLs directly and describe what it “sees.”
Table of Contents
- Prerequisites
- Installation
- Initialize the OpenAI Client
- Define the Image URL
- Generate Captions Function
- Run the Script
- Sample Output
- References
Prerequisites
- Python 3.7+
pip
package manager- An OpenAI API key with GPT-4 access
- Internet connectivity to fetch the image
Warning
Never hard-code your API key in a public repository. Use environment variables or a secure vault.
Installation
Install the official OpenAI Python client:
pip install openai
Initialize the OpenAI Client
Import and initialize the client with your API key:
from openai import OpenAI
# Initialize the client with your API key
client = OpenAI(api_key="sk-your-api-key-here")
Note
You can find the latest OpenAI Python SDK and examples in the openai-python GitHub repo.
Define the Image URL
Specify the publicly accessible image URL you want to caption:
# Image URL to caption
image_url = "https://assets-prd.ignimgs.com/2022/06/10/netflix-one-piece-1654901410673.jpg"
Generate Captions Function
Create a helper function that sends a chat completion request to GPT-4, including both a text prompt and the image URL. We’ll cap the response at 125 tokens to keep captions concise.
from typing import Dict
def generate_captions(image_url: str) -> str:
response = client.chat.completions.create(
model="gpt-4",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What is this image?"},
{"type": "image_url", "image_url": {"url": image_url}}
]
}
],
max_tokens=125
)
# Extract the generated caption
return response.choices[0].message.content
Run the Script
Use the function and print the returned caption:
if __name__ == "__main__":
caption = generate_captions(image_url)
print(caption)
Sample Output
This image features characters from the anime and manga "One Piece." In the center is Monkey D. Luffy wearing his trademark straw hat. To his left stands Sanji with blond hair, and to his right is Nami, recognizable by her orange hair. They are members of the Straw Hat Pirates.
References
Watch Video
Watch video content
Practice Lab
Practice lab