Mastering Generative AI with OpenAI
Audio Transcription Translation
Demo Audio Translation
In this tutorial, we’ll demonstrate how to translate a short Spanish audio clip into English text using OpenAI’s Whisper API. We'll process a 20-second MP3 segment (up to 25 MB) extracted from an Easy Spanish YouTube video and send it to the API in one request.
Prerequisites
- Python 3.7+
openai
Python SDK- An OpenAI API key
Install the SDK with:
pip install --upgrade openai
Note
Ensure your MP3 file is under 25 MB. Whisper supports formats like MP3, WAV, and FLAC.
Translation Code Example
import os
import openai
import IPython.display as ipd
# 1. Configure API key
openai.api_key = os.getenv("OPENAI_API_KEY")
# 2. Load and play the Spanish audio clip
file_name = "data/Spanish.mp3"
audio_file = open(file_name, "rb")
ipd.display(ipd.Audio(file_name))
# 3. Call Whisper for translation
result = openai.Audio.translate("whisper-1", audio_file)
# 4. Output the English translation
print(result.text)
Step-by-Step Breakdown
Step | Action | Code Snippet |
---|---|---|
1 | Configure the OpenAI API key | openai.api_key = os.getenv("OPENAI_API_KEY") |
2 | Load and display the MP3 clip inline | ipd.display(ipd.Audio(file_name)) |
3 | Translate audio using whisper-1 | openai.Audio.translate("whisper-1", audio_file) |
4 | Print the translated English text | print(result.text) |
Warning
Keep your API key secure. Do not hard-code it in public repositories.
Next Steps
Once you have the translated text, you can pass it to GPT-4 (or any other LLM) for further processing—such as summarization, sentiment analysis, or content moderation.
Links and References
Watch Video
Watch video content
Practice Lab
Practice lab