Introduction to OpenAI
Text Generation
Text to Speech
Build a seamless text-to-speech pipeline in Python by combining OpenAI’s Chat API with Google’s gTTS library. Generate natural language responses from an LLM and have them spoken aloud automatically.
Prerequisites
1. Install Dependencies
| Package | Purpose | Install Command |
|---|---|---|
| gTTS | Google Text-to-Speech Python client | pip install gTTS |
| OpenAI Python client | Official OpenAI API SDK | pip install openai |
| Audio playback utility | Play MP3 files (macOS: afplay; Linux: mpg123 or mpg321) | brew install mpg123 or sudo apt install mpg123 |
Note
This example is tested on Python 3.7+. If you use a different version, adjust commands as needed.
2. Set Your OpenAI API Key
export OPENAI_API_KEY="your_openai_api_key"
Warning
Never commit your API key to public repositories. Use a secure vault or environment manager in production.
Imports and Client Initialization
Begin by importing standard libraries, gTTS, and initializing the OpenAI client:
import os
from gtts import gTTS
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
1. Define the Prompt
Decide what you want the model to say. For example:
prompt = "Tell me a story about a brave knight who loves basketball."
2. Text-to-Speech Function
Convert text to speech and play the resulting MP3:
def text_to_speech(text: str, lang: str = "en", slow: bool = False) -> None:
"""
Generate speech from text using gTTS, save as MP3, and play it.
"""
tts = gTTS(text=text, lang=lang, slow=slow)
filename = "tts_output.mp3"
tts.save(filename)
# macOS uses 'afplay'; Linux users can install 'mpg123' or 'mpg321'
os.system(f"afplay {filename}")
Warning
Adjust the playback command (afplay, mpg123, or mpg321) based on your operating system.
3. Generate Text from OpenAI
Send the prompt to the Chat API and retrieve the response:
def generate_text(
prompt: str,
model: str = "gpt-3.5-turbo",
temperature: float = 0.8,
max_tokens: int = 150
) -> str:
"""
Generate a chat completion for the given prompt.
"""
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
temperature=temperature,
max_tokens=max_tokens
)
return response.choices[0].message.content
4. Combine Generation and Speech
Create a helper that prints the generated text, then speaks it:
def gen_and_speak(prompt: str) -> None:
"""
Generate text from the prompt, display it, and play the speech.
"""
text = generate_text(prompt)
print("Generated Text:\n")
print(text, "\n")
text_to_speech(text)
5. Entry Point
Run the full pipeline with your defined prompt:
if __name__ == "__main__":
gen_and_speak(prompt)
Example Console Output
$ python3 text_to_speech_pipeline.py
Generated Text:
Once upon a time in the kingdom of Eldoria, there lived a brave knight named Sir Cedric...
# (Audio will start playing automatically)
Next Steps & Extensions
- Support multiple languages by changing
langintext_to_speech(). - Experiment with different voices and speech parameters.
- Integrate other audio libraries like
pyduborplaysoundfor advanced playback.
Links and References
Watch Video
Watch video content