Introduction to OpenAI
Text Generation
Text to Speech
Build a seamless text-to-speech pipeline in Python by combining OpenAI’s Chat API with Google’s gTTS library. Generate natural language responses from an LLM and have them spoken aloud automatically.
Prerequisites
1. Install Dependencies
Package | Purpose | Install Command |
---|---|---|
gTTS | Google Text-to-Speech Python client | pip install gTTS |
OpenAI Python client | Official OpenAI API SDK | pip install openai |
Audio playback utility | Play MP3 files (macOS: afplay ; Linux: mpg123 or mpg321 ) | brew install mpg123 or sudo apt install mpg123 |
Note
This example is tested on Python 3.7+. If you use a different version, adjust commands as needed.
2. Set Your OpenAI API Key
export OPENAI_API_KEY="your_openai_api_key"
Warning
Never commit your API key to public repositories. Use a secure vault or environment manager in production.
Imports and Client Initialization
Begin by importing standard libraries, gTTS, and initializing the OpenAI client:
import os
from gtts import gTTS
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
1. Define the Prompt
Decide what you want the model to say. For example:
prompt = "Tell me a story about a brave knight who loves basketball."
2. Text-to-Speech Function
Convert text to speech and play the resulting MP3:
def text_to_speech(text: str, lang: str = "en", slow: bool = False) -> None:
"""
Generate speech from text using gTTS, save as MP3, and play it.
"""
tts = gTTS(text=text, lang=lang, slow=slow)
filename = "tts_output.mp3"
tts.save(filename)
# macOS uses 'afplay'; Linux users can install 'mpg123' or 'mpg321'
os.system(f"afplay {filename}")
Warning
Adjust the playback command (afplay
, mpg123
, or mpg321
) based on your operating system.
3. Generate Text from OpenAI
Send the prompt to the Chat API and retrieve the response:
def generate_text(
prompt: str,
model: str = "gpt-3.5-turbo",
temperature: float = 0.8,
max_tokens: int = 150
) -> str:
"""
Generate a chat completion for the given prompt.
"""
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
temperature=temperature,
max_tokens=max_tokens
)
return response.choices[0].message.content
4. Combine Generation and Speech
Create a helper that prints the generated text, then speaks it:
def gen_and_speak(prompt: str) -> None:
"""
Generate text from the prompt, display it, and play the speech.
"""
text = generate_text(prompt)
print("Generated Text:\n")
print(text, "\n")
text_to_speech(text)
5. Entry Point
Run the full pipeline with your defined prompt:
if __name__ == "__main__":
gen_and_speak(prompt)
Example Console Output
$ python3 text_to_speech_pipeline.py
Generated Text:
Once upon a time in the kingdom of Eldoria, there lived a brave knight named Sir Cedric...
# (Audio will start playing automatically)
Next Steps & Extensions
- Support multiple languages by changing
lang
intext_to_speech()
. - Experiment with different voices and speech parameters.
- Integrate other audio libraries like
pydub
orplaysound
for advanced playback.
Links and References
Watch Video
Watch video content