Introduction to OpenAI

Text Generation

Text to Speech

Build a seamless text-to-speech pipeline in Python by combining OpenAI’s Chat API with Google’s gTTS library. Generate natural language responses from an LLM and have them spoken aloud automatically.

Prerequisites

1. Install Dependencies

PackagePurposeInstall Command
gTTSGoogle Text-to-Speech Python clientpip install gTTS
OpenAI Python clientOfficial OpenAI API SDKpip install openai
Audio playback utilityPlay MP3 files (macOS: afplay; Linux: mpg123 or mpg321)brew install mpg123 or sudo apt install mpg123

Note

This example is tested on Python 3.7+. If you use a different version, adjust commands as needed.

2. Set Your OpenAI API Key

export OPENAI_API_KEY="your_openai_api_key"

Warning

Never commit your API key to public repositories. Use a secure vault or environment manager in production.


Imports and Client Initialization

Begin by importing standard libraries, gTTS, and initializing the OpenAI client:

import os
from gtts import gTTS
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

1. Define the Prompt

Decide what you want the model to say. For example:

prompt = "Tell me a story about a brave knight who loves basketball."

2. Text-to-Speech Function

Convert text to speech and play the resulting MP3:

def text_to_speech(text: str, lang: str = "en", slow: bool = False) -> None:
    """
    Generate speech from text using gTTS, save as MP3, and play it.
    """
    tts = gTTS(text=text, lang=lang, slow=slow)
    filename = "tts_output.mp3"
    tts.save(filename)
    # macOS uses 'afplay'; Linux users can install 'mpg123' or 'mpg321'
    os.system(f"afplay {filename}")

Warning

Adjust the playback command (afplay, mpg123, or mpg321) based on your operating system.


3. Generate Text from OpenAI

Send the prompt to the Chat API and retrieve the response:

def generate_text(
    prompt: str,
    model: str = "gpt-3.5-turbo",
    temperature: float = 0.8,
    max_tokens: int = 150
) -> str:
    """
    Generate a chat completion for the given prompt.
    """
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=temperature,
        max_tokens=max_tokens
    )
    return response.choices[0].message.content

4. Combine Generation and Speech

Create a helper that prints the generated text, then speaks it:

def gen_and_speak(prompt: str) -> None:
    """
    Generate text from the prompt, display it, and play the speech.
    """
    text = generate_text(prompt)
    print("Generated Text:\n")
    print(text, "\n")
    text_to_speech(text)

5. Entry Point

Run the full pipeline with your defined prompt:

if __name__ == "__main__":
    gen_and_speak(prompt)

Example Console Output

$ python3 text_to_speech_pipeline.py
Generated Text:

Once upon a time in the kingdom of Eldoria, there lived a brave knight named Sir Cedric...
# (Audio will start playing automatically)

Next Steps & Extensions

  • Support multiple languages by changing lang in text_to_speech().
  • Experiment with different voices and speech parameters.
  • Integrate other audio libraries like pydub or playsound for advanced playback.

Watch Video

Watch video content

Previous
Embeddings Demo