Demo Creating an App Using Ollama OpenAI Python Client
Guide to building a Flask AI Poem Generator using the OpenAI Python client and local Ollama backend, covering setup, example code, environment configuration, and production considerations
This guide demonstrates how to structure a small web application that uses the OpenAI Python client to talk to a local Ollama backend (or switch later to OpenAI’s hosted API with minimal changes). We’ll build a simple Flask-based AI Poem Generator that sends user prompts to a chat completion endpoint and renders the model’s output.Using a client library (instead of curl) helps keep your app code clean and portable across local development and hosted providers.
This example exposes a single route (”/”): GET renders a small form, POST sends the prompt to the chat completions endpoint using the OpenAI Python client and displays the returned poem.
Copy
# server.pyimport osfrom flask import Flask, request, render_template_stringfrom openai import OpenAIfrom dotenv import load_dotenvload_dotenv() # take environment variables from .envapp = Flask(__name__)client = OpenAI( api_key=os.environ.get("OPENAI_API_KEY"), base_url=os.environ.get("LLM_ENDPOINT"))# Simple HTML template for the app UIHTML_TEMPLATE = """<!DOCTYPE html><html><head> <meta charset="utf-8" /> <title>AI Poem Generator</title> <style> body { font-family: Arial, sans-serif; padding: 2rem; background: #f9f9f9; } .container { max-width: 800px; margin: 0 auto; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0,0,0,0.05); } textarea { width: 100%; height: 120px; padding: 0.5rem; margin-bottom: 0.75rem; } button { padding: 0.5rem 1rem; } pre { background-color: #f4f4f4; padding: 10px; border-radius: 5px; overflow-x: auto; } </style></head><body> <div class="container"> <h1>AI Poem Generator</h1> <form method="POST"> <textarea name="input" placeholder="Enter your prompt here..."></textarea> <br /> <button type="submit">Generate Poem</button> </form> {% if poem %} <h2>Your AI-Generated Poem</h2> <pre>{{ poem }}</pre> {% endif %} </div></body></html>"""@app.route("/", methods=["GET", "POST"])def index(): poem = None if request.method == "POST": try: input_message = request.form["input"] response = client.chat.completions.create( model=os.environ.get("MODEL"), messages=[ {"role": "system", "content": "You are an AI assistant specialized in writing poems."}, {"role": "user", "content": input_message} ], ) # Extract the generated text from the response # Structure: response.choices[0].message.content poem = response.choices[0].message.content except Exception as e: # log the exception, and show a friendly error to the user print("Error:", str(e)) poem = "An error occurred when trying to fetch your poem." return render_template_string(HTML_TEMPLATE, poem=poem)if __name__ == "__main__": port = int(os.getenv("PORT", 3000)) # Development server — do not use this directly in production app.run(host="0.0.0.0", port=port)
Store runtime configuration in a .env file at the project root. This keeps credentials and endpoints out of your code and makes switching between local Ollama and OpenAI hosted APIs straightforward.
Variable
Purpose
Example
OPENAI_API_KEY
API key used by the OpenAI client. For local Ollama testing this can be any value. Replace with a real key when using OpenAI hosted APIs.
You should see the Flask development server start:
Copy
* Serving Flask app 'server'* Debug mode: offWARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.* Running on all addresses (0.0.0.0)* Running on http://127.0.0.1:3000Press CTRL+C to quit
Open http://127.0.0.1:3000, enter a prompt such as “Write a short poem about cats” and click “Generate Poem.” The app will POST the prompt to the model and display the returned poem.
This demo illustrates a minimal integration pattern. To prepare for production:
Use a production WSGI server (gunicorn/uvicorn) instead of Flask’s dev server.
Store secrets securely (e.g., a secret manager or environment variable service).
Add robust error handling, rate limiting, and input validation/sanitization.
Cache or paginate long-running requests and handle model timeouts gracefully.
When switching to OpenAI hosted APIs, change the LLM_ENDPOINT and set a valid OPENAI_API_KEY.
This is a simple demo intended for local development. For production, use a production WSGI server (gunicorn/uvicorn), secure environment secrets properly, and follow best practices for rate-limiting, error handling, and user input sanitization.