Mastering Generative AI with OpenAI

Understanding Tokens and API Parameters

A Closer Look at OpenAI API

In this lesson, we’ll explore the OpenAI platform and its modular architecture. You’ll learn how each component fits together to deliver Generative AI as a Service and where tokens and API parameters come into play.

OpenAI Platform Architecture

The OpenAI platform provides a seamless developer experience through four core layers:

ComponentResponsibilityExamples
Foundation ModelsPretrained language and vision AIgpt-4, text-embedding-3
Services LayerAPI request orchestration, model management, user access, and securityAuthentication, rate limiting
RESTful API EndpointScalable HTTP interface for model inference and managementPOST /v1/completions
Official SDKs & CLIClient libraries and utilities for integrationOpenAI Python client, CLI

The image is a flowchart titled "OpenAI – 10,000-Ft. Overview," showing the interaction between tools, libraries, apps, an API, OpenAI, and foundation models.

Customer applications—whether your web app, mobile client, or backend service—interact with these layers by calling the RESTful API directly or using one of the managed SDKs.

Understanding Tokens and API Parameters

When you send a request to the OpenAI API, there are several parameters you can tune. The most important among them are:

  • Model: Choose which foundation model to use.
  • Prompt: The text input that the model will complete.
  • Max tokens: Limits the length of the generated response.
  • Temperature: Controls randomness in output (0.0–1.0).
  • Top_p: Enables nucleus sampling for probabilistic curation.

Note

Every API call consumes tokens based on the length of your prompt and the response. Monitor your usage in the OpenAI dashboard to manage costs.

Example: Text Completion Request

curl https://api.openai.com/v1/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "prompt": "Explain the principle of recursion in computer science.",
    "max_tokens": 150,
    "temperature": 0.7
  }'

Parsing the Response

A typical response object contains:

FieldDescription
idUnique identifier for the request
choicesArray of completion options (usually one)
usageToken consumption details (prompt_tokens, completion_tokens, total_tokens)
{
  "id": "cmpl-6abc123",
  "object": "text_completion",
  "created": 1675072800,
  "model": "gpt-4",
  "choices": [
    {
      "text": "Recursion in computer science refers to a function calling itself... ",
      "index": 0,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

Watch Video

Watch video content

Previous
Section Intro