Running Local LLMs With Ollama

Getting Started With Ollama

Demo Running Your First Model

Assuming you’ve already installed the Ollama application and CLI on your machine, this guide walks you through running your first large language model (LLM)—Meta’s Llama 3.2—entirely offline. Ollama supports many popular models, so feel free to substitute your preferred one.

Prerequisites

  • Ollama CLI installed and configured
  • At least 4 GB of free RAM
  • ~2 GB of disk space for the Llama 3.2 model
  • A modern terminal (macOS, Linux, or WSL on Windows)

Warning

Downloading and storing LLMs locally can consume significant disk space and memory. Ensure you have adequate resources before proceeding.

1. Pulling and Running the Model

Open your terminal and execute:

ollama run llama3.2

Since this is your first run, Ollama will fetch the model from the registry—similar to Docker pulling an image:

> ollama run llama3.2
pulling manifest
pulling dde5aa3fc5ff... 100%
pulling 966de95ca8a6... 100%
...
verifying sha256 digest
writing manifest
success

Once the download completes, Ollama verifies integrity, writes the manifest, and presents an interactive >>> prompt.

Note

All inference happens locally—your data remains on your device and no internet connection is needed after download.

2. Basic Interaction

At the >>> prompt, type any message:

>>> hey! how are you feeling?

You should see a response like:

I’m just a language model, I don’t have feelings or emotions like humans do. However, I’m functioning properly and ready to help with any questions or tasks you may have! How about you? How’s your day going so far?
>>> send a message (/?) for help

3. Experimenting with Prompts

Try a creative prompt:

>>> Compose a poem on DevOps.

The model generates a multi-paragraph poem in seconds. To discover built-in session commands, enter:

>>> /?

Available Session Commands

CommandDescription
/setSet session variables
/showDisplay model information
/load <model>Load a different model or session
/save <model>Save your current session state
/clearClear conversation context
/byeExit the interactive session
/? or /helpShow help for commands

Use triple quotes (""") to start a multi-line message.

4. Resetting Context

To clear memory of previous interactions:

>>> /clear
Cleared session context
>>> send a message (/?) for help

Now the model won’t recall what you asked before clearing. This is useful for isolated tests or debugging prompts.

5. Performing Calculations

Ollama LLMs can even handle simple math. For example:

>>> What is 12% compound interest on 5000 over 5 years?

Sample response:

FV = PV × (1 + r)^n

Where:
  FV = Future Value
  PV = Present Value = $5000
  r  = Annual rate = 12% = 0.12
  n  = Number of years = 5

Calculation:
  FV = $5000 × (1 + 0.12)^5
     ≈ $8,811.71

6. Exiting the Session

When you’re finished, leave the chat with:

>>> /bye
Goodbye!

You’ll return to your normal terminal prompt.


You’ve now run and interacted with a large language model locally using Ollama. Explore other models, tweak prompts, and enjoy full offline inference for enhanced privacy and performance. Happy experimenting!

Watch Video

Watch video content

Previous
Running Your First Model