Demo Essential Ollama CLI Commands

In this guide, we’ll walk through the most common Ollama CLI commands with practical examples. Whether you’re spinning up a model server, inspecting metadata, or managing local models, these commands form the foundation of any AI application workflow.

List All Commands
Run a Model Interactively
Stop a Running Model
View Local Models
Remove a Model
Download Models Without Running
Show Model Details
Monitor Active Models
Links and References

1. List All Commands

To discover every available Ollama subcommand, simply run:

ollama

You’ll see output similar to:

Available Commands:
  serve      Start Ollama model server
  create     Create a model from a Modelfile
  show       Show information for a model
  run        Run a model
  stop       Stop a running model
  pull       Pull a model from a registry
  push       Push a model to a registry
  list       List local models
  ps         List running models
  cp         Copy a model
  rm         Remove a model
  help       Help about any command

Flags:
  -h, --help     Help for ollama
  -v, --version  Show version information

Tip

Append --help to any command to view detailed usage information, for example:

ollama run --help

2. Run a Model Interactively

Launch an interactive chat session with a model (e.g., llama3.2):

ollama run llama3.2

Once running, type your queries:

>>> What is the time?
I'm not sure what time you are referring to, as I'm a large language model without real-time access...

To exit the chat, enter:

>>> /bye

3. Stop a Running Model

Models continue running in the background even after you exit the chat interface. To completely terminate:

ollama stop llama3.2

Warning

Leaving unused models running can consume memory and GPU resources. Always stop models you’re no longer using.

4. View Local Models

List all models downloaded on your machine:

ollama list

NAME	ID	SIZE	MODIFIED
llava:latest	8dd30f6b0cb1	4.7 GB	48 minutes ago
llama3.2:latest	a80c4f17acd5	2.0 GB	About an hour ago

5. Remove a Model

Free up disk space by deleting a model:

ollama rm llava

Confirm removal:

ollama list

NAME	ID	SIZE	MODIFIED
llama3.2:latest	a80c4f17acd5	2.0 GB	About an hour ago

6. Download Models Without Running

Use ollama pull to fetch a model without immediately launching it. For example, to pull Mistral 7B:

The image shows a webpage for the Mistral AI 7B model, detailing its version, parameters, and license information. The page includes options for searching models and accessing tools.

ollama pull mistral

You’ll see progress bars like:

pulling manifest
pulling ff82381e2bea... 28%

Note

Press Ctrl+C at any point to abort the download.

7. Show Model Details

Inspect model metadata, architecture, and licensing:

ollama show llama3.2

Model
  architecture     llama
  parameters       3.2B
  context length   131072
  embedding length 3072
  quantization     Q4_K_M

Parameters
  stop  "<|start_header_id|>"
  stop  "<|end_header_id|>"
  stop  "<|eot_id|>"

License
  LLAMA 3.2 COMMUNITY LICENSE AGREEMENT
  Release Date: March 26, 2024

Understanding the license ensures you comply with usage restrictions.

8. Monitor Active Models

Similar to Docker’s ps, this command lists all currently running models:

ollama ps

After launching a model in one terminal:

ollama run llama3.2

Open another terminal and run:

ollama ps

NAME	ID	SIZE	PROCESSOR	UNTIL
llama3.2:latest	a80c4f17acd5	4.0 GB	100% GPU	4 minutes from now

To stop the model:

ollama stop llama3.2

Verify no active models remain:

ollama ps

Demo Essential Ollama CLI Commands

Table of Contents