Assuming you’ve already installed the Ollama application and CLI on your machine, this guide walks you through running your first large language model (LLM)—Meta’s Llama 3.2—entirely offline. Ollama supports many popular models, so feel free to substitute your preferred one.Documentation Index
Fetch the complete documentation index at: https://notes.kodekloud.com/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
- Ollama CLI installed and configured
- At least 4 GB of free RAM
- ~2 GB of disk space for the Llama 3.2 model
- A modern terminal (macOS, Linux, or WSL on Windows)
Downloading and storing LLMs locally can consume significant disk space and memory. Ensure you have adequate resources before proceeding.
1. Pulling and Running the Model
Open your terminal and execute:>>> prompt.
All inference happens locally—your data remains on your device and no internet connection is needed after download.
2. Basic Interaction
At the>>> prompt, type any message:
3. Experimenting with Prompts
Try a creative prompt:Available Session Commands
| Command | Description |
|---|---|
/set | Set session variables |
/show | Display model information |
/load <model> | Load a different model or session |
/save <model> | Save your current session state |
/clear | Clear conversation context |
/bye | Exit the interactive session |
/? or /help | Show help for commands |
""") to start a multi-line message.
4. Resetting Context
To clear memory of previous interactions:5. Performing Calculations
Ollama LLMs can even handle simple math. For example:6. Exiting the Session
When you’re finished, leave the chat with:You’ve now run and interacted with a large language model locally using Ollama. Explore other models, tweak prompts, and enjoy full offline inference for enhanced privacy and performance. Happy experimenting!