← Back to AI & LLMs
AI & LLMs by @timverhoogt

ollama-local

Manage and use local Ollama models

New
Source Code

Ollama Local

Work with local Ollama models for inference, embeddings, and tool use.

Configuration

Set your Ollama host (defaults to http://localhost:11434):

export OLLAMA_HOST="http://localhost:11434"
# Or for remote server:
export OLLAMA_HOST="http://192.168.1.100:11434"

Quick Reference

# List models
python3 scripts/ollama.py list

# Pull a model
python3 scripts/ollama.py pull llama3.1:8b

# Remove a model
python3 scripts/ollama.py rm modelname

# Show model details
python3 scripts/ollama.py show qwen3:4b

# Chat with a model
python3 scripts/ollama.py chat qwen3:4b "What is the capital of France?"

# Chat with system prompt
python3 scripts/ollama.py chat llama3.1:8b "Review this code" -s "You are a code reviewer"

# Generate completion (non-chat)
python3 scripts/ollama.py generate qwen3:4b "Once upon a time"

# Get embeddings
python3 scripts/ollama.py embed bge-m3 "Text to embed"

Model Selection

See references/models.md for full model list and selection guide.

Quick picks:

  • Fast answers: qwen3:4b
  • Coding: qwen2.5-coder:7b
  • General: llama3.1:8b
  • Reasoning: deepseek-r1:8b

Tool Use

Some local models support function calling. Use ollama_tools.py:

# Single request with tools
python3 scripts/ollama_tools.py single qwen2.5-coder:7b "What's the weather in Amsterdam?"

# Full tool loop (model calls tools, gets results, responds)
python3 scripts/ollama_tools.py loop qwen3:4b "Search for Python tutorials and summarize"

# Show available example tools
python3 scripts/ollama_tools.py tools

Tool-capable models: qwen2.5-coder, qwen3, llama3.1, mistral

OpenClaw Sub-Agents

Spawn local model sub-agents with sessions_spawn:

# Example: spawn a coding agent
sessions_spawn(
    task="Review this Python code for bugs",
    model="ollama/qwen2.5-coder:7b",
    label="code-review"
)

Model path format: ollama/<model-name>

Parallel Agents (Think Tank Pattern)

Spawn multiple local agents for collaborative tasks:

agents = [
    {"label": "architect", "model": "ollama/gemma3:12b", "task": "Design the system architecture"},
    {"label": "coder", "model": "ollama/qwen2.5-coder:7b", "task": "Implement the core logic"},
    {"label": "reviewer", "model": "ollama/llama3.1:8b", "task": "Review for bugs and improvements"},
]

for a in agents:
    sessions_spawn(task=a["task"], model=a["model"], label=a["label"])

Direct API

For custom integrations, use the Ollama API directly:

# Chat
curl $OLLAMA_HOST/api/chat -d '{
  "model": "qwen3:4b",
  "messages": [{"role": "user", "content": "Hello"}],
  "stream": false
}'

# Generate
curl $OLLAMA_HOST/api/generate -d '{
  "model": "qwen3:4b",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

# List models
curl $OLLAMA_HOST/api/tags

# Pull model
curl $OLLAMA_HOST/api/pull -d '{"name": "phi3:mini"}'

Troubleshooting

Connection refused?

  • Check Ollama is running: ollama serve
  • Verify OLLAMA_HOST is correct
  • For remote servers, ensure firewall allows port 11434

Model not loading?

  • Check VRAM: larger models may need CPU offload
  • Try a smaller model first

Slow responses?

  • Model may be running on CPU
  • Use smaller quantization (e.g., :7b instead of :30b)

OpenClaw sub-agent falls back to default model?

  • Ensure ollama:default auth profile exists in OpenClaw config
  • Check model path format: ollama/modelname:tag