Ollama

Local model runner with an OpenAI-compatible API. The default backend for musegpt.

1 min read Provider Backend local openai-api default

Ollama is the default inference backend for musegpt. It runs models locally and exposes an OpenAI-compatible API at localhost:11434.

Configuration

[backend]
name = "ollama"
url = "http://localhost:11434"
model = "llama3.2"

Supported Features

Streaming chat completions via SSE
Structured output (JSON mode)
Model switching at runtime
GPU acceleration (automatic)

Default Endpoint

POST http://localhost:11434/v1/chat/completions

Compiled with SchemaFlux