Home / Provider/ Ollama

Ollama

Local model runner with an OpenAI-compatible API. The default backend for musegpt.

Ollama is the default inference backend for musegpt. It runs models locally and exposes an OpenAI-compatible API at localhost:11434.

Configuration

[backend]
name = "ollama"
url = "http://localhost:11434"
model = "llama3.2"

Supported Features

  • Streaming chat completions via SSE
  • Structured output (JSON mode)
  • Model switching at runtime
  • GPU acceleration (automatic)

Default Endpoint

POST http://localhost:11434/v1/chat/completions

Compiled with SchemaFlux