local
4 documents categorized under local.
llama.cpp
High-performance local inference engine with OpenAI-compatible server mode.
Ollama
Local model runner with an OpenAI-compatible API. The default backend for musegpt.
vLLM
High-throughput LLM serving engine with OpenAI-compatible API.
whisper.cpp
Local speech recognition engine for audio-to-text transcription in the musegpt audio pipeline.