Threading Model
Three-thread architecture for real-time audio, responsive UI, and background inference in the musegpt VST3 plugin.
musegpt runs inference via HTTP API calls to external backends. These calls can take milliseconds to minutes (streaming). The threading model ensures the audio thread is never blocked, the UI remains responsive, and inference runs safely in the background.
The Three Threads
Audio Thread (Real-Time, DAW-Controlled)
- Calls
processBlock()at ~1-10ms intervals - Must never block: no allocations, no locks, no syscalls, no HTTP, no I/O
- Reads inference results (e.g., generated MIDI) from a lock-free FIFO
- Never initiates inference requests
UI Thread (Main Thread)
- Handles user interaction and GUI repaints
- Enqueues inference requests to the background thread (non-blocking)
- Reads streaming response tokens from a lock-free queue
- Never waits on inference completion
Background Thread (Inference Worker)
- Owns the HTTP client lifecycle
- Picks up requests from a thread-safe queue
- Makes blocking HTTP calls to the configured inference backend
- Streams response tokens back via a lock-free queue
- The only thread that is allowed to block on I/O
Data Flow
sequenceDiagram
participant UI as UI Thread
participant BG as Background Thread
participant Audio as Audio Thread
UI->>BG: user sends message (queue)
BG->>BG: HTTP call to backend
BG->>UI: streaming tokens (lock-free)
UI->>UI: update display
BG->>Audio: MIDI result (lock-free FIFO)
Audio->>Audio: read MIDI (never blocks)
Rules
- No mutexes on the audio thread. All communication with the audio thread uses lock-free structures only (SPSC ring buffers, atomic flags).
- Audio thread reads, never writes requests. It consumes results from a lock-free FIFO. It never initiates inference.
- UI thread never waits on inference. It enqueues requests and polls/receives responses asynchronously.
- Background thread owns all I/O. It is the only thread that makes HTTP calls or touches the network.
- UI-to-background communication uses a thread-safe queue. A mutex is acceptable here since neither thread is real-time.
- Backend lifecycle is managed on the background thread. Starting, stopping, and health-checking backends never happens on the audio or UI thread.
- Streaming tokens are delivered incrementally. The background thread pushes tokens as they arrive via SSE; the UI thread consumes them for display without waiting for completion.
- Cancellation is cooperative. The UI thread sets an atomic cancel flag; the background thread checks it between tokens and aborts the HTTP call.
Testing Strategy
- Interface contract tests: verify the
InferenceBackendabstraction behaves correctly using a mock backend - Async request/response tests: submit a request, confirm the response arrives on a callback, confirm the caller never blocks
- Audio thread safety tests: prove the audio thread can read results without blocking, even during active inference
- Concurrent stress tests: multiple requests, backend start/stop during inflight requests, rapid submit/cancel
- ThreadSanitizer (TSan): all tests compile and run clean with
-fsanitize=thread