AIRGAP Monitor
The single entry point for LLM server status monitoring and model selection.
Overview
AIRGAP Monitor is the extension that exclusively owns model selection and LLM server status monitoring in AIRGAP Studio. It surfaces the active model, llama-server status, and response metrics in one place, and acts as the single source of truth for model switching.
Other extensions that consume LLMs — such as AIRGAP Assistant and AIRGAP Lite Assistant — do not expose a model selection UI. All model switches must go through AIRGAP Monitor.
Monitor Panel
Click the Monitor icon in the Activity Bar to see the following information in a single view:
- Active model — model name, context size, GPU/CPU mode
- llama-server status —
/healthresponse, port, uptime - Performance metrics — memory usage, tokens/sec throughput, TTFT (time to first token)
- Model metadata — chat template, recommended context window, compatibility profile
Model Selection Workflow
- Open the Command Palette (
Ctrl+Shift+P) → runAIRGAP: Select LLM Model - Choose a model from the QuickPick list
- llama-server restarts automatically (your current work is not interrupted)
- The active model is propagated automatically to every LLM-consuming extension
- The model badge in AIRGAP Assistant / AIRGAP Lite Assistant refreshes in read-only mode
A switch usually completes within seconds to tens of seconds. The first load may take longer depending on model size and disk speed.
Model Change IPC Flow
Internally, a model switch goes through the following stages:
- Request emit — AIRGAP Monitor atomically writes
config-request.json - Launcher detection — The C# Launcher's FileSystemWatcher picks up the change
- Request merge —
MergeRequestcombines the new request with the current configuration and produces a new llama-server argument set - Server restart —
RestartWithNewModelAsyncstops the existing llama-server and spawns it with the new model - State refresh — The Launcher updates
config.json, and Monitor re-emitsactive-model.jsonv2 - Subscriber refresh — LLM consumers subscribe to
active-model.jsonvia read-onlyfs.watch, so their UI updates automatically
This flow enforces the principle that AIRGAP Monitor is the single source of truth for model switching. LLM-consuming extensions cannot write config-request.json / config.json / active-model.json directly.
Supported Models Catalog
For the list of supported models, recommended specs, and compatibility profiles, see Supported Models. New models are registered in the single sources of truth — phase3/models-metadata.json + model-version.json — and then appear automatically in the AIRGAP Monitor QuickPick.
Notes
- AIRGAP Assistant and AIRGAP Lite Assistant do not provide an in-extension UI for changing the model, and one will not be added in the future.
- Direct edits to
config-request.json/config.json/active-model.json/models-metadata.jsonare forbidden — an inconsistent state can prevent llama-server from starting. - The first response right after a model switch may be slow (initial GGUF memory mapping).
Related Documents
- AIRGAP Lite Assistant — Default assistant
- AIRGAP Assistant — Autonomous agent assistant
- Switching AI Assistants — Assistant switching procedure
- Supported Models — Model catalog