AIRGAP StudioAIRGAP Studio

AIRGAP Monitor

The single entry point for LLM server status monitoring and model selection.

Overview

AIRGAP Monitor is the extension that exclusively owns model selection and LLM server status monitoring in AIRGAP Studio. It surfaces the active model, llama-server status, and response metrics in one place, and acts as the single source of truth for model switching.

Other extensions that consume LLMs — such as AIRGAP Assistant and AIRGAP Lite Assistant — do not expose a model selection UI. All model switches must go through AIRGAP Monitor.

Monitor Panel

Click the Monitor icon in the Activity Bar to see the following information in a single view:

  • Active model — model name, context size, GPU/CPU mode
  • llama-server status/health response, port, uptime
  • Performance metrics — memory usage, tokens/sec throughput, TTFT (time to first token)
  • Model metadata — chat template, recommended context window, compatibility profile

Model Selection Workflow

  1. Open the Command Palette (Ctrl+Shift+P) → run AIRGAP: Select LLM Model
  2. Choose a model from the QuickPick list
  3. llama-server restarts automatically (your current work is not interrupted)
  4. The active model is propagated automatically to every LLM-consuming extension
  5. The model badge in AIRGAP Assistant / AIRGAP Lite Assistant refreshes in read-only mode

A switch usually completes within seconds to tens of seconds. The first load may take longer depending on model size and disk speed.

Model Change IPC Flow

Internally, a model switch goes through the following stages:

  1. Request emit — AIRGAP Monitor atomically writes config-request.json
  2. Launcher detection — The C# Launcher's FileSystemWatcher picks up the change
  3. Request mergeMergeRequest combines the new request with the current configuration and produces a new llama-server argument set
  4. Server restartRestartWithNewModelAsync stops the existing llama-server and spawns it with the new model
  5. State refresh — The Launcher updates config.json, and Monitor re-emits active-model.json v2
  6. Subscriber refresh — LLM consumers subscribe to active-model.json via read-only fs.watch, so their UI updates automatically

This flow enforces the principle that AIRGAP Monitor is the single source of truth for model switching. LLM-consuming extensions cannot write config-request.json / config.json / active-model.json directly.

Supported Models Catalog

For the list of supported models, recommended specs, and compatibility profiles, see Supported Models. New models are registered in the single sources of truth — phase3/models-metadata.json + model-version.json — and then appear automatically in the AIRGAP Monitor QuickPick.

Notes

  • AIRGAP Assistant and AIRGAP Lite Assistant do not provide an in-extension UI for changing the model, and one will not be added in the future.
  • Direct edits to config-request.json / config.json / active-model.json / models-metadata.json are forbidden — an inconsistent state can prevent llama-server from starting.
  • The first response right after a model switch may be slow (initial GGUF memory mapping).