LLM Server Troubleshooting
Diagnostic and resolution steps when llama-server is not working
The AI engine in AIRGAP Studio runs as an embedded llama-server process. Use the following steps to diagnose problems when AI responses do not appear or when the chat panel is empty.
1. AI Engine Health Check
llama-server listens on port 11434 by default. The /health endpoint should return HTTP 200 when healthy.
PowerShell
Invoke-WebRequest http://127.0.0.1:11434/health -UseBasicParsing
curl
curl http://127.0.0.1:11434/health
Healthy response:
HTTP/1.1 200 OK
{"status":"ok"}
If you get a connection refused error or no response, the Launcher has not started llama-server. Restart the Launcher, and if the issue persists, check the Launcher log (Section 7).
2. Compatibility Proxy Health Check
Mistral and some other models go through a Compatibility Proxy on port 11433 for per-model chat-format normalization. The proxy is started and managed automatically by the Launcher.
curl http://127.0.0.1:11433/health
If you do not get a healthy response, restart the Launcher. The Launcher only spawns the proxy after llama-server health check returns 200 OK, so port 11434 must be healthy first.
The proxy only listens on loopback (
127.0.0.1) and cannot be reached from outside the machine.
3. Port Conflict Diagnosis
Main ports used by AIRGAP Studio:
| Port | Purpose |
|---|---|
| 11434 | llama-server (AI engine) |
| 11433 | Compatibility Proxy |
| 9222 | VSCodium Chrome DevTools (debug only) |
If another program holds one of these ports, the Launcher cannot start.
# Find the process holding port 11434
Get-NetTCPConnection -LocalPort 11434 | Select-Object LocalAddress, LocalPort, State, OwningProcess
# Look up the program by its PID
Get-Process -Id <PID>
Stop the conflicting program or reconfigure its port, then restart the Launcher.
4. Less than 4GB VRAM — Automatic CPU Fallback
AIRGAP Studio measures system VRAM and automatically switches to CPU inference when VRAM is less than 4096MB (4GB). No configuration is required.
Symptoms
- AI responses arrive but are noticeably slow (CPU inference).
- The Launcher log records GPU/CPU fallback messages.
Resolution
- Upgrade to a GPU with more VRAM (4GB minimum, 8GB+ recommended for comfortable use).
- Or switch to a smaller model:
- Qwen3 1.7B — Lightest option, fast responses even on CPU
- Granite 4.0 Micro — Optimized for English code authoring
Change models with the AIRGAP: Select LLM Model command (Section 9).
5. OOM (Out-Of-Memory)
If VRAM or RAM is exhausted during model loading or inference, the Launcher automatically falls back to a smaller model or CPU mode.
Symptoms
- Responses work initially but stop after a long context input.
- The Launcher log shows messages like
OOM,out of memory, orfailed to allocate.
Resolution
- Check the Launcher log (Section 7) for OOM messages.
- Switch to a smaller model.
- Close other GPU-intensive programs (games, browsers with hardware acceleration, etc.).
6. Model .gguf File Not Found
The AI engine needs at least one .gguf model file bundled with the installer.
Symptoms
- The Launcher starts but immediately exits with "No model found" or a similar error.
- AI responses never appear.
How to Verify
Check that the models/ directory under the install location contains at least one .gguf file.
- Dev build:
build/vscodium/models/ - Installed:
%LOCALAPPDATA%\AIRGAP\models\(path varies by install location)
Resolution
Run the Modelpack installer separately to add the model files. The AIRGAP Studio installer is split into three packages:
- Full — Studio + Modelpack (installed in one step)
- IDE-Only — Studio only (no models)
- Modelpack — Model files only (when the IDE is already installed)
If you installed IDE-Only, also run the Modelpack installer.
7. Launcher Log Location
The most important diagnostic information is in the Launcher log.
| Environment | Log Path |
|---|---|
| Installed (prod) | %LOCALAPPDATA%\AIRGAP\logs\launcher.log |
| Dev build (dev) | phase3/scripts/dev-launcher.log |
Look for the following keywords in the last 100 lines of the log:
health check failed— Health check failureport .* in use— Port conflictVRAM/GPU/CPU fallback— GPU/CPU fallbackOOM/out of memory— Memory exhaustionmodel not found— Missing model file
# Tail the last 100 lines of the log in PowerShell
Get-Content "$env:LOCALAPPDATA\AIRGAP\logs\launcher.log" -Tail 100
8. Chat Webview Goes Black
Sometimes AI responses work correctly but the chat panel appears as a black screen.
Cause
The VSCodium webview is rendered inside an OOP (Out-Of-Process) iframe, which can occasionally fail to paint due to a transient rendering issue. This is unrelated to the AI engine or network.
Resolution
Restart VSCodium:
- Command Palette (
Ctrl+Shift+P) →Developer: Reload Window - Or close the IDE from the Launcher and start it again
A single restart usually resolves the issue. If it recurs, update your GPU driver and close other GPU-intensive applications.
9. No Response After Model Change
After switching the model with AIRGAP: Select LLM Model, AI responses may not arrive immediately.
Expected Behavior
- Command Palette →
AIRGAP: Select LLM Model→ pick a model - The Launcher automatically restarts
llama-server(this can take seconds to tens of seconds) - AI responses resume once the new model is fully loaded
How to Verify
In the Launcher log (Section 7), look for a line like:
Starting llama-server with model <model name>
If this line is missing, or if it appears but is not followed by a 200 OK health check:
- Restart the Launcher.
- If the problem persists, try switching to a smaller model (memory pressure is a likely cause).
- Attach the Launcher log when contacting support.
Related Documents
- Terminal Troubleshooting — PowerShell / PATH / encoding issues
- Git Checkpoints Troubleshooting — Workspace history issues
- Getting Started — Initial install and setup