New defaults:
OLLAMA_NO_CLOUD=1 - "Ollama can run in local only mode by disabling
Ollama’s cloud features. By turning off Ollama’s cloud features, you
will lose the ability to use Ollama’s cloud models and web search."
https://ollama.com/blog/web-searchhttps://docs.ollama.com/faq#how-do-i-disable-ollama%E2%80%99s-cloud-features
example output:
```
ollama run minimax-m2.7:cloud
Error: ollama cloud is disabled: remote model details are unavailable
```
This setting can be safely disabled as you have to click on a link to
login to ollama cloud and theres no real way to do that in nomad outside
of looking at the nomad_ollama logs.
This one can be disabled in settings in case theres a model out there
that doesn't play nice. but that doesnt seem necessary so far.
OLLAMA_FLASH_ATTENTION=1 - "Flash Attention is a feature of most modern
models that can significantly reduce memory usage as the context size
grows. "
Tested with llama3.2:
```
docker logs nomad_ollama --tail 1000 2>&1 |grep --color -i flash_attn
llama_context: flash_attn = enabled
```
And with second_constantine/deepseek-coder-v2 with is based on
https://huggingface.co/lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF
which is a model that specifically calls out that you should disable
flash attention, but during testing it seems ollama can do this for you
automatically:
```
docker logs nomad_ollama --tail 1000 2>&1 |grep --color -i flash_attn
llama_context: flash_attn = disabled
```
GPU detection results were only applied at container creation time and
never persisted. If live detection failed transiently (Docker daemon
hiccup, runtime temporarily unavailable), Ollama would silently fall
back to CPU-only mode with no way to recover short of force-reinstall.
Now _detectGPUType() persists successful detections to the KV store
(gpu.type = 'nvidia' | 'amd') and uses the saved value as a fallback
when live detection returns nothing. This ensures GPU config survives
across container recreations regardless of transient detection failures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous lspci-based GPU detection fails inside Docker containers
because lspci isn't available, causing Ollama to always run CPU-only
even when a GPU + NVIDIA Container Toolkit are present on the host.
Replace with Docker API runtime check (docker.info() -> Runtimes) as
primary detection method. This works from inside any container via the
mounted Docker socket and confirms both GPU presence and toolkit
installation. Keep lspci as fallback for host-based installs and AMD.
Also add Docker-based GPU detection to benchmark hardware info — exec
nvidia-smi inside the Ollama container to get the actual GPU model name
instead of showing "Not detected".
Tested on nomad3 (Intel Core Ultra 9 285HX + RTX 5060): AI performance
went from 12.7 tok/s (CPU) to 281.4 tok/s (GPU) — a 22x improvement.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive benchmarking capability to measure server performance:
Backend:
- BenchmarkService with CPU, memory, disk, and AI benchmarks using sysbench
- Database migrations for benchmark_results and benchmark_settings tables
- REST API endpoints for running benchmarks and retrieving results
- CLI commands: benchmark:run, benchmark:results, benchmark:submit
- BullMQ job for async benchmark execution with SSE progress updates
- Synchronous mode option (?sync=true) for simpler local dev setup
Frontend:
- Benchmark settings page with circular gauges for scores
- NOMAD Score display with weighted composite calculation
- System Performance section (CPU, Memory, Disk Read/Write)
- AI Performance section (tokens/sec, time to first token)
- Hardware Information display
- Expandable Benchmark Details section
- Progress simulation during sync benchmark execution
Easy Setup Integration:
- Added System Benchmark to Additional Tools section
- Built-in capability pattern for non-Docker features
- Click-to-navigate behavior for built-in tools
Fixes:
- Docker log multiplexing issue (Tty: true) for proper output parsing
- Consolidated disk benchmarks into single container execution
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Detect Windows platform and use named pipe (//./pipe/docker_engine)
instead of Unix socket for Docker Desktop compatibility
- Add NOMAD_STORAGE_PATH environment variable for configurable
storage paths across different platforms
- Update seeder to use environment variable with Linux default
- Document new environment variable in .env.example
This enables local development on Windows machines with Docker Desktop
while maintaining Linux production compatibility.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>