project-nomad

mirror of https://github.com/Crosstalk-Solutions/project-nomad.git synced 2026-05-24 05:15:05 +02:00

History

Chris Sherwood d2f2172b3c fix(System): correct AMD VRAM in Graphics card + harden log probe Two related fixes to make the System Information page reliably show real GPU info instead of misleading lspci BAR0 readings or N/A. 1. Generalize bogus-VRAM detection to AMD. Same root cause as #835 (NVIDIA showing 32 MB), this time for AMD: lspci parses the first PCI memory Region (BAR0, typically 1-16 MiB on Navi cards) as `vram`. On NOMAD8 (Threadripper 3960X + Radeon RX 6800), the System Information page showed "1 MB" instead of "16 GB". PR #850 fixed this for NVIDIA by clearing the bogus value and re-running the Ollama log probe; the check was vendor-gated to NVIDIA only. `isBogusNvidiaVram` becomes `isBogusDgpuVram` with a `isDiscreteGpuVendor` helper matching /nvidia\|advanced micro devices\|amd\|ati/i. Same 256-MiB threshold — no real discrete GPU has less than that, while Intel iGPUs (which legitimately report small shared-memory VRAM via lspci) are left untouched. The probe gate condition is similarly renamed. 2. Read Ollama logs from the startup window, not tail:N. `getOllamaInferenceComputeFromLogs()` was reading the last 500 log lines and grepping for the "inference compute" line. That line is written once during Ollama's GPU discovery phase within seconds of startup. Under active embedding workloads we measured >1000 log lines/min, which pushes the line past any reasonable tail within minutes — at which point the probe returns null and the UI flips to "GPU Not Accessible" even though Ollama is happily using the GPU (size_vram > 0 in /api/ps). Switch from `tail: 500` to `since: containerStartedAt, until: containerStartedAt + 300s`. The 5-minute window is bounded regardless of container uptime and always captures Ollama's GPU discovery output. The inference-compute line is emitted in the first few seconds of startup, so 5 min is generous headroom. Validated on NOMAD8 (RX 6800, container uptime ~10 min with sustained ingestion that generated 6,345 log lines): Before: controllers[0]: { model: "Navi 21 ...", vram: 1 } After (bogus AMD VRAM cleared, log probe stale due to tail:500 churn): controllers[0]: { model: "Navi 21 ...", vram: null } gpuHealth: { status: "passthrough_failed" } -> UI shows "N/A" and the banner from PR #208 After (bogus cleared + log probe reads startup window): controllers[0]: { model: "AMD Radeon RX 6800", vram: 16384 } gpuHealth: { status: "ok", hasRocmRuntime: true, ollamaGpuAccessible: true } -> UI shows "16 GB", no banner Both branches of the fix exercise correctly: NVIDIA path unchanged (same code, just renamed identifiers), AMD path now triggers the probe and the probe reliably finds the GPU info regardless of container age.		2026-05-20 10:16:00 -07:00
..
app	fix(System): correct AMD VRAM in Graphics card + harden log probe	2026-05-20 10:16:00 -07:00
bin	feat: curated content system overhaul	2026-02-11 15:44:46 -08:00
commands	feat(Maps): regional map downloads via go-pmtiles extract (#780 )	2026-05-20 10:16:00 -07:00
config	fix: cache docker list requests, aiAssistantName fetching, and ensure inertia used properly	2026-04-03 14:26:50 -07:00
constants	feat(Maps): regional map downloads via go-pmtiles extract (#780 )	2026-05-20 10:16:00 -07:00
database	feat(Content): custom ZIM library sources with pre-seeded mirrors (#593 )	2026-05-20 10:16:00 -07:00
docs	docs: update release notes	2026-05-20 10:16:00 -07:00
inertia	fix(Maps): render notes in marker popup when populated	2026-05-20 10:16:00 -07:00
providers	feat(GPU): auto-remediate nomad_ollama passthrough loss on admin boot (#755 )	2026-05-20 10:16:00 -07:00
public	feat: switch all PNG images to WEBP (#575 )	2026-04-03 14:26:50 -07:00
resources	feat(Maps): regional map downloads via go-pmtiles extract (#780 )	2026-05-20 10:16:00 -07:00
start	feat(Content): custom ZIM library sources with pre-seeded mirrors (#593 )	2026-05-20 10:16:00 -07:00
tests	fix(UI): improve global map banner display logic (#702 )	2026-05-20 10:16:00 -07:00
types	feat(GPU): auto-remediate nomad_ollama passthrough loss on admin boot (#755 )	2026-05-20 10:16:00 -07:00
util	feat: display model download progress	2026-02-06 16:22:23 -08:00
views	feat: initial commit	2025-06-29 15:51:08 -07:00
.editorconfig	feat: initial commit	2025-06-29 15:51:08 -07:00
.env.example	feat: Add Windows Docker Desktop support for local development	2026-01-19 10:29:24 -08:00
ace.js	feat: initial commit	2025-06-29 15:51:08 -07:00
adonisrc.ts	feat(GPU): auto-remediate nomad_ollama passthrough loss on admin boot (#755 )	2026-05-20 10:16:00 -07:00
eslint.config.js	feat: openwebui+ollama and zim management	2025-07-09 09:08:21 -07:00
package-lock.json	build(deps): bump picomatch in /admin	2026-05-20 10:16:00 -07:00
package.json	chore(deps): pin all deps to exact versions	2026-05-20 10:16:00 -07:00
tailwind.config.ts	feat: initial commit	2025-06-29 15:51:08 -07:00
tsconfig.json	feat: initial commit	2025-06-29 15:51:08 -07:00
vite.config.ts	fix(Maps): ensure proper parsing of hostnames (#640 )	2026-04-03 14:26:50 -07:00