project-nomad/admin/app/services
Chris Sherwood ba661a9da1 fix(RAG): pace continuation batches when embedding is CPU-only
Stacks on top of the multi-batch ZIM ingestion fix. After that fix,
multi-batch ZIM ingestion completes correctly — but on installs where
Ollama runs the embedding model on CPU (currently every AMD ROCm
install, since Ollama's ROCm build doesn't accelerate nomic-bert),
the now-correct sustained 100% CPU saturation across all cores can
starve other services hard enough to take the box down. Confirmed
on a Threadripper 3960X + RX 6800 NOMAD: a wikipedia-class ZIM
ingestion pegged 48 threads cleanly enough that sshd lost
banner-exchange responsiveness and the box ultimately required a
power-cycle.

NVIDIA installs aren't affected — nomic-embed-text:v1.5 runs at
100% GPU on RTX 5060 (verified via `ollama ps`).

Detect placement at runtime, pace only when needed:

1. OllamaService.isEmbeddingGpuAccelerated() — queries /api/ps and
   returns true if any loaded embedding model reports size_vram > 0.
   Fails closed (returns false) if /api/ps is unreachable or no embed
   model is loaded yet — over-pacing is safer than crashing.

2. EmbedFileJob.handle() — between batches (hasMoreBatches: true
   branch), check placement and `await setTimeout(CPU_BATCH_DELAY_MS)`
   when CPU-only. CPU_BATCH_DELAY_MS = 1000 (1s) — enough to give the
   OS scheduler a window for sshd/disk-collector/etc., small enough
   that total ingestion time isn't meaningfully affected (each batch
   is ~60-90s of work).

GPU-accelerated installs see zero behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:52:02 -07:00
..
benchmark_service.ts fix(AI): vendor-aware AMD HSA override + benchmark discrete-GPU detection 2026-05-05 12:11:56 -07:00
chat_service.ts fix(AI): qwen2.5 loading on every chat message (#649) 2026-04-17 11:37:44 -07:00
collection_manifest_service.ts fix: update default branch name 2026-03-01 16:08:46 -08:00
collection_update_service.ts feat(content-updates): show size, surface downloads in Active Downloads 2026-05-03 13:17:07 -07:00
container_registry_service.ts feat: support for updating services 2026-03-11 14:08:09 -07:00
countries_service.ts feat(Maps): regional map downloads via go-pmtiles extract (#780) 2026-05-03 13:47:53 -07:00
docker_service.ts fix(AI): preserve semver tag in DB on AMD Ollama updates 2026-05-11 21:08:08 -07:00
docs_service.ts docs: add Community Add-Ons page with field manuals + W3Schools packs (#753) 2026-04-20 14:57:53 -07:00
download_service.ts feat(Maps): regional map downloads via go-pmtiles extract (#780) 2026-05-03 13:47:53 -07:00
kiwix_library_service.ts fix: prevent ZIM corrupt file crash and deduplicate Ollama download logs (#741) 2026-04-17 11:54:04 -07:00
map_service.ts feat(Maps): regional map downloads via go-pmtiles extract (#780) 2026-05-03 13:47:53 -07:00
ollama_service.ts fix(RAG): pace continuation batches when embedding is CPU-only 2026-05-13 13:52:02 -07:00
queue_service.ts fix(queue): singleton QueueService to stop ioredis connection leak 2026-05-13 13:48:21 -07:00
rag_service.ts fix(RAG): add start button in kb modal and ensure restart policy exists (#700) 2026-04-27 22:26:46 -07:00
system_service.ts fix(System): correct NVIDIA VRAM in Graphics card (#835) 2026-05-11 15:49:42 -07:00
system_update_service.ts fix(security): SSRF validation for map downloads and error sanitization (CWE-918, CWE-209) (#552) 2026-04-17 14:12:02 -07:00
zim_extraction_service.ts fix(rag): repair ZIM embedding pipeline (sync filter, batch gate, DOM walk) (#745) 2026-04-20 16:23:25 -07:00
zim_service.ts fix(queue): singleton QueueService to stop ioredis connection leak 2026-05-13 13:48:21 -07:00