project-nomad/admin/app
Chris Sherwood ba661a9da1 fix(RAG): pace continuation batches when embedding is CPU-only
Stacks on top of the multi-batch ZIM ingestion fix. After that fix,
multi-batch ZIM ingestion completes correctly — but on installs where
Ollama runs the embedding model on CPU (currently every AMD ROCm
install, since Ollama's ROCm build doesn't accelerate nomic-bert),
the now-correct sustained 100% CPU saturation across all cores can
starve other services hard enough to take the box down. Confirmed
on a Threadripper 3960X + RX 6800 NOMAD: a wikipedia-class ZIM
ingestion pegged 48 threads cleanly enough that sshd lost
banner-exchange responsiveness and the box ultimately required a
power-cycle.

NVIDIA installs aren't affected — nomic-embed-text:v1.5 runs at
100% GPU on RTX 5060 (verified via `ollama ps`).

Detect placement at runtime, pace only when needed:

1. OllamaService.isEmbeddingGpuAccelerated() — queries /api/ps and
   returns true if any loaded embedding model reports size_vram > 0.
   Fails closed (returns false) if /api/ps is unreachable or no embed
   model is loaded yet — over-pacing is safer than crashing.

2. EmbedFileJob.handle() — between batches (hasMoreBatches: true
   branch), check placement and `await setTimeout(CPU_BATCH_DELAY_MS)`
   when CPU-only. CPU_BATCH_DELAY_MS = 1000 (1s) — enough to give the
   OS scheduler a window for sshd/disk-collector/etc., small enough
   that total ingestion time isn't meaningfully affected (each batch
   is ~60-90s of work).

GPU-accelerated installs see zero behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:52:02 -07:00
..
controllers fix(AI): rewrite RAG query on first follow-up (off-by-one in skip-rewrite threshold) 2026-05-12 20:34:30 -07:00
exceptions fix(Docs): documentation renderer fixes 2025-12-23 16:00:33 -08:00
jobs fix(RAG): pace continuation batches when embedding is CPU-only 2026-05-13 13:52:02 -07:00
middleware fix(API): skip compression for Server-Sent Events (#798) 2026-04-27 19:00:31 -07:00
models feat(Content): custom ZIM library sources with pre-seeded mirrors (#593) 2026-05-04 11:30:59 -07:00
services fix(RAG): pace continuation batches when embedding is CPU-only 2026-05-13 13:52:02 -07:00
utils fix(Downloads): treat missing Content-Type as octet-stream (#848) 2026-05-11 21:09:40 -07:00
validators feat(Content): custom ZIM library sources with pre-seeded mirrors (#593) 2026-05-04 11:30:59 -07:00