mirror of
https://github.com/Crosstalk-Solutions/project-nomad.git
synced 2026-05-25 13:55:05 +02:00
Stacks on top of the multi-batch ZIM ingestion fix. After that fix, multi-batch ZIM ingestion completes correctly — but on installs where Ollama runs the embedding model on CPU (currently every AMD ROCm install, since Ollama's ROCm build doesn't accelerate nomic-bert), the now-correct sustained 100% CPU saturation across all cores can starve other services hard enough to take the box down. Confirmed on a Threadripper 3960X + RX 6800 NOMAD: a wikipedia-class ZIM ingestion pegged 48 threads cleanly enough that sshd lost banner-exchange responsiveness and the box ultimately required a power-cycle. NVIDIA installs aren't affected — nomic-embed-text:v1.5 runs at 100% GPU on RTX 5060 (verified via `ollama ps`). Detect placement at runtime, pace only when needed: 1. OllamaService.isEmbeddingGpuAccelerated() — queries /api/ps and returns true if any loaded embedding model reports size_vram > 0. Fails closed (returns false) if /api/ps is unreachable or no embed model is loaded yet — over-pacing is safer than crashing. 2. EmbedFileJob.handle() — between batches (hasMoreBatches: true branch), check placement and `await setTimeout(CPU_BATCH_DELAY_MS)` when CPU-only. CPU_BATCH_DELAY_MS = 1000 (1s) — enough to give the OS scheduler a window for sshd/disk-collector/etc., small enough that total ingestion time isn't meaningfully affected (each batch is ~60-90s of work). GPU-accelerated installs see zero behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| controllers | ||
| exceptions | ||
| jobs | ||
| middleware | ||
| models | ||
| services | ||
| utils | ||
| validators | ||