`computeFileWarnings()` previously caught all errors and returned an empty
map, which the frontend rendered as "every file is healthy" — reintroducing
exactly the silent-failure mode this surface exists to expose.
Return `{ ok, warnings }`; flip `ok: false` from the catch. KB modal renders
an inline amber notice under the Stored Files header when `ok === false`,
leaving per-row warning rendering untouched. Transient failures self-heal on
the next 30s poll; no toast spam.
Surfaces two silent failure modes that the prior binary
"any-chunks-in-Qdrant ⇒ embedded" check could not distinguish from
healthy ingestion:
- **Warning A — Zero-chunk file** (file_size > 100 MB, chunks = 0)
Fires on video-only / image-only ZIMs (`lrnselfreliance_en_all`,
TED talks, etc.) that the pipeline completes "successfully" with no
extractable text. AI Assistant literally cannot reference these.
- **Warning B — Partial-embed stall** (chunks < 50% of expected from
the ratio registry). Surfaces the simple_wiki "266 of 600,000 chunks"
case observed during NOMAD1 ingestion testing — previously these
looked identical to fully-completed embeds in the UI.
Both warnings render only when their condition is met (silent by
default; noisy only on real problems).
Base is `feat/kb-ratio-registry` (#891) because Warning B's "expected
chunks" estimate comes from `KbRatioRegistry.estimateChunks()`. GitHub
fast-forwards to `rc` once #891 merges.
- `app/utils/kb_warning_decision.ts` — pure `decideWarnings(inputs)`
with thresholds (`100 MB`, `0.5×`) as exported constants. 10 unit
tests cover the healthy case, both warnings, the under/at/over
boundary, the registry-miss suppression, and the video-only registry
case (`expectedChunks: 0` correctly skips Warning B).
- `RagService.computeFileWarnings()` — single Qdrant scroll tallies
chunks per source, filesystem walk fills in zero-chunk files,
ratio registry estimates the expectation, decision function emits.
- New endpoint `GET /api/rag/file-warnings` returns
`Record<source, FileWarning[]>` (sources with no warnings are
omitted, so the frontend can `warnings[source] ?? []` for clean
defaults).
- KB modal: warnings render inline under the file name as amber-tinted
pills. Polled every 30s alongside the existing health check.
- Warning C — chunks skipped due to length. PR #890 (#881 fix) prevents
the silent drop at the embed boundary, so the underlying condition
shouldn't fire anymore. If we still want to surface "we truncated
N chunks to fit", that needs separate `skipped_count` tracking in
EmbedFileJob — a Phase 2 follow-up.
- Suppressing Warning B during active mid-ingestion. The user can cross-
reference the Processing Queue to know it's in-flight; suppressing
warnings while a job runs would mask real stalls where the job died
mid-batch. Will revisit when per-card status is wired through.
- Use of `kb_ingest_state.chunks_embedded` (#888) as the chunk count
source. This PR uses Qdrant scroll directly so it can land
independently of #888.
- 10 new unit tests on `decideWarnings`, all pass
- Type-check clean
- Hot-patch + browser smoke test deferred until #891 lands (the ratio
registry needs to exist in the DB for `estimateChunks()` to return
non-null estimates — without it, only Warning A fires which is still
useful but Warning B stays dormant)
When a user picks a tier in TierSelectionModal, show how much additional
disk space the AI Assistant will need if the new ZIMs are indexed, plus
a policy-aware footer explaining whether they'll auto-index (Always) or
wait for opt-in (Manual). Estimates consume #891's KbRatioRegistry via a
new POST /api/rag/estimate-batch endpoint.
Backend
- New POST /api/rag/estimate-batch route + RagController.estimateBatch
- VineJS schema accepting array of {filename, sizeBytes}, capped at 500
- KbRatioRegistry.estimateBatch aggregates via the existing prefix-match
lookup, returns {totalChunks, totalBytes, hasUnknown}
- New BYTES_PER_CHUNK_ON_DISK constant (~8 KB: 3 KB vector + ~3 KB chunk
text + ~2 KB payload/index overhead). Tunable; will be replaced by
Phase 4 self-calibration once we have real measurements.
- Controller normalizes incoming filenames via path.basename so callers
that send full paths or URLs still match registry prefixes correctly.
Frontend
- api.estimateEmbeddingBatch() client method
- TierSelectionModal: when localSelectedSlug is set, resolve the tier's
resources (incl. inherited tiers), POST to /estimate-batch, and render
a new info block with the +~X GB figure + ingest-policy copy. Also
fetches rag.defaultIngestPolicy so the same block surfaces whether
indexing will fire automatically or wait for the user.
- resourceFilename() helper extracts the basename from the resource URL
so the registry lookup hits the right prefix regardless of mirror.
Tests
- 4 new cases in tests/unit/kb_ratio_lookup.spec.ts covering the
estimateBatch aggregator: standard sum, unknown-flagging, video-only
ZIM (0 chunks but known, hasUnknown stays false), empty input.
Stacks on feat/kb-ratio-registry (#891) — consumes the registry table
seeded by that PR. Once #891 merges to rc, this PR auto-rebases.
Out of scope for this PR (deferred to follow-ups):
- Per-batch opt-in checkbox (RFC §1's '☑ Also index these for AI') needs
a per-batch policy override path and is a separate PR
- Guardrail modal at 50 GB / 10% free / 6 hr thresholds (RFC §7) is also
separate; this PR is informational, not gating
- Time-to-embed estimate awaits a chunks-per-second metric per host
Adds a check to RAG health to make sure nomad_qdrant is online, if not
then the user will be blocked from clicking any buttons in the KB modal
until they click the start qdrant button and let the container start
There is a new file qdrant_restart_policy_provider.ts, which tries to
ensure that the restart policy always exists for the nomad_qdrant
container even though the policy should have been there when the
container is created.