project-nomad/admin/app/utils
Chris Sherwood 68e1bd5ff2 feat(KB): ratio registry for disk + time estimates (Phase 1B of RFC #883)
Foundation for the cost estimates and partial-stall detection that
Phase 2 will surface. No consumers yet — this PR just lays the table,
the seed rows, and the lookup helper so subsequent UI work has
estimates available without a per-ZIM benchmark.

## What lands

- New table `kb_ratio_registry` (pattern, chunks_per_mb, sample_count,
  notes). Migration creates and seeds heuristic defaults from the RFC
  appendix: devdocs (1100/MB), Wikipedia variants (270/MB), iFixit
  (50/MB), Stack Exchange Q&A (200/MB), video-only ZIMs (0), plus a
  catch-all fallback at 100/MB.
- `KbRatioRegistry` model with static `lookup()` and `estimateChunks()`.
- Pure helper `kb_ratio_lookup.ts` doing longest-prefix-match — a
  specific entry (`wikipedia_en_simple_`) overrides a broader one
  (`wikipedia_en_`). 9 unit tests covering the lookup boundary.
- `sample_count` starts at 0 (heuristic seed) and is reserved for
  Phase 4 self-calibration to increment as observed ZIMs update each row.

## Not in scope

- Self-calibration on successful ingestion (Phase 4)
- UI consumers — Warning B (partial-embed stall) and the storage budget
  meter / time estimates land in Phase 2.

## Tested

- Type-check clean
- 9 unit tests pass for `findChunksPerMb` and `estimateChunkCount`
- Migration applied on NOMAD3 via hot-patch; 9 seed rows verified in DB
2026-05-16 20:23:47 -07:00
..
downloads.ts fix(Downloads): treat missing Content-Type as octet-stream (#848) 2026-05-11 21:09:40 -07:00
fs.ts fix: prevent ZIM corrupt file crash and deduplicate Ollama download logs (#741) 2026-04-17 11:54:04 -07:00
kb_ingest_decision.ts feat(KB): per-file ingest state machine (Phase 1 of RFC #883) (#888) 2026-05-15 22:51:06 -07:00
kb_ratio_lookup.ts feat(KB): ratio registry for disk + time estimates (Phase 1B of RFC #883) 2026-05-16 20:23:47 -07:00
misc.ts feat(AI): chat suggestions and assistant settings 2026-02-01 07:24:21 +00:00
version.ts feat: support for updating services 2026-03-11 14:08:09 -07:00
zim_filename.ts fix(ZIM): preserve co-existing Wikipedia corpora on cleanup (#884) 2026-05-15 22:29:17 -07:00