Commit Graph

101 Commits

Author SHA1 Message Date
Chris Sherwood
54e3b5bd3d feat(maps): add notes input to map pin placement popup
The map_markers backend has accepted a `notes` column since PR #770 and
the popup display path was wired up to render it (commit 6328256), but
the placement UI never got an input. Result: notes are stored,
displayed when present, and impossible to actually enter via the UI.

Add a notes textarea below the name input in the placement popup,
thread the value through `addMarker` and `createMapMarker`, and trim +
null-coalesce on save. Notes display in the marker popup on click is
unchanged and now actually reachable.

- admin/inertia/lib/api.ts: extend createMapMarker request type with
  optional notes
- admin/inertia/hooks/useMapMarkers.ts: addMarker accepts and forwards
  notes (response already populated notes into local state, so no
  display-side change needed)
- admin/inertia/components/maps/MapComponent.tsx: markerNotes state,
  textarea after name input, threaded into handleSaveMarker

Edit-mode for existing markers (so users can backfill notes on
already-placed pins) is intentionally out of scope here - selected-marker
popup is still read-only. That's a follow-up PR if there's demand.
2026-05-21 12:16:44 -07:00
Chris Sherwood
059cf2afbe fix(content): show selected tier on cards while downloads are in flight
Since PR #36b6d8e moved tier-installation tracking from a client-side
persistence model to a server-side derive-from-disk model, the card
display only ever updates once every file in a tier is fully on disk.
A user who picks Standard sees a blank card for the duration of the
download (often hours for large tiers like Wikibooks). Worse, if some
files finish before others, the card briefly shows a lower tier (e.g.
Essential) before promoting to the selected tier on completion, which
reads as "the system didn't accept my pick."

Backend: compute a sibling `downloadingTierSlug` by unioning installed
resource IDs with the IDs from active RunDownloadJob queue entries
(waiting + active + delayed, failed deliberately excluded), then
resolving the highest tier whose every resource is in that union. Set
only when it differs from `installedTierSlug` — no point reporting
"downloading Standard" when Standard is already fully installed.

Frontend: unify the prominent corner badge logic in CategoryCard to a
single `badgeTier` derived from selectedTier > downloadingTier >
installedTier. Spinner + "(downloading)" suffix when in flight,
checkmark for installed/selected. The pill row and lime border follow
the same source.

Verified on NOMAD3: backend correctly resolves the downloading tier
from in-flight BullMQ jobs; CategoryCard shows the spinner badge
immediately on Submit and switches to the checkmark variant when
downloads complete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
6e5284e563 fix(KB): TierSelectionModal hook order + register IconLibrary
Two related fixes surfaced by armandoescalante in #915 when clicking a
Content Explorer category card (e.g. Medicine) on v1.32.0-rc.6:

1. TierSelectionModal placed a useMemo for freeBytes *after* the
   `if (!category) return null` early return (introduced in PR #901's
   guardrail integration). When `category` transitioned from null to
   non-null on first open, React saw a different hook count between
   renders and crashed the entire component tree with "Rendered more
   hooks than during the previous render", blanking the modal. Moved
   the freeBytes useMemo above the early return so hook order is
   constant.

2. `IconLibrary` was used as the icon prop on the Manage Custom
   Libraries button in remote-explorer.tsx but never registered in
   the DynamicIcon allowlist at admin/inertia/lib/icons.ts. Added it
   to both the import block and the icons map so the warning stops
   firing and the icon renders.

Closes #915.
2026-05-20 10:16:00 -07:00
Chris Sherwood
ffa70a54bc feat(chat): confirm-on-switch + one-chat-model-at-a-time enforcement
Surfaces NOMAD's previously-silent model-stacking behavior and enforces a
"one chat model in VRAM at a time" invariant (the embedding model is
always exempt). Addresses Chris's NOMAD3 testing observation that
switching the dropdown in the chat header was invisibly slow on low-VRAM
hardware because the prior model was never unloaded — Ollama would
either evict it under memory pressure or load the new one on CPU after
the runner choked.

Three integration points all funnel through one new helper:

- **User changes the model dropdown** in an active chat session →
  confirm modal "Switch to {newModel}? Switching to {newModel} will
  start a new chat. Your current conversation stays available in the
  sidebar." On confirm, fire `keep_alive: 0` against the previous chat
  model, clear active session, set the new selection. Cancel snaps the
  visible dropdown back to the previous value (no popup state leaks
  into `selectedModel`).

- **User clicks a session in the sidebar** → no popup (system-initiated).
  Restore the session's stored model into the dropdown and fire
  `unloadChatModels(targetModel)` so anything that isn't the target
  gets the unload hint.

- **Chat page first mount** → page-load normalization. Anything stacked
  from a prior session gets the unload hint with the current selected
  model as the target-to-preserve. Guarded by a ref so it only fires
  once per page lifetime; gated on `selectedModel` being populated.

Backend surface is a single new helper and a single new route:

  `OllamaService.unloadAllChatModelsExcept(targetModel: string | null)`
  → queries `/api/ps`, filters out (a) the embedding model name
  (hardcoded `nomic-embed-text:v1.5` to avoid the RagService circular
  import) and (b) `targetModel`, fires `POST /api/generate` with empty
  prompt + `keep_alive: 0` in parallel against everything else.
  Returns the names that were hinted. Best-effort: network or Ollama
  errors are logged and swallowed so callers don't fail on housekeeping.

  `POST /api/ollama/unload-chat-models` → thin wrapper validating
  `{ targetModel?: string | null }`.

Why `keep_alive: 0` is safe against in-flight inference: per Ollama's
scheduler semantics, the hint sets the post-completion eviction timer
to zero — the runner is not terminated. If Session A is mid-response
on gemma when Session B fires the unload, gemma stays resident until
A's request completes, then evicts. The user-visible worst case is the
race where A's longer-running request re-extends the timer back to the
default and the unload is no-op'd; the next transition (or page reload)
gets another chance, and Ollama's own LRU catches up under memory
pressure regardless. Robust in-flight tracking deferred to a follow-up
if we see stale-state in the wild.

Base `rc`: v1.40.0 will inherit everything from rc.6 via the backmerge.
Frontend tests deferred to a follow-up PR; existing inertia tsconfig
errors are pre-existing and unrelated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
d850cb9588 feat(KB): per-file ingest action + state indicator on Stored Files (RFC #883 §5)
Closes the Manual-mode UX dead-end: after toggling 'Auto-index new content
for AI?' to Manual, a freshly-downloaded ZIM (or any pending_decision file)
had no UI path to opt in for embedding short of the global Sync Storage /
Re-embed All bulk actions. Per RFC #883 §5, each Stored Files row now
carries a state pill and an adaptive single-button action.

State pill (left of any existing warning chips):
  - 'Indexed'    — green; row had chunks in Qdrant or state row is 'indexed'
  - 'Not Indexed' — neutral; state is pending_decision or browse_only
  - 'Failed'     — red
  - 'Stalled'    — amber
  - admin_docs collapsed row has no pill ('Managed by NOMAD' carries it)

Adaptive action button (paired with the existing Delete button per row):
  - pending_decision         → 'Index' (force=false)
  - browse_only              → 'Index' (force=true)
  - failed / stalled         → 'Retry' (force=true)
  - indexed + warning chip   → 'Re-embed' (force=true; confirm modal first)
  - indexed healthy / null   → no action button (bulk Re-embed All covers it)

Backend: GET /api/rag/files now returns
  { files: Array<{ source, state, chunksEmbedded }> }
instead of a flat string[]. State + chunk-count come from a single
KbIngestState query unioned into the existing Qdrant-derived source list
(no new round trips). New POST /api/rag/files/embed validates the source is
known, refuses if any inflight job already targets the same filePath
(prevents double-click duplicate-chunk hazard), pre-deletes Qdrant points
when force=true, then dispatches via the existing _dispatchEmbedJobsFor
helper used by reembedAll.

Per-file Re-embed (force=true on an already-indexed file) routes through a
StyledModal confirmation since it deletes existing vectors before queueing
a fresh job — same destructive-action weight as Delete's inline confirm but
heavier since it affects search until the rebuild finishes.

Folds in PR #907's blank-screen fix because my new render needs the same
generic restored: `<StyledTable<KbFileGroup>>` and `record.displayName`
(instead of the unresolved `sourceToDisplayName(record.source)` that ships
in rc.5 and ReferenceErrors on modal open). PR #907 also adds title
tooltips on the three bulk-action buttons; those tooltips are NOT included
here — let PR #907 land first or independently for that part.

Multi-select bulk-opt-in deferred per discussion: most Manual-mode users
ingest 1-2 files at a time, the existing global toggle covers the bulk
case, and checkboxes would expand scope past what rc.6 should hold. Will
file a follow-up issue for an 'Index N pending files' single-click button
once this lands.

Tests-in-PR scope was limited to keeping `kb_file_grouping.spec.ts` green
after the StoredFileInfo[] signature change (added asInfos() wrapper).
Dedicated unit tests for embedSingleFile (unknown source / inflight refused
/ force=true delete-then-dispatch) and the new state-pill rendering will
land in a follow-up PR alongside Playwright coverage of the row actions.

Verification path: NOMAD3 currently runs project-nomad-admin:integration-
rc6-preview (PRs #907 + #908 atop rc.5). After this branch is built into a
new integration tag, I'll re-run targeted Playwright UAT on the KB modal
covering: state pill rendering per state, Index click on pending_decision
opts in cleanly, Retry on failed re-dispatches successfully, Re-embed
confirmation modal copy + delete-then-dispatch on the military-medicine
partial-stall row, and Delete flow untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
633a3c3500 fix(KB): blank-screen on panel open + tooltips on bulk-action buttons
The Stored Knowledge Base Files render crashed on first open in v1.32.0-rc.5
with `ReferenceError: sourceToDisplayName is not defined`. The table column's
render() called `sourceToDisplayName(record.source)` but the function was
extracted to `lib/kb_file_grouping.ts` in PR #892 and never imported in
KnowledgeBaseModal.tsx. The unhandled error unmounts the entire React tree,
so users see a blank screen ~20s after opening the panel.

Root cause: PR #895 (conditional warnings) rewrote the render() and used
`sourceToDisplayName(record.source)` instead of `record.displayName`, which
KbFileGroup already carries from groupAndSortKbFiles(). PR #895's review
follow-up (cbae48a) compounded this by narrowing the StyledTable generic
from `KbFileGroup` to `{source: string}`, hiding the type drift from tsc.

This restores the post-#892 pattern:
- StyledTable generic back to `KbFileGroup`
- Render uses `record.displayName` (works for both per-file rows and the
  collapsed admin-docs row; calling sourceToDisplayName on the synthetic
  `__admin_docs_group__` would have rendered that literal as the row name).

Also folds in tooltip copy on the three bulk-action buttons (Reset & Rebuild,
Re-embed All, Sync Storage) so the difference in destructiveness is visible
on hover. Uses native `title` attribute via StyledButton's prop pass-through;
no new component dependency.

Inertia tsconfig catches this regression cleanly (TS2304 + TS2339); the
pre-push hook only runs the backend tsconfig which excludes inertia/**, so
the bug shipped. Tracking the typecheck-coverage gap as a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Jake Turner
7e768f3d09 fix(KB): guardrail bypass during estimate load + Transition sibling (PR #901 review)
- Disable TierSelectionModal Submit while the embed-estimate query is in
flight, so a fast click can't slip past the guardrail with an undefined
estimate.
- Move KbGuardrailModal out of the outer <Transition> and render it as a
Fragment sibling — Headless UI's Transition expects Transition.Child
descendants, not raw conditional siblings.
2026-05-20 10:16:00 -07:00
Chris Sherwood
cf3a924b9f feat(KB): guardrail modal at 50GB / 10%-free thresholds (RFC #883 §7)
One-time confirmation step gating bulk indexing actions that would
consume a substantial amount of disk for embedding storage. Fires only
when the user has policy=Always (i.e., the system would auto-index)
AND the estimate trips either:

  - GUARDRAIL_ABSOLUTE_BYTES = 50 GB embedding cost, OR
  - GUARDRAIL_FREE_DISK_RATIO = 10% of current free disk space

Under policy=Manual the guardrail is silent because the user has
already opted out of automatic ingestion — the files would just queue
as pending_decision either way.

Pieces
- inertia/lib/kb_guardrail.ts: pure decision helper with two constants
  and an evaluateGuardrail() that returns a verdict + reasons. No I/O
  on the helper itself so the logic is trivially testable
- inertia/components/KbGuardrailModal.tsx: confirmation dialog. Headless
  UI Transition + Dialog, amber 'large operation' header, plain-English
  estimate summary, [Cancel] / [Proceed anyway] footer. z-[60] so it
  layers above the tier modal underneath instead of replacing it
- inertia/components/TierSelectionModal.tsx integration: handleSubmit
  now evaluates the guardrail when policy=Always and embedEstimate is
  available; if it trips, we stash the verdict in state and render the
  guardrail modal as an overlay. Confirm runs finalizeSubmit (which is
  the pre-existing onSelectTier + onClose path); Cancel just closes the
  guardrail and leaves the tier modal as-is so the user can change
  their tier choice or flip the policy

The disk-free signal comes from the existing useSystemInfo hook +
getPrimaryDiskInfo helper. Passing freeBytes=0 (unknown) skips the
relative-disk check, so the modal still works on hosts whose disk
introspection failed — just relies on the absolute 50 GB threshold

Tests
- 9 cases in tests/unit/kb_guardrail.spec.ts: standard small batch (no
  trip), exact absolute threshold trips, over-absolute trips, over 10%
  free trips, both-at-once trips with two reasons, freeBytes=0 skip,
  freeBytes=0 + over-absolute trip, exact-10% boundary trips, just-
  under-both safe. All green.

Stacks on feat/kb-tier-estimate-on-disk (#897) — consumes that PR's
estimate endpoint to compute the verdict input. Auto-rebases to rc
when #897 merges.

Pairs with #894 (policy toggle) and #899 (JIT prompt): together the
three PRs cover the 'how do I avoid surprising the user with auto-
indexing they didn't ask for?' arc.

Out of scope (deferred)
- 6 hr time threshold (RFC §7): needs a per-host chunks-per-second
  metric we don't capture yet; would be a follow-up after Phase 4
  self-calibration (RFC §15) lands
- Wider integration (KbPolicyPromptBanner 'Index now' button, manual
  KB-modal sync): TierSelectionModal is the dominant bulk-decision
  surface and the right place to land this first
2026-05-20 10:16:00 -07:00
Jake Turner
9a684a5e62 fix(KB): silent maybe-later error + redundant prompt-state refetches (PR #899 review)
- KbPolicyPromptBanner: add onError toast to maybeLaterMutation so a
    failed policy save surfaces to the user instead of looking like a
    broken button (banner would otherwise reappear on next chat open
    with no explanation).
  - KbPolicyPromptBanner: set staleTime: Infinity on the prompt-state
    query. For users who already picked a policy (the vast majority),
    the result is effectively immutable per session — the mutations
    invalidate the key when it actually changes.
2026-05-20 10:16:00 -07:00
Chris Sherwood
fd153b46b8 feat(KB): first-chat JIT prompt for ingest policy (RFC #883 Phase 3 task 12)
When a user opens AI Chat with content available but no global ingest
policy yet recorded, surface a one-time banner above the chat header
asking how they want new content handled:

  - 'Index existing content' -> sets rag.defaultIngestPolicy=Always and
    triggers a sync so pending_decision files queue immediately
  - 'Maybe later' -> sets policy=Manual; existing and future content
    waits in pending_decision until the user opts in from the KB modal

After either button is clicked the banner never reappears, because both
write the policy KV (the same one #894 manages via the KB modal toggle).
There is intentionally no 'dismiss without deciding' X — that would just
re-show the banner forever.

Backend
- New GET /api/rag/policy-prompt-state returns
  {shouldPrompt, hasContent, totalFiles}
- RagService.getPolicyPromptState() reads KVStore('rag.defaultIngestPolicy')
  and counts kb_ingest_state rows; shouldPrompt is true only when policy
  is null AND scanner has seen >=1 file (avoids prompting on empty NOMADs)

Frontend
- New KbPolicyPromptBanner component (~120 LOC) handles the two-button
  decision flow with optimistic loading state, success/error toasts, and
  invalidates kbPolicyPromptState + ingestPolicy + embed-jobs + storedFiles
  on success
- Mounted in components/chat/index.tsx as the first child of the main
  content column so it sits above the chat title bar without taking space
  when shouldPrompt is false (renders nothing)
- Reads aiAssistantName from Inertia page props so banner copy matches
  the user's chosen assistant name

Stacks on feat/kb-policy-toggle (#894) because the policy KV mechanism
it writes through is introduced there. Both can land in rc.5; this PR
auto-rebases to rc once #894 merges.

Existing users on first upgrade to v1.32.0 will see this banner on first
chat visit post-upgrade — an explicit opt-in moment for content that was
already on disk. New users see it the first time they have curated
content downloaded.
2026-05-20 10:16:00 -07:00
Chris Sherwood
4e8caddcc2 fix(KB): remove redundant Refresh button from Processing Queue
useEmbedJobs already polls every 2s while jobs are active (and 30s when
idle) and auto-invalidates Stored Files when the queue drains. The
manual Refresh button was a no-op signal — it just confuses users who
click it and see no change. Per-job 'last activity Xs ago' lines remain
as the live-recency indicator.

Stacks on feat/kb-job-status-pill (#893) since the Refresh button only
exists in that branch.
2026-05-20 10:16:00 -07:00
Jake Turner
a0047c1555 fix(KB): surface file-warning compute failures instead of masking as healthy (PR #895 review)
`computeFileWarnings()` previously caught all errors and returned an empty
map, which the frontend rendered as "every file is healthy" — reintroducing
exactly the silent-failure mode this surface exists to expose.

Return `{ ok, warnings }`; flip `ok: false` from the catch. KB modal renders
an inline amber notice under the Stored Files header when `ok === false`,
leaving per-row warning rendering untouched. Transient failures self-heal on
the next 30s poll; no toast spam.
2026-05-20 10:16:00 -07:00
Chris Sherwood
563f86a22b feat(KB): conditional warnings A + B on Stored Files (RFC #883 §6)
Surfaces two silent failure modes that the prior binary
"any-chunks-in-Qdrant ⇒ embedded" check could not distinguish from
healthy ingestion:

- **Warning A — Zero-chunk file** (file_size > 100 MB, chunks = 0)
  Fires on video-only / image-only ZIMs (`lrnselfreliance_en_all`,
  TED talks, etc.) that the pipeline completes "successfully" with no
  extractable text. AI Assistant literally cannot reference these.

- **Warning B — Partial-embed stall** (chunks < 50% of expected from
  the ratio registry). Surfaces the simple_wiki "266 of 600,000 chunks"
  case observed during NOMAD1 ingestion testing — previously these
  looked identical to fully-completed embeds in the UI.

Both warnings render only when their condition is met (silent by
default; noisy only on real problems).

Base is `feat/kb-ratio-registry` (#891) because Warning B's "expected
chunks" estimate comes from `KbRatioRegistry.estimateChunks()`. GitHub
fast-forwards to `rc` once #891 merges.

- `app/utils/kb_warning_decision.ts` — pure `decideWarnings(inputs)`
  with thresholds (`100 MB`, `0.5×`) as exported constants. 10 unit
  tests cover the healthy case, both warnings, the under/at/over
  boundary, the registry-miss suppression, and the video-only registry
  case (`expectedChunks: 0` correctly skips Warning B).
- `RagService.computeFileWarnings()` — single Qdrant scroll tallies
  chunks per source, filesystem walk fills in zero-chunk files,
  ratio registry estimates the expectation, decision function emits.
- New endpoint `GET /api/rag/file-warnings` returns
  `Record<source, FileWarning[]>` (sources with no warnings are
  omitted, so the frontend can `warnings[source] ?? []` for clean
  defaults).
- KB modal: warnings render inline under the file name as amber-tinted
  pills. Polled every 30s alongside the existing health check.

- Warning C — chunks skipped due to length. PR #890 (#881 fix) prevents
  the silent drop at the embed boundary, so the underlying condition
  shouldn't fire anymore. If we still want to surface "we truncated
  N chunks to fit", that needs separate `skipped_count` tracking in
  EmbedFileJob — a Phase 2 follow-up.
- Suppressing Warning B during active mid-ingestion. The user can cross-
  reference the Processing Queue to know it's in-flight; suppressing
  warnings while a job runs would mask real stalls where the job died
  mid-batch. Will revisit when per-card status is wired through.
- Use of `kb_ingest_state.chunks_embedded` (#888) as the chunk count
  source. This PR uses Qdrant scroll directly so it can land
  independently of #888.

- 10 new unit tests on `decideWarnings`, all pass
- Type-check clean
- Hot-patch + browser smoke test deferred until #891 lands (the ratio
  registry needs to exist in the DB for `estimateChunks()` to return
  non-null estimates — without it, only Warning A fires which is still
  useful but Warning B stays dormant)
2026-05-20 10:16:00 -07:00
Chris Sherwood
e68c753e39 feat(KB): surface embedding-disk estimate in curated tier-change modal (RFC #883 §1)
When a user picks a tier in TierSelectionModal, show how much additional
disk space the AI Assistant will need if the new ZIMs are indexed, plus
a policy-aware footer explaining whether they'll auto-index (Always) or
wait for opt-in (Manual). Estimates consume #891's KbRatioRegistry via a
new POST /api/rag/estimate-batch endpoint.

Backend
- New POST /api/rag/estimate-batch route + RagController.estimateBatch
- VineJS schema accepting array of {filename, sizeBytes}, capped at 500
- KbRatioRegistry.estimateBatch aggregates via the existing prefix-match
  lookup, returns {totalChunks, totalBytes, hasUnknown}
- New BYTES_PER_CHUNK_ON_DISK constant (~8 KB: 3 KB vector + ~3 KB chunk
  text + ~2 KB payload/index overhead). Tunable; will be replaced by
  Phase 4 self-calibration once we have real measurements.
- Controller normalizes incoming filenames via path.basename so callers
  that send full paths or URLs still match registry prefixes correctly.

Frontend
- api.estimateEmbeddingBatch() client method
- TierSelectionModal: when localSelectedSlug is set, resolve the tier's
  resources (incl. inherited tiers), POST to /estimate-batch, and render
  a new info block with the +~X GB figure + ingest-policy copy. Also
  fetches rag.defaultIngestPolicy so the same block surfaces whether
  indexing will fire automatically or wait for the user.
- resourceFilename() helper extracts the basename from the resource URL
  so the registry lookup hits the right prefix regardless of mirror.

Tests
- 4 new cases in tests/unit/kb_ratio_lookup.spec.ts covering the
  estimateBatch aggregator: standard sum, unknown-flagging, video-only
  ZIM (0 chunks but known, hasUnknown stays false), empty input.

Stacks on feat/kb-ratio-registry (#891) — consumes the registry table
seeded by that PR. Once #891 merges to rc, this PR auto-rebases.

Out of scope for this PR (deferred to follow-ups):
- Per-batch opt-in checkbox (RFC §1's '☑ Also index these for AI') needs
  a per-batch policy override path and is a separate PR
- Guardrail modal at 50 GB / 10% free / 6 hr thresholds (RFC §7) is also
  separate; this PR is informational, not gating
- Time-to-embed estimate awaits a chunks-per-second metric per host
2026-05-20 10:16:00 -07:00
chriscrosstalk
8eb8809154 feat(KB): Always/Manual ingest policy toggle (RFC #883 §1/§4) (#894)
* feat(KB): per-file ingest state machine (Phase 1 of RFC #883)

Adds a persistent state machine for AI knowledge-base ingestion so the
scanner can distinguish "fully indexed", "user opted out", "failed", and
"stalled" from each other — none of which were derivable from the prior
binary "any chunks in Qdrant ⇒ embedded" check.

## What lands

- New table `kb_ingest_state` keyed by `file_path` with enum state column
  (`pending_decision | indexed | browse_only | failed | stalled`).
  Independent of `installed_resources` so it covers both curated downloads
  and manually-uploaded KB files.
- New KV key `rag.defaultIngestPolicy` (string: `Always | Manual`).
  Registered now but not consumed yet — JIT prompt + wizard step land in
  Phase 3 of the RFC.
- `EmbedFileJob.handle` writes state on terminal outcomes:
  - Success (final batch) → `indexed` + chunks count
  - `UnrecoverableError` → `failed` + error message
  - Retryable errors are left to BullMQ's existing retry path
- `scanAndSyncStorage` swaps the binary qdrant check for a state-aware
  decision tree (see `decideScanAction`). Existing installs auto-backfill
  on first scan: files with chunks in Qdrant but no state row become
  `indexed`; new files start as `pending_decision`.
- `deleteFileBySource` drops the state row last, so removed files
  disappear entirely instead of leaving an orphan that the next scan
  would re-dispatch into nothing.

## What does NOT land here

- Ratio registry (separate PR) — needed for partial-stall detection and
  cost estimates, but a separable concern.
- #880 follow-up initial-progress anchor (separate tiny PR).
- Phase 2 UI (status pill, per-card actions, conditional warnings).
- Phase 3 policy surfaces (wizard step, JIT prompt, guardrail modal).
- PR #886's bulk-action hookup — `_deletePointsBySource` / Re-embed All
  / Reset & Rebuild would also want to set state, but #886 isn't merged
  yet; that wiring goes in a follow-up once #886 lands.

## Target

This is forward work for v1.40.0 (RFC #883). Branching off `rc` because
that's the current latest base and post-GA Jake will sync rc→dev; a
retarget at PR-open time is a fast-forward if requested.

## Tests

- 9 new unit tests for `decideScanAction` covering all five states plus
  the no-row / chunks-present / chunks-missing combinations
- Type-check clean
- Smoke-tested end-to-end on NOMAD3 via hot-patch:
  - Backfill: 5 ZIMs + 2 KB uploads with existing chunks in Qdrant all
    came back `indexed` on first scan
  - Pending dispatch: a video-only ZIM with no chunks (`lrnselfreliance`)
    came back `pending_decision` and was correctly re-dispatched (Bull
    deduped to its historical `:completed` jobId — bgauger's #886 fix
    drains that)
  - Delete hook: deleting a KB upload via `DELETE /api/rag/files`
    removed both the disk file and the state row

* feat(KB): Always/Manual ingest policy toggle (RFC #883 §1/§4)

Activates the `rag.defaultIngestPolicy` KV registered in Phase 1
(#888) so users on a fresh install (or anyone who picks Manual mode)
no longer get every new ZIM auto-dispatched to the embed pipeline.

## Stacks on #888

This PR's base is `feat/kb-ingest-state-machine` (#888). The state
machine has to be in place for the decision function to be policy-aware;
GitHub will fast-forward the base to `rc` once #888 merges.

## Backend changes

- `decideScanAction` now takes a `policy: 'Always' | 'Manual'` argument
  (defaults to `Always` for backward compatibility).
- New `ScanAction` kind: `create_pending`. Manual mode records that the
  scanner has seen a new file (so the UI can surface a per-card Index
  affordance later) without dispatching an EmbedFileJob.
- `scanAndSyncStorage` reads the KV and passes it through. The scan-result
  log line now includes the active policy and a `waiting on user` count
  for Manual-mode hits.
- `rag.defaultIngestPolicy` added to `SETTINGS_KEYS` so it's reachable
  through the existing `GET/PATCH /api/system/settings` surface — no new
  endpoint.

## Frontend changes

- New section in the KB panel between "Why upload" and "Processing Queue":
  "Auto-index new content for AI? [Always | Manual]" — segmented radio
  with copy explaining the 5-10× disk multiplier. Default Always.
- `useQuery('ingestPolicy')` reads the current value; clicking the
  inactive option mutates and shows a notification confirming the new
  behavior.

## Tests

- 14 unit tests on `decideScanAction` (was 9) — split into Always-mode
  cases (preserves Phase 1's contract) and Manual-mode cases
  (`create_pending`, `pending_decision → skip`, etc.).
- Type-check clean.
- Hot-patch + browser verification deferred until #888 lands; the state
  machine smoke-tested cleanly on NOMAD3 in #888's PR, and this PR's
  decision-tree changes are exhaustively unit-tested.

## RFC open question §3 — policy-change re-trigger

Switching Manual → Always doesn't auto-dispatch existing `pending_decision`
rows immediately. The next scan re-evaluates and dispatches them under
the new policy. This matches the RFC's "treat the switch as I've-
thought-about-it" instinct for the guardrail; full guardrail
implementation lands in Phase 3 task 14.

---------

Co-authored-by: Jake Turner <52841588+jakeaturner@users.noreply.github.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
43ca584b6c feat(KB): status pill + last-activity timestamp on Processing Queue (RFC #883 §5/§10)
Each in-flight (or stuck) embedding job gets a colored health pill,
relative-activity timestamp, and chunk counter so users can tell at a
glance whether ingestion is making progress.

## Health states

- **🟢 Active** — last batch < 2 min ago
- **🟡 Slow** — last batch 2-5 min ago (CPU-paced multi-batch ingestion
  lives here naturally; not always a problem)
- **🔴 Stalled** — last batch > 5 min ago (likely real problem)
- ** Waiting** — queued, no batch started yet
- **🔴 Failed** — job recorded failed status

## What lands

- New backend util `kb_job_health.ts` with pure `computeJobHealth(input)`
  decision function. Time-based thresholds (2 min / 5 min) inlined as
  constants. 9 unit tests pin the boundaries.
- `EmbedJobWithProgress` gains `lastBatchAt`, `startedAt`, `chunks` —
  already set by `EmbedFileJob.handle` on every batch transition, just
  not previously surfaced through `listActiveJobs`.
- Frontend `kb_job_health_display.ts` maps each status to a Tailwind
  dot color, label, and aria-label so backend and UI stay in sync.
- `ActiveEmbedJobs.tsx` renders the pill, "last activity Xs ago", and
  chunk counter above each progress bar. Adds a manual Refresh button
  and "Last updated Xs ago" line — the existing 2s/30s auto-poll
  cadence in `useEmbedJobs` is left intact.
- Live tick at 5s keeps the relative timestamps current without
  re-fetching from the API.

## Not in scope

- Per-card Cancel / Retry / Un-index — separate Phase 2 PR
- Conditional warnings A/B/C — separate Phase 2 PR
- Computing throughput rate (chunks/min) — needs ratio registry consumer
  (Phase 2 follow-up); for now the pill answers the "is it stuck?"
  question directly without a rate estimate.
2026-05-20 10:16:00 -07:00
Chris Sherwood
c64ec97de4 feat(KB): group admin docs into single row in Stored Files (RFC #883 §9)
Project NOMAD's bundled docs (`/app/docs/*.md` and `README.md`) each
embed as their own KB source — currently rendering as 12+ individual
rows that swamp user-uploaded content in the Stored Files table.
Collapse them into one informational row:

> Project NOMAD documentation · 12 files · Managed by NOMAD

The admin-docs row hides the Delete button (those files would be
re-embedded on the next sync anyway, so deleting is a footgun). User
uploads and ZIMs keep their existing per-row Delete UX.

Also adds deterministic sort: ZIMs → user uploads → admin docs → other,
alphabetical within each bucket. Pure frontend change — `/api/rag/files`
response shape unchanged.

Decision logic extracted to `kb_file_grouping.ts` with 9 unit tests
covering bucket classification, sort order, count noun pluralization,
and empty-input handling.
2026-05-20 10:16:00 -07:00
Jake Turner
4c211964e0 fix(KB): add re-embed and reset & rebuild opts to fix broken embeddings (#886) 2026-05-20 10:16:00 -07:00
Chris Sherwood
f41027ca39 fix(Maps): render notes in marker popup when populated
Closes #796.

The maps API has accepted and persisted `notes` on map markers since
PR #770, but the marker popup component still rendered name only and
ignored the field. Now the popup shows a notes block beneath the name
when it's populated, with whitespace preserved and long text wrapped.

Threaded `notes` through the read path:
- `api.listMapMarkers` / `api.createMapMarker` response types
- `MapMarker` interface in `useMapMarkers` and the data.map projection
- `MapComponent`'s selectedMarker popup

The create/update UI is unchanged — users still set notes via the API
or DB directly, matching the issue's stated scope. A marker entry with
empty/whitespace-only notes renders the same as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Kenneth Brewer
08838b1944 feat(maps): show map coordinates on mouse move (#786)
* feat: Updated the map to show the coordinates as the user moves the cursor over the map. Changed the cursor to a crosshairs to make it easier to place map markers.

* Moved the scale unit control to its own component file for easier maintenance. Enhanced the behavior of the coordinate display on the map to not display when over the on screen controls, and the navigation bar. Added a toggle to turn off the coordinate display if the user doesn't wish to see it. Intentionally left the coordinate display when over a map marker so that the coordinates of the map marker can be estimated. In the future I intend to add the coordinates of a map marker when the map marker is clicked so that behavior may change in the future.
---------

Co-authored-by: Kenneth Brewer <kennethbrewer3@protonmail.com>
2026-05-20 10:16:00 -07:00
chriscrosstalk
8c06b5ba67 fix(UI): Country Picker UX polish + auto-refresh stored files (#817)
Three UX issues from manual testing of #780 on NOMAD3.

1. Slider was unusable for multi-step zoom changes
   `setLoading(true)` fired immediately on every selection or maxzoom change,
   which disabled the slider until the request returned. Even with the 400ms
   debounce delaying the network call, the UI was locked the whole time.
   User couldn't drag through zoom levels to find the right one.

   Fix: bump debounce to 1500ms, move `setLoading(true)` inside the setTimeout
   so it only flips after the debounce expires. Slider stays interactive
   throughout the wait. Slider `disabled` now only ties to `downloading`
   (active extract dispatch), not `loading` (preflight in flight). The
   existing requestId stale-safe pattern handles concurrent changes.

2. Newly-downloaded maps didn't show in Stored Map Files until manual refresh
   `props.maps.regionFiles` is rendered server-side and passed through Inertia
   props; without a partial reload it stayed stale until the user navigated
   away and back.

   Fix: watch `useDownloads({ filetype: 'map' })` count via a ref. When the
   count drops (a download finished), trigger `router.reload({ only: ['maps'] })`
   to refresh just the maps prop. Existing pattern from elsewhere in the
   codebase.

3. Country picker didn't surface already-downloaded countries
   When a user re-opened "Choose Countries" after downloading UK, UK appeared
   unchecked with no indication it was already on disk.

   Fix: pass installed pmtiles filenames into the modal as a prop; parse with
   regex `^([a-z]{2})_[\w-]+_z\d+\.pmtiles$` to extract country codes from
   single-country extracts (matching MapService.buildRegionSlug's iso2 lowercase
   slug pattern). Render an "Installed" badge on those countries with a tooltip
   explaining they're re-selectable for redownload at a different zoom.

   Group / custom multi-country extracts don't reverse-map cleanly from
   filename and are skipped here. Could be a follow-up if useful.

Files:
  admin/inertia/components/CountryPickerModal.tsx
    - SINGLE_COUNTRY_FILENAME_RE: iso2 + flexible date + zoom
    - installedFilenames prop with default []
    - installedCountrySet derivation via useMemo
    - "Installed" badge rendering on country list rows
    - Debounce: 400ms -> 1500ms; setLoading inside setTimeout
    - Slider disabled: only on `downloading`
  admin/inertia/pages/settings/maps.tsx
    - import useEffect/useRef
    - destructure activeMapDownloads from useDownloads
    - useEffect on download count drop -> router.reload({ only: ['maps'] })
    - pass installedFilenames to CountryPickerModal

All three fixes tested end-to-end on NOMAD3.
2026-05-20 10:16:00 -07:00
0xGlitch
94059b0aaf feat(Maps): regional map downloads via go-pmtiles extract (#780)
* feat(maps): add regional map downloads via go-pmtiles extract

* address Copilot review feedback on PR #780

- auto-refresh preflight on selection/maxzoom change with 400ms debounce and
  requestId stale-safety so the confirm button no longer requires a two-step
  "Estimate Size" -> "Start Download" dance
- safeUpdateProgress helper replaces fire-and-forget updateProgress().catch()
  pattern so cancelled-job errors (code -1) can't surface as unhandled rejections
- gate world basemap source on worldBasemapReady - when ensureWorldBasemap()
  fails we already delete world.pmtiles, so emitting the source was producing
  404s on every tile request
- verify go-pmtiles binary SHA256 at image build time; upstream doesn't ship a
  checksums file so per-arch hashes are pinned as build args with a regenerate
  note when bumping PMTILES_VERSION
2026-05-20 10:16:00 -07:00
Henry Estela
2d8a02f257 fix(RAG): add start button in kb modal and ensure restart policy exists (#700)
Adds a check to RAG health to make sure nomad_qdrant is online, if not
then the user will be blocked from clicking any buttons in the KB modal
until they click the start qdrant button and let the container start

There is a new file qdrant_restart_policy_provider.ts, which tries to
ensure that the restart policy always exists for the nomad_qdrant
container even though the policy should have been there when the
container is created.
2026-05-20 10:16:00 -07:00
chriscrosstalk
6c33a96972 fix(AI): allow cancelling in-progress model downloads and ensure consistent progress UI (#701)
Adds a cancel button to in-progress Ollama model downloads and unifies
the Active Model Downloads card layout with the Active Downloads card
used for ZIMs, maps, and pmtiles (byte counts, progress bar, live speed,
status indicator).

Closes #676.
2026-04-21 14:26:28 -07:00
chriscrosstalk
a813468949 feat(maps): add imperial/metric toggle for scale bar (#641)
Defaults to metric for global audience. Persists choice in localStorage.
Segmented button styled to match MapLibre controls.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:26:50 -07:00
chriscrosstalk
0183b42d71 feat(maps): add scale bar and location markers (#636)
Add distance scale bar and user-placed location pins to the offline maps viewer.

- Scale bar (bottom-left) shows distance reference that updates with zoom level
- Click anywhere on map to place a named pin with color selection (6 colors)
- Collapsible "Saved Locations" panel lists all pins with fly-to navigation
- Full dark mode support for popups and panel via CSS overrides
- New `map_markers` table with future-proofed columns for routing (marker_type,
  route_id, route_order, notes) to avoid a migration when routes are added later
- CRUD endpoints: GET/POST /api/maps/markers, PATCH/DELETE /api/maps/markers/:id
- VineJS validation on create/update
- MapMarker Lucid model

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:26:50 -07:00
Jake Turner
cb4fa003a4 fix: cache docker list requests, aiAssistantName fetching, and ensure inertia used properly 2026-04-03 14:26:50 -07:00
Jake Turner
1e4b7aea82 fix(UI): manual import map for DynamicIcon to avoid huge bundle of Tabler icons 2026-04-03 14:26:50 -07:00
Jake Turner
a14dd688fa feat(KnowledgeBase): support up to 5 files upload of 100mb each per req 2026-04-03 14:26:50 -07:00
chriscrosstalk
bac53e28dc feat(downloads): rich progress, friendly names, cancel, and live status (#554)
* feat(downloads): rich progress, friendly names, cancel, and live status

Redesign the Active Downloads UI with four improvements:

- Rich progress: BullMQ jobs now report downloadedBytes/totalBytes instead
  of just a percentage, showing "2.3 GB / 5.1 GB" instead of "78% / 100%"
- Friendly names: dispatch title metadata from curated categories, Content
  Explorer library, Wikipedia selector, and map collections
- Cancel button: Redis-based cross-process abort signal lets users cancel
  active downloads with file cleanup. Confirmation step prevents accidents.
- Live status indicator: green pulsing dot with transfer speed for active
  downloads, orange stall warning after 60s of no data, gray dot for queued

Backward compatible with in-flight jobs that have integer-only progress.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(downloads): fix cancel, dismiss, speed, and retry bugs

- Speed indicator: only set prevBytesRef on first observation to prevent
  intermediate re-renders from inflating the calculated speed
- Cancel: throw UnrecoverableError on abort to prevent BullMQ retries
- Dismiss: remove stale BullMQ lock before job.remove() so cancelled
  jobs can actually be dismissed
- Retry: add getActiveByUrl() helper that checks job state before
  blocking re-download, auto-cleans terminal jobs
- Wikipedia: reset selection status to failed on cancel so the
  "downloading" state doesn't persist

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(downloads): improve cancellation logic and surface true BullMQ job states

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jake Turner <jturner@cosmistack.com>
2026-04-03 14:26:50 -07:00
Henry Estela
7711b5f0e8 feat: switch all PNG images to WEBP (#575)
* feat(web): Switch all png except favicon to webp format
* fix(docs): use relative path for README project logo
2026-04-03 14:26:50 -07:00
chriscrosstalk
6a0195b9fc fix(UI): constrain install activity feed height with auto-scroll (#611)
The App Installation Activity list on the Easy Setup complete page grew
unboundedly, pushing Active Downloads off-screen. Caps the list at ~8
visible items with overflow scrolling, auto-scrolling to keep the latest
activity visible.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:26:50 -07:00
Henry Estela
69c15b8b1e feat(AI): enable remote AI chat host 2026-04-03 14:26:50 -07:00
Jake Turner
bd015f4c56 fix(UI): improve version display in Settings sidebar (#547) 2026-03-25 16:30:35 -07:00
Jake Turner
c67653b87a fix(UI): use StyledButton in TierSelectionModal for consistency (#543) 2026-03-25 16:30:35 -07:00
Chris Sherwood
78c0b1d24d fix(ai): surface model download errors and prevent silent retry loops
Model downloads that fail (e.g., when Ollama is too old for a model)
were silently retrying 40 times with no UI feedback. Now errors are
broadcast via SSE and shown in the Active Model Downloads section.
Version mismatch errors use UnrecoverableError to fail immediately
instead of retrying. Stale failed jobs are cleared on retry so users
aren't permanently blocked.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 16:30:35 -07:00
Jake Turner
baeb96b863 fix(ui): support proper size override of LoadingSpinner 2026-03-20 11:46:10 -07:00
Chris Sherwood
023e3f30af fix(downloads): allow users to dismiss failed downloads
Failed download jobs persist in BullMQ forever with no way to clear
them, leaving stale error notifications in Content Explorer and Easy
Setup. Adds a dismiss button (X) on failed download cards that removes
the job from the queue via a new DELETE endpoint.

- Backend: DELETE /api/downloads/jobs/:jobId endpoint
- Frontend: X button on failed download cards with immediate refresh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 11:46:10 -07:00
Chris Sherwood
b0b8f07661 fix: improve download reliability with stall detection, failure visibility, and Wikipedia status tracking
Three bugs caused downloads to hang, disappear, or leave stuck spinners:
1. Wikipedia downloads that failed never updated the DB status from 'downloading',
   leaving the spinner stuck forever. Now the worker's failed handler marks them as failed.
2. No stall detection on streaming downloads - if data stopped flowing mid-download,
   the job hung indefinitely. Added a 5-minute stall timer that triggers retry.
3. Failed jobs were invisible to users since only waiting/active/delayed states were
   queried. Now failed jobs appear with error indicators in the download list.

Closes #364, closes #216

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 11:46:10 -07:00
Jake Turner
9b74c71f29 fix(UI): minor styling fixes for Night Ops 2026-03-20 11:46:10 -07:00
Chris Sherwood
e4fde22dd9 feat(UI): add Debug Info modal for bug reporting
Add a "Debug Info" link to the footer and settings sidebar that opens a
modal with non-sensitive system information (version, OS, hardware, GPU,
installed services, internet status, update availability). Users can copy
the formatted text and paste it into GitHub issues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 11:46:10 -07:00
Jake Turner
9220b4b83d fix(maps): respect request protocol for reverse proxy HTTPS support 2026-03-20 11:46:10 -07:00
Chris Sherwood
b1edef27e8 feat(UI): add Night Ops dark mode with theme toggle
Add a warm charcoal dark mode ("Night Ops") using CSS variable swapping
under [data-theme="dark"]. All 23 desert palette variables are overridden
with dark-mode counterparts, and ~313 generic Tailwind classes (bg-white,
text-gray-*, border-gray-*) are replaced with semantic tokens.

Infrastructure:
- CSS variable overrides in app.css for both themes
- ThemeProvider + useTheme hook (localStorage + KV store sync)
- ThemeToggle component (moon/sun icons, "Night Ops"/"Day Ops" labels)
- FOUC prevention script in inertia_layout.edge
- Toggle placed in StyledSidebar and Footer for access on every page

Color replacements across 50 files:
- bg-white → bg-surface-primary
- bg-gray-50/100 → bg-surface-secondary
- text-gray-900/800 → text-text-primary
- text-gray-600/500 → text-text-secondary/text-text-muted
- border-gray-200/300 → border-border-subtle/border-border-default
- text-desert-white → text-white (fixes invisible text on colored bg)
- Button hover/active states use dedicated btn-green-hover/active vars

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 11:46:10 -07:00
Jake Turner
460756f581 feat(AI Assistant): improved state management and performance 2026-03-11 14:08:09 -07:00
Jake Turner
6f0fae0033 feat(AI Assistant): remember last model used 2026-03-11 14:08:09 -07:00
Jake Turner
58b106f388 feat: support for updating services 2026-03-11 14:08:09 -07:00
Jake Turner
db69428193 fix(AI): allow force refresh of models list 2026-03-11 14:08:09 -07:00
Jake Turner
dfa896e86b feat(RAG): allow deletion of files from KB 2026-03-04 20:05:14 -08:00
Jake Turner
99b96c3df7 feat(RAG): display embedding queue and improve progress tracking 2026-03-04 20:05:14 -08:00
Jake Turner
96beab7e69 feat(AI Assistant): custom name option for AI Assistant 2026-03-04 20:05:14 -08:00