Commit Graph

217 Commits

Author SHA1 Message Date
jakeaturner
736c9bd672 fix(security): canonicalize hostnames to block IPv4-mapped IPv6 IMDS bypass
Replace literal string matching with ipaddr.js parsing
so equivalent encodings of 169.254.169.254
(::ffff:169.254.169.254, ::ffff:a9fe:a9fe,fully-expanded forms)
and fd00:ec2::254 are all rejected.
2026-05-20 10:16:00 -07:00
jakeaturner
b3dac9b324 fix(security): match IPv6 SSRF patterns against unbracketed hostnames 2026-05-20 10:16:00 -07:00
jakeaturner
989a401f28 fix(AI): improve remote Ollama url validation to prevent SSRF vulnerability 2026-05-20 10:16:00 -07:00
Jake Turner
82f67debc1 fix(models): correct inverted belongsTo keys on ChatMessage.session (#921)
foreignKey/localKey were swapped on the ChatMessage → ChatSession
relation. Per Lucid's belongsTo contract, foreignKey is the column
on the child model and localKey is on the parent — so this must be
{ foreignKey: 'session_id', localKey: 'id' }, mirroring the inverse
hasMany on ChatSession.

The relation is not currently preloaded anywhere in the codebase, so
no runtime behavior changes today; this closes a latent bug that
would have broken any future preload('session') call.
2026-05-20 10:16:00 -07:00
Chris Sherwood
a5fe52f66f fix(KB): respect Manual ingest policy on post-download dispatch
RunDownloadJob's onComplete handler was unconditionally firing
EmbedFileJob.dispatch after every ZIM download, gated only by "is
Ollama installed?". The rag.defaultIngestPolicy KV setting was never
consulted, so users who explicitly set Auto-index to Manual still got
every newly-downloaded ZIM auto-embedded.

RagService.scanAndSync already handles Manual correctly by recording
pending_decision rows instead of dispatching (rag_service.ts:1587-1638
via decideScanAction). The post-download path skipped that gate.

Mirror the same check at the dispatch site: read the policy KV; if
Manual, firstOrCreate a pending_decision row in kb_ingest_state so the
per-file Index affordance from PR #909 surfaces the file the same way
scan-time-discovered Manual files do. firstOrCreate (not create) so a
re-download doesn't demote an existing indexed/failed row — the user
can explicitly re-index from the KB panel if they want fresh content.

Verified on NOMAD3: with rag.defaultIngestPolicy='Manual', every ZIM
downloaded today via Content Explorer (agriculture-essential +
computing-essential, ~62 MB across 7 files) wrote kb_ingest_state
rows with state='indexed' instead of pending_decision. Real bug,
not a hot-patch artifact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
059cf2afbe fix(content): show selected tier on cards while downloads are in flight
Since PR #36b6d8e moved tier-installation tracking from a client-side
persistence model to a server-side derive-from-disk model, the card
display only ever updates once every file in a tier is fully on disk.
A user who picks Standard sees a blank card for the duration of the
download (often hours for large tiers like Wikibooks). Worse, if some
files finish before others, the card briefly shows a lower tier (e.g.
Essential) before promoting to the selected tier on completion, which
reads as "the system didn't accept my pick."

Backend: compute a sibling `downloadingTierSlug` by unioning installed
resource IDs with the IDs from active RunDownloadJob queue entries
(waiting + active + delayed, failed deliberately excluded), then
resolving the highest tier whose every resource is in that union. Set
only when it differs from `installedTierSlug` — no point reporting
"downloading Standard" when Standard is already fully installed.

Frontend: unify the prominent corner badge logic in CategoryCard to a
single `badgeTier` derived from selectedTier > downloadingTier >
installedTier. Spinner + "(downloading)" suffix when in flight,
checkmark for installed/selected. The pill row and lime border follow
the same source.

Verified on NOMAD3: backend correctly resolves the downloading tier
from in-flight BullMQ jobs; CategoryCard shows the spinner badge
immediately on Submit and switches to the checkmark variant when
downloads complete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
jakeaturner
a9c48fc098 refactor(AI): single source of truth for embedding model name
Lift the hardcoded 'nomic-embed-text:v1.5' string out of both
RagService and OllamaService into a shared EMBEDDING_MODEL_NAME
constant in constants/ollama.ts. The duplicate in OllamaService
existed only to dodge a circular import with RagService; the
constants module has no service imports, so a shared constant
eliminates both the duplication and the drift risk called out
in the inline "keep in sync" comment.
2026-05-20 10:16:00 -07:00
Chris Sherwood
ffa70a54bc feat(chat): confirm-on-switch + one-chat-model-at-a-time enforcement
Surfaces NOMAD's previously-silent model-stacking behavior and enforces a
"one chat model in VRAM at a time" invariant (the embedding model is
always exempt). Addresses Chris's NOMAD3 testing observation that
switching the dropdown in the chat header was invisibly slow on low-VRAM
hardware because the prior model was never unloaded — Ollama would
either evict it under memory pressure or load the new one on CPU after
the runner choked.

Three integration points all funnel through one new helper:

- **User changes the model dropdown** in an active chat session →
  confirm modal "Switch to {newModel}? Switching to {newModel} will
  start a new chat. Your current conversation stays available in the
  sidebar." On confirm, fire `keep_alive: 0` against the previous chat
  model, clear active session, set the new selection. Cancel snaps the
  visible dropdown back to the previous value (no popup state leaks
  into `selectedModel`).

- **User clicks a session in the sidebar** → no popup (system-initiated).
  Restore the session's stored model into the dropdown and fire
  `unloadChatModels(targetModel)` so anything that isn't the target
  gets the unload hint.

- **Chat page first mount** → page-load normalization. Anything stacked
  from a prior session gets the unload hint with the current selected
  model as the target-to-preserve. Guarded by a ref so it only fires
  once per page lifetime; gated on `selectedModel` being populated.

Backend surface is a single new helper and a single new route:

  `OllamaService.unloadAllChatModelsExcept(targetModel: string | null)`
  → queries `/api/ps`, filters out (a) the embedding model name
  (hardcoded `nomic-embed-text:v1.5` to avoid the RagService circular
  import) and (b) `targetModel`, fires `POST /api/generate` with empty
  prompt + `keep_alive: 0` in parallel against everything else.
  Returns the names that were hinted. Best-effort: network or Ollama
  errors are logged and swallowed so callers don't fail on housekeeping.

  `POST /api/ollama/unload-chat-models` → thin wrapper validating
  `{ targetModel?: string | null }`.

Why `keep_alive: 0` is safe against in-flight inference: per Ollama's
scheduler semantics, the hint sets the post-completion eviction timer
to zero — the runner is not terminated. If Session A is mid-response
on gemma when Session B fires the unload, gemma stays resident until
A's request completes, then evicts. The user-visible worst case is the
race where A's longer-running request re-extends the timer back to the
default and the unload is no-op'd; the next transition (or page reload)
gets another chance, and Ollama's own LRU catches up under memory
pressure regardless. Robust in-flight tracking deferred to a follow-up
if we see stale-state in the wild.

Base `rc`: v1.40.0 will inherit everything from rc.6 via the backmerge.
Frontend tests deferred to a follow-up PR; existing inertia tsconfig
errors are pre-existing and unrelated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
jakeaturner
9356443d73 refactor(KB): typed failure codes for embedSingleFile + accurate HTTP status
Return a discriminated `EmbedSingleFileResult` from `RagService.embedSingleFile`
with `code: 'not_found' | 'inflight' | 'delete_failed' | 'dispatch_failed'` on
failure. `RagController.embedFile` now maps those codes to the correct status
instead of collapsing every failure to 409:

- not_found       → 404
- inflight        → 409
- delete_failed   → 500
- dispatch_failed → 500

The `code` is also included in the JSON body so clients can branch without
string-matching `error`.
2026-05-20 10:16:00 -07:00
Chris Sherwood
d850cb9588 feat(KB): per-file ingest action + state indicator on Stored Files (RFC #883 §5)
Closes the Manual-mode UX dead-end: after toggling 'Auto-index new content
for AI?' to Manual, a freshly-downloaded ZIM (or any pending_decision file)
had no UI path to opt in for embedding short of the global Sync Storage /
Re-embed All bulk actions. Per RFC #883 §5, each Stored Files row now
carries a state pill and an adaptive single-button action.

State pill (left of any existing warning chips):
  - 'Indexed'    — green; row had chunks in Qdrant or state row is 'indexed'
  - 'Not Indexed' — neutral; state is pending_decision or browse_only
  - 'Failed'     — red
  - 'Stalled'    — amber
  - admin_docs collapsed row has no pill ('Managed by NOMAD' carries it)

Adaptive action button (paired with the existing Delete button per row):
  - pending_decision         → 'Index' (force=false)
  - browse_only              → 'Index' (force=true)
  - failed / stalled         → 'Retry' (force=true)
  - indexed + warning chip   → 'Re-embed' (force=true; confirm modal first)
  - indexed healthy / null   → no action button (bulk Re-embed All covers it)

Backend: GET /api/rag/files now returns
  { files: Array<{ source, state, chunksEmbedded }> }
instead of a flat string[]. State + chunk-count come from a single
KbIngestState query unioned into the existing Qdrant-derived source list
(no new round trips). New POST /api/rag/files/embed validates the source is
known, refuses if any inflight job already targets the same filePath
(prevents double-click duplicate-chunk hazard), pre-deletes Qdrant points
when force=true, then dispatches via the existing _dispatchEmbedJobsFor
helper used by reembedAll.

Per-file Re-embed (force=true on an already-indexed file) routes through a
StyledModal confirmation since it deletes existing vectors before queueing
a fresh job — same destructive-action weight as Delete's inline confirm but
heavier since it affects search until the rebuild finishes.

Folds in PR #907's blank-screen fix because my new render needs the same
generic restored: `<StyledTable<KbFileGroup>>` and `record.displayName`
(instead of the unresolved `sourceToDisplayName(record.source)` that ships
in rc.5 and ReferenceErrors on modal open). PR #907 also adds title
tooltips on the three bulk-action buttons; those tooltips are NOT included
here — let PR #907 land first or independently for that part.

Multi-select bulk-opt-in deferred per discussion: most Manual-mode users
ingest 1-2 files at a time, the existing global toggle covers the bulk
case, and checkboxes would expand scope past what rc.6 should hold. Will
file a follow-up issue for an 'Index N pending files' single-click button
once this lands.

Tests-in-PR scope was limited to keeping `kb_file_grouping.spec.ts` green
after the StoredFileInfo[] signature change (added asInfos() wrapper).
Dedicated unit tests for embedSingleFile (unknown source / inflight refused
/ force=true delete-then-dispatch) and the new state-pill rendering will
land in a follow-up PR alongside Playwright coverage of the row actions.

Verification path: NOMAD3 currently runs project-nomad-admin:integration-
rc6-preview (PRs #907 + #908 atop rc.5). After this branch is built into a
new integration tag, I'll re-run targeted Playwright UAT on the KB modal
covering: state pill rendering per state, Index click on pending_decision
opts in cleanly, Retry on failed re-dispatches successfully, Re-embed
confirmation modal copy + delete-then-dispatch on the military-medicine
partial-stall row, and Delete flow untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
fd153b46b8 feat(KB): first-chat JIT prompt for ingest policy (RFC #883 Phase 3 task 12)
When a user opens AI Chat with content available but no global ingest
policy yet recorded, surface a one-time banner above the chat header
asking how they want new content handled:

  - 'Index existing content' -> sets rag.defaultIngestPolicy=Always and
    triggers a sync so pending_decision files queue immediately
  - 'Maybe later' -> sets policy=Manual; existing and future content
    waits in pending_decision until the user opts in from the KB modal

After either button is clicked the banner never reappears, because both
write the policy KV (the same one #894 manages via the KB modal toggle).
There is intentionally no 'dismiss without deciding' X — that would just
re-show the banner forever.

Backend
- New GET /api/rag/policy-prompt-state returns
  {shouldPrompt, hasContent, totalFiles}
- RagService.getPolicyPromptState() reads KVStore('rag.defaultIngestPolicy')
  and counts kb_ingest_state rows; shouldPrompt is true only when policy
  is null AND scanner has seen >=1 file (avoids prompting on empty NOMADs)

Frontend
- New KbPolicyPromptBanner component (~120 LOC) handles the two-button
  decision flow with optimistic loading state, success/error toasts, and
  invalidates kbPolicyPromptState + ingestPolicy + embed-jobs + storedFiles
  on success
- Mounted in components/chat/index.tsx as the first child of the main
  content column so it sits above the chat title bar without taking space
  when shouldPrompt is false (renders nothing)
- Reads aiAssistantName from Inertia page props so banner copy matches
  the user's chosen assistant name

Stacks on feat/kb-policy-toggle (#894) because the policy KV mechanism
it writes through is introduced there. Both can land in rc.5; this PR
auto-rebases to rc once #894 merges.

Existing users on first upgrade to v1.32.0 will see this banner on first
chat visit post-upgrade — an explicit opt-in moment for content that was
already on disk. New users see it the first time they have curated
content downloaded.
2026-05-20 10:16:00 -07:00
chriscrosstalk
8ed0bdfd8f fix(KB): union Stored Files list with state-machine file paths (#898)
Closes the 'zero_chunks warning has no row to attach to' gap surfaced by
the 2026-05-14 integration UAT. Before this fix RagService.getStoredFiles
returned only file paths that appeared in Qdrant's payload.source — so
files with 0 embedded chunks (video-only ZIMs, browse_only opt-outs,
ingestions that failed before producing any chunks) silently disappeared
from the KB panel's Stored Files table.

The fix unions the Qdrant scroll result with the disk-backed file paths
recorded in kb_ingest_state. Effect:
- lrnselfreliance_en_all_2025-12.zim (3.97 GB video-only ZIM, 0 chunks)
  now appears in the table, picks up its zero_chunks warning chip
- Files in pending_decision under Manual policy show up so the user can
  see what's waiting for opt-in
- Files in browse_only / failed states have a row for future per-card
  Retry / Re-index actions (forthcoming, blocked on #886)

The state-machine query is wrapped in its own try/catch so a transient
DB error degrades to the Qdrant-only list rather than 500-ing the whole
panel — same defensive posture as the outer try/catch.

Stacks on feat/kb-ingest-state-machine (#888) because the union depends
on the kb_ingest_state table that PR introduces. Will rebase to rc once
#888 merges. Completes the second half of #895's warning surface; the
first half (partial_stall) already worked because those files have at
least some chunks in Qdrant.
2026-05-20 10:16:00 -07:00
Jake Turner
a0047c1555 fix(KB): surface file-warning compute failures instead of masking as healthy (PR #895 review)
`computeFileWarnings()` previously caught all errors and returned an empty
map, which the frontend rendered as "every file is healthy" — reintroducing
exactly the silent-failure mode this surface exists to expose.

Return `{ ok, warnings }`; flip `ok: false` from the catch. KB modal renders
an inline amber notice under the Stored Files header when `ok === false`,
leaving per-row warning rendering untouched. Transient failures self-heal on
the next 30s poll; no toast spam.
2026-05-20 10:16:00 -07:00
Jake Turner
102998ec96 refactor(KB): move FileWarning to shared types/rag following existing convention 2026-05-20 10:16:00 -07:00
Chris Sherwood
563f86a22b feat(KB): conditional warnings A + B on Stored Files (RFC #883 §6)
Surfaces two silent failure modes that the prior binary
"any-chunks-in-Qdrant ⇒ embedded" check could not distinguish from
healthy ingestion:

- **Warning A — Zero-chunk file** (file_size > 100 MB, chunks = 0)
  Fires on video-only / image-only ZIMs (`lrnselfreliance_en_all`,
  TED talks, etc.) that the pipeline completes "successfully" with no
  extractable text. AI Assistant literally cannot reference these.

- **Warning B — Partial-embed stall** (chunks < 50% of expected from
  the ratio registry). Surfaces the simple_wiki "266 of 600,000 chunks"
  case observed during NOMAD1 ingestion testing — previously these
  looked identical to fully-completed embeds in the UI.

Both warnings render only when their condition is met (silent by
default; noisy only on real problems).

Base is `feat/kb-ratio-registry` (#891) because Warning B's "expected
chunks" estimate comes from `KbRatioRegistry.estimateChunks()`. GitHub
fast-forwards to `rc` once #891 merges.

- `app/utils/kb_warning_decision.ts` — pure `decideWarnings(inputs)`
  with thresholds (`100 MB`, `0.5×`) as exported constants. 10 unit
  tests cover the healthy case, both warnings, the under/at/over
  boundary, the registry-miss suppression, and the video-only registry
  case (`expectedChunks: 0` correctly skips Warning B).
- `RagService.computeFileWarnings()` — single Qdrant scroll tallies
  chunks per source, filesystem walk fills in zero-chunk files,
  ratio registry estimates the expectation, decision function emits.
- New endpoint `GET /api/rag/file-warnings` returns
  `Record<source, FileWarning[]>` (sources with no warnings are
  omitted, so the frontend can `warnings[source] ?? []` for clean
  defaults).
- KB modal: warnings render inline under the file name as amber-tinted
  pills. Polled every 30s alongside the existing health check.

- Warning C — chunks skipped due to length. PR #890 (#881 fix) prevents
  the silent drop at the embed boundary, so the underlying condition
  shouldn't fire anymore. If we still want to surface "we truncated
  N chunks to fit", that needs separate `skipped_count` tracking in
  EmbedFileJob — a Phase 2 follow-up.
- Suppressing Warning B during active mid-ingestion. The user can cross-
  reference the Processing Queue to know it's in-flight; suppressing
  warnings while a job runs would mask real stalls where the job died
  mid-batch. Will revisit when per-card status is wired through.
- Use of `kb_ingest_state.chunks_embedded` (#888) as the chunk count
  source. This PR uses Qdrant scroll directly so it can land
  independently of #888.

- 10 new unit tests on `decideWarnings`, all pass
- Type-check clean
- Hot-patch + browser smoke test deferred until #891 lands (the ratio
  registry needs to exist in the DB for `estimateChunks()` to return
  non-null estimates — without it, only Warning A fires which is still
  useful but Warning B stays dormant)
2026-05-20 10:16:00 -07:00
Chris Sherwood
e68c753e39 feat(KB): surface embedding-disk estimate in curated tier-change modal (RFC #883 §1)
When a user picks a tier in TierSelectionModal, show how much additional
disk space the AI Assistant will need if the new ZIMs are indexed, plus
a policy-aware footer explaining whether they'll auto-index (Always) or
wait for opt-in (Manual). Estimates consume #891's KbRatioRegistry via a
new POST /api/rag/estimate-batch endpoint.

Backend
- New POST /api/rag/estimate-batch route + RagController.estimateBatch
- VineJS schema accepting array of {filename, sizeBytes}, capped at 500
- KbRatioRegistry.estimateBatch aggregates via the existing prefix-match
  lookup, returns {totalChunks, totalBytes, hasUnknown}
- New BYTES_PER_CHUNK_ON_DISK constant (~8 KB: 3 KB vector + ~3 KB chunk
  text + ~2 KB payload/index overhead). Tunable; will be replaced by
  Phase 4 self-calibration once we have real measurements.
- Controller normalizes incoming filenames via path.basename so callers
  that send full paths or URLs still match registry prefixes correctly.

Frontend
- api.estimateEmbeddingBatch() client method
- TierSelectionModal: when localSelectedSlug is set, resolve the tier's
  resources (incl. inherited tiers), POST to /estimate-batch, and render
  a new info block with the +~X GB figure + ingest-policy copy. Also
  fetches rag.defaultIngestPolicy so the same block surfaces whether
  indexing will fire automatically or wait for the user.
- resourceFilename() helper extracts the basename from the resource URL
  so the registry lookup hits the right prefix regardless of mirror.

Tests
- 4 new cases in tests/unit/kb_ratio_lookup.spec.ts covering the
  estimateBatch aggregator: standard sum, unknown-flagging, video-only
  ZIM (0 chunks but known, hasUnknown stays false), empty input.

Stacks on feat/kb-ratio-registry (#891) — consumes the registry table
seeded by that PR. Once #891 merges to rc, this PR auto-rebases.

Out of scope for this PR (deferred to follow-ups):
- Per-batch opt-in checkbox (RFC §1's '☑ Also index these for AI') needs
  a per-batch policy override path and is a separate PR
- Guardrail modal at 50 GB / 10% free / 6 hr thresholds (RFC §7) is also
  separate; this PR is informational, not gating
- Time-to-embed estimate awaits a chunks-per-second metric per host
2026-05-20 10:16:00 -07:00
chriscrosstalk
8eb8809154 feat(KB): Always/Manual ingest policy toggle (RFC #883 §1/§4) (#894)
* feat(KB): per-file ingest state machine (Phase 1 of RFC #883)

Adds a persistent state machine for AI knowledge-base ingestion so the
scanner can distinguish "fully indexed", "user opted out", "failed", and
"stalled" from each other — none of which were derivable from the prior
binary "any chunks in Qdrant ⇒ embedded" check.

## What lands

- New table `kb_ingest_state` keyed by `file_path` with enum state column
  (`pending_decision | indexed | browse_only | failed | stalled`).
  Independent of `installed_resources` so it covers both curated downloads
  and manually-uploaded KB files.
- New KV key `rag.defaultIngestPolicy` (string: `Always | Manual`).
  Registered now but not consumed yet — JIT prompt + wizard step land in
  Phase 3 of the RFC.
- `EmbedFileJob.handle` writes state on terminal outcomes:
  - Success (final batch) → `indexed` + chunks count
  - `UnrecoverableError` → `failed` + error message
  - Retryable errors are left to BullMQ's existing retry path
- `scanAndSyncStorage` swaps the binary qdrant check for a state-aware
  decision tree (see `decideScanAction`). Existing installs auto-backfill
  on first scan: files with chunks in Qdrant but no state row become
  `indexed`; new files start as `pending_decision`.
- `deleteFileBySource` drops the state row last, so removed files
  disappear entirely instead of leaving an orphan that the next scan
  would re-dispatch into nothing.

## What does NOT land here

- Ratio registry (separate PR) — needed for partial-stall detection and
  cost estimates, but a separable concern.
- #880 follow-up initial-progress anchor (separate tiny PR).
- Phase 2 UI (status pill, per-card actions, conditional warnings).
- Phase 3 policy surfaces (wizard step, JIT prompt, guardrail modal).
- PR #886's bulk-action hookup — `_deletePointsBySource` / Re-embed All
  / Reset & Rebuild would also want to set state, but #886 isn't merged
  yet; that wiring goes in a follow-up once #886 lands.

## Target

This is forward work for v1.40.0 (RFC #883). Branching off `rc` because
that's the current latest base and post-GA Jake will sync rc→dev; a
retarget at PR-open time is a fast-forward if requested.

## Tests

- 9 new unit tests for `decideScanAction` covering all five states plus
  the no-row / chunks-present / chunks-missing combinations
- Type-check clean
- Smoke-tested end-to-end on NOMAD3 via hot-patch:
  - Backfill: 5 ZIMs + 2 KB uploads with existing chunks in Qdrant all
    came back `indexed` on first scan
  - Pending dispatch: a video-only ZIM with no chunks (`lrnselfreliance`)
    came back `pending_decision` and was correctly re-dispatched (Bull
    deduped to its historical `:completed` jobId — bgauger's #886 fix
    drains that)
  - Delete hook: deleting a KB upload via `DELETE /api/rag/files`
    removed both the disk file and the state row

* feat(KB): Always/Manual ingest policy toggle (RFC #883 §1/§4)

Activates the `rag.defaultIngestPolicy` KV registered in Phase 1
(#888) so users on a fresh install (or anyone who picks Manual mode)
no longer get every new ZIM auto-dispatched to the embed pipeline.

## Stacks on #888

This PR's base is `feat/kb-ingest-state-machine` (#888). The state
machine has to be in place for the decision function to be policy-aware;
GitHub will fast-forward the base to `rc` once #888 merges.

## Backend changes

- `decideScanAction` now takes a `policy: 'Always' | 'Manual'` argument
  (defaults to `Always` for backward compatibility).
- New `ScanAction` kind: `create_pending`. Manual mode records that the
  scanner has seen a new file (so the UI can surface a per-card Index
  affordance later) without dispatching an EmbedFileJob.
- `scanAndSyncStorage` reads the KV and passes it through. The scan-result
  log line now includes the active policy and a `waiting on user` count
  for Manual-mode hits.
- `rag.defaultIngestPolicy` added to `SETTINGS_KEYS` so it's reachable
  through the existing `GET/PATCH /api/system/settings` surface — no new
  endpoint.

## Frontend changes

- New section in the KB panel between "Why upload" and "Processing Queue":
  "Auto-index new content for AI? [Always | Manual]" — segmented radio
  with copy explaining the 5-10× disk multiplier. Default Always.
- `useQuery('ingestPolicy')` reads the current value; clicking the
  inactive option mutates and shows a notification confirming the new
  behavior.

## Tests

- 14 unit tests on `decideScanAction` (was 9) — split into Always-mode
  cases (preserves Phase 1's contract) and Manual-mode cases
  (`create_pending`, `pending_decision → skip`, etc.).
- Type-check clean.
- Hot-patch + browser verification deferred until #888 lands; the state
  machine smoke-tested cleanly on NOMAD3 in #888's PR, and this PR's
  decision-tree changes are exhaustively unit-tested.

## RFC open question §3 — policy-change re-trigger

Switching Manual → Always doesn't auto-dispatch existing `pending_decision`
rows immediately. The next scan re-evaluates and dispatches them under
the new policy. This matches the RFC's "treat the switch as I've-
thought-about-it" instinct for the guardrail; full guardrail
implementation lands in Phase 3 task 14.

---------

Co-authored-by: Jake Turner <52841588+jakeaturner@users.noreply.github.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
43ca584b6c feat(KB): status pill + last-activity timestamp on Processing Queue (RFC #883 §5/§10)
Each in-flight (or stuck) embedding job gets a colored health pill,
relative-activity timestamp, and chunk counter so users can tell at a
glance whether ingestion is making progress.

## Health states

- **🟢 Active** — last batch < 2 min ago
- **🟡 Slow** — last batch 2-5 min ago (CPU-paced multi-batch ingestion
  lives here naturally; not always a problem)
- **🔴 Stalled** — last batch > 5 min ago (likely real problem)
- ** Waiting** — queued, no batch started yet
- **🔴 Failed** — job recorded failed status

## What lands

- New backend util `kb_job_health.ts` with pure `computeJobHealth(input)`
  decision function. Time-based thresholds (2 min / 5 min) inlined as
  constants. 9 unit tests pin the boundaries.
- `EmbedJobWithProgress` gains `lastBatchAt`, `startedAt`, `chunks` —
  already set by `EmbedFileJob.handle` on every batch transition, just
  not previously surfaced through `listActiveJobs`.
- Frontend `kb_job_health_display.ts` maps each status to a Tailwind
  dot color, label, and aria-label so backend and UI stay in sync.
- `ActiveEmbedJobs.tsx` renders the pill, "last activity Xs ago", and
  chunk counter above each progress bar. Adds a manual Refresh button
  and "Last updated Xs ago" line — the existing 2s/30s auto-poll
  cadence in `useEmbedJobs` is left intact.
- Live tick at 5s keeps the relative timestamps current without
  re-fetching from the API.

## Not in scope

- Per-card Cancel / Retry / Un-index — separate Phase 2 PR
- Conditional warnings A/B/C — separate Phase 2 PR
- Computing throughput rate (chunks/min) — needs ratio registry consumer
  (Phase 2 follow-up); for now the pill answers the "is it stuck?"
  question directly without a rate estimate.
2026-05-20 10:16:00 -07:00
Chris Sherwood
159d57b2af feat(KB): ratio registry for disk + time estimates (Phase 1B of RFC #883)
Foundation for the cost estimates and partial-stall detection that
Phase 2 will surface. No consumers yet — this PR just lays the table,
the seed rows, and the lookup helper so subsequent UI work has
estimates available without a per-ZIM benchmark.

## What lands

- New table `kb_ratio_registry` (pattern, chunks_per_mb, sample_count,
  notes). Migration creates and seeds heuristic defaults from the RFC
  appendix: devdocs (1100/MB), Wikipedia variants (270/MB), iFixit
  (50/MB), Stack Exchange Q&A (200/MB), video-only ZIMs (0), plus a
  catch-all fallback at 100/MB.
- `KbRatioRegistry` model with static `lookup()` and `estimateChunks()`.
- Pure helper `kb_ratio_lookup.ts` doing longest-prefix-match — a
  specific entry (`wikipedia_en_simple_`) overrides a broader one
  (`wikipedia_en_`). 9 unit tests covering the lookup boundary.
- `sample_count` starts at 0 (heuristic seed) and is reserved for
  Phase 4 self-calibration to increment as observed ZIMs update each row.

## Not in scope

- Self-calibration on successful ingestion (Phase 4)
- UI consumers — Warning B (partial-embed stall) and the storage budget
  meter / time estimates land in Phase 2.

## Tested

- Type-check clean
- 9 unit tests pass for `findChunksPerMb` and `estimateChunkCount`
- Migration applied on NOMAD3 via hot-patch; 9 seed rows verified in DB
2026-05-20 10:16:00 -07:00
Jake Turner
e3b758f11e fix(AI): add truncation DEBUG log 2026-05-20 10:16:00 -07:00
Chris Sherwood
2dec5bf676 fix(AI): pre-cap embed input + log fallback reason (#881)
The OpenAI-compatible /v1/embeddings fallback path can't pass
`truncate:true` / `num_ctx:8192` to the model, so any chunk that
exceeds the model's loaded context_length (often 2048 for
nomic-embed-text:v1.5) returns a 400 BadRequestError and is silently
dropped from Qdrant. Two CPU-only ingestion runs on NOMAD1 hit this
on dense technical content (medlineplus, arduino.stackexchange) even
after PR #763's num_ctx fix on the native path.

Pre-cap each input string at 4000 chars before either backend call.
That's ~1000-2000 tokens depending on density, comfortably under the
model's 2048 default. The chunker in RagService is sized for
MAX_SAFE_TOKENS=1600 (3200 chars at its conservative 2 chars/token
estimate), so well-formed inputs are never touched; this is purely a
runtime safety net for the edge cases that slip through.

Also stop swallowing the original error in the catch. The bare
`} catch {}` here has masked recurring "input length exceeds context
length" failures for months (#369, #670, #881). Capture and warn-log
the message so future investigations see why we fell back.

Same root cause as #369 and #670 which were closed without an actual
fix to the fallback path.
2026-05-20 10:16:00 -07:00
chriscrosstalk
f304d80f80 fix(RAG): anchor continuation-batch initial progress to overall-file frame (#889)
Each continuation batch of a multi-batch ZIM embed runs as a fresh
BullMQ job, so handle() ran the hardcoded `safeUpdateProgress(job, 5)`
even when the file was already 100k articles into a 600k-article ZIM.
The UI gauge briefly dropped to 5% before the per-batch onProgress
callback caught up to the true overall percentage, reading as a
backward jump every time a new batch started.

Compute initialPercent from batchOffset / totalArticles when available,
falling back to 5 for single-batch files (uploaded PDFs, txts) where
totalArticles isn't set. Capped at 99 to leave headroom for the 100%
final-batch marker.

Follow-up to PR #880 (which fixed the 0-100% scaling during a batch
but still had the initial-frame regression).
2026-05-20 10:16:00 -07:00
chriscrosstalk
743549ca74 feat(KB): per-file ingest state machine (Phase 1 of RFC #883) (#888)
Adds a persistent state machine for AI knowledge-base ingestion so the
scanner can distinguish "fully indexed", "user opted out", "failed", and
"stalled" from each other — none of which were derivable from the prior
binary "any chunks in Qdrant ⇒ embedded" check.

## What lands

- New table `kb_ingest_state` keyed by `file_path` with enum state column
  (`pending_decision | indexed | browse_only | failed | stalled`).
  Independent of `installed_resources` so it covers both curated downloads
  and manually-uploaded KB files.
- New KV key `rag.defaultIngestPolicy` (string: `Always | Manual`).
  Registered now but not consumed yet — JIT prompt + wizard step land in
  Phase 3 of the RFC.
- `EmbedFileJob.handle` writes state on terminal outcomes:
  - Success (final batch) → `indexed` + chunks count
  - `UnrecoverableError` → `failed` + error message
  - Retryable errors are left to BullMQ's existing retry path
- `scanAndSyncStorage` swaps the binary qdrant check for a state-aware
  decision tree (see `decideScanAction`). Existing installs auto-backfill
  on first scan: files with chunks in Qdrant but no state row become
  `indexed`; new files start as `pending_decision`.
- `deleteFileBySource` drops the state row last, so removed files
  disappear entirely instead of leaving an orphan that the next scan
  would re-dispatch into nothing.

## What does NOT land here

- Ratio registry (separate PR) — needed for partial-stall detection and
  cost estimates, but a separable concern.
- #880 follow-up initial-progress anchor (separate tiny PR).
- Phase 2 UI (status pill, per-card actions, conditional warnings).
- Phase 3 policy surfaces (wizard step, JIT prompt, guardrail modal).
- PR #886's bulk-action hookup — `_deletePointsBySource` / Re-embed All
  / Reset & Rebuild would also want to set state, but #886 isn't merged
  yet; that wiring goes in a follow-up once #886 lands.

## Target

This is forward work for v1.40.0 (RFC #883). Branching off `rc` because
that's the current latest base and post-GA Jake will sync rc→dev; a
retarget at PR-open time is a fast-forward if requested.

## Tests

- 9 new unit tests for `decideScanAction` covering all five states plus
  the no-row / chunks-present / chunks-missing combinations
- Type-check clean
- Smoke-tested end-to-end on NOMAD3 via hot-patch:
  - Backfill: 5 ZIMs + 2 KB uploads with existing chunks in Qdrant all
    came back `indexed` on first scan
  - Pending dispatch: a video-only ZIM with no chunks (`lrnselfreliance`)
    came back `pending_decision` and was correctly re-dispatched (Bull
    deduped to its historical `:completed` jobId — bgauger's #886 fix
    drains that)
  - Delete hook: deleting a KB upload via `DELETE /api/rag/files`
    removed both the disk file and the state row

Co-authored-by: Jake Turner <52841588+jakeaturner@users.noreply.github.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
5e2c599c3e fix(ZIM): preserve co-existing Wikipedia corpora on cleanup (#884)
onWikipediaDownloadComplete was deleting every file whose name starts
with `wikipedia_en_`, treating distinct corpora (simple, medicine,
wikivoyage, climate_change, etc.) as competing versions of the same
selection slot. Whichever wiki finished second silently wiped the
other from disk.

Match by filename stem instead — strip the trailing `_YYYY-MM(-DD).zim`
date suffix and only delete files with the same stem as the new
download. Different release dates of the same variant still get cleaned
up; distinct variants are preserved.

Extracted the predicate to `app/utils/zim_filename.ts` so the boundary
is covered by unit tests (8 cases incl. the #884 repro scenario).
2026-05-20 10:16:00 -07:00
Jake Turner
4c211964e0 fix(KB): add re-embed and reset & rebuild opts to fix broken embeddings (#886) 2026-05-20 10:16:00 -07:00
Chris Sherwood
d28eb9be59 fix(RAG): report ZIM ingestion progress in overall-file frame
Before this change, the Active Downloads / Processing Queue UI showed the
ingestion progress gauge jumping wildly during multi-batch ZIM ingestion
(e.g. 5% → 88% → 27% → 5% → 56% → 36% over ~60 seconds for cooking SE).

Each continuation batch is a separate BullMQ job, and `EmbedFileJob.handle()`
reported `job.progress` in two different reference frames depending on
where it was in the batch lifecycle:

  - During-batch (via the onProgress callback): 5% → 95% scaled across
    "% through this batch's chunks"
  - End-of-batch (just before dispatching the next): overwritten to
    `(nextOffset / totalArticles) * 100` — % through the whole file
  - Next continuation batch starts with progress = 5% explicitly, then
    climbs through the per-batch range again

`listActiveJobs()` returns the latest active BullMQ job's progress. With
GPU-accelerated ingestion completing a batch every ~4 seconds, the UI
saw the jobId rotate constantly and the gauge whipsaw between the two
reference frames.

`totalArticles` was already wired through the EmbedFileJob params shape
and used end-of-batch — but RagService never actually populated it,
so any frame-scaling that depended on it silently fell back to the
per-batch range. Two fixes together:

1. `ZIMExtractionService.extractZIMContent()` now returns
   `{ chunks: ZIMContentChunk[]; totalArticles: number }` instead of a
   raw chunks array, surfacing `archive.articleCount` to the caller.
   Single caller (rag_service) updated to destructure.

2. `RagService.processZimFile()` includes `totalArticles` in its result
   so `EmbedFileJob.dispatch()` can propagate it to the continuation
   batch (which the existing code already does via
   `totalArticles: totalArticles || result.totalArticles`).

3. `EmbedFileJob`'s onProgress callback scales the service-reported
   per-batch percent into the overall-file frame when `totalArticles`
   is known: `((batchOffset + (percent/100) * ZIM_BATCH_SIZE) /
   totalArticles) * 100`. Capped at 99% to leave room for the explicit
   100% set at file completion. Falls back to the original 5-95% range
   for single-batch files (uploaded PDFs/txts) where totalArticles is
   undefined — the gauge then represents % through the only batch,
   which is what the UI expects for one-shot files.

Validated on NOMAD8 (RX 6800, ROCm-accelerated nomic):

  - devdocs python (small, ~1500 articles): batch progressions seen
    monotonically across continuation jobIds:
    1501@30% → 1510@33% → 1514@43% → 1518@52%.
  - ifixit (huge, ~100k articles): stays near 3% for the first many
    batches at offset 0..3000 — correct, the file is enormous.
  - wikipedia_en_medicine (large, ~70k articles): stays near 0-1% for
    the first batches — also correct.
  - Brief 0-5% blip on continuation handoff (the explicit
    `safeUpdateProgress(job, 5)` at batch start, before the first
    onProgress callback fires) — visible but quickly resolves to the
    overall-frame value. No more 5% ↔ 88% chaos.
2026-05-20 10:16:00 -07:00
Chris Sherwood
662a6c45fb fix(System): validate StartedAt with fallback to tail:500 (PR review)
Jake noted that `inspect.State.StartedAt` could be missing/malformed,
which would land NaN inside `container.logs({ since, until })`. Add
defensive validation that the parsed timestamp is finite and positive
before using it, with a fallback to the previous tail:500 strategy
(plus a warn log) when it isn't. Happy path is unchanged.
2026-05-20 10:16:00 -07:00
Chris Sherwood
d2f2172b3c fix(System): correct AMD VRAM in Graphics card + harden log probe
Two related fixes to make the System Information page reliably show real
GPU info instead of misleading lspci BAR0 readings or N/A.

1. Generalize bogus-VRAM detection to AMD.

Same root cause as #835 (NVIDIA showing 32 MB), this time for AMD: lspci
parses the first PCI memory Region (BAR0, typically 1-16 MiB on Navi
cards) as `vram`. On NOMAD8 (Threadripper 3960X + Radeon RX 6800), the
System Information page showed "1 MB" instead of "16 GB". PR #850 fixed
this for NVIDIA by clearing the bogus value and re-running the Ollama
log probe; the check was vendor-gated to NVIDIA only.

`isBogusNvidiaVram` becomes `isBogusDgpuVram` with a `isDiscreteGpuVendor`
helper matching /nvidia|advanced micro devices|amd|ati/i. Same 256-MiB
threshold — no real discrete GPU has less than that, while Intel iGPUs
(which legitimately report small shared-memory VRAM via lspci) are left
untouched. The probe gate condition is similarly renamed.

2. Read Ollama logs from the startup window, not tail:N.

`getOllamaInferenceComputeFromLogs()` was reading the last 500 log lines
and grepping for the "inference compute" line. That line is written once
during Ollama's GPU discovery phase within seconds of startup. Under
active embedding workloads we measured >1000 log lines/min, which pushes
the line past any reasonable tail within minutes — at which point the
probe returns null and the UI flips to "GPU Not Accessible" even though
Ollama is happily using the GPU (size_vram > 0 in /api/ps).

Switch from `tail: 500` to `since: containerStartedAt, until:
containerStartedAt + 300s`. The 5-minute window is bounded regardless of
container uptime and always captures Ollama's GPU discovery output. The
inference-compute line is emitted in the first few seconds of startup, so
5 min is generous headroom.

Validated on NOMAD8 (RX 6800, container uptime ~10 min with sustained
ingestion that generated 6,345 log lines):

Before:
  controllers[0]: { model: "Navi 21 ...", vram: 1 }

After (bogus AMD VRAM cleared, log probe stale due to tail:500 churn):
  controllers[0]: { model: "Navi 21 ...", vram: null }
  gpuHealth: { status: "passthrough_failed" }
  -> UI shows "N/A" and the banner from PR #208

After (bogus cleared + log probe reads startup window):
  controllers[0]: { model: "AMD Radeon RX 6800", vram: 16384 }
  gpuHealth: { status: "ok", hasRocmRuntime: true, ollamaGpuAccessible: true }
  -> UI shows "16 GB", no banner

Both branches of the fix exercise correctly: NVIDIA path unchanged
(same code, just renamed identifiers), AMD path now triggers the probe
and the probe reliably finds the GPU info regardless of container age.
2026-05-20 10:16:00 -07:00
Jake Turner
501860a23b fix(DockerService): improve volume logic and documentation in forceReinstall 2026-05-20 10:16:00 -07:00
Chris Sherwood
a22c6408e6 fix(RAG): pace continuation batches when embedding is CPU-only
Stacks on top of the multi-batch ZIM ingestion fix. After that fix,
multi-batch ZIM ingestion completes correctly — but on installs where
Ollama runs the embedding model on CPU (currently every AMD ROCm
install, since Ollama's ROCm build doesn't accelerate nomic-bert),
the now-correct sustained 100% CPU saturation across all cores can
starve other services hard enough to take the box down. Confirmed
on a Threadripper 3960X + RX 6800 NOMAD: a wikipedia-class ZIM
ingestion pegged 48 threads cleanly enough that sshd lost
banner-exchange responsiveness and the box ultimately required a
power-cycle.

NVIDIA installs aren't affected — nomic-embed-text:v1.5 runs at
100% GPU on RTX 5060 (verified via `ollama ps`).

Detect placement at runtime, pace only when needed:

1. OllamaService.isEmbeddingGpuAccelerated() — queries /api/ps and
   returns true if any loaded embedding model reports size_vram > 0.
   Fails closed (returns false) if /api/ps is unreachable or no embed
   model is loaded yet — over-pacing is safer than crashing.

2. EmbedFileJob.handle() — between batches (hasMoreBatches: true
   branch), check placement and `await setTimeout(CPU_BATCH_DELAY_MS)`
   when CPU-only. CPU_BATCH_DELAY_MS = 1000 (1s) — enough to give the
   OS scheduler a window for sshd/disk-collector/etc., small enough
   that total ingestion time isn't meaningfully affected (each batch
   is ~60-90s of work).

GPU-accelerated installs see zero behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
ba53702349 fix(queue): singleton QueueService to stop ioredis connection leak
Every static call site instantiated a fresh QueueService (24 call sites
across 8 files). QueueService.getQueue() opens a BullMQ Queue per call
when not cached, and each Queue opens two ioredis connections (one for
commands, one blocking). Because every static call constructed a new
QueueService, its internal `queues` cache was never shared, every call
opened a fresh pair, and none were ever closed.

In normal operation this leaked a few connections per API hit. During
multi-batch ZIM ingestion after PR #872 (where EmbedFileJob.handle()
dispatches the next batch every 50 articles), every batch completion
opened two new connections. On NOMAD3 at ~one batch every 4s sustained,
that's ~1800 leaked connections/hour. Redis hit its 10,000-maxclient
ceiling in ~5 hours and the admin container fell into an EPIPE flood
that required a restart to recover.

Fix: collapse QueueService to a true process-wide singleton with a
private constructor and getInstance() accessor. The existing per-queue
Map is now shared across every dispatch / status / cleanup call, so each
queue's underlying connections are opened exactly once for the lifetime
of the process. close() now clears the map so the singleton can be torn
down cleanly if a graceful-shutdown hook is ever wired up.

Validated on NOMAD3 (RTX 5060, v1.32.0-rc.4 + this patch hot-applied):
under sustained multi-batch wikipedia_en_simple_all_nopic ingestion,
connected_clients held flat at 21-22 across a 5-minute window. Pre-fix
the same scenario climbed to 10,000+ over hours.
2026-05-20 10:16:00 -07:00
Chris Sherwood
74cef75153 fix(RAG): unbreak multi-batch ZIM ingestion (jobId dedupe)
EmbedFileJob.dispatch() uses a deterministic per-file jobId
(sha256(filePath).slice(0,16)) for every batch. The parent batch's
handle() calls EmbedFileJob.dispatch({ batchOffset }) before returning,
so the parent is still in `active` state and locked when the
continuation tries to enqueue. BullMQ silently returns the locked
parent instead of creating a new job — and in newer BullMQ versions
it does so without throwing, so the existing
`catch (error.message.includes('job already exists'))` branch never
fires. After the parent completes, its entry stays in the `completed`
ZSET (held by `removeOnComplete: { count: 50 }`), continuing to trip
jobId dedupe for any subsequent re-dispatch attempts.

Result: every NOMAD install since 2026-02-08 (feat: zim content
embedding) with a multi-batch ZIM (wikipedia, cooking SE, ifixit,
lrnselfreliance, etc.) has only the first 50 articles indexed in
qdrant. The RAG feature has been silently degraded for ~3 months —
the user sees the file appear in their KB, qdrant accumulates ~50
articles' worth of vectors, and pagination quietly halts. No error
surfaces anywhere.

Fix: dispatch() skips the deterministic jobId for continuation batches
(batchOffset > 0), letting BullMQ auto-generate a unique one so each
batch stacks as an independent queue entry. Initial dispatches keep
the deterministic jobId so re-triggering an install (UI re-click,
sync rescan) remains idempotent. The existing 'job already exists'
branch is now gated on !isContinuation, since by construction
continuation batches will never hit dedupe.

Validated on NOMAD8 (RX 6800 / Threadripper 3960X, rc.3 + this patch):
devdocs_en_python (~1,500 chunks across multiple batches) correctly
paginates end-to-end. admin.log shows the expected sequence of
"Dispatched embedding job for file: X (continuation @ offset N)"
followed by "Starting embedding process for: X (batch offset: N)"
for each batch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Chris Sherwood
43645e4bbc fix(AI): rewrite RAG query on first follow-up (off-by-one in skip-rewrite threshold)
The short-conversation skip in `rewriteQueryWithContext` used `userMessages.length <= 2`,
which short-circuits both the very first turn AND the first follow-up. The follow-up is
the moment the rewriter matters most — it's where pronouns and shorthand ("the bars",
"how long does it last?") need to be resolved against earlier turns before the embedding
search runs. With the rewriter skipped, RAG queries against the raw last message, scores
nothing above the 0.3 threshold, and no context gets injected for that turn.

The visible symptom is the assistant treating the first follow-up in any chat as a
brand-new question — e.g. "great - they threw up 2 of the bars it looks like" answered
as if it were a recipe-bars question, with no carry-forward of the prior chocolate-
poisoning context.

Threshold lowered to `< 2`: skip only when there's exactly one user message (nothing to
rewrite from). From the first follow-up onward the rewriter runs, as originally intended
before commit 96e5027.

Validated against `mistral-nemo:12b` on NOMAD3 by hot-patching the compiled controller
and replaying the dog-chocolate scenario. Post-patch response correctly threads "3
Hershey's bars" from turn 1 into turn 2's answer; pre-patch (per reporter's screenshot)
pivoted to peanut butter bar recipes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Ben Gauger
3abf338767 fix(Downloads): treat missing Content-Type as octet-stream (#848)
download.kiwix.org (and some of its mirrors) don't always set a
Content-Type header on .zim responses. The MIME validator was reading
`headers['content-type'] || ''`, then running each allowlist entry
through `''.includes(...)` which is always false, so every download
from those hosts was rejected with `MIME type  is not allowed`.

RFC 7231 §3.1.1.5 says missing Content-Type may be treated as
application/octet-stream by the recipient, and that's already in every
binary-content allowlist we use (ZIM, PMTILES, base assets). Default
the missing case to that and the validator does the right thing.

Strict callers that don't list octet-stream still reject as before.
2026-05-20 10:16:00 -07:00
Chris Sherwood
019a5a4927 fix(AI): preserve semver tag in DB on AMD Ollama updates
Closes #855.

PR #804's AMD branch in `updateContainer()` overrode `newImage` to
`ollama/ollama:rocm` and then persisted that literal string to
`service.container_image` (line 1273). Two downstream consequences for
every AMD user who clicked Update on AI Assistant:

1. Apps page (`apps.tsx`) extracts the displayed version from
   `container_image` and rendered the literal string "rocm".
2. `ContainerRegistryService.getAvailableUpdates()` parsed `currentTag =
   "rocm"`, which isn't semver, so `parseMajorVersion` returned NaN, the
   filter didn't reject newer tags by major-version, and `isNewerVersion`
   treated any future tag as newer. Result: the same update reappeared
   on every check, forever.

Fix: separate "what we run" from "what we persist". `runtimeImage`
holds the tag passed to `docker.pull()` and `createContainer()` (still
`:rocm` for AMD), while `newImage` keeps the semver tag and is the
value written to the DB. Surgical: 3 references renamed plus 1
declaration added.

The install path (`_createContainer`) already had the right shape
(runtime-only override, no DB write of the override), so this PR only
touches `updateContainer`.

Test plan:
- `npm run typecheck` passes locally.
- Manual repro on NOMAD2 (AMD HX 370 / 890M, rc.2): before fix, DB
  shows `container_image = ollama/ollama:rocm` after triggering an
  Ollama update via Settings > Apps; Apps page shows version "rocm";
  `/api/system/services/check-updates` immediately re-reports the same
  update available. After fix, DB shows `container_image =
  ollama/ollama:<targetVersion>`; Apps page shows the semver; check-
  updates does not re-report the same update.
- nomad_ollama container itself still runs the `:rocm` image
  (verified via `docker inspect`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
Ben Gauger
6c799dd602 fix(System): correct NVIDIA VRAM in Graphics card (#835)
PR #804 added pciutils to the admin image so AMD detection could fall
back to lspci. Side effect for NVIDIA: si.graphics() now finds the card
via lspci but reads the BAR0 region size (16-32 MiB on most NVIDIA
cards) as VRAM, since nvidia-smi isn't installed in the admin image to
enrich the result. A GTX 1050 Ti showed 32 MB instead of 4096.

The nvidia-smi-via-Ollama and Ollama log probes already give the right
number, but they only ran when graphics.controllers came back empty.
Extend the trigger so they also run when the only entries are NVIDIA
controllers reporting under 256 MiB (no real dGPU has that little). If
the probes can't reach a value either (Ollama not installed, passthrough
broken), VRAM now falls back to N/A instead of the bogus 32.

Verified locally on an RTX 4070 Ti by simulating the container condition
(lspci available, nvidia-smi unreachable). Before the fix: vram 32, model
"AD104 [GeForce RTX 4070 Ti]". After: vram 12288, model "NVIDIA GeForce
RTX 4070 Ti", from the Ollama inference-compute log line. Also confirmed
the same result inside the actual admin Docker image.
2026-05-20 10:16:00 -07:00
Chris Sherwood
a2e2f7fc40 fix(AI): vendor-aware AMD HSA override + benchmark discrete-GPU detection
Closes #810.

## Bug A: HSA_OVERRIDE_GFX_VERSION=11.0.0 was unconditional

PR #804 set HSA_OVERRIDE_GFX_VERSION=11.0.0 for any AMD GPU. The inline comment
claimed this was harmless on supported discrete cards (gfx1030 RX 6800, etc.) — empirically false. With the override, Ollama crashes during GPU discovery on
gfx1030 and falls back to CPU silently. Affects every NOMAD user with an
RX 6800 or other RDNA 2 discrete card.

The correct value depends on the gfx version:
- gfx1030, gfx1100, gfx1101, gfx1102: officially supported by ROCm — no override
- gfx1031..gfx1036 (RDNA 2 variants + iGPUs like Rembrandt 680M): 10.3.0
- gfx1103, gfx1150, gfx1151 (Phoenix 780M, Strix 890M, Strix Halo): 11.0.0

### Resolution chain in `_resolveAmdHsaOverride()`

1. KV `ai.amdHsaOverride` — manual override; accepts 'none' to disable, or a
   semver-style value to force.
2. Marker file `/app/storage/.nomad-amd-gfx` — written by install_nomad.sh
   based on lspci codename. Mapped to override via `_mapGfxToHsaOverride()`.
3. Default: `11.0.0` — preserves prior behavior so existing iGPU users
   (780M / 890M, the dominant AMD population today) don't regress on upgrade.

Discrete RDNA 2 users on existing installs can opt out via
`ai.amdHsaOverride='none'` and force-reinstall AI Assistant, OR re-run
install_nomad.sh to refresh the marker file.

The helper is used in both `createContainer` (initial install) and
`updateContainer` (image update) paths, replacing the unconditional push.

## Bug B: BenchmarkService had no AMD discrete detection path

`BenchmarkService.getHardwareInfo()` had three GPU detection fallbacks:
1. `si.graphics()` — empty inside Docker for AMD
2. nvidia-smi — NVIDIA only
3. AMD APU regex from CPU model — integrated only

Result: AMD discrete cards (RX 6800, RX 7900 XTX, etc.) showed up as
"GPU: Not detected" on the leaderboard despite ROCm working. Corrupts
leaderboard data quality for that population.

Fix: after the existing fallbacks, call `SystemService.getSystemInfo()` and
read `graphics.controllers[0].model`. That path already handles AMD via the
marker file + Ollama log probe added in PR #804, so we're reusing existing
plumbing rather than duplicating detection logic.

## install_nomad.sh changes

The existing AMD detection block already runs lspci. Added a codename parse
step that maps Navi 21/22/23/24, Rembrandt, Phoenix1/Phoenix2, Strix/Strix
Point/Strix Halo, and Navi 31/32/33 to gfx versions, then writes
`/opt/project-nomad/storage/.nomad-amd-gfx`. Unknown codenames write nothing
(admin handles missing-marker case via the backward-compat default).

## Validation

Both bugs were originally surfaced and validated empirically on RX 6800 /
gfx1030 / Ubuntu 24.04 + kernel 6.17 + ollama/ollama:rocm during the #810
filing. Validation grid from that report:

| Run                                           | NOMAD Score | tok/s | GPU detected            |
|-----------------------------------------------|-------------|-------|-------------------------|
| Pre-fix (Bug A active)                        | n/a         | 0     | yes, but library=cpu    |
| HSA_OVERRIDE removed, Bug B unfixed           | 73.8        | 221.6 | "Not detected"          |
| Both fixes hot-patched (this PR's behavior)   | 73.7        | 216.0 | AMD Radeon RX 6800      |

Local checks: `npm run typecheck` clean, `npm run build` clean.
2026-05-20 10:16:00 -07:00
chriscrosstalk
62e75fdb54 feat(Content): custom ZIM library sources with pre-seeded mirrors (#593)
* feat(content): add custom ZIM library sources with pre-seeded mirrors

Users reported slow download speeds from the default Kiwix CDN. This adds
the ability to browse and download ZIM files from alternative Kiwix mirrors
or self-hosted repositories, all through the GUI.

- Add "Custom Libraries" button next to "Browse the Kiwix Library"
- Source dropdown to switch between Default (Kiwix) and custom libraries
- Browsable directory structure with breadcrumb navigation
- 5 pre-seeded official Kiwix mirrors (US, DE, DK, UK, Global CDN)
- Built-in mirrors protected from deletion
- Downloads use existing pipeline (progress, cancel, Kiwix restart)
- Source selection persists across page loads via localStorage
- Scrollable directory browser (600px max) with sticky header
- SSRF protection on all custom library URLs

Closes #576

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(content): recognize Wikipedia downloads from mirror sources

When Wikipedia is downloaded via a custom mirror instead of the default
Kiwix server, the completion callback now matches by filename instead
of exact URL. This ensures the Wikipedia selector correctly shows
"Installed" status and triggers old-version cleanup regardless of
which mirror was used.

Also handles the case where no Wikipedia selection exists yet (file
downloaded before visiting the selector), creating the record
automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ZIM): use cheerio for custom mirror directory parsing

* fix(ZIM): use URL constructor for more robust joining

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jake Turner <jturner@cosmistack.com>
2026-05-20 10:16:00 -07:00
0xGlitch
94059b0aaf feat(Maps): regional map downloads via go-pmtiles extract (#780)
* feat(maps): add regional map downloads via go-pmtiles extract

* address Copilot review feedback on PR #780

- auto-refresh preflight on selection/maxzoom change with 400ms debounce and
  requestId stale-safety so the confirm button no longer requires a two-step
  "Estimate Size" -> "Start Download" dance
- safeUpdateProgress helper replaces fire-and-forget updateProgress().catch()
  pattern so cancelled-job errors (code -1) can't surface as unhandled rejections
- gate world basemap source on worldBasemapReady - when ensureWorldBasemap()
  fails we already delete world.pmtiles, so emitting the source was producing
  404s on every tile request
- verify go-pmtiles binary SHA256 at image build time; upstream doesn't ship a
  checksums file so per-arch hashes are pinned as build args with a regenerate
  note when bumping PMTILES_VERSION
2026-05-20 10:16:00 -07:00
Chris Sherwood
299b767e63 feat(content-updates): show size, surface downloads in Active Downloads
Content Updates had three UX problems that compounded:

1. No size column, so users had to guess how big an update would be before
   clicking Update All. Upstream /api/v1/resources/check-updates doesn't
   return size, so CollectionUpdateService now enriches each update with
   a Content-Length HEAD request in parallel (5s timeout, non-fatal on
   failure — the row just renders an em-dash).

2. Small ZIM updates (1-8 MB) never appeared in Active Downloads. Two
   causes, both fixed: handleApply / handleApplyAll didn't invalidate the
   download-jobs query after dispatching, and useDownloads idled at 30s
   between polls — enough for a fast job to dispatch, download, and get
   cleaned up by removeOnComplete before the next refetch.

3. applyUpdate didn't forward title / totalBytes to RunDownloadJob, so
   any update that did briefly surface in Active Downloads had no label
   and no byte-count progress, just a filename and a percentage. It now
   passes both (matching zim_service's dispatch pattern).

Also parallelized applyAllUpdates so dispatching five updates doesn't
serialize five sequential BullMQ round-trips.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
chriscrosstalk
73e2115245 feat(AI): improved AMD GPU acceleration for Ollama via ROCm + HSA override (#804)
* feat(AI): re-enable AMD GPU acceleration for Ollama via ROCm + HSA override

Re-enables AMD GPU support that was disabled in 77f1868 pending validation
of the ROCm image and device discovery. Validation done 2026-04-28 on a
Minisforum UM890 Pro (Ryzen 9 PRO 8945HS + Radeon 780M iGPU) — Ollama
correctly offloaded all model layers to the iGPU when the container was
started with /dev/kfd + /dev/dri passthrough and HSA_OVERRIDE_GFX_VERSION=11.0.0.
On llama3.2:1b, GPU inference ran at 51.83 tok/s vs 33.16 tok/s on CPU
(same hardware, same prompt) — a 1.56x speedup confirmed by Ollama logs
showing "load_tensors: offloaded 17/17 layers to GPU".

Changes
-------

docker_service.ts
- Restore _discoverAMDDevices() (simplified — pass /dev/dri as a directory
  entry, mirroring `docker run --device /dev/dri` behavior, instead of the
  prior brittle hardcoded card0/renderD128 fallback that broke on systems
  where the AMD GPU enumerates as card1+).
- Restore the AMD branch in _createContainer():
  - Switches Ollama image to ollama/ollama:rocm
  - Mounts /dev/kfd + /dev/dri via Devices
  - Sets HSA_OVERRIDE_GFX_VERSION=11.0.0 (required for unsupported-but-RDNA3
    iGPUs like gfx1103; harmless on supported discrete cards)
  - KV opt-out via ai.amdGpuAcceleration (default on)
- Mirror the AMD branch in updateContainer():
  - Lifted GPU detection above docker.pull() so AMD updates pull :rocm
    rather than the standard :targetVersion tag (per-version ROCm tags
    aren't always published)
  - Replaces stale HSA_OVERRIDE in the inspect-captured env on update,
    so containers built before this PR pick up the current value

system_service.ts
- New getOllamaInferenceComputeFromLogs() — parses Ollama startup log line
  "msg=\"inference compute\" ... library=CUDA|ROCm ..." which Ollama emits
  for both NVIDIA and AMD. Catches silent CPU fallback (e.g. NVML death
  after update, or HSA_OVERRIDE failure) that the prior nvidia-smi exec
  probe couldn't detect.
- gpuHealth refactored to use log parsing as the primary probe for both
  vendors, with nvidia-smi exec retained as the NVIDIA-only secondary
  path for hardware enrichment when log parsing has no startup line yet.
- AMD path uses gpu.type KV value (persisted by DockerService._detectGPUType)
  + ai.amdGpuAcceleration opt-out to determine hasRocmRuntime.

types/system.ts
- GpuHealthStatus extended additively: hasRocmRuntime + optional gpuVendor.

types/kv_store.ts
- New ai.amdGpuAcceleration boolean (default-on).

settings/models.tsx, settings/system.tsx
- passthrough_failed banner copy now reads vendor from gpuHealth.gpuVendor
  ("an AMD GPU" vs "an NVIDIA GPU"). Same Fix button hits the same
  force-reinstall endpoint, which now configures AMD correctly.

install_nomad.sh
- AMD detection in verify_gpu_setup() upgraded from a strict-positive
  "ROCm not currently available" message to "ROCm acceleration will be
  configured automatically." Also tightens the lspci match to display
  controller classes (avoids false positives from AMD CPU host bridges,
  matching the same fix already in DockerService._detectGPUType).

Auto-remediation
----------------

Issue #755 proposes auto-remediation when gpuHealth.status flips to
passthrough_failed (today the user has to click "Fix: Reinstall AI
Assistant"). When that PR lands, AMD coverage falls out for free since
this PR uses the same passthrough_failed status code via the shared
gpuHealth machinery — #755's guard will need to flip from
hasNvidiaRuntime === true to (hasNvidiaRuntime || hasRocmRuntime).

Closes #124 (AMD GPU support).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(AI): detect AMD GPU presence inside admin container via marker file

The admin container doesn't have lspci installed, and AMD GPUs don't register
a Docker runtime the way NVIDIA does — so DockerService._detectGPUType() and
SystemService.gpuHealth had no way to know an AMD GPU was present.

The previous implementation fell through to lspci, which silently failed inside
the admin container, leaving gpu.type unset and gpuHealth stuck at 'no_gpu'
even on systems with an AMD GPU. (NVIDIA worked because Docker registers the
nvidia runtime, which is reachable via dockerInfo.Runtimes from any container.)

Discovered while testing the AMD acceleration patch on a Minisforum UM890 Pro:
the AMD branch in _createContainer() never fired because _detectGPUType()
returned 'none' even on a host with a working /dev/kfd.

Fix
---

install_nomad.sh writes the host-detected GPU type ('nvidia' | 'amd') to a
marker file in the storage volume the admin container already bind-mounts:

  /opt/project-nomad/storage/.nomad-gpu-type  →  /app/storage/.nomad-gpu-type

DockerService._detectGPUType() reads the marker as a secondary probe (after
the Docker runtime check) — covers AMD detection from inside the container
without requiring lspci or a /dev bind mount.

SystemService falls back to the marker file when KV gpu.type is empty so the
System page reflects AMD presence even before the user installs AI Assistant
for the first time. (Without this, the page would say 'no_gpu' until Ollama
was installed, even on hosts with an AMD GPU detected at install time.)

Verified on NOMAD6 (UM890 Pro, Ubuntu 24.04, 780M iGPU): with the marker file
in place and admin restarted, the patch's AMD branch fires correctly on Force
Reinstall AI Assistant. Resulting nomad_ollama runs ollama/ollama:rocm with
/dev/kfd + /dev/dri passthrough and HSA_OVERRIDE_GFX_VERSION=11.0.0; Ollama
logs show 'library=ROCm compute=gfx1100 ... type=iGPU'. NOMAD's in-product
benchmark on the same hardware climbed from 33.8 tok/s (CPU) to 57.3 tok/s
(GPU) — a 1.69x speedup, with TTFT dropping from 148ms to 66ms.

Migration for existing AMD installs
-----------------------------------

Users on an existing NOMAD install with an AMD GPU have no marker file (the
install script wrote it on a fresh install). Two paths get them on the GPU:
  1. Re-run install_nomad.sh — writes the marker, no other side effects
  2. Manually: echo amd | sudo tee /opt/project-nomad/storage/.nomad-gpu-type

Either then triggers AMD detection on the next AI Assistant install/reinstall.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(AI): pull ollama/ollama:rocm separately when AMD branch overrides image

The pull-if-missing logic in _createContainer ran against service.container_image
(the DB-pinned tag, e.g. ollama/ollama:0.18.2). The AMD branch then overrode
finalImage to ollama/ollama:rocm — but if that image wasn't already local, the
container creation step failed with "no such image: ollama/ollama:rocm".

Caught while validating on NOMAD2 (Ryzen AI 9 HX 370 + Radeon 890M / RDNA 3.5):
the prior end-to-end test on NOMAD6 had silently passed because the rocm image
was already pulled there from an earlier sidecar test, masking the bug.

Fix: inside the AMD branch, after setting finalImage to ollama/ollama:rocm,
run a parallel _checkImageExists + docker.pull dance for the new tag.

Also confirmed via this validation: the same HSA_OVERRIDE_GFX_VERSION=11.0.0
override works on the 890M (gfx1150 / RDNA 3.5) — Ollama logs report
'library=ROCm compute=gfx1100 description="AMD Radeon 890M Graphics"' and
inference runs at 51.68 tok/s (matching the existing X1 Pro published tile
of 51.7 tok/s on the same hardware class). RDNA 3 (780M, gfx1103) and RDNA
3.5 (890M, gfx1150) both use the same override successfully.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(Dockerfile): include pciutils for lspci gpu detection fallback

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Jake Turner <jturner@cosmistack.com>
2026-05-20 10:16:00 -07:00
Henry Estela
2d8a02f257 fix(RAG): add start button in kb modal and ensure restart policy exists (#700)
Adds a check to RAG health to make sure nomad_qdrant is online, if not
then the user will be blocked from clicking any buttons in the KB modal
until they click the start qdrant button and let the container start

There is a new file qdrant_restart_policy_provider.ts, which tries to
ensure that the restart policy always exists for the nomad_qdrant
container even though the policy should have been there when the
container is created.
2026-05-20 10:16:00 -07:00
John Scherer
132ec9c98a fix(API): accept notes, marker_type, and position on markers endpoints (#770)
The VineJS validators in createMarker and updateMarker silently
dropped fields not in their schema. The MapMarker model and DB
include notes and marker_type, and GET responses return them, but
POST and PATCH would not persist them.

updateMarker additionally did not accept latitude/longitude, so
markers could not be repositioned via the API after creation.

- Add notes and marker_type to both validators and model assignments.
- Add latitude/longitude to the update validator.
- Add coordinate range validation on both endpoints.

Closes #768
2026-05-20 10:16:00 -07:00
chriscrosstalk
7bebedcefa fix(RAG): pass num_ctx and truncate to Ollama embed call (#763)
Some Ollama installs ship nomic-embed-text:v1.5 with the embedding
model's default num_ctx=2048, which the RAG chunker (sized for ~1500
tokens of estimated content with ratio=2 chars/token) can exceed on
dense PDFs. The result is `400 the input length exceeds the context
length` from /api/embed, which then hits the OpenAI-compatible
fallback (which also errors), and surfaces as a BadRequestError.

Pass options.num_ctx=8192 (nomic-embed-text v1.5's RoPE-extrapolated
max) and truncate=true (silent truncation safety net) on every
embed call so we don't depend on the local modelfile defaults.

Reported on #756 by @NC4WD; same root cause as #369 and #670 which
were closed without an actual fix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:16:00 -07:00
chriscrosstalk
4b21ea6c40 fix(API): skip compression for Server-Sent Events (#798)
* fix(stream): skip compression for Server-Sent Events

The global compression middleware (added in v1.31.0-rc.2) buffers
response writes to determine encoding, which collapses per-token
streaming into a single block delivered after generation completes.
This broke the AI chat streaming UX from v1.31.0-rc.2 onward — text
no longer appears progressively as the model generates it, only at
the end.

Adds a filter to compression() that returns false when the response
Content-Type is text/event-stream. Other responses still go through
the default compression filter (compressible types are still
compressed; e.g. text/html via Brotli).

Reproduced on NOMAD3 v1.31.1: before fix, all SSE chunks for a 1B
model arrive within 10ms of each other after the model finishes.
After fix, tokens arrive at ~150ms intervals as they're generated
on a 12B model, with no Content-Encoding header on the SSE response.

Verified on the same host that /home still returns
Content-Encoding: br for HTML responses.

Closes #781. Reported and bisected by @toasterking
(works in v1.31.0-rc.1, broken from v1.31.0-rc.2 onward).

* fix(stream): use any for filter params to match existing as-any pattern

The compression library types its filter as (req: Request, res: Response)
expecting Express types, but AdonisJS passes raw IncomingMessage/ServerResponse
which is why the surrounding middleware uses `as any` casts at the call site.
The IncomingMessage/ServerResponse types I added are runtime-correct but
fail tsc against the library's declared types.

Drop the typed import in favor of `any` parameters, which matches how the
existing `compress(request.request as any, response.response as any, ...)`
call resolves the same mismatch.
2026-05-20 10:16:00 -07:00
chriscrosstalk
95d0816d50 feat(content-manager): add sortable file size column (#698)
Closes #685

Content Manager now surfaces the on-disk size of each ZIM file alongside
title/summary, and lets users sort the list by Size or Title. Defaults to
Size descending so the largest files are visible first.

- ZimService.list() now stats each file and returns size_bytes
- Content Manager table adds a formatted Size column (via formatBytes)
- Sortable headers for Title and Size with asc/desc toggle
2026-05-20 10:16:00 -07:00
chriscrosstalk
216509ae0d fix(rag): repair ZIM embedding pipeline (sync filter, batch gate, DOM walk) (#745)
Three bugs in the RAG embedding pipeline, diagnosed and patched by @sbruschke
against v1.31.0 with working before/after chunk counts. All three are
root-cause contributors to #388.

1. scanAndSyncStorage queued every file under /storage/zim/ for embedding,
   including Kiwix's generated kiwix-library.xml. EmbedFileJob rejected it
   with "Unsupported file type" and the default 30-attempt retry policy
   kept it looping on every sync, flooding nomad_admin logs. Now gated on
   determineFileType(filePath) !== 'unknown'.

2. hasMoreBatches compared zimChunks.length (section-level chunk count
   under the 'structured' strategy) against ZIM_BATCH_SIZE (an article
   limit). Because articles emit multiple sections, the two are never
   equal for real archives and processing silently stopped after the
   first 50 articles. Now gated on articlesInBatch >= ZIM_BATCH_SIZE.

3. extractStructuredContent walked only direct children of <body>, so any
   ZIM that wraps content in a container div (Devdocs, Wikipedia,
   FreeCodeCamp, React docs, etc.) produced zero sections and silently
   embedded zero chunks while reporting success. Now walks the full DOM
   via $('body').find('h2, h3, h4, p, ul, ol, dl, table'), with a
   whole-body text fallback when the selector walk yields nothing.

Before/after chunk counts confirmed by @sbruschke on v1.31.0:
  devdocs_en_git   0 -> 916
  devdocs_en_react 0 -> 481
  devdocs_en_node  0 -> 423
  libretexts_en_eng 1 -> 35 (climbing)
Wikipedia resumed progressing normally through its 6M articles.

Closes #718
Closes #719
Closes #720
Closes #388

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:26:28 -07:00
chriscrosstalk
810a70acb7 fix(ZIM): accumulate across Kiwix pages to prevent empty Content Explorer (#746)
When many ZIMs are already installed locally, a single Kiwix catalog page
(12 items) could return 12 already-installed items, which zim_service
would fully filter out client-side. The endpoint returned items: [] with
has_more: true, and the frontend's infinite-scroll guard
(flatData.length > 0) blocked fetchNextPage — leaving the user with
"No records found" despite plenty of uninstalled ZIMs available.

Backend now accumulates across up to 5 Kiwix fetches (60 items each)
until it has enough post-filter results to return, dedupes by entry id,
advances currentStart by actual entries returned (not requested), and
returns a next_start cursor. The frontend consumes that cursor instead
of computing Kiwix offsets locally, and the flatData.length > 0 guard is
removed so the existing on-mount effect drives bounded auto-fetch when
a short page lands.

The pre-existing has_more off-by-one (compared totalResults against the
input start rather than the post-fetch position) is fixed implicitly.

Diagnosis credit: @johno10661.

Closes #731

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:26:28 -07:00
chriscrosstalk
6646b3480b fix(AI): stop local nomad_ollama container when remote Ollama is configured (#744)
When users set a remote Ollama URL via AI Settings, the local nomad_ollama
container continued running and competed with the remote host for port 11434
and GPU access. Now configureRemote stops the local container on set and
restores it on clear (if still present). Container and its models volume are
preserved so the local install can be re-enabled later.

Closes #662

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:26:28 -07:00
chriscrosstalk
056556497c docs: add Community Add-Ons page with field manuals + W3Schools packs (#753)
Introduces a dedicated page listing third-party ZIM content packs built
by the community. Launches with the two current add-ons (jrsphoto field
manuals, kennethbrewer W3Schools) and explains how to install a ZIM pack
and where to submit a new one for inclusion.

- New doc at admin/docs/community-add-ons.md
- Wired into DocsService DOC_ORDER (slot 4) and TITLE_OVERRIDES so the
  hyphen in "Add-Ons" is preserved in the sidebar
- README gets a link under Community & Resources

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:26:28 -07:00