Three bugs in the RAG embedding pipeline, diagnosed and patched by @sbruschke
against v1.31.0 with working before/after chunk counts. All three are
root-cause contributors to #388.
1. scanAndSyncStorage queued every file under /storage/zim/ for embedding,
including Kiwix's generated kiwix-library.xml. EmbedFileJob rejected it
with "Unsupported file type" and the default 30-attempt retry policy
kept it looping on every sync, flooding nomad_admin logs. Now gated on
determineFileType(filePath) !== 'unknown'.
2. hasMoreBatches compared zimChunks.length (section-level chunk count
under the 'structured' strategy) against ZIM_BATCH_SIZE (an article
limit). Because articles emit multiple sections, the two are never
equal for real archives and processing silently stopped after the
first 50 articles. Now gated on articlesInBatch >= ZIM_BATCH_SIZE.
3. extractStructuredContent walked only direct children of <body>, so any
ZIM that wraps content in a container div (Devdocs, Wikipedia,
FreeCodeCamp, React docs, etc.) produced zero sections and silently
embedded zero chunks while reporting success. Now walks the full DOM
via $('body').find('h2, h3, h4, p, ul, ol, dl, table'), with a
whole-body text fallback when the selector walk yields nothing.
Before/after chunk counts confirmed by @sbruschke on v1.31.0:
devdocs_en_git 0 -> 916
devdocs_en_react 0 -> 481
devdocs_en_node 0 -> 423
libretexts_en_eng 1 -> 35 (climbing)
Wikipedia resumed progressing normally through its 6M articles.
Closes#718Closes#719Closes#720Closes#388
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When many ZIMs are already installed locally, a single Kiwix catalog page
(12 items) could return 12 already-installed items, which zim_service
would fully filter out client-side. The endpoint returned items: [] with
has_more: true, and the frontend's infinite-scroll guard
(flatData.length > 0) blocked fetchNextPage — leaving the user with
"No records found" despite plenty of uninstalled ZIMs available.
Backend now accumulates across up to 5 Kiwix fetches (60 items each)
until it has enough post-filter results to return, dedupes by entry id,
advances currentStart by actual entries returned (not requested), and
returns a next_start cursor. The frontend consumes that cursor instead
of computing Kiwix offsets locally, and the flatData.length > 0 guard is
removed so the existing on-mount effect drives bounded auto-fetch when
a short page lands.
The pre-existing has_more off-by-one (compared totalResults against the
input start rather than the post-fetch position) is fixed implicitly.
Diagnosis credit: @johno10661.
Closes#731
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When users set a remote Ollama URL via AI Settings, the local nomad_ollama
container continued running and competed with the remote host for port 11434
and GPU access. Now configureRemote stops the local container on set and
restores it on clear (if still present). Container and its models volume are
preserved so the local install can be re-enabled later.
Closes#662
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Qdrant's upstream default enables anonymous telemetry to telemetry.qdrant.io,
which doesn't match NOMAD's offline-first "zero telemetry" posture. Adding
QDRANT__TELEMETRY_DISABLED=true to the container environment turns it off for
fresh installs and reinstalls.
Existing installs keep their current telemetry-enabled container until the
Qdrant service is force-reinstalled via the Knowledge Base panel or the next
container recreation — Docker bakes Env into containers at create time, so
env changes require a new container.
Closes#742
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces a dedicated page listing third-party ZIM content packs built
by the community. Launches with the two current add-ons (jrsphoto field
manuals, kennethbrewer W3Schools) and explains how to install a ZIM pack
and where to submit a new one for inclusion.
- New doc at admin/docs/community-add-ons.md
- Wired into DocsService DOC_ORDER (slot 4) and TITLE_OVERRIDES so the
hyphen in "Add-Ons" is preserved in the sidebar
- README gets a link under Community & Resources
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a cancel button to in-progress Ollama model downloads and unifies
the Active Model Downloads card layout with the Active Downloads card
used for ZIMs, maps, and pmtiles (byte counts, progress bar, live speed,
status indicator).
Closes#676.
Downloads are now written to `filepath + '.tmp'` and atomically renamed
to the final path only on successful completion. Kiwix globs for `*.zim`
and ZimService filters `.endsWith('.zim')`, so `.tmp` files are invisible
to both during download. The same staging applies to `.pmtiles` map files.
Ref #372
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Corrupted ZIM files cause a native C++ abort (ZimFileFormatError) that
bypasses JS try/catch and kills the process. Add magic number validation
before passing files to @openzim/libzim so invalid files are skipped
gracefully. Also deduplicate Ollama download progress broadcasts — both
within a single stream (skip unchanged percentages) and across concurrent
callers (share one download promise per model).
Co-authored-by: aegisman <aegis@manicode.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the OpenAI-compatible fallback (/v1/models) is used, models are mapped as { name: m.id, size: 0 } with no details field. Accessing model.details.parameter_size throws `TypeError: Cannot read properties of undefined`, which crashes the React render and causes the entire page to go blank.
Defaults to metric for global audience. Persists choice in localStorage.
Segmented button styled to match MapLibre controls.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add distance scale bar and user-placed location pins to the offline maps viewer.
- Scale bar (bottom-left) shows distance reference that updates with zoom level
- Click anywhere on map to place a named pin with color selection (6 colors)
- Collapsible "Saved Locations" panel lists all pins with fly-to navigation
- Full dark mode support for popups and panel via CSS overrides
- New `map_markers` table with future-proofed columns for routing (marker_type,
route_id, route_order, notes) to avoid a migration when routes are added later
- CRUD endpoints: GET/POST /api/maps/markers, PATCH/DELETE /api/maps/markers/:id
- VineJS validation on create/update
- MapMarker Lucid model
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(downloads): increase retry attempts and backoff for large file downloads
* fix download retry config and abort handling
* use abort reason to detect user-initiated cancels
New defaults:
OLLAMA_NO_CLOUD=1 - "Ollama can run in local only mode by disabling
Ollama’s cloud features. By turning off Ollama’s cloud features, you
will lose the ability to use Ollama’s cloud models and web search."
https://ollama.com/blog/web-searchhttps://docs.ollama.com/faq#how-do-i-disable-ollama%E2%80%99s-cloud-features
example output:
```
ollama run minimax-m2.7:cloud
Error: ollama cloud is disabled: remote model details are unavailable
```
This setting can be safely disabled as you have to click on a link to
login to ollama cloud and theres no real way to do that in nomad outside
of looking at the nomad_ollama logs.
This one can be disabled in settings in case theres a model out there
that doesn't play nice. but that doesnt seem necessary so far.
OLLAMA_FLASH_ATTENTION=1 - "Flash Attention is a feature of most modern
models that can significantly reduce memory usage as the context size
grows. "
Tested with llama3.2:
```
docker logs nomad_ollama --tail 1000 2>&1 |grep --color -i flash_attn
llama_context: flash_attn = enabled
```
And with second_constantine/deepseek-coder-v2 with is based on
https://huggingface.co/lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF
which is a model that specifically calls out that you should disable
flash attention, but during testing it seems ollama can do this for you
automatically:
```
docker logs nomad_ollama --tail 1000 2>&1 |grep --color -i flash_attn
llama_context: flash_attn = disabled
```
Surfaces all installed AI models in a dedicated table between Settings
and Active Model Downloads, so users can quickly see what's installed
and delete models without hunting through the expandable model catalog.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>