mirror of
https://github.com/Crosstalk-Solutions/project-nomad.git
synced 2026-05-26 14:25:07 +02:00
EmbedFileJob.dispatch() uses a deterministic per-file jobId
(sha256(filePath).slice(0,16)) for every batch. The parent batch's
handle() calls EmbedFileJob.dispatch({ batchOffset }) before returning,
so the parent is still in `active` state and locked when the
continuation tries to enqueue. BullMQ silently returns the locked
parent instead of creating a new job — and in newer BullMQ versions
it does so without throwing, so the existing
`catch (error.message.includes('job already exists'))` branch never
fires. After the parent completes, its entry stays in the `completed`
ZSET (held by `removeOnComplete: { count: 50 }`), continuing to trip
jobId dedupe for any subsequent re-dispatch attempts.
Result: every NOMAD install since 2026-02-08 (feat: zim content
embedding) with a multi-batch ZIM (wikipedia, cooking SE, ifixit,
lrnselfreliance, etc.) has only the first 50 articles indexed in
qdrant. The RAG feature has been silently degraded for ~3 months —
the user sees the file appear in their KB, qdrant accumulates ~50
articles' worth of vectors, and pagination quietly halts. No error
surfaces anywhere.
Fix: dispatch() skips the deterministic jobId for continuation batches
(batchOffset > 0), letting BullMQ auto-generate a unique one so each
batch stacks as an independent queue entry. Initial dispatches keep
the deterministic jobId so re-triggering an install (UI re-click,
sync rescan) remains idempotent. The existing 'job already exists'
branch is now gated on !isContinuation, since by construction
continuation batches will never hit dedupe.
Validated on NOMAD8 (RX 6800 / Threadripper 3960X, rc.3 + this patch):
devdocs_en_python (~1,500 chunks across multiple batches) correctly
paginates end-to-end. admin.log shows the expected sequence of
"Dispatched embedding job for file: X (continuation @ offset N)"
followed by "Starting embedding process for: X (batch offset: N)"
for each batch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| app | ||
| bin | ||
| commands | ||
| config | ||
| constants | ||
| database | ||
| docs | ||
| inertia | ||
| providers | ||
| public | ||
| resources | ||
| start | ||
| tests | ||
| types | ||
| util | ||
| views | ||
| .editorconfig | ||
| .env.example | ||
| ace.js | ||
| adonisrc.ts | ||
| eslint.config.js | ||
| package-lock.json | ||
| package.json | ||
| tailwind.config.ts | ||
| tsconfig.json | ||
| vite.config.ts | ||