mirror of
https://github.com/Crosstalk-Solutions/project-nomad.git
synced 2026-06-04 02:26:52 +02:00
The OpenAI-compatible /v1/embeddings fallback path can't pass `truncate:true` / `num_ctx:8192` to the model, so any chunk that exceeds the model's loaded context_length (often 2048 for nomic-embed-text:v1.5) returns a 400 BadRequestError and is silently dropped from Qdrant. Two CPU-only ingestion runs on NOMAD1 hit this on dense technical content (medlineplus, arduino.stackexchange) even after PR #763's num_ctx fix on the native path. Pre-cap each input string at 4000 chars before either backend call. That's ~1000-2000 tokens depending on density, comfortably under the model's 2048 default. The chunker in RagService is sized for MAX_SAFE_TOKENS=1600 (3200 chars at its conservative 2 chars/token estimate), so well-formed inputs are never touched; this is purely a runtime safety net for the edge cases that slip through. Also stop swallowing the original error in the catch. The bare `} catch {}` here has masked recurring "input length exceeds context length" failures for months (#369, #670, #881). Capture and warn-log the message so future investigations see why we fell back. Same root cause as #369 and #670 which were closed without an actual fix to the fallback path. |
||
|---|---|---|
| .. | ||
| benchmark_service.ts | ||
| chat_service.ts | ||
| collection_manifest_service.ts | ||
| collection_update_service.ts | ||
| container_registry_service.ts | ||
| countries_service.ts | ||
| docker_service.ts | ||
| docs_service.ts | ||
| download_service.ts | ||
| kiwix_library_service.ts | ||
| map_service.ts | ||
| ollama_service.ts | ||
| queue_service.ts | ||
| rag_service.ts | ||
| system_service.ts | ||
| system_update_service.ts | ||
| zim_extraction_service.ts | ||
| zim_service.ts | ||