project-nomad/admin/app/services
Henry Estela 8b54310746
Improve context window size estimation
fixes issue seen with some models in lm studio resulting in:
"The number of tokens to keep from the initial prompt is greater than the context length (n_keep: 4705>= n_ctx: 4096)"

Fixed char/token estimate, the old value was too optimistic,
causing the cap to allow more text than the budget allowed in actual tokens.
After RAG injection, estimates the system prompt token count.
If it exceeds ~3000 tokens, requests the next standard context size (8192, 16384, 32768, or 65536),
large enough to fit the prompt plus a 2048-token buffer for the conversation and response.

For Ollama, num_ctx is honoured per-request and will load the model with that context
window. For LM Studio, the parameter is silently ignored — but the tighter char
estimate will also reduce how much RAG text gets stuffed in, so it's less likely to
overflow.
2026-03-25 17:18:06 -07:00
..
benchmark_service.ts fix: benchmark scores clamped to 0% for below-average hardware 2026-03-25 16:30:35 -07:00
chat_service.ts feat(AI Assistant): improved state management and performance 2026-03-11 14:08:09 -07:00
collection_manifest_service.ts fix: update default branch name 2026-03-01 16:08:46 -08:00
collection_update_service.ts feat: curated content update checking 2026-02-11 21:49:46 -08:00
container_registry_service.ts feat: support for updating services 2026-03-11 14:08:09 -07:00
docker_service.ts feat(ai-chat): Add ability to use a remote ollama instance on LAN 2026-03-25 17:18:04 -07:00
docs_service.ts fix(security): path traversal and SSRF protections from pre-launch audit 2026-03-11 14:08:09 -07:00
download_service.ts fix(downloads): allow users to dismiss failed downloads 2026-03-20 11:46:10 -07:00
map_service.ts fix(maps): respect request protocol for reverse proxy HTTPS support 2026-03-20 11:46:10 -07:00
ollama_service.ts Improve context window size estimation 2026-03-25 17:18:06 -07:00
queue_service.ts feat: background job overhaul with bullmq 2025-12-06 23:59:01 -08:00
rag_service.ts fix(ai-chat): ingestion of documents with openai and add cleanup button 2026-03-25 17:18:05 -07:00
system_service.ts fix(disk): correct storage display by fixing device matching and dedup mount entries 2026-03-20 11:46:10 -07:00
system_update_service.ts fix(System): ensure nomad container image tag resolves correctly 2026-03-11 14:08:09 -07:00
zim_extraction_service.ts feat: zim content embedding 2026-02-08 13:20:10 -08:00
zim_service.ts fix(security): path traversal and SSRF protections from pre-launch audit 2026-03-11 14:08:09 -07:00