project-nomad

mirror of https://github.com/Crosstalk-Solutions/project-nomad.git synced 2026-03-28 19:49:25 +01:00

History

Henry Estela 8b54310746 Improve context window size estimation fixes issue seen with some models in lm studio resulting in: "The number of tokens to keep from the initial prompt is greater than the context length (n_keep: 4705>= n_ctx: 4096)" Fixed char/token estimate, the old value was too optimistic, causing the cap to allow more text than the budget allowed in actual tokens. After RAG injection, estimates the system prompt token count. If it exceeds ~3000 tokens, requests the next standard context size (8192, 16384, 32768, or 65536), large enough to fit the prompt plus a 2048-token buffer for the conversation and response. For Ollama, num_ctx is honoured per-request and will load the model with that context window. For LM Studio, the parameter is silently ignored — but the tighter char estimate will also reduce how much RAG text gets stuffed in, so it's less likely to overflow.		2026-03-25 17:18:06 -07:00
..
controllers	Improve context window size estimation	2026-03-25 17:18:06 -07:00
exceptions	fix(Docs): documentation renderer fixes	2025-12-23 16:00:33 -08:00
jobs	fix(ai-chat): ingestion of documents with openai and add cleanup button	2026-03-25 17:18:05 -07:00
middleware	feat: background job overhaul with bullmq	2025-12-06 23:59:01 -08:00
models	feat: support for updating services	2026-03-11 14:08:09 -07:00
services	Improve context window size estimation	2026-03-25 17:18:06 -07:00
utils	fix(disk): correct storage display by fixing device matching and dedup mount entries	2026-03-20 11:46:10 -07:00
validators	feat(AI Assistant): improved state management and performance	2026-03-11 14:08:09 -07:00