project-nomad

mirror of https://github.com/Crosstalk-Solutions/project-nomad.git synced 2026-04-05 00:06:17 +02:00

Author	SHA1	Message	Date
Henry Estela	8b54310746	Improve context window size estimation fixes issue seen with some models in lm studio resulting in: "The number of tokens to keep from the initial prompt is greater than the context length (n_keep: 4705>= n_ctx: 4096)" Fixed char/token estimate, the old value was too optimistic, causing the cap to allow more text than the budget allowed in actual tokens. After RAG injection, estimates the system prompt token count. If it exceeds ~3000 tokens, requests the next standard context size (8192, 16384, 32768, or 65536), large enough to fit the prompt plus a 2048-token buffer for the conversation and response. For Ollama, num_ctx is honoured per-request and will load the model with that context window. For LM Studio, the parameter is silently ignored — but the tighter char estimate will also reduce how much RAG text gets stuffed in, so it's less likely to overflow.	2026-03-25 17:18:06 -07:00
Henry Estela	c8ce28a84f	fix(ai-chat): ingestion of documents with openai and add cleanup button Added a cleanup failed button for Processing Queue in the Knowledge Base since documents that fail to process tend to get stuck and then can't be cleared. Fixed the ingestion of documents for OpenAI servers. Updated some text in the chat and chat settings since user will need to manually download models when using a non-ollama remote gpu server.	2026-03-25 17:18:05 -07:00
Henry Estela	f98664921a	feat(ai-chat): Add support for OpenAI API Exisiting Ollama API support still functions as before. OpenAI vs Ollama API mostly have the same features, however model file size is not support with OpenAI's API so when a user chooses one of those then the models will just show up as the model name without the size. `npm install openai` triggered some updates in admin/package-lock.json such as adding many instances of "dev: true". This further enhances the user's ability to run the LLM on a different host.	2026-03-25 17:18:05 -07:00
Chris Sherwood	78c0b1d24d	fix(ai): surface model download errors and prevent silent retry loops Model downloads that fail (e.g., when Ollama is too old for a model) were silently retrying 40 times with no UI feedback. Now errors are broadcast via SSE and shown in the Active Model Downloads section. Version mismatch errors use UnrecoverableError to fail immediately instead of retrying. Stale failed jobs are cleared on retry so users aren't permanently blocked. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 16:30:35 -07:00
Jake Turner	db69428193	fix(AI): allow force refresh of models list	2026-03-11 14:08:09 -07:00
Jake Turner	6874a2824f	feat(Models): paginate available models endpoint	2026-03-03 20:51:38 -08:00
Jake Turner	98b65c421c	feat(AI): thinking and response streaming	2026-02-18 21:22:53 -08:00
Jake Turner	d55ff7b466	feat: curated content update checking	2026-02-11 21:49:46 -08:00
Jake Turner	12286b9d34	feat: display model download progress	2026-02-06 16:22:23 -08:00
Jake Turner	a91c13867d	fix: filter cloud models from API response	2026-02-04 17:05:20 -08:00
Jake Turner	d4cbc0c2d5	feat(AI): add fuzzy search to models list	2026-02-04 16:45:12 -08:00
Jake Turner	ab07551719	feat: auto add NOMAD docs to KB on AI install	2026-02-03 23:15:54 -08:00
Jake Turner	907982062f	feat(Ollama): cleanup model download logic and improve progress tracking	2026-02-03 23:15:54 -08:00
Jake Turner	31c671bdb5	fix: service name defs and ollama ui location	2026-02-01 05:46:23 +00:00
Jake Turner	243f749090	feat: [wip] native AI chat interface	2026-01-31 20:39:49 -08:00

15 Commits