project-nomad

mirror of https://github.com/Crosstalk-Solutions/project-nomad.git synced 2026-05-28 15:16:49 +02:00

History

Chris Sherwood ffa70a54bc feat(chat): confirm-on-switch + one-chat-model-at-a-time enforcement Surfaces NOMAD's previously-silent model-stacking behavior and enforces a "one chat model in VRAM at a time" invariant (the embedding model is always exempt). Addresses Chris's NOMAD3 testing observation that switching the dropdown in the chat header was invisibly slow on low-VRAM hardware because the prior model was never unloaded — Ollama would either evict it under memory pressure or load the new one on CPU after the runner choked. Three integration points all funnel through one new helper: - User changes the model dropdown in an active chat session → confirm modal "Switch to {newModel}? Switching to {newModel} will start a new chat. Your current conversation stays available in the sidebar." On confirm, fire `keep_alive: 0` against the previous chat model, clear active session, set the new selection. Cancel snaps the visible dropdown back to the previous value (no popup state leaks into `selectedModel`). - User clicks a session in the sidebar → no popup (system-initiated). Restore the session's stored model into the dropdown and fire `unloadChatModels(targetModel)` so anything that isn't the target gets the unload hint. - Chat page first mount → page-load normalization. Anything stacked from a prior session gets the unload hint with the current selected model as the target-to-preserve. Guarded by a ref so it only fires once per page lifetime; gated on `selectedModel` being populated. Backend surface is a single new helper and a single new route: `OllamaService.unloadAllChatModelsExcept(targetModel: string \| null)` → queries `/api/ps`, filters out (a) the embedding model name (hardcoded `nomic-embed-text:v1.5` to avoid the RagService circular import) and (b) `targetModel`, fires `POST /api/generate` with empty prompt + `keep_alive: 0` in parallel against everything else. Returns the names that were hinted. Best-effort: network or Ollama errors are logged and swallowed so callers don't fail on housekeeping. `POST /api/ollama/unload-chat-models` → thin wrapper validating `{ targetModel?: string \| null }`. Why `keep_alive: 0` is safe against in-flight inference: per Ollama's scheduler semantics, the hint sets the post-completion eviction timer to zero — the runner is not terminated. If Session A is mid-response on gemma when Session B fires the unload, gemma stays resident until A's request completes, then evicts. The user-visible worst case is the race where A's longer-running request re-extends the timer back to the default and the unload is no-op'd; the next transition (or page reload) gets another chance, and Ollama's own LRU catches up under memory pressure regardless. Robust in-flight tracking deferred to a follow-up if we see stale-state in the wild. Base `rc`: v1.40.0 will inherit everything from rc.6 via the backmerge. Frontend tests deferred to a follow-up PR; existing inertia tsconfig errors are pre-existing and unrelated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-20 10:16:00 -07:00
..
app	feat(chat): confirm-on-switch + one-chat-model-at-a-time enforcement	2026-05-20 10:16:00 -07:00
bin	feat: curated content system overhaul	2026-02-11 15:44:46 -08:00
commands	feat(Maps): regional map downloads via go-pmtiles extract (#780 )	2026-05-20 10:16:00 -07:00
config	fix: cache docker list requests, aiAssistantName fetching, and ensure inertia used properly	2026-04-03 14:26:50 -07:00
constants	feat(KB): Always/Manual ingest policy toggle (RFC #883 §1/§4) (#894 )	2026-05-20 10:16:00 -07:00
database	fix(KB): align chunks_per_mb column type with TS contract	2026-05-20 10:16:00 -07:00
docs	docs: update release notes	2026-05-20 10:16:00 -07:00
inertia	feat(chat): confirm-on-switch + one-chat-model-at-a-time enforcement	2026-05-20 10:16:00 -07:00
providers	feat(GPU): auto-remediate nomad_ollama passthrough loss on admin boot (#755 )	2026-05-20 10:16:00 -07:00
public	feat: switch all PNG images to WEBP (#575 )	2026-04-03 14:26:50 -07:00
resources	feat(Maps): regional map downloads via go-pmtiles extract (#780 )	2026-05-20 10:16:00 -07:00
start	feat(chat): confirm-on-switch + one-chat-model-at-a-time enforcement	2026-05-20 10:16:00 -07:00
tests	feat(KB): per-file ingest action + state indicator on Stored Files (RFC #883 §5)	2026-05-20 10:16:00 -07:00
types	feat(KB): per-file ingest action + state indicator on Stored Files (RFC #883 §5)	2026-05-20 10:16:00 -07:00
util	feat: display model download progress	2026-02-06 16:22:23 -08:00
views	feat: initial commit	2025-06-29 15:51:08 -07:00
.editorconfig	feat: initial commit	2025-06-29 15:51:08 -07:00
.env.example	feat: Add Windows Docker Desktop support for local development	2026-01-19 10:29:24 -08:00
ace.js	feat: initial commit	2025-06-29 15:51:08 -07:00
adonisrc.ts	feat(GPU): auto-remediate nomad_ollama passthrough loss on admin boot (#755 )	2026-05-20 10:16:00 -07:00
eslint.config.js	feat: openwebui+ollama and zim management	2025-07-09 09:08:21 -07:00
package-lock.json	build(deps): bump picomatch in /admin	2026-05-20 10:16:00 -07:00
package.json	chore(deps): pin all deps to exact versions	2026-05-20 10:16:00 -07:00
tailwind.config.ts	feat: initial commit	2025-06-29 15:51:08 -07:00
tsconfig.json	feat: initial commit	2025-06-29 15:51:08 -07:00
vite.config.ts	fix(Maps): ensure proper parsing of hostnames (#640 )	2026-04-03 14:26:50 -07:00