Commit Graph

12 Commits

Author SHA1 Message Date
brian
dc7abfd41a feat(rag): add EPUB file support for Knowledge Base uploads
EPUBs are ZIP archives containing structured XHTML content with semantic
chapter/section markup, making them well-suited for RAG text extraction
and chunking.

Changes:
- Add 'epub' to determineFileType() in utils/fs.ts
- Add processEPUBFile() method in rag_service.ts that:
  - Reads container.xml to locate the OPF manifest
  - Parses the OPF spine for correct reading order
  - Extracts text from each XHTML content document using cheerio
  - Falls back to all manifest items if no spine is found
- Wire epub case into processAndEmbedFile() switch
- Add jszip dependency for ZIP archive reading (cheerio already present)

Closes #253-adjacent (epub is a common format for Project Gutenberg
content and technical reference books)
2026-03-13 13:25:01 -04:00
Jake Turner
96e5027055 feat(AI Assistant): performance improvements and smarter RAG context usage 2026-03-11 14:08:09 -07:00
Jake Turner
dfa896e86b feat(RAG): allow deletion of files from KB 2026-03-04 20:05:14 -08:00
Jake Turner
99b96c3df7 feat(RAG): display embedding queue and improve progress tracking 2026-03-04 20:05:14 -08:00
Jake Turner
6817e2e47e fix: improve type-safety for KVStore values 2026-03-03 20:51:38 -08:00
Jake Turner
4747863702 feat(AI Assistant): allow manual scan and resync KB 2026-02-09 15:16:18 -08:00
Jake Turner
8726700a0a feat: zim content embedding 2026-02-08 13:20:10 -08:00
Jake Turner
ab07551719 feat: auto add NOMAD docs to KB on AI install 2026-02-03 23:15:54 -08:00
Jake Turner
d1f40663d3 feat(RAG): initial beta with preprocessing, embedding, semantic retrieval, and ctx passage 2026-02-01 23:59:21 +00:00
Jake Turner
31c671bdb5 fix: service name defs and ollama ui location 2026-02-01 05:46:23 +00:00
Jake Turner
243f749090 feat: [wip] native AI chat interface 2026-01-31 20:39:49 -08:00
Jake Turner
50174d2edb feat(RAG): [wip] RAG capabilities 2026-01-31 20:39:49 -08:00