project-nomad/admin/app
brian dc7abfd41a feat(rag): add EPUB file support for Knowledge Base uploads
EPUBs are ZIP archives containing structured XHTML content with semantic
chapter/section markup, making them well-suited for RAG text extraction
and chunking.

Changes:
- Add 'epub' to determineFileType() in utils/fs.ts
- Add processEPUBFile() method in rag_service.ts that:
  - Reads container.xml to locate the OPF manifest
  - Parses the OPF spine for correct reading order
  - Extracts text from each XHTML content document using cheerio
  - Falls back to all manifest items if no spine is found
- Wire epub case into processAndEmbedFile() switch
- Add jszip dependency for ZIP archive reading (cheerio already present)

Closes #253-adjacent (epub is a common format for Project Gutenberg
content and technical reference books)
2026-03-13 13:25:01 -04:00
..
controllers feat(AI Assistant): performance improvements and smarter RAG context usage 2026-03-11 14:08:09 -07:00
exceptions fix(Docs): documentation renderer fixes 2025-12-23 16:00:33 -08:00
jobs feat: support for updating services 2026-03-11 14:08:09 -07:00
middleware feat: background job overhaul with bullmq 2025-12-06 23:59:01 -08:00
models feat: support for updating services 2026-03-11 14:08:09 -07:00
services feat(rag): add EPUB file support for Knowledge Base uploads 2026-03-13 13:25:01 -04:00
utils feat(rag): add EPUB file support for Knowledge Base uploads 2026-03-13 13:25:01 -04:00
validators feat(AI Assistant): improved state management and performance 2026-03-11 14:08:09 -07:00