project-nomad/admin/app/utils
brian dc7abfd41a feat(rag): add EPUB file support for Knowledge Base uploads
EPUBs are ZIP archives containing structured XHTML content with semantic
chapter/section markup, making them well-suited for RAG text extraction
and chunking.

Changes:
- Add 'epub' to determineFileType() in utils/fs.ts
- Add processEPUBFile() method in rag_service.ts that:
  - Reads container.xml to locate the OPF manifest
  - Parses the OPF spine for correct reading order
  - Extracts text from each XHTML content document using cheerio
  - Falls back to all manifest items if no spine is found
- Wire epub case into processAndEmbedFile() switch
- Add jszip dependency for ZIP archive reading (cheerio already present)

Closes #253-adjacent (epub is a common format for Project Gutenberg
content and technical reference books)
2026-03-13 13:25:01 -04:00
..
downloads.ts fix(Kiwix): initial download and setup 2025-12-07 16:04:41 -08:00
fs.ts feat(rag): add EPUB file support for Knowledge Base uploads 2026-03-13 13:25:01 -04:00
misc.ts feat(AI): chat suggestions and assistant settings 2026-02-01 07:24:21 +00:00
version.ts feat: support for updating services 2026-03-11 14:08:09 -07:00