project-nomad/admin/app/services
brian dc7abfd41a feat(rag): add EPUB file support for Knowledge Base uploads
EPUBs are ZIP archives containing structured XHTML content with semantic
chapter/section markup, making them well-suited for RAG text extraction
and chunking.

Changes:
- Add 'epub' to determineFileType() in utils/fs.ts
- Add processEPUBFile() method in rag_service.ts that:
  - Reads container.xml to locate the OPF manifest
  - Parses the OPF spine for correct reading order
  - Extracts text from each XHTML content document using cheerio
  - Falls back to all manifest items if no spine is found
- Wire epub case into processAndEmbedFile() switch
- Add jszip dependency for ZIP archive reading (cheerio already present)

Closes #253-adjacent (epub is a common format for Project Gutenberg
content and technical reference books)
2026-03-13 13:25:01 -04:00
..
benchmark_service.ts fix(Benchmark): improved error reporting and fix sysbench race condition 2026-02-11 22:09:31 -08:00
chat_service.ts feat(AI Assistant): improved state management and performance 2026-03-11 14:08:09 -07:00
collection_manifest_service.ts fix: update default branch name 2026-03-01 16:08:46 -08:00
collection_update_service.ts feat: curated content update checking 2026-02-11 21:49:46 -08:00
container_registry_service.ts feat: support for updating services 2026-03-11 14:08:09 -07:00
docker_service.ts feat: support for updating services 2026-03-11 14:08:09 -07:00
docs_service.ts fix(security): path traversal and SSRF protections from pre-launch audit 2026-03-11 14:08:09 -07:00
download_service.ts fix(Downloads): sort active downloads by progress descending 2026-02-08 13:14:04 -08:00
map_service.ts fix(security): path traversal and SSRF protections from pre-launch audit 2026-03-11 14:08:09 -07:00
ollama_service.ts fix(AI): allow force refresh of models list 2026-03-11 14:08:09 -07:00
queue_service.ts feat: background job overhaul with bullmq 2025-12-06 23:59:01 -08:00
rag_service.ts feat(rag): add EPUB file support for Knowledge Base uploads 2026-03-13 13:25:01 -04:00
system_service.ts feat: support for updating services 2026-03-11 14:08:09 -07:00
system_update_service.ts fix(System): ensure nomad container image tag resolves correctly 2026-03-11 14:08:09 -07:00
zim_extraction_service.ts feat: zim content embedding 2026-02-08 13:20:10 -08:00
zim_service.ts fix(security): path traversal and SSRF protections from pre-launch audit 2026-03-11 14:08:09 -07:00