mirror of
https://github.com/Crosstalk-Solutions/project-nomad.git
synced 2026-05-25 22:05:07 +02:00
* fix(stream): skip compression for Server-Sent Events The global compression middleware (added in v1.31.0-rc.2) buffers response writes to determine encoding, which collapses per-token streaming into a single block delivered after generation completes. This broke the AI chat streaming UX from v1.31.0-rc.2 onward — text no longer appears progressively as the model generates it, only at the end. Adds a filter to compression() that returns false when the response Content-Type is text/event-stream. Other responses still go through the default compression filter (compressible types are still compressed; e.g. text/html via Brotli). Reproduced on NOMAD3 v1.31.1: before fix, all SSE chunks for a 1B model arrive within 10ms of each other after the model finishes. After fix, tokens arrive at ~150ms intervals as they're generated on a 12B model, with no Content-Encoding header on the SSE response. Verified on the same host that /home still returns Content-Encoding: br for HTML responses. Closes #781. Reported and bisected by @toasterking (works in v1.31.0-rc.1, broken from v1.31.0-rc.2 onward). * fix(stream): use any for filter params to match existing as-any pattern The compression library types its filter as (req: Request, res: Response) expecting Express types, but AdonisJS passes raw IncomingMessage/ServerResponse which is why the surrounding middleware uses `as any` casts at the call site. The IncomingMessage/ServerResponse types I added are runtime-correct but fail tsc against the library's declared types. Drop the typed import in favor of `any` parameters, which matches how the existing `compress(request.request as any, response.response as any, ...)` call resolves the same mismatch.
36 lines
1.2 KiB
TypeScript
36 lines
1.2 KiB
TypeScript
import env from '#start/env'
|
|
import type { HttpContext } from '@adonisjs/core/http'
|
|
import type { NextFn } from '@adonisjs/core/types/http'
|
|
import compression from 'compression'
|
|
|
|
// Skip compression for Server-Sent Events. The compression library buffers
|
|
// response writes to determine encoding, which collapses per-token streaming
|
|
// into a single block delivered after generation completes (regression in
|
|
// v1.31.0-rc.2, reported in #781 by @toasterking).
|
|
const compress = env.get('DISABLE_COMPRESSION')
|
|
? null
|
|
: compression({
|
|
filter: (req: any, res: any) => {
|
|
const contentType = res.getHeader('Content-Type')
|
|
if (typeof contentType === 'string' && contentType.includes('text/event-stream')) {
|
|
return false
|
|
}
|
|
return compression.filter(req, res)
|
|
},
|
|
})
|
|
|
|
export default class CompressionMiddleware {
|
|
async handle({ request, response }: HttpContext, next: NextFn) {
|
|
if (!compress) return await next()
|
|
|
|
await new Promise<void>((resolve, reject) => {
|
|
compress(request.request as any, response.response as any, (err?: any) => {
|
|
if (err) reject(err)
|
|
else resolve()
|
|
})
|
|
})
|
|
|
|
await next()
|
|
}
|
|
}
|