Deep Research: NDJSON streaming so the connection survives long runs
Previously the endpoint returned a single JSON object at the end. Apache+ PHP-FPM buffers the entire body until PHP exits, so a 160s azure_full run caused the browser to drop the fetch as "Failed to fetch" while the server was still synthesising — the response then arrived to a dead socket. Switch to application/x-ndjson with one event per line. The endpoint emits 'progress', 'start', 'step' (running/complete/warning/error), 'subq', and a final 'final' event carrying the full result payload. Output buffering is explicitly disabled so each line flushes through Apache as soon as the agent emits it. DbnDeepResearchAgent::run() now accepts an optional ?callable $emit and fires step:running before each step + step:complete after, plus a subq event per sub-question retrieval round. JS reads response.body as a stream, splits on newlines, updates the trace panel live, and renders the final result when the final event arrives. Status pill shows live progress detail (e.g. "Synthesising with Azure gpt-4o — this is the slowest step…"). Engine row in the form now shows expected duration per engine (~15-45s mini, ~60-180s full, ~30-90s GPU) so users know what they're in for before clicking Run.
This commit is contained in:
+4
-4
@@ -16,11 +16,11 @@ require_once __DIR__ . '/includes/layout.php';
|
||||
|
||||
<div class="control-row" id="drEngineControl">
|
||||
<span class="control-label">Engine</span>
|
||||
<label><input type="radio" name="drEngine" value="azure_mini" checked> Azure gpt-4o-mini ★ <small class="control-hint">(fast)</small></label>
|
||||
<label><input type="radio" name="drEngine" value="azure_full"> Azure gpt-4o <small class="control-hint">(best)</small></label>
|
||||
<label><input type="radio" name="drEngine" value="gpu"> GPU (cuttlefish) <small class="control-hint">(local)</small></label>
|
||||
<label><input type="radio" name="drEngine" value="azure_mini" checked> Azure gpt-4o-mini ★ <small class="control-hint">(~15-45s)</small></label>
|
||||
<label><input type="radio" name="drEngine" value="azure_full"> Azure gpt-4o <small class="control-hint">(best · ~60-180s)</small></label>
|
||||
<label><input type="radio" name="drEngine" value="gpu"> GPU (cuttlefish) <small class="control-hint">(local · ~30-90s)</small></label>
|
||||
</div>
|
||||
<p class="upload-hint">Azure engines use your BNL Azure credits. GPU runs qwen2.5:14b via LiteLLM on cuttlefish.</p>
|
||||
<p class="upload-hint">Azure mini is the default and finishes fastest. Azure full is the most thorough but can take 1-3 minutes. GPU keeps everything inside the BNL fleet. Live progress shown in the right-hand reasoning panel.</p>
|
||||
|
||||
<div class="dr-slice-section">
|
||||
<p class="control-label">Corpus slices</p>
|
||||
|
||||
Reference in New Issue
Block a user