Files
dobetternorge-tools/deep-research.php
T
daveadmin a1a7f442a7 Deep Research: NDJSON streaming so the connection survives long runs
Previously the endpoint returned a single JSON object at the end. Apache+
PHP-FPM buffers the entire body until PHP exits, so a 160s azure_full run
caused the browser to drop the fetch as "Failed to fetch" while the server
was still synthesising — the response then arrived to a dead socket.

Switch to application/x-ndjson with one event per line. The endpoint emits
'progress', 'start', 'step' (running/complete/warning/error), 'subq', and a
final 'final' event carrying the full result payload. Output buffering is
explicitly disabled so each line flushes through Apache as soon as the
agent emits it.

DbnDeepResearchAgent::run() now accepts an optional ?callable $emit and
fires step:running before each step + step:complete after, plus a subq
event per sub-question retrieval round.

JS reads response.body as a stream, splits on newlines, updates the
trace panel live, and renders the final result when the final event
arrives. Status pill shows live progress detail (e.g. "Synthesising with
Azure gpt-4o — this is the slowest step…").

Engine row in the form now shows expected duration per engine
(~15-45s mini, ~60-180s full, ~30-90s GPU) so users know what they're in
for before clicking Run.
2026-05-15 10:47:35 +02:00

163 lines
11 KiB
PHP
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<?php
declare(strict_types=1);
$toolName = 'deep-research';
$toolTitle = 'Deep Research';
$toolKind = 'Agent + Rank/Rerank RAG';
$toolBadge = 'family-legal';
$extraScripts = ['assets/js/deep-research.js'];
require_once __DIR__ . '/includes/layout.php';
?>
<form id="deepResearchForm" class="tool-form deep-research" enctype="multipart/form-data">
<div class="lang-switcher" id="drLangSwitcher" role="group" aria-label="UI language">
<button type="button" class="lang-btn is-active" data-lang="en">&#127468;&#127463; EN</button>
<button type="button" class="lang-btn" data-lang="no">&#127475;&#127476; NO</button>
</div>
<div class="control-row" id="drEngineControl">
<span class="control-label">Engine</span>
<label><input type="radio" name="drEngine" value="azure_mini" checked> Azure gpt-4o-mini &#9733; <small class="control-hint">(~15-45s)</small></label>
<label><input type="radio" name="drEngine" value="azure_full"> Azure gpt-4o <small class="control-hint">(best · ~60-180s)</small></label>
<label><input type="radio" name="drEngine" value="gpu"> GPU (cuttlefish) <small class="control-hint">(local · ~30-90s)</small></label>
</div>
<p class="upload-hint">Azure mini is the default and finishes fastest. Azure full is the most thorough but can take 1-3 minutes. GPU keeps everything inside the BNL fleet. Live progress shown in the right-hand reasoning panel.</p>
<div class="dr-slice-section">
<p class="control-label">Corpus slices</p>
<p class="upload-hint">Select which slices of the Do Better Norge legal corpus the agent searches. Toggle Broader Legal on when the question reaches beyond family law.</p>
<div class="dr-slice-grid">
<button type="button" class="dr-slice is-on" data-slice="family_core" aria-pressed="true">
<div class="dr-slice__head">
<span class="dr-slice__title">Family Law Core</span>
<span class="dr-slice__badge">on</span>
</div>
<p class="dr-slice__tagline">Barneloven, custody, samvær, mediation</p>
</button>
<button type="button" class="dr-slice is-on" data-slice="child_welfare" aria-pressed="true">
<div class="dr-slice__head">
<span class="dr-slice__title">Child Welfare</span>
<span class="dr-slice__badge">on</span>
</div>
<p class="dr-slice__tagline">Barnevern, omsorgsovertakelse, foster care</p>
</button>
<button type="button" class="dr-slice is-on" data-slice="echr_hague" aria-pressed="true">
<div class="dr-slice__head">
<span class="dr-slice__title">ECHR and Hague</span>
<span class="dr-slice__badge">on</span>
</div>
<p class="dr-slice__tagline">Article 8, EMD, HCCH, cross-border family</p>
</button>
<button type="button" class="dr-slice" data-slice="broader_legal" aria-pressed="false">
<div class="dr-slice__head">
<span class="dr-slice__title">Broader Legal Support</span>
<span class="dr-slice__badge">off</span>
</div>
<p class="dr-slice__tagline">Arbeidsmiljøloven, NOUer, statutes, government background</p>
</button>
</div>
</div>
<details class="advanced-panel" id="drAdvanced">
<summary class="advanced-toggle">Advanced controls</summary>
<div class="dr-control-grid">
<div class="dr-control-card">
<label>Sub-questions <span id="drSubQValue" class="dr-control-value">4</span></label>
<input type="range" id="drSubQ" min="3" max="5" step="1" value="4">
<small>How many angles the agent expands the question into before retrieval.</small>
</div>
<div class="dr-control-card">
<label>Chunks / sub-Q <span id="drChunkLimitValue" class="dr-control-value">6</span></label>
<input type="range" id="drChunkLimit" min="4" max="10" step="1" value="6">
<small>How many corpus chunks the hybrid retriever pulls per sub-question.</small>
</div>
<div class="dr-control-card">
<label>Similarity floor <span id="drSimValue" class="dr-control-value">0.30</span></label>
<input type="range" id="drSim" min="0.20" max="0.60" step="0.05" value="0.30">
<small>Minimum cosine similarity for uploaded-doc chunks to count as a match.</small>
</div>
<div class="dr-control-card">
<label>Sources kept <span id="drTopKValue" class="dr-control-value">12</span></label>
<input type="range" id="drTopK" min="8" max="14" step="1" value="12">
<small>Top sources kept after dedupe + rerank to feed synthesis.</small>
</div>
<div class="dr-control-card">
<label>Temperature <span id="drTempValue" class="dr-control-value">0.15</span></label>
<input type="range" id="drTemp" min="0.05" max="0.40" step="0.05" value="0.15">
<small>Synthesis creativity. Keep low for grounded legal briefs.</small>
</div>
</div>
</details>
<div class="upload-zone" id="drUploadZone" role="region" aria-label="File upload">
<input type="file" id="drUploadInput" multiple accept=".pdf,.docx,.txt" aria-label="Choose files">
<div id="drUploadPrompt" class="upload-prompt">
<span class="upload-icon" aria-hidden="true">&#8679;</span>
<p>Drop up to 5 case files here, or <label for="drUploadInput" class="upload-browse">browse</label></p>
<p class="upload-hint"><strong>PDF</strong>, <strong>DOCX</strong>, <strong>TXT</strong> &mdash; chunked + embedded in memory only, never stored.</p>
</div>
<div id="drUploadFileInfo" class="upload-file is-hidden">
<ul id="drUploadFileList" class="upload-file-list"></ul>
<button type="button" id="drUploadClear" class="upload-clear">&times; Clear</button>
</div>
</div>
<label class="input-label" for="drInput">Question or pasted text</label>
<textarea id="drInput" name="drInput" rows="8" placeholder="Describe the legal question, paste case notes, or both. The agent will research the corpus from 35 angles."></textarea>
<div class="form-footer">
<p id="drStatus" class="form-status" role="status" aria-live="polite"></p>
<button id="drRunButton" type="submit">Run deep research</button>
</div>
</form>
<section id="drResults" class="results deep-research-results" aria-live="polite">
<div class="empty-state">
<h3>Ready</h3>
<p>Pick slices, drop a case file or paste a question, then run. The agent will expand the question, retrieve from the corpus + your upload, rerank, and synthesise a cited brief.</p>
</div>
</section>
<!-- Source modal -->
<div id="drSourceModal" class="dr-source-modal is-hidden" role="dialog" aria-modal="true" aria-labelledby="drSourceModalTitle">
<div class="dr-source-modal__dialog">
<header class="dr-source-modal__head">
<div>
<p class="eyebrow" id="drSourceModalEyebrow">Source</p>
<h3 id="drSourceModalTitle"></h3>
</div>
<button type="button" id="drSourceModalClose" class="upload-clear" aria-label="Close">&times;</button>
</header>
<div class="dr-source-modal__body">
<aside class="dr-source-modal__meta" id="drSourceModalMeta"></aside>
<div class="dr-source-modal__text" id="drSourceModalText"></div>
</div>
</div>
</div>
<!-- Hidden stubs so tools.js element refs don't crash on this page -->
<div class="is-hidden" id="languageControl" aria-hidden="true"><input type="radio" name="language" value="en" checked></div>
<div class="is-hidden" id="redactionControl" aria-hidden="true"></div>
<div class="is-hidden" id="audioZone" aria-hidden="true">
<input type="file" id="audioInput" style="display:none">
<div id="audioPrompt"></div>
<div id="audioFileInfo"><ol id="audioQueueList"></ol><button type="button" id="audioClear"></button></div>
</div>
<div class="is-hidden" id="diarizeControl" aria-hidden="true">
<input type="checkbox" id="diarizeCheck">
<input type="number" id="numSpeakersInput">
</div>
<div class="is-hidden" id="transcribeLangControl" aria-hidden="true"><input type="radio" name="transcribeLang" value="no" checked></div>
<div class="is-hidden" id="vocabControl" aria-hidden="true">
<div id="vocabPresets"></div>
<textarea id="initPromptInput"></textarea>
</div>
<div class="is-hidden" id="aliasSection" aria-hidden="true">
<button type="button" id="addAliasRow"></button>
<div id="aliasRows"></div>
</div>
<div class="is-hidden" id="exemptSection" aria-hidden="true">
<button type="button" id="addExemptRow"></button>
<div id="exemptRows"></div>
</div>
<?php require_once __DIR__ . '/includes/layout_footer.php'; ?>