fix(tools): parse-harden Do Better Legal ask against leaky fine-tune output

The dbn-legal-agent-v3 fine-tune (Track 1 / family) emits a labelled-prose
template — duplicate `answer:` prefixes, markdown-escaped underscores (`\_`),
and a trailing raw JSON blob — rather than the strict JSON the Azure/gpt-4o
path produces via response_format. decodeJsonObject() returned null on that
invalid JSON, so ask() dumped the entire raw blob into `answer`.

Fix at the parse layer (no upstream response_format change, to avoid fighting
the fine-tune's training):
- dbnToolsRepairJsonText(): strip fences, drop only invalid `\_`/`\*` escapes,
  then balanced-brace scan collecting every top-level {...} (longest first) to
  recover an appended JSON object. Shared by both gateways' decodeJsonObject(),
  so all JSON tools benefit.
- dbnToolsParseLabeledFields(): parse labelled-prose into real fields when no
  JSON decodes, tolerating escaped key names and collapsing duplicate prefixes.
- ask() null-fallback now builds clean structured fields from the parsed prose
  instead of dumping raw; what_remains_uncertain becomes a proper list.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-06-02 17:36:35 +02:00
parent 7fcd317205
commit c84ed2ed78
4 changed files with 136 additions and 44 deletions
+111
View File
@@ -1461,6 +1461,117 @@ function dbnToolsExtractCleanAnswer(string $text): string
return trim($text);
}
/**
* Robustly extract a JSON object from a model reply, tolerating the artifacts the
* fine-tuned models leak: ```fences```, markdown-escaped underscores/asterisks
* (`\_`, `\*` — never valid JSON escapes), and prose wrapped around a real JSON
* blob. Returns the decoded array, or null if nothing parses. Shared by both
* gateways' decodeJsonObject(), so every JSON tool benefits.
*/
function dbnToolsRepairJsonText(string $content): ?array
{
$content = trim($content);
$content = (string)preg_replace('/^```(?:json)?\s*\n?/i', '', $content);
$content = (string)preg_replace('/\n?```\s*$/', '', $content);
// Drop only invalid markdown escapes; leave legitimate \n \" \\ \/ \t intact.
$content = (string)preg_replace('/\\\\([_*])/', '$1', $content);
$content = trim($content);
$decoded = json_decode($content, true);
if (is_array($decoded)) {
return $decoded;
}
// Collect every balanced top-level {...} block (ignoring braces inside JSON
// strings), then try the longest first — handles "prose then appended JSON".
$candidates = [];
$depth = 0;
$start = -1;
$inStr = false;
$escaped = false;
$len = strlen($content);
for ($i = 0; $i < $len; $i++) {
$ch = $content[$i];
if ($inStr) {
if ($escaped) {
$escaped = false;
} elseif ($ch === '\\') {
$escaped = true;
} elseif ($ch === '"') {
$inStr = false;
}
continue;
}
if ($ch === '"') {
$inStr = true;
} elseif ($ch === '{') {
if ($depth === 0) {
$start = $i;
}
$depth++;
} elseif ($ch === '}') {
if ($depth > 0) {
$depth--;
if ($depth === 0 && $start >= 0) {
$candidates[] = substr($content, $start, $i - $start + 1);
$start = -1;
}
}
}
}
usort($candidates, static fn(string $a, string $b): int => strlen($b) <=> strlen($a));
foreach ($candidates as $candidate) {
$decoded = json_decode($candidate, true);
if (is_array($decoded)) {
return $decoded;
}
}
return null;
}
/**
* Parse a labelled-prose reply (`answer: ...`, `what_we_found: ...`) into an assoc
* array keyed by $keys, for fine-tunes that ignore the JSON contract. Tolerates
* markdown-escaped key names (`what\_we\_found`). Each value runs until the next
* known key label or a trailing { JSON blob (discarded). Returns only found keys.
*/
function dbnToolsParseLabeledFields(string $text, array $keys): array
{
$text = (string)preg_replace('/\\\\([_*])/', '$1', trim($text));
if ($text === '' || empty($keys)) {
return [];
}
// Find each "key:" label position (line start, case-insensitive).
$labels = [];
foreach ($keys as $key) {
if (preg_match('/^\s*' . preg_quote($key, '/') . '\s*:/im', $text, $m, PREG_OFFSET_CAPTURE)) {
$labelStart = $m[0][1];
$valueStart = $labelStart + strlen($m[0][0]);
$labels[] = ['key' => $key, 'start' => $labelStart, 'value_start' => $valueStart];
}
}
if (!$labels) {
return [];
}
usort($labels, static fn(array $a, array $b): int => $a['start'] <=> $b['start']);
$out = [];
$count = count($labels);
for ($i = 0; $i < $count; $i++) {
$end = ($i + 1 < $count) ? $labels[$i + 1]['start'] : strlen($text);
$value = substr($text, $labels[$i]['value_start'], $end - $labels[$i]['value_start']);
// Drop a trailing appended JSON blob from the last field's value.
$brace = strpos($value, '{');
if ($brace !== false && $i + 1 === $count) {
$value = substr($value, 0, $brace);
}
// Collapse a duplicated "key:" prefix the model sometimes repeats inside the value.
$value = (string)preg_replace('/^\s*' . preg_quote($labels[$i]['key'], '/') . '\s*:\s*/i', '', trim($value));
$out[$labels[$i]['key']] = trim($value);
}
return $out;
}
function dbnToolsInferCheckSeverity(string $text): string
{
if (preg_match('/ugyldig|§\s*41|kontradiksjon|klar nødvendighet|strand lobben|biologiske bånd/i', $text)) {