fix(tools): parse-harden Do Better Legal ask against leaky fine-tune output
The dbn-legal-agent-v3 fine-tune (Track 1 / family) emits a labelled-prose
template — duplicate `answer:` prefixes, markdown-escaped underscores (`\_`),
and a trailing raw JSON blob — rather than the strict JSON the Azure/gpt-4o
path produces via response_format. decodeJsonObject() returned null on that
invalid JSON, so ask() dumped the entire raw blob into `answer`.
Fix at the parse layer (no upstream response_format change, to avoid fighting
the fine-tune's training):
- dbnToolsRepairJsonText(): strip fences, drop only invalid `\_`/`\*` escapes,
then balanced-brace scan collecting every top-level {...} (longest first) to
recover an appended JSON object. Shared by both gateways' decodeJsonObject(),
so all JSON tools benefit.
- dbnToolsParseLabeledFields(): parse labelled-prose into real fields when no
JSON decodes, tolerating escaped key names and collapsing duplicate prefixes.
- ask() null-fallback now builds clean structured fields from the parsed prose
instead of dumping raw; what_remains_uncertain becomes a proper list.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -148,26 +148,7 @@ final class DbnAzureOpenAiGateway
|
||||
|
||||
public function decodeJsonObject(string $content): ?array
|
||||
{
|
||||
$content = trim($content);
|
||||
$content = (string)preg_replace('/^```(?:json)?\s*\n?/i', '', $content);
|
||||
$content = (string)preg_replace('/\n?```\s*$/', '', $content);
|
||||
$content = trim($content);
|
||||
|
||||
$decoded = json_decode($content, true);
|
||||
if (is_array($decoded)) {
|
||||
return $decoded;
|
||||
}
|
||||
|
||||
$start = strpos($content, '{');
|
||||
$end = strrpos($content, '}');
|
||||
if ($start !== false && $end !== false && $end > $start) {
|
||||
$candidate = substr($content, $start, $end - $start + 1);
|
||||
$decoded = json_decode($candidate, true);
|
||||
if (is_array($decoded)) {
|
||||
return $decoded;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
return dbnToolsRepairJsonText($content);
|
||||
}
|
||||
|
||||
private function postJson(string $url, array $payload, int $timeout): array
|
||||
|
||||
Reference in New Issue
Block a user