The dbn-legal-agent-v3 fine-tune (Track 1 / family) emits a labelled-prose
template — duplicate `answer:` prefixes, markdown-escaped underscores (`\_`),
and a trailing raw JSON blob — rather than the strict JSON the Azure/gpt-4o
path produces via response_format. decodeJsonObject() returned null on that
invalid JSON, so ask() dumped the entire raw blob into `answer`.
Fix at the parse layer (no upstream response_format change, to avoid fighting
the fine-tune's training):
- dbnToolsRepairJsonText(): strip fences, drop only invalid `\_`/`\*` escapes,
then balanced-brace scan collecting every top-level {...} (longest first) to
recover an appended JSON object. Shared by both gateways' decodeJsonObject(),
so all JSON tools benefit.
- dbnToolsParseLabeledFields(): parse labelled-prose into real fields when no
JSON decodes, tolerating escaped key names and collapsing duplicate prefixes.
- ask() null-fallback now builds clean structured fields from the parsed prose
instead of dumping raw; what_remains_uncertain becomes a proper list.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
De-family-ify shared JSON tools (persona-aware routing + neutral base
prompt), make the verification review pick its engine per track
(family/child-welfare -> dbn-legal-agent-v3, others -> gpt-4o interim),
and route product-name strings through dbnToolsProductName(). Rebrand the
MCP/tools surface (mcp.php + i18n mcp_* strings) to Do Better Legal.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Generalize the family-locked legal tools into caveauAI persona profiles
(client 57 chat profiles, resolved in-process via the chat_profiles bridge).
Each tool accepts an optional `profile` slug that scopes the corpus package(s),
search method, system prompt and synthesis model; omitting it falls back to the
family-legal package so existing behaviour is unchanged.
- dbnToolsResolvePersona / dbnToolsListPersonas / dbnToolsBootChatProfiles in
bootstrap.php; new api/personas.php + dbn.list_personas MCP tool.
- LegalTools search/ask/corpusContextForSummarize and the BvjAnalyzer /
LegalAnalysis / translate paths take the persona's packages + prompt + model.
- Persona <select> on ask/search/summarize (populated from api/personas.php).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three fixes:
1. bootstrap.php dbnToolsRunLegalCheck(): prepend first 350 chars of synthesis text
to the v3 user message so it validates actual content, not just general law.
2. BvjAnalyzerAgent: fix engine guard — was skipping check for claude_sonnet/haiku;
now skips only when dbn_legal_v3 is the synthesis model (it already IS the check).
3. LegalAnalysisAgent: add post-synthesis dbnToolsRunLegalCheck() call after Pass 3;
add 'legal_check' key to runFullAnalysis() return.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents saved via save-from-tool or case-upload store content directly
in client_documents.content without being chunked into client_chunks.
dbnToolsFetchDocChunks now falls back to client_documents.content for
any requested doc_ids that returned no rows from client_chunks.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add "Select from My Docs" button to all text tool forms; free-tier
users see an upgrade modal, paid (CaveauAI) users get a searchable
multi-select modal backed by /api/dashboard/documents.php
- Add "Select from My Audio" picker on Transcribe with single-select
and a "Save to My Audio" button for persisting uploaded clips
- New PHP helpers in bootstrap.php: dbnToolsFetchDocChunks,
dbnToolsClientIdFromSession, dbnToolsInjectDocContent
- timeline, ask, redact APIs prepend selected document content
(fetched from client_chunks SQL) before the textarea text
- api/dashboard/audio-upload.php stores audio files on server and
creates a client_documents row with source_type='audio'
- api/transcribe.php falls back to stored audio via audio_doc_id POST
field when no file is uploaded
- api/dashboard/documents.php supports ?source_type= filter
- tools.js: doc_ids added to JSON payload; stored-audio transcribe path
- New assets/css/doc-picker.css, assets/js/doc-picker.js
- SQL migration: scripts/sql/audio_docs_column.sql
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Full private corpus dashboard for tools.dobetternorge.no users — each SSO
account gets an auto-provisioned CaveauAI tenant (clients row, corpus) on
first visit. Includes upload (file/paste/URL), RAG chat with SSE streaming
and citation chips, document CRUD, FalkorDB graph relations tab, and
improved save-from-tool flow with tag/preview support.
- dashboard/{index,documents,document,upload,chat,settings}.php
- api/dashboard/{corpus-init,documents,upload,ingest-status,chat-stream,
save-from-tool,graph}.php
- includes/{CorpusProvision,layout_dashboard,layout_dashboard_footer}.php
- assets/css/dashboard.css assets/js/corpus-save.js (routing upgrade)
- includes/{bootstrap,layout}.php extended for dashboard provisioning
Migration 141 (clients.dbn_sso_uid + import_method enum) applied on chloe.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
With timeout=45s + fallback timeout=30s, the total silence was 75s,
exceeding the 60s H2 idle stream limit in the browser. Remove the
fallback: if dbn-legal-agent-v3 times out or fails, return empty
immediately. Legal check is non-critical (wrapped in try/catch in
generate()); the draft is still correct without it.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The dbnToolsRunLegalCheck call blocks for 30-120s with no output,
causing the H2 idle stream timeout (~60s) to drop the connection.
Fix: emit 'Verifying legal authorities...' progress event just before
the legal check to reset the idle timer. Also reduce legal check
timeouts from 120s/60s to 45s/30s so the call completes within the
new 60s window even if LiteLLM is slow.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cuttlefish lost; Gemma3 QLoRA approach abandoned. dbn-legal-agent-v2 in
LiteLLM now aliases to qwen2.5:14b on Colin. dbnToolsRunLegalCheck() adds
explicit system prompt since there is no longer a Modelfile-embedded one.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace dbn-legal-agent with dbn-legal-agent-v2 in bootstrap.php
(dbnToolsRunLegalCheck), DeepResearchAgent.php (interpretSeed,
expandQueries, synthesis fallback, deploy label), BvjAnalyzerAgent.php
(check_model label) — 8 locations total
- Add dbn-legal-agent-v2 legal threshold check to KorrespondAgent:
called after selfCheck() in both generate() and refine(); result
surfaced as legal_check[] in the API response
- Render legal_check card in korrespond.js using existing bvj-red-flag
styles; shows only when non-empty
- Add .korr-legal-check CSS block in tools.css
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Probe testing revealed the fine-tune loops when asked to check a brief
directly (tool-planning architecture conflict) but answers focused legal
Q&A reliably in ~55s. New step 6b asks one targeted question per document
type (akuttvedtak → § 4-25 klar nødvendighet, adopsjon → Strand Lobben,
undersøkelse → fvl § 17/§ 41) and merges the finding into
procedural_red_flags with check_model provenance. Silent on timeout/error.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Public landing page at / for unauthenticated users (EN/NO/UK/PL)
- Authenticated / shows Case Workbench dashboard with manifesto strip,
stats, and launched-tool grid (Transcribe, Timeline, BVJ, Advocate,
Deep Research, Corpus)
- Added includes/i18n.php with full 4-language translation layer
- Extended layout.php to Case Workbench shell with tool rail, lang switcher
- AI output language normalization extended to en/no/uk/pl in PHP agents
- SSO token validation in bootstrap.php / index.php (dobetternorge.no bridge)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AiGateway uses getenv(LITELLM_MASTER_KEY) + stream_context HTTP which was
failing on the chloe virtualhost process. New dbnToolsLiteLLMEmbedBatch()
helper mirrors dbnToolsCallGpuLlm — hardcoded URL + key, cURL-first, same
pattern already proven for LLM calls. Removes AiGateway dependency from
DeepResearchAgent entirely.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New surface at /deep-research.php where the user pastes a question or
uploads PDF/DOCX/TXT case files and a LLM-orchestrated agent researches
the Do Better Norge legal corpus from 3-5 angles, with hybrid retrieval,
cross-encoder rerank, and synthesis that emits an inline-[n]-cited
markdown brief plus a numbered sources panel.
Uploaded documents are chunked + embedded in memory only (nomic-embed-text
via LiteLLM) and searched alongside the shared corpus during the same
request — never persisted to disk, DB, or Qdrant.
Reuses ClientRagPipeline::searchAll (hybrid + rerank), dbnV6 slice
helpers, and the existing extract.php text-extraction logic via a new
dbnToolsExtractUploadedFile() helper. Also adds dbnToolsCallGpuLlm()
helper in bootstrap.php — fixes a latent bug where LegalTools.php
was already calling that name with no definition.
Search.php is unchanged.
- bootstrap.php: dbnToolsValidateSsoToken(), SSO session check in dbnToolsIsAuthenticated()
- index.php: SSO handler at top, Do Better Norge member panel in login card
- .env: DBN_SSO_SECRET placeholder
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass 1: deterministic regex with Nordic/European/ECHR/Global packs
covering fødselsnummer, Swedish personnummer, Danish/Finnish CPR,
UK NI, French INSEE, IBAN, EU phones, ECHR application numbers, DOB,
and national ID label patterns.
Pass 2: LLM semantic scan (Azure OpenAI) finds names, orgs, places
and identifying descriptions missed by regex. Runs on pre-redacted
text so no raw PII reaches the LLM.
Adds region selector (Nordic/European/ECHR/Global) to the Redact UI.
Falls back gracefully when Azure is not yet configured.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>