Commit Graph

16 Commits

Author SHA1 Message Date
daveadmin 8a11001bff Add AWS Bedrock three-tier gateway routing (LiteLLM via Colin)
Routes AI tools across three tiers based on task complexity:
- Azure GPT-4o-mini always: redact, translate, timeline-basic, search-legal (mechanical tasks)
- Claude Haiku 4.5 (Bedrock): ask, summarize, timeline-deep, citations (Norwegian nuance)
- Claude Sonnet 4.6 (Bedrock): korrespond, legal-analysis, deep-research, barnevernet-analyze,
  discrepancy-find, advocate (public-facing legal output)

No AWS credentials in app — credentials live in LiteLLM on Colin (same as nova-lite).
Rollback: DBN_BEDROCK_ENABLED=false in .env, no code push needed.

Includes extended thinking support for Pro deep-research via chatWithThinking().
Claude Opus 4.7 constant added for future premium tier (needs litellm_config.yaml entry).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 15:22:48 +02:00
daveadmin ba9cddf9a1 Add monetization spine + Build Your Own Case (Min Sak)
- Stripe: StripeClient.php, checkout/portal/webhook endpoints, idempotent event handling
- FreeTier: tier-aware credits (free/light/pro/pro_plus), bonus_balance, hourly caps per tier
- pricing.php + billing.php: 4-tier cards, 3 topups, Customer Portal, balance breakdown
- Min Sak: CaseStore.php, AzureDocIntelligence.php, AzureSearchAdmin.php — per-user hybrid RAG
- api/case/: upload, list, delete, ingest-callback (HMAC-auth'd from n8n)
- award-survey-credits: inter-site HMAC endpoint for dobetternorge.no survey bonus
- dashboard.php: tier badge, balance breakdown card, Min Sak CTA, survey CTA
- KorrespondAgent + all 3 other agents: use_my_case toggle wired to dbnToolsCaseContext()
- bootstrap.php: dbnToolsCaseContext(), dbnToolsIntersiteSecret(), dbnToolsCurrentTier()

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 20:52:54 +02:00
daveadmin 0e167bf464 Integrate dbn-legal-agent-v2: upgrade all v1 refs + add Korrespond legal-check
- Replace dbn-legal-agent with dbn-legal-agent-v2 in bootstrap.php
  (dbnToolsRunLegalCheck), DeepResearchAgent.php (interpretSeed,
  expandQueries, synthesis fallback, deploy label), BvjAnalyzerAgent.php
  (check_model label) — 8 locations total
- Add dbn-legal-agent-v2 legal threshold check to KorrespondAgent:
  called after selfCheck() in both generate() and refine(); result
  surfaced as legal_check[] in the API response
- Render legal_check card in korrespond.js using existing bvj-red-flag
  styles; shows only when non-empty
- Add .korr-legal-check CSS block in tools.css

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 23:59:07 +02:00
daveadmin 04555a96b1 Add Citation Explorer tool and graph-expansion badges to Advocate results
- citations.php + assets/js/citations.js: new tool page for browsing the
  FalkorDB citation graph by title/ID, with autocomplete, action pills
  (cites/cited_by/implements/chain), hop-by-hop navigation, and exploration trail
- advocate.js: tag graph-expanded source cards with 'via citation graph' badge
- DeepResearchAgent: propagate _graph_expanded flag through normalizeCorpusChunk
  and top_sources serialization so it reaches the frontend
- tools.css: add .dr-source-tag--graph variant (green pill)
- i18n.php: register 'citations' tool in all 4 languages with CIT icon

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:30:04 +02:00
daveadmin ffcf887428 feat(timeline): add live filter, actor chips, group headers, copy button, source toggle, count badge
- Live search/filter bar: filters events by keyword across event, actor, source_excerpt, date
- Actor filter chips: click to filter by actor, multi-select, teal active state
- Year/month group headers when sorted chronologically (── 2023 ──, Mar 2024 ──)
- Per-event copy button (hover-revealed 📋): copies "date · actor · event" to clipboard
- "Hide/show sources" toggle: collapses all source excerpts without re-rendering
- Count badge: "23 events · 3 actors · 2022–2025" above the list
- applyTimelineFilters() unifies sort + actor + text filters in one re-render pass
- CSV export now includes end_date column
- Reset all filter state on each new run

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 15:46:59 +02:00
daveadmin c4362738c1 feat(transcribe): GPT cleanup pass + advanced options i18n
Adds optional post-transcription cleanup via GPT-4o/GPT-4o-mini to fix
mishearing errors, punctuation, and domain terms. Speaker role labelling
now accepts a deployment param. Adds i18n strings for advanced options
panel (task, VAD filter, Whisper model, AI cleanup) in all four languages.
Updates BvjAnalyzerAgent and DeepResearchAgent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 07:23:01 +02:00
daveadmin a3d46f9756 feat: Legal Tools v1 — multilingual landing, dashboard, SSO bridge
- Public landing page at / for unauthenticated users (EN/NO/UK/PL)
- Authenticated / shows Case Workbench dashboard with manifesto strip,
  stats, and launched-tool grid (Transcribe, Timeline, BVJ, Advocate,
  Deep Research, Corpus)
- Added includes/i18n.php with full 4-language translation layer
- Extended layout.php to Case Workbench shell with tool rail, lang switcher
- AI output language normalization extended to en/no/uk/pl in PHP agents
- SSO token validation in bootstrap.php / index.php (dobetternorge.no bridge)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 22:53:27 +02:00
daveadmin 343b19d0b4 Add sub-question branching + document summary modals
- Source modal now shows LLM-generated document summary (lazy-gen + cached
  in documents.summary) instead of raw chunk text; toggle reveals matched
  chunk; "View all chunks" button fetches every chunk of the document via
  new api/document-chunks.php endpoint
- Each sub-question card gets a "Branch ↓" button that pre-fills the query
  with that sub-question and shows a context panel with the prior brief
  summary; prior_context + branch_notes are injected into interpretSeed()
  and synthesise() so the LLM knows where the research is coming from
- Upload document summaries generated at synthesis time and attached to
  upload sources alongside corpus summaries
- DB: documents.summary TEXT column added to bnl_corpus on chloe

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 19:44:27 +02:00
daveadmin 0ff4eb6d31 Add dbn-legal-agent to deep-research and advocate pipelines
- interpretSeed: uses dbn-legal-agent for Norwegian/advocate queries
- expandQueries: uses dbn-legal-agent for Norwegian sub-question generation
- synthesise: adds dbn_legal engine option (dbn-legal-agent via LiteLLM GPU)
- advocate.php: adds Norwegian specialist radio button in engine selector

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 19:12:19 +02:00
daveadmin 7bccd8c010 Expand corpus slices to 8: split ECHR/Hague, add Norwegian Courts, Bufdir, DBN Resources
- Replace combined echr_hague slice with echr (Art.8+9, HUDOC, NIM) and hague (INCADAT,
  cross-border abduction) as separate toggles; echr defaults ON, hague defaults OFF
- Add norwegian_courts slice: Domstol (src 5,26) + Rettspraksis.no (src 33, 482 docs)
- Add bufdir_guidance slice: Barneombudet (19), Bufdir (20), Statsforvalteren (31)
- Add dbn_resources slice: DBN website pages (flashcards, resource directory), defaults OFF
- Replace isWebsiteChunk() with slice-aware shouldExcludeChunk(): always strips EU AI Act
  chunks (EUR-Lex source 7 leaks through when Qdrant runs unconstrained) and DBN website
  pages unless dbn_resources slice is explicitly ON
- Update SLICE_DEFS in advocate.js and deep-research.js to match all 8 slices
- Backward compat: echr_hague key in incoming requests fans out to echr+hague

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 16:01:05 +02:00
daveadmin 640778454f Add Case Advocate tab — partisan brief grounded in Norwegian law
New /advocate.php tab: user selects who they represent (biological
father, mother, foster carer, CWS, etc.) and the agent takes their
side entirely. Adversarial sub-questions target supporting Lovdata
statutes + ECHR precedents; synthesis returns client_strengths[] and
opposing_weaknesses[] alongside the advocate brief.

- DeepResearchAgent: add advocateRole param to run(), interpretSeed(),
  expandQueries(), synthesise(). Neutral path unchanged (empty string).
- api/deep-research.php: extract + validate advocate_role from payload;
  telemetry logs tool='advocate' vs 'deep_research'.
- advocate.php: new page with role dropdown (presets + custom), same
  corpus slices/engine/controls/upload zone as deep research.
- assets/js/advocate.js: page-scoped JS; renders advocate banner,
  client strengths card (teal), advocate brief, opposing weaknesses
  card (amber), sub-Q cards, sources, uncertainty, next step.
- assets/css/tools.css: append .adv-* rules (~120 lines).
- includes/layout.php: add Advocate nav tab between Deep research and
  Summarize.
- index.php: add Advocate cap-card tile.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 12:26:05 +02:00
daveadmin 785de04f05 fix: batch embed 5 chunks at a time with flush between; fix hydrateSourceUrls SQL
Embed timeout: bnl_corpus Ollama embeds ~49 chunks sequentially in CPU mode,
easily exceeding the 60s cURL timeout. Now truncates upload text to
MAX_UPLOAD_CHARS before chunking (~21 chunks max) and embeds in batches of 5
with a progress flush between batches to keep the stream alive.

SQL error: bnl_corpus.documents lacks the temporal columns added in migration
136 (valid_from, valid_until, etc.). dbnV6QueryDocumentMeta uses IFNULL which
doesn't protect against missing columns. Replaced with a direct query using
only the columns confirmed to exist on this instance.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 11:42:38 +02:00
daveadmin 3196c33ebb fix: replace AiGateway.embedBatch with direct LiteLLM cURL for upload indexing
AiGateway uses getenv(LITELLM_MASTER_KEY) + stream_context HTTP which was
failing on the chloe virtualhost process. New dbnToolsLiteLLMEmbedBatch()
helper mirrors dbnToolsCallGpuLlm — hardcoded URL + key, cURL-first, same
pattern already proven for LLM calls. Removes AiGateway dependency from
DeepResearchAgent entirely.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 11:30:25 +02:00
daveadmin e130db8119 Deep Research v2: exclude marketing site, deep-link sources, per-agent reports
Three user-flagged issues after the first real run with a 920KB sakkyndig PDF:

1. dobetternorge.no marketing-website chunks leaked into the retrieval pool.
   ClientRagPipeline::searchAll defaults include_beta_website=true; we now
   pass false for both website flags, AND defensively drop any returned
   chunk whose source_name contains "website" or title contains
   "dobetternorge.no" before it can pollute synthesis.

2. Brief returned was "just a paragraph". Bumped synthesis max_tokens
   2200→3200, raised timeout 120→180s, and rewrote the prompt to require
   400-900 words with min 4 paragraphs when source_count>=3, covering EACH
   sub-question in its own paragraph. Now also passes authority + jurisdiction
   into the sources block so the model can pinpoint statutes correctly.

3. No way to see what each "sub-question agent" researched or click through
   to the source articles. Restructured the results panel so per-sub-question
   report cards now render ABOVE the synthesised brief. Each report shows the
   question, the rationale, and the top 3 retrieved sources for that sub-Q
   with title→deep link + 1-line excerpt. Brief follows. Consolidated
   numbered sources list at the bottom, with titles as deep links too.

Deep-link construction: source_url is hydrated via dbnV6QueryDocumentMeta
in a single batched call after retrieval. For Lovdata sources with a
section_title containing §<n>, the link is path-anchored to that section
(/§43). For other hosts (HUDOC, Regjeringen, Bufdir, etc.) we link to the
document root URL.

Telemetry: trace_metadata now carries retrieval_counts {raw_corpus,
filtered_website, post_filter_corpus, raw_upload, after_dedupe, after_topk}
so future regressions are diagnosable from the metadata.jsonl log alone.
The completion status pill surfaces the corpus/website/upload split.
2026-05-15 11:12:13 +02:00
daveadmin a1a7f442a7 Deep Research: NDJSON streaming so the connection survives long runs
Previously the endpoint returned a single JSON object at the end. Apache+
PHP-FPM buffers the entire body until PHP exits, so a 160s azure_full run
caused the browser to drop the fetch as "Failed to fetch" while the server
was still synthesising — the response then arrived to a dead socket.

Switch to application/x-ndjson with one event per line. The endpoint emits
'progress', 'start', 'step' (running/complete/warning/error), 'subq', and a
final 'final' event carrying the full result payload. Output buffering is
explicitly disabled so each line flushes through Apache as soon as the
agent emits it.

DbnDeepResearchAgent::run() now accepts an optional ?callable $emit and
fires step:running before each step + step:complete after, plus a subq
event per sub-question retrieval round.

JS reads response.body as a stream, splits on newlines, updates the
trace panel live, and renders the final result when the final event
arrives. Status pill shows live progress detail (e.g. "Synthesising with
Azure gpt-4o — this is the slowest step…").

Engine row in the form now shows expected duration per engine
(~15-45s mini, ~60-180s full, ~30-90s GPU) so users know what they're in
for before clicking Run.
2026-05-15 10:47:35 +02:00
daveadmin 4cbe0a4ac4 Add Deep Research tool — agent + rank/rerank RAG
New surface at /deep-research.php where the user pastes a question or
uploads PDF/DOCX/TXT case files and a LLM-orchestrated agent researches
the Do Better Norge legal corpus from 3-5 angles, with hybrid retrieval,
cross-encoder rerank, and synthesis that emits an inline-[n]-cited
markdown brief plus a numbered sources panel.

Uploaded documents are chunked + embedded in memory only (nomic-embed-text
via LiteLLM) and searched alongside the shared corpus during the same
request — never persisted to disk, DB, or Qdrant.

Reuses ClientRagPipeline::searchAll (hybrid + rerank), dbnV6 slice
helpers, and the existing extract.php text-extraction logic via a new
dbnToolsExtractUploadedFile() helper. Also adds dbnToolsCallGpuLlm()
helper in bootstrap.php — fixes a latent bug where LegalTools.php
was already calling that name with no definition.

Search.php is unchanged.
2026-05-15 10:30:47 +02:00