dobetternorge-tools

Author	SHA1	Message	Date
daveadmin	3f7d4eef13	feat(tools): add letter length + summary depth controls; harden korrespond §-discipline - Summarize: new depth param (brief/standard/detailed) with depth-aware prompt instructions and coverage mandate; wired through API + JS - Korrespond: new letter length param (concise/standard/detailed) injected as Lengde: instruction in draft pass; wired through API + JS - Korrespond draft prompt: add §-discipline rule (cite only directly relevant §§) plus Opphevet guard (aligned with dobetterlegal-tools) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-04 13:44:02 +02:00
daveadmin	8b99ceec3b	feat(rag): add doc-summary pre-filtering to DbnLegalToolsService::search Before chunk retrieval, embed the query against bnl_doc_summaries Qdrant collection to identify the most semantically relevant documents. The resulting document IDs are passed as shared_doc_ids to searchAll(), narrowing the shared-corpus chunk search to those documents only. Applied to the 'shared' and 'both' scope paths (not 'private', which has no shared corpus). Non-fatal: on any error preFilterDocIds stays empty and search falls back to current unfiltered chunk retrieval. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-03 10:15:57 +02:00
daveadmin	c84ed2ed78	fix(tools): parse-harden Do Better Legal ask against leaky fine-tune output The dbn-legal-agent-v3 fine-tune (Track 1 / family) emits a labelled-prose template — duplicate `answer:` prefixes, markdown-escaped underscores (`\_`), and a trailing raw JSON blob — rather than the strict JSON the Azure/gpt-4o path produces via response_format. decodeJsonObject() returned null on that invalid JSON, so ask() dumped the entire raw blob into `answer`. Fix at the parse layer (no upstream response_format change, to avoid fighting the fine-tune's training): - dbnToolsRepairJsonText(): strip fences, drop only invalid `\_`/`\*` escapes, then balanced-brace scan collecting every top-level {...} (longest first) to recover an appended JSON object. Shared by both gateways' decodeJsonObject(), so all JSON tools benefit. - dbnToolsParseLabeledFields(): parse labelled-prose into real fields when no JSON decodes, tolerating escaped key names and collapsing duplicate prefixes. - ask() null-fallback now builds clean structured fields from the parsed prose instead of dumping raw; what_remains_uncertain becomes a proper list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-02 17:36:35 +02:00
daveadmin	7fcd317205	feat(tools): reposition as Do Better Legal two-track Norwegian-law MCP De-family-ify shared JSON tools (persona-aware routing + neutral base prompt), make the verification review pick its engine per track (family/child-welfare -> dbn-legal-agent-v3, others -> gpt-4o interim), and route product-name strings through dbnToolsProductName(). Rebrand the MCP/tools surface (mcp.php + i18n mcp_* strings) to Do Better Legal. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-02 07:45:17 +02:00
daveadmin	662fbf7d6d	feat(tools): persona-driven multi-domain corpus + model routing Generalize the family-locked legal tools into caveauAI persona profiles (client 57 chat profiles, resolved in-process via the chat_profiles bridge). Each tool accepts an optional `profile` slug that scopes the corpus package(s), search method, system prompt and synthesis model; omitting it falls back to the family-legal package so existing behaviour is unchanged. - dbnToolsResolvePersona / dbnToolsListPersonas / dbnToolsBootChatProfiles in bootstrap.php; new api/personas.php + dbn.list_personas MCP tool. - LegalTools search/ask/corpusContextForSummarize and the BvjAnalyzer / LegalAnalysis / translate paths take the persona's packages + prompt + model. - Persona <select> on ask/search/summarize (populated from api/personas.php). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-01 20:49:58 +02:00
daveadmin	5a0ef89dca	feat(mcp): expose corpus_search, korrespond_refine, extract_text tools Restores the 3 tools (manifest + invoke arms + invokeExtract helper), the citation-atom RAG lever in LegalTools/corpus-search, and the catalog icons. These were live on prod via rsync but uncommitted, so a git-pull deploy reverted the manifest from 22 to 19 tools. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-01 16:45:41 +02:00
daveadmin	234ab7278b	Timeline: group same-date/actor events, clean badges, Bedrock routing - renderTimeline(): group consecutive same-date+actor events into one card with a bullet list; single events keep their current layout - Date format: YYYY-MM-DD → "1 Jun 2023" (3-letter month, international) - Time shown in header when available - Remove date_type badge; confidence badge replaced by amber ⚠ flag on low-confidence events only (high/medium border colour still shows) - LegalTools.php: resolve azure_full/azure_mini to Bedrock Sonnet/Haiku when DbnBedrockGateway is active; claude_sonnet/claude_haiku also handled - timeline.php + api/timeline.php: engine labels updated (Claude Haiku/Sonnet); claude_haiku + claude_sonnet added to valid engine list - i18n engine labels updated in all 4 languages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 23:21:35 +02:00
daveadmin	8a11001bff	Add AWS Bedrock three-tier gateway routing (LiteLLM via Colin) Routes AI tools across three tiers based on task complexity: - Azure GPT-4o-mini always: redact, translate, timeline-basic, search-legal (mechanical tasks) - Claude Haiku 4.5 (Bedrock): ask, summarize, timeline-deep, citations (Norwegian nuance) - Claude Sonnet 4.6 (Bedrock): korrespond, legal-analysis, deep-research, barnevernet-analyze, discrepancy-find, advocate (public-facing legal output) No AWS credentials in app — credentials live in LiteLLM on Colin (same as nova-lite). Rollback: DBN_BEDROCK_ENABLED=false in .env, no code push needed. Includes extended thinking support for Pro deep-research via chatWithThinking(). Claude Opus 4.7 constant added for future premium tier (needs litellm_config.yaml entry). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 15:22:48 +02:00
daveadmin	17ad54cf36	Add chunked timeline routing	2026-05-25 12:34:41 +02:00
daveadmin	3ad8f4843c	Harden timeline quick extraction	2026-05-25 11:14:21 +02:00
daveadmin	983c423740	Fix nova-lite JSON: drop response_format, strip markdown fences nova-lite ignores json_object constraint and returns {} empty; without it, it wraps output in ```json fences. Strip fences before decodeJsonObject. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 10:51:24 +02:00
daveadmin	f00d3d68e5	Add Quick mode (nova-lite/Bedrock) as 3rd tier for timeline tool Timeline now offers Quick/Standard/Deep: nova_lite routes to Amazon Bedrock nova-lite via LiteLLM (1 credit, ~2s faster), azure_mini stays gpt-4o-mini (1 credit), azure_full stays gpt-4o (2 credits, Pro only). ToolModels tier rules: free→nova_lite only, plus→quick/standard, pro→all three. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 10:26:07 +02:00
daveadmin	d47024ed67	timeline: remove GPU, add SSE status updates, DOCX export, single-file, engine-aware credits - Remove GPU/cuttlefish engine from timeline.php, api/timeline.php, LegalTools.php, tools.js (all 4 langs) - Add engine-aware credit cost: gpt-4o-mini=1 credit, gpt-4o=2 credits (matches redact pattern) - Remove multiple attribute from file input (single document only) - New api/timeline-stream.php: SSE endpoint emitting status events + final result - New api/timeline-download.php: DOCX export of timeline events - LegalTools::timeline() gains ?callable $onProgress for live status updates - tools.js: spinner on run, SSE streaming fetch, Export to Word button - Save to My Docs was already wired (showSaveResultButton at line 1136) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 09:32:28 +02:00
daveadmin	56cd87dd7b	redact: UX overhaul — engine simplification, credits, spinner, save-to-docs, badges - Remove GPU/regex engine options; keep only azure_mini (1 credit) and azure_full (2 credits) - Variable credit cost: engine-aware pre-check and charge in api/redact.php; PricingCatalog base = 1 - Fix ATTORNEY not preserved when keepOfficials=true: add to LLM prompt, generic-tag, pseudonym regexes - Replace Azure credits hint with per-engine credit cost text (all 4 languages) - Single-file upload only (was: up to 5); simplify status messages - Clear previous redaction output and show pulsing spinner when a new run starts - Add "Save to My Docs" button in redact output panel (corpus-save.js path) - corpus-save.js: capture source_doc_ids from button dataset, pass in POST payload - api/save-to-corpus.php: accept source_doc_ids, store first as source_url=corpus-doc:{id} - doc-picker.js: show "✂ Redacted" badge for documents saved from the redact tool - CSS: .redact-working spinner, doc-item__badge--redact pill styles Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 08:18:51 +02:00
daveadmin	b21bfb2f1d	Add NOK pricing catalog, credit ledger, success-based charging, and tier-gated model routing - PricingCatalog.php: single source of truth for plans (free/plus/pro), top-ups, Stripe price env keys, tool costs (0–6 credits), STT variable billing, feature limits - FreeTier.php: monthly-first credit deduction, ledger (user_tool_credit_ledger), STT reservation/settle/release, monthly reset, trial logic - StripeClient.php: canonical SKUs (plus/pro/topup_100/300/1000), legacy aliases kept - stripe-checkout.php: subscription vs payment mode, trial gating, catalog metadata - stripe-webhook.php: idempotent via stripe_events, handles subscription lifecycle + invoice.paid renewal + one-time topup credit grants - All API tools: success-based credit deduction (check before, charge after) - transcribe.php: file-size heuristic reservation, settle from actual provider duration - ask.php + LegalTools.php: ToolModels engine resolution — Pro gets gpt-4o - KorrespondAgent.php + korrespond.php: tier-gated draft deployment — Free/Plus gets gpt-4o-mini, Pro gets gpt-4o - pricing.php: NOK-only, plan cards, top-up packs, Organisation contact card, tool cost table, separate monthly/prepaid balance display - 003_pricing_credit_catalog.sql: ledger and STT reservation tables Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-24 13:42:27 +02:00
daveadmin	e768662efe	Add Summarize Document tool — engine selector, file upload, optional corpus enrichment - summarize.php: full custom inline form (replaces tool_form.php wrapper) with lang switcher, azure_mini/azure_full/gpu engine selector, 8 corpus-slice toggles (all off by default), doc picker, file upload zone, and textarea - api/summarize.php: rewritten to streaming NDJSON (matches barnevernet pattern); accepts JSON payload with text, language, engine, slices[], doc_ids[] - includes/LegalTools.php: adds corpusContextForSummarize() (keyword search via ClientRagPipeline) and summarizeWithContext() (engine-aware LLM call with optional corpus prepend); returns structured JSON matching existing summarize format - assets/js/summarize.js: self-contained IIFE handling file upload via api/extract.php, slice toggles, NDJSON stream reader, result rendering, and trace panel update - includes/i18n.php: adds 'summarize' to nav in all 4 languages (EN/NO/UK/PL), inserted after 'redact' in the tool order with icon 'SZ' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-23 23:25:40 +02:00
daveadmin	b014638f39	feat(corpus): add save-to-corpus + private corpus search scope - POST /api/save-to-corpus.php — saves tool output text to user's default CaveauAI corpus via ClientRagPipeline - api/case/upload.php — dual-writes uploaded PDFs to CaveauAI client_documents (best-effort) - assets/js/corpus-save.js — shared <dialog> handler for .js-save-corpus buttons on all tool pages - includes/layout_footer.php — injects corpus-save.js + shared save dialog markup - korrespond/deep-research/barnevernet/discrepancy JS — save-to-corpus buttons on output sections - api/search.php + LegalTools::search() — corpus_scope param ('shared'\|'private'\|'both'), merges personal CaveauAI corpus with shared legal library when 'both' - includes/tool_form.php + assets/js/tools.js — corpus scope radio toggle shown on search tab - api/user-docs.php — add POST upload method for non-SSO authenticated users Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 17:50:32 +02:00
daveadmin	28932297b3	Add user context notes field to timeline tool Adds an optional textarea below the main text input where users can provide clarifications to guide the LLM — e.g. year anchors, actor aliases, or focus instructions. Notes are injected into the prompt as a clearly delimited block and translated across all four UI languages (en/no/uk/pl). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-19 12:36:37 +02:00
daveadmin	59b39ff85b	feat(redact): tag highlighting, inventory panel, before/after toggle, gpt-4o upgrade - CSS: colour-coded [TAG] spans by entity type (person=pink, org=blue, place=green, date=amber, id=purple) - Inventory panel: collapsible list showing tag → original text mappings with occurrence counts, sourced from new redaction_map API response key - Before/after toggle: Redacted / Original view-switch buttons wired to lastOriginalText captured at submission time - One-click gpt-4o upgrade button when mini or GPU engine was used - Backend: redaction_map built from applied LLM entities (tag → originals + occurrence count via substr_count on final text) - renderResults now calls setupRedactViewToggle() after DOM is written Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-18 08:22:41 +02:00
daveadmin	e32ee60e78	feat(timeline): tighten prompt for accuracy — year inference, month names, actor normalization, confidence calibration - Add 4-step year inference rule for DD.MM. entries (scan backward/forward for anchor year) - Add Norwegian month-name formats (18. september, den 18. september 2025, etc.) with month lookup table - Add $relativeInstruction to tell LLM upfront when relative dates are excluded (not just PHP-filtered post-hoc) - Define confidence calibration criteria explicitly (high/medium/low) - Improve source_excerpt guidance: most diagnostic phrase, not just any verbatim phrase - Add actor normalization for Norwegian institutions (Barnevernstjenesten, Fylkesnemnda, Statsforvalteren, etc.) - Add deduplication rule for events appearing across multiple documents - Add end_date field for date_type=period events - Improve what_we_found schema hint to require count/range/actors/gaps - Increase max_tokens to 8000 for azure_full (gpt-4o) to avoid truncation on large documents - Tighten system prompt with Norwegian CPS legal chain context Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-18 07:11:31 +02:00
daveadmin	13572e9dfb	feat: extract and display event times on timeline (kl. HH:MM etc.) Prompt now instructs the model to extract time of day (HH:MM) when present in Norwegian formats: kl. 14:30, kl 09.00, 14:30, 14.30. renderTimeline shows time as a muted inline annotation next to the date. CSV export gains a Time column after Date. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 23:03:20 +02:00
daveadmin	a3d46f9756	feat: Legal Tools v1 — multilingual landing, dashboard, SSO bridge - Public landing page at / for unauthenticated users (EN/NO/UK/PL) - Authenticated / shows Case Workbench dashboard with manifesto strip, stats, and launched-tool grid (Transcribe, Timeline, BVJ, Advocate, Deep Research, Corpus) - Added includes/i18n.php with full 4-language translation layer - Extended layout.php to Case Workbench shell with tool rail, lang switcher - AI output language normalization extended to en/no/uk/pl in PHP agents - SSO token validation in bootstrap.php / index.php (dobetternorge.no bridge) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 22:53:27 +02:00
daveadmin	55e11cb649	Azure: route azure_mini engine to gpt-4o-mini explicitly The .env default DBN_AZURE_OPENAI_CHAT_DEPLOYMENT is gpt-4o, so the azure_mini branch (which just called ->chat() without withDeployment) was silently hitting gpt-4o too. Both UI engine options resolved to the same model, and timed out together on long Norwegian documents. Fix: explicitly route azure_mini → gpt-4o-mini in both timeline and redact paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 09:38:55 +02:00
daveadmin	85c3cee719	Azure: raise chat timeout 45s → 90s default; timeline uses 120s Timeline was using no explicit timeout, falling back to the gateway's 45s default, which timed out on long Norwegian legal documents. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 02:09:02 +02:00
daveadmin	f183678f35	Redact: catch soft dates (years, month+year, ranges, prepositions) Adds Nordic-pack regex patterns for: - DD.MM.YYYY / DD/MM/YYYY / YYYY-MM-DD - Year ranges (2011/2012, 2018-2019) - Month + year (Norwegian + English, with optional day) - Year preceded by temporal preposition (i 2015, fra 2019, rundt 2018) Also renames the entity toggle from "Dates of birth" to "Dates" (broader scope) in all four languages, and expands the LLM prompt so soft date references in free text are caught even when regex misses them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 01:58:35 +02:00
daveadmin	cdd0fb970b	fix(timeline): explicit Norwegian date format recognition in prompt Add DD.MM.YY, D.M., diary-line format instructions so the model doesn't skip short Norwegian dates like 18.09.25 or 6.1. Two-digit years always treated as 20YY. Lines starting with date+colon are always events. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 01:10:16 +02:00
daveadmin	7690ed17ee	feat(timeline): full form UI with engine selection and advanced settings Add 4-language switcher (EN/NO/UK/PL), engine choice (Azure mini/full, GPU/cuttlefish), and expandable Advanced panel (Focus, Confidence filter, Date types) to timeline.php. Wire new params through api/timeline.php and LegalTools::timeline() with engine routing, focus-aware prompt injection, and confidence/date-type post-filters. Add TIMELINE_I18N to tools.js with improved renderTimeline() confidence colour-coding and new CSS classes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 00:59:12 +02:00
daveadmin	8c12d5e778	Redact tool: rich UI, multilingual, engine choice, output formats - Custom inline form (EN/NO/UK/PL lang switcher) replacing generic stub - Engine selector: Azure gpt-4o-mini (default), gpt-4o, GPU cuttlefish, regex-only - Entity type toggles: names, organisations, places, dates of birth - Output formats: contextual role tags, generic [PERSON], Norwegian pseudonyms - Keep officials mode: judges/experts kept as [JUDGE: Andersen] format - Exempt names list: specific names excluded from redaction - Hint paragraphs explaining each option in all four languages - Backend: engine routing, callGpuLlm(), applyGenericTags(), applyPseudonymization() - AzureOpenAiGateway: withDeployment() clone pattern for per-call model override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 00:20:16 +02:00
daveadmin	bddafea049	Timeline: document upload, upgraded prompt, CSV export, date_type badge	2026-05-13 08:10:40 +02:00
daveadmin	634a4fa154	Raise MAX_PASTE_CHARS to 128K and redaction max_tokens to 8000	2026-05-13 07:41:41 +02:00
daveadmin	95685862ab	Redact: multi-doc upload, contextual person naming, aliases - Extract limit raised from 32K to 128K chars per file (long legal docs now fit) - Redact API body/text limits raised (400KB / 128K chars) to match - Upload zone accepts multiple files (up to 5); extracted text concatenated with doc separator and combined before redaction; shows per-file char counts - LLM redact pass now infers contextual person roles (FATHER, MOTHER, CHILD, ATTORNEY, JUDGE, etc.) instead of generic [PERSON] for all names; same individual gets consistent tag throughout the document - Tag validation widened to allow any [A-Za-z0-9_- ] pattern (not just the five hardcoded tags), supporting contextual and alias tags - Alias UI added to Redact mode: user maps real names to bracketed aliases (e.g. "David Jr" -> [Junior]); aliases injected into LLM system prompt as override instructions; max 20 aliases, 100 chars each - max_tokens raised from 2000 to 4000; timeout from 60s to 90s for larger docs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 07:17:02 +02:00
daveadmin	3c8d7ebc34	feat: pass temporal_mode and as_of_date through DBN search API Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 18:45:54 +02:00
daveadmin	1f4f01bda3	Add public showcase landing, doc summary cards, and chunk toggle - index.php: public showcase landing page (hero, how-it-works, capabilities, evidence mock, login form) visible to unauthenticated visitors; full OG/SEO meta; app shell hidden behind auth as before - tools.css: showcase section styles (gradient hero, step cards, capability grid, CTA button, evidence mock, footer) - LegalTools.php: sourceFromChunk() batch-fetches doc_summaries from RAG DB for non-private chunks; excerpt shows doc summary when available, falls back to raw chunk text; chunk_text field always carries the raw excerpt - tools.js: renderEvidenceItem() shows doc summary as card body; adds a collapsible "View chunk" toggle when summary differs from raw chunk text Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 08:37:36 +02:00
daveadmin	62dbb8d900	Gate tools login with Caveau access	2026-05-08 17:12:38 +02:00
daveadmin	9b22947eb2	Two-pass PII redaction with multi-country pattern packs Pass 1: deterministic regex with Nordic/European/ECHR/Global packs covering fødselsnummer, Swedish personnummer, Danish/Finnish CPR, UK NI, French INSEE, IBAN, EU phones, ECHR application numbers, DOB, and national ID label patterns. Pass 2: LLM semantic scan (Azure OpenAI) finds names, orgs, places and identifying descriptions missed by regex. Runs on pre-redacted text so no raw PII reaches the LLM. Adds region selector (Nordic/European/ECHR/Global) to the Redact UI. Falls back gracefully when Azure is not yet configured. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-07 01:27:52 +02:00
daveadmin	2d8d1c7409	Initial release: Do Better Norge Legal Tools Hub Five MVP tools (Ask, Search, Summarize, Timeline, Redact) with email+password auth, Azure OpenAI gateway, evidence trail panel, and process-and-forget privacy default. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-07 00:01:07 +02:00

36 Commits