Commit Graph

44 Commits

Author SHA1 Message Date
daveadmin f2fbb69e0a feat: lightweight header/footer — IBM Plex Sans, slimmed nav badge, compact footer
Drops Roboto + IBM Plex Mono from Google Fonts, replaces with IBM Plex
Sans (matching dobetternorge.no). Nav badge loses bordered pill, becomes
plain uppercase label with slash separator. Footer cut from 3-column
text-wall (~300 words) to compact 2-column layout (~50 words) — logo +
tagline + privacy note on left, 5 links in 2 columns on right.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 07:10:46 +02:00
daveadmin f0b7d343a3 feat: unified landing page with auth-aware gate + /dashboard.php
Removes the logged-in vs logged-out page bifurcation. index.php now
always renders the public landing (tools overview, hero, trust section)
with auth-conditional nav/hero CTAs and a two-column member/register
gate shown only to unauthenticated visitors. Authenticated workbench
extracted to new dashboard.php. Adds 8 new i18n keys across all 4
languages and new CSS for auth-nav, hero CTA, two-column gate, and
register buttons.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 06:45:44 +02:00
daveadmin 93b28b8783 feat: rebuild preview pages with real API content and full i18n
- preview.php: localized pitch + features for en/no (uk/pl fall back to en)
- Sample outputs now match actual API response format: streaming pipeline
  steps, confidence fields, entity counts, corpus slice names, speaker roles
- i18n.php: add 10 preview-specific keys across all 4 languages (en/no/uk/pl)
- Transcribe: shows 3-engine cascade + real speaker roles (saksbehandler/dommer/advokat)
- Timeline: shows date_type, confidence, what_remains_uncertain, next_practical_step
- Redact: shows two-pass pipeline (regex Nordic pack + LLM NER) + contextual tags
- Barnevernet: shows 7-step streaming trace + procedural flag severity levels
- Advocate: shows partisan brief with advocate_role + citation confidence
- Deep Research: shows corpus slices + sub-questions + contradiction-aware synthesis
- Corpus: shows real Qdrant + Azure AI Search config, hybrid search result

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 14:39:12 +02:00
daveadmin 849a7cf434 feat: add public tool preview pages with realistic samples
Each landing card now links to preview.php?tool=SLUG — a dedicated
public page with an expanded pitch, 4 capability bullets, and a
realistic Norwegian-language sample input+output for all 7 tools.

- preview.php — new public page (no auth required), switch-driven content
- includes/tool-svgs.php — extracted $toolSvgs into shared include
- index.php — require tool-svgs.php, card href → preview.php?tool=SLUG
- assets/css/tools.css — lt-preview-* component styles appended

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 13:08:46 +02:00
daveadmin c350750b7e fix: serve logo locally to fix broken image on nav/footer
External URL was unreachable from tools subdomain (CSP or cross-origin block),
causing a grey placeholder rectangle. Logo now served from assets/images/ and
brightness/invert filter removed — logo is white-on-transparent, displays
correctly on dark nav and footer without filtering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 12:41:11 +02:00
daveadmin 38683cffc0 feat: rebrand landing page to match dobetternorge.no
- Add sticky navy nav with logo-header.webp, Legal Tools badge, lang switcher, red CTA
- Replace showcase-hero with full-bleed dark hero (Crimson Pro, IBM Plex Mono, stat pills)
- Redesign tool cards: 3-col grid, 178px illustrated SVG art per card (7 unique illustrations)
- Add lt-trust 3-col strip and lt-access navy gate panel
- Rebuild footer with 3-col navy layout matching main site
- Add Crimson Pro / Roboto / IBM Plex Mono Google Fonts via <link> + @import
- CSS: new lt-* variables, all new landing component styles appended to tools.css

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 12:33:31 +02:00
daveadmin 8b77acb828 feat: free-tier credit system + Syttende Mai access for Google users
- FreeTier.php: credit check/deduct/reset engine with hourly rate limit
- bootstrap.php: dbnmDb() singleton, dbnToolsIsFreeTier(), credit gate helpers
- index.php: store tier=free|approved in session from SSO JWT
- All 7 API endpoints: credit gate (402/429) + X-Credits-Remaining header
- layout.php: credit meta tag, JS balance var, Syttende Mai banner (05-17 only)
- tools.js: credit badge in topbar, 402 modal, 429 toast, dbnUpdateCredits()
- barnevernet.js + deep-research.js: wire 402/429 handling for NDJSON streams
- tools.css: styles for credit badge, no-credits modal, rate-limit toast

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 21:05:08 +02:00
daveadmin 568314c554 fix: wire GCP Speech client into tools transcribe (was using unreachable ai-portal path)
Copies GcpSpeechClient into the tools repo so it's deployed with the code;
removes the broken dbnToolsAiPortalRoot() path that resolved to a nonexistent
/home/dobetternorge/ai-portal directory. Also restarted the CPU Whisper
service which had a stuck CLOSE_WAIT socket causing silent fetch failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 13:43:28 +02:00
daveadmin c6a9cc9199 feat: add site footer with privacy statement, CaveauAI attribution, and AI disclaimer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 23:24:49 +02:00
daveadmin 13572e9dfb feat: extract and display event times on timeline (kl. HH:MM etc.)
Prompt now instructs the model to extract time of day (HH:MM) when
present in Norwegian formats: kl. 14:30, kl 09.00, 14:30, 14.30.
renderTimeline shows time as a muted inline annotation next to the date.
CSV export gains a Time column after Date.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 23:03:20 +02:00
daveadmin c5c90d92f3 feat: add Redact tool to launched nav and dashboard (all 4 languages)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 22:58:20 +02:00
daveadmin a3d46f9756 feat: Legal Tools v1 — multilingual landing, dashboard, SSO bridge
- Public landing page at / for unauthenticated users (EN/NO/UK/PL)
- Authenticated / shows Case Workbench dashboard with manifesto strip,
  stats, and launched-tool grid (Transcribe, Timeline, BVJ, Advocate,
  Deep Research, Corpus)
- Added includes/i18n.php with full 4-language translation layer
- Extended layout.php to Case Workbench shell with tool rail, lang switcher
- AI output language normalization extended to en/no/uk/pl in PHP agents
- SSO token validation in bootstrap.php / index.php (dobetternorge.no bridge)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 22:53:27 +02:00
daveadmin ba6c197f1b refactor: remove dbn_legal engine from BVJ Analyzer
dbn-legal-agent is not suitable for structured RAG synthesis:
- Fine-tune contamination appends feedback loops after JSON output
- 7-min latency vs 45s for gpt-4o-mini
- 8B base gives weaker instruction-following on complex JSON contracts
- No improvement in citation accuracy (RAG provides the legal content)

dbn-legal-agent kept for open-ended freeform Norwegian legal Q&A
where citation structure isn't required. BVJ synthesis now uses
azure_mini|azure_full|gpu only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 22:13:16 +02:00
daveadmin 7e0fce4167 fix: rein in dbn-legal-agent feedback-loop contamination (stop seqs + JSON extract + system prompt) 2026-05-15 22:05:49 +02:00
daveadmin 6161ceea75 fix: pass $emit into synthesiseBvj so dbn-legal-agent keepalives fire 2026-05-15 21:51:16 +02:00
daveadmin bc52690472 fix: BVJ party extraction robustness + dbn-legal-agent streaming
Party extraction: wider excerpt (12k chars), cleaner prompt, fallback for
root-level array responses, log raw response on unexpected structure.

dbn-legal-agent synthesis: replace blocking curl (200s timeout) with an
SSE streaming approach (CURLOPT_WRITEFUNCTION). PHP now emits keepalive
progress events every 15 s during generation, preventing browser network
errors on slow ~6 t/s cuttlefish inference. Timeout extended to 660 s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 21:35:18 +02:00
daveadmin 9b8cb9c6dc fix: raise file upload limit from 4 MB to 8 MB
PHP constant and all JS client-side guards updated. Server PHP ini is 64M.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 20:57:25 +02:00
daveadmin 43cf5b8ce4 feat: Barnevernet Analyzer — document analysis + partisan RAG brief
7-step agent pipeline: document classification, party extraction, timeline
extraction, corpus RAG (child_welfare/echr/family_core/bufdir_guidance),
and synthesis using the user's chosen engine (including dbn-legal-agent).
Progressive NDJSON streaming renders doc_meta, parties, and timeline cards
before the final advocacy brief and procedural red flags arrive.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 20:49:46 +02:00
daveadmin 343b19d0b4 Add sub-question branching + document summary modals
- Source modal now shows LLM-generated document summary (lazy-gen + cached
  in documents.summary) instead of raw chunk text; toggle reveals matched
  chunk; "View all chunks" button fetches every chunk of the document via
  new api/document-chunks.php endpoint
- Each sub-question card gets a "Branch ↓" button that pre-fills the query
  with that sub-question and shows a context panel with the prior brief
  summary; prior_context + branch_notes are injected into interpretSeed()
  and synthesise() so the LLM knows where the research is coming from
- Upload document summaries generated at synthesis time and attached to
  upload sources alongside corpus summaries
- DB: documents.summary TEXT column added to bnl_corpus on chloe

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 19:44:27 +02:00
daveadmin 0ff4eb6d31 Add dbn-legal-agent to deep-research and advocate pipelines
- interpretSeed: uses dbn-legal-agent for Norwegian/advocate queries
- expandQueries: uses dbn-legal-agent for Norwegian sub-question generation
- synthesise: adds dbn_legal engine option (dbn-legal-agent via LiteLLM GPU)
- advocate.php: adds Norwegian specialist radio button in engine selector

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 19:12:19 +02:00
daveadmin 7bccd8c010 Expand corpus slices to 8: split ECHR/Hague, add Norwegian Courts, Bufdir, DBN Resources
- Replace combined echr_hague slice with echr (Art.8+9, HUDOC, NIM) and hague (INCADAT,
  cross-border abduction) as separate toggles; echr defaults ON, hague defaults OFF
- Add norwegian_courts slice: Domstol (src 5,26) + Rettspraksis.no (src 33, 482 docs)
- Add bufdir_guidance slice: Barneombudet (19), Bufdir (20), Statsforvalteren (31)
- Add dbn_resources slice: DBN website pages (flashcards, resource directory), defaults OFF
- Replace isWebsiteChunk() with slice-aware shouldExcludeChunk(): always strips EU AI Act
  chunks (EUR-Lex source 7 leaks through when Qdrant runs unconstrained) and DBN website
  pages unless dbn_resources slice is explicitly ON
- Update SLICE_DEFS in advocate.js and deep-research.js to match all 8 slices
- Backward compat: echr_hague key in incoming requests fans out to echr+hague

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 16:01:05 +02:00
daveadmin 640778454f Add Case Advocate tab — partisan brief grounded in Norwegian law
New /advocate.php tab: user selects who they represent (biological
father, mother, foster carer, CWS, etc.) and the agent takes their
side entirely. Adversarial sub-questions target supporting Lovdata
statutes + ECHR precedents; synthesis returns client_strengths[] and
opposing_weaknesses[] alongside the advocate brief.

- DeepResearchAgent: add advocateRole param to run(), interpretSeed(),
  expandQueries(), synthesise(). Neutral path unchanged (empty string).
- api/deep-research.php: extract + validate advocate_role from payload;
  telemetry logs tool='advocate' vs 'deep_research'.
- advocate.php: new page with role dropdown (presets + custom), same
  corpus slices/engine/controls/upload zone as deep research.
- assets/js/advocate.js: page-scoped JS; renders advocate banner,
  client strengths card (teal), advocate brief, opposing weaknesses
  card (amber), sub-Q cards, sources, uncertainty, next step.
- assets/css/tools.css: append .adv-* rules (~120 lines).
- includes/layout.php: add Advocate nav tab between Deep research and
  Summarize.
- index.php: add Advocate cap-card tile.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 12:26:05 +02:00
daveadmin 785de04f05 fix: batch embed 5 chunks at a time with flush between; fix hydrateSourceUrls SQL
Embed timeout: bnl_corpus Ollama embeds ~49 chunks sequentially in CPU mode,
easily exceeding the 60s cURL timeout. Now truncates upload text to
MAX_UPLOAD_CHARS before chunking (~21 chunks max) and embeds in batches of 5
with a progress flush between batches to keep the stream alive.

SQL error: bnl_corpus.documents lacks the temporal columns added in migration
136 (valid_from, valid_until, etc.). dbnV6QueryDocumentMeta uses IFNULL which
doesn't protect against missing columns. Replaced with a direct query using
only the columns confirmed to exist on this instance.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 11:42:38 +02:00
daveadmin d2f9831472 feat: Corpus Intelligence page + timeline background events
Adds /corpus.php — a data transparency page showing what powers the
legal tools: 9 coverage categories with live doc counts, a full
sources table pulled from the corpus DB, the AI stack (LLMs, Whisper,
Qdrant, Azure AI Search, embeddings, chunking), and a pipeline flow
diagram. Stats are live via a new /api/corpus-stats.php endpoint
(queries dobetter_rag + bnl_admin). The reasoning sidebar is repurposed
as a Corpus health panel on this page.

Also ships the in-progress timeline background events toggle:
API and UI wired together via include_background param.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 11:31:24 +02:00
daveadmin 3196c33ebb fix: replace AiGateway.embedBatch with direct LiteLLM cURL for upload indexing
AiGateway uses getenv(LITELLM_MASTER_KEY) + stream_context HTTP which was
failing on the chloe virtualhost process. New dbnToolsLiteLLMEmbedBatch()
helper mirrors dbnToolsCallGpuLlm — hardcoded URL + key, cURL-first, same
pattern already proven for LLM calls. Removes AiGateway dependency from
DeepResearchAgent entirely.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 11:30:25 +02:00
daveadmin e130db8119 Deep Research v2: exclude marketing site, deep-link sources, per-agent reports
Three user-flagged issues after the first real run with a 920KB sakkyndig PDF:

1. dobetternorge.no marketing-website chunks leaked into the retrieval pool.
   ClientRagPipeline::searchAll defaults include_beta_website=true; we now
   pass false for both website flags, AND defensively drop any returned
   chunk whose source_name contains "website" or title contains
   "dobetternorge.no" before it can pollute synthesis.

2. Brief returned was "just a paragraph". Bumped synthesis max_tokens
   2200→3200, raised timeout 120→180s, and rewrote the prompt to require
   400-900 words with min 4 paragraphs when source_count>=3, covering EACH
   sub-question in its own paragraph. Now also passes authority + jurisdiction
   into the sources block so the model can pinpoint statutes correctly.

3. No way to see what each "sub-question agent" researched or click through
   to the source articles. Restructured the results panel so per-sub-question
   report cards now render ABOVE the synthesised brief. Each report shows the
   question, the rationale, and the top 3 retrieved sources for that sub-Q
   with title→deep link + 1-line excerpt. Brief follows. Consolidated
   numbered sources list at the bottom, with titles as deep links too.

Deep-link construction: source_url is hydrated via dbnV6QueryDocumentMeta
in a single batched call after retrieval. For Lovdata sources with a
section_title containing §<n>, the link is path-anchored to that section
(/§43). For other hosts (HUDOC, Regjeringen, Bufdir, etc.) we link to the
document root URL.

Telemetry: trace_metadata now carries retrieval_counts {raw_corpus,
filtered_website, post_filter_corpus, raw_upload, after_dedupe, after_topk}
so future regressions are diagnosable from the metadata.jsonl log alone.
The completion status pill surfaces the corpus/website/upload split.
2026-05-15 11:12:13 +02:00
daveadmin a1a7f442a7 Deep Research: NDJSON streaming so the connection survives long runs
Previously the endpoint returned a single JSON object at the end. Apache+
PHP-FPM buffers the entire body until PHP exits, so a 160s azure_full run
caused the browser to drop the fetch as "Failed to fetch" while the server
was still synthesising — the response then arrived to a dead socket.

Switch to application/x-ndjson with one event per line. The endpoint emits
'progress', 'start', 'step' (running/complete/warning/error), 'subq', and a
final 'final' event carrying the full result payload. Output buffering is
explicitly disabled so each line flushes through Apache as soon as the
agent emits it.

DbnDeepResearchAgent::run() now accepts an optional ?callable $emit and
fires step:running before each step + step:complete after, plus a subq
event per sub-question retrieval round.

JS reads response.body as a stream, splits on newlines, updates the
trace panel live, and renders the final result when the final event
arrives. Status pill shows live progress detail (e.g. "Synthesising with
Azure gpt-4o — this is the slowest step…").

Engine row in the form now shows expected duration per engine
(~15-45s mini, ~60-180s full, ~30-90s GPU) so users know what they're in
for before clicking Run.
2026-05-15 10:47:35 +02:00
daveadmin 4cbe0a4ac4 Add Deep Research tool — agent + rank/rerank RAG
New surface at /deep-research.php where the user pastes a question or
uploads PDF/DOCX/TXT case files and a LLM-orchestrated agent researches
the Do Better Norge legal corpus from 3-5 angles, with hybrid retrieval,
cross-encoder rerank, and synthesis that emits an inline-[n]-cited
markdown brief plus a numbered sources panel.

Uploaded documents are chunked + embedded in memory only (nomic-embed-text
via LiteLLM) and searched alongside the shared corpus during the same
request — never persisted to disk, DB, or Qdrant.

Reuses ClientRagPipeline::searchAll (hybrid + rerank), dbnV6 slice
helpers, and the existing extract.php text-extraction logic via a new
dbnToolsExtractUploadedFile() helper. Also adds dbnToolsCallGpuLlm()
helper in bootstrap.php — fixes a latent bug where LegalTools.php
was already calling that name with no definition.

Search.php is unchanged.
2026-05-15 10:30:47 +02:00
daveadmin 55e11cb649 Azure: route azure_mini engine to gpt-4o-mini explicitly
The .env default DBN_AZURE_OPENAI_CHAT_DEPLOYMENT is gpt-4o, so the
azure_mini branch (which just called ->chat() without withDeployment)
was silently hitting gpt-4o too. Both UI engine options resolved to
the same model, and timed out together on long Norwegian documents.

Fix: explicitly route azure_mini → gpt-4o-mini in both timeline and
redact paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 09:38:55 +02:00
daveadmin 85c3cee719 Azure: raise chat timeout 45s → 90s default; timeline uses 120s
Timeline was using no explicit timeout, falling back to the gateway's
45s default, which timed out on long Norwegian legal documents.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 02:09:02 +02:00
daveadmin f183678f35 Redact: catch soft dates (years, month+year, ranges, prepositions)
Adds Nordic-pack regex patterns for:
- DD.MM.YYYY / DD/MM/YYYY / YYYY-MM-DD
- Year ranges (2011/2012, 2018-2019)
- Month + year (Norwegian + English, with optional day)
- Year preceded by temporal preposition (i 2015, fra 2019, rundt 2018)

Also renames the entity toggle from "Dates of birth" to "Dates" (broader
scope) in all four languages, and expands the LLM prompt so soft date
references in free text are caught even when regex misses them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 01:58:35 +02:00
daveadmin cdd0fb970b fix(timeline): explicit Norwegian date format recognition in prompt
Add DD.MM.YY, D.M., diary-line format instructions so the model doesn't
skip short Norwegian dates like 18.09.25 or 6.1. Two-digit years always
treated as 20YY. Lines starting with date+colon are always events.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 01:10:16 +02:00
daveadmin 7690ed17ee feat(timeline): full form UI with engine selection and advanced settings
Add 4-language switcher (EN/NO/UK/PL), engine choice (Azure mini/full,
GPU/cuttlefish), and expandable Advanced panel (Focus, Confidence filter,
Date types) to timeline.php. Wire new params through api/timeline.php and
LegalTools::timeline() with engine routing, focus-aware prompt injection,
and confidence/date-type post-filters. Add TIMELINE_I18N to tools.js with
improved renderTimeline() confidence colour-coding and new CSS classes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 00:59:12 +02:00
daveadmin 8c12d5e778 Redact tool: rich UI, multilingual, engine choice, output formats
- Custom inline form (EN/NO/UK/PL lang switcher) replacing generic stub
- Engine selector: Azure gpt-4o-mini (default), gpt-4o, GPU cuttlefish, regex-only
- Entity type toggles: names, organisations, places, dates of birth
- Output formats: contextual role tags, generic [PERSON], Norwegian pseudonyms
- Keep officials mode: judges/experts kept as [JUDGE: Andersen] format
- Exempt names list: specific names excluded from redaction
- Hint paragraphs explaining each option in all four languages
- Backend: engine routing, callGpuLlm(), applyGenericTags(), applyPseudonymization()
- AzureOpenAiGateway: withDeployment() clone pattern for per-call model override

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 00:20:16 +02:00
daveadmin df31674f2e SSO integration: validate dobetternorge.no signed tokens, update landing page
- bootstrap.php: dbnToolsValidateSsoToken(), SSO session check in dbnToolsIsAuthenticated()
- index.php: SSO handler at top, Do Better Norge member panel in login card
- .env: DBN_SSO_SECRET placeholder

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 18:47:05 +02:00
daveadmin eaff2a4d86 Per-tool pages + multi-engine transcribe with expert controls
- Split monolithic index.php into per-tool pages (ask, search, summarize,
  timeline, redact, transcribe), each with its own URL and bookmarkable state
- Shared shell: includes/layout.php + layout_footer.php; shared form:
  includes/tool_form.php used by all text-tool pages
- index.php now redirects authenticated users to ask.php; unauthenticated
  users see the login gate only
- transcribe.php: engine selector (GPU/OpenAI/Azure), model size (small/
  medium/large-v3), diarize, language, expert settings (beam, VAD, task,
  initial prompt)
- api/transcribe.php: engine routing — GPU (cuttlefish), OpenAI BYOK,
  Azure AI Speech; passes model/beam/task/vad/prompt to Whisper server
- tools.js: data-active-tool body attr drives setTool() on load; <a> nav
  tabs skip click listeners; null guards on form/passcodeForm; engine radio
  toggle shows/hides BYOK key inputs and model selector; RTF shown in status
- tools.css: styles for BYOK inputs, expert settings panel, prompt textarea

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 22:14:20 +02:00
daveadmin bddafea049 Timeline: document upload, upgraded prompt, CSV export, date_type badge 2026-05-13 08:10:40 +02:00
daveadmin 634a4fa154 Raise MAX_PASTE_CHARS to 128K and redaction max_tokens to 8000 2026-05-13 07:41:41 +02:00
daveadmin 95685862ab Redact: multi-doc upload, contextual person naming, aliases
- Extract limit raised from 32K to 128K chars per file (long legal docs now fit)
- Redact API body/text limits raised (400KB / 128K chars) to match
- Upload zone accepts multiple files (up to 5); extracted text concatenated with
  doc separator and combined before redaction; shows per-file char counts
- LLM redact pass now infers contextual person roles (FATHER, MOTHER, CHILD,
  ATTORNEY, JUDGE, etc.) instead of generic [PERSON] for all names; same
  individual gets consistent tag throughout the document
- Tag validation widened to allow any [A-Za-z0-9_- ] pattern (not just the
  five hardcoded tags), supporting contextual and alias tags
- Alias UI added to Redact mode: user maps real names to bracketed aliases
  (e.g. "David Jr" -> [Junior]); aliases injected into LLM system prompt as
  override instructions; max 20 aliases, 100 chars each
- max_tokens raised from 2000 to 4000; timeout from 60s to 90s for larger docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 07:17:02 +02:00
daveadmin 3c8d7ebc34 feat: pass temporal_mode and as_of_date through DBN search API
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 18:45:54 +02:00
daveadmin 1f4f01bda3 Add public showcase landing, doc summary cards, and chunk toggle
- index.php: public showcase landing page (hero, how-it-works, capabilities,
  evidence mock, login form) visible to unauthenticated visitors; full OG/SEO
  meta; app shell hidden behind auth as before
- tools.css: showcase section styles (gradient hero, step cards, capability
  grid, CTA button, evidence mock, footer)
- LegalTools.php: sourceFromChunk() batch-fetches doc_summaries from RAG DB
  for non-private chunks; excerpt shows doc summary when available, falls back
  to raw chunk text; chunk_text field always carries the raw excerpt
- tools.js: renderEvidenceItem() shows doc summary as card body; adds a
  collapsible "View chunk" toggle when summary differs from raw chunk text

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 08:37:36 +02:00
daveadmin 62dbb8d900 Gate tools login with Caveau access 2026-05-08 17:12:38 +02:00
daveadmin 9b22947eb2 Two-pass PII redaction with multi-country pattern packs
Pass 1: deterministic regex with Nordic/European/ECHR/Global packs
covering fødselsnummer, Swedish personnummer, Danish/Finnish CPR,
UK NI, French INSEE, IBAN, EU phones, ECHR application numbers, DOB,
and national ID label patterns.

Pass 2: LLM semantic scan (Azure OpenAI) finds names, orgs, places
and identifying descriptions missed by regex. Runs on pre-redacted
text so no raw PII reaches the LLM.

Adds region selector (Nordic/European/ECHR/Global) to the Redact UI.
Falls back gracefully when Azure is not yet configured.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 01:27:52 +02:00
daveadmin 2d8d1c7409 Initial release: Do Better Norge Legal Tools Hub
Five MVP tools (Ask, Search, Summarize, Timeline, Redact) with
email+password auth, Azure OpenAI gateway, evidence trail panel,
and process-and-forget privacy default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 00:01:07 +02:00