Commit Graph

60 Commits

Author SHA1 Message Date
daveadmin d178fbf295 Fix double file picker on audioZone click
Guard against INPUT clicks bubbling up to zone handler,
which caused the file picker to open twice.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 20:28:24 +02:00
daveadmin 59644414f5 Fix transcribe form blocked by required textarea validation
The hidden textarea still had required=true, so browser-native form
validation silently blocked submit when no audio was the only input.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 19:19:30 +02:00
daveadmin aa2d64b599 Add transcribe progress indicator — elapsed timer and progressive trace messages
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 19:12:09 +02:00
daveadmin d425c99e8e Transcribe: audio-to-text tool with diarization and speaker role labelling
New sixth tool in the hub. Accepts MP3/WAV/OGG/M4A/FLAC/WEBM up to 200 MB,
proxies to Whisper on cuttlefish GPU. Optional speaker separation with LLM
role labelling (dommer, advokat, forelder, sakkyndig, etc. via GPT-4o-mini).
Client-side TXT / SRT / VTT download from segment data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 18:43:22 +02:00
daveadmin bddafea049 Timeline: document upload, upgraded prompt, CSV export, date_type badge 2026-05-13 08:10:40 +02:00
daveadmin 95685862ab Redact: multi-doc upload, contextual person naming, aliases
- Extract limit raised from 32K to 128K chars per file (long legal docs now fit)
- Redact API body/text limits raised (400KB / 128K chars) to match
- Upload zone accepts multiple files (up to 5); extracted text concatenated with
  doc separator and combined before redaction; shows per-file char counts
- LLM redact pass now infers contextual person roles (FATHER, MOTHER, CHILD,
  ATTORNEY, JUDGE, etc.) instead of generic [PERSON] for all names; same
  individual gets consistent tag throughout the document
- Tag validation widened to allow any [A-Za-z0-9_- ] pattern (not just the
  five hardcoded tags), supporting contextual and alias tags
- Alias UI added to Redact mode: user maps real names to bracketed aliases
  (e.g. "David Jr" -> [Junior]); aliases injected into LLM system prompt as
  override instructions; max 20 aliases, 100 chars each
- max_tokens raised from 2000 to 4000; timeout from 60s to 90s for larger docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 07:17:02 +02:00
daveadmin bbe5307c03 Add document upload to Redact tool
api/extract.php — new endpoint accepting .pdf/.docx/.txt up to 4 MB;
pdftotext for PDFs, ZipArchive+DOMXPath for DOCX, mb_convert_encoding
for TXT; truncates to 32 000 chars to stay within redact limit.

index.php — drop/browse upload zone above the textarea, visible only
in Redact mode.

tools.js — setupUpload(), handleFileUpload(), resetUpload(); drag-and-drop
and file picker both call the extract endpoint then populate the textarea.

tools.css — upload zone, drag-over, file-info, clear button styles.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 06:52:14 +02:00
daveadmin 1f4f01bda3 Add public showcase landing, doc summary cards, and chunk toggle
- index.php: public showcase landing page (hero, how-it-works, capabilities,
  evidence mock, login form) visible to unauthenticated visitors; full OG/SEO
  meta; app shell hidden behind auth as before
- tools.css: showcase section styles (gradient hero, step cards, capability
  grid, CTA button, evidence mock, footer)
- LegalTools.php: sourceFromChunk() batch-fetches doc_summaries from RAG DB
  for non-private chunks; excerpt shows doc summary when available, falls back
  to raw chunk text; chunk_text field always carries the raw excerpt
- tools.js: renderEvidenceItem() shows doc summary as card body; adds a
  collapsible "View chunk" toggle when summary differs from raw chunk text

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 08:37:36 +02:00
daveadmin 9b22947eb2 Two-pass PII redaction with multi-country pattern packs
Pass 1: deterministic regex with Nordic/European/ECHR/Global packs
covering fødselsnummer, Swedish personnummer, Danish/Finnish CPR,
UK NI, French INSEE, IBAN, EU phones, ECHR application numbers, DOB,
and national ID label patterns.

Pass 2: LLM semantic scan (Azure OpenAI) finds names, orgs, places
and identifying descriptions missed by regex. Runs on pre-redacted
text so no raw PII reaches the LLM.

Adds region selector (Nordic/European/ECHR/Global) to the Redact UI.
Falls back gracefully when Azure is not yet configured.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 01:27:52 +02:00
daveadmin 2d8d1c7409 Initial release: Do Better Norge Legal Tools Hub
Five MVP tools (Ask, Search, Summarize, Timeline, Redact) with
email+password auth, Azure OpenAI gateway, evidence trail panel,
and process-and-forget privacy default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 00:01:07 +02:00