- Remove GPU/regex engine options; keep only azure_mini (1 credit) and azure_full (2 credits)
- Variable credit cost: engine-aware pre-check and charge in api/redact.php; PricingCatalog base = 1
- Fix ATTORNEY not preserved when keepOfficials=true: add to LLM prompt, generic-tag, pseudonym regexes
- Replace Azure credits hint with per-engine credit cost text (all 4 languages)
- Single-file upload only (was: up to 5); simplify status messages
- Clear previous redaction output and show pulsing spinner when a new run starts
- Add "Save to My Docs" button in redact output panel (corpus-save.js path)
- corpus-save.js: capture source_doc_ids from button dataset, pass in POST payload
- api/save-to-corpus.php: accept source_doc_ids, store first as source_url=corpus-doc:{id}
- doc-picker.js: show "✂ Redacted" badge for documents saved from the redact tool
- CSS: .redact-working spinner, doc-item__badge--redact pill styles
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dbnToolsString was called with required=true (default), so it aborted
before dbnToolsInjectDocContent could inject content from the doc picker.
Now passes required=false and validates after injection so doc-only
submissions work in timeline, redact, ask, and summarize.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add "Select from My Docs" button to all text tool forms; free-tier
users see an upgrade modal, paid (CaveauAI) users get a searchable
multi-select modal backed by /api/dashboard/documents.php
- Add "Select from My Audio" picker on Transcribe with single-select
and a "Save to My Audio" button for persisting uploaded clips
- New PHP helpers in bootstrap.php: dbnToolsFetchDocChunks,
dbnToolsClientIdFromSession, dbnToolsInjectDocContent
- timeline, ask, redact APIs prepend selected document content
(fetched from client_chunks SQL) before the textarea text
- api/dashboard/audio-upload.php stores audio files on server and
creates a client_documents row with source_type='audio'
- api/transcribe.php falls back to stored audio via audio_doc_id POST
field when no file is uploaded
- api/dashboard/documents.php supports ?source_type= filter
- tools.js: doc_ids added to JSON payload; stored-audio transcribe path
- New assets/css/doc-picker.css, assets/js/doc-picker.js
- SQL migration: scripts/sql/audio_docs_column.sql
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Custom inline form (EN/NO/UK/PL lang switcher) replacing generic stub
- Engine selector: Azure gpt-4o-mini (default), gpt-4o, GPU cuttlefish, regex-only
- Entity type toggles: names, organisations, places, dates of birth
- Output formats: contextual role tags, generic [PERSON], Norwegian pseudonyms
- Keep officials mode: judges/experts kept as [JUDGE: Andersen] format
- Exempt names list: specific names excluded from redaction
- Hint paragraphs explaining each option in all four languages
- Backend: engine routing, callGpuLlm(), applyGenericTags(), applyPseudonymization()
- AzureOpenAiGateway: withDeployment() clone pattern for per-call model override
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extract limit raised from 32K to 128K chars per file (long legal docs now fit)
- Redact API body/text limits raised (400KB / 128K chars) to match
- Upload zone accepts multiple files (up to 5); extracted text concatenated with
doc separator and combined before redaction; shows per-file char counts
- LLM redact pass now infers contextual person roles (FATHER, MOTHER, CHILD,
ATTORNEY, JUDGE, etc.) instead of generic [PERSON] for all names; same
individual gets consistent tag throughout the document
- Tag validation widened to allow any [A-Za-z0-9_- ] pattern (not just the
five hardcoded tags), supporting contextual and alias tags
- Alias UI added to Redact mode: user maps real names to bracketed aliases
(e.g. "David Jr" -> [Junior]); aliases injected into LLM system prompt as
override instructions; max 20 aliases, 100 chars each
- max_tokens raised from 2000 to 4000; timeout from 60s to 90s for larger docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass 1: deterministic regex with Nordic/European/ECHR/Global packs
covering fødselsnummer, Swedish personnummer, Danish/Finnish CPR,
UK NI, French INSEE, IBAN, EU phones, ECHR application numbers, DOB,
and national ID label patterns.
Pass 2: LLM semantic scan (Azure OpenAI) finds names, orgs, places
and identifying descriptions missed by regex. Runs on pre-redacted
text so no raw PII reaches the LLM.
Adds region selector (Nordic/European/ECHR/Global) to the Redact UI.
Falls back gracefully when Azure is not yet configured.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>