Legal Tools
Open Timeline → Sign in

Technical Showcase · How the AI reads time

How Timeline knows when things happened.

A full walkthrough of the 3-pass extraction pipeline, Norwegian date format recognition, event classification schema, multi-engine architecture, and the fine-tuned dbn-legal-agent model.

12+ date formats
5 event types
3 pipeline passes
3 engine options

Architecture

Three passes. Each with a distinct job.

The pipeline is intentionally sequential — Pass 1 is rule-based and near-instant; Pass 2 is the LLM extraction; Pass 3 post-processes and scores the output.

Pass 1 · PHP / regex

Detect & normalise known formats

A deterministic pattern-matching pass runs before any LLM call. It scans the full input for dates matching 12+ Norwegian formats and normalises them to ISO 8601:

  • dd.mm.yyyyYYYY-MM-DD
  • d. månedsnavn yyyy → resolved calendar date
  • Diary-format lines (starting with a date + colon) → auto-tagged as events
  • Two-digit years → always interpreted as 20YY

Normalised anchors are injected into the LLM prompt to reduce hallucinated or misread dates.

Pass 2 · gpt-4o-mini / gpt-4o / dbn-legal-agent

Extract, classify & score

The LLM reads the full document alongside the pre-pass anchors. For every temporal reference it returns a structured JSON event object:

  • date — resolved ISO date, or verbatim string if unresolvable
  • date_typeabsolute | relative | recurring | conditional | period
  • confidencehigh | medium | low
  • actor — attributed entity (from source text, not inferred)
  • description — one-sentence event summary
  • source_excerpt — verbatim text fragment (max 200 chars)

The prompt explicitly instructs the model not to invent dates or actors not present in the source. Temperature is set to 0.1 for deterministic output.

Pass 3 · PHP post-processor

Filter, sort & assemble

PHP applies all active filters before returning the result:

  • Focus filter — strips events not matching the requested focus mode (deadlines / hearings / CPS)
  • Confidence filter — removes LOW-confidence events if requested
  • Background filter — strips background/narrative events if unchecked
  • Date-type filter — strips relative/recurring events if unchecked

The post-processor then assembles the what_remains_uncertain list and the next_practical_step recommendation.

Date recognition

12+ Norwegian date formats, all recognised.

Norwegian legal documents use a wide variety of date notations. The Pass 1 pre-pass recognises all of these deterministically; the LLM handles the rest in Pass 2.

Format Example Notes
dd.mm.yyyy 30.07.2015 Standard Norwegian numeric
dd.mm.yy 09.04.25 Two-digit year → always 20YY
d. månedsnavn yyyy 3. mars 2024 Written month in bokmål/nynorsk
d. månedsnavn 15. januar Year inferred by proximity scanning
yyyy-mm-dd 2024-03-12 ISO 8601
månedsnavn yyyy mars 2024 Month + year only
yyyy 2024 Year-only reference
Season + year høsten 2023 Seasonal reference → Q3/Q4
Diary-format line 18.09.2025: Møte avholdt Date + colon → auto-tagged as event
Relative reference tre uker etter vedtaket Anchored to nearest resolved event
Recurring pattern hver mandag Classified as recurring
Period / range fra mars til juni 2024 Yields start_date + end_date

Classification schema

Five event types. Three confidence levels.

date_type values

date_type Definition Example
absolute A specific, resolvable calendar date 30.07.2015 → 2015-07-30
relative A date expressed relative to another event tre uker etter vedtaket
recurring A pattern that repeats on a schedule each Monday, every 6 months
conditional A date contingent on a condition being met if no response within 14 days
period A date range or duration with start and end fra mars til juni 2024

confidence levels

confidence Meaning Visual in timeline
high Date is explicitly and unambiguously stated in the source text Green badge
medium Date is inferred, approximate, or stated with slight ambiguity Amber badge
low Date is implied, undated, or extracted from a degraded/ambiguous passage Grey badge

Actor attribution rules

Rule Example
Named entity in the same sentence “Trude [saksbehandler] ringte 14. mars” → actor: Trude
Role label without a name “Barnevernet fattet vedtak” → actor: Barnevernet
No clear attribution in sentence actor: [unattributed]
Document-level default If no per-event actor, defaults to the document sender/issuing body

Engines

Three engines, one structured output.

All engines return the same JSON schema — the post-processor handles all three identically. Engine choice affects speed, quality, and privacy only.

Engine Model Latency Best for
Azure gpt-4o-mini ★ gpt-4o-mini (Azure West Europe) ~15 s Default. Fast, cost-efficient, handles most legal documents well.
Azure gpt-4o gpt-4o (Azure West Europe) ~45 s Complex documents, overlapping events, poor-quality or dense source text.
GPU / cuttlefish dbn-legal-agent via LiteLLM proxy ~25 s Maximum privacy. Entirely local. Fine-tuned on Norwegian legal corpus.

Fine-tuned model

dbn-legal-agent: trained on Norwegian legal text.

QLoRA fine-tune

dbn-legal-agent

A QLoRA (Quantized Low-Rank Adaptation) fine-tune trained on Norwegian child-welfare and administrative law text — case notes, court decisions, Barnevernet correspondence, Fylkesnemnda decisions, and Statsforvalter rulings. The model has internalised the temporal patterns of Norwegian legal proceedings: the procedural sequence of an omsorgsovertakelse, the typical timeline of a tiltaksplan review cycle, what akutt means as a temporal signal, how Fylkesnemnda milestones are ordered.

In the Timeline GPU engine, dbn-legal-agent runs as the primary extraction model via the LiteLLM proxy on cuttlefish. The structured JSON output schema is identical to the Azure engines — the same post-processing pipeline applies regardless of which engine produced the extraction. No Azure API calls are made when the GPU engine is selected.

QLoRA Norwegian legal corpus case notes court decisions Barnevernet Fylkesnemnda LiteLLM proxy

Privacy & security

Your documents never leave your session.

Privacy by design

  • All uploaded files are extracted to text in memory using PHP's in-process file handlers. The raw binary is never written to disk on the server.
  • Session context (pasted text, uploaded content, extracted timeline events) is scoped to your authenticated session and discarded when the session ends.
  • Azure OpenAI (gpt-4o, gpt-4o-mini) is configured on the West Europe region. Data processed via Azure OpenAI is not used for model training under the default enterprise agreement.
  • The GPU/cuttlefish engine processes entirely locally — no data leaves your network. The LiteLLM proxy on cuttlefish receives your document text and returns structured JSON; nothing is forwarded to an external API.
  • Telemetry logged: tool name, engine, focus mode, event count, latency. No document text, case references, actor names, or extracted events are logged.

See it work on your case.

Free for Do Better Norge members. All engines available to every member.

Open Timeline → Sign in to use Timeline → Register free User guide