daveadmin e130db8119 Deep Research v2: exclude marketing site, deep-link sources, per-agent reports
Three user-flagged issues after the first real run with a 920KB sakkyndig PDF:

1. dobetternorge.no marketing-website chunks leaked into the retrieval pool.
   ClientRagPipeline::searchAll defaults include_beta_website=true; we now
   pass false for both website flags, AND defensively drop any returned
   chunk whose source_name contains "website" or title contains
   "dobetternorge.no" before it can pollute synthesis.

2. Brief returned was "just a paragraph". Bumped synthesis max_tokens
   2200→3200, raised timeout 120→180s, and rewrote the prompt to require
   400-900 words with min 4 paragraphs when source_count>=3, covering EACH
   sub-question in its own paragraph. Now also passes authority + jurisdiction
   into the sources block so the model can pinpoint statutes correctly.

3. No way to see what each "sub-question agent" researched or click through
   to the source articles. Restructured the results panel so per-sub-question
   report cards now render ABOVE the synthesised brief. Each report shows the
   question, the rationale, and the top 3 retrieved sources for that sub-Q
   with title→deep link + 1-line excerpt. Brief follows. Consolidated
   numbered sources list at the bottom, with titles as deep links too.

Deep-link construction: source_url is hydrated via dbnV6QueryDocumentMeta
in a single batched call after retrieval. For Lovdata sources with a
section_title containing §<n>, the link is path-anchored to that section
(/§43). For other hosts (HUDOC, Regjeringen, Bufdir, etc.) we link to the
document root URL.

Telemetry: trace_metadata now carries retrieval_counts {raw_corpus,
filtered_website, post_filter_corpus, raw_upload, after_dedupe, after_topk}
so future regressions are diagnosable from the metadata.jsonl log alone.
The completion status pill surfaces the corpus/website/upload split.
2026-05-15 11:12:13 +02:00
2026-05-08 17:12:38 +02:00

Do Better Norge Legal Tools Hub

MVP docroot for tools.dobetternorge.no.

Required environment

  • CaveauAI client access for DBN_CAVEAU_CLIENT_SLUG and DBN_CAVEAU_PACKAGE_SLUG
  • DBN_AZURE_OPENAI_ENDPOINT
  • DBN_AZURE_OPENAI_API_KEY
  • DBN_AZURE_OPENAI_API_VERSION
  • DBN_AZURE_OPENAI_CHAT_DEPLOYMENT
  • DBN_AZURE_OPENAI_EMBEDDING_DEPLOYMENT

Optional:

  • DBN_AI_PORTAL_ROOT (defaults to sibling ai-portal)
  • DBN_CAVEAU_CLIENT_SLUG (defaults to dobetter)
  • DBN_CAVEAU_PACKAGE_SLUG (defaults to family-legal)
  • DBN_TOOLS_SUPPORT_DIR
  • DBN_TOOLS_METADATA_LOG

Authentication

The login form authenticates against Caveau client_users for the configured client slug. The client must be active, the user must be active, and the client must have an active subscription to the configured corpus package.

Use scripts/setup-caveau-access.php for repeatable local/production setup of the Do Better Norge Caveau owner account, family-legal subscription, and white-label domain mappings. Pass the account password through DBN_SETUP_PASSWORD at runtime only; do not commit it.

The APIs process pasted text in memory and write only metadata such as tool name, latency, language, source count, chunk count, deployment, and anonymous session id.

S
Description
No description provided
Readme 33 MiB
Languages
PHP 77.3%
JavaScript 13.6%
CSS 9.1%