feat(transcribe): Norwegian defaults, vocabulary presets, multi-file court day queue

- Default language → nb (Bokmål); auto-detect demoted with warning note
- Default model → large-v3; VAD filter on by default
- Vocabulary prompt promoted to main form with 4 preset buttons
  (Barnerett/CPS, Rettssak/tingrett, Generell norsk, Egendefinert)
- Multi-file upload queue: drop/select multiple clips, numbered list UI
- Sequential queue processing with cumulative time_offset per clip
- Backend shifts segment timestamps so SRT/VTT covers full court day
- Merged transcript + segments across all clips for single download

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-14 22:20:11 +02:00
parent df31674f2e
commit 26f4e2231b
4 changed files with 356 additions and 138 deletions
+11
View File
@@ -53,6 +53,7 @@ if ($engine === 'openai' && $file['size'] > 25 * 1024 * 1024) {
dbnToolsError('OpenAI Whisper API has a 25 MB file limit. Use the GPU engine for larger files.', 413, 'openai_file_too_large');
}
$timeOffset = max(0.0, (float)($_POST['time_offset'] ?? 0));
$t0 = microtime(true);
// ── Route to engine ───────────────────────────────────────────────────────────
@@ -79,6 +80,16 @@ if ($engine === 'openai') {
$latencyMs = (int)round((microtime(true) - $t0) * 1000);
// ── Shift segment timestamps for multi-clip sessions ─────────────────────────
if ($timeOffset > 0.0 && !empty($result['segments'])) {
foreach ($result['segments'] as &$seg) {
$seg['start'] = round(($seg['start'] ?? 0) + $timeOffset, 3);
$seg['end'] = round(($seg['end'] ?? 0) + $timeOffset, 3);
}
unset($seg);
}
// ── Speaker role labelling (GPU + diarize only) ───────────────────────────────
$segments = $result['segments'] ?? [];