feat(transcribe): Norwegian defaults, vocabulary presets, multi-file court day queue

- Default language → nb (Bokmål); auto-detect demoted with warning note - Default model → large-v3; VAD filter on by default - Vocabulary prompt promoted to main form with 4 preset buttons (Barnerett/CPS, Rettssak/tingrett, Generell norsk, Egendefinert) - Multi-file upload queue: drop/select multiple clips, numbered list UI - Sequential queue processing with cumulative time_offset per clip - Backend shifts segment timestamps so SRT/VTT covers full court day - Merged transcript + segments across all clips for single download Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 22:20:11 +02:00
parent df31674f2e
commit 26f4e2231b
4 changed files with 356 additions and 138 deletions
@@ -53,6 +53,7 @@ if ($engine === 'openai' && $file['size'] > 25 * 1024 * 1024) {
    dbnToolsError('OpenAI Whisper API has a 25 MB file limit. Use the GPU engine for larger files.', 413, 'openai_file_too_large');
 }

+$timeOffset = max(0.0, (float)($_POST['time_offset'] ?? 0));
 $t0 = microtime(true);

 // ── Route to engine ───────────────────────────────────────────────────────────
@@ -79,6 +80,16 @@ if ($engine === 'openai') {

 $latencyMs = (int)round((microtime(true) - $t0) * 1000);

+// ── Shift segment timestamps for multi-clip sessions ─────────────────────────
+
+if ($timeOffset > 0.0 && !empty($result['segments'])) {
+    foreach ($result['segments'] as &$seg) {
+        $seg['start'] = round(($seg['start'] ?? 0) + $timeOffset, 3);
+        $seg['end']   = round(($seg['end']   ?? 0) + $timeOffset, 3);
+    }
+    unset($seg);
+}
+
 // ── Speaker role labelling (GPU + diarize only) ───────────────────────────────

 $segments    = $result['segments']    ?? [];