feat(transcribe): GPT cleanup pass + advanced options i18n

Adds optional post-transcription cleanup via GPT-4o/GPT-4o-mini to fix
mishearing errors, punctuation, and domain terms. Speaker role labelling
now accepts a deployment param. Adds i18n strings for advanced options
panel (task, VAD filter, Whisper model, AI cleanup) in all four languages.
Updates BvjAnalyzerAgent and DeepResearchAgent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-18 07:23:01 +02:00
parent e32ee60e78
commit c4362738c1
5 changed files with 345 additions and 112 deletions
+37
View File
@@ -48,6 +48,43 @@ require_once __DIR__ . '/includes/layout.php';
<p class="upload-hint" data-i18n="vocabHint">Helps Whisper recognise technical terms. Not included in the transcript.</p>
</div>
<details id="advancedOptions" class="expert-field">
<summary data-i18n="advancedOptions">Advanced options</summary>
<div class="control-row" id="taskControl">
<span class="control-label" data-i18n="task">Task</span>
<label><input type="radio" name="task" value="transcribe" checked> <span data-i18n="taskTranscribe">Transcribe</span></label>
<label><input type="radio" name="task" value="translate"> <span data-i18n="taskTranslate">Translate to English</span></label>
</div>
<div class="control-row">
<span class="control-label" data-i18n="vadFilter">VAD filter</span>
<label><input type="checkbox" id="vadFilterCheck" name="vad_filter"> <span data-i18n="vadFilterLabel">Remove silence / noise</span></label>
<small class="control-hint" data-i18n="vadFilterHint">Improves accuracy on recordings with long pauses.</small>
</div>
<div class="control-row" id="whisperModelControl">
<span class="control-label" data-i18n="whisperModel">Whisper model</span>
<select id="whisperModelSelect" name="whisper_model">
<option value="large-v3" selected>large-v3 (best)</option>
<option value="large-v2">large-v2</option>
<option value="medium">medium (faster)</option>
<option value="small">small</option>
<option value="base">base</option>
<option value="tiny">tiny</option>
</select>
<small class="control-hint" data-i18n="whisperModelHint">Used when Azure/GCP unavailable. large-v3 is the default.</small>
</div>
<div class="control-row" id="postModelControl">
<span class="control-label" data-i18n="postModel">AI cleanup</span>
<label><input type="radio" name="post_model" value="" checked> <span data-i18n="postModelNone">None</span></label>
<label><input type="radio" name="post_model" value="gpt-4o-mini"> <span data-i18n="postModelMini">GPT-4o Mini</span></label>
<label><input type="radio" name="post_model" value="gpt-4o"> <span data-i18n="postModelFull">GPT-4o</span></label>
<small class="control-hint" data-i18n="postModelHint">Fixes errors, punctuation, and domain terms after transcription.</small>
</div>
</details>
<div class="upload-zone" id="audioZone" role="region" aria-label="Audio upload" data-i18n-aria="uploadAria">
<input type="file" id="audioInput" accept="audio/*,video/mp4,video/webm" multiple aria-label="Choose audio files">
<div id="audioPrompt" class="upload-prompt">