Beta

Windows is currently a public beta. This feature list reflects the current beta build. Expect iteration and refinement before 1.0.

Features

A comprehensive overview of TypeWhisper's capabilities on Windows.

On-Device Transcription

All local processing runs on your CPU using ONNX Runtime with int8 quantization - no GPU required. Two engines are available:

  • Parakeet TDT 0.6B - Fast general transcription supporting 25+ languages. ~670 MB download.
  • Canary 180M Flash - Multilingual model with built-in translation between English, German, French, and Spanish. ~200 MB download.

Cloud Transcription (Optional)

For higher accuracy or faster processing, you can optionally connect cloud providers. Your voice data stays on your PC unless you explicitly enable a cloud provider. API keys are encrypted at rest via DPAPI.

ProviderModelNotes
Groqwhisper-large-v3Fast cloud transcription, supports translation
Groqwhisper-large-v3-turboFastest, no translation
OpenAIgpt-4o-transcribeHighest accuracy
OpenAIgpt-4o-mini-transcribeLower cost, good quality
OpenAIwhisper-1Classic, supports translation

Configure cloud providers in Settings or during the Welcome Wizard.

System-Wide Dictation

Use a global hotkey to start and stop recording from any app. Transcribed text is automatically pasted into the active text field. The default hotkey is Ctrl+Shift+F9 - you can change it in Settings > Hotkey. Three independent hotkeys can be configured:

  • Hybrid - Short press toggles recording on/off, long press activates push-to-talk. Best of both worlds.
  • Toggle - Press the hotkey once to start recording, press again to stop. Good for longer dictation where you want hands-free recording.
  • Push-to-Talk - Hold the hotkey to record, release to stop and transcribe. Ideal for quick messages or when you want precise control over recording duration.

Live Partial Results

Silero VAD detects speech segments during recording and transcribes them in real time. A floating overlay shows partial transcription results before you stop recording, so you get immediate feedback as you speak.

File Transcription

Transcribe audio and video files directly within the app. Drag and drop files onto the TypeWhisper window, or use the file picker to select them.

  • Supported formats - WAV, MP3, M4A, AAC, OGG, FLAC, WMA, MP4, MKV, AVI, MOV, WebM
  • Batch processing - Queue multiple files and transcribe them sequentially
  • Export - Save results as TXT, SRT, or WebVTT subtitles with accurate timestamps

Translation

TypeWhisper supports three translation methods:

  • Canary on-device - Translation between English, German, French, and Spanish using the Canary 180M Flash model. Fully offline.
  • Marian on-device - Local ONNX translation model supporting 20 target languages: EN, DE, FR, ES, IT, NL, PL, SV, DA, FI, CS, RU, UK, HU, JA, ZH, AR, HI, VI, ID. No internet required.
  • Cloud LLM - Groq (Llama 3.3 70B) or OpenAI (GPT-4o-mini) for any language pair. Requires an API key.

Set translation options in Settings or configure them per-app using Profiles.

Dictionary

Custom term corrections applied automatically after transcription. Fix names, jargon, or recurring misrecognitions. Supports regex patterns for advanced replacements.

13 built-in term packs are included: Web Dev, .NET/C#, DevOps, Data & AI, Design, Game Dev, Mobile, Security, Databases, Medical, Legal, Finance, and Music Production.

Snippets

Text shortcuts that expand automatically. Define a trigger word and its replacement text. Supports dynamic placeholders:

  • {date}, {time}, {datetime} - Current date/time (custom formats supported, e.g. {date:dd.MM.yyyy})
  • {clipboard} - Current clipboard content
  • {day}, {year} - Current day name or year

Whisper Mode

Boost microphone gain for quiet speech or noisy environments. When enabled, TypeWhisper amplifies the microphone input so you can speak softly and still get accurate transcriptions - useful in shared offices, libraries, or late-night sessions. Toggle it per-profile or globally in settings.

Audio & Recording

  • Audio Ducking - Automatically reduces system volume while recording to minimize background noise from other applications.
  • Media Pause - Automatically pauses media playback (music, videos) during recording and resumes when done.
  • Audio Normalization - Automatic gain control for consistent input levels, regardless of how close you are to the microphone.
  • Silence Detection - Automatically stops recording after a configurable silence period, so you don't have to press the hotkey again.
  • Sound Feedback - Audio cues for recording start and stop, so you know when TypeWhisper is listening.
  • Non-blocking Pipeline - Multiple recordings can be queued while transcription runs in the background. Start your next recording before the previous one finishes processing.

Dashboard & History

  • Dashboard - Usage statistics showing total words, recording duration, and number of transcriptions with an activity chart.
  • Transcription History - All transcriptions are saved locally with timestamps, the app they were dictated into, and which engine/model was used. Search and browse your history. Edit transcriptions inline and see correction detection that highlights differences between the original and edited text.