Qwen3 ASR
Bundledby TypeWhisper
About
Qwen3 ASR is TypeWhisper’s bundled local speech-to-text provider for Qwen3-ASR on Apple Silicon. Version 1.1.0 adds the refreshed macOS settings surface shown above: optional HuggingFace token storage, quick model picks, and ten MLX model variants from compact 0.6B quantized downloads to the larger 1.7B BF16 model.
Transcription runs on-device after the selected model has been downloaded. TypeWhisper does not send audio to a cloud transcription API for this provider, and no API key is required to transcribe.
The published release is the source of truth for the public version. The plugin release workflow can patch the bundle manifest during release, so a local checkout or registry cache may briefly show an older version while the release finishes.
Sources: Qwen3Plugin v1.1.0 release, TypeWhisper plugin registry, Qwen3 plugin source
Requirements & Privacy
| Requirement | Details | Why it matters |
|---|---|---|
| Platform | macOS 14.0 or newer on Apple Silicon (arm64). | The macOS plugin uses Apple’s MLX stack and ships only for Apple Silicon. |
| Download source | Models are downloaded from Hugging Face into TypeWhisper’s plugin data directory. | The first load can take time and needs enough disk space for the selected model. |
| Network use | Network access is used for model downloads and optional HuggingFace token validation. | Audio transcription itself is local after the model is available. |
| Credentials | No API key is needed. A HuggingFace token is optional. | The token only helps with Hugging Face download rate limits; it is not used as transcription auth. |
| Model memory | The smallest 0.6B variants target 8 GB Macs. The recommended 1.7B 6-bit model targets 16 GB+ Macs. The 1.7B 8-bit and BF16 options target 32 GB+ Macs. | Pick the model that fits your RAM budget before optimizing for quality. |
Sources: Qwen3 plugin manifest, Qwen3 plugin source
HuggingFace Token
The HuggingFace token field is optional. It stores a trimmed token in the macOS Keychain through TypeWhisper’s plugin secret helper and validates the token before saving it.
| Field | What it controls | Example |
|---|---|---|
| HuggingFace Token | Adds a token that MLX/Hugging Face can use for model downloads. | Paste an hf_... token when you hit download rate limits. |
| Show/Hide token | Reveals or masks the token while editing. | Use the eye button to inspect a pasted token before saving. |
| Save | Validates the token and stores it only when Hugging Face accepts it. | Wait for Valid HuggingFace Token before retrying a large model download. |
| Remove | Clears the stored token from TypeWhisper. | Remove it when switching Hugging Face accounts or before handing the Mac to another user. |
| Validating token… | Shows that TypeWhisper is checking the token before storing it. | If validation stays pending, check network access and retry. |
| Invalid HuggingFace Token | The token did not validate. | Create a new token at huggingface.co/settings/tokens and save again. |
The token is not required for normal use. Qwen3 ASR still runs locally and does not need a paid cloud speech-to-text account.
Sources: Hugging Face user access tokens, Qwen3 plugin source
Model Selection
The Qwen3 ASR settings view exposes quick picks plus the full model list. Use the quick picks first unless you have a specific reason to compare quantization levels.
| Quick pick | Model | When to use it |
|---|---|---|
| Fast / 8 GB | Qwen3 0.6B (6-bit) | Fast choice for 8 GB Macs and casual dictation. |
| Best default | Qwen3 1.7B (6-bit) | Recommended default for most 16 GB+ Macs. |
| Quality / 32 GB | Qwen3 1.7B (8-bit) | Higher-quality choice for Macs with 32 GB+ RAM. |
| Model | ID | Hugging Face repository | Download | RAM | Guidance |
|---|---|---|---|---|---|
| Qwen3 0.6B (4-bit) | qwen3-asr-0.6b-4bit | mlx-community/Qwen3-ASR-0.6B-4bit | ~0.7 GB | 8 GB+ | Smallest download; use when memory matters more than quality. |
| Qwen3 0.6B (5-bit) | qwen3-asr-0.6b-5bit | mlx-community/Qwen3-ASR-0.6B-5bit | ~0.8 GB | 8 GB+ | Slightly more precision than 4-bit with a similar footprint. |
| Qwen3 0.6B (6-bit) | qwen3-asr-0.6b-6bit | mlx-community/Qwen3-ASR-0.6B-6bit | ~0.9 GB | 8 GB+ | Fast / 8 GB. Fast pick for 8 GB Macs and casual dictation. |
| Qwen3 0.6B (8-bit) | qwen3-asr-0.6b-8bit | mlx-community/Qwen3-ASR-0.6B-8bit | ~1.0 GB | 16 GB+ | Higher-precision small model when 0.6B quality matters. |
| Qwen3 0.6B (BF16) | qwen3-asr-0.6b-bf16 | mlx-community/Qwen3-ASR-0.6B-bf16 | ~1.6 GB | 16 GB+ | Unquantized small model; useful for comparison or validation. |
| Qwen3 1.7B (4-bit) | qwen3-asr-1.7b-4bit | mlx-community/Qwen3-ASR-1.7B-4bit | ~1.6 GB | 16 GB+ | Smallest 1.7B option; use when 6-bit is too heavy. |
| Qwen3 1.7B (5-bit) | qwen3-asr-1.7b-5bit | mlx-community/Qwen3-ASR-1.7B-5bit | ~1.8 GB | 16 GB+ | Middle ground if the default 6-bit model is tight on memory. |
| Qwen3 1.7B (6-bit) | qwen3-asr-1.7b-6bit | mlx-community/Qwen3-ASR-1.7B-6bit | ~2.0 GB | 16 GB+ | Recommended. Best default for most 16 GB+ Macs. |
| Qwen3 1.7B (8-bit) | qwen3-asr-1.7b-8bit | mlx-community/Qwen3-ASR-1.7B-8bit | ~2.5 GB | 32 GB+ | 32 GB quality. Higher-quality pick for 32 GB+ Macs. |
| Qwen3 1.7B (BF16) | qwen3-asr-1.7b-bf16 | mlx-community/Qwen3-ASR-1.7B-bf16 | ~4.1 GB | 32 GB+ | Largest unquantized model; use for max-fidelity validation. |
Click Download & Load to download and activate a model. The loaded model becomes the active transcription model. Click Unload to deactivate the model and remove that model’s downloaded files.
Sources: Qwen3 model catalog in source, MLX community models on Hugging Face
Transcription Behavior
Qwen3 ASR appears in TypeWhisper as Qwen3 ASR (MLX) after a model is loaded. It supports local speech recognition, but not translation.
| Setting or input | What TypeWhisper sends | Notes |
|---|---|---|
| Profile language | A Qwen language name when the profile language is supported. | de becomes German; fr becomes French. Empty or unsupported languages leave Qwen3 in auto-detect mode. |
| Auto-detect | No forced language hint. | TypeWhisper does not fall back to English when the profile has no supported Qwen3 language. |
| Dictionary terms / prompt context | An anti-hallucination instruction plus normalized technical terms. | Terms such as TypeWhisper or MLX are passed as context bias when available. |
| Translate to English | Not supported. | Use another transcription provider when the workflow requires translation. |
| Long or looped output | TypeWhisper retries likely looped output with shorter chunks and no context bias. | This guard is intended to prefer a cleaner transcript when Qwen3 repeats words or phrases. |
Supported language codes are zh, en, yue, ar, de, fr, es, pt, id, it, ko, ru, th, vi, ja, tr, hi, ms, nl, sv, da, fi, pl, cs, fil, tl, fa, el, ro, hu, and mk. fil and tl both map to Filipino.
Sources: Qwen3 language handling, TypeWhisper plugin SDK dictionary terms
Setup Examples
Fast local transcription on an 8 GB Mac
- Open TypeWhisper Settings > Plugins.
- Configure Qwen3 ASR.
- Leave HuggingFace Token empty unless downloads are rate-limited.
- Click Download & Load for Qwen3 0.6B (6-bit).
- Select Qwen3 ASR (MLX) as the transcription engine in Settings or in a profile.
Recommended setup for most 16 GB+ Macs
- Open TypeWhisper Settings > Plugins.
- Configure Qwen3 ASR.
- Click Download & Load for Qwen3 1.7B (6-bit).
- Set the profile language when you want a language hint, or leave it unset for auto-detect.
- Add dictionary terms for product names, people, and domain vocabulary that should be biased into the transcript.
Higher-quality setup on a 32 GB+ Mac
- Open TypeWhisper Settings > Plugins.
- Configure Qwen3 ASR.
- Click Download & Load for Qwen3 1.7B (8-bit).
- Use Qwen3 1.7B (BF16) only when you specifically want an unquantized validation or comparison model.
Switch or remove a model
- Open Qwen3 ASR in the plugin settings.
- Click Unload on the active model if you want to remove its downloaded files.
- Click Download & Load on the next model.
- Reopen the transcription profile and confirm Qwen3 ASR (MLX) is still selected.
Notes
- Qwen3 ASR is documented here for the macOS MLX implementation.
- A HuggingFace token increases download rate limits; it does not turn Qwen3 ASR into a cloud provider.
- Transcription runs locally after the selected model is downloaded.
- The provider supports dictionary terms and prompt context with a
10,000character budget. - Translation is not supported by this provider.
- The public
1.1.0release requires TypeWhisper host version1.2.1or newer.