Migrating from SFSpeechRecognizer to SpeechAnalyzer: On-Device Transcription in iOS 26
- Author
- Ehsan Azish · 3NSOFTS
- Updated
- June 2026
- Read time
- 16 min read
- Level
- Intermediate → Senior
- Platform
- iOS 26+, Swift concurrency, audio basics
Implementation Notes
- ~/ What broke: Callback-era speech recognition patterns do not map cleanly to iOS 26.
- ~/ What to do: Move from delegate callbacks to async transcription flows with explicit permission states.
SFSpeechRecognizer served for nearly a decade, but it was always a delegate-and-callback API bolted onto a pre-concurrency world, with on-device support that felt secondary to its cloud path. iOS 26 introduces SpeechAnalyzer and SpeechTranscriber — a modern, async-native, on-device-first transcription stack. This guide is the practical migration path, written from shipping a production dictation feature on it.
What actually changes
This isn't a one-line API swap. The model is different:
| Concern | SFSpeechRecognizer | SpeechAnalyzer / SpeechTranscriber |
|---|---|---|
| Concurrency | Delegates + completion handlers | async/await, AsyncSequence results |
| On-device | Opt-in, secondary | First-class, default |
| Results | Recognition-result callbacks | Streamed result objects over a sequence |
| Lifecycle | Recognizer + request + task | Analyzer + transcriber module + input stream |
The mental shift: instead of registering a delegate and reacting to callbacks, you feed an audio stream into an analyzer and consume a results sequence. It fits Swift concurrency the way the old API never could.
The old shape (for reference)
// SFSpeechRecognizer — callback-driven, what you're migrating away from.
let recognizer = SFSpeechRecognizer()
let request = SFSpeechAudioBufferRecognitionRequest()
request.requiresOnDeviceRecognition = true
let task = recognizer?.recognitionTask(with: request) { result, error in
if let result {
let text = result.bestTranscription.formattedString
// hop back to main, update UI...
}
}
// feed buffers via request.append(buffer) from an AVAudioEngine tap
The pain points you're leaving behind: manual main-thread hops, requiresOnDeviceRecognition as a flag rather than the default, and no natural async surface.
The new shape
SpeechAnalyzer coordinates one or more analysis modules; SpeechTranscriber is the transcription module. You configure the transcriber, attach it to an analyzer, feed audio, and consume results as an async sequence.
import Speech
@available(iOS 26, *)
final class TranscriptionService {
private var analyzer: SpeechAnalyzer?
private var transcriber: SpeechTranscriber?
func start(locale: Locale) async throws -> AsyncStream<String> {
// Configure the transcriber module (on-device by default).
let transcriber = SpeechTranscriber(locale: locale /*, options as needed */)
self.transcriber = transcriber
// Attach it to an analyzer.
let analyzer = SpeechAnalyzer(modules: [transcriber])
self.analyzer = analyzer
// Consume results as they stream in.
return AsyncStream { continuation in
Task {
for await result in transcriber.results {
continuation.yield(result.text) // partial + final updates
}
continuation.finish()
}
}
}
}
API-surface caveat. Apple refined
SpeechAnalyzer/SpeechTranscriberinitializers and the results-element shape across iOS 26 point releases. Treat the snippet as the architecture — analyzer coordinates modules; you consume a results sequence — and confirm exact initializer signatures and the result element's property names against the SDK you target. Keep configuration in one place so signature changes are a localized edit.
Feeding audio
You still capture audio (typically via AVAudioEngine), but instead of request.append(buffer) you feed buffers into the analyzer's input. The flow:
- Install a tap on the engine's input node.
- Convert/forward buffers to the analyzer's input stream.
- Consume the transcriber's
resultssequence on aTask. - On stop, finalize the analyzer so it flushes the last partial into a final result.
The key behavioral detail: results arrive as a stream of increasingly-final hypotheses, the same conceptual model as before, but delivered as an AsyncSequence rather than repeated delegate calls. Update your UI on each yield; the last one before finish is your final transcript.
Permissions and availability
- You still request speech-recognition authorization (
SFSpeechRecognizer.requestAuthorizationequivalent flow) and microphone permission. Don't skip the pre-permission UX — App Store review rejects flows that request mic/speech access without context. - On-device transcription has language/locale availability that varies by device and may require an asset download for some locales. Check availability for the requested
Localeand fall back gracefully (or prompt the download) rather than assuming every locale works offline immediately.
Migration strategy for a shipped app
Don't rip-and-replace. Stage it:
- Gate the new path behind
#available(iOS 26, *)and keepSFSpeechRecognizerfor older OS versions. You'll support both for at least one release cycle. - Wrap transcription behind your own protocol (
Transcribing) with two implementations — old and new — so the call sites don't change. - Migrate UI to consume an
AsyncStream<String>regardless of backend; that's the shape both can produce, and it future-proofs the call site. - Validate locale coverage on the new API against what your users actually use before flipping the default.
protocol Transcribing {
func start(locale: Locale) async throws -> AsyncStream<String>
func stop() async
}
// LegacyTranscriber: SFSpeechRecognizer (iOS < 26)
// ModernTranscriber: SpeechAnalyzer (iOS 26+)
Production checklist
- Adopt the analyzer + module + results-stream model, not a one-to-one callback port.
- Confirm initializer/result signatures against your SDK; isolate them.
- Consume results as an
AsyncSequence; the last pre-finish value is final. - Keep both backends behind a protocol for at least one release cycle.
- Verify per-locale on-device availability; some locales need asset downloads.
- Keep the pre-permission UX — review rejects context-free mic/speech prompts.
Why this matters for shipped apps
SpeechAnalyzer is the API Apple will build on going forward, and its async-native, on-device-default design genuinely simplifies a dictation pipeline — but only if you adopt its streaming model rather than forcing the old callback shape onto it. Get the architecture right (protocol-wrapped, stream-consuming, locale-validated) and you can support old and new OS versions from one clean call site through the transition.
We've shipped production on-device dictation on this stack. If you're moving a speech feature to SpeechAnalyzer and want the migration architected cleanly, that's the kind of work we do at 3NSOFTS.