Foundation ModelsUpdated · June 2026

Foundation Models GenerationError: Complete Handling Reference for Production iOS Apps

Author: Ehsan Azish · 3NSOFTS
Updated: June 2026
Read time: 14 min read
Level: Intermediate
Platform: iOS 26+, Foundation Models, async/await

Implementation Notes

~/ What broke: Opaque Foundation Models errors reached users as generic failures.
~/ What to do: Map Foundation Models failures to deterministic UI states and useful recovery paths.

GenerationErrorLanguageModelSession error handlingFoundation Models error codesGenerationError error 4on-device AI error handling

When a Foundation Models generation fails, you get a LanguageModelSession.GenerationError. If you only catch { } it generically, you'll see opaque messages like:

The operation couldn't be completed.
(FoundationModels.LanguageModelSession.GenerationError error 4.)

That tells the user nothing and tells you nothing. This is the reference that maps every case to its meaning, its recoverability, and the action you should take — so your error handling is exhaustive instead of a single catch-all.

The cases

GenerationError is an enum. Each case carries a Context with a debugDescription. Here's the full surface and how to treat each one.

`exceededContextWindowSize`

What it means: The session transcript (instructions + all prior turns + this prompt) exceeded the model's context window. Recoverable: Yes, but not in the catch block — you must start a new session. Action: Summarize-and-carry, or hard reset. See the dedicated guide. User copy: None — recover invisibly.

`guardrailViolation`

What it means: Content was flagged by safety guardrails — OR model assets are missing and the failure is misreported as a guardrail. Inspect debugDescription. Recoverable: Depends. Asset failures need onboarding/availability handling; genuine rejections need a fallback. Action: Classify, then route. See the guardrail guide. User copy: For genuine rejections, silently fall back. Never say "your request was unsafe."

`assetsUnavailable`

What it means: Required model assets aren't available — model not downloaded, mid-download, or evicted. Recoverable: Not immediately. The model needs to become available. Action: Route to your availability-gated UI. Check SystemLanguageModel.default.availability for the specific reason (device not eligible, model not ready, AI features off). User copy: "On-device intelligence is still setting up" — or hide the feature entirely.

`unsupportedGuide`

What it means: A @Guide constraint you used isn't supported (e.g. a regex or constraint type the model can't honor). Recoverable: Only by fixing your schema — this is a developer error, not a runtime condition. Action: Simplify the offending @Guide. Catch in development; it should never reach production. User copy: None — fix it before shipping.

`unsupportedLanguageOrLocale`

What it means: The prompt's language/locale isn't supported by the on-device model. Recoverable: No, for that input. Action: Detect unsupported locales up front and fall back to a non-AI path or a server model where you have one. User copy: Degrade silently; don't expose locale limits.

`decodingFailure`

What it means: The model produced output that couldn't be decoded into your @Generable type. Recoverable: Often, with a retry — generation is stochastic. Action: Retry once with the same or a slightly simplified schema. If it persists, your @Generable type is likely too complex; flatten it. User copy: None — retry transparently, then fall back.

`rateLimited`

What it means: Too many requests in too short a window. Recoverable: Yes, with backoff. Action: Exponential backoff and retry. Also a signal to debounce user-driven generation (e.g. don't fire on every keystroke). User copy: None — back off silently.

A complete handler

Exhaustive switching beats a catch-all. This routes every case to the right behavior:

func generate(_ prompt: String) async -> GenerationOutcome {
    do {
        let answer = try await session.respond(to: prompt)
        return .success(answer.content)
    } catch let error as LanguageModelSession.GenerationError {
        switch error {
        case .exceededContextWindowSize:
            await recoverContextWindow()        // new session, then caller retries
            return .retryable

        case .guardrailViolation(let ctx):
            return classifyGuardrail(ctx)        // .assetIssue or .safetyFallback

        case .assetsUnavailable:
            return .modelUnavailable

        case .decodingFailure:
            return .retryable                    // single retry upstream

        case .rateLimited:
            return .backoff

        case .unsupportedLanguageOrLocale,
             .unsupportedGuide:
            return .fallback                     // deterministic path

        @unknown default:
            // Future-proofing: Apple has added cases across point releases.
            return .fallback
        }
    } catch {
        return .fallback
    }
}

The @unknown default is not optional hygiene here — Apple has added new GenerationError cases across iOS 26 point releases. Without it, a future case silently slips through your exhaustive switch as an unhandled crash-or-misroute. With it, anything unrecognized degrades to your deterministic fallback.

Decoding "error N" messages

The bare GenerationError error 4 comes from a generic localizedDescription on an error you didn't pattern-match. The fix is simply to always switch on the typed enum rather than printing localizedDescription. Once you match the cases above, you never see a numeric code again — you see exceededContextWindowSize, guardrailViolation, and so on, which are actionable.

If you must log the raw error for telemetry, log the case name and context.debugDescription, not the localized string:

private func log(_ error: LanguageModelSession.GenerationError) {
    // Stable, greppable telemetry — not the user-facing localized string.
    logger.error("GenerationError: \(String(describing: error))")
}

Production checklist

Switch on the typed enum, never print localizedDescription to the user.
Include @unknown default — new cases ship in point releases.
Map each case to one of: recover, retry, backoff, fallback, model-unavailable.
Retryable cases (decodingFailure, rateLimited) get one retry / backoff, not a loop.
Log the case name + debugDescription for telemetry; never the numeric code.

Why this matters for shipped apps

The difference between a robust AI feature and a fragile one is entirely in this switch statement. Most tutorials show the happy path and a single catch { }. Production traffic exercises every branch — long sessions, unsupported locales, devices mid-download, the occasional decode miss. Handle them exhaustively once and the feature simply works for everyone; handle them lazily and you ship "error 4" to real users.

Want this error-handling layer designed correctly before launch? It's part of every on-device AI integration we do at 3NSOFTS.

Authoritative References

Foundation Models frameworkApple IntelligencePrivate Cloud ComputeCore MLCore ML documentation