Power Automate error handling patterns
How to make flows robust to failure — Run After, Try/Catch, retries, logging, and alerting.
Power Automate flows that work in development but break in production usually fail because of inadequate error handling. Network timeouts, API rate limits, missing data, permissions changes — production reality is full of edge cases that dev environments don't surface. Robust flows handle them deliberately.
The Run After mechanism
Every action in a flow has a Run After setting that controls when it executes based on the previous action's outcome:
- Is successful (default) — run only if previous succeeded.
- Has failed — run only if previous failed.
- Is skipped — run only if previous was skipped.
- Has timed out — run only if previous timed out.
Setting Run After Has failed on a logging action gives you a "catch" branch:
[Action that might fail]
├── (default) → continue with main flow
└── (run after failure) → log to SharePoint, notify Teams, exit gracefully
Scopes as try/catch blocks
A Scope action groups several actions. The whole scope can have a Run After configuration. This gives you try/catch:
Scope: Try
Action 1
Action 2
Action 3
Scope: Catch (Run after Try fails)
Log failure
Notify owner
Scope: Finally (Run after Try succeeds OR Catch runs)
Cleanup
The pattern matches familiar try/catch/finally from imperative languages. Use it liberally; most production flows benefit.
Retry policies
Most connector actions have built-in retry policies for transient failures (network blips, throttling). Configurable per action:
- Default — Microsoft's recommended retries.
- None — no retries.
- Exponential — exponential backoff with configurable initial interval and max retries.
- Fixed — fixed-interval retries.
For idempotent actions (reading data, creating records that won't duplicate), enable generous retries. For non-idempotent actions (sending email, posting to Teams), be careful — retries can produce duplicates.
Throttling and rate limits
Microsoft 365 connectors throttle aggressively at scale. A flow that runs once an hour is fine; a flow processing 10,000 records concurrently will throttle.
Mitigations:
- Use Apply to each with concurrency control to limit parallel execution.
- Add delays between iterations.
- Batch operations where the connector supports it.
- Cache data so you're not re-reading the same source repeatedly.
Idempotency
Design flows so re-running them produces the same result as running once. For actions that change state:
- Check if already done before doing — "is this Planner task already created? if so, update; if not, create."
- Use idempotency keys — append a unique identifier to operations so retries don't duplicate.
This makes recovery from partial failures easy: just rerun the flow.
Logging
Production flows benefit from a structured logging pattern:
- Send all errors to a central SharePoint list or Dataverse table — date, flow name, error message, affected record.
- Notify the owner for critical failures via Teams or email.
- Dashboard the log in Power BI — visibility into flow health.
Power Automate's built-in run history is per-flow; centralising errors makes them easier to spot across many flows.
Alerting
Critical flows need active alerts on failure, not just logs. Patterns:
- Teams notification to a dedicated channel.
- PagerDuty / Opsgenie for incident response.
- Email to the owner.
Configure these in the catch scope.
Monitoring as code
For mature deployments, the CoE Starter Kit provides cross-flow monitoring dashboards. Logic Apps and Application Insights integration gives even deeper telemetry for high-volume flows.
When to graduate to a real platform
Power Automate excels at low-to-medium-volume orchestration with rich connectors. For very high-volume, latency-sensitive, complex error-handling scenarios:
- Azure Logic Apps — same engine, more enterprise features.
- Azure Durable Functions — for orchestrations requiring code.
- Custom services — for the highest-volume real-time scenarios.
For most Microsoft 365 automation, Power Automate plus good error-handling patterns is enough. The investment in robust patterns pays back the first time a critical flow fails at 2am.