What AI Gets Wrong About Power Platform (And Why That Is the Point)
AI made three Power Automate architecture mistakes in 10 minutes. After correction, it delivered 14 production-ready flows. Here is the real pattern.
Three architectural mistakes in the first 10 minutes. That is how AI-assisted Power Platform development actually starts. Not with perfect output, but with wrong defaults that reveal exactly where human judgment matters most in Power Automate solutions.
And yet, by the end of that same session, 14 production-ready notification flows were packaged in a Dataverse solution ZIP, imported, and working. Every flow followed the corrected architecture. Every FetchXML query used the same patterns. Every email came from the shared mailbox instead of the flow owner. Consistent output across 14 flows in a way no human team could match under the same time pressure.
Both of those things are true. The AI got the architecture wrong AND executed the corrected architecture better than I could have alone. That tension is the entire story of AI-assisted development right now.
The Three Mistakes
Mistake 1: Real-Time Triggers for Everything
The AI proposed real-time triggers for all 14 notification flows. On the surface, that is reasonable. Real-time is the most common Power Automate trigger pattern. Someone creates a record, a flow fires, an email goes out. Simple.
Wrong.
When Sarah Chen (our product owner) opens a review cycle, the bulk-creation flow generates dozens or hundreds of evaluation records simultaneously. Each mrd_personnelevaluation creation fires downstream flows. A supervisor responsible for 15 direct reports would receive 15 individual emails within seconds. Not a summary. Not a digest. Fifteen separate emails, each with a deep link to a different evaluation form.
The correct pattern is daily digest. One email per person per day, sent at 8:00 AM Eastern, listing all new items since the last run. A single email with a table of assignments instead of an inbox flood.
The only exception is rejection. Rejection is rare (one at a time), urgent (the author needs to revise immediately), and always a deliberate human action. Rejection notifications are real-time. The other 12 flows run on schedule.
Mistake 2: Embedding Email Actions in Business Flows
The AI proposed adding email steps directly into the existing business logic flows. The signing step advancement flow already knows when a step becomes “Awaiting.” Why not add an email action right there?
Because a notification failure must never break the signing chain.
If the email connector throws a throttling error, or the shared mailbox is temporarily unavailable, or the HTML template has a rendering issue, none of that should prevent the signing step from advancing. The business process is the primary concern. Notifications are secondary. They need to be independently deployable, independently disable-able, and independently testable.
I enforced strict separation. The 14 notification flows share zero logic with the 10 business logic flows. You can disable every notification flow in production without affecting a single signing chain. You can redeploy notifications without touching business logic. You can hand notification development to a different team entirely.
Mistake 3: Generic Naming
The AI proposed descriptive but generic names. Something like “Send Email When Form Assigned” or “Notify Signer of Ready Step.”
That works for 3 flows. It falls apart at 14. And it completely breaks when you plan for future channels.
I established the NTF-EMAIL-## convention. Every notification flow has a channel type prefix (NTF-EMAIL, with NTF-INAPP and NTF-TEAMS reserved for future channels), a sequence number, and a human-readable suffix. NTF-EMAIL-01 - Form Assigned. NTF-EMAIL-05 - Rejection to Author. NTF-EMAIL-12 - Supervisor Escalation.
When in-app notifications ship later, NTF-INAPP-01 through NTF-INAPP-14 slot right into the same architecture. No renaming. No reorganization. The tag-based architecture handles the rest.
Why Does AI Get Power Platform Architecture Wrong?
This is not an AI quality problem. It is a training data distribution problem. AI models optimize for the most common pattern in their training data, which means they default to whatever Microsoft Learn tutorials and community posts repeat most often. That works for simple automations but fails for enterprise-scale solutions with domain-specific constraints.
AI optimizes for the most common pattern. The most common Power Automate pattern IS real-time triggers. Most flows fire on record creation or update. Most notification examples in Microsoft Learn show an email action embedded in the same flow that does the work. Most tutorials use descriptive names without a tagging system.
The AI gave me the statistically most likely answer. That answer was wrong because my domain has constraints that the most common pattern does not account for:
- Bulk operations. Most Power Automate solutions create records one at a time. Meridian creates hundreds at once during cycle opening.
- Security boundaries. Most tutorials treat notifications as part of the business flow. Meridian requires notification failures to be completely isolated from business logic.
- Future channels. Most examples solve for today’s notification method. Meridian needs a naming system that accommodates channels that do not exist yet.
AI has no way to know these constraints from a prompt alone. Microsoft’s own Copilot documentation is explicit about this: “All changes done by copilot should be reviewed in the designer.” That is not a disclaimer. That is an accurate description of how AI-assisted development works. The output needs human review. Every time.
The Correction Pattern
Here is the part that surprised me.
When I corrected the first mistake (real-time to daily digest), I did not just say “change it.” I explained why. Bulk creation. Email flooding. The 8:00 AM digest pattern. The rejection exception.
The AI did not just fix that one flow. It incorporated the reasoning into every subsequent notification flow. All 12 scheduled flows got the daily digest pattern. Both rejection flows got real-time triggers. I corrected once. The AI applied the correction 14 times.
Same with the separation principle. I explained why notifications must be independent flows. From that point forward, the AI never proposed embedding an email action in a business flow again. The correction became the new baseline.
This is fundamentally different from fixing things manually. When a junior developer makes the same mistake on flow 1, they might make it again on flow 7. They might forget by flow 12. Human consistency degrades across repetitive work. AI consistency does not. Once corrected, the correction holds across all subsequent output.
The pattern is simple: human corrects with reasoning, AI incorporates the reasoning into everything that follows. Not “do it differently.” Rather, “do it differently because X.” The “because” is what makes the correction stick.
Where AI Excels
After the three corrections landed and the spec was updated, the AI executed the corrected patterns without a single deviation. Here is what it produced:
14 consistent flow definitions. Every flow followed the same structural pattern. Variable initialization chains at the top level (a Power Automate platform constraint). FetchXML queries instead of OData $filter for temporal conditions and linked entity joins. Sequential Apply-to-each loops with concurrency set to 1 for predictable email body construction. SharedMailboxSendEmailV2 from the shared mailbox, not SendEmailV2 from the flow owner.
Format conversion. Power Automate’s editor format and solution export format are different. The export format wraps the definition in a properties object, simplifies connection references, and adds a schemaVersion field. Microsoft does not document this transformation. The AI handled it across all 14 files without a single structural error.
Deterministic GUID generation. UUID v5 with the project namespace, ensuring repeatable builds. Import the solution twice and it updates existing flows rather than creating duplicates. Try doing that manually across 14 flows without a copy-paste error.
XML manifest updates. customizations.xml workflow entries. solution.xml version bumps and RootComponents. The kind of tedious, error-prone editing that humans get wrong through fatigue on the third or fourth flow. The AI got it right on all 14.
ZIP packaging. Node.js archiver with forward-slash path separators. PowerShell’s Compress-Archive creates backslashes, which causes silent import failures in Dataverse. A bug that takes hours to diagnose the first time you hit it. The AI knew to avoid it. The full packaging pipeline is documented in Building Dataverse Solution ZIPs Programmatically.
This is the work that AI transforms. Not the architectural decisions. The execution of those decisions at scale, with perfect consistency, across a volume of repetitive work that would take a human team hours and introduce errors through sheer fatigue.
Parallel Agent Execution
The tag system did more than organize flows for humans. It enabled batching for AI agents.
Four agents ran in parallel, each responsible for 3-4 flows grouped by functional area:
- Agent 1: Author-facing scheduled flows (Form Assigned, Author Reminder, Author Past Due)
- Agent 2: Signer-facing scheduled flows (Ready for Signature, Ready for Acknowledgment, Signer Heads-Up)
- Agent 3: Event-driven and completion flows (Rejection to Author, Rejection to Previous Signers, Evaluation Complete)
- Agent 4: Supervisor escalation and cycle broadcast flows
Each agent read the same spec. Each produced flows following the same patterns. The output was consistent not because the agents coordinated with each other, but because they all read the same document.
This is the scalability argument for spec-first development. Without a spec, each agent would make different assumptions about FetchXML structure, variable naming, email formatting, and error handling. With a spec, four independent agents produce output that looks like one person wrote it.
The Honest Assessment
Here are the real numbers.
| Task | With AI | Without AI (Manual) |
|---|---|---|
| Requirements reconciliation | ~15 min | ~45 min |
| Architecture decisions (3 correction rounds) | ~30 min | ~30 min |
| Spec documentation | ~20 min | ~2 hr |
| Flow JSON generation (14 flows) | ~45 min | ~8-10 hr |
| Format conversion (14 files) | ~10 min | ~2-3 hr |
| GUID generation + XML manifests | ~5 min | ~1 hr |
| ZIP packaging | ~5 min | ~30 min |
| Total | ~2.5 hr | ~15-17 hr |
The 6-7x multiplier is real. But it comes with conditions.
Correction time is front-loaded. The first 30 minutes were spent fixing wrong architectural assumptions. If I had accepted the AI defaults, the system would have shipped with email floods and fragile coupling between notification and business logic. An inexperienced developer who takes the AI output at face value ships a broken system.
The spec is the bridge. AI cannot learn your domain in one chat session. But it can read a spec. The documentation-first approach is what enabled parallel agent execution. Without a spec, each agent makes different assumptions. With a spec, four agents produce consistent output. The spec writing time is not overhead. It is the interface between human judgment and AI execution.
Consistency is the actual value. Writing one flow takes roughly the same time with or without AI. Writing 14 flows with identical patterns, consistent naming, matching FetchXML structures, and uniform error handling. That is where AI provides compounding returns.
The developer must know what “right” looks like. AI proposed real-time triggers because that is the most common Power Automate pattern. I knew it was wrong because of domain-specific constraints. The multiplier only works when the human has enough experience to catch the wrong defaults.
Neither could have done the other’s job efficiently. I could not have written 14 consistent flow JSONs, converted formats, generated deterministic GUIDs, and packaged a solution ZIP in 45 minutes. The AI could not have known that real-time triggers would flood inboxes during bulk creation. Both roles are essential. That is the honest assessment.
Once AI generates the flows, you still need quality gates before anything reaches production. The AI-powered flow review patterns we use catch structural issues that even well-corrected AI output can introduce.
Spec-Driven Power Platform Series
This article is part of a series on building Power Automate solutions with specs, governance, and AI:
- Tag-Based Flow Architecture - How 3-letter prefixes make 24 flows manageable
- Spec-First Development - Why specs should exist before the designer opens
- Notification Architecture - Notifications that cannot break business logic
- FetchXML in Power Automate - When OData $filter is not enough
- Building Solution ZIPs - The undocumented packaging guide
- What AI Gets Wrong (this article) - And why human correction is the point
- 14 Flows in 10 Minutes - The full story
AZ365.ai - Azure and AI insights for architects building on Microsoft. Follow Alex on LinkedIn for architecture deep dives.
Stay in the loop
Get new posts delivered to your inbox. No spam, unsubscribe anytime.
Related articles
The 10-Minute Build: How Specs and AI Produced 14 Power Automate Flows
Power Automate flows built by AI in 10 minutes -- but only because two years of governance made specs machine-readable. The full architecture story.
Spec-Driven Power Platform: The Complete Series
7 articles on building Power Automate flows with specs, governance, and AI. From tag-based architecture to solution packaging to honest AI collaboration.
Notification Architecture That Cannot Break Your Business Logic
Separate Power Automate notification flows from business logic. 14 flows, zero write operations, daily digests over real-time floods.