Browse all topics
Power Platform

AI Builder document processing

How AI Builder extracts structured data from PDFs and images at scale, and where it fits with newer Copilot patterns.

AI Builder document processing is the part of the Power Platform that extracts structured data from semi-structured documents — invoices, receipts, applications, business cards, ID documents. Inside a Power Automate flow or Power App, you upload a PDF or image and AI Builder returns the extracted fields as structured data.

The model types

AI Builder ships several document-processing model types:

  • Prebuilt invoice processing — recognises standard invoice fields (vendor, invoice number, date, line items, total).
  • Prebuilt receipt processing — for retail receipts.
  • Prebuilt business-card reader — extracts contact details.
  • Prebuilt ID reader — extracts data from common ID documents.
  • Prebuilt contract recognition — identifies parties, terms, dates from contracts.
  • Custom document processing — train your own model on your specific document type (your custom invoice format, your specific application form, your industry-specific document).

Custom models train on as few as 5–10 example documents, learn the field locations and labels, and apply to subsequent documents at scale.

Typical flow patterns

The most common automation pattern:

  1. Trigger — an email arrives in a shared mailbox with a PDF attachment, or a file is uploaded to a SharePoint folder.
  2. AI Builder action — invoke the document processing model, extract fields.
  3. Decision — does the data look right? High confidence → continue. Low confidence → route to a human reviewer.
  4. Write to system — write the extracted data into Dataverse, SharePoint list, SQL, an ERP via custom connector.
  5. File the document — archive the original PDF in a structured location.
  6. Notify — let the requester know it's been processed.

This pattern replaces a lot of "human reads PDF, types into system" work that consumes back-office time in many organisations.

Confidence scores and human-in-the-loop

A critical feature: every extracted field has a confidence score. Production flows should branch on confidence — high → auto-process, medium → flag for review, low → route to a human reviewer always.

The Approvals action in Power Automate creates a natural human-in-the-loop step: present the document and extracted fields to a reviewer; they correct any errors and approve; the flow continues. Over time, the corrected data can train the next iteration of the model.

Where AI Builder fits today

AI Builder document processing has been somewhat overshadowed by Microsoft 365 Copilot, Copilot Studio, and Microsoft Syntex for some scenarios:

  • For conversational extraction ("read this document and tell me the total"), Copilot is often easier.
  • For SharePoint-resident documents with classification and field extraction at scale, Syntex has overtaken AI Builder.
  • For flow-driven, programmatic extraction in Power Automate, AI Builder remains the standard tool.

Licensing

AI Builder is billed by AI Builder credits — a consumption unit consumed per model invocation. Credits can be purchased as monthly add-ons or via pay-as-you-go. Each document type has a different credit cost per invocation (invoice processing more expensive than business card reading, etc.).

For organisations with high-volume document workflows — accounts payable, claims processing, customer onboarding — AI Builder remains a practical first step, often paired with Power Automate desktop flows for the parts that touch legacy UIs.