Power BI semantic model design — Solving Microsoft 365

A Power BI semantic model (formerly called a dataset) is the reusable data layer that reports and dashboards build on. Designed well, a semantic model serves many reports with consistent semantics and fast queries. Designed badly, every report is slow and inconsistent, and analysts hate it.

Star schema as the foundation

The single most important design principle: star schema. Centre fact tables (transactional / event data) surrounded by dimension tables (descriptive lookup data):

Fact table: Sales (one row per sale, with foreign keys to dimensions plus measures like quantity and revenue).
Dimension tables: Date, Product, Customer, Region, SalesRep.

Avoid flat denormalised tables with everything joined together. Avoid snowflake schemas with dimensions normalised into multiple tables. Star schema beats both for Power BI's storage and query engine.

Modelling principles

One fact table per business process — Sales fact, Inventory fact, Orders fact. Don't try to combine.
Many fact tables can share dimensions — Date dim shared across Sales, Inventory, Orders.
Dimensions before facts — design the dimensions first; facts hang off them.
Surrogate keys — integer keys in dimensions; foreign keys in facts. Better performance than natural keys.
Slowly changing dimensions (SCDs) — when dimension values change (customer moves), decide whether to overwrite (Type 1) or version (Type 2).

DAX measures vs calculated columns

Two ways to add calculations:

Calculated columns — computed once per row, stored in the model. Use for row-level attributes that don't change with filter context.
Measures — computed at query time based on current filter context. Use for aggregations and most analysis.

Prefer measures over calculated columns in almost all cases. Measures are recalculated efficiently; calculated columns bloat the model and don't respond to user filters.

Naming conventions

Tables — Customer, Product, Sales (singular, not Customers).
Columns — Customer Name, Product Category, Sales Amount (proper case with spaces, user-facing).
Measures — Total Sales, YTD Sales, Sales Growth % (descriptive, in their respective measure tables).
Hidden columns — keys, internal columns hidden from users.

Good naming is the difference between a usable model and a confusing one.

Measure tables

A common pattern: create a measures table (an empty placeholder table) and put all measures there. The model browser groups them visibly, separated from columns. For complex models with hundreds of measures, multiple measure tables organised by domain make navigation easier.

Storage modes

Three modes per table:

Import — data loaded into Power BI's in-memory engine. Fast queries, requires refresh.
DirectQuery — queries pushed to the source server live. Always-current but slower per query.
Direct Lake (Fabric only) — reads OneLake Parquet files directly into memory. Import-like performance with live freshness.

Choose per table. Composite models mix modes — fact table in Direct Lake (large, current), dimensions in Import (small, fast).

Performance tuning

For models that feel slow:

Reduce model size — remove unused columns; columns are the main memory consumer.
Disable auto date/time — adds hidden date tables that bloat the model.
Use proper datatypes — integer keys, not strings; date types, not datetime if time isn't needed.
Aggregation tables — for large fact tables, pre-aggregated tables answer common queries faster.
Incremental refresh — large fact tables refresh only the recent partition, not the whole table.

Documentation

For semantic models consumed by many people:

Model description in the workspace.
Column descriptions for each column the user sees.
Measure descriptions explaining what each measure means and how it's calculated.
Synonyms for Q&A natural-language queries.

Good documentation lets users explore the model independently rather than asking the modeller every time.

Reuse via dataflows

For shared transformations across many models, use dataflows:

Define source queries and transformations once in a dataflow.
Many semantic models consume the dataflow as their source.
Refresh the dataflow once; all downstream models benefit.

For mature deployments, dataflows feed certified semantic models which feed many reports. Layered architecture; reuse at every layer.

Certification

For enterprise-wide models, promote them to Promoted or Certified status in the workspace. Certified models appear in the user's quick-access list and signal "this is the right model to use" — reducing model sprawl.

Operational considerations

Version control — Power BI Desktop files (.pbix) are binary; treat as source artefacts. Microsoft Fabric Git integration changes this for the better.
Testing — validate measure values against source data before publishing.
Capacity planning — large models in Premium / Fabric capacities consume memory; track utilisation.
Refresh failures — set up alerts for refresh failures.

For organisations doing serious Power BI at scale, semantic model design is the foundational craft. Time invested here pays back across every downstream report.