On using Microsoft Flow as a pre-ETL step for Power BI

Photo by WeRoad on Unsplash

Photo by WeRoad on Unsplash

A topic I’ve presented a few times to an Power BI crew is the concept that Microsoft Flow makes a great pre-ETL step for Power BI.

I talked to many attendees at Difinity Power BI Conference Auckland about this. It’s probably a time to write this up to summarize my thoughts.

There’s just something nice about having a product called Flow gathering data into your lakes.

Power BI

Power BI - Power Query and DAX is extremely good at crunching numbers.

Flow has variables, branching and loop logic, but they are slow for number crunching. Loop limitations like 5k rows or total number of action steps are also very limiting.

Don’t use Flow to crunch numbers.

Flow

Power BI connects to about 70 sources - Flow connects to about 250 connections.

Flow’s HTTP action and Custom Connector framework is better at gathering data. Automatic paging, more authentication options, retry policy, async HTTP request.

Flow can gather data on Trigger, copy them all to a simple location (e.g. SharePoint library or Azure Blog Storage) for Power BI to start refresh. So data processing can be done in a secure way.

Flow can call Power BI API to start a refresh. Turning the ETL into a trigger-push system. We get the best of both worlds - Flow to gather data, and Power BI to crunch them.

Power BI Scheduled Refresh with Flow

  1. Schedule Refresh requires Flow to call Power BI API with a HTTP Request. But until very recently – this requires a delegate permission (so Flow has to call Power BI as a user) to perform delegate permission call requires a Custom Connector – which is what this blog post from Konstantinos goes into detail about creating.

    https://medium.com/@Konstantinos_Ioannou/refresh-powerbi-dataset-with-microsoft-flow-73836c727c33

  2. In February, Power BI has started releasing application permission to call Power BI API – this work is still in preview and requires quite a lot of set up, it also requires the administrator to approve the permission. App-only read/write everything is a pretty high level permission, non-admin can't grant this.

    https://powerbi.microsoft.com/en-us/blog/use-power-bi-api-with-service-principal-preview/

    pros: easier to call with Service Identity, and HTTP request
    cons: may be impossible to get your admin to approve a new AAD APP permission

  3. The April roadmap says Refresh is an action coming to Flow’s Power BI Connector – this will use delegate permissions and would be much simpler to use. So my thinking is that wait for this to drop after April

    https://docs.microsoft.com/en-us/business-applications-release-notes/April19/microsoft-flow/improved-power-bi-connector