Serverless Parallelism in Microsoft Flow and SharePoint
/This is a short post about running things in parallel. There are two angles to this:
"Sometimes you fan-out to 50 parallels and want to run them as quickly as possible, sometimes, you want them in a single file and no one skips a queue".
Plan
- Parallelism Settings in For Each (AKA: when to fan-out to 50)
- Parallelism Settings in SharePoint Trigger (AKA: and when to queue up in single file)
For Each
In 2017, we saw Microsoft demo'ed parallel settings in Logic Apps - the support for this dropped into Flow's UI recently.
What this allows is everything in the loop will happen at the same time, in batches of up to 50. A great example is copying a lot of files from a SharePoint library to another.
This flow, by default, runs one element at a time. Takes 45 seconds for 23 files.
Running For Each in parallel, the copying takes 6 seconds.
Advance use cases of Parallel For Each:
In SharePoint Site Provisioning - split large PnP Template into many small ones, and run them in parallel. Since PnP Provisoining is additive - most of the actions can finish on their own.
HTTP action to AzureFunction has automatic retry policy by default. So if an AzureFunction fails it will retry (default is 4 times with 20 second delay)
See also:
http://www.vrdmn.com/2018/01/site-designs-flow-azure-functions-and.html
Sometimes, instead of running so many items at once in parallel (fan-out), we want to make sure only one item run at a time. This brings us to the second part of this post.
Parallelism Setting in SharePoint Trigger
This one is trickier, and it's not always clear _why_ you need to do this. But sometimes, you need to stop parallelism, and handle things one at a time.
Setting parallelism to 1
There's a problem Split On and Concurrency Control are exclusive.
So we need to turn off Split On. This means the trigger will now return an array of SPLists (because the trigger works like a delta query)
I quickly enter about 10 list items in a SharePoint list - triggering off Flow runs.
The result of concurrency/parallelism controls here is that only one Flow run at one time. It runs with a batch of items that we'll need to handle individually, but they do not overlap.
Advance use case of parallel setting on SharePoint Trigger:
If you are generating sequential incremental numbers on your SharePoint list - this is very useful to prevent two Flows run at the same time.
Summary
- Parallel Settings on For Each
- Parallel Settings on SharePoint Trigger
- And a small note about Split On and handling array of items on an Created/Updated event