Updating AzCopy in Azure Pipeline

You know how the saying goes - if it ain’t broke don’t fix it. Well, something broke in my Azure Pipeline for Flow Studio App a few days ago, and it took a bit of time to figure it out, so it makes sense to write it down. I’m pretty sure I’ll forget again.

The error is related to AzFileCopy

  • AADSTS7000222: The provided client secret keys for app '***' are expired. Visit the Azure portal to create new keys for your app: https://aka.ms/NewClientSecret, or consider using certificate credentials for added security: https://aka.ms/certCreds.

  • There was an error with the service principal used for the deployment.

I set up my pipelines years ago and don’t remember what was in them. But there were a few issues:

  • I want to switch to Workload Identity Federation thing in Azure Pipelines, it looks like that means I won’t have to keep remembering my keys

  • I was using AzureFileCopy@3 which is not the latest version, latest version is v6. It also looks like v3 didn’t support the new credentials.

Steps

  • Click the convert button in Azure Pipelines

  • Fix AzCopy arguments

  • Fix a permission issue

Click the convert button

It created this identity. Hmm no secrets.

Fixing AzCopy arguments

I was using these AzCopy arguments: /S /Y /SetContentType

/S is --recursive=true 
/Y is --overwrite=true 
/SetContentType is apparently a default behaviour now so I didn’t have to set that
--as-subdir=false

This is a new one I needed, because otherwise it was creating the “Drop” folder in Azure Pipelines as the rootfolder in Azure Blob.

Fixing Permissions

For a few hours I was struggling with AzCopy not working with the new credentials, and I don’t understand. I think because previously the credentials impersonated a person. Whereas now I need to grant a certain role to this new identity.

Go to Subscription (or Resource Group, or Storage)’s IAM settings.
Add Role Assignment
Find Storage Blob Data Contributor
Add the service accounts created by Azure Pipelines.

It should look something like this at the end.

and success



Running Serverless Apollo GraphQL on AzureFunctions with cheap Azure Blob Table database(s)

leonardo-ramos-CJ4mbwSK3EY-unsplash.jpg

Merry Christmas holidays and happy new years.

This is a bit of a holiday reading, I wanted to start by give you the sense of what I’ve been tinkering as an experiment, and it looks increasingly like this is going into production before 2019 ends.



Plan

  • Have

    • I have a bunch of Azure storage Tables (about 70k rows of cheap database - at $1 / month storage) - they are spread out across several containers but for the basic purposes - I think of them as independent tables that lack proper relationships defined between them.
      I think that’s appropriate to describe most no-sql databases.

    • Azure Functions based CRUD REST APIs wraps around the Azure Blob Table - problem - no good caching or relationship mechanism. I wasn’t a big fan of keep rolling out CRUD REST endpoints, feeling that I should try find a different way.

  • Idea

    • Run a GraphQL server on Azure Functions

  • Learn:

    • If you want to run GraphQL on Azure Functions - there’s a dotnet way and there’s a NodeJS way

    • The dotnet version

      • https://www.tpeczek.com/2019/05/serverless-graphql-with-azure-functions.html

      • https://medium.com/@seekdavidlee/eklee-azure-functions-graphql-a-serverless-graphql-implementation-on-azure-484611c3680b

      • I’ve personally made a stack decision to stick to NodeJS/TypeScript/JavaScript, so I won’t be exploring the dotnet server.

    • The NodeJS version

      • A quick look around will send you to Apollo Server which has done 3 videos with Microsoft Azure/Channel9 on getting started, running on AppService, and running on Azure Functions (on consumption).

      • Part 1 https://www.youtube.com/watch?v=7R33hGFV4f0

      • Part 2 https://www.youtube.com/watch?v=Mt4bBpOdwyE

      • Part 3 https://www.youtube.com/watch?v=unUeFApHeT0

  • Write

    • Steps after Apollo + AzureFunctions 3-part videos.

    • Resolving GraphQL entities to Azure Blob Table

    • Resolving Relationships

    • Securing with Azure AD

    • Deploying to AzureFunctions

    • Reroute via Azure Function proxy

  • Future

    • Switch to Apollo RESTDataSource

    • Apollo Client with Angular and Observable bindings

    • Apollo Server with Redis cache



Learn

After the three videos, particularly the first and the third - to understand GraphQL server, and hosting it on AzureFunctions, you’ll end up with a working graphql server running on localhost showing you books array from memory. Getting here after watching three videos was extremely fast ~30minutes. Very surprising how much shortcut we took to get here this fast.

const typeDefs = gql`
    type Book {
        title: String
        author: String
    }
    type Query {
        books: [Book]
    }
`;

const resolvers = {
    Query: {
        books: () => books,
    },
};

let books = [
    {
        title: "A",
        author: "Mark B"
    },
    {
        title: "C",
        author: "John D"
    }
];

const playgroundSettings: any = {
    "schema.polling.enable": false
};
const server = new ApolloServer({ 
    typeDefs, 
    resolvers, 
    playground: {
        settings: playgroundSettings
    },
    debug: false 
});
export default server.createHandler();


A really fancy thing Apollo did was to put the graphql endpoint on POST, and run the playground test environment on GET. Pointing a browser to the same endpoint will show the playground.

Here is the playground running the sample books query. Debug works wonderfully running in localhost.

By default, the playground will do continuous polling on the endpoint. That’s going to keep the Function awake and incur a lot of costs. It might also keep bumping into any local debug breakpoints. So let’s turn that off. I also set debug to false.


Write - Resolving Azure Table

Next, we need to extend from here. My original AzureFunctions has REST endpoints that calls Azure storage Tables via the Azure.Storage SDK. So bring those in.

import { ApolloServer, gql } from "apollo-server-azure-functions"; 
import * as azure from "azure-storage";

And start to switch out the code

const typeDefs = gql`
    type Flow {
        RowKey: String
        name: String
        displayName: String
        critical: Boolean
        environmentName: String
        state: String
    }
    type Query {
        flows: [Flow]
    }
`;

const resolvers = {
    Query: {
        //books: () => books,
        flows: (parent,args,context) => azureTableQuery('flows'),
    }
};

const tableService = azure.createTableService();
const partition = "xxxx-partition-key";

const azureTableQuery = (table) => {
    let pRows = new Promise((resolve, reject)=>{

        let query = new azure.TableQuery().where('PartitionKey eq ?', partition);
        tableService.queryEntities(table, query, null, (error, result, response)=>{
            if(error){
                reject(error);
            }
            else {
                resolve(response.body['value']);
            }
        });
    });
    return pRows;
}

Notes: Azure.Storage does old style callbacks, so I’m wrapping them in a Promise that I can resolve. When I resolve - I want the raw JSON from the service, not the ‘tweaked’ JSON in result (it does weird things to your JSON entities, that’s a blog for another time).

A really nice thing about Apollo Server resolve function is that it knows how to handle promises naturally. So this tweak basically is me saying - hey, go fetch from Azure Storage Table from the ‘flows’ container.

To run this locally, we’ll need to store an Azure storage key in local.settings.json

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "",
    "FUNCTIONS_WORKER_RUNTIME": "node",
    "MyStore": "DefaultEndpointsProtocol=https;AccountName=mystore;AccountKey=FOOxxx"
  }
}

We can reference this like so:

const tableService = azure.createTableService(process.env["MyStore"]);

Write - Resolving relationships

const azureTableRetrieve = (table, key) => {
    return new Promise((resolve, reject)=>{
        tableService.retrieveEntity(table, partition, key, null,
        (error, result, response)=>{
            if(error){
                reject(error);
            }
            else {
                let entity: any = response.body;
                resolve(entity);
            }
        });
    });
}

const typeDefs = gql`
    type Environment {
        RowKey: String
        name: String
        displayName: String
        isDefault: Boolean
    }
    type Flow {
        RowKey: String
        id: String
        name: String
        displayName: String
        critical: Boolean

        environmentName: String
        ref_environment: Environment
        state: String
    }
}`;

const resolvers = {
    Query: {
        environment: (parent,{name},context) => azureTableRetrieve('environments', name),
        flows: (parent,args,context) => azureTableQuery('flows'),
        flow: (parent,{name},context) => azureBlobTableRetrieve('flows', name)
    },
    Flow: {
        ref_environment: (parent,args,context) =>
          azureTableRetrieve('environments', parent.environmentName),
    }
};

Notes:

I call “ref_” because I’m terrible at making property names. Lots of fields like “environmentName” or “environment” are already taken. It got too hard picking new names, I decided to call all entity references ref_

Deploy to Azure Functions

Need to deploy two parts:

Friends don’t let friends right click to deploy unless another friend put it in the right click menu. In that case, please don’t make me choose my friends.

Also, right click and upload local settings file to send our MyStore setting into Azure functions.

That’s two right clicks and deployment done!

Connect Azure Function proxy

My existing API routes are already covered with Azure Functions proxy. So I added a new route that takes graphql over to this separate Azure Functions. I also bury the functionkey in the route so users can’t see it.

Future

There’s a bunch more ideas for the future.

  • Explore federated authentication and user level security.

  • Figure out why Azure Storage SDK next generation doesn’t do Tables

  • Switch from using SDK to Azure Storage REST and use Apollo’s excellent RESTDataSource implementation instead (it already implemented HTTP caching, pagination, top, etc)

  • implement mutations, especially e-tag handling.

  • implement server side Redis cache

  • Implement Apollo Client to fetch data for my existing Angular front-end

  • Implement client side cache and observable pattern

Let me know if you find this interesting and want to follow along. I wrote this mostly as documentation for what I’m doing and the incremental changes I’m making as I go.

From Office 365 to Azure Event Grid, the events must Flow

Photo by Archana More on Unsplash

In this blog post, we capture all the events across an Office 365 Tenant from multiple event sources, gather them, and send them through an Azure Event Grid.

We then listen, filter and handle our events in a central, unified way.

The events must Flow.


This is also the full write up of this microblog posted to Twitter #FlowNinja earlier this month.


Plan

  1. What benefit do we get from this?

  2. Listen to every event across an Office 365 Tenant

  3. Construct a uniform event message

  4. Send them into a Serverless Event Solution - Azure Event Grid

  5. Filter and catch our events

We want to build 3 Flows

flow-event-grid-1.jpg

What benefit do we get from this?

First, we see the increasing availability of event hooks - we have subscriptions, delta queries, or webhooks, across various different products in Office 365. Some products like SharePoint is getting a SocketIO webhook. There will be more events, and our event handling design must evolve.

Second, we see the cost-effective solutions to handle these ever increasing flood of events in the form of Serverless compute. This is true with Azure Functions, Microsoft Flow or Azure Logic Apps.

We end up with a lot of individual event sources and a lot of individual event receivers. This is a common event handling problem. As number of events we handle increases, the worse the event management problem becomes.

Consider you have a “handle a document uploaded to a library” event - a very typical SharePoint Workflow. Now consider this library is cloned to a hundred project sites.

If we clone the event handler a hundred times, we have a problem.


If you have already done this with Flow, then try https://FlowStudio.app to help you manage them.
Ooh inline product placement!


If you are a developer, then consider this scenario.
Consider a typical event handling in the browser. A decade ago we used jQuery like this:

// 2008
$('button').click(handler);

// 2018
$(global).on('click', 'button', handler);

And gradually we find that unacceptable, because we have buttons, events everywhere, and managing individual event hooks was tedious and error prone. Eventually, we moved to a global handler model, and we filter just prior to event being raised.

The headache-less way to handle events is to set the hooks all at the global root level, and then filter by the event source and event type.

That is the exact reason we need Azure Event Grid.

  • Centrally manage our events

  • Decouple the event source from event handlers

  • Stay sane, with a hard problem

Listen to every event across an Office 365 Tenant

I had previously wrote about listening to Office 365 Management API via the HTTP action and app-only permissions. What I did not realize was that the Office 365 Management API also have fantastic webhooks.

I read about the webhooks from Kent Weare’s post, where he uses this event to get a trigger when Flows are created in the tenant.

https://flow.microsoft.com/en-us/blog/automate-flow-governance/

These are grouped into several categories: AzureAD, Exchange, SharePoint and General (other).

This is the subscriber. One picture of 4 blocks. I’m subscribing to three webhooks at once.



The Office 365 Management API is a fantastic general event source. The downside is that it’s not instant - event handler is called between 10-20minutes after the actual event. So it is great as an audit webhook, or for scheduling files or search or to signal for a bot to re-scan a document. But it’s not an instant webhook.

Of course, if our goal is to send events into an Event Grid - we can work with multiple event sources at the same time. We can add Microsoft Graph events or subscribe to SharePoint list webhooks directly.

This is the top of the Listener

Construct a uniform event message

The event grid has a event JSON structure, it also supports a CloudEvent structure.

When I built my implementation, the Event Grid connector is still in preview and I had troubles publishing a Cloud Event structure. I assume this wouldn’t be a problem anymore as the connector evolves.

This is the final loop design - all done.

Remember, we are running Serverless so abuse/utilize every opportunity to use as many Azure Servers as you can - if you can fan-out to parallelism you must.

Don’t talk to each individual HTTP action one at a time. Do (up to 50) all at the same time.


Send them into a Serverless Event solution - Azure Event Grid

The Azure Event Grid is a serverless event processing pipeline. It decouples our event source(s) from our event handlers.

Here is our first handler.

This catches every event on the Event Grid - in Event Grid, we see we have our first webhook attached - it appears as a LogicApps webhook.

Here are three examples of what it caught:

  • Flow created Event

  • Site Collection created Event

  • File uploaded Event

flow-event-grid-3.jpg

Filter and catch our events

We see the very specific webhook now registered on the event grid - and the filters are also listed

PPTX filter only runs when the file I’ve uploaded is a PowerPoint file.


Summary

I have been talking a lot about Serverless and how our tools and design must evolve. Having a unique Office 365 to Event Grid solution is something I talked about as far back as 2017. I’m glad a year later I’ve finally got a great prototype going.

  • Office 365 Management API is a great webhook source that catches all sorts of events. The downside is that it is an audit webhook, so the delay may not be acceptable to your needs.

  • Using Azure Event Grid to perform filtering and subscription gives us the unique ability to see EVERYTHING that’s going on in our tenant. That has tremendous value.

  • Because event source and event handling is now decoupled - we can add new event sources to push to the same Azure Event Grid. We can do this from Microsoft Graph, we can do this from SharePoint, or we can do this from a whole myriad of triggers available in Microsoft Flow

  • We can write Azure Functions to trigger off the Event Grid, and it would be visible as well.

  • I was reading and appreciating sending events to the new Azure SignalR service from Azure Functions - that would be pretty amazing to convert an event grid message into a websocket event.
    https://twitter.com/nthonyChu/status/1044427579460145152

The possibilities are endless. Our tools and our design must evolve.

Gaps between PowerBI streaming tiles and SharePoint

So I spend an evening playing (I actually have a lot of fun exploring these things) and figuring out how the pieces of SharePoint, PowerBI and Flow are supposed to work together.

In my head - they already connect.  But I have never seen anyone blog them.  So I decided to give it a stab.

Turns out there are some gaps.

The Idea

The Idea is simple.  We can create a PowerBI that uses SharePoint List as a datasource.  But instead of configuring scheduled refresh, we want to use the PowerBI Rest Dataset to push data in a streaming way.  And since Microsoft Flow has an action to do this, as well as the triggers to listen to SharePoint List.  We can get SP List-push-to PowerBI without needing schedule refresh.  That is a crazy fun idea.

The Reality

The reality is that there are several gaps.  These are probably solvable, but I just want to list them first, and we'll tackle them in the future.

Gap 1.  SharePoint List dataset != Push-enabled REST dataset

PowerBI makes a distinction between what's a REST/Pushable Dataset vs normal datasets like external lists.  In fact, Flow can not connect to a non-REST dataset.

So we need to create a REST dataset in PowerBI Service (it is not a feature of PowerBI Desktop), and then use the REST dataset as a live connection in a PowerBI Report.

Gap 2.  The only way to create a PowerBI REST dataset is via the REST API. 

There is no UI.  Ouch.  That pretty much makes this a developer task.  OK that's fine, we create a REST dataset via REST endpoint and a JSON schema (double ouch).  

Now we can build our PowerBI report, connect the REST dataset from PowerBI Service.  We save and publish this report to PowerBI Service and then insert the PowerBI Report in an SPFx webpart (PRO license needed for embed) into a SharePoint modern page.

This part is actually really seamless.  Don't worry, we have more gaps.

Gap 3. PowerBI report does not livestream REST dataset results.  

So I'm staring at my PowerBI visual in a SharePoint modern page.  In a separate window, I update the source SharePoint List.  In yet another separate window, I can see the Flow ran and push the new list item into the streaming dataset.

Excellent.  Except, the SPFx PowerBI Report Visual isn't updating.  It doesn't update.  I waited 15mins for it to do nothing!

If I F5, then I immediately see the new value.  But it doesn't do live streaming refresh :-(

It turns out, to see live stream results we need a PowerBI Dashboard or PowerBI Tile (streaming tile).

PowerBI Dashboard can only be created in the PowerBI Service.  We take an existing report and pin the visual.  This asks us to add the visual as a tile in a PowerBI Dashboard.

Gap 4. SPFx PowerBI report webpart does not show PowerBI dashboard embed.

So I create a PowerBI Dashboard and I go back to the SPFx PowerBI Preview webpart.  Only to find it doesn't do dashboard embed.

It only does Report embed.  So we will need to build our own SPFx that let us do dashboard embed.  This requires a embed token from MSGraph - but we should be able to piggyback the graph-util helper in SPFx to do our token exchange.

There's potentially one more issue.

Gap 5.  Does embed PowerBI Dashboard or Tile actually connect to streaming datasets?

I don't know the answer to this yet.

Gap 6. PowerBI REST Dataset endpoint can only add rows, not update them

The REST API lets us add rows to a REST dataset easily, or clear the table.  But there's no way to update an existing row.

The use case for the streaming REST dataset is like a ongoing stock ticker or temperature meter.  You don't update a record that's streamed past.  You only care about new records.

Flow only has an action to add row to PowerBI Dataset.

Gap 7. Flow does not have an action to Clear the dataset

The REST API lets us clear the dataset table, so technically, I could clear the table each time and repopulate it with the entire list again.

But unfortunately, Flow only has one PowerBI Action - add row to a REST dataset.  It does not have a clear rows action.

More work to do, more exploration to be had

Parts of the puzzle works really well.  Flow pushes data freely from SharePoint list changes into the streaming dataset.  If the dashboard or tile is shown on a webpage by itself, we immediately see it update like magic.

But if we want the dashboard/tile embedded within a SharePoint modern page.  There's still work to be done.

Ultimately, if we want live streaming data capability, it might be easier to use PowerBI LiveQuery against Azure SQL, and have Flow push data into that, instead of PowerBI REST streaming Dataset.

 

 

 

Working with SharePoint WebHooks with JavaScript using an Azure Function

This blog post covers additional explanation on how to subscribe to SharePoint WebHooks and have it running with only JavaScript in Azure Functions.  The entire code is in one single JavaScript file.

The SharePoint team delivered on the promise to ship SharePoint WebHooks, and made an reference example in C#.  Be sure to watch the PnP webcast as well.

What is Azure Functions?

Azure Functions, in the simplest sense, is a Azure WebJob that lets you run a JavaScript function in a file (it can do C# too), and it will run it for you when you trigger it.  Because it is a Serverless platform - you don't pay for the WebJob unless your function is running.  This makes it really economical (you also get like a million free runs per month...)

Azure Function comes with a super user-friendly UI and you can paste or upload your JavaScript directly in the browser.  You don't need any tools installed.

Running in the browser

 

Of course, this is server-JavaScript.  So think NodeJS.  We can talk to lots of REST APIs (Graph, SPO) but there will be no browser DOM.

What can you do with this?

You can trigger it on a timer.   Hey that sounds like a SharePoint Timer Job.

You can trigger it on a web request.  This is super useful if you have a client-side UX and you need a button to do some high level elevated permission action.  You call the Azure Function to do it for you, using an App-Only permission elevation.

Why is SharePoint WebHooks important?

A SharePoint WebHook is a REST endpoint that you can attach a remote End Point to.  Right now, there is only a List endpoint.  So any updates to the list items will trigger an event, and SharePoint will call your function.

Hey.  Wait that sounds like a SharePoint Remote Event Receiver.  You are right!

Design

So the idea of our function is this - when triggered:

It will Auth and then talk to SharePoint REST and do one of the following.

  • Check if it has a request parameter "subs" - then it will list the current subscriptions on our target list
  • Check if it has a request parameter "sub" - it will try to attach itself to the target list
    SharePoint will immediately call the function with a validationtoken parameter so…
  • Check if the request has a validationtoken parameter - it will immediately reply with that token as text/plain.
  • Skipping all three conditions, it will run the default action
    The default action is that it will add an list item in a different destination list.  Because the function is running on its own App-Only permission, it can update a list that the original user doesn't have access to.

 

GET Subscriptions

options = {
    method: 'GET',
    uri: "https://johnliu365.sharepoint.com/_api/web/lists/getbytitle('subscribe-this')/subscriptions",
    headers: headers
};
request(options, function (error, res, body) {
    context.log(error);
    context.log(body);
    context.res = { body: body || '' };
    context.done();
});

GET subscriptions.  Array result is [] empty by default.  It has subscriptions after you attach hooks successfully.

 

POST Subscription (to add itself)

options = {
    method: 'POST',
    uri: "https://johnliu365.sharepoint.com/_api/web/lists/getbytitle('subscribe-this')/subscriptions",
    body: JSON.stringify({
        "resource": "https://johnliu365.sharepoint.com/_api/web/lists/getbytitle('subscribe-this')",
        "notificationUrl": "https://johnno-funks.azurewebsites.net/api/poke-spo2?code=q8sq9wxm62asd-YOURTRIGGERCODE",
        "expirationDateTime": "2017-01-01T16:17:57+00:00",
        "clientState": "jonnofunks"
    }),
    headers: headers
};
request(options, function (error, res, body) {
    context.log(error);
    context.log(body);
    context.res = { body: body || '' };
    context.done();
});

Sending POST subscription without handling validation token.  The request fails.

 

Handle Validation Token

// if validationtoken is specified in query
// immediately return token as text/plain
context.log(req.query);
context.log(req.query.validationtoken);
context.res = { "content-type": "text/plain", body: req.query.validationtoken };
context.done();

The function is called twice.  Second time by SharePoint to validate the subscription.

Result Video

The Poked list item was created by "SharePoint App" not "John Liu" the user.

Notes

This demo builds on top of the code from Azure Functions, JS and App-Only Updates to SharePoint Online that covers authentication with certificate, and running AppOnly permissions. 

Additionally, I've moved ClientID and Certifate ThumbPrint to Azure Function App Settings.  This means they are no longer part of the code.

App Settings

var clientId = process.env['MyClientId'];
var thumbprint = process.env['MyThumbPrint'];
Function App Settings > Configure App Settings

Function App Settings > Configure App Settings

 

Source Code

I'm making an effort to put all my demo source code on GitHub going forward.  This is part of the "Upgrade Your JS" demos.

https://github.com/johnnliu/demo-upgrade-your-js/tree/master/azure-function-web-hook

Let me know what you think about SharePoint WebHooks, Azure Functions and JavaScript that will rule everything ;-)

If you spot a bug or want to update the code - send me a Pull Request.