Decode InfoPath attachments with a bit of JS AzureFunctions

Serge, April and me were discussing a problem with pulling out InfoPath Attachment from InfoPath form XML and writing them into a SharePoint document library.

This is a problem I tried to tackle before, but came to realization that I would need an AzureFunction. The main reason is that the InfoPath attachment is a base 64 byte array but the byte array has a variable length header that includes the attachment file name. Flow doesn’t have amazing byte manipulation or left-shift abilities. So we need to write an AzureFunction to help.

As I brood over the problem I also thought it might be easier to handle the byte array with JavaScript. So I gave it a go.

This blog is my version of the answer.

The original decoder code in C#

There is a pretty old MSDN article on the C# code

private void DecodeAttachment(BinaryReader theReader)
{
  //Position the reader to get the file size.
  byte[] headerData = new byte[FIXED_HEADER];
  headerData = theReader.ReadBytes(headerData.Length);

  fileSize = (int)theReader.ReadUInt32();
  attachmentNameLength = (int)theReader.ReadUInt32() * 2;

  byte[] fileNameBytes = theReader.ReadBytes(attachmentNameLength);
  //InfoPath uses UTF8 encoding.
  Encoding enc = Encoding.Unicode;
  attachmentName = enc.GetString(fileNameBytes, 0, attachmentNameLength - 2);
  decodedAttachment = theReader.ReadBytes(fileSize);
}

The updated code in JS AzureFunctions

module.exports = function (context, req) {
    context.log('JavaScript HTTP trigger function processed a request.');
    if (req.body && req.body.file) {
        // https://support.microsoft.com/en-us/help/892730/how-to-encode-and-decode-a-file-attachment-programmatically-by-using-v
        var buffer = Buffer.from(req.body.file, 'base64')
        //var header = buffer.slice(0, 16);  // unused header
        var fileSize = buffer.readUInt32LE(16);  // test is 5923 bytes
        var fileNameLength = buffer.readUInt32LE(20);  // test is 13 chars
        // article lies - it's utf16 now
        var fileName = buffer.toString('utf16le', 24, (fileNameLength-1)*4 -1);  
        var binary = buffer.slice(24 + fileNameLength * 2);
        context.res = {
            // status: 200, /* Defaults to 200 */            
            body: {
                fileName: fileName,
                fileNameLength: fileNameLength,
                fileSize: fileSize,
                fileContent: binary.toString('base64')
            }
        };
    }
    else {
        context.res = {
            status: 400,
            body: "Please pass a base64 file in the request body"
        };
    }
    context.done();
};


The InfoPath form


The Microsoft Flow that coordinates the work


Results

  • Need Azure Function here

  • JavaScript buffer is pretty good at doing byte decoding, easy to read too

  • Debugging and tweaking the byte offset is quite a bit of trial and error, was not expecting that. May be that MSDN article is too old, it is from 2003.

  • You may think - John 2018 is not the right year, or decade to be writing about InfoPath. But hear me out. As companies move their form technology forward, they will need to consider how to migrate the data and attachments in their current InfoPath forms somewhere - having this blog post as a reference is important for that eventual migration. Good luck!