Decode InfoPath attachments with a bit of JS AzureFunctions
/Serge, April and me were discussing a problem with pulling out InfoPath Attachment from InfoPath form XML and writing them into a SharePoint document library.
This is a problem I tried to tackle before, but came to realization that I would need an AzureFunction. The main reason is that the InfoPath attachment is a base 64 byte array but the byte array has a variable length header that includes the attachment file name. Flow doesn’t have amazing byte manipulation or left-shift abilities. So we need to write an AzureFunction to help.
As I brood over the problem I also thought it might be easier to handle the byte array with JavaScript. So I gave it a go.
This blog is my version of the answer.
The original decoder code in C#
There is a pretty old MSDN article on the C# code
private void DecodeAttachment(BinaryReader theReader) { //Position the reader to get the file size. byte[] headerData = new byte[FIXED_HEADER]; headerData = theReader.ReadBytes(headerData.Length); fileSize = (int)theReader.ReadUInt32(); attachmentNameLength = (int)theReader.ReadUInt32() * 2; byte[] fileNameBytes = theReader.ReadBytes(attachmentNameLength); //InfoPath uses UTF8 encoding. Encoding enc = Encoding.Unicode; attachmentName = enc.GetString(fileNameBytes, 0, attachmentNameLength - 2); decodedAttachment = theReader.ReadBytes(fileSize); }
The updated code in JS AzureFunctions
module.exports = function (context, req) { context.log('JavaScript HTTP trigger function processed a request.'); if (req.body && req.body.file) { // https://support.microsoft.com/en-us/help/892730/how-to-encode-and-decode-a-file-attachment-programmatically-by-using-v var buffer = Buffer.from(req.body.file, 'base64') //var header = buffer.slice(0, 16); // unused header var fileSize = buffer.readUInt32LE(16); // test is 5923 bytes var fileNameLength = buffer.readUInt32LE(20); // test is 13 chars // article lies - it's utf16 now var fileName = buffer.toString('utf16le', 24, (fileNameLength-1)*4 -1); var binary = buffer.slice(24 + fileNameLength * 2); context.res = { // status: 200, /* Defaults to 200 */ body: { fileName: fileName, fileNameLength: fileNameLength, fileSize: fileSize, fileContent: binary.toString('base64') } }; } else { context.res = { status: 400, body: "Please pass a base64 file in the request body" }; } context.done(); };
The InfoPath form
The Microsoft Flow that coordinates the work
Results
Need Azure Function here
JavaScript buffer is pretty good at doing byte decoding, easy to read too
Debugging and tweaking the byte offset is quite a bit of trial and error, was not expecting that. May be that MSDN article is too old, it is from 2003.
You may think - John 2018 is not the right year, or decade to be writing about InfoPath. But hear me out. As companies move their form technology forward, they will need to consider how to migrate the data and attachments in their current InfoPath forms somewhere - having this blog post as a reference is important for that eventual migration. Good luck!