ContentType Workflows not being updated in the Site Collection


This is a problem we've been seeing for a long time in our development and production server.  It is hard to describe the error, but there are a series of consequences.

  • We have reusable content type workflows defined for a InfoPath Form content type.
  • The Workflow Association is created at the root content type level and pushed down through the site collection, for every Forms Library that uses the content type.
  • Sometimes, we discover that an older version of the workflow is running, after we've updated the workflow. 
  • Next, we'll find that for some reason, for one specific Forms Library, the Workflow Association is still pointing to the old workflow template, so even when we manually start a new instance of a workflow, it is still running the older version.
  • In our specific case, the error was happening with Nintex Workflows, but this is not a Nintex Workflow problem - the problem is in the underlying SharePoint platform and how it is creating the workflow associations.


The Chase


I begin to write scripts to try to catch the form libraries that has this behaviour.  I worked out how to read the Workflow Association records.  For every workflow association, there is an InternalName field that looks like this:

  • [workflow name] <Cfg.{guid}>

When it is updated, the current workflow association is renamed to:

  • [workflow name (Previoius Version:{date}) <Cfg.{guid}>

Additionally, each Workflow Association has a BaseId that points to a BaseTemplate object, and the Template object has an Indexer Property

  • template["DeclarativeConfiguration"] - which gives back the same Cfg.{guid}

So a program (or script) can be written that looks through all the form libraries in our site collection, and report on content type workflow associations where the Cfg.{guid} on the Workflow Association doesn't match the one on the template.




Each Workflow Associations collection has a special method that will attempt to update the Workflow Associations to the latest version.  The method UpdateAssociationToLatestVersion will detect the Cfg difference, create a new Association to the latest template, rename the old Workflow Association to Previous, and set the old WorkFlow Association to continue running, but not start new instances.


The Error


I see a few scenarios.  When there is a workflow instance currently running, that seems to be when the Workflow Association for that Form Library breaks.

I also begin to watch SharePoint's Workflow Association table and notice that when a Workflow Association is broken, it seems to have a strange StatusFieldName "ows_" - this is for the Workflow Status column, but shouldn't be with that specific name.



Deep into the logs, I also see this exception buried:

Column 'ows_' does not exist. It may have been deleted by another user.


The Fix


Some quick search turns up Stefan Gossner's blog post from 2011.  Which describes the problem probably better than I could.  Essentially, when you have a lot of Lookup in your List View (remembering that each Workflow Status column is actually also a Lookup), it is possible for your List View Lookup Threshold throttling to break the code that creates new Workflow Associations.  When this exception happens, the existing Workflow Association is broken and must be manually removed and recreated.  And in the broken state, it will not Update to the latest version.



The Aftermath


After we bumped the List View Lookup Threshold from 8 to 50, we are now able to update our Workflow Templates, and the correct Workflow Associations are being created and pushed down.  We still have a bunch of broken associations that requires attention and fixing, but at least we can identify them and correct them manually.

Very, very annoying problem, but it feels good to finally find the solution that has plagued us for months.