We need a better way of way of dealing with flow errors. An error in production costs time to troubleshoot & resolve, and we want this to be automated better.
These have been issues for many years & versions, and we need some guidance on when they will be addressed.
For each flow error, here's what we'd like:
Can the error occurence (or the detection thereof - with flow ) be fully logged? e.g., the function markProblemAssignment() marks the problem, but does not log. There are useful things to be pulled from the Clipboard at this time.
Can the error be displayed in a notification dashboard visible to admin. These have long been buried on the Word Admin menu (under Process & Rules > Tools).
Assignment mismatch - the mismatch is buried on the Clipboard. We should display to the user where the work is pointing to; where the assignment is pointing to - and potentially allow the application (or the user) to decide which should take precedence.
Flow Not At Task - Let's have a button to fix it. If we follow good design, and have each status should correspond to a unique task step (not that anything validates that), we could restart the flow and restore to that point.
Lock Lost - If it's a soft lock, tell the user that the log had expired (this should never happen with Continued support for Extending Locks?); if someone else made an update since the form was updated, provide that details.
PegaSaveDetect is zip produced from a rule-admin-product that is hence installable from the import landing page (not update manager).
The zip provides trigger rules and some checking activities. The trigger launches the activities whenever your work object is saved, or your assignment objects are saved or deleted.
The checking activities will produce an instance of class Log-SaveDetect (nothing is written to the pega log file) whenever any of the following situations are detected:
Work object or assignment is written when the pxFlow property data on the work object does not properly refer to the assignment
Assignment is deleted while pxFlow still refers to it
work object is written with an older pxUpdateDateTime than the previous one written
If the pyTemporaryObject property value is asserted
The instances in Log-SaveDetect can be viewed as xml from the standard class explorer, or tabulated with a listview that is supplied called ShowEntries
Each entry contains:
Time stamp of infarction
description of what infarction was detected
All stack traces of triggered events up to and including the stack trace of the event that was detected as an infarction
contents of the deferred list
There is a different PegaSaveDetect zip for each prpc version. As of today, this tool has ben given out when customer's have an issue such as flow-removed, assignment-mismatch, or flow-not-at-task errors, or assignments mysteriously disappear, or remain when object has been resolved. But the tool isn't available on pdn yet.
Jon, to be clear, PegaSaveDetect is a diagnostic tool that GCS created to try and identify the root cause of Flow Removed and Flow Not At Task problems. It is something that we may suggest to put into production to track down the source of a specific problem, but not generally something we would recommend installing and leaving because it can be noisy if your application has a large number of places where things get out of and then back into sync. Often we find that customers have to fix a bunch of design defects just to clear a path for it to be useful in production. If, after that exercise, you leave it running in production and someone moves some less than clean code in, you can fill up your table with Log-SaveDetect entries and not even realize it. It's also not the easiest tool to understand and analyze, again because it wasn't built by developers, but by support engineers and only to the point that it meets our needs.
The developers are aware of the tool and I believe the major stumbling block to getting it included in the product is the performance hit inherent in its design (the triggers fire every save, committed save, and delete of an assignment and work object). The performance impact is usually worth it if we're trying to track down an intermittent, production only, flow removed problem, but not generally something I'd recommend for day to day operation. Certainly, hearing that you would like having something like this baked into the product is helpful for me when I next speak about it to the product owners.
All that said, putting PSD onto your dev system and running it periodically to ensure that your application doesn't have bad saves/deletes is not a bad idea at all.
Yes, it would be helpful. The rules do include System-Settings to enable/disable the trigger activities (though note that the triggers still are in the declarative network... is there a better way to put a System-Setting reference directly in the trigger rule, so it wouldn't be added in the first place?) Speaking of, the package we got does have a bit of a mix-up with regard to the cached system settings, see Don't make me cache system-settings...
re: "you can fill up your table with Log-SaveDetect entries and not even realize it..."
This begs the point here that it would be a useful sysadmin tool to be able to monitor the growth of tables (or class instances).
While those are all good points, the key thing in my statement is that PSD is a diagnostic tool my team built. We are a support team and not a development team. It is not an officially released tool. I would reiterate that I wouldn't recommend putting it into production unless you have a specific issue you are trying to diagnose. Even then, you should probably be working directly with GCS. For lower environments where you are concerned about making sure your custom implementation doesn't get your work object and assignments out of sync, it's not a bad tool to have in your tool chest.