Question
BNY Mellon
US
Last activity: 29 Oct 2015 7:37 EDT
Overhaul of Flow Error Management needed
We need a better way of way of dealing with flow errors. An error in production costs time to troubleshoot & resolve, and we want this to be automated better.
These have been issues for many years & versions, and we need some guidance on when they will be addressed.
For each flow error, here's what we'd like:
We need a better way of way of dealing with flow errors. An error in production costs time to troubleshoot & resolve, and we want this to be automated better.
These have been issues for many years & versions, and we need some guidance on when they will be addressed.
For each flow error, here's what we'd like:
- Can the error occurence (or the detection thereof - with flow ) be fully logged? e.g., the function markProblemAssignment() marks the problem, but does not log. There are useful things to be pulled from the Clipboard at this time.
- Can the error be displayed in a notification dashboard visible to admin. These have long been buried on the Word Admin menu (under Process & Rules > Tools).
- Can the error message to the user follow a standard, useful format, e.g., Standard for User Error Messages?
- Additionally for these error types:
- Assignment mismatch - the mismatch is buried on the Clipboard. We should display to the user where the work is pointing to; where the assignment is pointing to - and potentially allow the application (or the user) to decide which should take precedence.
- Flow Not At Task - Let's have a button to fix it. If we follow good design, and have each status should correspond to a unique task step (not that anything validates that), we could restart the flow and restore to that point.
- Lock Lost - If it's a soft lock, tell the user that the log had expired (this should never happen with Continued support for Extending Locks?); if someone else made an update since the form was updated, provide that details.
Message was edited by: Vidyaranjan Av
-
Like (0)
-
Share this page Facebook Twitter LinkedIn Email Copying... Copied!
PegaSaveDetect diagnostic patch should be available with Support team for this. It's good to include this in the product.
BNY Mellon
US
ok, I'll bite. What is the PegaSaveDetect diagnostic patch?
Pegasystems Inc.
IN
from my learning's...
- as Naresh said, PegaSaveDetect diagnostic patch available for all PRPC versions (5x, 6x & 7x)
- it will detect & log work processing errors such as assignments/flows being deleted unexpectedly.
- once installed, all the errors will be captured as instances of class 'Log-SaveDetect or Log-PegaSaveDetect'
- there are TWO DSS settings for Save & Delete operations to enable by logging in as PegaSaveDetectOperator
- once the patch is installed there is no to limited way to delete the Patch itself
- however the instances can be deleted.
HiEric Osman, please correct me otherwise. Thank you!
psahukaru
Pegasystems
US
Assignment-mismatch and flow-not-at-task should never happen.
/Eric
BNY Mellon
US
Of course not. But should it happen, as we're seeing on occasion, we want to be able to capture the data properly.
Pegasystems
US
Yes, that is basically right.
PegaSaveDetect is zip produced from a rule-admin-product that is hence installable from the import landing page (not update manager).
The zip provides trigger rules and some checking activities. The trigger launches the activities whenever your work object is saved, or your assignment objects are saved or deleted.
The checking activities will produce an instance of class Log-SaveDetect (nothing is written to the pega log file) whenever any of the following situations are detected:
- Work object or assignment is written when the pxFlow property data on the work object does not properly refer to the assignment
- Assignment is deleted while pxFlow still refers to it
- work object is written with an older pxUpdateDateTime than the previous one written
- If the pyTemporaryObject property value is asserted
The instances in Log-SaveDetect can be viewed as xml from the standard class explorer, or tabulated with a listview that is supplied called ShowEntries
Each entry contains:
Yes, that is basically right.
PegaSaveDetect is zip produced from a rule-admin-product that is hence installable from the import landing page (not update manager).
The zip provides trigger rules and some checking activities. The trigger launches the activities whenever your work object is saved, or your assignment objects are saved or deleted.
The checking activities will produce an instance of class Log-SaveDetect (nothing is written to the pega log file) whenever any of the following situations are detected:
- Work object or assignment is written when the pxFlow property data on the work object does not properly refer to the assignment
- Assignment is deleted while pxFlow still refers to it
- work object is written with an older pxUpdateDateTime than the previous one written
- If the pyTemporaryObject property value is asserted
The instances in Log-SaveDetect can be viewed as xml from the standard class explorer, or tabulated with a listview that is supplied called ShowEntries
Each entry contains:
- Time stamp of infarction
- description of what infarction was detected
- All stack traces of triggered events up to and including the stack trace of the event that was detected as an infarction
- contents of the deferred list
There is a different PegaSaveDetect zip for each prpc version. As of today, this tool has ben given out when customer's have an issue such as flow-removed, assignment-mismatch, or flow-not-at-task errors, or assignments mysteriously disappear, or remain when object has been resolved. But the tool isn't available on pdn yet.
/Eric
BNY Mellon
US
Awesome - this sounds super useful, and I imagine this should be rolled into the product one day. Requested it from GCS.
Pegasystems Inc.
US
Jon, to be clear, PegaSaveDetect is a diagnostic tool that GCS created to try and identify the root cause of Flow Removed and Flow Not At Task problems. It is something that we may suggest to put into production to track down the source of a specific problem, but not generally something we would recommend installing and leaving because it can be noisy if your application has a large number of places where things get out of and then back into sync. Often we find that customers have to fix a bunch of design defects just to clear a path for it to be useful in production. If, after that exercise, you leave it running in production and someone moves some less than clean code in, you can fill up your table with Log-SaveDetect entries and not even realize it. It's also not the easiest tool to understand and analyze, again because it wasn't built by developers, but by support engineers and only to the point that it meets our needs.
The developers are aware of the tool and I believe the major stumbling block to getting it included in the product is the performance hit inherent in its design (the triggers fire every save, committed save, and delete of an assignment and work object). The performance impact is usually worth it if we're trying to track down an intermittent, production only, flow removed problem, but not generally something I'd recommend for day to day operation. Certainly, hearing that you would like having something like this baked into the product is helpful for me when I next speak about it to the product owners.
Jon, to be clear, PegaSaveDetect is a diagnostic tool that GCS created to try and identify the root cause of Flow Removed and Flow Not At Task problems. It is something that we may suggest to put into production to track down the source of a specific problem, but not generally something we would recommend installing and leaving because it can be noisy if your application has a large number of places where things get out of and then back into sync. Often we find that customers have to fix a bunch of design defects just to clear a path for it to be useful in production. If, after that exercise, you leave it running in production and someone moves some less than clean code in, you can fill up your table with Log-SaveDetect entries and not even realize it. It's also not the easiest tool to understand and analyze, again because it wasn't built by developers, but by support engineers and only to the point that it meets our needs.
The developers are aware of the tool and I believe the major stumbling block to getting it included in the product is the performance hit inherent in its design (the triggers fire every save, committed save, and delete of an assignment and work object). The performance impact is usually worth it if we're trying to track down an intermittent, production only, flow removed problem, but not generally something I'd recommend for day to day operation. Certainly, hearing that you would like having something like this baked into the product is helpful for me when I next speak about it to the product owners.
All that said, putting PSD onto your dev system and running it periodically to ensure that your application doesn't have bad saves/deletes is not a bad idea at all.
BNY Mellon
US
Yes, it would be helpful. The rules do include System-Settings to enable/disable the trigger activities (though note that the triggers still are in the declarative network... is there a better way to put a System-Setting reference directly in the trigger rule, so it wouldn't be added in the first place?) Speaking of, the package we got does have a bit of a mix-up with regard to the cached system settings, see Don't make me cache system-settings...
re: "you can fill up your table with Log-SaveDetect entries and not even realize it..."
This begs the point here that it would be a useful sysadmin tool to be able to monitor the growth of tables (or class instances).
Pegasystems Inc.
US
Jon,
While those are all good points, the key thing in my statement is that PSD is a diagnostic tool my team built. We are a support team and not a development team. It is not an officially released tool. I would reiterate that I wouldn't recommend putting it into production unless you have a specific issue you are trying to diagnose. Even then, you should probably be working directly with GCS. For lower environments where you are concerned about making sure your custom implementation doesn't get your work object and assignments out of sync, it's not a bad tool to have in your tool chest.
Thanks,
Mike
BNY Mellon
US
Harumph. I still don't know what the strategy is to clean this up fully.
Dealing with a situation now with "Flow Not At Task" -- basically, consider a process model where A is followed by B in the flow stages.
The Assignment has been updated to flow A, but the workpage.pxFlow(<B>) was not properly updated, while workpage.pxFlow(<A>) is still present and filled out.
From what I see, the .pxFlow() pages are added within the flow-generated Java -- that's not very modular.
We need a function/activity (to be run by privilege, of course), which would copy the current assignment page to the corresponding pxFlow() to patch things up in these situations.
BNY Mellon
US
And, for FixBrokenAssignments to run, it needs pxFlow(<FlowName_Subscript>).pyFlowType=FlowName set.
So we ended up writing our own activity above to fix this.