Posted: 19 Feb 2018 15:03 EST Last activity: 5 Oct 2018 2:17 EDT
Tracing and activity and recovering from fail
I'm trying to understand how Pega Tracer implements a lock when tracing a particular activity. I have a developer who has had an issue with a trace error and when the trace fails due to the tracer session not closing or terminated properly (due to session getting timed out or due to getting stuck for some reason), it doesn’t open again. Even switching to another node doesn't allow the user to start tracer on the rule in question. Additionally, terminating the requestor via System->Operations-> Requestor Management. We've experimented with Activity rules to see this behavior. The error message is formatted as:
"Tracer session error: This rule <rule key> is being traced by operator <operator ID> from requestor <requestor ID> - Please restart tracer."
I was able to duplicate the error received by opening a two sessions in different nodes trying to trace the same activity. The second trace attempt fails with the error message. I assume this has something to do with the architecture of how the tracer "traces" and spits the data back to a user. If I navigate to System->Operations-> Requestor Management and execute a Terminate Tracer on the first requestor from the second requestors session, I am able to pick up and trace with the second session.
(Note: I did not replicate the exact scenario my developer was seeing with a tracer fail ... this may make a difference in the ability to select Terminate Tracer, or if it's an option at all. I'm unclear what they'd be able to see in this scenario for options).
I can't see a lock in pr_sys_locks, so where is the lock being held if at all in session/memory/database?
Is there any information to confirm the tracer architecture and why you can only trace an activity from one requestor at a time?
Does it make a difference when you terminate the requestor FIRST (and are unable to Terminate Trace as it is no longer an option)? Does this somehow leave you to waiting for the system to release the trace lock on it's own?
If terminating the trace AND terminating the requestor both fail to release the lock, is the only other option to restart the node?
Where is the lock being held and can it be viewed using any Pega or database viewing tools?
I believe your assumption here is correct. I am not the highest authority on this topic, but my earlier response was influenced by those who work in this area more often than I do, and they did not mention any way of viewing the locks.
What you see here is what you get. None of the Pega resources that replied to the post identified a better way to deal with these issues than what is documented here. Using the System->Operations-> Requestor Management will occasionally fix the issue (see earlier in the post) and restarting certainly fixes things. The Pega resources that replied on the post mention an SMA option (Logging and Tracing -> Remote Tracing; then Clear Rule Watches). We never had an opportunity to try this as a possible solution to the problem we had, as it never returned and we were unable to duplicate it. Additionally, as we're all being told that SMA is on the outs I don't know that this is going to be replaced or added to the functionality they are making available in the Developer portal itself.