HI all, I am writing this post to understand more about debugging queue processing. I have a queue processor which is fed regularly by a job scheduler. When I see the queue processors landing page on the admin studio, I find that the number of queue items (Ready, Failed and processed) shown. I have few questions related to my implementation and queue processors in general. 1. When I trace it I don't see the next queue item being traced. I have to freshly queue an item from my user session. My operator ID and ASYNC requestor share the same access group, so not sure what is that I am missing here additionally. 2. When I open the data flow instance, I see that there are warnings shown stating stale threads are found. What needs to be done about this? (I don't see any where this is documented, please advise!) 3.When I am trying to stop the queue processor from the admin studio, I get the following error. "Queue processor, <>couldn't reach a stable state. Please check after some time.". Why is this being caused? what are the things I need to check? 4. How can I delete the items that are queued for a queue processor? 5. How can I see how many concurrent processes are running on the cluster? The statistics provided on the dataflow does not indicate this info. Also I don't see any documentation which talks about what does the details that are shown in the dataflow. (I have provided 4 threads on my Queue processor). We are on PEGA 8.2.1 version using PEGA cloud instance.
Could you please let us know the state of stream node under Configure-> Decisioning ->Infrastructure ->Services -> Stream?
Regarding stale threads warning :
The data flow has a mechanism that is executed frequently for maintenance which is called Pulse. During the Pulse, one of the operations executed is a Stale Thread detection in which the system identifies threads that are potentially stuck, either in as an infinite loop or waiting for a response from an external service such as a database call. By default, if the thread has not updated its metrics (performed multiple times during Partition execution) for over 30 minutes, it is considered stale and a warning with the stack trace of this thread is added to the data flow run.
Thanks for your update. In some case if the activity is getting hung, dataflow throws huge number of stale thread exception – in this case we need to verify the activity. In one of my case server restart by deleting extractmarkerfile from temp folder cleared the stale thread warning from dataflow run , hope the same would help here.
We can delete the items which are in broken queue and not sure if it is ready to process state.