How this can happen? The Agent is setup to recurring each 120sec?
After a while the condition in the first step changed and the work items got processed one by one again without the recurring time of 120sec.
This brought us another issue: OutOfMemoryError!
I can imagine this because of thousands of items which were scheduled and the agent where running them without waiting the time setup for recurring; maybe even in the same Thread!?
If the requestor run all items really within the same thread, which I didn't thought it should work like, the pages where not cleaned up properly from the engine and would cause for sure an OutOfMemoryError after a while. The agent where processing about 1500 work items in 13h before running into an out of memory.
So, when this agent started to run again after restarting the node, we tried to switch off this agent in the SMA, but the agent doesn't stop at all and picked up next work items one by one which were scheduled, again without waiting 120sec like the agent scheduler has been setup.
Only when I interrupted the requestor in the SMA, the agent processing got stopped and ALL scheduled items in the queue (SYSTEM-QUEUE-DEFAULTENTRY) for the specific agent name got moved into the Broken-Queue.
I found the cause why the agent picked up so many in one single session.
Unfortunately, I missed this information which I found now in pega help:
For agents of the Standard queue mode, the maximum number of items from the agent queue that are processed at one time by the agent before it goes back to sleep for its interval. If this field is blank, the default value is 50. To specify that the agent is to process all entries in its queue before sleeping, enter 0.
This value is the maximum number of items to process. If Max Records is 30 and there are only 20 entries in the agent queue when the agent wakes, the agent processes the 20 entries and sleeps; it does not wait for an additional 10 entries.
As we left the max record filed empty for the standard agent in the agent rule, up to 50 scheduled items got processed in on single session because there were thousand items in the queue.
But I still don't understand why all 1500 scheduled items got moved into the broken queue when I interrupt the requestor and not only 50 which would explain the default max record size.
Is there an option to change the max recode default value?
Here is the asnwere of my last question: Why all scheduled 1500 items in the queue got moved into the broken queue by interrupting the current batch reqeustor for the running agent.
The upper pega help text is for older pega version 7.2.
But, our customer is using 7.2.2 and it seems there has been made a change in what is the default max record value.
Here is the pega help for max record in version pega 7.2.2.
For agents of the Standard queue mode, specify the maximum number of items from the agent queue that are processed at one time by the agent before it goes back to sleep for its interval. If this field is blank, the default value is 0, and the agent processes all entries in its queue before sleeping.