A key challenge you will face in splunk is grouping alerts / exceptions by correlation so you can assess the unique issues that you face and frequency/cost of each issues. There is logic in AES to uniquely assess and persist the correlation string for each alert type.
Broader question - why do you want / need to recreate AES in splunk rather than simply using AES as-is -- or even simpler, just use the PDC service, since it frees you from the need to pay for AES infrastructure, gets new features / improved advice frequently and has a product manager who is happy to get feedback and make enhancements based on user input (yours truly).
I logged into the Anthem tenant and only see partial data (alerts but no health status) from one node. How many nodes should have been sending to PDC/AES? Are you using standard Pega7 logging configuration or has it been customized in any way?
On monitored node, enable debug logging for classes httpclient.wire.header and httpclient.wire.content to see what it is sending to PDC and whether it gets a proper http response code. Not uncommon for network admin assistance to be needed to open outgoing https to pdc-external.pegacloud.com
well, there is no real 'query list'. We do lots of queries depending on the report. The key point is that PDC has logic to recognize each alert type and define the proper way to correlate related alerts into a case, and we use that to create a correlation string and hash it for correlation ID. That way, when agents look at newly received alerts, they either match correlationID of an existing case or a new case gets created. We've also done work to ensure that the correlation relates to the root cause -- primarily with regard to assessing if a query comes from a report, and RDB method or an OBJ method.
We're going to now start adding logic to recognize certain exceptions and give prescriptive advice on how to fix them.