Posted: 12 Jul 2018 15:02 EDT Last activity: 2 Aug 2018 13:00 EDT
Pega Adaptive Snapshot
I just wanted to know the internal architecture of adaptive models. Pega is currently uploading all the responses into adaptive models. It is creating a new snapshot record in the Data-Decision-ADM-ModelSnapshot for every single day. What is the purpose of this class? Why don't it create a new entry after single run or at least after every single update? Is there any reason for this?
We have adaptive models which are getting trained every day based on the web responses. We are using a delay learning architecture. Pega doesn't store the individual adaptive responses into adaptive models. It calculates the outcomes using ML and add one record as a snapshot entry(Based on my understanding). What happens if the ADMSnapshot agent is down on one particular day? How can we recover the responses past responses? How do we know which responses it has processed till that time? Is there any way to do a disaster recovery? Thanks in advance.
So reporting - taking snapshots to enable the model reports you view in the Analytics Center - is separate from the process of training the models with responses.
With the latter, from 7.3 onwards, we store all responses that arrive and then, when we've received a model rule-specific number of them, an ADM Service node will apply all those responses to the relevant model, increasing its learning.
When this is complete, the entire new state of the model is saved to the SQL database in data.pr_data_adm_factory. We also build a smaller scoring model, which is distributed around the cluster and used to make decisions (i.e. when a model rule is executed in a Make Decision strategy).
Reporting does not interfere with the learning process above. The snapshot agent will look at the factory for each model and create a snapshot of the information: the model state at that moment in time. This is ONLY used in the reports of the Analytics Center, not for learning or any data recovery.
It is creating a new snapshot record in the Data-Decision-ADM-ModelSnapshot for every single day. What is the purpose of this class?
The class is the data model underlying each row of the table on the Model Overview page of the Analytics Center (the first page you see in the Monitoring tab when you open an Adaptive Model there). It stores model-level information so users can see how well their model is performing.
Why don't it create a new entry after single run or at least after every single update? Is there any reason for this?
Both of these are great ideas and are already on our product roadmap. Having the snapshot agent run at arbitrary intervals is leftover from releases prior to 7.3, when models were updated in memory on a separate server as each response arrived, meaning there was no 'best' time to take snapshots.
What happens if the ADMSnapshot agent is down on one particular day?
The training and use of your models to make decisions will not be affected. If you are storing historical report data (configured via 'Edit Settings' on the ADM Service page), then you will be missing report data for that particular day. If you've chosen just to store the most recent reporting snapshot, your reports will not be updated with that day - you'll just continue to see the previous day's info.
How can we recover the past responses?
Responses used for learning will only be lost if there is no update of a model within 48 hours, and they cannot be recovered. If you're referring to the scenario above, where the snapshot agent misses a day, this will not result in responses being lost. However, you cannot 'roll back' the state of your model factory and take a snapshot of the missed day for reporting purposes, so there is no way to 'backfill' the reporting data you've missed.
How do we know which responses it has processed till that time?
The model instances under a rule are usually differentiated by their combination of Issue, Group, Name, Direction and Channel values. From 7.3, you can see how many positive and negative responses each of these model instances has processed on both the Adaptive Model management page within Designer Studio and in the model report in the Analytics Center.
Is there any way to do a disaster recovery?
Responses and scoring models will be replicated across up to 3 of your DDS nodes by default, so multiple of these nodes would have to have their storage destroyed to lose all replicas of that data. However, the factories in the SQL database are the ultimate source of truth for Adaptive Models, and are not replicated, so these should be backed up as with any other data in the Pega DB.
The worst case would be if you were unable to bring up an ADM Service node for more than 48 hours: any unprocessed responses from before then would then be lost when a service node does come up. If this is deemed unacceptable, you could backup your DDS data and replay the responses in another way at a later date, although this is currently unsupported OOTB.
This is great response. We are creating a new simulation environment for champion challenger efforts and more analytic investigations with multiple "what if" scenarios on this environment.
As a part of this effort we need to synch up ADM learnings of Simulation environment with production environment on an adhoc basis, before we run our "what if" scenarios. Even though we can not synch up with the EXACT state of ADM learnings (due to in memory operations), we want to bring it as close as possible with production learnings.
After looking at all existing ADM tables, the following seem to be the list of all ADM tables we have in our database (We are using Pega 7.3). Out of all these tables, would it be okay, if we just transfer the data of "PR_DATA_ADM_FACTORY" table from production to Simulation?
Or do you suggest we need to get all the following tables data. (Or any other tables that we did not mention here)?