Creating workflow template for Predictive Analytics in PEGA
I went through the tutorial on 'Building Predictive Models' and tried to create a predictive model in my Prediction/Decision studio. However, some of the options/features provided in the default workflow are confusing to me e.g. I tried solving standard 'Loan Prediction Problem', in the end I could see a chart that looks like k-fold cross validation. However, I was expecting it to be a confusion matrix.
Can somebody tell/point me how to create a customized template/workflow for prediction cases?
Thanks for your question, it sounds like you have some good data science background.
In a nutshell we actually dont expose the underlying pipeline as it is really supposed to be a 'model factory' not a 'model laboratory' hence it follows a best practice pipe line, similar to some of the other tools in the market that follow a high degree of automation.
The pipeline is roughly as follows:
Specify the goals (binary scoring, numerical regression), define the population and outcomes, specify hold out validation schemes (not n fold)
Automatically prepare all predictors (numerical binning, symbolical grouping) and measure performance univariately
Perform subset feature selection - group the predictors based on cross correlation, and sort on performance in each group
Automatically build regression and decision tree models. Ability to build additional models, including genetic programming
Evaluate the models. First at the score level (lift charts, discrimination/ROC). Calibrate the scores by binning into classes and calculate 'true' probabilities.