Can we restart BIX process if a failure happens. Restart from where it was failed initially
We're using BIX to write data into DB table for reporting purpose in our application. We've enabled filter criteria only "Use last updated time as start" checkbox. We didn't have any other filters on extract rule.
We've configured Extract rule in Job Scheduler thru pxExtractDataWithArgs OOTB activity. It runs every day night at 12AM.
For some reasons if there is failure in BIX process. Let's say if we get 5000 records for the criteria and system able to insert 2500 records which is 50% of the whole. After 2500 record if in case BIX failes. Is there a way to restart the BIX from where it was failed meaning from 2501 record. Currently what happening is whenever there is a failure we troubleshoot the issue and re-run the extract it again starting to insert from first record. This results in duplicating the first 50% of records because they are already being inserted into table for the first initial failed execution.
Because of this duplication and not able to restart where it was failed. We're having some manual work to eliminate the duplicate records from target DB table.
I'd like to know if there is any way to avoid these and option to restart.
Version Details :
Pega - 8.3.0
BIX - 8.3
DB(Vendor Type) - MSSQL
Server - Hosted on Tomcat(On-Premises)
***Edited by Moderator Marissa to update Platform Capability tags****
For bix this is the expected behaviour. If there is any issue it will run the extract from the last successful extract date time. However I am unsure why it would create duplicate records. Ideally it should overwrite the existing record and not create another record.
In our instance whenever a failure occured(Server issue , data issue, etc). When we re-run the extract yes it's again re inserting all records that were already inserted in first failure run. Even though there's hasn't been any update on those cases in portal.
Enable the Info logs and run the Extract. Check the logs it will have the information related to the pzinskeys which are successfully inserted or Query the target DB table with order by DESC on the InsKey column the one which is highest is the one that is last inserted . Now update your extract filter with pzInsKey Greater Than the Last Insert InsKey and check Skip standard filters option and Uncheck Use last updated time as start. Now run the extract it will start extracting the data from where it is failed and will be easy to figure out the issue.
Generally for every successful run BIX will update the pylastupdatedate column in the pr_extract_time table. If any failure occurs in the middle the there might be issue with the Node during the time of extraction. If there is any data issue also extraction will be success but the ins keys which are having issue those will be failed and written in to logs.
Yes by looking at BIX log's should be able to determine last pzInsKey being processed in that batch and by taking that ID and give the pzInsKey is greater than in filter criteria of extract rule by un-checking last update time and checking "skip standard filters" . This will help in restarting the process where it was failed and eliminating duplication of inserts for same work items again.
The problem i see here is let's say 10 work items were being pulled in extract criteria .. (O-10 to O-20) after 5 records of successful at O-16 if process failes. By looking at log's we can take O-5 as the last record and put-in filter with greater than of that ID . But what if there were records from last week's created work item(Meaning a work item was created last week and now user performed an update and resolve the case (O-5). Which means this case ID won't be in sequential order but it'll still come into part of BIX criteria. Since we have filter with greater than work item O-15 .. In this case there could be a possibility that we may miss these kinds of updated records.
But this needs some manual code intervention in target environment. If it's a PRD environment ideally we shouldn't be allowed to these changes. Is there way by not doing manual intervention and automate the process.(For example - Kind of re-queue mechanism )