Posted: 15 Apr 2016 14:05 EDT Last activity: 5 May 2016 17:35 EDT
Multi-tenant creation failure?
PRPC is reporting the failure of a tenant to be created, however SOAP UI shows the tenant was created successfully and I can log in to the created tenant and accomplish expected functions.
In Pega Academy, a tenant is created using PRPC 7.1.6. After the tenant is created, the second function is to asynchronously import a course RAP so that a student can log in and begin the course.
1) SOAP UI shows that the creation of P_RES_Fund7Ex12AM_611 was successful and it provided me with a tenant URL. I logged in and saw that the operator IDs necessary for the student to log in and start the course were present. (Only admin logins exist at the shared layer - the course operator IDs did not come from there.)
2) The Tenant Reserve database table shows the P_RES_Fund7Ex12AM_611 instance ID but shows the Status as 'Failed'. It also shows the request ID as 'null'. The request ID is the ID for the request to import the course RAP file. (The request ID for an import on a different instance is "Success".)
Where does the data come from to populate the Status column? Why is the request ID 'null' when the RAP file was clearly imported? I have checked the logs (attached) and they say that the RAP file was imported. In the logs I also see a request ID that apparently is part of the tenant creation process, however I cannot conclusively prove that this CREATETENANT-1088 work object was definitely for P_RES_Fund7Ex12AM_611. A query of the DB table does not return any results using a request ID of CREATETENANT-1088.
Our MT servers have a capacity of approximately 300 tenants - on paex12 we are seeing 260 tenants in a failed status - however I don't believe that all of those are actually "Failed" - this one doesn't appear to be. I need to resolve this so that the server isn't overloaded with "Failed" tenants.
Did you run the soap service createtenant asynchronously? If yes, you may try to increase the services/maxRequestorChildren setting (default to 10) - anything more than 10 will be ignored by the engine. Also saw some thread dumps in the log, indicating your system was not performing well - also consider adjust agent/threadpoolsize (default is only 5) as the import process is using batch requestors.
"PegaRULES-Batch-5" Id=58 in RUNNABLE (running in native)
Similar to issue resolved with HFix-27035 on 7.1.7 system. The import process extracts the RAP file to a temp directory. When Pega tracer or 3rd party tools like DynaTrace are running, the temp directory is prone to being cleaned up prematurely. The import process tracks the status of the import (if it is still "in progress") through a flag which can get set to false during serialization. The temp file cleanup can occur before the batch requestor even kicks in to complete the import. The hotfix change in 7.1.7 was made to determine if import is in progress based on that temp location still existing. The 7.1.7 issue was seen during Import process in Designer Studio. The logs show the H requestor doing the RAP explosion and the B requestor trying to access the files.
2016-01-11 15:46:40,279 [.PRPCWorkManager : 3] [ STANDARD] [ ] [ PORTAL:02.66.70] (internal.archive.InstancesFile) ERROR vpadm5 - Unable to read from the file.
java.io.FileNotFoundException: /amex/wspemea/PegaTempDir/RuleMgmt/HACC366CDB62456F879BE026BCB028778/TestId/rules/instances_00000.bin (No such file or directory)
The above issue, seen in 7.1.6, is also one where the Batch Requestor tries to operate on files that are gone. However, this scenario occurs on a MT environment and the B requestor does everything.
2016-04-21 15:25:06,935 [ PegaRULES-Batch-4] [ STANDARD] [RES_Fund7Ex14AP_2030] [ PegaRULES:07.10] (tRAPs.Data_Admin_Tenant.Action) ERROR Administrator@pega.com - file://web:/StaticContent/global/ServiceExport/MultiTenant/TenantRAPs/Fund7 doesn't exist or is not a directory.
During debugging of this issue via SOAP call (through SR-A23511), the issue itself disappeared. The agent that performs the import normally was restarted and again the issue was not seen. This application will be observed for one week and then revisited to make sure the issue has not occurred. It is possible that a Pega tracer was running and caused the initial reported issue. We can easily verify this next week by starting the Pega Tracer and observing the behavior once the agent runs. If warranted, the hotfix can be ported back to 7.1.6.