Public Safety Canada
Posted: 4 years ago
Last activity: 3 years 9 months ago

High-Availability with Single-Sign-On process

I currently have setup a two nodes (PRPC 7.1.8) with a load balancer (sticky session) and shared storage between the two nodes for testing. The nodes are setup for single-sign-on using Custom Authentication Service (Kerberos via the Tomcat). Currently, when Quiesce is used on one of the node, the user experience seemed to be seamless (the user can basically start clicking and continue working) on the available node.

The problem I'm having is during an unexpected node crash. I can't seem to have the same or similar user experience. The user can't continue to work. The user has to re-open a new session to the new server (e.g. close/re-open browser). When the user tries to refresh on the same browser, I can see the session created but all I get back from the server is HTTP 500 error. I verified the problem is not on the container side as  can see it in the catalina log that the user is correctly authenticated on the working node.

My question is, what is correct behavior during an unexpected node crash? What should the custom authentication service be doing?

