Our application runs on Tomcat and Postgres database. A failover testing was performed on Postgres db. Post failover, while working on the application we got the following error:
ERROR: cannot execute INSERT in a read-only transaction Call getNextException to see other errors in the batch. Problem #2, SQLState 25006, Error code 0: org.postgresql.util.PSQLException: ERROR: cannot execute INSERT in a read-only transaction Problem #3, SQLState 25006, Error code 0: org.postgresql.util.PSQLException: ERROR: cannot execute INSERT in a read-only transaction
We are using the following configuration for the Pega DB datasource:
The error behavior is not consitent. Sometimes after 5-10 minutes, Pega is able to resolve the change in IP configuration at the database end and works fine. Other times, it is not able to determine the change endlessly and keep on hitting the secondary database instead of the primary.
A separate Java application has not faced these issues even though they are also using exactly same data source settings as well.
Has anybody else faced this issue? Any directions into which we should investigate further?.
***Edited by Moderator Marissa to update SR Details***
When a DB failover is performed, the primary instance stops responding to packets as soon as the failover is initiated. All the connections are therefore will be stuck in ESTABLISHED state and all the query executions for the running transactions in the primary DB at the time of failover fails when the system tries to execute a query on those transactions again.
This is because the running transactions does not get transferred to the secondary DB as it is and the some data gets lost leaving that transaction in an incomplete state. If a system tries to execute any query on the objects where those transactions were operating, it will throw an error as you see. Once the keep alive time has expired and such inconsistent transactions are removed by the system, the errors are not observed anymore.
If you have heard about Oracle Golden Gate, then you will know that its architecture allows you to transfer continuous real time running transactions smoothly over to a secondary databases but you have to pay for it.
Since Postgres is open source, it doesn't allow such functionalities and you will have to deal with it on your own. Try modifying the sysctl configuration with the following properties and setting them to aggressive values until you find an optimal value. Below values are default and try with a value lesser than these.
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 60
Coming to the point where the other Java application doesn't see these issues the question you will want to ask yourself-
Is Pega and the java application are similar in the number of interactions being fired from the application to the DB at a given time? Pega is huge and DB interactions will be much high in number.
I understand your point on the running transaction. However, I also face issue while logging into a new session. Pega doesn't allow me to login (I am assuming that Pega also performs inserts as part of the basic login process as well).
I also assume that logging into a new session should be a new transaction in DB and hence the above situation should not happen.