Posted: 27 Jun 2016 20:09 EDT Last activity: 4 Oct 2018 13:54 EDT
Upgrading from Lucene to Elastic search indexing
We are currently in Pega 7.1.7 version with 2 batch nodes and 4 user nodes. We have disabled the elastic indexing and running the legacy lucene indexing to a shared mount.
We have customised the search so that all nodes can search the indexes from the shared mount.
Now we are upgrading to Pega 7.1.9 and would like to update to Elastic search indexing.
We have few concerns that needs to be addressed:
Since we have around 2 million work objects and our system needs to run 12 hours 7 days a week with outage of only 8 hours we are looking for options on how to re-index
Option 1:- Run the re-indexing through one of the batch node. Please confirm if it is for the re-indexing to run during the business hours when users are on the system. Option 2:- Initiate the re-indexing through batch from some local desktop connected to production database with increasing the number of threads. But not sure whether this can go during the business hours when users are on the system. Please confirm.
In option 1 we can increase maxnumworkers setting to 10 and run it as well.
Please let us know if there is a better way in doing this.
**Moderation Team has archived post**
This post has been archived for educational purposes. Contents and links will no longer be updated. If you have the same/similar question, please write a new post.
Note that having a large number of threads may not give you a lot of benefit because of reading everything from the DB. So you might get signification time reduction up to 3 / 4 threads but beyond that while there is some improvement, it flattens out.
As Swati mentioned, it is advisable to run this during off business hours. Since we read a lot of data from the database to index, there will be some impact.
If there are 9 nodes and 2 of them are host indexing nodes, how will the other 7 nodes communicate (using which IP? is it the pyClusterAddress or different thing)
Note that Elastic Search uses the port range 9300 ~ 9399 to communicate between the different nodes. All nodes will communicate with the index host nodes as identified on the search landing page. The actual IP address and port number used by each node for communication is listed in pyIndexerAddress column in pr_sys_statusnodes table. Note that pyClusterAddress column is used by Hazelcast (and not Elastic Search).
Rajiv, One question. In all versions shipped after 7.1.7, does pega by default enable ELASTIC search (and disable/avoid lucene search)?
For a fresh installation, Elastic Search is enabled by default. For an upgrade, Lucene indices if present prior to upgrade will still be used, till such time the user doesn't re-index from the search landing page. Once all enabled indices from the search landing page have been re-indexed, the platform will automatically switch over the Elastic Search. Please refer to the upgrade guide on details for this.