Posted: 19 May 2016 18:47 EDT Last activity: 8 Aug 2016 1:58 EDT
Does Elastic Search support higher volume of production data?
We are working on a call center application and is under upgrade process (V7.1.9). Our application has huge volume of data (TB's) and 50-60 billions of records in production and due to that we are afraid of enabling ES for work objects based on unknown performance issues related to it. Can someone respond me whether ES has any volume restriction or ES supports higher volume of data like us? What are the precautionary things we have to take while enabling ES in our prod environment?
Note: We disabled searching of work items in prod in the current V62 environment (Lucene) due to some performance issues seen earlier.
In version 6.2, the Pega platform was using Apache Lucene 3.0.0 which did have memory issues. This was later fixed in Apache Lucene 3.5.0 which was used in 6.3 version of Pega platform. You can find the details of the performance issue here - https://issues.apache.org/jira/browse/LUCENE-2205
That said, Elastic Search, introduced in 7.1.7 version of the Pega platform uses Apache Lucene 4.6 underneath and thus the performance is much better.
As I understand, the data is huge. How often does this data change? How does your database cope up with this volume? Do you have an archive / purge strategy? Do you need to search on all this data all the time?
How often does this data change? Once the cases are resolved, the data does not change at all. For assignment errors, support team usually resolves these manually.
How does your database cope up with this volume? Oracle MA, Oracle RAC architecture.DB Utilization capacity to handle current load so far (without Elastic Search functionality)— ~ 40 – 50%. Keep in mind that DB utilization capacity is much higher during peak season (From October to end of Jan).
Do you have an archive / purge strategy? All cases older than 1 year are moved to the Archived DB and should not be needed in the indexes.
Do you need to search on all this data all the time? As per the use case “Configure the “Quick Search” feature to have the following pre-set search features to match on: Case-ID, Member-Name, Member ID, and User Name. (Case search – only, other OOTB searches will be removed).
Our current Archive and Search Wizard doesn't remove entries from search index for those instances which are purged from the database. So, this needs to be done by the developer / administrator themselves.
If you are going to search using "Case-ID, Member-Name, Member ID and User Name" then these can done via database queries as well. Elastic Search is useful when you need the power of full text search where you are not sure which field in the instance contains the string you are searching for.
Bigger question - what is the use case and business value for full text search in this call center?
Most high volume call centers have very clearly defined access paths to old calls / cases - search by ref #/caseID, search by account / member / subscriber / provider, search by last name + date
Keep in mind that enabling full text search adds five database transactions every time work object / case is saved -- write to the ftsindexer queue, select from ftsindexer queue, update ftsindexer queue, read work, delete from ftsindexer queue. Regardless of index scalability, DB load is going to increase. If system is truly very large, that may be an issue.
We do have several customer service / customer support systems that run with text search enabled (gcssupport is not built on customer service/cpm rules but is used for that role) but they are 'high touch' and not 'high volume'. How would you characterize your application?
1. What is the use case and business value for full text search in this call center?
Business was told that this is OOTB capability at the time (by Pega consultants).
6. We do have several customer service / customer support systems that run with text search enabled (gcssupport is not built on customer service/cpm rules but is used for that role) but they are 'high touch' and not 'high volume'. How would you characterize your application?
Whats the nature of this customers product / service?
What are the customers data retention requirements?
If you processed 145k calls a day, with two service intents per call, that would be 435k work items per day. Assuming 260 work days per year and 2 years retention I get 226 million work items to search. Where did you get 25.8 million?
What are the logical ways in which one would be searching hundreds of millions of cases?
What’s the nature of this customers product / service? – To resolve customer specific issues and concerns using this app.
What are the customers data retention requirements? – Not sure on this.
If you processed 145k calls a day, with two service intents per call, that would be 435k work items per day. Assuming 260 work days per year and 2 years retention I get 226 million work items to search. Where did you get 25.8 million? – 25.8M transactions include all the User hits between app server and User sessions; it does not mean we are creating that many intent Work objects (S-Cases).
What are the logical ways in which one would be searching hundreds of millions of cases? – Work ID, Member/Cust. ID mainly
After Pega 7, biz is asking some search features in the new app but for now this feature is disabled in V62 to avoid performance issues while searching the data (I- or S- cases). By the way, our guys contacted on this issue with GCS earlier and based on that we disabled this feature till now. I don't remember the SR# for this earlier contact with GCS.
We had an internal discussion on enabling ES feature in our prod environment. We got some interesting points and would like to share with you guys.
1) PEGA enhanced Elastic search in PRPC7.X using third party driver : hazel cast which supports hazel cache mechanism.
2) But the implementation of hazel cache mechanism had bug in PRPC where it will create hung threads by scanning jvm nodes which are part of different host as part of boot strap process,etc.It even bump up the CPU utilization , network traffic.
3) One of our team members mentioned that there is a Hfix-21219 given to resolve some performance issues but not sure whether the above mentioned issues are still present in V7.1.9
Based on above points, can any of you please help us on what needs to be done?
I am not sure I understand point 1 and 2. Elastic search is meant to provide distributed search and failover. Hazelcast is meant to provide distributed cache / remote execution facility in the product. While both are distributed technologies, the purpose is completely different.
This issue with Elasticsearch was fixed in 7.1.8 and backported to 7.1.7 via HFIX-21219. So 7.1.9 does have the fix.
Sorry for the late reply. As per your last reply, we are planning to enable the ES in our prod env but I would like your recommendation. Can we go ahead and enable it prod based on our current situation? If yes, we need Pega support for any issues/concerns after enabling it. Please let me know.
If you need the full text search capabilities on the cases that get created, then Elastic Search is the only way forward within the Pega 7 platform. As mentioned, if full text search is not needed, database queries can also work well.
Given the volumes you discussed the following should be kept in mind before enabling
Turn off indexing on classes which don't need to be searched. This can be done by checking the option "Exclude this class from search" in the advanced tab of the class definition
Have a purge / archive mechanism which can clean up the instances in the Elastic Search index as well
Have ample resoruces for the index host nodes
disk space should be up to 3 times the size of the initial index
suitable memory to handle the resources of Elastic Search (you should use up to half of the available RAM for your heap - at minimum of 8 GB)
Inter node connectivity over port range 9300~9399 should avoid any network blips
At least 2 nodes should be configured as index host nodes to provide failover
If you are using Websphere and IBM Java, please read the platform support guide on the minimum Websphere and IBM SDK versions
Enable attachment indexing on the search landing page, only if necessary
You should be able to reach out GCS for any issues faced or use the Pega Product Support community to seek help on any questions that you might have.