About pr_sys_workIndexer_queue table & Lucene Search

Question

BrahmeswaraRao

Member since 2016

37 posts

Posted: Oct 14, 2015

Last activity: Oct 20, 2015

Posted: 14 Oct 2015 15:46 EDT
Last activity: 20 Oct 2015 2:22 EDT

Closed

About  pr_sys_workIndexer_queue table & Lucene Search

Report

We got the use case as to get rid of manual search by Lucene search.

I'm pretty much new to the "Search" functionality ,so just gone through the articles in PDN,and I've couple of questions regarding Lucene search and PR_SYS_WorkIndexer_Queue table .

PR_SYS_WorkIndexer :-

1) Why indexing file to be created in external/internal directory ,and how does it helps in searching for the objects.?

2) How entries are being added to the queue table?

Is it happened when work object is created? but I could see that it's happening when doing "ReIndex" operation as well.

3) why queue table is required for work index? , and Is "SystemWorkIndexer" activity creates a index files in directory? .

Lucene Search :

As per my understanding, single node should be the host for search indexing , and there is a web service call to get the indexing records when search for the work object from different node,

but how the system is consuming the internal web service for search.

Please make me to understand search functionality ..

Thanks in advance ,

Brahmesh.

To see attachments, please log in.

System Administration

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Posted: 8 years ago

Posted: 15 Oct 2015 5:49 EDT

nistr replied to BrahmeswaraRao

Report

Hi Brahmeswara,

Search and indexing are vast topics but if you are familiar with database technology, I will try to draw analogy between the two of them.

When you write an SQL query to retrieve data from a database, it works great if the number of records are small but as the number of records increase, the performance drops. To overcome this, the simplest suggestion given is to create an index on the column which is used in the WHERE clause of the SQL statement. Now indexes consume more space in the database, but speed up retrieval and thus your queries run faster.

So if databases already have this feature, why do I need full text search?

Lucene is one of the libraries that provides full text search. Since in the Pega platform, we don't expose each and every property as a column in the database, we can't write SQL statements which are performant when they have to refer to the values in the storage stream. Also, since the structure of the data stored in the storage stream is hierarchical, it is not easy for RDBMS to provide efficient retrieval using SQL. So full text search engines do inverted indices. You can read more about inverted indices and full text search at the Lucene website - http://lucene.apache.org

So how does Pega use Lucene?

Hi Brahmeswara,

Search and indexing are vast topics but if you are familiar with database technology, I will try to draw analogy between the two of them.

So if databases already have this feature, why do I need full text search?

So how does Pega use Lucene?

The Pega platform takes the data stored in the stream and indexes the content so that the search control can retrieve the details of any instance where the search string was found anywhere in the document. We have a search landing page which provides the details of the indices that we have. You can re-index through the search landing page. Now full text search index will maintain the index on the file system. Since the file system is specific for each node, thus only one node maintains the index. With Elastic Search in Pega 7.1.7 onwards, we can provide failover as well.

Why do we need the pr_sys_workindexer_queue?

The data in the database is not static. It keeps changing as instances are created, updated and deleted. This means that the Lucene index files need to be also made up to date with these changes. Thus as instances are changed in the Pega platform, we make a note of the pzInsKey of the instance in the pr_sys_workindexer_queue table. Subsequently the SystemWorkIndexer agent picks up the entry, gets the latest changes, and modifies the index files.

What does search do?

When you search for a specific text, the Lucene index is looked up and records are returned that have this text in it. Since the index is hosted on one node, we use SOAP to connect to the search node if the current node initiating the search is not the search node. This is internal to the Pega platform and as a Pega developer using the platform to develop an application need not be worried.

Hope this answers your questions.

-Rajiv

Show Less

To see attachments, please log in.

Like (0)

Posted: 8 years ago

Posted: 19 Oct 2015 8:23 EDT

BrahmeswaraRao

replied to nistr

Report

Thanks a ton Rajiv ..

But have small question ,As you mentioned , The Search control can retrieve the details of any instance ,

Would it retrieve the details which are present in BLOB as well?

To see attachments, please log in.

Like (0)

Posted: 8 years ago

Posted: 20 Oct 2015 2:22 EDT

nistr replied to BrahmeswaraRao

Report

Would it retrieve the details which are present in BLOB as well?

As I mentioned above, we create an index on the filesystem for the data in the BLOB. So search will look into this index to see if the search text, that was provided, is available in any of the records or not. It doesn't go check the database and thus doesn't check the BLOB in the literal sense. But it can check if the search text is present in any record or not (even when the property was part of the BLOB in DB). That said, you cannot return the stored (exact) value of a property as part of the search results. We only put very few top level properties in its exact value in the index so that results can be displayed when someones does a search. Any other property can be opened up, from the DB, by using the properties returned.

To see attachments, please log in.

Like (0)

Get Started with Community

Question

About  pr_sys_workIndexer_queue table & Lucene Search

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

About&nbsp; pr_sys_workIndexer_queue table &amp; Lucene Search

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.

About pr_sys_workIndexer_queue table & Lucene Search