Posted: 14 Jun 2016 11:57 EDT Last activity: 23 Jun 2016 14:55 EDT
MS SQL Clustered Index
I've been having issues with deadlocks on the pr_sys_locks table in 7.1.9. running on MS SQL Server. Unlike Oracle, SQL Server will pick a victim for the deadlock and kill the query which leads to application problems unless you have logic to retry. After troubleshooting the issue we discovered that the pr_sys_locks table did not have a clustered index which lead the query optimizer to choose a poor query plan. After adding a clustered index on pzInsKey the deadlocks stopped. SQL Server best practice states that in most cases you should have one clustered index on each table. Most if not all of the OOTB shipped Pega tables do not have a clustered index. I'm about to add some clustered indexes on a few important tables to try fixing some query alerts. I'm inclined to add them to more tables but I don't want to blindly make changes that could have a negative impact.
Why is PRPC not following best practice? What's the benefit of shipping without clustered indexes?
Is this a known issue?
What are other customers experiencing/doing?
Message was edited by: Marissa Rogers - added category
Examining the tables that were created while installing PRPC 7.1.9 there are over 180 PRPC tables without a clustered index. To say PRPC supports Clustered indexes is a little misleading since this is a database construct and PRPC supports many DB vendors. I'm trying to determine if PRPC was shipped this way by design or by mistake. We are in the process of troubleshooting alerts and will consider adding clustered indexes to improve performance on a query by query basis but it would be a big help to get a broader understanding of what other customers have done since the OOTB PRPC tables don't appear to be follow best practice.
We found Pega customers experienced severe performance impacts due to the cluster index's maintenance overhead. It caused locking/blocking issues. Furthermore even with cluster index, the key lookup for non-index column is still not cheap compared to covering index. That's why we ship non-clustered index by default for Pega 7.
Hi Jie. How are you? Thanks for the response. We were getting lots of deadlocks on pr_sys_locks until we added a clustered index on pzinskey. Without this the query plan was using
Index Seek (nonClustered) on the primary key (pzinskey) followed by a RID Lookup (Heap). Essentially the PK index was used to find the right row and then a RID Lookup was used to read the blob. Although this is better than a full table scan there are many articles on the web that say this is still slower than a Clustered Index Seek(Clustered). A covering index in this case didn't help because we needed to read the blob and that was done via RID Lookup(Heap).
Is there another way to resolve this sort of issue and avoid RID Lookup(Heap)? At the moment I'm happy with our current solution for the lock table but I'd like to know about alternatives in case I need them for other tables where the overhead of a clustered index is not appropriate.
For table without blobs like pr_sys_locks, a non-clustered covering index or include-column index is cheaper and better than maintaining a clustered index. Since 7.1.8, this pr_sys_locks table can be periodically defragmented to improve performance. https://pdn.pega.com/release-note/ability-periodically-defragment-prsyslocks-table. As to tables with blob, we found it's cheaper to maintain non-clustered index. SQL Server supports non-clustered include blob column index. You may consider that for some heavy blob involved queries.