Posted: 22 Sep 2020 21:53 EDT Last activity: 19 Oct 2020 11:59 EDT
Case ID generation mechanism
In this post, I am sharing case ID generation mechanism and how it can be customized.
First of all, case ID consists of three parts - [prefix]-[integer]-[suffix] as shown below.
You can use either prefix, or suffix, or both along with sequence #. "Suffix" is blank by default and most of customers don't use it in my experience but you can use it if it's preferred. "Prefix" is added to your application rule automatically (the initial letter of the case type by default) when you define a case type.
1. How to customize it
There are three approaches to change it. Try it in the order of 1, 2, 3. If what you're trying is not possible in the approach, try next one. FYI, if you update all of them, precedence is 3 > 1 > 2.
You can update prefix in the application rule. Only prefix is changeable and suffix can't be configured. You can only enter static string value (expression is not allowed).
pyDefault data transform
You can set .pyWorkIDPrefix (prefix), and .pyWorkIDSuffix (suffix) in the pyDefault data transform. You can use expression to set dynamic value.
You can override Work-.GenerateID for greater flexibility. Prefix, suffix, or sequence # (*) can be customized.
* In my experience, very few customer wants to customize sequence # generation logic. Although it is technically possible, I would not recommend it as it is risky and may cause unexpected issues. If you really need to do this, make sure the ID is always unique and also there is no performance problem in a multi-node environment.
2. Sequence # generation mechanism
There has been a change in 8.3. Let me explain old / new mechanism and why it was changed.
(1) Old mechanism (~ Pega 8.2)
The latest ID is maintained in the database table (PC_DATA_UNIQUEID) per case type. Every time case gets created, system calls Work-.GenerateID and it queries and updates the value in the table. The ID is incremented by one and returned to app node.
(2) New mechanism (Pega 8.3 ~)
The latest ID is maintained in the app node. Database table (PC_DATA_UNIQUEID) is still used, but it only holds the chunk of scope (called "batch size"). For example, when the very first case is created, app node queries the batch size in the database. Since there is no entry in the table, system updates it to 1000 (by default). App node is assigned with 1-1000 scope. Hence the 1st case's sequence ID becomes 1. When next case is created on the same app node, it won't hit the database anymore and 2 is assigned immediately since the latest ID is maintained at app node (not database). This continues until either the app node exhausts the assigned scope (1-1000), or restarts. Be noted, this process happens per app node. For example, if app node #2 comes up, it is assigned with the next chunk (1001-2000). Hence, even if the latest ID in the app node #1 is in the middle of scope, app node #2 will start with 1001 regardless.
* Why it was changed
The old mechanism largely relied on database and performance was bad. Communication between app node and database is costly. Actually, the half of case creation process time was this ID generation. So, bottleneck issues were sometimes reported in a high load environment. More importantly, in a multi-node environment, the case generation can happen at the same time between nodes, and it could cause contention as the row is shared for all nodes. New mechanism reduced the number of communication between app node and database and increased the performance. Now ID generation takes only less than 5% of its case creation process time.
* Business impact
As a side effect of this new mechanism, now the sequence ID jumps around between nodes or every time you restart the system. Prior to 8.3, the case was pretty much sequential - 1, 2, 3, 4, 5...etc. Now, it goes like 1, 1001, 2, 2001, 3001, 3, 4, 3002... etc. This is all caused by technical reasons. However, for some customers or business type, sequence # is important. So I would recommend you to consider with business people the balance - if, the sequence # is more important than performance, you can change the batch size. Or if you don't get bothered by sequence #, you can keep the default.
* Batch size update
The default value is 1000, but you can change it by Dynamic System Settings. For example, if you update it to 1, system will behave like old version. Be noted performance gets slower in that case.
Dynamic System Settings: idGenerator/defaultBatchSize
Owning ruleset: Pega-RULES
If you want the batch size to differ per case type, you can specialize it by inserting the prefix of case type in between (ex. idGenerator/P-/defaultBatchSize).
***Edited by Moderator Marissa to update Platform Capability tags****
In the new mechanism, as I explained, allocating the next batch size process occurs either when the app node exhausts the assigned scope, or system restarts. In your example, whichever node starts first will take 2001 - 3000 and the latter node will take 3001 - 4000 (*).
* Technically speaking, allocation is not triggered by restart itself. It is triggered by case creation after system restart. For example, assuming that you have one app node and someone created the first case (W-1). Then if you restart the system 10 successive times (without creating any cases), will the next case ID be W-10001? Nope, W-1001.