Discussion

Pegasystems Inc.
JP
Last activity: 24 May 2022 6:08 EDT
Case ID generation mechanism
Hi,
In this post, I am sharing case ID generation mechanism and how it can be customized.
First of all, case ID consists of three parts - [prefix]-[integer]-[suffix] as shown below.
No | Example | Prefix | Sequence # | Suffix |
---|---|---|---|---|
1 | C-100 | C- | 100 | |
2 | 15378-DR | 15378 | -DR | |
3 | MORT-763-K4 | MORT- | 763 | -K4 |
You can use either prefix, or suffix, or both along with sequence #. "Suffix" is blank by default and most of customers don't use it in my experience but you can use it if it's preferred. "Prefix" is added to your application rule automatically (the initial letter of the case type by default) when you define a case type.
1. How to customize it
There are three approaches to change it. Try it in the order of 1, 2, 3. If what you're trying is not possible in the approach, try next one. FYI, if you update all of them, precedence is 3 > 1 > 2.
No | Approach | Comments |
1 | Application rule | You can update prefix in the application rule. Only prefix is changeable and suffix can't be configured. You can only enter static string value (expression is not allowed). |
2 | pyDefault data transform | You can set .pyWorkIDPrefix (prefix), and .pyWorkIDSuffix (suffix) in the pyDefault data transform. You can use expression to set dynamic value. |
3 | Work-.GenerateID activity | You can override Work-.GenerateID for greater flexibility. Prefix, suffix, or sequence # (*) can be customized. |
* In my experience, very few customer wants to customize sequence # generation logic. Although it is technically possible, I would not recommend it as it is risky and may cause unexpected issues. If you really need to do this, make sure the ID is always unique and also there is no performance problem in a multi-node environment.
2. Sequence # generation mechanism
There has been a change in 8.3. Let me explain old / new mechanism and why it was changed. Be noted, this new mechanism was introduced only in PostgreSQL and it is not available for other database at moment. However, we are planning to extend it to Oracle and Microsoft SQL Server from 8.7.
(1) Old mechanism (~ Pega 8.2)
The latest ID is maintained in the database table (PC_DATA_UNIQUEID) per case type. Every time case gets created, system calls Work-.GenerateID and it queries and updates the value in the table. The ID is incremented by one and returned to app node.
(2) New mechanism (Pega 8.3 ~)
The latest ID is maintained in the app node. Database table (PC_DATA_UNIQUEID) is still used, but it only holds the chunk of scope (called "batch size"). For example, when the very first case is created, app node queries the batch size in the database. Since there is no entry in the table, system updates it to 1000 (by default). App node is assigned with 1-1000 scope. Hence the 1st case's sequence ID becomes 1. When next case is created on the same app node, it won't hit the database anymore and 2 is assigned immediately since the latest ID is maintained at app node (not database). This continues until either the app node exhausts the assigned scope (1-1000), or restarts. Be noted, this process happens per app node. For example, if app node #2 comes up, it is assigned with the next chunk (1001-2000). Hence, even if the latest ID in the app node #1 is in the middle of scope, app node #2 will start with 1001 regardless.
* Why it was changed
The old mechanism largely relied on database and performance was bad. Communication between app node and database is costly. Actually, the half of case creation process time was this ID generation. So, bottleneck issues were sometimes reported in a high load environment. More importantly, in a multi-node environment, the case generation can happen at the same time between nodes, and it could cause contention as the row is shared for all nodes. New mechanism reduced the number of communication between app node and database and increased the performance. Now ID generation takes only less than 5% of its case creation process time.
* Business impact
As a side effect of this new mechanism, now the sequence ID jumps around between nodes or every time you restart the system. Prior to 8.3, the case was pretty much sequential - 1, 2, 3, 4, 5...etc. Now, it goes like 1, 1001, 2, 2001, 3001, 3, 4, 3002... etc. This is all caused by technical reasons. However, for some customers or business type, sequence # is important. So I would recommend you to consider with business people the balance - if, the sequence # is more important than performance, you can change the batch size. Or if you don't get bothered by sequence #, you can keep the default.
* Batch size update
The default value is 1000, but you can change it by Dynamic System Settings. For example, if you update it to 1, system will behave like old version. Be noted performance gets slower in that case.
- Dynamic System Settings: idGenerator/defaultBatchSize
- Owning ruleset: Pega-RULES
- If you want the batch size to differ per case type, you can create another Dynamic System Settings (ex. idGenerator/P-/batchSize). Be noted, this is case sensitive and if you type "BatchSize" instead of "batchSize", it won't take effect.
- If you plan to update batch size, I would advise you to do so before you create a case. This is because, if you update Dynamic System Settings after case creation, pyLastReservedID is updated to 1000 anyways, and when the app node is restarted or user connects to other node, system will create 1001 regardless of the value in Dynamic System Settings. This is one step behind, and customer may feel the change is not reflected immediately. Usually this should be okay because the Dynamic System Settings is included in the R-A-P and imported to production environment before the very first case is created. But if you care sequence ID even in Dev, I would suggest to do so before defining a case type to avoid unnecessary confusion.
Thanks,