Posted: 18 Jan 2021 13:27 EST Last activity: 28 Jan 2021 1:02 EST
Data Page Scope Example
Can anyone please give me a specific example/scenario for each scope (thread, requestor, and node)?
1. Please use your own words instead of copying and paste things like this (the following information can be found easily. If it answers my question, I wouldn't even post here. There are few and limited resources out there for Pega, Pega team really need to expand the community/collaboration board by giving some real answers using own words, or at least try to explain things in different ways).
If we specify a page in node level, then it can be accessed by all the requestors in the particular node.
If we specify a page in Requester level, then it can be accessed by all the threads opened by the requestor.
If we specify a page in Thread level, then it can be accessed only by that particular thread
2. Please do not just give me a link due to the same reason -- if I could find the answer by myself, I wouldn't have asked here. I have already read the following post, so please please please do NOT post a link and say "this link will help with your question" (No! they won't help!). If you find a resource that's really helpful, at least add some of your own words regarding my question, please.
A Requestor-scoped Data Page is useful when the results of one trip to a data source will be reusable for the duration of the user's session, regardless of how many work items that user works on.
An example is gathering portal display preferences for a user. A Requestor-scoped Data Page assembles this data once per user session after the user has logged in, allowing these preferences to be referenced as needed from this Data Page for any case over the remainder of that user session.
A Thread-scoped Data Page then covers the scenarios where a result set is gathered that must be dedicated to the context of the current case, and would be inappropriate to share to other cases within the same user session.
For example, a Data Page that looks up the Net Worth of the Customer identified in the current Service Case would not be information appropriate to make available to other Service Cases worked on by the same user. So when each Service Case needs its own copy of a Data Page's result set, a Thread-scoped data page is the common approach.
People can often confuse appropriateness of sharing result sets across Threads and Requestors, with their volatility (how rapidly the source data changes). Whilst Node-scoped data pages are definitely good for data that hardly ever changes, highly volatile source data can still be managed using a Requestor-scoped Data Page, configuring its Load Management to consider the result set stale after (say) 5 minutes.
This strikes a balance between the number of (potentially slow) trips made to the data source and the freshness of this data, all whilst minimizing the amount of memory needed to hold the cached data. Consider this in scenarios where it is appropriate for volatile result sets to be sharable across all Threads for a Requestor (cases for a session).
Implementing this as a Thread-scoped Data Page - even with the same small "refresh interval" - in applications with a large number of concurrent users - can still result in a large number of copies of (virtually) the same data held in memory at once, retrieved using a larger number of data source lookups. This impacts the performance of your Pega application. Caching this data at Requestor-level - if appropriate - will consume less memory and make less calls to the database or APIs.
Thank you so much for explaining in detail! It starts to make sense to me now. From what I understand after reading your answer, I should ask myself a couple of questions when I need to decide which scope to use. Question 1: will different users retrieve the same or unique information (if it’s the same, then use Node scope); Question 2: will the same user retrieve the same info even when they open multiple cases (if yes, use Requestor scope; otherwise it’s Thread). Is that correct?
I appreciate your two examples. Here is one that I came up with: customers can view their own personal information (name, address, phone number…) after they log in. Since customers are not allowed to view other customers’ info, the data page should never be Node-scoped. Also, customers’ info (in most cases) have low volatility, which means no matter how many cases they open, they are going to see the same personal info. Therefore, it’s a Requestor scope. Would you agree?
Remember though that volatility isn't the primary driver for Data Page scoping, as volatility can be controlled via the Data Page's "Load Management" configuration: Different users accessing the same information is a good candidate for Node-scoped, even if that data is volatile enough to be considered stale every few minutes.
You can establish some interesting "middle ground" scenarios by parameterising your Data Pages as well, which gives you even more flexibility in your scoping decisions. Parameterized Data Page results are cached independently of each other, and can allow a Data Page to be scoped at Requestor or Node level (resulting in fewer copies of the same data in memory), when without parameters it may have to be Thread level.
Thank you again for your reply. As you mentioned load management, I noticed that Node must have an access group (doesn't apply to Thread or Requestor). I would assume only users who belong to that access group have the right to call the Node-scoped data page. To test my hypothesis, I created a data page to verify login credential with Node scope. I wanted to see what would happen if someone didn't belong to the Administrator access group tried to run the data page. Then I created a role and put him under a different access group (User access group). I thought he wouldn't be able to run the data page, but it worked just fine. Nothing is different when I switched to another access group.
I'm thinking maybe it's because I set configuration wrong, but what should I expect to see if it was right?
If a person with a different access group can run the data page, what's the point of specifying an access group in a data page with Node scope?
Posted: 1 month ago
Updated: 1 month ago
Posted: 24 Jan 2021 18:10 EST Updated: 26 Jan 2021 14:50 EST
The Node-scoped Data Page's Access Group does not govern who can or cannot access it. Access is determined by rule resolution based on your Access Group, just as it is for Requestor- and Thread-scoped Data Pages. That is, if your ruleset stack - determined by your Access Group - includes the Node-scoped Data Page, you can access it.
The reason why a Node-scoped Data Page has its own Access Group is to assure the behavior configured in the Node-scoped Data Page performs in a consistent and standardized way for all Requestors. That is, a Node-scoped Data Page - by definition, shared by multiple Requestors - should return the same results regardless of who triggered its load sequence.
Bob and Charlie are using different Access Groups, which yield different ruleset lists.
So long as both Bob's and Charlie's ruleset stacks have the ruleset that the Node-scoped Data Page is in, they can each access it.
Bob and Charlie can each access the data already cached on this Data Page if the data is deemed fresh.
Bob and Charlie can also trigger the load sequence if the Data Page holds no data, or the data held is 'stale'.
As Bob's and Charlie's ruleset stacks are different, rule resolution for each of them may result in running different versions of the rules referenced by the Data Page, depending on whether the load sequence is triggered by Bob or Charlie. If different rules run depending on who triggers the load, the results risk being different.
So for Node-scoped Data Pages, the Data Page's Access Group removes this variability. Regardless of whether Bob or Charlie trigger the load sequence, the Node-scoped Data Page always uses the ruleset stack (and role-based access control) of the Access Group specified in its configuration. The Data Page always loads using the same rules - regardless of who triggered it - and serves up a result set that is not influenced by the user who loaded it.
Requestor- and Thread-scoped Data Pages don't do this. By definition, neither can be referenced outside of the current Requestor. It is therefore appropriate that the Requestor's ruleset list (and security context) drive how those Data Pages are loaded for use by that Requestor.
In fact, patterns like Dynamic Class Referencing actually exploit that the Requestor's ruleset stack determines what versions of rules run when a Requestor-scoped Data Pages loads. These Data Pages deliberately give different results, depending on the Requestor's current Access Group. However, for Node-scoped Data Pages, a consistent result set for all users is expected.
Closing the loop, this may be one of the factors of your design decision around how to scope a Data Page:
If I need all users to see the same results, regardless of their Access Group: use Node-scope
If a user needs to see different results based on something specified in their Access Group: use Requestor- or Thread-scope.
@ChensuZ5 I think from Bramm comments have explained it quite in details to explain the following. Just to add to give very simple thought to begin and think on how you want to standardize it.
Thread: Case specific details
Requestor: Session based information should be used as requestor, taking your ex: exchange rates will continuously during the trading time and if you need to show the latest as per 30 min requestor make more sense keeping the refresh stratergy.
Node: least and consistent change options can be in this something like dropdown values pulled from configuration list.
I propose the exchange rates would be even better Node-scoped with a 30 minute refresh interval.
If you have a large number of users concurrently using the system, you have a higher likelihood of duplicate data pages across user sessions when this is requestor scoped, even though the data need not be considered unique for each user.
The question "What was USD-to-GBP some time in the last 30 minutes?" could be the same answer for all users, whilst only holding one copy of D_ExchangeRate["USD", "GBP"] in memory on that node.
The additional benefit is that - for common exchange rates - most users would get a result set immediately, without hitting the data source. So, not only are you reducing memory footprint, you are also reducing CPU and IO as Pega does less rule execution and fewer API calls, as well as maximizing user responsiveness.