How to configure multinode Simulation?

Question

RocíoR03

Member since 2014

2 posts

M2C

Posted: Dec 29, 2015

Last activity: Jan 14, 2016

Posted: 29 Dec 2015 6:59 EST
Last activity: 14 Jan 2016 5:26 EST

Closed

How to configure multinode Simulation?

Report

Hi,

In the project I'm working we are trying to perform large scale what-if simulations. We have almost 1000 propositions and 1 million customers.

If we run single-node simulation for 100.000 customer to se what would happen if some changes go live, the full simulation spend close to 10h.

We have tried to configure our pre-prod environment to run multi-node simulations following instructions from Pega Training Course "Decisioning Simulations for System Architects 7.1" and also "DSM Reference Guide 7.1.7", but something should be missed or wrong because the performance has increased dramatically (currently maybe could be near to 2 or 3 days).

Our preprod environment has 2 servers with 2 nodes each one.

First of all what I've done is create a new "ProcessBatchJob" Agent in my application (as image below), and verify in SMA it was running:

Hi,

In the project I'm working we are trying to perform large scale what-if simulations. We have almost 1000 propositions and 1 million customers.

If we run single-node simulation for 100.000 customer to se what would happen if some changes go live, the full simulation spend close to 10h.

Our preprod environment has 2 servers with 2 nodes each one.

First of all what I've done is create a new "ProcessBatchJob" Agent in my application (as image below), and verify in SMA it was running:

After that I could modify Topology settings to set them as in the first image (note: I tried with 2 threads per node and there is no meaningfull performance difference).

Then, following training course instructions, I created in customer database table a new column named "PartitionKey", numeric and I set random values between 1 and 10. Then created Property in customer Pega Class and re-mapped with database table.

And lastly I modified the Input Definition setting PartitionKey Customer's Property in partition Key field in Distributed Runs, and also modified Report Definition to add "Partition Key Parameter" and

Do you have any idea about what would be happening? Or let me know if you need more information to clarify something.

Regars,

Rocío

Show Less

To see attachments, please log in.

System Administration

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Posted: 8 years ago

Updated: 8 years ago

Posted: 31 Dec 2015 12:31 EST
Updated: 31 Dec 2015 12:35 EST

ManuVarghese

PEGA

replied to RocíoR03

Report

Hi Rocío,

The setup looks fine to me. can you please send the Batch Progress screenprint from Simulation History.

Run Simulation for a smaller data set of 5000 records, (default batch size is 250 records * 4 nodes) so we should get a stats for 5 fetches. then inspect the simulation history. refer the screen print below.

here it show the speed at various stages, like

1. IH read speed - speed for fetching the records from IH fact responses.

2, Read Speed - speed for fetching the records Customer table. this also include the speed to fetch "Additional Embedded pages" from associated tables. this is enabled in the Input Definition.

3. Execution Speed - speed from executing strategy.

4. Write Speed - speed for writing the strategy output to output table defined in Output Definition.

Try comparing this statistics when in single node and multimode setup to check where the slowness occurring.

you can switch between single and multi node by using an Input Definition and Report definition that doesn't use Partition key.

Thanks

Manu

To see attachments, please log in.

Like (0)

Posted: 8 years ago

Posted: 13 Jan 2016 8:08 EST

RocíoR03

M2C

replied to ManuVarghese

Report

Hi Manu,

apologizes but I couldn't repeat the test and reply you before.

Find below the screenshot corresponding to multi-node test results (with 5000 records):

And here you are test results in single node:

At the moment I'm not able to evaluate if something is wrong or not, or if those are appropiate values.

Hi Manu,

apologizes but I couldn't repeat the test and reply you before.

Find below the screenshot corresponding to multi-node test results (with 5000 records):

And here you are test results in single node:

At the moment I'm not able to evaluate if something is wrong or not, or if those are appropiate values.

Let me include new data we have achieved. We run, once again, a simulation for 100.000 records and we "discovered", "saw" that the execution seems to be 250 records each 15 minutes more or less. I mean, we click button "Re-execute" and no matter how many times we click refresh or refresh database output table content, nothing happens until 15 minutes. Then refreshing in landing page appears 250 records, and in db table 250 results, then, again spend 15 minutes until new data appears... Is meaningful my explanation? Is clear? I'm not sure if I'm able to tell you properly what is happening.

Find here other screenshot, related to this execution with 100.000 records, stoped after 2h and a half (aprox).

In customer database table we have an unique index for customer Id property (PK), no more index.

Show Less

To see attachments, please log in.

Like (0)

Posted: 8 years ago

Posted: 14 Jan 2016 5:26 EST

ManuVarghese

PEGA

replied to RocíoR03

Report

Hi Rocio,

Before starting the Simulation, the process perform a delete operation for that particular Simulation Work ID to clear existing output. For 100000 records, it may take some time. if this table was created by output definition, it will have a index on pyWorkID column, please cross check that.

if you have only one simulation using the underlying output table, then you can perform a manual truncate of the table before starting a Simulation run, this is eliminate the time taken for delete operation.

however this delete is a one time operation at the beginning. From the statistics, the Read speed for both IH and Customer Data(It also includes fetching additional embedded pages) has reduced drastically when compared to 5000 records run, have you checked and compared the AWR reports (assuming this is Oracle). was DB not performing well when this test was run.

on a general note, in all of the tests (single and multinode), I noticed that the Execution speed is slow, 1 rec/sec, which brings the average speed down, the execution speed denote the speed at which strategy is executed, you will need to profile the strategy by running it in an Interaction rule, see if you can notice which component of the strategy is taking more time.

Thanks

Manu

To see attachments, please log in.

Like (0)

Get Started with Community

Question

How to configure multinode Simulation?

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

How to configure multinode Simulation?

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.