Data Flow should run on different nodes

Question

AndreG66

Member since 2013

34 posts

Vodafone GmbH

Posted: Feb 22, 2019

Last activity: Apr 24, 2019

Posted: 22 Feb 2019 8:25 EST
Last activity: 24 Apr 2019 11:10 EDT

Closed

Data Flow should run on different nodes

Report

Hi,

I have a cluster of two nodes and both are declared to run data flows. But if I start one with Dataflow-Executed it is only started on one node. How can I distribute the data flow, that it is running on more than one node?

Background: I want to test Kafka interface and if two nodes processes Kafka Messages twice, which is not wanted.

Thanks, Andre

To see attachments, please log in.

Decision Management

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Posted: 5 years ago

Posted: 28 Feb 2019 6:05 EST

NigelPeach replied to AndreG66

Report

Hi, the dataflow will distribute the work automatically but only to the point it can split up the work into partitions, what's the partitionkey values set to in your source dataset?

To see attachments, please log in.

Like (0)

Posted: 5 years ago

Posted: 28 Feb 2019 6:39 EST

AndreG66

Vodafone GmbH

replied to NigelPeach

Report

Hi Nigel,

I'm currently doing prototyping, so defined a simple integer as a partioning key. But what I figured out in the meantime is, that partioning key is only relevant if Pega sends Kafka message. I also did a test in a cluster. I defined two dataflow nodes. Then I configured a Real-Time dataflow, but this flow is only running on the first node and not on the second. The main purpose I would achive is to prevent duplicate message, because dataflow it started on several nodes.

I also did an additional test, I implemented to data flow with same Kafka dataset listening to same topic and here both datalfows will be processed by one message. Which makes sense as it is publish/subcribe and not queue based message exchange pattern. Anyway for a single dataflow distributed to different nodes I expect that the message is only processed once. With Kafka-Producer tool I was able to set the PartioningKey and in Pega I can see that value in pzPartition.

... and thanks for help!!

Andre

To see attachments, please log in.

Like (0)

Jonathan Pereira

Posted: 5 years ago

Posted: 4 Mar 2019 2:46 EST

NigelPeach replied to AndreG66

Report

Andre, hi, thanks for the context, I was originally answering from a pure dataflow/dataset perspective, the Kafka dataset is outside of my experience but I'll check with others as to how the partitioning should be working.

To see attachments, please log in.

Like (0)

Kensho Tsuchihashi Aruldevan Thangappan T Tushar Banerjee Ganesh Kumar CV Sarath Chandra Penumarthi and 4 More

Posted: 5 years ago

Posted: 18 Mar 2019 3:25 EDT

AndreG66

Vodafone GmbH

replied to AndreG66

Report

Hi,

any updates? I still need help.

Andre

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 24 Apr 2019 10:45 EDT

KevinZheng_GCS

PEGA

replied to AndreG66

Report

Are you running external Kafka? And your use case is to read messages from this external Kafka? What Pega platform version are you running?

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 24 Apr 2019 11:10 EDT

AndreG66

Vodafone GmbH

replied to KevinZheng_GCS

Report

Hi Kevin,

currently I'm just prototyping. But it should be an external Kafka and we are using Pega Platform 7.3 with Pega Marketing 7.22.

To see attachments, please log in.

Like (0)

Get Started with Community

Question

Data Flow should run on different nodes

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

Data Flow should run on different nodes

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.