Amit Bhabhe (AmitB441)

AmitB441 Member since 2019 1 post
Posted: November 22, 2019
Last activity: February 26, 2020
Posted: 22 Nov 2019 8:38 EST
Last activity: 26 Feb 2020 14:06 EST

Using Real time Data flow with Kafka Data set to only read new records

We are exploring how PEGA works with Kafka data set using real time data flow. How does it keep track of records which are processed/read from data set. Here is an example & observations.

1] Configure a real time data flow with Kafka data set.

2] Set the Read options as 'only read new records'.

3] Using Kafka producer post few messages to a topic which is configured in the Kafka data set. Say you have posted 3 messages.

4] Review the component statistics – Data flow run stats. – shows 3 successful records

5] Stop the data flow and post another message – 4th message

6] Start the data flow and post another message – 5th message

7] Review the components statistics – data flow run stats – you will see it has processed only the 5th message which means 4th message is lost or not processed. Is this an expected behaviour? What is the definition of ‘new record’? Is it anything posted after the data flow is started/re-started or everything posted since last processed record?

We have raised a support request with PEGA for messages getting lost based on above scenario. The GCS team suggested to raise this on support community, hence this post.

Data Integration
Moderation Team has archived post, This thread is closed to future replies. Content and links will no longer be updated. If you have the same/similar Question, please write a new Question.