PRPC does not currently take advantage of consumer groups. It doesn't need them because it keeps track of its position in the topic via the database rather than relying on Kafka. I believe though you can set a consumer group for all PRPC connections to Kafka via dynamic system settings.
I have seen a weird behavior when stop and starting kafka connector. It pulls messages from the beginning of the kafka queue instead of picking up from where it left off. Trying to understand is it because of new consumer group getting assigned automatically or because of lack of consumer group. It would be helpful how the pega kafka connector manages the connection on stop and start,
Just making sure we are talking about the same thing - you mean the Kafka dataset (connectors in PRPC are something different)? At least currently if you want to browse continually from a Kafka dataset, you are meant to use it as part of a real-time dataflow. The real-time dataflow is responsible for keeping the read position for all Kafka partitions for the topic.
If the real-time dataflow is paused/resumed, the Kafka dataset will resume reading from where it stopped.
If the dataflow is stopped/restarted the Kafka dataset will start reading based on the configured initial position (either earliest message or only new messages).
It clears this up little bit. We have the kafka dataset as part of the real time flow to do some windowing.
But the behavior is very strange trying to read from the beginning or only new messages. Most kafka clients read from where it left off on stop and start. I think it is because of the lack of consumer group implementations. Do kafka dataset get assigned automatically a consumer group or they are never assigned a consumer group? Lot of the implementation details seem to be a black box.
You can set a global consumer group via DSS for PRPC but this is mostly used for authorization purposes. Again, PRPC does keep position but it is only in the context of the dataflow run instead of using the one stored in Zookeeper as part of the consumer group. This allows you to run multiple dataflows attaching to the same Kafka topic without having to mess with Kafka consumer groups.
We are using Kafka Dataset not the PRPC connector. I am trying to understand how the kafka dataset under the hood whether it is setting up a default consumer group or not having one at all. Any Kafka client on stop and start should continue from where the last message was consumed. With Kafka dataset it looks like either we read from the beginning or read only new messages. on stop and start
As you have mentioned in the post, Can you please share how to set - "global consumer group" via DSS and what should be the DSS name and value to be set ?
We have set up a Kafka listner and dataset which is SSL enable. Test connectivity works fine but the messages we are getting via the dataset are all failing and as we open the xml it shows as blank xml.