We are trying to use dataflow to read data from a kafka topic. In the dataflow, we have a dataset which is reading records from a kafka topic as source and then an activity as destination. In the activity, we are only trying to write a log message for now. When we run the dataflow directly, we don't see anything under successful records for destination. Moreover, nothing gets written into the logs so the activity is not getting triggered. The activity is not appearing in the tracer also.
We have exactly the same use case. My question is slightly different though.
We can successfully read messages off an external Kafka topic via a Kafka DataSet via a real-time DataFlow. This is all working. However I can't see where the JSON payload is available to the Data Transform configured in the DataFlow. See screenshot below. My question is how we can reference the JSON payload read by the Kafka DataSet in the Data Trasnfrom and in the Activity by that matter.
I have a similar POC on 8.2.1 and also note that there is never a result count for "Successful records" for Destination configuration, but they do run.
One of the Destination configuration in my Data Flow is similarly an Activity that executes a Log-Message and I see the messages appearing in the PegaRULES log. Some things to be mindful of about this activity:
Its Applies To class needs to match (or be a superclass of) the class of data your Data Flow has at that point in the Data Flow. The ruleform will be prescriptive about this, and presumably you would see log messages calling out a "rule not found" issue if there was an inappropriate class involved.
With default logging configuration, your Log-Message method needs to use InfoForced or Error as its LoggingLevel; Info and Debug are not logged by default. Use InfoForced if you just want to prove what the data looks like during this phase of development. Never let InfoForced Log-Message statements be executed in Production, particularly in Real-time processing data flows that will run the Data Flow a very large number of times.
Note that if you are running a multi-node setup, your Real-time processing Data Flow will likely run on a different node to the one you are logged into as a developer. Your log messages - once written - will be written to the logfile on the other node which may also explain why you may not be seeing log entries.
If getting access to the other node's logs is problematic, consider different ways to record diagnostics for what your Data Flow is doing that are less node-specific:
Write the data item to a database table using a Database Table Data Set or an Activity with Obj-Save/WriteNow (again, not in Production)
Write the data item to a separate Kafka topic using a separate Kafka Data Set than the one used for your Source
Even running the Data Flow from its ruleform's Actions menu appears to set up a temporary Data Flow run which then runs as if it was a Real-time processing Data Flow run - on the RealTime node. Even on a single-node setup (I am using Personal Edition which is single-node), Rule-tracing the Activity your Data Flow calls shows no events. I suspect this is because the Real-time Data Flow likely creates a new Requestor when it starts up (similar to an Agent) and rule-tracing only monitors the Requestors that were active when the Rule Trace started - the same issue we used to have with tracing Agents before Admin Studio.
I think if you ensure your logging activity is publishing logs which you can access, or you record your diagnostics to somewhere else that you can readily access, this is the best way to troubleshoot your Data Flow rather than via the Tracer.