Does Data flow pick up records inserted in source after DF is started?

Question

ABHINANDAN

Member since 2011

32 posts

Areteans Technology Solutions

Posted: Jun 11, 2018

Last activity: Jun 13, 2018

Posted: 11 Jun 2018 12:42 EDT
Last activity: 13 Jun 2018 4:58 EDT

Closed

Does Data flow pick up records inserted in source after DF is started?

Report

Does Data flow pick up records inserted in the source after DF is started?

To be more precise , say at a particular point of time when the DF is started the source has 10 records, during the DF execution 5 more records are inserted in the source, will those 5 be picked up in the same DF execution?

We tried to simulate and found that it is not picking up, but not sure if we were able to simulate it correctly, so wanted to know how the DF internally works and how will it behave in this scenario?

The reason I am concerned about this is because in our project we are using a DF on a source where records can concurrently be inserted through some other channel, so if DF picks up new records then it will take considerable time for it finish.

Regards

Abhi

To see attachments, please log in.

Decision Management

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Posted: 5 years ago

Posted: 12 Jun 2018 12:19 EDT

Raju Botu

Aaseya IT Solutions

replied to ABHINANDAN

Report

From my experience with Data flows, since your data source is not a stream it might not pick up newly inserted records.

To test this, we need to make sure the data flow is not completed before the new records are committed to DB. I will test this scenario and update the post again tomorrow.

Also, at the start of DF, there is lot of work to decide on number of partitions and distribution of data across the nodes etc. So it clearly takes a look at the content available at that moment.

To see attachments, please log in.

Like (0)

Posted: 5 years ago

Posted: 12 Jun 2018 14:28 EDT

ABHINANDAN

Areteans Technology Solutions

replied to ABHINANDAN

Report

Thanks Raju,do update us about your findings.

As I mentioned in my Post, I tried simulating the same thing which you mentioned i.e when I started the DF, that point there were 4 records and 4 uncommited records, in the middle of DF execution I commited the rest 4 and found DF didn't picked up those.

However I intend to retry the experiment after reducing the batch size to 2 (by default it is set as 1000), so it's possible that when DF started at that time it found record count to be 4,and batch size being 1000 fetched all 4 in first retrieve itself and hence never bothered to do a second fetch.

If I decrease the batch size to 2, then it will have to do more than one fetch, will like to see how it behaves then.

Regards

Abhi

To see attachments, please log in.

Like (0)

Uday Parki

Posted: 5 years ago

Posted: 13 Jun 2018 4:58 EDT

pereh replied to ABHINANDAN

Report

Hi Abhi,

I'd be careful to base your solution whether DF will pick up those records or not, there are too many factors that could influence if they will be processed or not. By the comments put here I'm guessing you have a DB source and you are on track on one of these conditions, the batch size will also influence if a new records is processed or not.

If all records are read, then DB source will notify it's done and only then you insert new records, those will not be processed.

There are more factors that could influence that:

If your source class has keys then the records will be processed in order, meaning that if you insert a new record before a record being processed the new record won't be processed, but if you insert it after, it will.
If you have a partitioned source and you insert a record on a partition that didn't exist before then it will never be processed.

and that's only for DB, different sources could have different behaviors around that depending on their nature.

To see attachments, please log in.

Like (0)

Get Started with Community

Question

Does Data flow pick up records inserted in source after DF is started?

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

Does Data flow pick up records inserted in source after DF is started?

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.