Support Center

Question

Suganyak1359

Member since 2017

6 posts

Sensiple Software Solutions Pvt Ltd

Posted: Sep 6, 2019

Last activity: Sep 10, 2019

Posted: 6 Sep 2019 2:16 EDT
Last activity: 10 Sep 2019 10:57 EDT

Closed

Data capture

Report

Hi,

I am trying to capture data for 'Balance Due' field in this pdf. Can someone provide me a solution to get the amount from Balance Due field. I have attached my pdf for reference.

Thanks.

To see attachments, please log in.

Robotic Process Automation

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Posted: 4 years ago

Posted: 6 Sep 2019 9:45 EDT

ThomasSasnett

PEGA

replied to Suganyak1359

Report

Here you go. The key to working with PDFs is to look at the Developer Tools available in the PDFViewer. You can highlight the various parts you can extract and then determine how to navigate the document from there.

I hard-coded this to use the attached PDF, but you could replace that with a selection if you like.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 9 Sep 2019 6:20 EDT

chakri5122

Accenture

replied to ThomasSasnett

Report

Hi Thomos,
What is the use of PDFViewer in windows form.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 9 Sep 2019 9:02 EDT

ThomasSasnett

PEGA

replied to chakri5122

Report

It serves two purposes;

1. It allows you to see the PDF to validate what you are pulling.

2. It allows you to enable to the developer options and highlight the word. segments, and lines. This help with development so you know how it is able to parse the file.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 9 Sep 2019 6:41 EDT

Suganyak1359

Sensiple Software Solutions Pvt Ltd

replied to Suganyak1359

Report

Hi,

Thanks for the response! I looked at the attachment, It is working great.

I couldn't find System.IO.Path#GetDirectoryName in my Robotics studio.

Also Can you please explain the PDF_P_LoadFiles.os automation flow, the use for RuntimeHost & System.IO.Path blocks in that flow.

Is it possible to capture the same fields through OCR techniques.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 9 Sep 2019 9:12 EDT

ThomasSasnett

PEGA

replied to Suganyak1359

Report

1. To add any static .Net methods (like System.IO.Path.GetDirectoryName), you right-click on an area of the Toolbox and select Choose Items. You then select "Pega Robotics Static Members". Next, select "From Global Assembly Cache". Finally, select the assembly the method you wish to use is in. In this case (most of the interesting ones are in either mscorlib or System) select mscorlib (it is near the bottom of the "m", so if you click the letter "n" and scroll up), you can find it quicker) and locate the Directory node. Simply check next to whichever method you like.

2. I wanted to attach the PDF to the example for ease of distribution. In practice, you'd load the PDF from another path. The logic there is just used to locate the file on-disk in the extract directory of the solution, since I have attached the file to the deployment.

3. OCR simply turns whatever file you have into a PDF to use the PDF Connector component on it, so not directly. I guess you could get the entire text of the file and parse it yourself though.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 10 Sep 2019 8:54 EDT

Suganyak1359

Sensiple Software Solutions Pvt Ltd

replied to ThomasSasnett

Report

Hi,

I see you have uploaded the sample pdf in the solution. I couldn't either attach my sample pdf to the solution nor locate my pdf folder through automation.

Can you tell how to open the pdf files (located in my folder) to load the files in Pdf viewer, in Robotics studio. I tried using Path.GetDirectoryName method, but error occurs.

Thanks.

To see attachments, please log in.