Support Center

Question

VigneshR8038

Member since 2016

3 posts

TCS

Posted: Dec 26, 2019

Last activity: Dec 30, 2019

Posted: 26 Dec 2019 10:39 EST
Last activity: 30 Dec 2019 9:39 EST

Closed

OCR pattern extraction

Report

Hi,

I was trying to fetch particular details from an image say pan card. But we faced a blocker where 2 similar scanned ID's on extracting through OCR had some difference in the order in which the information was extracted. I was unable to find the pattern to get particular details from the text .Any help on the above will be appreciated.

To see attachments, please log in.

Robotic Process Automation

Like (0)
Share this page Facebook Twitter LinkedIn Email Copying... Copied!

Posted: 4 years ago

Posted: 26 Dec 2019 11:15 EST

ThomasSasnett

PEGA

replied to VigneshR8038

Report

Please provide some more details. It is not clear from your post what you are asking for. Do you have any examples you can provide? Screenshots and specific behaviors are really helpful. With that, I might be able to offer some suggestions.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 26 Dec 2019 12:29 EST

VigneshR8038

TCS

replied to ThomasSasnett

Report

Thanks for the quick response. We are trying to extract the details from a scanned image using the documentocr component. When trying to extract the details from the first sample identity card, the unique number which is at the bottom of the card is being extracted first (order of the text output). When we extract it for a different sample we are getting it in the same order as present in the identity card but how do we find a common pattern to fetch the required details we need from the full extracted strong as our extracts all the details in the entire image. Please suggest. Happy to provide more information if needed.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 26 Dec 2019 13:23 EST

ThomasSasnett

PEGA

replied to VigneshR8038

Report

I still do not think I fully understand what you are asking. It sounds like you have text that you've extracted via OCR and you are asking how to parse that text. I cannot answer that without that text. If you cannot provide the text as an example, then I would suggest looking into RegEx (Regular Expressions). These are useful for extracting patterns from text. I use a tool called "Expresso" to help build and test more complex RegEx although there may be better ones or even web-based ones as well.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 29 Dec 2019 8:04 EST

VigneshR8038

TCS

replied to VigneshR8038

Report

Thanks for the response. Below are the 2 scenarios we are facing for scanning the same type of card of 2 different persons.

Scenario 1:
"HHcJohn DarrenXXXX1234XV:.^-Signature > .C"-Signature1dd/mm/yyyyAccount NumberABC Organization

Scenario 2:
fcRPFHRcTABC Organizationdd/mm/yyyyAccount Number*XXXX1234XMichael JamesSignatureS?;*

We are mainly looking to extract the important details like the name and the card number(XXXX1234). But in both scenario 1 and 2 the OCR outputs are out of order compared to the card scanned and we couldn't find any common pattern we could use to extract the details. Please suggest.

To see attachments, please log in.

Like (0)

Posted: 4 years ago

Posted: 30 Dec 2019 9:39 EST

jeffbadger

PEGA

replied to VigneshR8038

Report

Have you tried using the DocumentOCR component with the new PdfConnector functionality? When setting up a document type you can add both scenarios to the document type and then test to see which is present. In my experience, when the OCR reads the document in a different order it is usually either not the same document or there is some rotation on the document that has caused lines to skew. The DocumentOCR component handles rotation pretty well but that is not an exact science.

To see attachments, please log in.

Like (0)

Get Started with Community

Question

OCR pattern extraction

Need help or want to help others?

Experience the benefits of Support Center when you log in.

Question

OCR pattern extraction

Related content:

Need help or want to help others?

Experience the benefits of Support Center when you log in.

We'd prefer it if you saw us at our best.