Posted: 9 Nov 2016 2:05 EST Last activity: 10 Nov 2016 9:30 EST
Need to read data from the doc and pdf
I have a requirement where we get different types of files(doc,docx,pdf) on the filesystem path. Based on the filename input i will be pulling the file from the server and read through the contents of the files and copy it to the property whose control is rich text editor. I am able to read the file contents however, the format, alignment, images, tables are coming as text and displaying the data without any format or alignment. I am reading the file from java code as we have to pic only a specific file which cannot be achieved by file listener.
Please suggest if there is any approach or do i need to modify in my code copied below.
Since this isn't something PRPC offers OOTB - and you are already utilizing the third-party library 'pdfBox' (which does ship with PRPC); I think you also need to check more general forums for advice on this.
The same goes for the file types you mention (WORD): you can use the 'Apache POI' library for this - which also ships with PRPC (actually it is a repackaged version of the library I believe in this case).
Sorry I couldn't provide any more specific information here - maybe somebody else has done something similar with PRPC who could help out here ?