8 Replies Latest reply: May 15, 2014 10:09 AM by elginreigner _ RSS

    File Import mashing columns together

    guyporter _

      I regularly import pdf files into Monarch for export to Excel. The information generally takes the form of columns of aged debt numbers.

      Often I find that some columns are mashed together and out of alignment. Sometimes this is a whole column and sometimes just part of a page that is out. Obviously I vary the traps and painting to try to get the right data in the right place as well as varying the spacing on the input option to try to help this.

      I have a report that has done something I have never seen before and has actually moved some of the digits out of order!!

      [      140.45      

                   0. 00

                51.58 1 

              0.00      

              0.00      

            107.27      

           ,445.25 1    

              0.00      

          3,836.38      

      ]

      It is not clear form this example but the numbers 51.581 should in fact read 151.58. it has taken the 1 from the front and put it at the back. IT has however not done this on all the fields - ,445.251 should be 1,445.25 but 107.27 is correct.

      If you can offer any solution to this problem then I would be delighted to hear it. The user will have to otherwise print the report off and do the calculations manually. The report looks ok in the original PDF and I have no idea why Monarch has done this.

      Thanks

      Guy

        • File Import mashing columns together
          Grant Perkins

          Guy,

           

          There are a number of adjustments that you can make to help Monarch interpret PDF files  - more in V10 than V9 Monarch as new issues of 'standards' were identified amongst the many PDF writing engines out in the wild.

           

          The thing is that a PDF is, to all intents and purposes, a 'graphics' document that contains a graphics block and some positional information. In the case of 'text' in the document is contains font and size information too.

           

          It is quite likely that this information could be anywhere (more or less) in the file or a sub-section of the file - say a line number of text - and is re-interpreted as the file is output. This is (or certainly was) the way compression worked to keep file sizes down when storage capacity was small and expensive.

           

          So it seems likely that what you have found is a file written in a way that appears OK within a PDF dedicated display program but that offers some challenges to Monarch's interneal PDF interpretation systems within V9. (I assume V9 having checked you profile.)

           

          What happens to the report of you open it using the Adobe PDF reader and export it as text from there? How does it look? It should work, of course, but I have seen examples where the reader failed to format the lines correctly.

           

          If that works for the specific report I would suggest using the reader to get the text file and then Monarch from there to get your file for Excel.

           

          It might be worth making a note of where the file originated and which PDF Writer software and version produced it. That way any future anomalous files can be compared to see if a common theme appears.

           

          Many have found success opening the file for edit in a PDF aware editor (Adobe Acrobat if you have it but there are others) and then re-writing as a new file. The theory is that the edit will 'refresh' the internal structure of the file and so offer a better chance for Monarch to deal with it.

           

          HTH.

           

           

          Grant

            • File Import mashing columns together
              guyporter _

              Hi Grant, Thanks for your suggestion, I am aware of both options and in this case I have tried the resaving as a pdf (Usually I use the free Cutepdf program) and seeing if I can convert it as a text, neither works in this instance.

              The file was written with GPL Gostscript 8.15 so maybe this information will be useful in further version.

              I have seen plenty of mashed columns but never the reversal of figures. The report looks normal in pdf view.

              Thanks for your advice anway

              Any other suggestions out there?

                • File Import mashing columns together
                  Data Kruncher

                  Check out Gareth's comments in [URL="http://www.monarchforums.com/showthread.php?t=3377"]post #10 of this thread[/URL] as I suspect that they relate to your situation.

                  • File Import mashing columns together
                    Grant Perkins

                    Hi Grant, Thanks for your suggestion, I am aware of both options and in this case I have tried the resaving as a pdf (Usually I use the free Cutepdf program) and seeing if I can convert it as a text, neither works in this instance.

                    The file was written with GPL Gostscript 8.15 so maybe this information will be useful in further version.

                    I have seen plenty of mashed columns but never the reversal of figures. The report looks normal in pdf view.

                    Thanks for your advice anway

                    Any other suggestions out there?[/quote]

                     

                    Guy,

                     

                    Are you saying that the Adobe Reader text export fails to get it right as well?

                     

                    I think in that case you will be struggling to find a result at all let alone one that might be considered consistent and robust enough for regular use.

                     

                    Can the source be persuded to use a different PDF writer? One wonders how correct the original reports might be .....

                     

                     

                     

                    Grant

                      • File Import mashing columns together
                        guyporter _

                        tseems this report is going to be a dud. IF the problem persists we will have to do that report manually or ask the client to update their version of the pdf writer and see if we get a better result.

                        The Adobe writer option did not work either.

                        Thanks for your help so far.

                          • File Import mashing columns together
                            Grant Perkins

                            Guy,

                             

                            I seem to recall from some time ago a very similar sort of problem (might even be identical) with a file produced from the same source though which particular release I am not sure now.

                             

                            8.15 seems to date back to 2005 or so and a quick dip into Google suggests that 'issues' with one release or another of the code are not unheard of.

                             

                            I think a number of systems developers, especially 'bespoke' systems developers, used a 'one off' selection of GPL Ghostscript as a quick, easy and free way to implement PDF documents when the format started to become ubiquitous. So long as it worked to produce a PDF all was well. How it worked behind the scenes was not their problem.

                             

                            It does seem to leave you a bit stuck though. Have you any way to try 10.5 to see if things are improved? It seems unlikely but it would not take much to try it.