8 Replies Latest reply: May 15, 2014 10:09 AM by Olly Bond RSS

    Problems with PDF Files

    Walter K.

      Hello,

      I was wondering if there are inherent problems working with PDF files?

      We get many PDF files from our clients that I use in Monarch Pro v10.5. Most work just fine.  While others seem to change ever time they submit a PDF file. 

       

      When you look at the PDF files in Acrobat Reader, it looks exactly the same.  But when I open it in Monarch, the layout/format has changed. Some columnar data is more spread out, do not line up as before and causes my models not to work properly.  I've widened traps and used floating traps and nothing seems to work consistently.  Almost every time a file is submitted I have to redo the model.  I've tried making adjustments in the PDF Import Options to no avail, though I admit I'm not that fluent in that area. 

       

      Is there some way Monarch can handle these aberrations for each file or are there settings in Adobe Acrobat or PDF creators that have settings that can be adjusted to be more consistent.

       

      Thanks

      Walter

        • Problems with PDF Files
          elginreigner _

          Short answer, yes. There are many differnt versions of PDF. Visually they all appear the same, but on the data format side, they are completely different.

           

          My work around is to open the file and save as a XPS file (MS version of PDF, much more consistent).

           

          Or you can adjust the settings under PDF input per model.

            • Problems with PDF Files
              Olly Bond

              Hello Walter,

               

              To add to Elgin's points, yes, you can get a PDF produced by Ghostscript which looks (in Monarch) different to one produced by Acrobat Distiller and so, and there are a couple of hundred PDF writers out there to deal with... But also, even if the same layout report turns up each week, produced by the same version of the same PDF writer in the same version of the PDF standard, and you use the same version of Monarch to read it in using the same PDF import engine, you can have problems.

               

              This is because Monarch makes an intelligent guess about the spacing on each line based on the data it reads. If one week your report says:

               

              Sales - - - $1,000[/CODE]

               

              but next week it's:

               

              Sales - - - $1,000,000[/CODE]

               

              then the spacing has to be recalculated accordingly.

               

              There are techniques when modelling to handle this - using MCRs, floating traps, string functions like trim() and conversion functions like val().

               

              In a recent webinar I explained how to handle variable width columns - if you contact Datawatch they should be able to give you a link to register to view the archived recording.

               

              Best wishes,

               

              Olly

              • Problems with PDF Files
                Walter K.

                To save as an XPS file, I'm guessing I need Adobe Acrobat since I don't seem to have that option with just Adobe Reader?

                 

                And what settings would I adjust in PDF Input?  Or did you mean PDF Import?

                 

                Thanks again.

                  • Problems with PDF Files
                    Olly Bond

                    Hello Walter,

                     

                    To convert a PDF to XPS, you just need to have the Microsoft XPS printer driver installed on your PC. In Vista and later, this is there by default, I think, but in XP you can download it from Microsoft's web site.

                     

                    Then just open the PDF in Adobe Acrobat Reader, and print it to XPS format. The resulting XPS file should resemble the PDF in layout, but hopefully will be slightly better behaved in Monarch.

                     

                    To handle XPS files, I think you need at least Monarch v10 Pro.

                     

                    HTH,

                     

                    Olly

                      • Problems with PDF Files
                        MacA _

                        Installed and lead in before installation said there was the Document Writer but after rebooting still have not printer option when in Word for the XPS printing.  I am on XP OS of course.  Anyone provide a solution.  I am running Monarch 10.0 and see opening XPS documents as an option if I can just get to that point.

                         

                        Thanks

                        Max

                          • Problems with PDF Files
                            guyporter _

                            I have version 10 installed and saved a troublesome pdf as an XPS file. Monarch opened it but the screen was blank. Any ideas on what I need to do to view the file?

                              • Problems with PDF Files
                                Olly Bond

                                Hello Guy,

                                 

                                If the data isn't too sensitive, then by all means email the PDF to me. If you'd prefer a more secure channel, I've a Huddle space we could use.

                                 

                                If the PDF is just an image layer, then Monarch would show a blank, so you might need some OCR. Can you select the text in Acrobat Reader and copy it into Notepad?

                                 

                                Best wishes,

                                 

                                Olly

                      • Problems with PDF Files
                        Data Kruncher

                        Hi Walter,

                         

                        Unfortunately the news isn't great on that front. The challenges faced when bringing PDF files into Monarch are reasonably well documented in this forum. The problems stem from the non-standard manner used by the many different PDF authoring programs to create "standard" PDF files.

                         

                        The first thing that I'd do is check with those supplying you with the PDF files to see if they can supply XPS files instead, which Monarch v10.5 can read without any of the difficulties that you've described and that many of us have also encountered.

                         

                        XPS files can be created by recent versions of Microsoft Office so the files shouldn't be difficult to produce. The difficulty most often seems to be in getting people to supply them. You're likely to be fighting with change, not so much with software.

                         

                        All that said, you'll likely find the best results with floating traps, or a combination of floating traps and other approaches such as calculated fields that employ functions to isolate the data that you really want.

                         

                        In general, because of the proliferation of non-standard PDF authoring software (which produce a clean product to our eyes, but often something decidedly otherwise behind the scenes) working with PDF files just isn't as easy as many hoped it would be.

                         

                        Edit: typing while others were posting. Again. At least we're all on the same page!