5 Replies Latest reply: May 15, 2014 10:10 AM by Grant Perkins RSS

    Problems With PDF

    ghughes20 _

      I've been a long time user of Monarch for about 10 years, but only us it occasionally.  I'm currently having problems opening PDF files from one day to the next.  The traps I've set are very specific, often with a specific character in a specific spot.  However, when I open the same report from one day to the next, the placement of this character shifts 1 - 6 spots to the left or right.  Thus, I need to reset my traps every day.  This is beyond frustrating.  I'm using Monarch 10 and have played around with the various settings in the pdf wizard with no success.  Have others faced similar problems and how where they solved.  (I've tried to create as flexible traps as possible, but have run out of options.)

       

      Any help would be greatly appreciated.

       

      Thank you,

       

      Greg

        • Problems With PDF
          Olly Bond

          Hello Greg,

           

          PDFs appear to change from one report run to the next, and it's not Monarch's fault! The PDF format is very flexible in how it records characters and where to show them. Monarch has to convert this into characters and spaces, and has to make an educated guess. If a gap of ten spaces usually contains a field of three characters, but one day has a length of nine characters, you will see subsequent fields nudged a bit.

           

          You can minimise this effect in Monarch by changing the scaling options at input - in extreme cases put the scaling down to 0,1 to inject the maximum amount of blank space in between the data. You can also counteract this effect, by using the floating trap feature in your Monarch templates.

           

          Trapping for a specific character might best be achieved with ßxß if there are blank spaces either side of it. Sometimes, if the trapping can't cover all the bases you need, you might have to take every line of the file into the table using an empty template, trim the spaces a little in the table window and export the table as a fixed width text file for a second attack in Monarch.

           

          HTH

           

          Olly

            • Problems With PDF
              Grant Perkins

              Greg,

               

              Olly summarises the problem nicely.

               

              It may be worth seeing how the Adobe PDF Reader deals with you incoming files when you use the 'convert to text' (or whatever it is called) option on them. For that feature it is performing a similar function to Monarch and usually has similar issues with the format of the resulting output.

               

              If you have the opportunity there are a number of other PDF editors out there and in theory they would also have the same sort fo problems. However, if you find one that does not you may be able to utilise it in your workflow and I am sure that the Datawatch people would be interested to have a look at it as well.

               

              There are a number of posts in the forum realting to problem pdfs all with many comments that, together, help to explain just how variable the PDF concept - with its many independent 'writer' programs, some of which are likely to be quite old when embedded in legacy systems - can be.

               

              HTH.

               

               

              Grant

                • Problems With PDF
                  ghughes20 _

                  Firstly, thank you all for the prompt suggestions.  I'm exploring many of them.

                   

                  So I'm trying to use the floating trap.  Once I've designed my trap and set my fields I get the following error message...

                   

                  "Floating trap must match the sample line exatly to enable field position calculations.  Use of the 'Shift to match sample' function on the trap's context menu may eliminate the error". 

                   

                  I'm at a loss as to what this means.  The line I'm trying to trap is a page header and is centered in the middle of the page.  (As such, the pdf converter doesn't place the header in the same spot from one day to the next.)  So, I've designed a floating trap to find the line, but now I can save the trap.

                   

                  Thanks again,

                   

                  Greg

                    • Problems With PDF
                      ghughes20 _

                      Never mind.  User error.

                        • Problems With PDF
                          Grant Perkins

                          Greg,

                           

                          FWIW the operational requirements of the floating trap feature can be a little esoteric for getting one's head around - especially with a 'moving target' like PDF output.

                           

                          I think the original intention was to be able to split up and extract from computer log style 'reports' - generically these tend to be quite consistent in presentation other than the length of the data 'fields' which vary by content.

                           

                          The consistency required is less evident, more often than not, in the PDF extracts and so the extent to which the 'float' can be applied may be limited to, for example, line identification and perhaps 2 'fields' whereas a 'log file' can be assumed, for practical purposes, to be reporting every line (except page headers if they exist) and to be showing a definable number of fields either for the entire line or the first x sections of the line.

                           

                          I look forward to hearing about how you got on.

                           

                           

                          Grant