4 Replies Latest reply: May 15, 2014 10:13 AM by Grant Perkins RSS

    PDF Report columns are all of a sudden off screw when viewed in Datapump Model....

    Dave Brinston

      One of our processes are not working properly anymore.  A report is created from our Payroll system and is saved as a .pdf file.  This file gets dropped in a folder monitored by datapump.  We have been using this process for 5+ years now and it was always working fine now we noticed that datapump can't extract the data from the pdf file properly because it looks like the columns in the pdf file are not positioned where they should be for the model to extract the data.  The catch is, when i open the report up in any PDF reader the columns look fine and when compared to previous pdf reports that were created for this process (i.e. ones that worked properly in the past), the columns look exactly the way they should. 


      So it seems to me that the PDF reader or viewer for datapump is maybe having trouble reading the pdf file.  Again this process always worked in months previous to this one for the last 5 years or more.  We are using version 10.5.


      Any suggestions?  




        • PDF Report columns are all of a sudden off screw when viewed in Datapump Model....
          Olly Bond

          Hello Dave,


          You can't control PDFs - the amount of   spacing   between     the words   can seem random. But you can cope with it in Monarch. Can you copy and paste from the report window here between CODE tags so we can give you some pointers?


          Best wishes,



            • PDF Report columns are all of a sudden off screw when viewed in Datapump Model....
              elginreigner _

              PDFs are evil. Was there an upgrade of any kind against the system that generates the PDF files? I see this all of the time with our clients, the PDF engine that is used or the report it self gets altered. The front-end/visual side of the PDF looks the same but the back-end/data side that Monarch see the data/columns have shifted. Unfortunately, if this is the case, your only answer is to adjust the model in place or create a new one.


              I personally usually create new models, allowing me to save the old model in case it was a fluke or it changes back. As Olly stated, if you supply sample from the PDF, we can help with trapping.

                • PDF Report columns are all of a sudden off screw when viewed in Datapump Model....
                  Dave Brinston

                  Thanks for your response guys.  I was afraid i would have to recreate the model but kind of expected that to be the solution. 





                    • PDF Report columns are all of a sudden off screw when viewed in Datapump Model....
                      Grant Perkins

                      If you have luck on your side and the reports are not too complex and any changes wrought by the pdf writer program used leave the format construction of the extracted data somewhat similar to what has gone previously ... then you may[/I], just possibly, be able to come up with a model that is quite generic in its ability to deal with whatever those reports can throw at it.


                      I think if you identify a one off change that has resulted in a new model mapping being necessary (and know that it is likely to stay that way for a while) then a revised model is the way to go. If you need to work with both old and new reports you may need some way of identifying which version of the model is likely needed for the Datapump process.


                      If the input pdfs become somewhat random when extracted then seeking a generic solution might be appropriate rather than trying to identify the correct model for the source pdf creation routine.


                      As an example I seem to recall, some year back, looking at a single output file that seemed to have multiple ouput formats from what were ostensibly the same data records with just minor variations in the fields reported.


                      The data seemed to be all over the place.


                      However after staring at it all for a while I suddenly noticed that in fact the first 5 or 6 columns of the presnted table were pretty much identical and only the last 1 to 3 columns (it varied as to what was infilled) might be different.


                      The apparent randomness was in fact simply down to the sections of the report being centred on the page, thus shifting positions across the line section by section.


                      Extracting to a text file as a single field per line allowed the entire report to be easily left justified and suddenly most fields fell into consistent positions and the remaining variable content was  easily managed into its correct fields using a generic approach.


                      I mention this only in case something similar might be the cause of your extracted format inconsistencies. If so there may be a fairly straightforward approach to dealing with them.


                      Grant Perkins