6 Replies Latest reply: May 15, 2014 10:10 AM by jbvinny _ RSS

    Trap Entire Row

    jbvinny _

      I am working with a difficult PDF and have decided that my best approach is to trap entire rows worth of data and parse it out after the fact.  Whenver I try to do this I get a error that says the width can only be like 250 characters... Can someone explain how to trap an entire row to me?

        • Trap Entire Row
          Grant Perkins

          I can't say I have come across this one before.

           

          Are you hitting the field width limit (254 characters as I recall) or is there a a character limit to the width of the PDF file that can be processed? (Even so a report that is around 250 characters wide is not very usual - or at least it hasn't been usually historically.)

           

          Or, another option, has the PDF extraction become so wide in attempting to get the data to play nicely that it has resulted on overly wide 'fields'?

           

          If the latter and depending on what options you have for parsing the lines once extracted, it may be possible to compress the line width by removing excess spaces using the PDF extraction settings and get things under the character limit.

           

          If the limit relates to using a single field ther may be options for utilising more than one field as well. However without assessing the PDF input you are working with I would only claim to be speculating.

           

          It may be worth looking at the PDF file with 'fresh eyes' too. What sort of problems does it present? Some are indeed very difficult to work with but I have come across a few that looked impossible at first sight but turned out to be not so bad once I realised what was going on with them.

           

          Is the report confidential or can a representative sample be shared with others for investigation?

           

          Grant

            • Trap Entire Row
              Olly Bond

              Hello jbvinny,

               

              Yes, I've seen this one a few times Monarch will let you trap the entire row, but only in chunks of 254 characters. So highlight one field starting in column 1, another in 255, another in 509, etc. Call these , etc. They'll all be of type Character.

               

              Then in the table create a Calculated field of type Memo called as simply ... Now you can parse the data out that you need.

               

              Hope this helps,

               

              Olly

                • Trap Entire Row
                  jbvinny _

                  Grant,

                   

                  I was hitting the field width limit. I worry that compressing the data in the pdf extractor would make it difficult to then parse because the values are not seperated by a delminator. I have tried several different options with this report and traping all the data and then parsing it out in the table view or excel seems like my best shot. Basically, I have about 300 one page pdf's that I need to pull information from. The problem is that each PDF is formatted slightly different. Columns aren't aligned the same so I tried using a floating trap only to find that some data isn't even spaced the same vertically so my 2 line trap would need to be 3 lines in some cases. What I ended up doing was cominging all the files into a PDF binder and then saving that as text. While still ugly this data looked more managable if I could trap an entire line. However, I keep getting the error that my field was to wide and had never run into that before (or at least don't remember it).

                   

                   

                  Ollly,

                   

                  That should work. Once I combine the fields in the table view I can parse out each value using a space as the deliminator. Thanks!

                   

                   

                  Seperate question, have either of you had any success working with a table that is within a pdf. Something like :

                   

                  Miami New York Chicago Dallas

                  Net Income 5,452 4,364 6,000 4,800

                  Revenue 7,000 6,000 8,000 5,500

                  FTE's 200 100 250 175

                   

                   

                  Traping this data would be relatively simply using a multi-region approach but for some reason I always struggle with the headers (Miami, New York, etc)... Any thoughts? I'm sure I am just being dense and missing something. :cool:

                    • Trap Entire Row
                      jbvinny _

                      Wow my attempt at providing you a table failed miserably. Hopefully it still makes some sense.

                        • Trap Entire Row
                          Grant Perkins

                          Wow my attempt at providing you a table failed miserably. Hopefully it still makes some sense.[/QUOTE]

                           

                          Hmm. My attempt to put your table between CODE tags didn't give good results either.

                           

                          Basically if the cities present as column headers and so would fit into a multi-column region definition for the columns than it should be possible make that into an append template. However there are a couple of situations where with some reports other data can get in the way, literally, and mess things up. Assuming the table really appears

                          as a nice clean table well aligned then the append approach should be viable if you can define a trap to ID the line exclusively.

                           

                           

                          Grant

                            • Trap Entire Row
                              jbvinny _

                              Unfortunately I find that often the headers don't line up with the date below and therefore cross over from one multi column region to another...This seems to happen with PDF's often.   Any ideas there?