2 Replies Latest reply: May 15, 2014 9:54 AM by Tom Whiteside RSS

    HTML File

    amy _

      The following is the output from a pdf file converted to html.  I am unable to convert this to table format because of the multi-line fields. 

       

      When the file is an html or it looks like the following: (Os are used for blank spaces)

         A     B        C          

      1. X-XOXXXXOOXXXXXXXXXXXXXXXX

      OOOOOXXXXOOXXXXXXXXXXXXXXXX

      OOOOOOOOOOOXXXXXXXXXXXXXX

       

       

      2. X-XOXXXOXXXXXXXXXXXXXXXX

      OOOOOOOOOXXXXXXXXXX 

                                  

       

      Fields 1C and 2C could be different widths and have differing heights.  Is there a way to use monarch to recognize the wrapped text and differing field widths?

       

      If the file exists as a text file then each records exists of several lines with no way to distinguish one field from the next (as most fields are full-text fields?)

       

      Ex.

       

      1.XXXXX

      XXXXXXXXXXXXXXXXX

      XXXXXXXXXXX

      XXXXXXX

       

      2.XXXXXXXXXXX

      XXXXXXXXXXXXXXXXXX

      XXXXXXX

      XXX

       

      I have read the directions and can't figure it out. HELP PLEASE!

       

      [size="1"][ October 28, 2003, 03:35 PM: Message edited by: amy ][/size]

        • HTML File
          Grant Perkins

          Amy,

           

          To clarify, is the sample still in the HTML file or have you extracted from the HTML to a table but ended up with multiple rows where you wanted just one?

           

          Also I can't quite work out what you are explaining with the examples and the requirement for 'differing heights'. Probably just me being dumb but clarification would be useful if possible.

           

          If there is a possibility that you could provide a sample of the file you need to work with (HTML?) I would be happy to try to work out a way forward.

           

          Let me know and I will send you a Private Message to provide my email address to which the file can be sent.

           

          Grant

           

          Originally posted by amy:

          The following is the output from a pdf file converted to html.  I am unable to convert this to table format because of the multi-line fields. 

           

          When the file is an html or it looks like the following: (Os are used for blank spaces)

             A     B        C          

          1. X-XOXXXXOOXXXXXXXXXXXXXXXX

          OOOOOXXXXOOXXXXXXXXXXXXXXXX

          OOOOOOOOOOOXXXXXXXXXXXXXX

           

           

          2. X-XOXXXOXXXXXXXXXXXXXXXX

          OOOOOOOOOXXXXXXXXXX 

                                      

           

          Fields 1C and 2C could be different widths and have differing heights.  Is there a way to use monarch to recognize the wrapped text and differing field widths?

           

          If the file exists as a text file then each records exists of several lines with no way to distinguish one field from the next (as most fields are full-text fields?)

           

          Ex.

           

          1.XXXXX

          XXXXXXXXXXXXXXXXX

          XXXXXXXXXXX

          XXXXXXX

           

          2.XXXXXXXXXXX

          XXXXXXXXXXXXXXXXXX

          XXXXXXX

          XXX

           

          I have read the directions and can't figure it out. HELP PLEASE! [/b][/quote]

          • HTML File
            Tom Whiteside

            Amy,

             

            Using your second example of a text file format, estimate the maximum field width before the forcible word wrap - - in your case, it looks to be record 2, line 2, with 18 characters.  Use a multiple-line field trap with Advanced Field Properties of End Field On Blank field values: 1.

             

            Unless I'm terribly off-base, this should give you records 1 and 2 as single-line fields, each with four "pieces," separated by 3 spaces.  Now, if this is indeed your situation, you could concatenate the four pieces with something like the following:

             

            New_Field=RTrim(LSplit(,4," ",1))RTrim(LSplit(,4," ",2))RTrim(LSplit(,4," ",3))+RTrim(LSplit(,4," ",4))

             

            This seems to work okay for Monarch 7.  If you are using Monarch 6.01, or, if your 3 spaces do not disappear, then take a look at the posting

            [url="http://mails.datawatch.com/cgi-bin/ultimatebb.cgi?ubb=get_topic;f=1;t=000327#000005"]topic[/url] "next door" to this one, namely, Hopefully Simple Question.

             

            Please advise if I'm not grasping your situation.  The way I'm reading your problem is that you (1) need to capture multiple-line record fields, and (2) strip out any remaining spaces and concatenate the pieces into one field per record.

             

            Hope this helps....

             

            [size="1"][ May 19, 2006, 12:26 PM: Message edited by: Todd Niemi ][/size]