    Perplexed with HTML FILE DATA RETRIEVAL V8

    Jenni _

      Ok I am a brand new user and am having issues with HTML. The HTML I am trying to do the detail and use the Advanced Start Field On String:   Anywhere in previous line and putting the Col=XX to flag where the data should be pulled from but it is not working. What am I doing wrong, here is an example of the data...

      SOFTWARE NAME - Using Supplier Listing                                                          

      <TABLE=000 NEST=00 ROW=000 COL=00 ID=0001>                                                          

      Using Supplier Listing                                                                               

      <TABLE=000 NEST=01 ROW=001 COL=01 ID=0002>                                                          

      Supplier Name                                                                               

      <TABLE=000 NEST=01 ROW=001 COL=02 ID=0003>                                                          

      Action #                                                                               

      <TABLE=000 NEST=01 ROW=001 COL=03 ID=0004>                                                          

      SAP #                                                                               

      <TABLE=000 NEST=01 ROW=001 COL=04 ID=0005>                                                          

      Additonal SAP Numbers for this supplier                                                             

      <TABLE=000 NEST=01 ROW=001 COL=05 ID=0006>                                                          

      Supplier Address                                                                               

      <TABLE=000 NEST=01 ROW=002 COL=01 ID=0007>                                                          

      SUPPLIER NAME                                                         

      <TABLE=000 NEST=01 ROW=002 COL=02 ID=0008>                                                          

      HREF="main.ASP?WCI=Main&WCE=ViewAction&WCU=s%3d6KAHGYAUZW58DVW4I2AR72UDKLJNB10F%7c*%7er%3d7836" 7836

      <TABLE=000 NEST=01 ROW=002 COL=03 ID=0009>                                                          

      VENDOR NUMBER                                                                               

      <TABLE=000 NEST=01 ROW=002 COL=04 ID=0010>                                                          

      ALTERNATE VENDOR NUMBER                                                                               

      <TABLE=000 NEST=01 ROW=002 COL=05 ID=0011>                                                          


      CITY, STATE ZIPCODE                                                                     

      Thanks for your help!

          Grant Perkins



          When you say you are using COL=XX I assume you mena that for the field you want you are specifying a value for XX, i..e "COL=01" because you want to extract data from specific columns only?


          In other words the "COL=XX" is being used as some sort of filter as well as a field start position indicator?


          I am not sure the HTML analysis always works  consistently doing that but it might work on your input files.


          So are you getting absolutely nothing (no fields shaded as you define the template) or are you not getting what you want to get?


          If you are not getting, or not always getting, what you expect to get there may be other reasons. The number of line in the template ample, for example, can introduce some challenges and the results always seem to be less easy to visualise when working with HTML output.


          Hope this helps in some way. Let us know the answers so we can thnk of more things to consider.




            Jenni _

            Hi Grant,

            Yes I am using the COL=01 through COL=05 and it is not working consistently, I am not getting what I want to get. I am getting something just not all the fields lining up with the right data and blanks and junk. What is the best method for getting the data out of a HTML file with this type format? Can you do a model for me using the data I gave you so I can get an idea of how best to do this? I don't get the floating trap method to do this as was suggested prior to your posting to me.

            Thanks Grant.. appreciate your help!

              Grant Perkins



              HTML files are pretty strange things to deal with and not something I look at often but I think there may be some basic generic rules and then some 'rules' specific to the file in question. One of the more interesting things about html files is the the way they are presented as 'text', for your purposes, can be very variable because they are used to present screen based versions of variable data. There are many ways a programmer can choose to make that happen.


              To get the results you want you need to consider, as a minimum, whether COL alone gives you enough informatioan to be sure about what sort of field you are dealing with and how to define it. The chance are the ROW is also a factor. In which case the pseudo filtering using COL alone will not be sure to work. It may for some outputs, but could fail badly for others.


              I could create a model for the sample you posted but I need to know what you want to get out of the lines and whether a line will consist on a single field in a single row OR MIGHT be empty OR MIGHT have text which could be presented as more than one row.


              Is that information you presented in the rows below the html code lines actual sample text or an indication of the sort of text that is there? Assuming the latter, how do you know which field in the table you want to add the text to? Is it only based on the COL number?


              If you have an original file that you can share (non confidential data if that is an issue) with me and a precise definition of what you need from it I will send you a private message with my email address for you to forward the file. I am much happier working with a real sample with information that specifies exactly what is required - it tends to produce better results more quickly!