3 Replies Latest reply: May 15, 2014 9:52 AM by Grant Perkins RSS

    Filestrip and search

    waterzap _

      I am currently using V7 Pro and trying to

      get some data out of HTML files. I have two problems:

       

      Firstly, I noticed when I began using Monarch and importing HTML files, it would not show up as HTML, i.e. the tags were not grayed out.

      After a bit of experimenting I noticed that the pages were in the form

       

      <h t m l>

      <h e a d></h e a d>

      <b o d y></b o d y>

      </h t m l>

      <h t m l>

      <h e a d></h e a d>

      <b o d y></b o d y>

      </h t m l>

      <h t m l>

      <h e a d></h e a d>

      <b o d y></b o d y>

      </h t m l>

       

      With only the middle html group having the data that I wanted.

      When I then deleted the top and bottom groups, the page shows up fine in Monarch.

       

      Is there a way to do this automatically?

       

      I was thinking about Filestrip, but it cuts the files off at a certain position. And in this case I want to cut it off at the HTML tag on the top and bottom. The position might differ each time.

       

      Then secondly, the reports are not all that standard. (I am looking at financial statements, hundreds of different ones)

       

      So I was thinking that the best way to get the figures out of the report was to tell Monarch to look for a word, then look to the right and take the first number is sees. If it is a letter, then go on searching.

       

      Is this at all possible with Monarch?

        • Filestrip and search
          waterzap _

          I am currently using V7 Pro and trying to

          get some data out of HTML files. I have two problems:

           

          Firstly, I noticed when I began using Monarch and importing HTML files, it would not show up as HTML, i.e. the tags were not grayed out.

          After a bit of experimenting I noticed that the pages were in the form

           

          <h t m l>

          <h e a d></h e a d>

          <b o d y></b o d y>

          </h t m l>

          <h t m l>

          <h e a d></h e a d>

          <b o d y></b o d y>

          </h t m l>

          <h t m l>

          <h e a d></h e a d>

          <b o d y></b o d y>

          </h t m l>

           

          With only the middle html group having the data that I wanted.

          When I then deleted the top and bottom groups, the page shows up fine in Monarch.

           

          Is there a way to do this automatically?

           

          I was thinking about Filestrip, but it cuts the files off at a certain position. And in this case I want to cut it off at the HTML tag on the top and bottom. The position might differ each time.

           

          Then secondly, the reports are not all that standard. (I am looking at financial statements, hundreds of different ones)

           

          So I was thinking that the best way to get the figures out of the report was to tell Monarch to look for a word, then look to the right and take the first number is sees. If it is a letter, then go on searching.

           

          Is this at all possible with Monarch?

          • Filestrip and search
            Grant Perkins

            That Monarch is not recognising the html format seems a bit strange, unless there is something specific about the format or tags that the Monarch parser does not understand.

             

            Of course if that is the case then perhaps removing that bit of the code might make the rest appear as it should.

             

            Do you have 7.01 installed? Not sure it makes any difference for this requirement but I have not yet found an html file that Monarch failed to display as expected. So it is worth considering if any of the software components, Monarch or Windows, may need updating.

             

            Is there any chance you could supply a sample file (with non-sensitive data) that I could try on my own installation?

             

             

            Grant

             

             

            Originally posted by waterzap:

            I am currently using V7 Pro and trying to

            get some data out of HTML files. I have two problems:

             

            Firstly, I noticed when I began using Monarch and importing HTML files, it would not show up as HTML, i.e. the tags were not grayed out.

            After a bit of experimenting I noticed that the pages were in the form

             

            <h t m l>

            <h e a d></h e a d>

            <b o d y></b o d y>

            </h t m l>

            <h t m l>

            <h e a d></h e a d>

            <b o d y></b o d y>

            </h t m l>

            <h t m l>

            <h e a d></h e a d>

            <b o d y></b o d y>

            </h t m l>

             

            With only the middle html group having the data that I wanted.

            When I then deleted the top and bottom groups, the page shows up fine in Monarch.

             

            Is there a way to do this automatically?

             

            I was thinking about Filestrip, but it cuts the files off at a certain position. And in this case I want to cut it off at the HTML tag on the top and bottom. The position might differ each time.

             

            Then secondly, the reports are not all that standard. (I am looking at financial statements, hundreds of different ones)

             

            So I was thinking that the best way to get the figures out of the report was to tell Monarch to look for a word, then look to the right and take the first number is sees. If it is a letter, then go on searching.

             

            Is this at all possible with Monarch? /b[/quote]

            • Filestrip and search
              Grant Perkins

              That Monarch is not recognising the html format seems a bit strange, unless there is something specific about the format or tags that the Monarch parser does not understand.

               

              Of course if that is the case then perhaps removing that bit of the code might make the rest appear as it should.

               

              Do you have 7.01 installed? Not sure it makes any difference for this requirement but I have not yet found an html file that Monarch failed to display as expected. So it is worth considering if any of the software components, Monarch or Windows, may need updating.

               

              Is there any chance you could supply a sample file (with non-sensitive data) that I could try on my own installation?

               

               

              Grant

               

               

              Originally posted by waterzap:

              I am currently using V7 Pro and trying to

              get some data out of HTML files. I have two problems:

               

              Firstly, I noticed when I began using Monarch and importing HTML files, it would not show up as HTML, i.e. the tags were not grayed out.

              After a bit of experimenting I noticed that the pages were in the form

               

              <h t m l>

              <h e a d></h e a d>

              <b o d y></b o d y>

              </h t m l>

              <h t m l>

              <h e a d></h e a d>

              <b o d y></b o d y>

              </h t m l>

              <h t m l>

              <h e a d></h e a d>

              <b o d y></b o d y>

              </h t m l>

               

              With only the middle html group having the data that I wanted.

              When I then deleted the top and bottom groups, the page shows up fine in Monarch.

               

              Is there a way to do this automatically?

               

              I was thinking about Filestrip, but it cuts the files off at a certain position. And in this case I want to cut it off at the HTML tag on the top and bottom. The position might differ each time.

               

              Then secondly, the reports are not all that standard. (I am looking at financial statements, hundreds of different ones)

               

              So I was thinking that the best way to get the figures out of the report was to tell Monarch to look for a word, then look to the right and take the first number is sees. If it is a letter, then go on searching.

               

              Is this at all possible with Monarch? /b[/quote]