9 Replies Latest reply: Jun 15, 2016 2:10 PM by Grant Perkins RSS

    Need help capturing paragraphs in a text

    Tingyu Du

      I'm working with text file with the following format displayed in the link: http://i.stack.imgur.com/2lmbY.png. The text contains 500 documents appended like this, and I want to capture each paragraph starting from "Questions and Answers". I tried to follow the steps in creating multiple line fields, but the paragraphs stop to display immediately after the second one. Here is the table I have created, but no more than 2 paragraphs: http://i.stack.imgur.com/gCKNr.jpg Could someone help to show how to capture each paragraphs?

       

      I'm also trying to capture the first line of each paragraph starting from "Questions and Answers", but not sure how to do this.

       

      Thanks a lot for any help. I'm using Monarch Pro V9.

        • Re: Need help capturing paragraphs in a text
          Olly Bond

          Hello Tingyu,

           

          If your F2 is a character field, please change it to Memo so it can be 32,767 characters long and not 254.

           

          Also, in the report, edit the F2 field and in the Advanced tab set it to end on "none of the above".

           

          Best wishes,

           

          Olly

           

          Olly Bond

          MONARCH ? | ? | ? | ? EXPERTS

          www.monarchexperts.com<http://www.monarchexperts.com>

          olly@monarchexperts.com<mailto:olly@monarchexperts.com>

            • Re: Need help capturing paragraphs in a text
              Tingyu Du

              Hi, thanks for your response. I tried the Memo field and edited the Advanced tab to "none of the above", but the field ends on the second page. Since my transcript is at least 10 pages long for each document, two pages are not enough to capture the information, I tried to capture each paragraph instead and encountered problems stated above. Is there other ways to capture the entire text after "Questions and Answers" for each document?

                • Re: Need help capturing paragraphs in a text
                  Olly Bond

                  Hi Tingyu,

                   

                  Many many years ago, at the London College of Printing, I was taught to cast off lengths of text to see how many pages they would need. Your maximum field size in Monarch v9 is 32,767 bytes - that's as big as the Access database engine under the hood supports. So with each line of your document being about 80 characters wide, you can cope with about 400 lines of text. That's probably no more than 8 pages of 50 lines each. So if you need 10+ pages of text in one field, you're possibly going to need another approach.

                   

                  Best wishes,

                   

                  Olly

                   

                  Olly Bond

                  MONARCH ? | ? | ? | ? EXPERTS

                  www.monarchexperts.com<http://www.monarchexperts.com>

                  olly@monarchexperts.com<mailto:olly@monarchexperts.com>

                    • Re: Need help capturing paragraphs in a text
                      Tingyu Du

                      Hi Olly, It's good to hear that Monarch can cope with about 400 lines of text, but I'm still having trouble with the limit of the Memo field and the "None of the above" option. I read in the Help topics the following:

                       

                      • None of the Above: Select to terminate the field when Monarch encounters another template, including another instance of this template. The field will also be terminated if a HTML markup line is encountered.

                      Monarch will also terminate a multiple line field after it extends two pages. The field will be terminated on the second page where the page break character (character code 12) is encountered. This prevents a field from continuing without end if the selected End Field On action is not appropriate to end the field.

                      The None Of The Above behavior is always enforced by Monarch, even when the None Of The Above option is not selected. Select this option only when none of the other options would apply.

                       

                      It says that Monarch will terminate a multiple line field after it exceeds two pages, and I still can't get texts more than two pages. Do you know how to solve this problem?

                • Re: Need help capturing paragraphs in a text
                  Olly Bond

                  Hi Tingyu,

                   

                  I think I've found a way to capture each paragraph as a separate memo field - trapping with a two line template on the blank line and starting with line 2, ending with blank field values. Unless one paragraph is more than two pages long, this should grab all the data.

                   

                  It might not grab it into the shape you want though - once you have got the data, what are you trying to do with it?

                   

                  Best wishes,

                   

                  Olly