9 Replies Latest reply: May 15, 2014 9:56 AM by Grant Perkins RSS

    Data Template Question from a Newbie

    subrama6 _

      Hi,

       

      Got a quick question.  I've played around with this for a while, and now I've reached the end of my expertise, and thought I'd reach out to the datawatch community.  I have a file that I need to apply a data template to.  Basically, the file is in rows of alternating formats.

       

      Format 1

      Format 2

      Format 1

      Format 2

      Format 1

      Format 2

      ...etc

       

      A Format 1 immediately followed by a Format 2 constitutes a single record.  I need to be able to pull data from both lines when I build my table.  The problem I'm having is as follows.  There are instances when there are two Format 1 lines in a row or two Format two lines in a row.  I'm only interested in consecutive Format 1/Format 2 pairs.  When I make my data template as having both of those lines, random page breaks (given that this is a flat dump to a text file with no page breaks) messes everything up, because sometimes a Format 1 line appears immediately before a page break with the corresponding Format 2 line following.  This pair doesn't get caught.  As an alternative, I tried setting up the Format 2 line as a regular template, and a Format 1 line as a page header.  This works, and correctly ignores the first Format 1 if there are two Format 1's in a row, ie

       

      Format 1 <ignore in table view>

      Format 1

      Format 2

       

      However, it fails when I have two Format 2's in a row, ie

       

      Format 1

      Format 2

      Format 2 <ignore in table view>

       

      In this case, it incorrectly posts both Format 2 lines to the table, instead of ignoring the second as I want.  Does anyone have any suggestions as to how I might set up my data template(s) to deal with such a frustrating file format?

       

      Thanks in advance

       

      [size="1"][ August 11, 2004, 03:05 PM: Message edited by: subrama6 ][/size]

        • Data Template Question from a Newbie
          Grant Perkins

          Hi,

           

          Your description initially suggests the idea of creating a 2 line detail template based on either the first of the second line as the trap line, and filtering the results to eliminate records without the line pairs. However that may not always work reliably.

           

          Is there anything in the lines that would help to identify the fields you require as paired lines?

           

          Indeed is there anything in the report to prevent an isolated format 1 line being followed by an isolated format2 line?

           

          One approach might be to trap for format1 lines but defined the WHOLE WIDTH OF THE LINE as the field and then 'End field on' "None of the above" (V7) or (Minimum Action (V6).

           

          That should give you a single very large field for the twin lines and a shorter one for the single lines. The data in the long field could then be sliced and diced using some other techniques.

           

          However it would be useful to see a sample of the 'report' first if possible in the hope that other more direct ideas can be identified.

           

          Any chance?

           

          Grant

          • Data Template Question from a Newbie
            subrama6 _

            Thanks very much for responding.  Unfortunately, for work-related nondisclosure reasons, I am unable to give you an actual report.  However, what follows is dummy data in the format of the report, which will hopefully be of some assistance.  because of formatting, this will likely look strange, but things are generally all of the same length, and right justified when they are not, for the different formats:

            ---

             

            SOMETHING        25    100.00 WHATEVER              10.00

            ABC   01           123456                           100.00

            SOMETHING        50     20.00 WHATEVER              10.00

            DEF   04           789012                            45.00

            SOMETHING         5      5.00 WHATEVER              10.00

            SOMETHING         5      5.00 WHATEVER              12.00

            HIJ   02           345678                            23.00

            SOMETHING        75     50.00 WHATEVER              10.00

            KLM   05           901234                            75.50

            SOMETHING        20     25.00 WHATEVER              10.00

            NOP   54           567890                            87.00

            SOMETHING        15    250.00 WHATEVER              10.00

            QRS   75           987654                           345.50

            QRZ   75           987654                           345.50

            SOMETHING        25     75.00 WHATEVER              10.00

            TUV   34           321098                            13.00

            ---

            As clarification, please note the two instances of the anomolies I previously discussed within this data set.  The two "SOMETHING" lines above the "HIJ" line.  In this instance, the first "SOMETHING" line should be ignored, and the second should be paired with the "HIJ" line.  The second anomaly is the "QRZ" line immediately following the "QRS" line.  In this case, the "QRZ" line should be ignored.  I have tried a two line template trapping on the second line, and it does work, but because this is a long flat text file, it gets messed up when a page break comes between a format1/format2 pair.  As a complete newcomer to monarch, I'm not sure I completely understand your second suggestion, but hopefully this sample report will help to clarify the situation.  Again, I appreciate your help.

            • Data Template Question from a Newbie
              subrama6 _

              One more thing - to address your question of whether there is anything in the report to prevent an isolated Format 2 line from following an isolated Format 1 line, that is an extremely good point that I failed to take into consideration.  The answer is that there is not.  As that is the case, I'd like to consider the template simply assuming that a Format 1 line followed immediately by a Format 2 line does constitute a valid pair.  I will see if I can deal with the isolated issue you raised from the end of the program generating this report.

              Thanks again

              • Data Template Question from a Newbie
                Grant Perkins

                Hi,

                 

                Don't worry too much about my second suggestion - it's the sort of thing one has to consider when a report just does not respond to the more usual approaches. I think you will be OK here.

                 

                If trapping on the second line work (except for the page breaks) I think you might find that trapping on the first line and filtering out any results which have no second line fields would also work. However it seems you have a solution with the exception of the Page Break problem. Is that correct?

                 

                Is the page break a visible separation 'printed' as part of the file? A row of dashed -


                for example?

                 

                If so it should be possible (but I have seen exceptions!) to make a template for the page break - including any blank lines either side of a physical marker - and set it to a Page Header .

                 

                Page header areas are 'invisible' to the part of the process dealoing with details and appends and so the orphaned lines should be processed correctly.

                 

                You may also want to check the information in the Monarch Help related to 'Input Options' for report translation, particularly the section on "Ignore Form Feed Characters".

                 

                I think that should get you to where you need to be if I have guessed correctly.

                 

                I have not tried working with the sample but can doso if necessary - are the format1 lines indented as they appear to be when viewed in edit mode through the forum window? (The forum displays often adjust the posted layouts and sometimes it is difficult to be sure that what can be extracted is how it was meant to be!)

                 

                 

                Grant

                • Data Template Question from a Newbie
                  subrama6 _

                  Grant,

                   

                  The format 1 lines are indeed indented a little.  Hopefully that helps us out a bit  smile.gif[/img]   It is correct that I am somewhat close with two different solutions.  Setting both lines as a single template, but trapping on the 6 digit number in the second line works except for page breaks.  These page breaks are not in the original file, as it is just plain text created by Visual Basic.  In Monarch, the page break appears as a line of "----


                  " across the page.  Alternately, I get close if I define the second line as a detail template and the first line as a page header template.  Though this solves the problem of two Format 1 lines in a row, it doesn't work with two Format 2 lines in a row.  I tried Ignoring Form Feed Charcters and Unused Print Control Characters in the Input options as you suggested, but it still ignored a pair I have split over a page break.  Perhaps I should change the page break options to not happen every 256 lines, but there's no way to save that in the .mod file I'm creating.  Anyway, thanks so much for your help.  I fear I'm not being as clear as possible, so please let me know if you have any further questions and I'd be glad to do my best to clarify further.

                  Thanks

                  • Data Template Question from a Newbie
                    subrama6 _

                    Just tried one more thing - If I set the detail template to be both lines but trap on something in the first line as opposed to the second, as I had been doing earlier, it properly spans page breaks.  However, it incorrectly pulls data when I have two format 1 lines in a row... but it's close  smile.gif[/img]

                    • Data Template Question from a Newbie
                      Grant Perkins

                      Just coming at this from a different angle for a moment ...

                       

                      I wonder if the format of the 'report' - really a file dump I suspect - is as it is simply because it has been wrapped in order to print on a specici size of paper.

                       

                      If so, and providing the total for the records woould be less than 1000 characters on a single row, it might be worth looking at the possibility of changing back to single row format before using Monarch.

                       

                      If you can open the report (or part of it) in Word and check the print control characters you might see single <LF> and double <CRLF> instances relating to the different line combinations.

                       

                      In which case there is an opportunity to amend the file to single rows per record using the MSRP utility.

                       

                      Now, this might be a long shot but if it is feasible it could make life much easier!

                       

                      BTW, I have no problem with your clarity of description but I do find it much easier to seek the solutions to specific problems when in possession of the original files. Of course I also appreciate the need for confidentiality and fully respect that. What I am very aware of is how easy it can be to make wrong assumptions about something or miss an opportunity for a useful solution by not having the entire problem visible.

                       

                      It can be extremely difficult to fully describe a problem and equally difficult to fully describe a proposed solution.

                       

                      No matter - working through the process is much of the fun!

                       

                       

                      Grant

                       

                      Originally posted by subrama6:

                      Grant,

                       

                      The format 1 lines are indeed indented a little.  Hopefully that helps us out a bit   smile.gif[/img]    It is correct that I am somewhat close with two different solutions.  Setting both lines as a single template, but trapping on the 6 digit number in the second line works except for page breaks.  These page breaks are not in the original file, as it is just plain text created by Visual Basic.  In Monarch, the page break appears as a line of "----


                      " across the page.  Alternately, I get close if I define the second line as a detail template and the first line as a page header template.  Though this solves the problem of two Format 1 lines in a row, it doesn't work with two Format 2 lines in a row.  I tried Ignoring Form Feed Charcters and Unused Print Control Characters in the Input options as you suggested, but it still ignored a pair I have split over a page break.  Perhaps I should change the page break options to not happen every 256 lines, but there's no way to save that in the .mod file I'm creating.  Anyway, thanks so much for your help.  I fear I'm not being as clear as possible, so please let me know if you have any further questions and I'd be glad to do my best to clarify further.

                      Thanks /b[/quote]

                      • Data Template Question from a Newbie
                        subrama6 _

                        Grant,

                         

                        actually, to give you a little more information, the report is formatted as such for the following reason.  There exists an initial file that has various bits of information - say 10 rows worth - for each individual account.  quite ridiculously, and outside of my control, this file is a flat text dump in two columns.  my task was to get a couple of pieces of information from each entry.  to begin that process, i wrote VB code to split everything into one column by reading through the file once and writing characters 1-50 of the active line to a new file than reading through the file again and adding charcters 51-100 to the new file.  this was step 1.  I then amended the code so as to only get the relevant lines with respect to the information I wanted, namely the lines of format 1 and format 2.  I thought that I might be able to conduct the data extraction portion of it more easily through Monarch (since I will eventually have to go to Excel), and when writing the VB code a few weeks ago, having no knowledge of Monarch at the time, wrote it without thinking of these ramifications.  In retrospect, it certainly makes sense to have one large line for each entry, and that is an option I'm currently exploring in VB.  If it would help any, I'd be happy to mock up a file with dummy data in the appropriate format and email it to you.  I didn't notice a way to post a file to this forum, but if there is a way, I'd be happy to do it.  Thanks again

                        • Data Template Question from a Newbie
                          Grant Perkins

                          It sounds like you have within your grasp a few other ways to control the presentation of the data.

                           

                          It may be worth checking the MSRP (very fast file manipulation) and PREP utilities to see it they can be useful in your data manipulation.

                           

                          I would be very happy to work with a mock up file. Given what we have covered so far a 'real' example or two of the page break scenario would be valuable.

                           

                           

                          There is no way to post a file on the forum but I will send you a Private Message with my email address.

                           

                          Grant