22 Replies Latest reply: Dec 27, 2016 12:22 PM by mikell _ RSS

    Measuring true line length in Monarch

    mikell _

      Because of CRs embedded in a report, I'm getting false Carriage Returns which messes up the model.  First of all, is there anyway to exclude errant CRs in the middle of a line?  Secondly, my version of Monarch (10.5) appears to compress lines that contain a lot of spaces. It appears to end the line when a large number of spaces occur at the end of non space characters.  Is there any way to force the inclusion of all spaces?  I would like to be able to do a calculation of the true input line length, without Monarch compressing spaces. There is a checkbox in "Input Options" for "Trim leading and trailing spaces from character and Memo fields.". It is not checked and, when checked, does trim the lines as described. However, when it is not checked, it still trims some long groups of spaces, seemingly trailing spaces.

        • Re: Measuring true line length in Monarch
          Olly Bond

          Hello Mike,

           

          There used to be a DOS utility on Datawatch's site as a free download called MSRP.exe.

           

          This Monarch String Replacement Program allowed you to replace character 10 or 13, for example, with a space, or nothing at all.

           

          I think this functionality was then automatable using the version of Monarch Utility that shipped with v10.

           

          monarchu.exe report.txt /replace... was the syntax, if I recall correctly.

           

          Are your lines longer than 2000 characters?

           

          Best wishes,

           

          Olly

            • Re: Measuring true line length in Monarch
              Grant Perkins

              Mike,

               

              As Olly has mentioned - check out Monarch Utility in which you should find the tools you need to tidy up your report before processing.

               

              Also have a look at is in Notepad and Wordpad. (I assume is it not a PDF file - in which case the approach will need to be different.)

               

              Unless it has been in some way altered by some other process you may find that the C/rs are related to the apparent loss of trailing white space.

               

              Monarch Utility offers several useful tools for preparing a file so you should find something in there is what are dealing with is basically a straight record dump. Sound like it is.

               

              If it still looks tricky in terms of layout but you know the expected field sizes you should be able to map in the fields you want via Utility or, if that proves tricky, using Monarch's more advanced features.

               

              Be sure you are looking at the source data using a regular font. If you have the screen defaulted to a proportional font the chances are things will look a mess - especially if there is a lot of white space in the records, which in my experience is usually the case!

               

              HTH.

               

               

              Grant

                • Re: Measuring true line length in Monarch
                  mikell _

                  As I said, I cannot use the utility, unless built into the utility is the option to only strip out CRs when the line length is less than 132. Otherwise it would strip out all the CRs, which would then seem to mess up the input report.

                  Also, I have "looked" at the input report and validated exactly what I have said. And because these files are already being run in automated fashion, I don't want to have to insert another utility and also can't run it by hand. Again, all I want is to have the input data lines be fed through by Monarch at exactly the same length as are in each input line, which it doesn't seem to do now and my options trim setting is off.

                    • Re: Measuring true line length in Monarch
                      Olly Bond

                      Hello Mike,

                       

                      Sorry, I should have been clearer. I was assuming your lines had a CR (ASCII 13) occurring at bad places, and CR LF (ASCII 13 10) at the end of each line. So you can automate it by replacing 13 10 with a string like "Olly's fake line ending", then replace the bad 13s with a space, the replace the temporary string with 13 10.

                       

                      I've got v10 here and would happily have a bash at this if you can send me the report and model.

                       

                      Best wishes,

                       

                      Olly

                       

                      Olly Bond

                      MONARCH ? | ? | ? | ? EXPERTS

                      www.monarchexperts.com<http://www.monarchexperts.com>

                      olly@monarchexperts.com<mailto:olly@monarchexperts.com>

                        • Re: Measuring true line length in Monarch
                          mikell _

                          Olly,

                          It would seem your proposed solution would work, if I could apply it within the model. Because of our automation, I cannot run any external app to clean up the input file. I would need to perform your strip out CR&LF then replace LF then restore CR&LF, within Monarch.

                          But, again, why does Monarch not have the true line lengths represented within Monarch.  Other than where a line is truncated as explained above, my line lengths are 134 characters long (determined by viewing in EditPad Pro).  Inside of Monarch the captured lines (which ignore the first column)  are most commonly 132 but can range down into the fifties, mainly depending on how many empty spaces are on the right hand side of the line. If I can do nothing else about my LF issue, within Monarch, I would at least like to be able to have Monarch pass through the lines un-truncated so I could set up an error determining method, using Len().

                    • Re: Measuring true line length in Monarch
                      mikell _

                      Olly,

                       

                      If I run the utility, won’t it strip the CR’s from all the lines?  Most of my lines are 132 characters.  If I had some option to apply the such a function when the raw input line was shorter than 132 it could work. Perhaps I could also develop the necessary functionality but only if the lines could be brought into Monarch at their actual length. However, Monarch 10.5 seems to truncate the true line lengths when the lines end with some number of multiple spaces. All I’m trying to currently do is to be able to have the true length of the line imported into Monarch, so I can use Len() to determine when a line had an improper CR in it and set a flag to point out the bad data.

                        • Re: Measuring true line length in Monarch
                          Grant Perkins

                          Mike,

                           

                          It sounds very much like the "report" you have should really be thought of as a database record.

                           

                          I would suggest that whatever has "wrapped" the lines from their original length will have add the CRs and zapped the trailing spaces.

                           

                          You really have 2 options and variations on them.

                           

                          The Utility tool (or a number of text editors that are available) can undertake the file Preparation task to get is back to the full original length as a single line that Monarch should then be able to deal with. Sound like you have 2 "standard" 132 char wide "report" lines (based on paper width for old dot matrix printers) and a short line that woould be the end of the record.

                           

                          In theory those tools and the required parameters can be included in a batch script in an automated process.

                           

                          You could also go back along the supply lines of the report to seek a point where maybe the records had not been wrapped.

                           

                          Alternatively you could, knowing the expected field lengths I assume, simply make a template with 3 lines of date and map the fields directly based on start point and length.

                           

                          There is a good chance that the field at the end of line one might wrap onto line 2 (if it does it will always be like that) and that the same thing might happen for line 2 to 3. If so you will need to rebuild what should have been a single field (assuming you need it) from the split ones. A calculated field or two in the Table will do that for you.

                           

                          There are some other options, maybe, but the above are the usual generic approaches to dealing with the sort of problem you are faced with.

                           

                           

                          Grant

                            • Re: Measuring true line length in Monarch
                              mikell _

                              Yes, the "report" should be thought of as a database record. That is what I'm doing with this report. Using Monarch to output the print report into a delimited file and then using SSIS to put the data from that into a database. And I can't easily change the automated process to insert an app to deal with the errant CRs because there are hundreds of other reports running through the same process and I'm not about to mess with that automation.  I can handle the issue if only Monarch would do as it says in its Option/Input setting and handle the input lines accurately. It is trimming the trailing spaces even though the "Trim leading and trailing spaces from Character and Memo fields" is not checked. I determine this by taking the complete line as a field and doing a calculation on that field as Len(name of the measured field).  If there are characters at the very end of the field, the "DetailLineLength" will be 132. If not, the calculated length will be something less, seemingly counting to the last character in the line. Now, to me, that appears that Monarch is trimming the trailing spaces.

                              Also, "Ignore unused print control characters (0 - 31)"  and "Ignore form feed character (12)" have been tried set and not set, with no difference.

                              I note that, viewing the input file in EditPad Pro and other editors that can show CR/LF or CR, show all the lines as 133 characters, other than the very infrequent errant lines with a CR inserted.  Within Monarch, I am ignoring the first column.

                              If I could accurately read the length of the lines in Monarch as they exist in the input file, as the input settings claim, I would have no problem dealing with this issue.

                                • Re: Measuring true line length in Monarch
                                  Grant Perkins

                                  Mikel,

                                   

                                  In my experience with this type of file whatever has written the output from the original database (it's likely to be a "Print" command where the output device is understood by the Print program to be a dot matrix printer with a maximum print line of 132 chars - standard width greenbar type listing paper) has got to the end of the available data and simply started a new line (CR/LF).

                                   

                                  As far as Monarch is concerned once you have established that the entire record required 2 or 3 lines to be concatenated into a single line, where the final CR/LF appear (denoting the end of a record) does not matter at all for the output (export) that you will define and create.

                                   

                                  In terms of your automated process ... are all the other reports in need of the same re-working?

                                   

                                  If yes, what makes this one different?

                                   

                                  If no, why would the existing process be suitable for what you need to do with it in its current form?

                                   

                                  BTW, for clarification the Help provided for the Input Options includes the following (I have lifted this from V11 Help as I don't have V10 on this machine but there are no changes between the versions in respect of this feature.)

                                   

                                  Ignore Print Control Characters

                                  Select this setting to have Modeler ignore all print control characters below ASCII 32 except for:

                                  • Form Feed (ASCII 12)
                                  • Line Feed (ASCII 10)
                                  • Carriage Return (ASCII 13)
                                  • TAB (ASCII 09)
                                  • Null (ASCII 0) which is converted to a space (ASCII 32).

                                   

                                  It would be very unhelpful if Monarch totally ignored basic line control characters.

                                   

                                  Note also that Monarch can make use of much wider lines than the 132 character "printer format" constraint from which your input file suffers. How wide depends on the version in use. You have 4000 character line width to play with in V10.5.

                                   

                                  If you are in a position to share the file, or a few records from it, we can work out the options you have for dealing with it but basically the need, however you choose to perform it, is as Olly originally suggested - remove the CRs and leave the CR/LF although maybe LF alone might work.

                                   

                                  By and large trying to grab each line as a field and then concatenate them is certainly an alternative approach but it does introduce the potential issue of trimming spaces and so may not be at all ideal for this requirement without considerable additional work to ensure accurate re-construction.

                                   

                                  HTH.

                                   

                                   

                                  Grant

                                    • Re: Measuring true line length in Monarch
                                      mikell _

                                      I don't need to concatenate lines at all. I had set up the template just fine, filling some 235 fields with data according to the locations in the report. But, occasionally an errant CR would occur in a line which would break the model. If Monarch would not trim the trailing spaces (and I have it set not to) I might use Len() to read where the lengths of the input lines weren't 132 characters and notify what section needs to be corrected.

                                      Again (and again and again) I note that trailing spaces in each line are trimmed according to where the last non space character is, in spite of the setting. This appears to be a bug. Annnd, the very latest version (13.5) acts the same, and that disturbs me greatly because that means even if we upgrade, which we were about to do, it wouldn't correct the issue.

                                      And no, all the reports haven't seemed to have the same problem. Most of the other points haven't really applied to this case.

                                      The reports we handle with Monarch are not simple output from a database, with a print command. They are formatted with descriptive headers.

                                      And where the occasional lines are split because of an incorrectly added CR cannot be defined in the template since they are a fairly rare occurrence. However, a .0001% occurrence can still mess things up when the raw input report is a 2gb file.

                                      As far as the automated process, this is the first time I have seen this issue, but it obviously could happen again.

                                      I do have "Ignore Print Control characters" checked and "Trim leading and trailing spaces" NOT checked.

                                      My input file does not "suffer" from a 132 character "printer format" constraint. What I suffer from is that "Trim leading and trailing spaces" is not checked in the input options, but Monarch is still trimming the trailing spaces, very much against my wishes. No one tries to explain why this is in spite of my making a major issue of it. This also occurs in the newest version 13.5. I consider it to be a bug and it should be corrected if we are to continue to use Monarch, which is definitely not a given at this point.

                                      Once again, I am not in a position to run the input file through another app in the automated process. I must work with the file as it comes into Monarch and the problem here is that Monarch is trimming the trailing spaces, against my wishes and not in accordance with their own input settings.

                                        • Re: Measuring true line length in Monarch
                                          Olly Bond

                                          Hello Mike,

                                           

                                          I've used Utility to replace CR LF with # CR LF to force a non blank character at the right hand margin before. Have you had any luck with that approach?

                                           

                                          Best wishes

                                           

                                          Olly Bond

                                          MONARCH ? | ? | ? | ? EXPERTS

                                          www.monarchexperts.com<http://www.monarchexperts.com>

                                          olly@monarchexperts.com<mailto:olly@monarchexperts.com>

                                            • Re: Measuring true line length in Monarch
                                              mikell _

                                              I am not in a position to apply any app to the data before it runs in the Monarch model.

                                               

                                              But why does no one deal with the apparent fact that the setting "Trim leading and trailing spaces" does not work as it should. This is a bug and if it worked as it should, I wouldn't have a problem. And this bug exists in Monarch 13.5 as well.

                                                • Re: Measuring true line length in Monarch
                                                  Chris Porthouse

                                                  Have you submitted this issue to Datawatch support (support.datawatch.com)?  The community is a good way for self help but if you think you have a found a bug/defect, reporting it to support might be a better option.  This way they can try to verify and replicate the issue and escalate for resolution.  Monarch v14 is right around the corner and it would be nice to squash any bugs that are in the product.

                                                    • Re: Measuring true line length in Monarch
                                                      Olly Bond

                                                      Hello Chris,

                                                       

                                                      That's a nice idea - though I'm not sure if a customer with v10 and possibly not on maintenance might be in the best place to get a response from support.

                                                       

                                                      But I'd certainly agree that the user should try 13.5 and see if that solves the problem. Then again, it's clear in this case that they are making a lot of use of automation.

                                                       

                                                      Best wishes,

                                                       

                                                      Olly

                                                        • Re: Measuring true line length in Monarch
                                                          mikell _

                                                          I have tried the Trim Trailing Spaces issue on Monarch 13.5 and it acts the same as in 10.5.

                                                          For that matter, since I am doing a direct comparison between 10.5 and 13.5, other than certain limitations of 10.5 I am very displeased with comparisons, such as loading time of the Table view or the raw report itself. 10.5 appears to be much, much faster loading an input file (like in the 250mb range), and loading the table view. Unless something changes, I can't conceive of moving to 13.5 or later.

                                                • Re: Measuring true line length in Monarch
                                                  mikell _

                                                  And I cannot share even a small portion of the file due to governmental restraints (and commonsense). We are a mortgage company.

                                                    • Re: Measuring true line length in Monarch
                                                      Grant Perkins

                                                      Well, we can offer a secure server and with V13 you could redact sensitive details.

                                                       

                                                      We have worked with financial organisations before doing similar investigations.

                                                       

                                                      As a matter of interest what is the source file type? Text? PDF? Something else - mainframe for example?

                                                       

                                                      Also, when you are displaying the file within Monarch are you using a fixed width font? A trivial thing that is easily overlooked sometimes.

                                                       

                                                      When you say Monarch trims trailing spaces I am puzzled. Apart from the last line, where lines have been wrapped at some point by whatever program, I have not seen a file exhibit truncated lines due to trimming of data on what is otherwise a file of fixed position data exported correctly.

                                                       

                                                      On the other hand if, for some reason, it's a PDF file then nothing would surprise me.

                                                       

                                                      If you have an errant CR or two (it would be interesting to know where they are coming from) I would have though that they are either an artefact of the data content or something that has bee introduced by another step in the production of the file. I have not seen something like that introduced by Monarch when reading a file and displaying it.

                                                       

                                                      I don't quite understand how the Trim issue is likely to affect your data mapping of the fields and their respective positions in the record. Trim is generally something applied post extraction to tidy up the appearance of fields extracted from relatively loosely formatted text files - especially large blocks of text. From what you have written that does not seem to be what you are doing with the model and file in question at the point that the problem arises.

                                                       

                                                      Please bear in mind that we are to a great extent working in the dark here without having access to an original file exhibiting the problem.

                                                       

                                                       

                                                      Grant

                                                        • Re: Measuring true line length in Monarch
                                                          mikell _

                                                          1. I cannot send data files without  explicit permission. This is only done with contractors who have signed agreements. And I will not bother to waste further time doing redaction in V13.

                                                           

                                                          2. The source file is text.

                                                           

                                                          3. The displayed font is fixed width, although that wouldn't make a difference in this case. It would just make counting spaces harder.

                                                           

                                                          4. What is hard to understand about Monarch trimming trailing spaces. If you have set the option to do that, the spaces from the last non-space character will be trimmed. I have observed is that it still appears to do that regardless of how the input option has been set. This is not about truncating actual data, only about trimming spaces.

                                                           

                                                          5. As I have explained ad infinitum, the errant CR is caused by people copying and pasting from other sources and the CR being included into the input field. This is connected to an external site where the data is aggregated and sent back in the form of text based formatted reports. The CR is correctly interpreted by Monarch because it is in the source report. However the source report is supposed to follow a prescribed pattern which is how my Monarch model parses the text report into a table. Monarch does not introduce an errant CR. Monarch displays the lines EXACTLY as they appear in the input report. And in a non-Monarch text editor, every normal line (other than the rare line with the errant CR) will have exactly 132 characters including spaces.

                                                           

                                                          6. The "Trim issue" does not affect my data mapping. My problem is I want to test the lines to determine where an errant CR has been placed (NOT BY MONARCH) so I can notify of the issue and deal with how it breaks the model after that errant (and very occasional) CR.  Since all normal input lines are 132 characters, I use a single field that will capture EVERY line and then perform a calculation for EACH line using "Len()".  If a line shows up less than 132, I would know an errant CR has been mistakenly placed in the source file.

                                                           

                                                          I will duplicate below how the table is displayed (a few sample rows) in Monarch showing only from column 107 to 118. The headers are in bold:

                                                           

                                                          DetailRows    DetailLineLength

                                                          355-LAST-COV               118

                                                          DL-NONRPT-FG               118

                                                          HI-PRCD-TYP                117

                                                          ESC-WVR-DT                 116

                                                          ILLP                       110

                                                          ARM-INIT-DT                117

                                                           

                                                           

                                                          But, Monarch's "Len()" function either automatically trims the input field, ignoring trailing spaces, or Monarch trims the trailing spaces of line when imported and the Len() function reports the absolute length of the line imported into Monarch. I cannot see which, since I do not believe I can view CRs or LF in Monarch (I can in my other text editing utilities.

                                                           

                                                           

                                                           

                                                            • Re: Measuring true line length in Monarch
                                                              Grant Perkins

                                                              Mike,

                                                               

                                                              Thank you for this extensive description which clears up, for me at least, the questions I had after reading the earlier posts.

                                                               

                                                              Now I understand where the errant CRs are likely to come from.

                                                               

                                                              To find the positions of ASCii characters in a field using the ASCii number for the character Monarch makes use of the ASC() function.

                                                               

                                                              If you combine that with the INSTR() function Monarch will tell you where in a field the ASCii character first appears, reading left to right. If you compare that with the LEN() answer it should be possible to work out which extracted lines/fields are troubled by an embedded but unwelcome formatting character (or characters).

                                                               

                                                              Add in the potential for the IF() function and the  STRIP() function to be deployed in a suitable formula and removal of the errant CRs should be possible.

                                                               

                                                              Further I would guess that there are a limited number of fields in the record that could be afflicted by the problem. If so that might help to ascertain which parts of the record require specific handling to fix the extraction ready for exporting.

                                                               

                                                              It might get a bit tricky and be slightly messy but is probably deliverable. Watch out for the possibility of more than one unwanted ASCii character per field or line.

                                                               

                                                              There are some other functions that may be useful too but those mentioned above are probably the basics.

                                                               

                                                              HTH.

                                                               

                                                               

                                                              Grant

                                                                • Re: Measuring true line length in Monarch
                                                                  mikell _

                                                                  I know of all and use the functions you mentioned. None will correct this issue because once the input file comes into Monarch, with the errant CR, the model breaks (for that page).

                                                                  If I could accurately measure each line length (which Monarch doesn't seem to do since it Trims the trailing spaces, regardless of the Trim setting), I could set a flag and send a message saying there is an issue on the particular loan number.

                                                                  I am talking with Datawatch support about what I consider to be a bug.

                                                                    • Re: Measuring true line length in Monarch
                                                                      mikell _

                                                                      Concerning my issue that the input setting "Trim leading and trailing spaces from Character and Memo fields" doesn't work properly, I was informed today that the condition I observed has been verified by Datawatch Support and a ticket is being raised with the development team. (ref: Ticket ID 29838)