10 Replies Latest reply: May 15, 2014 9:59 AM by Grant Perkins RSS

    help with variable length fields

    Nupur _

      Hello,

      I need a little help with the following data:

      [font="courier"] >>> xxxxx, xxxxxx      IxxxxxxxxxxyydddddS

      1

                             xxxxxxx xxxxxxxxxxx xxxxxxxx

                          xx xx xxxxx xx. xxxxxxx, xx ddddd

      Path#: yyddddd  -S     SURGICAL PATHOLOGY REPORT       Date Received: yymmdd

       

      Name: ADAMS, xxxxxx      I      UMR#: xxxxxxxxxx       ACCT: xxxxxxxxxx

       

      Surgeon: xxxxx, E.               Referring Physician: xxxxxxxx, C.

       

      Location: 010903      Age : 62            DOB : yymmdd           Sex : F

       

      Specimen      : UTERUS, BILATERAL FALLOPIAN TUBES AND OVARIES; APPENDIX.

      ******************************************************************************

      Date Reported: 1-18-yy Report By x. y. xxxxxxxx, M. D.

       

      >>> xxxxxxxx, xxxxx       LddddddddddyydddddS

      1

                             xxxxxxx xxxxxxxxxxx xxxxxxxx

                          50 N. PERRY ST. PONTIAC, MI 48342

      Path#: yyddddd  -S     SURGICAL PATHOLOGY REPORT       Date Received: 93mmdd

       

      Name: ALDERMAN, xxxxx       L   UMR#: xxxxxxxxxx       ACCT: xxxxxxxxxx

       

      Surgeon: xxxxxx, C.L.            Referring Physician: xxxxx M

       

      Location: O           Age : 30            DOB : yymmdd           Sex : F

       

      Specimen      : COLON BIOPSY 35CM

      ******************************************************************************

      Date Reported: yy0506  Report By x.y. xxx D.O.

       

      >>> xxxxxxxx, xxxxxx      FxxxxxxxxxxyydddddS

      1

                             xxxxxxx xxxxxxxxxxx xxxxxxxx

                          xx xx xxxxx xx. xxxxxxx, xx 12345

      Path#: yyddddd  -S     SURGICAL PATHOLOGY REPORT       Date Received: yymmdd

       

      Name: xxxxxxxx, WINNIE      F   UMR#: xxxxxxxxxx       ACCT: xxxxxxxxxx

       

      Surgeon: xxxxxxx P          Referring Physician: xxxxxxx NB

       

      Location: CC0601      Age : 87            DOB : yymmdd           Sex : F

       

      Specimen      : BIOPSY OF TUMOR FOR FROZEN SECTION

      ******************************************************************************

      Date Reported: 1-12-yy Report By x.y. xxx D.O.

        /font[/quote]Each new record starts with the ">>>" string.  I've created a template with all 15 lines of the first record. I then created traps for each field based on the first record which consists of the first 15 lines since each field is preceded by the field name. The first line was made into a floating trap since there is some horizontal movement on almost every line of the template. I want to be able to specify when Monarch should stop a certain field. E.g. when I created a trap for Surgeon, it picked up the correct string for the first and second records, but there is some horizontal shift in the 3rd record, so Monarch is appending a few characters from the "Referring Physician" string.

      How do I specify the end of a field when there is horizontal shifts in the fields without creating floating traps for each line? I've actually tried that approach as well, but I'm getting extra rows in the Table view as there is some inconsistency in the data.

       

      Thanks for your help.

       

      Nupur

        • help with variable length fields
          Grant Perkins

          Can I offer one idea to address both your problem reports?     

           

          If you have any rows which are not variable map the fields in the normal way.

           

          For the rows where the data shifts horizontally consider something like extracting the entire row as a single field and then slice and dice it into the separate fields you need based on the position of the field tags (Tags being "Surgeon:", "Referring Physician:" and similar).

           

          If this appeals to you but you would like more examples of how to approach the slice and dice aspect let me know (and indicate the problem lines other than the Surgeon line if there are any) and I will offer some suggestions.

           

          HTH.

           

           

          Grant

          • help with variable length fields
            Nupur _

            Thanks a lot for your response.

             

            I would love some examples of the slice and dice approach that you mention. The only other place I'm having an identical problem is the "Date Reported" where the date and "Report by" strings show some horizontal movement.

             

            Regards,

            Nupur

            • help with variable length fields
              Grant Perkins

              OK then ...

               

              Let's assume that the only fixed point on the line is the first letter of the first TAG - I'll take the surgeon line for this example.

               

              We could create a field that covers the entire row and extract it as one (though in this case it would save some work to start the field AFTER the "Surgeon:" string.

               

              That leaves us 3 parts to the line in this example. The surgeon Name, the tag for the second field and the data for the second field.

               

              Let's do the easy bit first using on of the SPLIT functions - RSPLIT in this case.

               

              No matter whether or not we included the "Surgeon:" string if we

               

              RSPLIT(, ":", 1)

               

              the resulting calculated field will give the string of characters after the ":" that follows "Referring Physician:"

               

              Whether or not we include the "Surgeon:" string in the field

               

              RSPLIT(, ":", 2) (Or LSPLIT instead of RSPLIT)

               

              would give us everything after the "Surgeon:" string including the "Referring Physician:" tag.

               

              (If you use LSPLIT you would need to adjust the formula based on how many ":"'s are in the string.)

               

              Now you have some more options on your approach.

               

              Sticking with the SPLIT themeand assuming you did NOT include the "Surgeon:" tag you could RSPLIT on "space" and go for part 3 which should then separate the Surgeon Name from the following TAG text. Job done.

               

              Or you could use LEN() to get a figure for the length of the field after the split. Knowing you want to ignore the tag, which is always a fixed number fo characters, you could create a calculated field to get the LENgth of the field, subtract from the the number of characters in the phrase "Referring Physician" and get the remaining parts of the string using the LEFT function.

               

              There are several other approaches to this and similar problems but these look like they can probably be resolved quite readily with the sort of actions above.

               

              Other particularly useful functions for this sort of requirement are SUBSTR and INSTR - in fact any of the functions that you can see when creating or editing a calculated field in Version 8 grouped under the STRING functions in the function list.

               

              See how you get on with the SPLIT functions and we can move on to the others if we need to.

               

              Grant

              • help with variable length fields
                Bruce _

                Nupur

                 

                I was able to break up the report into several lines and it seemed to work OK. The lines that don't move are easy.

                 

                The trick I used for Referring Physician, was just to just pick up this detail in the line (leaving surgeon to another easy append), with the floating trap feature.

                 

                With the small sample, this worked fine.

                 

                Good luck

                • help with variable length fields
                  Grant Perkins

                  Like Bruce I think there is some potential for using the floating trap on the Surgeon line. Definitely potential for the first line (Date) of each record.

                   

                  There are some things to be wary of though, which is why the slice and dice option appealled more in this case.

                   

                  Firstly the concept of setting up an Append for data from the middle of a record is fine so long as the report lends itself to that concept. Remember that the start  of a template trap definition point for an append needs to be BEFORE the detail and that should any records not have the append line trap for some reason the previous append will be carried over to that record as well. Id does not look like a problem as far as this report is concerned but might be with others that look OK at first sight.

                   

                  Secondly the floating trap concept will work fine provided you can, in effect, define the start point of each field and the MAXIMUM SIZE that each field will ever be AND rely on that particular part of the trap being always present.

                   

                  In this case it looks like the trap point required would be ever present but there is plenty of potential it seems, unless the report print specification tells you otherwise, for the Surgeon Name to be a significant variable within some record in another issue of the report at some point.  If you know, for example, that it cannot exceeed 40 characters (and the same information relating to any other fields on the row can also be applied) then you can create a floating trap definition which will automatically adjust data sizes extracted according to variable width delivered. (This does not affect the defined size of the field in the table.) If you don't know the maximum sizes it is possible to be surprised from time to time!

                   

                  Settling on the slice and dice technique means these matters do not need to be considered, though the floating trap would be more elegant in many ways.

                   

                  Having said that for this type of record my personal preference would always be to keep all of the fields in the body of the record as part of the detail template if possible because the original report design seems to be consistently doing just that based on the sample. And that includes the Date Reported line as far as I can tell.

                   

                  All of which is just my justification for sticking with the slice and dice concepts as a 'trusted' solution in this particular example.       

                   

                  On the other hand why not experiment with both approaches and see how they work out across the full length report? It's a great way to get to be comfortable and productive with the many different methods that Monarch can offer.

                   

                  HTH

                   

                   

                  Grant

                  • help with variable length fields
                    Nupur _

                    Thanks for Bruce and Grant for their responses.

                     

                    The lsplit and rsplit functions worked beautifully. Even fields that were nested in the middle were extracted using nested lsplit and rsplits.

                     

                    I do have another question: When capturing a Memo field, it looked like Monarch was getting rid of all the <CRLF> characters. So the field looked like 1 long string. Is that correct? What do I do to preserve the formatting (newline placements) of the text?

                     

                    Gratefully,

                    Nupur

                    • help with variable length fields
                      Grant Perkins

                      Originally posted by Nupur:

                      I do have another question: When capturing a Memo field, it looked like Monarch was getting rid of all the <CRLF> characters. So the field looked like 1 long string. Is that correct? What do I do to preserve the formatting (newline placements) of the text?

                      /b[/quote]Ah yes, this one has come up from time to time and at some point in the life of V6 there was a release that, in error, did retain CRLF info and caused problems for a few people ... as I recall.

                       

                      I think the basic problem is that most likely the results of the Moanrch session are going to eb expoerted to something that does not, or has not in the past, recognised formatted text OR does but not when it is being imported.

                       

                      That said there were some changes made for Version 8 which go some way towards a suitable compromise (given that the recipient of the data may wish to display it in a different format to the original anyway) by offering the TEXTLINE function which allows text to be identified by its original line from the block of text extracted and presented in a calculated field.

                       

                      If the original discussions of the problems surrounding useful formatted text processing - related to what Monarch is intended to be used for - is still around (and I can find it) I will post a link.

                       

                      I can't remember if you have V8 or not but if you do check out the 'textline' function.

                       

                      The only other way I can think of to preserve formatting would be to treat each line as a field (which is pretty much what textline does ...) or even treat each line of text as a detail record, though this would really only work usefully if your ultimate objective was to print the information, perhaps as the same report but with some additional data added.

                       

                      Such a concept has its place, both in terms of an end result or as an intermediate step in a multipart analysis, but is probably not what you are trying to achieve right now.

                       

                      Does this help in any way?

                       

                      Grant

                      • help with variable length fields
                        Nupur _

                        So my understanding of TEXTLINE is that it will give you the line requested by 'n' - the line number. But that wouldn't really serve my purpose. Basically I'm extracting a certain portion of a text report that is in the form of a paragraph and want to retain the newline characters. The output of Monarch is going into a database (SQL Server which has the capability to store embedded newlines). At some later point, the database users need to manually examine the field extracted by Monarch and therefore for reading convenience it would be nice if the text formatting could be preserved.

                         

                        Thanks for your help.

                         

                        Nupur

                        • help with variable length fields
                          Nupur _

                          BTW, the version I currently have is V8.0

                          • help with variable length fields
                            Grant Perkins

                            Originally posted by Nupur:

                              At some later point, the database users need to manually examine the field extracted by Monarch and therefore for reading convenience it would be nice if the text formatting could be preserved.

                            /b[/quote]Sounds like you have a candidate for an entry in the next version wish list thread!

                             

                            If you had some way to consistently delineate the end of a line you might have a method to retain the format.

                             

                            For example if it happened that the last character of each line of text was "|" and you could make this the last character in the field width, then the spaces, Assuming a fixed woidth font, would all be retained, as far as I know, and so the format would also be retained. At least I think that ould be the case. I have used a couple of similar ideas in the past but have note tested whether V8 treats that approach differently.

                             

                            It is a bit of a long shot from directly within a report. It mightm however, be possible to do a first pass manipulation of the report or extract the text as a subset of it, add the marker, and then re-run the extract. I guess  the work involved with that will either be justified or not by the importance of the format to the people making the request and 'paying' for the development! 

                             

                            HTH.

                             

                             

                            Grant