4 Replies Latest reply: May 15, 2014 9:52 AM by Grant Perkins RSS

    Variable Everything

    Jeff C

      I have this file that contains 3 records that I am in need of extracting.  Each record begins with a 11 digit account number (658884****).  The next needed information is the date 20040927 in the first record.  Next begins a set of locations and counts.  The locations are represented by 5 digit numbers and the count is the number following the location number.  If the record has more that 10 locations then after the tenth record there is a separation segment (~SDQEA*92).  This is evident in the first record.  The count could be a single digit, tens or hundreds.  My first attempts were to edit the data file in Notepad and remove the unnecessary information and to separate each record so that each record was on a separate line.  Then save the file and bring it into Monarch.  That's about as far as I got.  Any help will be appreciated.

       

      Here is what the file looks like originally:

      ISA*00         00              01008965063             01123456789         *040930142

      1U004000000000001.0P>GSPD0089650631234567892004093014211X004010~ST*

      8520000000001~XQG20040930~LINUI65888410001CTP*UBV.47ZAQS*0062004092

      7~SDQEA92045892048461805072105073100603380603412060358064887*0

      69553069775~SDQEA92072456073065077912~LISUI65888410007~CTP*INV.9

      9ZAQS*00620040926SDQEA920458966048463540507258050738106033153*

      06034269060359006488221069559106977151~SDQEA92072451560730610507

      79151~LINUI65888410007CTP*INV.99ZAQS*00620040927~SDQEA92045895*0

      48461806033206034110648810069552069779072453073062077913~CTT3~

      SE180000000001GE11IEA10000000001~

       

      I am in need of the 11 digit account number and date to be associated with and repeated with each location number and count number.  Like this:

       

      65888410001 20040927 04589  2

      65888410001 20040927 04846 18

      65888410001 20040927 05072  1

      and so on.

       

      Any idea?

        • Variable Everything
          Grant Perkins

          Jeff,

           

          Looks like a job for MSRP as a starting point rather than an editor. On the other hand if this is a one-off task and there are only 3 records maybe editing is easier ...

           

          It looks like each account number is preceded by "UI". So if you substitute a carriage return/Line feed for "UI" (or whatever) each record will start on a new line. Eventually that should make it easy to define a template for an append to get the account number and date.

           

          You could consider doing something similar for the SDQ identifiers. If there was only ever a single set of up to 10 locations it would quite easy to then slice and dice the line, based on character positions in a known format, to get the location based information or perhaps in this case using the LSPLIT and/or RSPLIT functions.

           

          However if you have possibly many lines it is a little trickier to decide which way to go.

           

          If the maximum likely number of lines is not too great you could consider creating a multi line field and then slicing and dicing that. But I think that might be rather pointless in this case since so many locations seem possible.

           

          The variable width of the section giving the location count is a problem as well - if it was a fixed width column it would be easier.

           

          If there is an opportunity to use MSRP to break the SDQ lines out into separate location and count rows you would have a solution. Each location and count would be a detail record and the Account and date would be an append.

           

          I hope this provide some useful ideas. I would enjoy playing with this one but have no time to do so right now. May have more opportunity in a few days time if you are still working on it.

           

           

          Grant

          • Variable Everything
            Gareth Horton

            Jeff

             

            Many moons ago I wrote a routine for MSRP that reformatted EDI streams for VorteXML (one of our other products)

             

            This batch file should do the trick, just enter the command line like this:

             

            edi_conv jeff.edi * nosubsection ~

             

            Here is the code for the batch file

             

            [font="courier"]Echo Off

             

             

            Rem VorteXML Preparation script for EDI, using Datawatch's MSRP utility.

            Rem

            Rem This file may need to be edited for different types of EDI, considering

            Rem there are many separator conventions and inconsistencies in different

            Rem EDI arrangements. 

            Rem

            Rem EDI Files have been found that have 2, 3 and 4 delimiters

            Rem

            Rem This uses Datawatch's MSRP search and replace utility to

            Rem reformat EDI Streams to be more compatible for extraction using

            Rem VorteXMLs floating traps.

            Rem

            Rem The aim is to put some white space between the data separators in

            Rem the case where no data is present when defining the initial template.

            Rem This allows fields to be defined with the help of accompanying EDI

            Rem specs for the particular document. 

            Rem 

            Rem Therefore, when data appears in consecutive documents, this will be

            Rem picked up and extracted.

            Rem

            Rem Also, this routine inserts a carriage return when a section delimiter

            Rem is found, to create a one section per line file.  This circumvents

            Rem problems with the data length of the EDI file exceeding VorteXML's

            Rem 1024 character width limit for input files and creates a structure

            Rem which is possible to template.

            Rem

            Rem The MSRP utility is a command line DOS assembler program, which functions

            Rem under all versions of Windows, but only accepts 8.3 filenames.

            Rem

            Rem This utility needs to be accessible for the conversion to work.

            Rem

            Rem Please refer to the sample file edi_stream.prn, which this conversion

            Rem script will convert to edi_vx.prn.  Try out the conversion yourself.

            Rem

            Rem There is an associated profile, vx_edi.vx, which extracts the data

            Rem and generates XML

             

            Rem Check for blank entry - give information if no parameters given

             

            if "%1" == "" goto Error

            if "%2" == "" goto Error

            if "%3" == "nosubsection" goto NoSubsection

            if "%4" == "" goto Error

             

            Rem Allow long filenames and avoid damaging original file by creating copy

            copy %1 edi_tmp.prn

             

            Rem Use MSRP to search and replace consecutive data delimiters

             

            msrp "%2%2" "%2  %2" edi_tmp.prn dattmp.prn

            msrp "%2%2" "%2  %2" dattmp.prn dattmp1.prn

            msrp "%2%2" "%2  %2" dattmp1.prn dattmp2.prn

             

            del edi_tmp.prn

            del dattmp.prn

            del dattmp1.prn

             

            msrp "%3%3" "%3  %3" dattmp2.prn subtmp.prn

            msrp "%3%3" "%3  %3" subtmp.prn subtmp1.prn

            msrp "%3%3" "%3  %3" subtmp1.prn subtmp2.prn

             

            msrp "%2%3" "%2 %3" subtmp2.prn subtmp3.prn

            msrp "%3%2" "%3 %2" subtmp3.prn subtmp4.prn

             

            Rem use MSRP to create line breaks after each segment delimiter

             

            msrp "%4" "%4/13/10" subtmp4.prn edi_vx.prn

             

            Rem Clean up Work files

             

            del dattmp2.prn

            del subtmp.prn

            del subtmp1.prn

            del subtmp2.prn

            del subtmp3.prn

            del subtmp4.prn

             

            goto complete

             

            :NoSubsection

             

            echo No Subsection

             

            Rem Allow long filenames and avoid damaging original file by creating copy

            copy %1 edi_tmp.prn

             

            Rem Use MSRP to search and replace consecutive data delimiter

             

            msrp "%2%2" "%2  %2" edi_tmp.prn dattmp.prn

            msrp "%2%2" "%2  %2" dattmp.prn dattmp1.prn

            msrp "%2%2" "%2  %2" dattmp1.prn dattmp2.prn

             

            msrp "%4" "%4/13/10" dattmp2.prn edi_vx.prn

             

            del edi_tmp.prn

            del dattmp.prn

            del dattmp1.prn

            del dattmp2.prn

             

             

            :complete

             

            echo Process Complete

             

            :Error

             

            cls

             

            echo Datawatch Corporation EDI Preparation Script For VorteXML

            echo -


            echo The following parameters are required

            echo.

            echo.

            echo 1 - Filename

            echo.

            echo.

            echo.

            echo 2 - Data Delimiter

            echo.

            echo.

            echo.

            echo 3 - SubSection Delimiter

            echo.

            echo.

            echo.

            echo 4 - Section Delimiter

            echo.

            echo.

            echo.

            echo Enter all four parameters, separated by spaces.

            echo If there is no Sub Section delimiter, then enter

            echo "nosubsection" (without the quotes as the 3rd argument

            echo.

            echo e.g. edi_conv mytestfile.edi : + '

            echo.[/font][/quote]I ran your file through it, but getting rid of the carriage returns etc to return it to it's original stream format and got this:

             

            [font="courier"]ISA*00 00 01008965063 01123456789 *  0409301421U004000000000001.0P>GSPD0089650631234567892004093014211X004010~

            ST8520000000001~

            XQG20040930~

            LIN*  UI65888410001~

            CTP*  UBV.47~

            ZAQS  *  00620040927~

            SDQEA92045892048461805072105073100603380603412060358064887069553*069775~

            SDQEA92072456073065077912~

            LIS*  UI65888410007~

            CTP*  INV.99~

            ZAQS  *  00620040926~

            SDQEA92045896604846354050725805073810603315306034269060359006488221069559106977151~

            SDQEA9207245156073061050779151~

            LIN*  UI65888410007~

            CTP*  INV.99~

            ZAQS  *  00620040927~

            SDQEA92045895048461806033206034110648810069552069779072453073062077913~

            CTT*3~

            SE180000000001~

            GE11~

            IEA10000000001~[/font][/quote]Hopefully this should get you a bit further on

             

             

            Gareth

             

            Originally posted by Jeff C:

            I have this file that contains 3 records that I am in need of extracting.  Each record begins with a 11 digit account number (658884****).  The next needed information is the date 20040927 in the first record.  Next begins a set of locations and counts.  The locations are represented by 5 digit numbers and the count is the number following the location number.  If the record has more that 10 locations then after the tenth record there is a separation segment (~SDQEA*92).  This is evident in the first record.  The count could be a single digit, tens or hundreds.  My first attempts were to edit the data file in Notepad and remove the unnecessary information and to separate each record so that each record was on a separate line.  Then save the file and bring it into Monarch.  That's about as far as I got.  Any help will be appreciated.

             

            Here is what the file looks like originally:

            ISA*00         00              01008965063             01123456789         *040930142

            1U004000000000001.0P>GSPD0089650631234567892004093014211X004010~ST*

            8520000000001~XQG20040930~LINUI65888410001CTP*UBV.47ZAQS*0062004092

            7~SDQEA92045892048461805072105073100603380603412060358064887*0

            69553069775~SDQEA92072456073065077912~LISUI65888410007~CTP*INV.9

            9ZAQS*00620040926SDQEA920458966048463540507258050738106033153*

            06034269060359006488221069559106977151~SDQEA92072451560730610507

            79151~LINUI65888410007CTP*INV.99ZAQS*00620040927~SDQEA92045895*0

            48461806033206034110648810069552069779072453073062077913~CTT3~

            SE180000000001GE11IEA10000000001~

             

            I am in need of the 11 digit account number and date to be associated with and repeated with each location number and count number.  Like this:

             

            65888410001 20040927 04589  2

            65888410001 20040927 04846 18

            65888410001 20040927 05072  1

            and so on.

             

            Any idea? /b[/quote]

            • Variable Everything
              Grant Perkins

              Now that is a useful little (?) script. Nice.

               

              It occurs to me that, from the result, it should be possible further modify the lines begining with alpha characters (except the SDQ lines) to replace the * with something else (bear with me here). The SDQ lines, for all characters up to the first real location number, could then be replaced with nothing. That would leave those lines looking something like

               

              01234501235100 etc.

               

              If you then replace the remaining *'s with something like "/013/010", i.e. CRLF again you would get

               

              01234

              5

              01235

              100

              etc.

               

              That, when used as a report in Monarch, would give you a 2 line detail record with the trap for the first line being 5 numeric characters. (All other lines seem to begin with alpha and I have assumed that the counts will not reach 5 numeric. However since we know there will always be 2 lines even that should not matter.

               

              Pick up your Account number and Date as an append template and attach them to each of the detail lines and you should be able to get the result required.

               

              I may experiment over the weekend ...

               

               

                       :eek:  

               

              Grant

               

              [size="1"][ October 07, 2004, 05:58 PM: Message edited by: Grant Perkins ][/size]

              • Variable Everything
                Grant Perkins

                Well, I didn't wait for the weekend.

                 

                The process was really rather easy using a Windows based utility program similar MSRP. The only downside to this is that I don't know whether the program can be used in a batch file. If not it is not really to much of a problem I think since the same functionality can be scripted using MSRP.

                 

                I will confess that I simply 'lost' the first few lines which I guess are some sort of file identifier. Nothing there that you have mentioned as relevant to the output so zapping it seemed like a good idea.

                 

                Here is what I ended up with (starting from the result fiel posted by Gareth.)

                 

                [font="courier"]ST#852#0000000001~

                XQ#G#20040930~

                LIN#  #UI#65888410001~

                CTP#  #UBV#.47~

                ZA#QS#  #  #006#20040927~

                04589

                2

                04846

                18

                05072

                1

                05073

                10

                06033

                8

                06034

                12

                06035

                8

                06488

                7

                06955

                3

                069775~  NB this line looks wrong in original.

                07245

                6

                07306

                5

                07791

                2~

                LIS#  #UI#65888410007~

                CTP#  #INV#.99~

                ZA#QS#  #  #006#20040926~

                04589

                66

                04846

                354

                05072

                58

                05073

                81

                06033

                153

                06034

                269

                06035

                90

                06488

                221

                06955

                91

                06977

                151~

                07245

                156

                07306

                105

                07791

                51~

                LIN#  #UI#65888410007~

                CTP#  #INV#.99~

                ZA#QS#  #  #006#20040927~

                04589

                5

                04846

                18

                06033

                2

                06034

                11

                06488

                10

                06955

                2

                06977

                9

                07245

                3

                07306

                2

                07791

                3~

                  /font[/quote]I reckon that about does it, other than removing the ~ symbols.

                 

                What do you think Jeff?

                 

                Grant