7 Replies Latest reply: May 15, 2014 9:55 AM by Grant Perkins RSS

    XML issues

    React _

      I am trying to extract XML and output as delimited text.  Following other documents I have got this working with one exception.  I have my records selected OK, except when the final field for extraction is missing it ignores the next record selection, but pick up the next one OK.  Examples:

       

      OK Record[/b]

       

        <Item LineNo="10">

                     <ItemID CodeType="BuyersPartNo"></ItemID>

                     <ItemDescription>4-20-4 PlaPG SIL SPR, Cot</ItemDescription>

                     <Quantity>1</Quantity>

                     <Width>563</Width>

                     <Height>201</Height>

                     <ItemReference>C258N3/1/1/THR</ItemReference>

                     <ItemReference>258024/2/1</ItemReference>

                     <ItemDate></ItemDate>

                     <GlazingDetails>

                          <Thickness>28.0</Thickness>

                          <Leaf>

                               <LeafID CodeType="BuyersPartNo"></LeafID>

                               <LeafDescription>4mm Planitherm</LeafDescription>

                               <LeafThickness>4.0</LeafThickness>

                          </Leaf>

                          <Spacer>

                               <SpacerDescription>SIL SPR</SpacerDescription>

                               <SpacerThickness>20.0</SpacerThickness>

                          </Spacer>

                          <Leaf>

                               <LeafID CodeType="BuyersPartNo"></LeafID>

                               <LeafDescription>4mm Pt</LeafDescription>

                               <LeafThickness>4.0</LeafThickness>

                               <LeafDetail Type="Pattern">Cot</LeafDetail>

                          </Leaf>

                     </GlazingDetails>

                </Item>

       

      not OK Record[/b] 

       

                                                                 <Item LineNo="1">

                     <ItemID CodeType="BuyersPartNo"></ItemID>

                     <ItemDescription>4-16 'KTG SIL SPR</ItemDescription>

                     <Quantity>1</Quantity>

                     <Width>1069</Width>

                     <Height>2140</Height>

                     <ItemReference>T14023/1/1/F</ItemReference>

                     <ItemReference>LNE290/1/1</ItemReference>

                     <ItemDate></ItemDate>

                     <GlazingDetails>

                          <Thickness>24.0</Thickness>

                          <Leaf>

                               <LeafID CodeType="BuyersPartNo"></LeafID>

                               <LeafDescription>4mm Clear Tgh</LeafDescription>

                               <LeafThickness>4.0</LeafThickness>

                          </Leaf>

                          <Spacer>

                               <SpacerDescription>SIL SPR</SpacerDescription>

                               <SpacerThickness>16.0</SpacerThickness>

                          </Spacer>

                          <Leaf>

                               <LeafID2 CodeType="BuyersPartNo"></LeafID2>

                               <LeafDescription>4mm Pilk K Tgh</LeafDescription>

                               <LeafThickness2>4.0</LeafThickness2>

                          </Leaf>

                     </GlazingDetails>

                </Item>

       

      notice that the last field I have set is the "Pattern" field, when it exists all is OK ,but when it does not the following record is ignored.  The field is set to line count of 1, but is a file of 157 records I only export 89 and the missing records are all related to this problem.

       

      This seems like a bug but unsure.

        • XML issues
          Grant Perkins

          Hi,

           

          Are you grabbing the entire record in a single template? If so, how many rows are you setting in your SAMPLE for the template? And what are you trapping on?

           

          The symptom sort of suggests that your sample for the template is covering the maximum number of rows you expect in a record but when there is a shorter record the sample lines overlap the first or trap line of the next record which results in the process not 'seeing' it.

           

          Normally in this sort of situation I would be keen to suggest the 'vertical floating field' solution which is based on defining a field with a preceding string which does noy belong to any specific row and is populated only if the preceding string exists at some position between the top of one record and the top of the next.

           

          However, the nature of XML leads to duplication of the preceding text one might use. In other words the field tag which would be ideal for preceding text is less than ideal when it appears more than once in a record - as we have in your examples.

           

          <LeafDescription> occurs twice as does <LeafThickness>

           

          There may be other duplication in other records.

           

          So it looks to me like you need a way to differentiate the sections in the incoming xml file OR you would need to consider those sections to be DETAIL templates, identify each separately, add the rest of the record to each detail record using APPENDS and then take things from there.

           

          But before I head off down that road it would be good if you could confirm the suspicions I set out at the top of the post since if I am wrong the solution may lie elsewhere. For example if you could use the Monarch Utility to extend the number of lines between records  before processing with Monarch the problem would likely go away anyway!

           

          HTH.

           

           

          Grant

          • XML issues
            React _

            Grant,

             

            you are spot on with this, I have tried various different ways and have come to the same conclusion as you when the "Pattern" field is not present, the line count takes it over the next record and therefore ignores it

             

            By the way I am running 8.02 of Monarch.

             

            I am trapping on the <Item LineNo and I am sampling on a record where the "Pattern" field exists as I require it, and therefore as we agree leads to the problem.

             

            I suspect if we can extend the number of lines to the maximum before processing this could well sort things.

             

             

            I do however require the data in the duplicate field names.

            • XML issues
              Grant Perkins

              Hmm. I still see three approaches.

               

              1. Use the Monarch Utility (or the MSRP.EXE component of it for those with earlier versions of Monarch) to add some space lines at the end of a valid record in order to extend the shortest possible record from the set to be at least as long as the template sample rows for the largest record.

               

              See below.

               

              I think this will work if the only problem you have is whether the Pattern field exists or not AND if the PATTERN field is always the last field you need from the record. If it get more complicated than that life could get even more interesting.

               

              2. Break the record down into component detail parts (i.e. the duplicates like LeafDescription) and then use append templates to add the other fields to each of the detail records.

               

              The obvious problem with that as a simple solution is that the duplicate tags occur more than once. ItemReference for example would be a duplicate to deal with in an append template on that basis.

               

              3. Look at the potential for using preceding strings to identify fields so that their row position within the record no longer matters too much.

               

              The problem with that is the tags we can use for the preceding string as the same as the duplicate fields and that still leaves us with a problem - unless we could harness the current problem for our benefit in dealing the with the duplcates. It might be possible for the ItemReference duplication for example since the data lie on consecutive rows.

               

              All to challenging really so I hope the file keeps things simple as per idea 1.

               

              In which case you need to substitute some strings using the Monarch Utility, MSRP or you personal favourite string replacement program.

               

               

              If you use the Monarch Utility you want the 'Prepare files for Monarch' option.

               

              Identiy your input file and the location and name for the output file.

               

               

              It looks like we need to use the </Item> tag at the end of the record to throw in some line feeds - I guess you know how many are required.

               

              You need to use the ASCii codes for special characters and add some Line Feeds at the end.

               

              So if you change

              [font="courier"] 

              /60/47Item/62[/font][/quote]to

              [font="courier"]

              /60/47Item/62/10/10[/font][/quote]It will insert 2 extra lines at the end of each record block (so far as I can tell from the sample record) and should give you the length of record you need to work with. If still too short just add some more /10's to the second string.

               

              I suspect you may already be familiar with this concept but if not a file showing all the ASCii characters is useful (ASCii.txt from the Monarch Downloads page for example) and the instructions in the startup screen for the MSRP.EXE program (Also available from the downloads page still I think.) offers a simple but complete explanation of what is required for a successful transformation.

               

              HTH.

               

              Let us know how you get on and if you think it is an acceptable solution. It is be possible to set this up as a command line in a batch file if the process requires automation.

               

               

              Grant

              • XML issues
                React _

                Grant,

                 

                Using the Monarch Utility and find and replace does seems to have done the trick, you mentioned that you can run this in a batch file, is that right and if so is there a switch to overwrite existing files without prompting?

                 

                The only issue I really have is that as the detail records are being set for the duplicate field names, I am getting 2 output records per 2 input record, although I can handle this later via coding etc, is there a way to join them back again in Monarch.

                 

                Other than that great solution - many thanks

                • XML issues
                  React _

                  sorry that should have read 2 output records for 1 input record

                  • XML issues
                    React _

                    Grant,

                     

                    I have found what I needed and infact when you run it in command line mode it does not prompt to overwrite the original.

                     

                    Many Thanks

                    • XML issues
                      Grant Perkins

                      Originally posted by React:

                      The only issue I really have is that as the detail records are being set for the duplicate field names, I am getting 2 output records per 2 input record, although I can handle this later via coding etc, is there a way to join them back again in Monarch.

                       

                      Other than that great solution - many thanks /b[/quote]Glad you have the Utility side sorted.

                       

                      As for the multiple records - it depends what you need. If something is duplicates I would assume there is a reason in the orignal output although this may or may not be important for your purposes.

                       

                      Presenting the data through a Summary would probably allow you to get what you need simply end up with a single row for each record but with a count of 2 for the duplicates. Ignore the count and take the rest of the row and that might be what you need.  (I'm being circumspect about it because I can't be 100% sure I have a grasp of all of the requirement.)

                       

                      There may also be some mileage in looking at some of the features of filters especially under the 'Advanced' tab in the filter definition.

                       

                      HTH.

                       

                       

                      Grant