11 Replies Latest reply: May 15, 2014 10:02 AM by Grant Perkins RSS

    External Lookups

    DavidJ _

      I am trying to join three reports on 5 common fields.  One report is in a fixed-width text layout, and the other two are .csv files (created by Monarch).  I created a model to read in the text file, and then selected the other files as external lookups.  I link the five fields, and everything looks good, except that certain records show (null) for the data values from one of the .csv files, instead of the numbers that are there.  All the other rows show the common fields and the new fields from the external lookup.  The other .csv file's data shows up just fine.  Why would Monarch not read the data from one .csv, but read it from the other?

       

      All three files are extracts from the same massive report, and all the numeric data is extracted the same way (number, 2 decimal, thousands, length of 16).

       

      If I skip the text file and start with the first .csv file and link in the second, everything looks fine.  If I skip the text file and start with the second .csv file and link in the first, I see the (null) values.

       

      It shouldn't be this hard!!!

        • External Lookups
          Grant Perkins

          Hi David,

           

          That was handy timing - I was just finishing off in here ...

           

          NULL suggests that EITHER the value is being imported but is seen as an alpha value being brought into a numeric field OR the link is not working for some reason and NULL represents 'not found'. Something along those lines.

           

          What do you see if you just open the troublesome .csv file in Monarch as a database? Are the field Numeric or alpha? (I assume not date, etc., as you have not mentioned date ...)

           

           

          Grant

            • External Lookups
              DavidJ _

              Grant,

               

              I see all the data as I expect to see it.  The fields that I use to link are all character, and the fields that should show up are all numeric.

               

              The two things I see about the offending records are these:

              1. The values in one of the five linking fields are longer than 25 characters.  (I don't have a problem with the data in the records where this field has a shorter value).

              2. The offending records are created by extracting a summary.  (I don't have a problem with the data in the records where the .csv was created by extracting a table with a filter.)

               

              So even in the records where the one linking field has a long value, the data extracted from tables is present, but the data extracted from the summaries is not.  But the data in the .csv files for the linking fields all match up.

                • External Lookups
                  Grant Perkins

                  David,

                   

                  I would guess that you need to check the output field size from the summary for the 25 char wide field. You may find truncation. The output may be to be set to a narrower display width and this will constrain the width of the field in the csv file. (iirc).

                   

                  Why the width might be constrained is another matter ... we can get to thinking about that if my guess is useful.

                   

                  HTH.

                   

                   

                   

                  Grant

                    • External Lookups
                      DavidJ _

                      Grant,

                       

                      I looked at that earlier, and the field is consistently set to 50 characters.  And the .csv file shows the full contents of the field.

                       

                      Also, I changed the / to a - in the report, but I receive the same results.  So it is not data, but apparently in how the external lookup is reading in the .csv created from the summary.

                        • External Lookups
                          Grant Perkins

                          I don't recall coming across anything like that. But on a positive note it sounds like you have identified the probable cause even if not the solution.

                           

                          Random thought in case it prompts something; You don't have another field based on the same data but with the / used as a split point do you or some other indended truncation that has slipped in to the lookup unnoticed?

                           

                          It's getting late here so I'll have to pick this up in the morning. Do let us know how you get on.

                           

                          Grant

                            • External Lookups
                              DavidJ _

                              Grant,

                               

                              I believe there is something inherently wrong in the Monarch code (or some Microsoft dll), and this is why:  When I changed the sort on one of the other fields in the summary, the rows with the long data first appeared in line 8 instead of line 138, and now it matches nicely.  How Monarch interprets fields should not be based on the position of the data (row-wise).   I am thinking that if I exported to a fixed-width text file instead of a .csv, I might eliminate the problem, but I should not have to resort to that.  I will need to test that, though, since I cannot rely on longer field values to always show up near the top so that Monarch can read them properly.

                               

                              David

                                • External Lookups
                                  Grant Perkins

                                  David,

                                   

                                  I think you are right about the Fixed Width solution and setting the field sizes in the model since not setting them may result in variable size fields form run to run - not good for a lookup.

                                   

                                  I'm not really in a position to experiment and comment just at the moment but the normal expectation when analysing an input file 'on-the-fly' is to work out the structure based on the first few rows - 250 usually in Monarch's case though it is possible the visible number of rows may seem less if filtering is involved in the table.

                                   

                                  The CSV output/input potential for problems did not occur to me last night (it was rather late) but I was assuming that the model was already fixing the field widths on the other part of the link in a way that woould resolve the problem since as I read it you had some longer strings that worked at other times. I suspect I read that a little wrong.

                                   

                                  I would nearly always recommend fixing the field sizes at the maximum possible based on the source file data structure. Yep, one may only be using 30 chars of a 50 char field - but will it ALWAYS be like that ...?

                                   

                                  The only exception might be a huge field where you know the fill size can never exceed a much smaller number of characters.

                                   

                                  Check the filed widths and change the file output type and you should see the result you expect - Ithink - based on your description and update.

                                   

                                  Note my caution though - I never feel totally comfortable working 'blind'.

                                   

                                  Have a great weekend.

                                   

                                   

                                  Grant

                                    • External Lookups
                                      Bruce _

                                      Hi David

                                       

                                      It is a long shot, but CSV can get upset with embedded quotes or commas. Does your data have any of those?

                                       

                                      I found the exact same problem as you, but was never able to track what was the exact issue.

                                       

                                      Not being able to count on CSV 100% of the time I decided to export to Access and then I had complete control of the field types and sizes. Then I imported this access table back into Monarch.

                                      One minor extra step but it was worth it for me.

                                        • External Lookups
                                          DavidJ _

                                          Bruce, it is not a data content issue - it is a matter of how far Monarch looks into external lookup tables to determine max width of the fields (at least for .csv files).  Through trial and error, I have determined that the magic number is 100.  If I put a record with a longer value in my linking field in the 101st row (not counting the header row), then it won't match.  But if I move it up to the 100th row, then it does match.  I don't have a problem when I export from a filtered table, since the first 30 or so records contain all the values for that linking field.  But the summaries require me to sort the data, and that's what throws everything off.  So I may have to resort to using Access.

                                            • External Lookups
                                              Grant Perkins

                                              David,

                                               

                                              If there is no easy way to set the sizes in the model for future use creating a fixed width format summary export for the file to be linked to ought to resolve the problem without forcing you to go to Access, although Access is another valid option of course.

                                               

                                              Another approach I have used from time to time is to include a formatting line as line 1 of the sample. This could be 'forced' as part of the Monarch output somehow (how liklely to depend on other factors) or could be included by file concatenation after the summary export. (e.g. have a 1 line 'dummy format' file and then concatenate the summary export to it, perhaps as part of a batch file appended to an automated export process. The dummy line would play no part in the lookup activity of course.)

                                               

                                              How practical either approach is depends on individual circumstances. For a one-off or occasional need the dummy line may be the most pragmatic. For regular stuff but not high volumes the batch file approach shoudl work very well. For daily and volume processing Data Pump should be able to encapsulate everything.

                                               

                                               

                                              HTH.

                                               

                                               

                                              Grant

                                • External Lookups
                                  DavidJ _

                                  This may be it:  When I review the External Lookup properties for the .csv files extracted from the summaries and spin through the sample data, the link field with the lengthy data is truncated at 25 characters, even though it is all present in the .csv file.  When I review the External Lookup properties for the .csv files extracted from the filtered tables and spin through the sample data, the link field with the lengthy data is fully populated.

                                   

                                  The 26th character in one of the longer fields (and the first long field to display in the sample data), happens to be a forward slash.  That may be an issue, or it may be a red herring.

                                   

                                  I guess the next test is to replace the / with something else in the entire report and see if that makes a difference.

                                   

                                  David