7 Replies Latest reply: May 15, 2014 10:13 AM by psamson _ RSS

    How to match record with commond  word but in different position

    psamson _

      Good evening,

       

      I'm looking to a way to match record that may have same word in but in a different place. This is for a customer list.

       

      ex:

       

      Customer 1:  House of pizzeria (Colombus)

      Customer 2: Pizzeria Casa (Hatford)

      Customer 3: Pizzeria Case de Matéo (Claremont)

       

      I try to find a way that Monarch summarize data for all those three customer in one line.

       

      Thanks for your help

        • How to match record with commond  word but in different position
          Data Kruncher

          Hi Patrick,

           

          Your best route here is using a calculated field using the InStr function and likely one additional calculated field, either the user-edited or perhaps a runtime parameter would be more useful.

           

          You can build a HasSearchWord formula-based calculated field with this expression:

          If(InStr("pizzeria",RptField),"Yes","No")[/CODE]

           

          The enhancement will look like this:

          If(InStr(,RptField),"Yes","No")[/CODE]

           

          Making SearchTerm a runtime parameter adds convenience in that Monarch will ask you for a value every time that you open this model and then apply your specified value to the model.

           

          You can achieve a similar result with the user-edit field type, but it's much more work/hassle.

           

          The final steps would be to filter your data to show only the relevant names, and then use that filter in a summary definition.

           

          Shout if you have further questions.

            • How to match record with commond  word but in different position
              OddJob _

              Hi

               

              If you need to summarize the data for different type of customer at the same time, for example Pizzerias, Restaurants, Bars etc. then you could try this approach.

               

              You will first need to create a list of the different types of outlet that you need to summarize by so, create a calculated field called 'Lookup' and type in the words that you need to match:

              "PizzeriaBistroRestaurantBarHotel" /code

               

              Next you will need to check all of the words in the Customer Name to see if any of them find a match in your list.

              You can use the LSplit() function to split the name into several parts and compare each part to the lookup list.

              Example:

              LSplit(,5," ",1) /code

              This will split the Customer Name field whenever it sees a space in the name, into 5 parts. This example returns the first part (or first word) of the name.

               

              By taking the first, second, third part etc. of the name and comparing each one to the lookup list, you can see if any of them match your list and if so, return the word that matches.

               

              Create a new calculated field called 'Type and use something like this for the formula:

              Trim(

              If(Instr(LSplit(,5," ",1),lookup)>0,LSplit(,5," ",1)," ")

              +

              If(Instr(LSplit(,5," ",2),lookup)>0,LSplit(,5," ",2)," ")

              +

              If(Instr(LSplit(,5," ",3),lookup)>0,LSplit(,5," ",3)," ")

              +

              If(Instr(LSplit(,5," ",4),lookup)>0,LSplit(,5," ",4)," ")

              +

              If(Instr(LSplit(,5," ",5),lookup)>0,LSplit(,5," ",5)," ")

              )

              /code

               

              You can then summarize your data by the 'Type' field.

               

              If you have lots of different types of outlet to summarize, this might not be the best way but, it does work.

               

              Hope this is of use to you.

                • How to match record with commond  word but in different position
                  psamson _

                  Hi !

                  Is there a way to make Monarch decide to match similar name. In my customer list I can have different customer that have similitude between their name  and I'd like to summarize those customer base on the similitude in their name. I have a long list and I can't enter the search word because I do not know all of them.

                    • How to match record with commond  word but in different position
                      Olly Bond

                      Hello,

                       

                      I tried to do some "fuzzy" matching years ago - it's not straightforward. You can compare the similarity of the left hand n characters of two fields with Left().

                       

                      Hope this helps,

                       

                      Olly

                        • How to match record with commond  word but in different position
                          OddJob _

                          Hi

                           

                          As Olly suggests, there isn't a simple way to do this fully.

                           

                           

                          How much can be done would probably depend on how similar the customer names are or how you define 'similar'.

                           

                          If they all contain a location like (Hatford) (Claremont) etc. it's possible to extract that part of the name and find all of the duplicates.

                           

                          If the names begin with the same words like 'Pizzeria Casa' it's possible to extract the first word, two words, three words etc. in a calculated field.

                          Example:

                          LSplit(,5," ",1)+LSplit(,5," ",2) /code

                          Would give the first two words of the name so, anything beginning with 'Pizzeria Casa' or 'House of' or 'Casa Del' etc.

                           

                          If you then create a filter; on the Advanced Tab, under Duplicate Handling, set this to 'all duplicated rows' and select your new calculated field as 'specified key', you should just see any customers that appear to be duplicates based on the first two words of the name (if this is how you are classing them as similar).

                           

                          It doesn't have to be the first two words, it could be the second and third word or first and third word but, you have to decide which parts of the name make one customer name similar to another customer name.

                           

                          Rather than using a filter to identify duplicates, you could use the calculated field (containing parts of the name) as the input field for a Lookup calculated field. By importing all of the entries you would have a list containing the first two words of each customer name. You can then put an entry in the output field for any records that you identify as duplicates or near duplicates.

                           

                          You can then filter on the new lookup field to show all records with duplicates based on the new lookup field.

                           

                          There's a fair bit of manual editing and auditing to do but it may get you a little closer to what you want to do.

                           

                          Hope this helps.