2 Replies Latest reply: May 15, 2014 10:10 AM by MonUserCJ _ RSS

    How to select only rows that have some fields duplicated

    MonUserCJ _

      Hey all,

       

      I ran into an issue where some data contained duplicate data according to some fields (e.g. name, birth date), but not all the fields in each row. That is, two rows may have the same name and birth date listed, but a different phone number. For my purposes, these rows are duplicates.

       

      I know how to create a summary listing the duplicates based on certain fields, and I think I know how to go from the summary to the rows that contain these duplicate entries. What I'm wondering is if there is a way of selecting all the rows of these duplicates, so that I can export a delimited file to see how my program will handle these duplicates.

       

      I know how to read through the file to the duplicates in my program, but I am curious if there is a way to list rows based on the criteria of a summary. How can I do this? I'd appreciate any advice anyone can give.

       

      Thanks.

        • How to select only rows that have some fields duplicated
          Olly Bond

          Hello CJ

           

          Yes - you can do this using the Advanced tab of a Filter in the Table window. You don't need to specify a filter expression on the main tab - just leave that blank - and in the advanced tab you select the fields you want to measure duplicates on, and use check boxes to specify whether you want to keep unique rows, or only duplicated rows.

           

          Then you can export the table, filtered to show only the duplicates. Note that if you want only one row from a duplicate set, Monarch will return the first row as sorted - so by defining a sort order you could get, for example, the last row of each set, or the row with the longest phone number or any other criteria you need.

           

          Best wishes,

           

          Olly