2 Replies Latest reply: May 15, 2014 9:56 AM by DavidS _ RSS

    Address blocks

    DavidS _

      I've been using Monarch for more than 5 years (currently using Pro version 8.01), and I'm a big fan, but I've never explored your forum. It looks like a really useful tool. Now to my question:

       

      I often have to parse "free form" address blocks; these address blocks sometimes have the attention name at the beginning of the address, and sometimes at the end; the addresses have variable numbers of lines. Most of the attention names are identifiable with a prefix such as "c/o" or "Attn:".

       

      This is most problematic when the name is at the end of the block, because it tends to be interpreted as a country or a city (usually flagged with an error code).

       

      Is there a way to identify these fields?

       

      Thanks in advance for everyone's advice.

        • Address blocks
          Grant Perkins

          Hi David and welcome to the forum!

           

          We seem to be having a rush of Address Block questions these past few days and weeks. I'm wondering whether it might be useful to have a dedicated group thread for a while so that people can gather Address identification problems and solutions together for a knowledge exchange.

           

          I don't think addresses can ever be identified completely 100% of the time, partly due to natural variability in the address structures and partly due to the human factor, rounds of data conversions and so on.

           

          I tend to think that there are a couple of options for an approach.

           

          One says - "I get an xx% success level. It's good enough for me and is unlikely to be bettered even by visual scanning by humans."

           

          The other says - "Any wrong records are no good to me - I need everything 100% OK."

           

          If you have known errors that are identifiable, predictable and likely to occur fairly regularly (and be a source of a serious enough problem to require intervention) you might need to consider including some conditional handling.

           

          There could be a number of approaches to this, though none might stand out as a definitive solution for all situations.

           

          On might set up a calculated field for each of the likely problem fields and check whether the resulting entry starts with "c/o" or "Attn:" for example. If it does then eithwer remove it or move it to a "Contact Name" field, otherwise use the information it contains in the field being parsed. (That describes the concept not the formula!)

           

          If it IS possible that data appear to be in the wrong field and needs to be moved and the problem is large enough or important to justify checking within the model for such anomalies, it may be possible to work out the rules to store the data in the correct field. An example would be your "Attn:" at the end of an address and so picked up as a Country. But also possibly a City appearing as an address line rather than as a City entry.

           

          I could envisage a few IF() function based calculated fields.

           

          There may also be some benefits in extracting some lines individually rather than as a block if the block processor is being tasked to work with erratically populated original data in multi-line fields.

           

          As I said back near the top, it can be very difficult to judge how far to try to correct the incoming address data.

           

          If you want to past a couple example pages ( I appreciate that may not be possible without diguising the data and one problem with address work is that some data needs top be left as it is for things to work as intended.) it may be possible to come up with some ideas that you have not already tried. No guarantee though!

           

          HTH.

           

           

          Grant

          • Address blocks
            DavidS _

            Hi Grant,

             

            Thanks for your prompt reply. You certainly have some good ideas and suggestions. My problem is that every week I receive one or two reports with this type of address data, and each time, the report is formatted differently, so I actually have to create a new model each time, and with each report and model, different situations are encountered.

             

            I agree with your comment regarding the accuracy of address blocks. I always tell the recepients of the data extracts not to expect 100% accuracy; I'm usually satisified with 98% accuracy, which, when dealing with data sets containing tens of thousands of records, is still a fairly large number.

             

            I usually manipulate the data after exporting it to get to my 98% goal; I was just hoping that there might be a feature that I had overlooked, which would allow for extraction of the attention name.

             

            I think I'll post an entry to the suggestions tread, to add a feature to the address blocks function, which automatically extracts attention names and titles, just like it extracts postal codes and other address sections.

             

            Thanks again for your suggestions.

             

            David