Just to be really specific about what I'm asking -
The main issue I'm having is that after I create templates for email/tel/fax, I'm kind of stumped about how to create traps for lastname/address without any anchors.So basically if I try to create a trap for all lines from the first line (containing the last name) to the line before tel/fax, it selects everything rather than the lines I want.
Dr FirstName LastName
Address Line 1
Address Line 2 City
Dr FirstName LastName
Address Line2 City
Professor FirstName Initial LastName
Address Line 1
Address Line 2
Address Line 1
E-mail: /font[/quote][/quote][/b][/size][/QUOTE]Hmm, no tags at all? No format to help us?
When I re-display as above the space lines disappear - are they in the original file?
Does the Tel:/Fax line really wrap as shown above?
Which version of Monarch do you have?
I think you should be able to get at the email field at the end of the record OK and probably the line above that for the Tel/Fax numbers.
If you can identify a start placement for the name and address sections, together or individually it should be possible to extract in a number of ways and then process the results to give the break out of the names perhaps using the NSPLIT function and the address using teh Address Blocks feature from within the Table area functionality.
There may be some anomalies to deal with where the address or name does not fit the rules, but that would not be unusual for that sort of data.
However, first we have to work out how to extract the block(s) of data! Clearly some things might well make this a little tricky ...
No tags at all!!! Just pure text formatted with slight variations.
The blank line is present in the original text and separates 1 contact from another. The tel/fax line does wrap as shown but not for every contact.
I have 225 pages of this text that I would like to convert to CSV or tab-delimited and thought Monarch should be able to handle this task.
I have Monarch V8.0.
I was planning on using the LSPLIT or NPSLIT function but am unable to capture all the data in a single field. When I create a trap for all lines and view in table format, it shows each line in the original text as a separate row in table view.
Thanks a bunch!!!
Grant has asked some good question, and clearly has a solution in mind.
If you'll excuse the interruption, I've made a little progress that might be of use, provided you can use the functions I've used, depending upon your Monarch version. You might want to edit your profile, BTW. Edit: I see that you posted while I was working on my post. You should be able to use these formulas.
Having captured all of the record in one field (named Contact Data), I built the following calculated fields:
Name and Address:
[font="courier"]substr(,instr("E-mail",[Phone and Email])len("E-mail: "),len()-instr("E-mail",[Phone and Email])1)[/font][/quote]Yikes!! OK, so it didn't seem so bad when I started out... Anyone have more efficient ways to get there?
Hehe, yes, Kruncher is correct, I have an idea in mind ... but it may be marginal. This type of file, indeed anything to do with addresses, can be hugely variable in content I find. Address analysis for separate field break out can also be fraught with problems in the source data and is very unlikely to be 100% successful - unless the extraction is from a 'just cleaned' database, and even then you may need some luck! Expect to need some sort of interventions at some point and use the error flag provided in the Address Block functionality.
Anyway, those observations made, here's my idea.
If the first line of the output above the first name is not a blank line, add a blank line.
I will use a detail template and a footer template. The footer's first line is the "Tel:" line so the detail template SAMPLE needs to be resticted to the SMALLEST number of actual data lines in a record ABOVE the "Tel:" line plus include the space line above the record. In the sample record that seems to be 8 lines in total.
Trap on the blank line (hence the reason for needing at least one at the top of the file).
Paint the field for LastName in line 2 of the template and the 'full' name in line 3.
In line 4 paint a single line field wide enough to capture the widest address line. Set the advance properties of this field to "End field on" - 'None of the above'. That will gove you a single field with all your address info.
There is a problem with this that has no obvious solution based on the sample layout. Department and Organisation can be mixed up or non-existent. The only way I can think of dealing with those issues is to simply treat them as part of the address and see what comes of it using the address block feature.
A second problem is that I can't be sure what results the address block feature will provide because it mainly works based on real data - like Zip codes and things, which are not present in the sample. Data elements that can be extracted with rules - Zip Codes, and things like that - or comparison lists can provide ways of obtaining precision and are used by the address block feature. If you have a limited number of expected Organisations, for example, and need to populate an 'Organisation' field accurately in a separate database it may be possible to define a table of possible values and calculate a field that uses that table to 'correct' the data position in the resulting output. For now I will leave that subject for consideration.
Create a footer template for the shortest footer area - 2 lines would seem to be OK. If it is then you can paint fields for the first line, wide enough to cover the max width of the Tel:/Fax: field, and make it a multi-line field. Also paint one for the E-Mail field to the right of the E-mail tag. A single line field looks like it will be OK but if in doubt make it multi-line. In the Advanced Properties set the "Start field on" value to "Preceding string" of "E-mail:" .
That will save us having to calculate the E-mail address from the Tel: field extraction.
So in the table you now have 3 real fields - LastName, the 'Full name field' and the E-Mail field.
Lets get the rest.
Use the Address Block feature to extract the information you require from the address lines field. The wizard will take you through it. Be sure to set the error flag option to help with post process checking. I can't add much to that without having some real addresses to process as mentioned above. You can do some supplementary processing on the results if you can define the rules ...
To get the Tel: number from the "Tel and Fax" field as I named it;
[font="courier"]ltrim(strip((lsplit(,2,"/",1)),"Tel:"))[/font][/quote]seems to be a simple way to do it but there are a number of other possibilities. The ltrim simply removes a leading space.
To get the "Fax:" number for the 'Tel and Fax' field I used the same idea, although it is slightly crude for this purpose and I was not sure if simply controlling for the trainling "E-Mail:" string would work in all cases. Reduce it if you can to look prettier! ;
[font="courier"]strip(lsplit(,4,":",3),"abcdefghijklmnopqrstuvwyzABCDEFGHIJKLMNOPQRSTUVWXYZ-")[/font][/quote]Ltrim, Rtrim, etc. may also be required. Again there are a number of other ways to slice and dice this extracted field if needed.
My caveat is that whilst these suggestions seem to work for the sample they may not always work for the real report.
If that is the case then we need to seek an alternative approach.
Be careful using the footer. If the TEL: tag is not always present for each record you will have problems.
I can supply my 'cut and paste and edit' sample file and its model if you send me an email address via a PM.