I used the template to select certain parts of the pdf and that's fine but I want to select ALL LINES and can't see how you do that.
I'm also not sure what the relationship between a template and a model is i.e. I'm saving a project and opening a template. Where does model come into this?
Sorry for my naivety!
The project file holds the combination of the source file (e.g. a pdf or txt file which holds the data) and the model. The model file holds all traps and filters.
Basically that answers your first question: the project knows which file holds the data and the model tells it which lines to extract. Apparently not all lines in the data file are meeting the traps you've set in the model. Open the project in Modeller, change the traps and you should be able to fix it.
Thanks very much for your responsiveness and explanation re Monarch's project structure.
That helps a lot.
Re the trap...If I was just processing a single file...I'd be very happy to adjust the trap to get all lines.
The problem is I'm trying to automate the collection of text from a lot of disparate pdfs so...I'm after the foolproof trap that will catch every line of every pdf ever presented. Is this possible i.e. the equivalent of wildcard '*'.
Sorry if I wasn't clear and thank you for your consideration.
I'm more into trapping txt files or processing csv files in Modeller, but I guess this should work: simply don't put any traps on the trap line, but highlight the entire row and make it a memo field.
This is not really a intelligent way of extracting the data. Better said, it's data conversion rather than extracting data. There are probably better tools for this than Modeller.
Thanks I'll try that and let you know how I get on.
I realise I'm effectively using a scalpel as a cold chisel.
I was using pdftotext.exe but it's not converting accurately and Monarch's sitting there for those pdfs that won't convert automatically.
Unfortunately, after testing, the number of files that won't convert accurately is prohibitive so...Monarch's taking over as the primary rather than remedial tool. When you say there's might be a more appropriate straight conversion tool did you have something in mind?
I couldn't find a memo field per se but didn't put anything on line S and highlighted all of line T
I saved this template as "no_trap_in_T_hilite_all_S" and...it selected everything like you said.
I then clicked the table window which let me then do...file/export/table/nameofpdf.txt and choose output dir i.e. exporting to a text file.
Mindful that my project only has a single pdf...I'm just wondering if it's possible to specify a command line that changes in a loop without needing a project for each pdf that I need to convert e.g. something like
for X = 1 to 10
monarch path_to_pdf+str$(X) path_to_SAME_model_file path_to_txtfile+str$(X)
Nearly there hopefully and thanks very much for your help.
You can add multiple report files to a project, I'm sure that's the case for pdf files as well. However you've to do this in Modeller (or Monarch). If you're still using xprj files (Monarch Project files) you could also add reports via a text editor as xprj files are XML files.
Modeller can not be instructed to take all files from one folder. For this you'ld need Datawatch Automator (aka follow-up of DataPump). This tool allows you to use the astrix (*) as wildchart. So "C:\SomePath\Sales *.txt" would pick all text files in the folder C:\SomePath\ which start with Sales as well as Sales.txt.
>You can add multiple report files to a project, I'm sure that's the case for pdf files as well.
>However you've to do this in Modeller (or Monarch).
>If you're still using xprj files (Monarch Project files)...
>...you could also add reports via a text editor as xprj files are XML files.
Thanks! I don't quite understand this at the moment but will look into it.
>Modeller can not be instructed to take all files from one folder.
>For this you'ld need Datawatch Automator (aka follow-up of DataPump).
Sounds like a limitation re processing a large number of pdfs
but then I suppose Monarch/Monarch is intended as a scapel not a chain saw <smile>.
Thank you very much indeed for your advice.
It's extremely helpful and very kind of you.