Saturday, November 20, 2021

UFO data : Practical tips for searching your digitised material using free version of PDF-Xchange Editor for Windows

I have previously posted reasons why I find using the free version of PDF-XChange Editor to make reviewing results of searching my offline collection of PDFs 10 to 20 times faster.  I thought it might be helpful if I post a few practical instructions on using that free software to search large collections of digitised UFO material with a few screen shots to help get a few more people started using this useful free option.  (Did I mention that this is free??). 

Basically, as I've previously posted in more detail, this free software can be used to search an entire directory full of PDF documents (or, indeed, containing sub-directories full of different collections of PDF documents). For example, I routinely use this software to search hundreds of thousands of different PDFs, including most of the published UFO books, most UFO newsletters/magazines from various countries, thousands of newspaper clippings, millions of pages of transcripts of UFO podcasts/documentaries, official UFO documents from around the world, PDF conversions of key UFO websites/forums and other material.

If, like me, you have a large collection of potentially relevant PDFs on your hard-drive(s), then I'd recommend at least trying out the following steps:

(1) Download and install the free version of “PDF-XChange Editor” for Windows. This is available on the Tracker Software website. (The website may try to guide you towards buying a version of this software, but the version available for free does everything I need, including the ability to search keywords in numerous different PDFs using the simple steps below).

(2) Open "PDF-XChange Editor” and click on the "Search" button on the top right of the screen (i.e. the image of the binoculars) - NOT the separate "find" button which only searches within a single PDF.


(3) Type in the keyword(s) you want to search for and click on the "Search" button under the keyword. (If you want to be adventurous, you can select various options, e.g. making the search term case sensitive or using bolean logic in your search. You can also specify the proximity of different keywords, e.g. finding results containing the words "Ariel" and "school" but only if both words are used within the same paragraph. However, I'd recommend starting with just a simple, single keyword such as a name of a witness/researcher)

(4) Click on the drop down box and select “browse” at the bottom. You can then select the directory of journals you want to search (e.g. a directory containing newsletters downloaded from the online archive I've helped create on the AFU's website or a directory containing all the PDF material that you have downloaded or digitised).

The search then generates a list of results such as the one below:




Importantly, you can:

(A) see the search results with a brief indication of the context of the keyword (which can help eliminate some irrelevant results without even opening the relevant document), and 

(B) click on any individual result in the search results pane and the relevant PDF will open to the spot with the relevant keyword, so you can see the fuller context very, very quickly and easily.  This is a massive time saver compared to Google searches.  By clicking down the list of results in the search results pane on the right of the screen, you can see hundreds of search results in context in the window on the left of the screen in a matter of minutes. I find this about 10 or 20 times faster than reviewing Google searches.



If you want to start getting more adventurous, you can play around with the options that appear when you click on the "Advanced Criteria" tick box  (e.g. to play around with Boolean searches, such as searching for "Richard" or "Rick" together with "Doty") and the "Options" drop-down menu under the search results (e.g. to specify the required proximity of multiple keywords, e.g. requiring that "Richard" or "Rick" appear next to "Doty" or whether you want to see all results where "Richard" or "Rick" appears in the same paragraph as "Doty", which will - for example - help capture references to "Richard C Doty").




One option that is probably worth a separate post is the option to save search results in various formats, and to reload saved search results. I'll post some examples of this process in a separate post, before moving on to other topics relating to UFO databases.


2 comments:

  1. Thanks, I just downloaded PDF-X and it works great. Just what I was looking for!

    ReplyDelete
  2. Have you looked into Paperless-ngx (https://docs.paperless-ngx.com/)? I have a site up where I'm working on my own UFO dataset. It can do quite a bit of what you talk about above, at scale, for free. Might be of more use to you than me, since you can code and get it to do things post-upload with scripts.

    ReplyDelete