Sunday, November 21, 2021

UFOware - exported sample search results (Ariel School, Richard Doty, Ron Pandolfi, Jaime Shandera)

I don't think anyone else in ufology has tried anything like this yet : sharing exporting results of sample searches of their offline collections of UFO material.

Many people have about creating UFO databases or UFO data warehouses (the latter term having been promoted by Jacques Vallee, including in his interesting presentation in Paris in 2014 and in some related discussions I had with him at that event). However, concrete progress and results in relation to many such plans have been rather sparse.

I therefore thought it might be a good idea for me to:

(1) Share a few sample search results from searches of my own UFO data warehouse ("UFOware");

(2) Share some analysis resulting from the use of UFOware. I've been a bit wary of posting such analysis. Since some unpleasantness a few years ago involving one or two paranoid individuals, I've generally only posted raw UFO material (such as UFO newsletters, official documents etc) and related resources (such as some databases, indexes and tips in relation to searching UFO data) rather than much analysis or commentary. But I don't think that it's possible to discuss the merits of different techniques for searching/analysing UFO data completely in isolation from the results of such efforts.


In relation to the former point, here are a few sample search results obtained from UFOware:


(1) Results of searches in UFOware for "Ariel" in the same paragraph as "School" (to find references to the Ariel School incident in Zimbabwe):

https://docs.google.com/spreadsheets/d/1iIkE7INaPLs8LaWaJ16X9SMgtCZvUyP2xALhzbLmaCs/edit?usp=sharing



(2) Results of searches in UFOware for "Doty" together with either "Rick" or "Richard" (to find references to Richard Doty): 

https://docs.google.com/spreadsheets/d/1AtRXb0-1LzaJQn6CcOovkm-dlmOB6vJkx9XFmM2qcGI/edit?usp=sharing


(3) Results of searches for "Pandolfi" in UFOware (to find references to Ron Pandolfi) :

https://docs.google.com/spreadsheets/d/1XwXGyQDX-KjXFaZQYFGqMHqrA9K57GyMPsijkk2Chlo/edit?usp=sharing


(4) Searches for "Shandera" in UFOware (to find references to Bill Moore's colleague Jaime Shandera)

https://docs.google.com/spreadsheets/d/1sxu0dHLhgKQ-og2cUK-gDNvr0n-yCpix3XZFuNgMCC8/edit?usp=sharing

I'm not sure about the value of such exported results. They do allow some relevant resources to be identified, but without sharing the underlying books/resources as well then the utility is massively reduced.  

Of course, I have been seeking to share many UFO magazines/newsletters when I can get permission to to do, but getting permission to share complete books (even if they are out of print) has proven difficult. 

So, unfortunately, in the sample search results it is not possible to click on each search result and see the results in context - unlike performing a search of UFOware when the underlying material is stored on a local hard-drive. 



(I've previously posted about the software used to generate these results, i.e. the free version of PDF X-Change Editor for Windows, which allows search results to be saved in various formats and to be reloaded at a later date rather than having to redo a search.  Since a search of my main UFO collection now takes about 24 hours, even using a very faster 1TB SSD drive, this ability to reload searches is useful). 



Saturday, November 20, 2021

UFO data : Practical tips for searching your digitised material using free version of PDF-Xchange Editor for Windows

I have previously posted reasons why I find using the free version of PDF-XChange Editor to make reviewing results of searching my offline collection of PDFs 10 to 20 times faster.  I thought it might be helpful if I post a few practical instructions on using that free software to search large collections of digitised UFO material with a few screen shots to help get a few more people started using this useful free option.  (Did I mention that this is free??). 

Basically, as I've previously posted in more detail, this free software can be used to search an entire directory full of PDF documents (or, indeed, containing sub-directories full of different collections of PDF documents). For example, I routinely use this software to search hundreds of thousands of different PDFs, including most of the published UFO books, most UFO newsletters/magazines from various countries, thousands of newspaper clippings, millions of pages of transcripts of UFO podcasts/documentaries, official UFO documents from around the world, PDF conversions of key UFO websites/forums and other material.

If, like me, you have a large collection of potentially relevant PDFs on your hard-drive(s), then I'd recommend at least trying out the following steps:

(1) Download and install the free version of “PDF-XChange Editor” for Windows. This is available on the Tracker Software website. (The website may try to guide you towards buying a version of this software, but the version available for free does everything I need, including the ability to search keywords in numerous different PDFs using the simple steps below).

(2) Open "PDF-XChange Editor” and click on the "Search" button on the top right of the screen (i.e. the image of the binoculars) - NOT the separate "find" button which only searches within a single PDF.


(3) Type in the keyword(s) you want to search for and click on the "Search" button under the keyword. (If you want to be adventurous, you can select various options, e.g. making the search term case sensitive or using bolean logic in your search. You can also specify the proximity of different keywords, e.g. finding results containing the words "Ariel" and "school" but only if both words are used within the same paragraph. However, I'd recommend starting with just a simple, single keyword such as a name of a witness/researcher)

(4) Click on the drop down box and select “browse” at the bottom. You can then select the directory of journals you want to search (e.g. a directory containing newsletters downloaded from the online archive I've helped create on the AFU's website or a directory containing all the PDF material that you have downloaded or digitised).

The search then generates a list of results such as the one below:




Importantly, you can:

(A) see the search results with a brief indication of the context of the keyword (which can help eliminate some irrelevant results without even opening the relevant document), and 

(B) click on any individual result in the search results pane and the relevant PDF will open to the spot with the relevant keyword, so you can see the fuller context very, very quickly and easily.  This is a massive time saver compared to Google searches.  By clicking down the list of results in the search results pane on the right of the screen, you can see hundreds of search results in context in the window on the left of the screen in a matter of minutes. I find this about 10 or 20 times faster than reviewing Google searches.



If you want to start getting more adventurous, you can play around with the options that appear when you click on the "Advanced Criteria" tick box  (e.g. to play around with Boolean searches, such as searching for "Richard" or "Rick" together with "Doty") and the "Options" drop-down menu under the search results (e.g. to specify the required proximity of multiple keywords, e.g. requiring that "Richard" or "Rick" appear next to "Doty" or whether you want to see all results where "Richard" or "Rick" appears in the same paragraph as "Doty", which will - for example - help capture references to "Richard C Doty").




One option that is probably worth a separate post is the option to save search results in various formats, and to reload saved search results. I'll post some examples of this process in a separate post, before moving on to other topics relating to UFO databases.


Friday, November 19, 2021

UFO data : The KISS principle - AI software versus smart searches

The application of Artificial Intelligence to large data sets offers the potential for considerable insight into data about UFOs - but merely understanding how to perform effective and efficient keyword searching of digitised UFO material (e.g. UFO books, articles, PhD dissertations, case files and a collection of UFO databases) is a very useful first step.

Since an item I posted on the ATS discussion forums in 2011, I've been posting online about the benefits of using free software (such as PDF-Xchange Editor, outlined below) or commercial packages (such as Adobe Acrobat) to conduct offline searches of collections of PDFs.

Simple tools relating to that recommendation have caused less excitement than when I've posted a bit about some of the work I've been doing behind the scenes on using Artificial Intelligence to assist with UFO research. For example, a UFO chat-bot that I posted online to demonstrate the use of Artificial Intelligence to suggest explanations for some UFO reports was referred to by the veteran Spanish UFO researcher Vicente-Juan Ballester-Olmos on his blog as "one of the best, most proactive and original developments in UFO research in the last decades".  But I think my encouragement of the use of simple but effective search tools (and sharing tips on their use) has actually been a more meaningful contribution to ufology.

The simplicity of basic search tools should not cause them to be disregarded. 

Indeed, simplicity has considerable advantages in terms of ease of adoption.  

I'm a firm supporter of the KISS principle [i.e. Keep It Simple, Stupid].

In fact, despite some other people discussing the possibility of using AI software to search/analyse UFO data, I haven't seen many discussions of the numerous analytical traps involved in such searches/analysis or any acknowledgement of the many practical difficulties involved in using such tools.  Issues that I've struggled with for years are generally glossed over in most references to AI tools.

I generally prefer simple tools to protracted discussions of the potential application of AI without anyone demonstrating clearly that they have managed to use such AI tools to make UFO research more efficient or effective.  



In particular, I consider it important to have - and be able to search efficiently and effectively - material on your own hard drive.  The results of such searches can be reviewed much more rapidly than, say, the results of Google searches of online material.  I'd estimate that the speed of reviewing search results is about 10 or 20 times faster with such offline search tools.  

Given that most of us involved in UFO research have limited time, an ability to speed up the reviewing of material by a factor of 10 or 20 is pretty significant.

Back in 2012, I performed some comparisons using various items of search and indexing different software a couple of years ago. I tested several pieces of software to make it easier and faster to find UFO material on my computer. I wanted to see which piece(s) of software were quickest and/or easiest to use to search through the UFO material on my hard-drive.

My collection of digitised material has been growing exponentially in the last few years (to include many books, journals, magazines, official documents, archives of email discussion lists, catalogues, indexes and other material), particularly since I have found a few ways to search this material more efficiently which has caused me to seek to increase my collection of digitised material.

Obviously, I'm not able to put all this material online, for numerous reasons (not the least of which are copyright issues). I have, however, been able to upload over 2 million pages of material after getting relevant permissions.  Moreover, I'm happy to share some tips and techniques which may help others to search their own collections more efficiently and effectively.

I had previously been interested in finding efficient ways of searching for UFO material online. In particular, I spent a fairly considerable amount of time seeking to develop various customised search engines (using, in particular, Google's free Google Custom Search service) to search some of the better UFO websites in a single search. I was not happy with the results of those efforts, particularly because the index used by the Google Custom Search service is more limited than the index used by the main Google search service. Because I was not happy with the results, I will refrain from posting links to the the various customised search engines I made.

Because of the limitations I found with the Google Custom Search and because quite a bit of UFO material is not available online, I turned my focus to searching UFO material on my hard-drive. 

I was very pleased to find software which allowed fast searches of multiple PDF files on my hard-drive (with an ability to specify which file or folders were to be searched).  While various items of software can now perform this task, I have found the free version of the PDF-Xchange Editor to be the fastest and most useful option I have tried.

I subsequently compared that piece of software with the Copernic Desktop Search software (helpfully mentioned to me by Chris Aubeck on the EuroUFO List) and some other indexing software, including DtSearch - recommended to me by Maurizio Verga on the EuroUFO List.

I found both of these pieces of software useful, for different types of searches. However, since 2012 I have increasingly focused on the PDF-Xchange Editor for reasons I outline below.


(1) Cost
The free version of PDF-Xchange Editor does everything I want to use it for (including searching large collections of files) while Copernic Desktop Search is not free to use in relation to collections larger than 2GB. 

The cost of Copernic is not extremely high - but after being used to free searches online I'm sure this cost may deter some people.


(2) Types of files searched:
Copernic Desktop Search is not limited to searching PDF files (and searched, for example, Microsoft Word/Excel files, on my hard-drive) while PDF-Xchange Editor (as its name may imply) is
limited in this way. 

Since (for reasons outline below) I generally prefer using PDF-Xchange Editor, I developed a fairly strong incentive to convert as much digitised material as possible from Word documents etc to PDF format. Some of you will have noticed, for example, that I've been seeking to convert the archives of various email discussion lists to PDF format to enable me to use PDF-Xchange Editor to search those archives (amongst other material).

Of course, it goes without saying (which will not stop it saying it...) but both pieces of software can only search digitised information. Neither is going to help with the piles of books and documents which I haven't scanned. Again, this has given me an incentive to work to increase the amount of UFO material which is digitised. Now, as at 2021, most UFO books, UFO newsletters and many case files have been digitised.


(3) Initial set-up time:
Copernic Desktop Search can take quite a while to produce an initial index. I had to leave one of my computers alone for about 4 days for an index of its 500Gb hard-drive to be compiled.

PDF-Xchange Editor does not create any index - it needs to run through each specified file/folder each time a search is performed. This means it is quicker to set up.


(4) Speed of obtaining search results:

Copernicus is MUCH faster at producing a list of search results. Results are virtually instantaneous.

A search of a sizeable collection using PDF-Xchange Editor can take quite a few minutes (or even hours when I specify a search of my entire collection of UFO material). 

As at 2021, a full search of my higher-priority UFO items (about 1 TB) takes about 24 hours to run.  (I have other hard drives holding over 40 TB of material, but 1TB is sufficient for most of the key text material).



(5) Speed of REVIEWING search results:

I have found it MUCH easier and quicker to go through the results of searches in PDF-Xchange Editor.

The search results in PDF-XChange Editor indicate how many times the relevant keyword or phrase appears in any particular document (with a helpful snippet of surrounding words, which
often allows you to eliminate many of the results) and allows you to click on each one in turn very quickly, with the relevant page being displayed almost instantly.

Trying to review the results of a search on Copernic Desktop Search is, relatively speaking, a pain in the backside. There is a preview window which displays the first relevant occurrence of a keyword/phrase within a document when you highlight that document's filename, but I've found that preview window to be relatively slow and the formatting of text in that preview window is often almost unreadable.



CONCLUSION:
I think that some of the discussions of the potential application of sexy AI tools to UFO data risks causing people to overlook some very simple to use (but nonetheless effective and efficient) search tools - e,g, PDF-XChange Editor.

Heck, I much prefer using some basic search tools (particularly those built into PDF-XChange Editor) even over indexing software such as Copernic Desktop Search. Generally, I'd rather wait a few minutes (or even hours) for PDF-Xchange Editor to produce its search results and then zip through those results very quickly and easily. I've found it possible to use such tools to review relevant search results 10 to 20 times faster than Google searches of online material.

I can start a search on PDF-XChange Editor and carry on with other tasks on my computer (or simply start a search before going to bed or before going out for a meal) and review search results when they are ready. There isn't usually any massive urgency about getting results of a search regarding UFO material, so I tend to use PDF-XChange Editor because reviewing the results of a search takes up less of my (limited...) spare time. 

To some extent, the most appropriate piece of software depends on the type of search - if there are likely to be a lot of results (e.g. for "astronomer" or "meteorologist") then I'd focus on the ease/speed of reviewing results but if I'm not sure there will be many (or any) results then I used to try a quick search using Copernic Desktop Search.  Since it is not always possible to know how many results will be found, I increasingly used only PDF-Xchange Editor.

I'll post more practical detail about using PDF-Xchange Editor (including screen shots) in a separate item shortly, before posting more about various UFO databases and AI tools.  I'd prefer, at least for these initial post, to "Keep It Simple, Stupid".