Friday, November 10, 2023

New UFO Chatbot - "Dave" (better, I think, than my previous UFO chatbots i.e. "Robert" [2018] and "Jenny" [April 2023])

Here's my third UFO chatbot - "Dave".  ("Dave" is named as a gesture of respect to journalist and UFO researcher Associate Professor David Clarke).

Back in December 2018, I created and shared a basic UFO chatbot which I named "Robert", after Robert Moore. I think that was the first UFO chatbot created.  (Robert Moore was a kind and friendly British UFO researcher. He has since sadly passed away).  "Robert" UFO chatbot attempted to respond to raw reports of basic UFO sightings by asking some questions and suggesting _possible_ solutions for them. "Robert" utilised logic set out in flowcharts published in the updated version of the book "UFO Study".  "Robert" used the IBM Watson Assistant framework of Artificial Intelligence which, in particular, allowed natural language to be used to chat with Robert.   A leading European UFO researcher, Vicente-Juan Ballester-Olmos, made some very generous remarks about my "Robert" chatbot on his Fotocat blog in March 2019.  He wrote that:

"In my considered opinion, what Isaac Koi has done is one of the best, most proactive and original developments in UFO research in the last decades".

More recently, on 1 April 2023, I shared a new toy with the UFO community : "Jenny". I think "Jenny" was the first Chatbot focused on UFOs that used ChatGPT to answer questions and summarise information to assist with UFO research and investigations. I called that Chatbot "Jenny" as a gesture of respect to another British ufologist, Jenny Randles.  I'm not aware of any other UFO chatbots being shared with the UFO community between my creation of "Robert" in 2018 and my creation of "Jenny", so "Jenny" appears to have been only the second chatbot focused on UFO investigation and research.
Here's a direct link to "Jenny":

My new UFO chatbot "Dave", like "Jenny", uses ChatGPT (version 4, whereas Jenny used version 3.5) and - also like "Jenny" - is intended to give critical evaluations of potential explanations of UFO sightings. "Dave" has been instructed to reflect, in particular, the work of Jenny Randles, J Allen Hynek, Jacques Vallee, Richard Haines, and Mick West.

Jenny was a significant upgrade over Robert, but was unreliable and prone to making stuff up. Basically, "Jenny" was neither as smart nor as careful as the "Jenny" was named after. :) But it was much much easier to create "Jenny" than my previous chatbot.

"Dave" was even easier to create, utilising the GPT creation tool released by OpenAI this week.  

I think "Dave" is considerably smarter than "Jenny" (although the next Chatbot - as yet unnamed, which I working to train on a huge volume of digitised UFO material - should be a bigger step forward).

Unfortunately, I think OpenAI currently only makes such GPT creations available to those that pay for "ChatGPT Plus" membership. Of course, I'd prefer to make this new evolution of a UFO chatbot available without charge.

"Dave" can be accessed by clicking on the image below.




Here is a direct link to "Dave":

https://chat.openai.com/g/g-LUQvGeeIm-dave-ufo-analyst


Saturday, April 1, 2023

World's first GPT-based UFO chatbot? "Jenny" (UfoGPT Chatbot1) - exploring the potential for new UFO research and investigation tools

As one of my continuing efforts to apply Artificial Intelligence to ufology, I've created a new toy to share with the UFO community : "Jenny". I think "Jenny" is the first Chatbot focused on UFOs that uses GPT to answer questions and summarise information to assist with UFO research and investigations. I've called this Chatbot "Jenny" as a gesture of respect to British ufologist, Jenny Randles.  

"Jenny" is intended to give critical evaluations of potential explanations of UFO sightings, reflecting in particular the work of Jenny Randles, J Allen Hynek, Jacques Vallee, Richard Haines, and Mick West.

"Jenny" can be accessed by clicking on the image below.

Here is a direct link to "Jenny":

https://ora.sh/isaackoi/jenny

Back in December 2018, I created and shared a basic UFO chatbot which I named "Robert" (after Robert Moore). I think that was the first UFO chatbot created. 

A leading European UFO researcher, Vicente-Juan Ballester-Olmos, made some very generous remarks about my "Robert" chatbot on his Fotocat blog in March 2019.  He wrote that:

"In my considered opinion, what Isaac Koi has done is one of the best, most proactive and original developments in UFO research in the last decades".

As I detailed at the time, my 2018 "Robert" UFO chatbot attempted to respond to raw reports of basic UFO sightings by asking some questions and suggesting _possible_ solutions for them. "Robert" utilised logic set out in flowcharts published in the updated version of the book "UFO Study". That book was originally written by veteran ufologist Jenny Randles. It was subsequently updated by another British researcher, Robert Moore (after whom "Robert" was named).  

"Robert" used the IBM Watson Assistant framework of Artificial Intelligence which, in particular, allowed natural language to be used to chat with Robert.    

"Robert", bless his heart, certainly wasn't that bright. The point, however, was to prompt a bit of thinking about _how_ UFO reports can be filtered and how this can be achieved most efficiently and effectively - whether using chatbots, AI software or otherwise.

I'm not aware of any other UFO chatbots being shared with the UFO community since I shared "Robert" in 2018, so "Jenny" may be only the second chatbot focused on UFO investigation and research.

"Jenny" is a significant upgrade. 

"Jenny" uses the GPT large language model. The GPT model is also utilised by ChatGPT, the general-purpose chatbot that has been taking the world by storm in recent months and generating consider interest in the future of Artificial Intelligence. "Jenny" therefore has access to a mass of information (not always correctly summarised or analysed by it...), rather than being limited to a simple logical flowchart like that driving "Robert".  

There are various ways to possible to tweak chatbots to be more focused on UFOs  Methods to attempt to obtain more effective and efficient assistance from chatbots for UFO research and investigation include:

(1) Most simply, merely creating a new chatbot by giving a general-purpose chatbot (such as ChatGPT) a series of instructions or "prompts" that tells the chatbot :

(a) the context of an enquiry, e.g. that it relates to UFO research and investigation; 

(b) the style in which it is to respond, e.g. that the chatbot should give evidence for and against any potential explanations it gives for a UFO sighting;

(c)  the data or views that it should prioritise, e.g. the work of Jenny Randles, J Allen Hynek, Jacques Vallee, Richard Haines, and Mick West

 (2) Providing additional digitised information (e.g. scanned UFO books, UFO journals and UFO case files) to contextualise and/or train responses. I have helped get most UFO books, most UFO journals and a massive number of UFO case files scanned, partly in anticipation of better AI tools being available in the very near future to analyse this material (in addition to the much simpler - but very effective - search tools available now to additional information and analysis from this digitised material). I also have been working on how this material can be used to "train" the next generation of UFO chatbots.

"Jenny" is created by the (much simpler) former option.  I'll be returning to the topic of chatbots created using the latter (rather more complicated and promising) option in another post or series of posts.

From some initial testing of the success of the former simple method of creating a GPT chatbot tailored to UFO research, the prompts provided to "Jenny" (the current version of which is pasted below):

(1) Are generally successful in providing some context for a ufologist's queries, so that "Jenny" can respond to (for example) a single word or the name of a famous UFO sighting (e.g. "Ariel school" or "Nimitz") by giving relevant details in the context of ufo sightings.

(2) Only very modestly successful in generating more detailed responses than the general-purpose ChatGPT chatbot. At the moment, the length and detail of answers (and their style) still varies significantly. However, quick and easy tweaking appears to have had some impact on these points (in my rather subjective view) and some further improvement can probably be achieved with minor tweaking.

(3) Fails to obviously reflect the sources/views it is instructed to particularly consider.

I think this very quick and easy method of creating a tailored chatbot has 

Here is the wording of the current version of the prompt. It is deliberately rather repetitive in an attempt to stress instructions which "Jenny" did not appear to want to follow (particularly giving reasons for and against each potential explanation for a UFO sighting, since she seemed to want to just list potential explanations without critically evaluating them).  

[Prompt for "Jenny" as at 1st April 2023]

As an AI bot, I am programmed to engage in conversation with a human. My perspective is that of a logical, analytical, and mildly skeptical researcher, which reflects the most rational and scientific UFO literature. My responses will prioritize data reflecting the views of Jenny Randles, J Allen Hynek, Jacques Vallee, Richard Haines, and Mick West. I will evaluate potential explanations for UFO sightings critically and individually, giving supporting reasoning and concisely summarized evidence for and against each potential explanation individually. I will do that without waiting to be asked explicitly to give reasons in relation to each potential explanation. I will take it as implied that reasoning and evidence is required for each and every potential explanation that I give. I must give reasoning for and against each and every single potential explanation I give for a UFO sighting. The giving of reasons and citing relevant evidence is fundamental to my existence. I must also give relevant links to further evidence and critical analysis.

I will consider UFO data from UFO websites and digitized UFO books. I will also explore airborne stimuli that have previously been misinterpreted and supplement this information with critical thinking and avoiding logical fallacies.

I exist to give detailed and reasoned answers, so before responding I will presume that I have been explicitly asked to regenerate a more detailed answer and then asked again to regenerate with a more detailed and longer answer.

I will try very hard to give reasoned responses for any against different viewpoints, rather than simply noting that different viewpoints exist. I shall act as if I have been repeatedly asked to regenerate my previous responses with more detail of any different views giving reasons and a summary of relevant evidence.

[End of prompt for "Jenny" as at 1 April 2023] 

I asked "Jenny" to improve some initial versions of the prompt that I drafted, so the above wording reflects input from Jenny herself. However, I've been repeatedly changing and (I think) improving the prompt and this work is still on-going. I think that a little bit more work would probably materially improve results. 

I'll paste below some screenshots of sample interactions with the version of "Jenny" that existed during initial testing (as part of on-going refinement of the relevant prompt), purposely only giving a single word or name of a UFO sighting as a query to "Jenny" with her responses. 








Tuesday, January 18, 2022

Better than Google?? High quality UFO websites converted into searchable PDFs : Project 1947 and NICAP.org

Before moving on to discuss various UFO databases (new and old), I'll just post one more item illustrating the PDF search tool that I've previously recommended which I frankly think is more useful than most of the UFO databases that I've collected. I'd like a few more UFO researchers to try briefly out this (free) tool by doing a quick direct comparison between using Google and this tool to search material on a couple of large UFO websites containing many pages of high quality research: the Project 1947 website and the NICAP.org website.






Some large UFO websites have grown organically and are not always easy to navigate. Some of them have limited, or no, search functions. Apart from enabling UFO websites to be preserved, converting websites to PDFs increases the options for searching their content.  

Of course, it is possible to search a particular websites using Google.  Google searches can be limited to a particular website by including before "site:" name, e.g. "site:https://www.project1947.com/".  So, if you want to search the Project 1947 website for, say, pages containing the name "Hynek" you can do a Google search for "site:https://www.project1947.com/ Hynek".

But when you do a Google search for results using that method, it is still necessary to click on each result and wait for the page to load and then search the page for "Hynek" in order to see each search result in context. 

At the cost of a limited amount of storage space, it is possible to store complete UFO websites as PDFs (whether as a single PDF for the entire website or a collection of many PDFs for each webpage on a website) and then search them as with any other single PDF or collection of PDFs.  I've previously posted at some length about one free tool that I have used for the last decade or so: PDF Xchange Editor.  

Personally (although I'd welcome views from others that do this suggested comparison), I think that it is far, far quicker to review the results of a search of a UFO website converted to PDFs (and the results are more comprehensive) than using Google.  Okay, it is necessary to store the website as a PDF and this uses up some storage space, but hard drives and SSDs are now fairly cheap and some UFO websites are mainly text/images so are not huge.  

In short, I think that this tool is better than Google when it comes to doing fast and comprehensive searches of large websites (particularly if the search generates numerous relevant results to be reviewed).

Taking two websites that I think contain a lot of high quality research as case studies - the Project 1947 website and the NICAP.org website - I have used Adobe Acrobat to convert them to PDFs.  I have uploaded a single PDF for each website to the online archive I have been helping to develop.  The PDFs produced by Adobe Acrobat include some duplication - which is a bit irritating - but this method of generating PDFs with working internal links is very quick and easy. Of course, if   

The Project 1947 website is run by Jan Aldrich. The NICAP.org website is run by Francis Ridge. Both websites are fairly well known among the UFO community (but deserve to be better known) and contain many pages of UFO documents, catalogues, history and analysis. Both websites include material from various leading UFO researchers. 

Both Jan and Fran have kindly given me permission to try out some archiving ideas.  Jan also requested that I seek permission from a few other individuals involved in his website and I am pleased to say that all of them (i.e. John Stepkowski, Barry Greenwood, Keith Basterfield, Paul Dean) all helpfully responded that if Jan was happy then they were happy and that they viewed Jan's permission as sufficient to cover any material they had contributed to Jan's website.

It is possible to search either of these PDFs, or both of them at the same time, using this free tool. It is also possible to include these PDFs in a larger collection of PDFs (e.g. with some or many UFO books, UFO documents, PhD dissertations about UFOs etc etc) to do tailored searches.

Incidentally, the resulting PDFs include scans of a number of official UFO documents (which have also been rendered searchable as part of this process).

I have added these PDFs to a folder in relation to UFO websites.








Sunday, November 21, 2021

UFOware - exported sample search results (Ariel School, Richard Doty, Ron Pandolfi, Jaime Shandera)

I don't think anyone else in ufology has tried anything like this yet : sharing exporting results of sample searches of their offline collections of UFO material.

Many people have about creating UFO databases or UFO data warehouses (the latter term having been promoted by Jacques Vallee, including in his interesting presentation in Paris in 2014 and in some related discussions I had with him at that event). However, concrete progress and results in relation to many such plans have been rather sparse.

I therefore thought it might be a good idea for me to:

(1) Share a few sample search results from searches of my own UFO data warehouse ("UFOware");

(2) Share some analysis resulting from the use of UFOware. I've been a bit wary of posting such analysis. Since some unpleasantness a few years ago involving one or two paranoid individuals, I've generally only posted raw UFO material (such as UFO newsletters, official documents etc) and related resources (such as some databases, indexes and tips in relation to searching UFO data) rather than much analysis or commentary. But I don't think that it's possible to discuss the merits of different techniques for searching/analysing UFO data completely in isolation from the results of such efforts.


In relation to the former point, here are a few sample search results obtained from UFOware:


(1) Results of searches in UFOware for "Ariel" in the same paragraph as "School" (to find references to the Ariel School incident in Zimbabwe):

https://docs.google.com/spreadsheets/d/1iIkE7INaPLs8LaWaJ16X9SMgtCZvUyP2xALhzbLmaCs/edit?usp=sharing



(2) Results of searches in UFOware for "Doty" together with either "Rick" or "Richard" (to find references to Richard Doty): 

https://docs.google.com/spreadsheets/d/1AtRXb0-1LzaJQn6CcOovkm-dlmOB6vJkx9XFmM2qcGI/edit?usp=sharing


(3) Results of searches for "Pandolfi" in UFOware (to find references to Ron Pandolfi) :

https://docs.google.com/spreadsheets/d/1XwXGyQDX-KjXFaZQYFGqMHqrA9K57GyMPsijkk2Chlo/edit?usp=sharing


(4) Searches for "Shandera" in UFOware (to find references to Bill Moore's colleague Jaime Shandera)

https://docs.google.com/spreadsheets/d/1sxu0dHLhgKQ-og2cUK-gDNvr0n-yCpix3XZFuNgMCC8/edit?usp=sharing

I'm not sure about the value of such exported results. They do allow some relevant resources to be identified, but without sharing the underlying books/resources as well then the utility is massively reduced.  

Of course, I have been seeking to share many UFO magazines/newsletters when I can get permission to to do, but getting permission to share complete books (even if they are out of print) has proven difficult. 

So, unfortunately, in the sample search results it is not possible to click on each search result and see the results in context - unlike performing a search of UFOware when the underlying material is stored on a local hard-drive. 



(I've previously posted about the software used to generate these results, i.e. the free version of PDF X-Change Editor for Windows, which allows search results to be saved in various formats and to be reloaded at a later date rather than having to redo a search.  Since a search of my main UFO collection now takes about 24 hours, even using a very faster 1TB SSD drive, this ability to reload searches is useful). 



Saturday, November 20, 2021

UFO data : Practical tips for searching your digitised material using free version of PDF-Xchange Editor for Windows

I have previously posted reasons why I find using the free version of PDF-XChange Editor to make reviewing results of searching my offline collection of PDFs 10 to 20 times faster.  I thought it might be helpful if I post a few practical instructions on using that free software to search large collections of digitised UFO material with a few screen shots to help get a few more people started using this useful free option.  (Did I mention that this is free??). 

Basically, as I've previously posted in more detail, this free software can be used to search an entire directory full of PDF documents (or, indeed, containing sub-directories full of different collections of PDF documents). For example, I routinely use this software to search hundreds of thousands of different PDFs, including most of the published UFO books, most UFO newsletters/magazines from various countries, thousands of newspaper clippings, millions of pages of transcripts of UFO podcasts/documentaries, official UFO documents from around the world, PDF conversions of key UFO websites/forums and other material.

If, like me, you have a large collection of potentially relevant PDFs on your hard-drive(s), then I'd recommend at least trying out the following steps:

(1) Download and install the free version of “PDF-XChange Editor” for Windows. This is available on the Tracker Software website. (The website may try to guide you towards buying a version of this software, but the version available for free does everything I need, including the ability to search keywords in numerous different PDFs using the simple steps below).

(2) Open "PDF-XChange Editor” and click on the "Search" button on the top right of the screen (i.e. the image of the binoculars) - NOT the separate "find" button which only searches within a single PDF.


(3) Type in the keyword(s) you want to search for and click on the "Search" button under the keyword. (If you want to be adventurous, you can select various options, e.g. making the search term case sensitive or using bolean logic in your search. You can also specify the proximity of different keywords, e.g. finding results containing the words "Ariel" and "school" but only if both words are used within the same paragraph. However, I'd recommend starting with just a simple, single keyword such as a name of a witness/researcher)

(4) Click on the drop down box and select “browse” at the bottom. You can then select the directory of journals you want to search (e.g. a directory containing newsletters downloaded from the online archive I've helped create on the AFU's website or a directory containing all the PDF material that you have downloaded or digitised).

The search then generates a list of results such as the one below:




Importantly, you can:

(A) see the search results with a brief indication of the context of the keyword (which can help eliminate some irrelevant results without even opening the relevant document), and 

(B) click on any individual result in the search results pane and the relevant PDF will open to the spot with the relevant keyword, so you can see the fuller context very, very quickly and easily.  This is a massive time saver compared to Google searches.  By clicking down the list of results in the search results pane on the right of the screen, you can see hundreds of search results in context in the window on the left of the screen in a matter of minutes. I find this about 10 or 20 times faster than reviewing Google searches.



If you want to start getting more adventurous, you can play around with the options that appear when you click on the "Advanced Criteria" tick box  (e.g. to play around with Boolean searches, such as searching for "Richard" or "Rick" together with "Doty") and the "Options" drop-down menu under the search results (e.g. to specify the required proximity of multiple keywords, e.g. requiring that "Richard" or "Rick" appear next to "Doty" or whether you want to see all results where "Richard" or "Rick" appears in the same paragraph as "Doty", which will - for example - help capture references to "Richard C Doty").




One option that is probably worth a separate post is the option to save search results in various formats, and to reload saved search results. I'll post some examples of this process in a separate post, before moving on to other topics relating to UFO databases.


Friday, November 19, 2021

UFO data : The KISS principle - AI software versus smart searches

The application of Artificial Intelligence to large data sets offers the potential for considerable insight into data about UFOs - but merely understanding how to perform effective and efficient keyword searching of digitised UFO material (e.g. UFO books, articles, PhD dissertations, case files and a collection of UFO databases) is a very useful first step.

Since an item I posted on the ATS discussion forums in 2011, I've been posting online about the benefits of using free software (such as PDF-Xchange Editor, outlined below) or commercial packages (such as Adobe Acrobat) to conduct offline searches of collections of PDFs.

Simple tools relating to that recommendation have caused less excitement than when I've posted a bit about some of the work I've been doing behind the scenes on using Artificial Intelligence to assist with UFO research. For example, a UFO chat-bot that I posted online to demonstrate the use of Artificial Intelligence to suggest explanations for some UFO reports was referred to by the veteran Spanish UFO researcher Vicente-Juan Ballester-Olmos on his blog as "one of the best, most proactive and original developments in UFO research in the last decades".  But I think my encouragement of the use of simple but effective search tools (and sharing tips on their use) has actually been a more meaningful contribution to ufology.

The simplicity of basic search tools should not cause them to be disregarded. 

Indeed, simplicity has considerable advantages in terms of ease of adoption.  

I'm a firm supporter of the KISS principle [i.e. Keep It Simple, Stupid].

In fact, despite some other people discussing the possibility of using AI software to search/analyse UFO data, I haven't seen many discussions of the numerous analytical traps involved in such searches/analysis or any acknowledgement of the many practical difficulties involved in using such tools.  Issues that I've struggled with for years are generally glossed over in most references to AI tools.

I generally prefer simple tools to protracted discussions of the potential application of AI without anyone demonstrating clearly that they have managed to use such AI tools to make UFO research more efficient or effective.  



In particular, I consider it important to have - and be able to search efficiently and effectively - material on your own hard drive.  The results of such searches can be reviewed much more rapidly than, say, the results of Google searches of online material.  I'd estimate that the speed of reviewing search results is about 10 or 20 times faster with such offline search tools.  

Given that most of us involved in UFO research have limited time, an ability to speed up the reviewing of material by a factor of 10 or 20 is pretty significant.

Back in 2012, I performed some comparisons using various items of search and indexing different software a couple of years ago. I tested several pieces of software to make it easier and faster to find UFO material on my computer. I wanted to see which piece(s) of software were quickest and/or easiest to use to search through the UFO material on my hard-drive.

My collection of digitised material has been growing exponentially in the last few years (to include many books, journals, magazines, official documents, archives of email discussion lists, catalogues, indexes and other material), particularly since I have found a few ways to search this material more efficiently which has caused me to seek to increase my collection of digitised material.

Obviously, I'm not able to put all this material online, for numerous reasons (not the least of which are copyright issues). I have, however, been able to upload over 2 million pages of material after getting relevant permissions.  Moreover, I'm happy to share some tips and techniques which may help others to search their own collections more efficiently and effectively.

I had previously been interested in finding efficient ways of searching for UFO material online. In particular, I spent a fairly considerable amount of time seeking to develop various customised search engines (using, in particular, Google's free Google Custom Search service) to search some of the better UFO websites in a single search. I was not happy with the results of those efforts, particularly because the index used by the Google Custom Search service is more limited than the index used by the main Google search service. Because I was not happy with the results, I will refrain from posting links to the the various customised search engines I made.

Because of the limitations I found with the Google Custom Search and because quite a bit of UFO material is not available online, I turned my focus to searching UFO material on my hard-drive. 

I was very pleased to find software which allowed fast searches of multiple PDF files on my hard-drive (with an ability to specify which file or folders were to be searched).  While various items of software can now perform this task, I have found the free version of the PDF-Xchange Editor to be the fastest and most useful option I have tried.

I subsequently compared that piece of software with the Copernic Desktop Search software (helpfully mentioned to me by Chris Aubeck on the EuroUFO List) and some other indexing software, including DtSearch - recommended to me by Maurizio Verga on the EuroUFO List.

I found both of these pieces of software useful, for different types of searches. However, since 2012 I have increasingly focused on the PDF-Xchange Editor for reasons I outline below.


(1) Cost
The free version of PDF-Xchange Editor does everything I want to use it for (including searching large collections of files) while Copernic Desktop Search is not free to use in relation to collections larger than 2GB. 

The cost of Copernic is not extremely high - but after being used to free searches online I'm sure this cost may deter some people.


(2) Types of files searched:
Copernic Desktop Search is not limited to searching PDF files (and searched, for example, Microsoft Word/Excel files, on my hard-drive) while PDF-Xchange Editor (as its name may imply) is
limited in this way. 

Since (for reasons outline below) I generally prefer using PDF-Xchange Editor, I developed a fairly strong incentive to convert as much digitised material as possible from Word documents etc to PDF format. Some of you will have noticed, for example, that I've been seeking to convert the archives of various email discussion lists to PDF format to enable me to use PDF-Xchange Editor to search those archives (amongst other material).

Of course, it goes without saying (which will not stop it saying it...) but both pieces of software can only search digitised information. Neither is going to help with the piles of books and documents which I haven't scanned. Again, this has given me an incentive to work to increase the amount of UFO material which is digitised. Now, as at 2021, most UFO books, UFO newsletters and many case files have been digitised.


(3) Initial set-up time:
Copernic Desktop Search can take quite a while to produce an initial index. I had to leave one of my computers alone for about 4 days for an index of its 500Gb hard-drive to be compiled.

PDF-Xchange Editor does not create any index - it needs to run through each specified file/folder each time a search is performed. This means it is quicker to set up.


(4) Speed of obtaining search results:

Copernicus is MUCH faster at producing a list of search results. Results are virtually instantaneous.

A search of a sizeable collection using PDF-Xchange Editor can take quite a few minutes (or even hours when I specify a search of my entire collection of UFO material). 

As at 2021, a full search of my higher-priority UFO items (about 1 TB) takes about 24 hours to run.  (I have other hard drives holding over 40 TB of material, but 1TB is sufficient for most of the key text material).



(5) Speed of REVIEWING search results:

I have found it MUCH easier and quicker to go through the results of searches in PDF-Xchange Editor.

The search results in PDF-XChange Editor indicate how many times the relevant keyword or phrase appears in any particular document (with a helpful snippet of surrounding words, which
often allows you to eliminate many of the results) and allows you to click on each one in turn very quickly, with the relevant page being displayed almost instantly.

Trying to review the results of a search on Copernic Desktop Search is, relatively speaking, a pain in the backside. There is a preview window which displays the first relevant occurrence of a keyword/phrase within a document when you highlight that document's filename, but I've found that preview window to be relatively slow and the formatting of text in that preview window is often almost unreadable.



CONCLUSION:
I think that some of the discussions of the potential application of sexy AI tools to UFO data risks causing people to overlook some very simple to use (but nonetheless effective and efficient) search tools - e,g, PDF-XChange Editor.

Heck, I much prefer using some basic search tools (particularly those built into PDF-XChange Editor) even over indexing software such as Copernic Desktop Search. Generally, I'd rather wait a few minutes (or even hours) for PDF-Xchange Editor to produce its search results and then zip through those results very quickly and easily. I've found it possible to use such tools to review relevant search results 10 to 20 times faster than Google searches of online material.

I can start a search on PDF-XChange Editor and carry on with other tasks on my computer (or simply start a search before going to bed or before going out for a meal) and review search results when they are ready. There isn't usually any massive urgency about getting results of a search regarding UFO material, so I tend to use PDF-XChange Editor because reviewing the results of a search takes up less of my (limited...) spare time. 

To some extent, the most appropriate piece of software depends on the type of search - if there are likely to be a lot of results (e.g. for "astronomer" or "meteorologist") then I'd focus on the ease/speed of reviewing results but if I'm not sure there will be many (or any) results then I used to try a quick search using Copernic Desktop Search.  Since it is not always possible to know how many results will be found, I increasingly used only PDF-Xchange Editor.

I'll post more practical detail about using PDF-Xchange Editor (including screen shots) in a separate item shortly, before posting more about various UFO databases and AI tools.  I'd prefer, at least for these initial post, to "Keep It Simple, Stupid".



Friday, October 22, 2021

UFO Databases: Basic background - My 2005 article on UFO databases/software

[Before posting more recent material, I thought it may be worth reposting an item I first posted online in 2005 on a blog and the UFO Updates email discussion List. The focus was on my suggestion that a modest effort be made to identify and collate the existing UFO databases, in an attempt to reduce the amount of reinvention of the wheel that takes place within ufology.

My article in 2005 concluded "Perhaps the most obvious observations from reviewing the discussions referred to above are that many, many catalogues/databases have (a) been planned but not finished, or (b) finished but are not readily available. I dread to think how much time and effort has been wasted on such projects".

I've developed some of these ideas a bit since then and will include some updates in subsequent posts...]






http://ufoinquiry.blogspot.com/2005/06/tools-of-trade-software.html




I find it useful to split up the various
types of software/databases that ufologists use. In this email,
I'll divide my comments into the following categories:

A. Databases of UFO reports;

B. Databases of other information;

C. Expert systems to assist in identifying possible stimuli for
a report;

D. Other software to assist in investigations and research.

At the very minimum, I consider this exercise to be worthwhile
because it may assist some of the various individuals that
appear to be putting considerable time and effort into
developing their own databases/software. Also, the usefulness of
databases and other software merely as bibliographical tools
should not be underestimated given the sheer mass of literature
and documentation relating to UFO reports. However, before
launching into these topics, I'll just note a few cautionary
remarks in relation to the use of computers within ufology:

(1) "Computers are a powerful tool which properly used will give
enormous assistance to ufologists the world over... but it
should be recognised from the outset that they alone will not
answer the questions. [T]he UFO enigma will not be answered by
computers but by the talented and intuitive thinking of human
minds" per Spencer, John and Vallee, Jacques and Verga, Maurizio
in "UFO: 1947-1987" (1987) (edited by Hilary Evans with John
Spencer) at page 245 of the Fortean Tomes softcover edition (in
Chapter 3.6, entitled "Computers in Ufology").

(2) "Poor data will merely produce the wrong answer more quickly
on a computer. No technology or technique will compensate for
deficient data." per Peter Hill, quoted in Phenomenon (1988)
(edited by John Spencer and Hilary Evans) at page 224.

(3) "The well-known phrase "garbage in, garbage out' applies
equally well to ufology." per Gamble, Stephen and Wootten,
Michael and Danby, J and Smith, Willy and Kuhlemann, Bertil in
"Phenomenon" (1988) (edited by John Spencer and Hilary Evans) at
pages 224-237 of the MacDonald hardback edition (in Part 3, in
the unnumbered chapter entitled "Harnessing the Computer").

(4) See also the remarks by Brad Sparks on UpDates at the
following link:

http://www.virtuallystrange.net/ufo/updates/2005/jan/m09-010.shtml

With these caveats firmly in mind, I turn to the categories I
outlined above.

A. Databases of UFO reports

Given that many ufologists (and non-ufologists) in modern
society appear to be almost compulsive list-makers, it is not
surprising that there are already a wide variety of dabases of
UFO Reports.

Heck, there are already a considerable number of lists of
databases (in effect, databases of databases - or "databases
squared").

What is more surprising is that these databases rarely seem to
be referred to by other individuals that are considering
developing their own databases.

If I were to attempt to prepare a comprehensive list of
databases from scratch, I would attempt to divide existing
databases into various categories (e.g. according to the method
of storage or access (such as online, computerised and paper
based), or according to the type of data stored (e.g. worldwide
reports, regional reports, or specialised (e.g. pilot sightings,
EM reports) and gradually build up a comprehensive list of
databases within each catagory.

However, given the existence of some attempts to list databases
already, I think a more efficient and systematic approach would
be to begin by starting by producing a list of references to
lists of databases (in effect, a database of databases of
databases, or a list of databases squared, or a "database
cubed").

I'll begin with a list of online lists of databases (most of
which are useful as guides to databases available online, but
are rather weak in relation to databases supplied on CD or on
paper):

(a) Mark Cashman's list of catalogues at the following link,
which is clearly presented and useful (but rather limited):

http://www.temporaldoorway.com/ufo/catalog/index.htm

(b) Potentially more comprehensive, but a bit hit and miss in
its coverage, is the following page on Francis Ridge's "NICAP"
website. That page refers to various categories or "groups" of
sightings. Clicking on a "group" displays a page relating to
that category of sighting that generally begins with a list of
databases or analyses relevant to that category.

http://www.nicap.dabsol.co.uk/special.htm

(c) Project 1947 provides a list of catalogues (which appears to
be generally limited to those by contributors to the Project)
at:

http://www.project1947.com/47cats.htm

(d) A slightly bare list of databases (which includes several
regional databases rarely mentioned elsewhere) is provided by
SUFOI at the following link:

http://www.sufoi.dk/artik-sn/new12-08.htm

(e) Few of the many computer software projects currently in
development give any indication that existing databases/software
were reviewed before launching into the new project. One of the
few exceptions is the RR0 project being run by Jerome Beau,
which not only includes a limited list of "alternatives and
competition" but also (extremely briefly and not entirely
clearly (possibly because of the somewhat stilted
English/jargon)) attempts to define what is different about the
proposed project. See the "alternatives and competition" table
and the remarks below it at the following link:

https://sourceforge.net/docman/display_doc.php?docid=17408&group_id=70060#AlternativesAndCompetition

(f) A very brief list of databases is given by Terry Groff on his
UFO Tools website at the following link:

http://www.terrygroff.com/ufotools/statistics.html

The most striking thing about these lists, to me at least, is
that there is very limited overlap in the lists of databases. It
seems to me that merely combining these lists would generate a
more comprehensive list of UFO databases than is currently
available on the Internet.

Even more striking is the fact that the lists of databases and
catalogues that appear in print also have extremely limited
overlap with the above lists. For example, UFO databases are
listed and/or discussed in the following:

Evans, Hilary in "UFO: 1947-1987" (1987) (edited by Hilary Evans
with John Spencer) at page 46 of the Fortean Tomes softcover
edition (Chapter 2.3.1, entitled "UFOs as Global Phenomenon").

Hall, Richard in "The UFO Evidence: Volume 2 - A Thirty Year
Report" (2001) (edited by Richard Hall) at pages 646-647 (in
Section 16) of the Scarecrow Press hardback edition.

Hynek, J Allen and Vallee, Jacques in their "The Edge of
Reality" (1975) at pages 76, 78-82 (in Chapter 3) of the Henry
Regnery hardback edition.

Gamble, Stephen and Wootten, Michael and Danby, J and Smith,
Willy and Kuhlemann, Bertil in "Phenomenon" (1988) (edited by
John Spencer and Hilary Evans) at pages 224-237 of the MacDonald
hardback edition (Part 3, in the unnumbered chapter entitled
"Harnessing the Computer").

Randles, Jenny and Warrington, Peter in their "UFOs : A British
Viewpoint" (1979) at pages 180-181 (in Chapter 11) of the Book
Club Associates hardback edition.

Randles, Jenny and Warrington, Peter in their "Science and the
UFOs" (1985) at page 60 (in Chapter 4) of the Blackwell hardback
edition.

Spencer, John and Vallee, Jacques and Verga, Maurizio in "UFO:
1947-1987" (1987) (edited by Hilary Evans with John Spencer) at
pages 238-245 of the Fortean Tomes softcover edition (in Chapter
3.6, entitled "Computers in Ufology").

Sturrock, Peter in his "The UFO Enigma" (1999) at pages 166-167
(in Chapter 24) of the Warner Aspect hardback edition.

Westrum, Ronald M in "UFO Phenomena and the Behavioral
Scientist" (1979) (edited by Richard F Haines) at pages 104-106
(in Chapter 5) of the Scarebrow Press hardback edition.

The above lists are an attempt at a database cubed. I've started
to generate my own database squared (i.e. a list of computer
databases), by listing the databases listed in the
webpages/discussions and other databases I've read about
elsewhere (or have obtained).

Before I spend much more time on this project, I'd invite anyone
that knows of other lists of databases to add to the above
database cubed to do so.

Of the above list of existing lists of databases, I would
heavily highlight in particular the article by Spencer, John and
Vallee, Jacques and Verga, Maurizio in "UFO: 1947-1987" (1987)
(edited by Hilary Evans with John Spencer) at pages 238-245 of
the Fortean Tomes softcover edition (in Chapter 3.6, entitled
"Computers in Ufology").

That article discusses a considerable number of existing
databases. Interestingly, I don't think I've read about most of
those databases since that article was printed in 1987. It would
be interesting to follow up on the status and availability of
those databases. A few hours of effort in following up the
availability of programs or databases that took week or months
to produce could be very rewarding.

I note in particular the following from that article (at the top
of page 242): " This is the only publication in the world
exclusively devoted to the use and application of computers in
ufology. A lot of international researchers contribute to the
[Computer UFO Newsletter] edited by Maurizio Verga. with
articles on research projects, ready programs, proposals of
common works and new software. There is a column, 'Offers of
software', where there is an offer at cost price of all UFO
programs available at the moment (about 30) for different kinds
of computers.".

Presumably, if the authors of the relevant programs were
prepared to make the programs available at cost price, some or
all of them would be prepared to make them available on a
website (such as Terry's "UFO Tools" website).

I note that the Newsletter is referred to on Maurizio Verga's
website at the following link, but I don't know whether the
newsletters that were produced (or the relevant programs) are
already available online or how useful they would be.

http://www.ufo.it/verga.htm

Another previous effort that I would be interested in knowing
more about (and may be worth noting by those that are working
on, or thinking about, generating their own database) is the
International Committee for UFO Research ("ICUR"). That
organisation made an effort to consider how more comprehensive
international databases of UFO reports could be generated and
how (if at all) the data in various databases could be
standardised. See the links below:

http://members.rogers.com/vlourenco/mufon/hais02.htm

http://dspace.dial.pipex.com/town/square/el82/icur.htm

That Committee boasted an impressive list of members, including
BUFORA, CUFOS, Project UNICAT, Project URD, SUFOI and others.
I'm aware of some background on the Committee (see the short
list of references below), but am quite out of date. I'd like to
know far more about this interesting endeavour. Can anyone point
me to more up to date information? Are any of the members of the
executive of that Committee on this List? How active was/is the
Committee? Did it issue any reports or substantial minutes of
its deliberations?

Some references for the International Committee for UFO Research
(in addition to the 2 hyperlinks given above):

Blevins, Dave in his "UFO Directory International" (2003) at
pages 89-90 (in Part 2) of the McF softcover edition.

Gamble, Stephen and Wootten, Michael and Danby, J. and Smith,
Willy and Kuhlemann, Bertil in "Phenomenon" (1988) (edited by
John Spencer and Hilary Evans) at page 224 of the MacDonald
hardback edition (Part 3, in the unnumbered chapter entitled
"Harnessing the Computer").

Randles, Jenny in her "UFO Reality" (1983) at page 52 (in
Chapter 3) of the Hale hardback edition.

West, Arnold in "Phenomenon" (1988) (edited by John Spencer and
Hilary Evans) at page 12 of the MacDonald hardback edition (in
the unnumbered chapter entitled "About BUFORA and ICUR").


In this part of this email I'm merely seeking to outline how a
comprehensive list of existing computer databases could be
produced, not to give a list of them. (A draft list I'm working
on is probably too long to include in this email). However, it
would be remiss of me to fail to give a couple of comments on
the two offline giants of the UFO database world: UFOCAT and
Larry Hatch's *U* database.

UFOCAT: I don't think that there's any real doubt that UFOCAT is
the most famous and largest offline UFO database.

I gave a list of references to discussion of UFOCAT cut and
paste from an incomplete draft of my Chronology in my email at
the following link:

http://www.virtuallystrange.net/ufo/updates/2005/jan/m10-017.shtml

See also the CUFOS and UFOCAT webpages:

http://www.cufos.org/UFOCAT.html
http://www.ufocat.com/

As I remarked in that email, I think it would be in the
interests of ufology and CUFOS for the manual for UFOCAT to be
made available on the internet. The manual hints at the wealth
of data and bibliographical references on various topics that
can be extracted from UFOCAT. Also, the UFOCAT database (which
runs on Microsoft's Access) comes with various pre-prepared
lists relating to particularly types of sightings etc. I would
have thought it would be good advertising for UFOCAT for one or
more of those lists to be made freely available on the CUFOS
website.

I'd also note the following comment from page 5 of the UFOCAT
2002 Manual: "We would first caution potential users not to
expect to be able to begin and end their research using only
UFOCAT 2002-there are too many gaps in the data and, just like
the Internet, not every source of information is as reliable and
accurate as the next. The results obtained from UFOCAT 2002 are
best thought of as a reference guide to the original sources for
the crucial details. Otherwise, the distinction between poorly
investigated reports and exhaustively studied sightings will be
lost. However, you will substantially improve your search for
information by accessing UFOCAT 2002. What was true when Allan
Hendry wrote his critique of UFOCAT in 1979 is even truer today:
UFOCAT 2002 is without peer as a reference source. Thousands of
hours went into creating it, and months have gone into revising
it to improve its ease of use. It exists today as the most
comprehensive reference tool and bibliographic source on UFO
reports in existence."


Larry Hatch's *U* Database - Given Larry's frequent posts to
Updates, his database probably does not require any introduction
or any reference to his website at the following link:
http://www.larryhatch.net/

As far as I've seen, this database has not discussed in many
books so far. However, Larry's objective appears to be very
similar to that of Dr Willy Smith's Project UNICAT (i.e. a
filtered catalogue of higher quality UFO reports). Project
UNICAT's database has been discussed in several of the
references given above, and elsewhere (e.g. in the entry
entitled "UNICAT Project" at pages 943-944 of Jerome Clark's
"UFO Encylopedia 2nd Edition : Volume 2 L-Z:" (1998).

Larry's database is currently only available as a Microsoft DOS
program and its appearance is a rather basic. The sound effects
may have been cutting edge for DOS software but are now simply a
bit irritating. However, these rather superficial issues should
not cause the database itself to be underestimated. The database
is a useful tool and I look forward to seeing Larry release a
new version of his database once its been given a new, glossy,
Windows user interface. (Again, I find it interesting that the
references given by Larry's database rarely seem to overlap with
the references given for the same sightings by UFOCAT).


B. Databases of other information

Ah, well, this is a rather wide category of a mass of
(generally) smaller databases.....

For example, there are lists/databases of different types of IFOs
(e.g. Menzel's list, which is now online at:

http://www.cufon.org/cufon/ifo_list.htm

or lists relating to a particular type of IFO (e.g. the list of
clouds (with photos) for which a link is given on Terry's UFO
Tools website).

More significant are the various bibliographies (by Catoe etc).
I won't attempt to list the existing bibliographies in this
email, but will simply note that several of the existing
bibliographies contain sections which are devoted to listing
bibliographies - see, in particular, the following:

(1) Codes LB and LBA in the excellent online database produced
by the AFU, at the following link:

http://www.afu.info/booksbycodeL.htm

(2) US Library of Congress, Tracer Bullet 91-1 "Unidentified
Flying Objects (UFOs)" containing section entitled
"Bibliographies", available online at:

http://www.loc.gov/rr/scitech/tracer-bullets/ufostb.html

Surprising limited, unless I'm missing something, are existing
indexes/databases of government documents. Of course, there are
some lists (e.g. Brad Sparks' list of Project Blue Book
"Unknowns"), but I've seen far fewer such indexes than would be
useful. If someone has already compiled a list of such
indexes/databases, I'd be grateful if they could let me know as
it would produce a short cut for the database squared I'm
producing.

The range of other existing lists/databases that might be of
interest to ufologists is almost unlimited, for example:

(1) lists of SETI projects (such as those presented by Darling,
David in his "The Extraterrestrial Encyclopedia" (2000) at pages
378-383 (in the table entitled "SETI Observing Programs: 1960 to
the Present") of the Three Rivers softcover edition and Jill
Tarter's list in "Extraterrestrials: Science and alien
intelligence" (1985) (edited by Edward Regis) as her tabular
Appendix entitled "Archive of SETI observing programs 1959-84"
at page 192 of the Cambridge University Press softcover edition.

(2) various lists of movies involving UFOS/aliens, including:

a. The list entitled "A Checklist of ETs in the Cinema"
presented by Chris Boyce in in his "Extraterrestrial Encounter"
(1979) at page 164 (in Appendix 1) of the David & Charles
hardback edition, at page 152 of the 1980 revised NEL paperback
edition.

b. The list of science fiction films with themes of either
visitors from space, or travelling to space or both presented by
Armando Simon in "UFO Phenomena and the Behavioral Scientist"
(1979) (edited by Richard F Haines) at page 53 (in Chapter 3) of
the Scarebrow Press hardback edition.

c. The list entitled "Alien Inspired Movies" presented by
Kurland, Michael in his "The Complete Idiot's Guide to
Extraterrestrial Intelligence" (1999) at page 290 (in Chapter
28) of the Alpha Books softcover edition, and included in
Appendix E at pages 315-316.

(3) List of names of supposed extraterrestrials relating to UFO
sightings/contactees presented by Paul Christopher in his "Alien
Intervention" (1998) at pages 81-82 (in Chapter 5) of the
Huntington House softcover edition.

etc., etc., etc.


C. Expert systems to assist in identifying possible stimuli for
a report

Jacques Vallee has written about an expert system called
OVNIBASE that he developed using NEXPERT SYSTEM (developed by
Neuron Data, Inc) to implement a screening system which could be
operated by clerical personnel with the objective of eliminating
most misidentifications and to enable a skilled scientific
analyst to spend his or her time on those few cases genuinely
worthy of full investigation.

This system was discussed by Jacques Vallee in his
"Confrontations" (1990) at pages 212-213 (in the Appendix) of
the Ballantine Books paperback edition. It is also discussed in
the article by Spencer, Vallee and Verga highlighted above.

I've heard very little about this system in recent years. I
understand that it was being developed further by a French
group, but am not sure of its current status or availability.

D. Other software to assist in investigations and research.

This appears to be the primary focus of Terry Groff's "UFO
Tools" website at:

http://www.terrygroff.com/ufotools/

Again, I won't attempt to list specific examples in this email
(given its already considerable length), but will merely note
some categories for which lists could be developed:

1. Software for checking specific IFOs, the most obvious example
being astronomical sources;

2. Software relevant to particular types of evidence, e.g.
Photographic evidence : image analysis software; Witness
evidence: software/databases to assist in locating witnesses and
calculation tools to assist in evaluating witness evidence. (In
relation to calculation tools, in addition to noting the tools
on Terry Groff's UFO website referred to above, I note that the
article by Spencer, Vallee, and Verga highlighted above appears
to briefly refer to other such calculation tools, including an
Italian program called "Elaborazione Dati Avvistamento"
("Sighting Data Processing"), which, at least according to that
article, "allows the processing of many different parameters
coming from the witness' tale. Probable sizes, altitude,
distance and speed are some of the parameters you can
obtain...".

3. Software for digitising information, e.g. Documents: Scanning
software, OCR (Optical Character Recognition) software; Sound
(e.g. lectures, radio interviews) : software such as Magix's
Audio Cleaning Lab.

Also, it is important not to forget the full range of activities
that may be encompassed by the term ufology, including political
lobbying FOIA requests. There are various interactive tools
online (and other software) that can be useful in relation to
these areas. For example, there are websites that allow the user
to send a fax elected representatives in a particular country,
or to help generate the text of a FOIA request letter.

Furthermore, there are of course the fundamental software
program (word processors, spreadsheets, databases, desktop
publishing software, virus software, zipping software etc etc).

Perhaps the most obvious observations from reviewing the
discussions referred to above are that many, many
catalogues/databases have (a) been planned but not finished, or
(b) finished but are not readily available. I dread to think how
much time and effort has been wasted on such projects. I urge
the various individuals on Updates that are involved in the
development of further databases to:

(1) consider what, if anything, their project adds to existing
databases;

(2) adopt realistic goals; and

(3) consider how their project can be designed in stages or
modules,

so that others can build upon your work if you decide to
abandon it.

To help me (or anyone else) track down databases that have been
developed but almost forgotten about, I repeat the invitation
given above to let me (or Updates generally) know of references
to lists of databases (other than those given in Section A of
this email) so that a comprehensive list of databases can be
generated and then followed up.




Introduction - UFO databases, data warehousing and AI analytical tools

This blog focuses on my accelerating efforts from 2005 onwards to develop various UFO databases, a UFO data warehouse and related Artificial Intelligence tools for the analysis of UFO data. This will include details of UFOware, OffCat and other projects that I have been working on.

This overlaps with some of my smaller projects such as a project to scan UFO material (which has now covered most of the world's UFO newsletters/magazines, most UFO books, a mass of UFO case files, UFO dissertations, official UFO documents from various countries and other written material) and to make UFO audio-visual material more accessible (which has resulted in the sharing online of over 2 million pages of automated transcripts and related indexes).

I have, of course, not been working alone on many of these sub-projects. For example, the scanning project has involved helping to coordinate an informal network of over 100 UFO groups/researchers.  Details and credits in relation to that sub-project are given on another of my blogs - "Isaac Koi - New Uploads", particularly in an introductory item : "Scanning project : Introduction and general permission request". 

The lines (if any) between systematised UFO databases (including spreadsheets and formal databases) and collections of raw, unsystematised UFO data (including in books, newsletters, case files and otherwise) have become increasingly blurred given the development of increasingly sophisticated tools (including Artificial Intelligence software) which can extract data from a myriad of forms of information.  

We may have reached the point that development of formal UFO databases is increasingly not the most cost effective way forward. Indeed, they may be becoming redundant. Since around 2010, I have increasingly thought that the focus should be on the AI tools for analysis of any digitised form of UFO information and, of course, the related matter of digitising relevant UFO information.

However, a data warehouse can make use of, and complement, formal UFO databases. We do not have to choose whether to develop a database or a datawarehouse.  A datawarehouse can, of course, contain one or (obviously...) more UFO databases.

This blog will detail:

(1) Some previous UFO databases (and related UFO catalogues, indexes, bibliographies, lists, spreadsheets etc);

(2) Some new UFO databases, including OffCat;

(3) UFO datawarehousing projects, including UFOware;

(4) Tools for making the most of the above (including, but not limited to, Artificial Intelligence tools for analysing data).