In the spring of 2015, I was lucky enough to receive a NSF CAREER award for a project entitled "Automated scientific discovery and the philosophical problem of natural kinds." The aim of this project is to develop a new approach to automated scientific discovery based on the theory of natural kinds -- in the sense of projectible kinds -- that I've been elaborating for a while (see this paper). More specifically, the aim over the next five years is to produce algorithms that sort dynamic causal systems into natural kinds as well as algorithms that construct novel variables useful for finding law-like causal relations and additional kinds. These algorithms are intended to be pit directly against the real world; from the outset they are being developed to communicate with physical systems via sensors and actuators rather than confronted with data that has been preprocessed by a human.
Since the grant is a CAREER award, it funds extensive education and outreach components as well. I am excited to be offering a two-week graduate summer school in "Philosophy & Physical Computing" in July of 2016. I will also be putting on a two-day "Robot Scientist" event for middle school students that will be hosted at the Science Museum of Western Virginia.
I and my group of student researchers have already gotten some promising prototypes of the classifier algorithm -- an algorithm that finds kinds -- to work. And I've given the project a new name. I've begun calling the entire collection of automated discovery algorithms under development "EUGENE", largely in honor of Eugene Wigner whose ideas were influential in shaping the theory of natural kinds being implemented (hence the title of this post).
In the next few posts, I'll explain the basic algorithm for kind discovery and why one might expect it to uncover useful categories. For now, in order to give a little more of an overview of the project, I'll provide the summary from my grant proposal:
CAREER: Automated scientific discovery and the philosophical problem of natural kinds
In the course of everyday research, scientists are confronted with a recurring problem: out of all the empirical quantities related to some phenomenon of interest, to which should we pay attention if we are to successfully discover the regularities or laws behind the phenomenon? For most ways of carving up the observable world with a choice of theoretical variables, no tractable patterns present themselves. It is only a special few that are 'projectible', that allow us to accurately generalize from a few particular facts to a great many not in evidence. And yet in the course of their work, scientists efficiently choose variables that support generalization. This presents a puzzle, the epistemic version of the philosophical problem of `natural kinds': how we can know in advance which choices of variables are projectible. This project will clarify and test a new approach to solving this puzzle---the Dynamical Kinds Theory (DKT) of natural kinds---by constructing a series of computer algorithms that automatically carry out a process of variable choice in the service of autonomous scientific discovery. The inductive success of these algorithms when applied to genuine problems in current scientific settings will serve as tangible validation of the philosophical theory.
This project connects the philosophical problem of natural kinds with computational problems of automated discovery in artificial intelligence. It tests the DKT by deriving discovery algorithms from that theory's normative content, and then applying these algorithms to real-world phenomena. Successful algorithms imply that in fact the DKT at least captures an important subclass of the projectible kinds. More dramatically, these discovery algorithms have the potential to produce more than one equally effective but inconsistent classification of phenomena into kinds. The existence of such alternatives plays a central role in debates over scientific realism.
The automated discovery algorithms produced will be leveraged to introduce a generation of graduate students in philosophy and science to the deep connections between physical computing and philosophical epistemology. A recurring summer school will train graduate students in basic programming and formal epistemology, with hands on development of automated discovery systems. Each summer school will culminate in a two-day outreach event at which the graduate students will assist a diverse group of area secondary school children in building their own `robot scientist'. Students and teachers completing the summer school or outreach programs will leave with their own mini-computers configured for developing their own approaches to discovery. Outside of philosophy, the application of the discovery algorithms to open problems in areas of ecology, evolution, metagenomics, metabolomics, and systems biology has the potential to suggest previously unconceived theories of the fundamental ontology in these fields. In particular, the algorithms will be applied to agent-based models of evolutionary dynamics to search for population-level laws, and to publicly available long-term ecological data to search for stable dynamical kinds outside the standard set of ecological categories.