Episode 16 - John Beverly - Ontology in the Real World, via Virus Infectious Disease Ontology


Sahana: Hi everyone! Welcome to In Limbo Conversations! Today, we have with us John Beverly, a doctoral research fellow at Northwestern University. His areas of research include logic, social epistemology, ethics and applied ontology. For more details about this work, please check here. So, in this conversation, we're going to be talking to John about something that’s really exciting to me- it is about ontology and its application and current situation. I think his work is one of the best examples to show how metaphysics at large and ontology more specifically can contribute to public crisis situations like the current one. Thank you so much for joining me John!


John: Oh thank you! I'm very happy to be here.


Sahana: So, as a background for our audience, I thought we could begin by talking about the motivation for and the necessity of infection disease ontology. So, in the preprint titled "The Infectious Disease Ontology and Age of Covid-19" which you had worked on with many others- I think Shane Babcock, Lindsay Cowell and Barry Smith- you guys had mentioned that when we are dealing with a public health emergency like Covid-19, there is a need to share data across many disciplines and data systems but that such sharing often gets hindered due to various issues so and also that these issues could be adequately addressed by ontology- so, could you tell us, in the context of covid-19 specifically, what kind of data is required to be shared across disciplines and data system and what are the issues that you have noticed in this process of sharing and also, if you had to put it simply, how could we define ontology and how can ontology address these issues?


John: These are great questions absolutely! So, I'll start with the last one and work my way back, because I think the last one might be easier to address surprisingly.

So, in traditional-analytic metaphysics, ontology is approached as like providing or thinking about a recipe list for ingredients that you would put into the world- like what is there and like that's what ontology is at least, with a lowercase 'o' and so, in Applied Ontology, we're engaged in a similar project but instead of thinking about like tropes to universals, we're thinking about things like viruses and like medicine or details of Mount Madison and so- what we do is we bring the methodologies from traditional metaphysics like conceptual analysis and stuff into, like the scientific disciplines. We help organize information that researchers themselves while good at organizing the the expert or their area of expertise maybe aren't paying attention or very close attention- like fine grain details- and so they miss some things. This is important because what you don't want to happen is like somebody working in geography and somebody working in virology trying to coordinate data that they've been acquiring about the spread of an infection for epidemiology or something but nevertheless like using the word 'river' two different ways, you're using the word 'virus' two different ways, right?- which very likely happens when people aren't talking to themselves- you're using a shared vocabulary. In fact, this was, in fact, in the 80s, the motivation for the field of Applied Ontology- turns out that a lot of researchers were gathering a lot of data because we had new methods for acquiring things, we could put them in like spreadsheets and databases and stuff- but everybody was inventing their own languages and they weren't talking to each other. Now, for the most part, they were using similar enough languages across these databases- at least approximate some kind of communication- but the problem really isn't about getting enough of a shared vocabulary though, because there's so much data that we usually end up using computers to help and computers are dumb, in the sense that, you have to be very careful about what you say and one little misstep- how you translate the term over here is going to, when you try to translate that or like at least use an automated computing system to do the translation over here, it's just going to miss that entirely. So ontologies are structured vocabularies, logically well-structured vocabularies- all they do is they come in and make sure that across a wide range of disciplines, people are using the same language.


Sahana: Okay so it looks like a taxonomy in that way.

Among parts of the hierarchy, so, for instance, instead of just saying “This is a..is like this” or “This is a subclass of this” which is what you would do in a normal taxonomy- in addition, we say things like “..and these things have parts” and so, parthood is a relationship that relates things in the taxonomy to other things, in other parts of the taxonomy in different ways.


John: Right yeah absolutely! It's a taxonomy like you might be exposed to in a biology class except and here's one of the benefits of it- we include a lot of formal relations that have logical properties. Among parts of the hierarchy, so, for instance, instead of just saying “This is a..is like this” or “This is a subclass of this” which is what you would do in a normal taxonomy- in addition, we say things like “..and these things have parts” and so, parthood is a relationship that relates things in the taxonomy to other things, in other parts of the taxonomy in different ways.


Sahana: Okay so- coming to Infectious Disease Ontology that is IDO. So, in IDO, IDO-CORE is a disease and pathogen neutral ontology, which covers just those entities and relations which are relevant to infectious diseases generally. It is extended by disease and pathogen specific ontology modules. Could you please share how it is that your team generates the ontology- maybe, if possible, it would be really cool if we could get a glimpse of the ontology for our audience to see.


John: Yeah, absolutely! This problem I was gesturing at a minute ago about researchers talking past each other- this is loosely speaking called the problem of data silos- they have data stores or data lakes and they're not inter-translatable and so, ontologies come and help address that translation issue. When ontologies came on the scene, people were essentially on the same page about “Oh yeah- look we need a common language that we can use to translate all this stuff into."- but then, what happened is a lot of researchers thought to themselves, “Okay I'll just make up ontology." which ended up regenerating the same problem, right?

A bunch of ontology silos. Because of that, in the late 80s, a group of ontologists, a consortium got together and said “Look, we're going to have certain ground principles and we're going to have one set group of ontologies and other ontologies are going to extend from those, so we have a common core” and so that's the OBO Foundry and at the top of the OBO Foundry is the basic formal ontology which is developed by Barry Smith and has recently become an ISO standard. It's standardized across the world. It's very impressive and I actually have a little picture here- I want to show you- the hierarchy of it- so you can see- this is like the top level ontology- that all ontologies will have to use when they're going to extend or build to a more specific domain.





John: So the tippy top- this is entity- everything is an entity, at least insofar as anything relevant to scientific investigation- that's the idea and then in BFO, you'll divide things into two classes: there's two major subclasses of entity which are continuants, which are things like you and me, like right here-right now and then, on the other hand, you have occurrence, things that occur in time- they have like temporal components. They are flowing. Ontologically speaking, we need both- we don't just want to talk about material entities,we also want to talk about what they do and how they do it. It's mostly about, in fact, it's probably largely about things that are done. This is not a reductive ontology either. You see this in analytic metaphysics- I am gonna start with material things and define everything else- we're not really interested in that- we want to give the researchers everything they need and we take how they speak about the world as a cue. So, they talk about processes- we're not trying to tell them they're wrong- we're just trying to give them the right resources.


Sahana: In a certain sense, it is more descriptive.


John: Absolutely right. It is largely descriptive-although when we later, when we talk about some issues, that normative element is going to come up in surprising ways but we'll talk about that in a minute


Sahana: Do you guys collect the data first and then make this- is it based on empirical data you get from different places?


John: It's sort of like that. So, Barry and a whole huge team of people developed this and I worked on this myself, but when I came to work on it- it was mostly set like this and I was building the logic underpinning it, but what Barry did is he thought really hard- like a philosopher does-and he had a bunch of logicians and scientists working with him and they gathered the data and so, he had like the philosophers doing the conceptual analysis working, right? - alongside researchers from various biomedical domains- they could figure out exactly what was needed and they basically listen to what scientists are saying and help them by using the tools of metaphysics to clear up confusions and bury it. He loves to tell stories about like the sorts of confusions he would notice like type-token confusions, use- mention confusions- just things that researchers aren't thinking about- because they're busy- they've got a lot of other stuff to think about- they just got to code all this data- if they're thinking about like phenotypes of some species of animal and they're running experiments and stuff and want to get that info out- they're not really thinking about types and tokens or uses and mentions and then when they write papers, you know, they're susceptible to making these sorts of mistakes but again- if you want to have computers help you-you got to be real careful- so we come in and try to do that careful work for the scientists- so that they can keep doing what they're doing and they have this really well-defined logical well-defined background against which they can build.


Sahana: Okay. That sounds really interesting. I wanted to ask you about the extension of Coronavirus Infectious Disease Ontology which is the ITO and it's a specific extension that focuses on Covid-19 which is called IDO Covid-19. So, in the paper, you mentioned three broad ways in which IDO and IDO Covid-19 can be significantly helpful- first- IDO-Covid 19 could help with information driven efforts to deal with ongoing pandemic, second- search ontologies can accelerate data discovery in early stages of future pandemic and third- they also promote reproducibility of infectious disease research. So- could you tell us a bit more about these benefits and also what do you feel are the limitations in the development of ontologies like IDO right now and how does your team intend to address it?


John: Absolutely. So, we have like the right picture on the table- IDO-CORE-the Infectious Disease Ontology- that’s a really specific subset or more restricted subset of the BFO ontology. You start with the top level stuff and then you get narrower and narrower as you get closer and closer to a target domain like Covid-19, which is a pretty specific thing you want to talk about. In intermediary steps, we have from BFO, you have infectious disease ontology covering them all and then you have the Virus Infectious Disease Ontology which we built, starting over the summer as a bridge between Infectious Disease Ontology and then the Coronavirus Infectious Disease Ontology, which I came to work on at the start of the pandemic with some researchers in the University of Michigan. Oliver and his team and I've been working with them, but also building around them- so we have a straight stream from BFO all the way down to an extension of Oliver's ontology which is the IDO Covid-19 ontology and so that's when we get to the really fine grained level of like data or coordinating data for or about Covid-19. I just want to make sure we have that structure in place so when we get there, the data that we're thinking about and the stuff that we're annotating like texts from the National Library of Science about clinical research, like trials that are going on right now- what we do is we think about the results of those trials and we tag them and annotate them with the ontology terms, and then we can run automated reasoners over what we tag and we can generate new information.


One of the things that Oliver actually showed us how to do and what we've been doing, is you can take ontologies that talk about chemistry, you can take ontologies that talk about drugs and you can take ontologies that cover say SARS-CoV-1 or MERS or some other coronavirus- you can look at those profiles, put all that information together and spit out a range of options for drug treatments...People just hadn't been considering them, some of which have been taken up...


One of the things that Oliver actually showed us how to do and what we've been doing, is you can take ontologies that talk about chemistry, you can take ontologies that talk about drugs and you can take ontologies that cover say SARS-CoV-1 or MERS or some other coronavirus- you can look at those profiles, put all that information together and spit out a range of options for drug treatments. And what you have, drug treatment options for SARS-CoV-2 based on what we know about drugs that were worked or seemed to be effective for the other coronaviruses. And that's a lot of information to pull together. And of course, if a team got together and sat and thought about it really hard, they could do it. But what Oliver and his team did was that they threw all this info into their processing units, tagged a bunch of articles with their ontologies and it spat out hypotheses like within a couple of days. So it was pretty impressive, and they generated - they have some preprints on this - they generated a range of treatment options for Corona or for SARS-CoV-2 infections that people weren't thinking about. These were early days too. People just hadn't been considering them, some of which have been taken up. Like, researchers got to them eventually but Oliver and his team were like, "I told you." Yeah, so it was useful in that respect of coordinating a whole bunch of data, so you can generate hypotheses and generate options for treatment just based on what we already know - which is a lot. Again, Barry - I always like pointing to Barry - Barry likes to say "We probably have a cure for cancer. We just don't know it." Because we got all this data and we just haven't coordinated it all yet. But once we do, it's there.


Sahana: I was thinking- if we come to realize that COVID-19 is not only a respiratory- it's a multi-systemic infection. How does that make a change to the ontology in the formal sense? What are the changes you would have to make?


John: So hopefully, when we build the ontology, and as we've been building the ontology, we try to make sure that we only assert things that have reached a good level of consensus, because we want to generate good hypotheses. If we find out it's like a multi-system issue, that's actually going to be something that's compatible with the way we're understanding SARS-CoV-2 infections because what we're doing is, we're focusing on like the structure of the virus, the structure of viral replication, the various pathways that a virus particle might be involved in, while it's generating pathogenesis in somebody. Pathogenesis - that's the transition from like infection to disease and that can take many different forms. In fact one of the things that's really good about ontology is when we have all this information, not only can we generate treatment options by pulling a bunch of data together, we can also generate a bunch of pathways, like mechanisms of pathogenesis and so we can be consistent with several of them and maybe several of them are on display in various infections. And that makes sense, right? Because like SARS-CoV-2 infection has a wide range of symptoms, right? It has a wide range of manifestations. People are dying of Acute Respiratory Disorder very frequently. Some people are dying of pneumonia. Some people are dying of all sorts of things. And the details are going to matter, based on environmental constraints and demographic information. And again, our model is going to be consistent with explaining all that stuff. Like all those different ways through pathogenesis, all the way to disease. So the short answer is -hopefully we won't have to change much. And we'll let the researchers tell us which way we should be going. But we'll be compatible with each of those explanations. I think that's a strength because as you see, these infections are crazy! It's hard to tell because there are so many different symptoms and people are having so many different complications, and it's such a new pathogen, it's really hard to say.


Sahana: Like a global laboratory for the formal, like applied ontology right now, in a certain sense. You're actually looking at something evolve continuously.


John: Absolutely.


Sahana: What problems have you faced in developing these ontologies And how do you guys address it? You had mentioned that they were geographers and there are people from different areas and you're talking to them. And also, what kind of issues come up? And how do you generally deal with the big data? Are these institutional decisions which are made or are these also dependent on philosophical commitments of you guys? How does it really work?


John: I won't speak for all the ontologies but I try not to let my philosophical commitments get in the way. I was an engineer by training, so I kind of came into philosophy to do something else. That engineering training- it really informs how I engage with scientists. And we all do engage with scientists. So, the consensus that arises is- it's approached by philosophers who come in and do applied ontology, saying "Look, we are not experts in biology, we are not experts in virology, like that's not what we do- but nevertheless we can help you- let me show you how you might want to code this, as like a human being, as a social security number or something. But that's silly because people are not divisible by numbers right? So let me let me help you. Or you might want to think of, like a country as a type of thing. I'm sorry, the United States as a type of thing. But it’s a token of a type - the type being country. And this is a common thing that's confused. So that's what we do. And then what we do is we'll engage in email conversations or at this point, we'll almost always zoom calls or something. And we will argue- argue the same way philosophers have argued forever with scientists, who are really interested in getting the stuff right but haven't been thinking about it.

Lots of ontologists who are philosophers do bring their philosophical convictions. But at the same time, lots of scientists come in and have their own philosophical or scientific convictions. Say, about what a virus is. Is it alive? Is it not? Right? Like this is something that comes up in almost every talk i give about this, and every time, I say to the scientists and the philosophers, I say "That's a very interesting question! It's also not one we need to address." We don't have to decide that. We just need to model it. And then everybody's usually like "Okay. That's right. Let's move on." So we have the (philosophical) convictions, but this is a practical problem. Or a series of practical problems that we're trying to solve. And while those are good questions to ask- it is like, "We'll ask them later, all right, once we get the models there.”



Say, about what a virus is. Is it alive? Is it not? Right? Like this is something that comes up in almost every talk i give about this, and every time, I say to the scientists and the philosophers, I say "That's a very interesting question! It's also not one we need to address." We don't have to decide that. We just need to model it. And then everybody's usually like "Okay. That's right. Let's move on." So we have the (philosophical) convictions, but this is a practical problem. Or a series of practical problems that we're trying to solve. And while those are good questions to ask- it is like, "We'll ask them later, all right, once we get the models there.”


Sahana: That makes sense. That reminds me of the instrumentalist position on doing metaphysics than doing a speculative kind- which is taking over. I'm working in scientific metaphysics, corroborating with and collaborating with different scientists, rather than doing an armchair kind of, in-the-ivory-tower kind of philosophy.

That's all the questions I wanted to ask you, and thank you so much for talking to me John! It has been a wonderful conversation. You've been so patient. Thank you.


John: This is wonderful. I really appreciate this and I am so pleased with all the work that you and your team have been doing. And thank you so much for adding us to the bibliography and for having me on for this interview. This is fantastic.


Get in touch with us!

© In Limbo