We wanted to update our April 2020 blog post which discussed the goal of creating open and accessible COVID 19 data models. I’m pleased to report that Galois’s Agile Metamodel Inference using Domain-specific Ontological Languages (AMIDOL) project began its third phase in May 2020. The entire AMIDOL project has been a great success. The project funding currently amounts to $2.7 million as part of DARPA’s Automating Scientific Knowledge Extraction (ASKE) program, which was designed to encourage the automation of data models.
AMIDOL aims to be a platform that helps domain experts, scientists, and programmers create more accurate and updateable models in real-time. Policy makers at the local, state, and federal level all rely on scientific models to help them manage imminent or existing emergencies. However, it is difficult to make these models actionable – reproducible and modifiable in real-time – because the models are based on pure mathematics, and the underlying data is not made available to other agencies or domain experts. When scientists recode models with their own data sets, it is a lengthy and painstaking process.
AMIDOL is making great progress in addressing these challenges. Our current work focuses on the COVID 19 pandemic – we are primarily aiming to establish model credibility. We’ve had promising success resolving differences between scientific theory and its implementation, and in making scientific models actionable.
Doctor, Doctor, Give Me the News
AMIDOL uses an innovative three-phase approach. The first stage involves creating a set of domain-specific languages (DSLs). In the case of COVID 19, we aim to help enable scientists to generate code directly from diagrams representing susceptible patients, infected patients, and recovered patients. We’re also aiming to help scientists create equations that describe the pandemic.
Current modeling practices require scientists to write programs in general-purpose languages. This often introduces errors into the model because scientists are trying to replicate a model from their notes. Each time the scientist crafts code by hand, it can lead to errors, because the coding required to make an equation replicable is not usually a scientist’s core skill.
As we noted in April, one of our aims was to help “clean” models that have errors in them. When scientists import a model into AMIDOL, errors can be identified and often automatically repaired in AMIDOL’s Intermediate Representation (IR).
If scientists build a model within AMIDOL, they wouldn’t need to hand-code at all. They could simply use existing mathematical or formal diagram representations they wish to model.
AMIDOL takes these representations and automatically synthesizes code on the back-end. This ability to factor in (or factor out) data in the IR gives a scientist the ability of a software engineer without needing to actually be a software engineer. Scientists could rapidly synthesize models and quickly verify if the data makes sense.
Shake It Up With Modified Data Inputs
AMIDOL is designed to let scientists change inputs, assign particular rewards to a model for measuring data, and extract data that could be used elsewhere. This gets around the need to build an entirely new model every time the inputs change.
For instance, if scientists want to track how many people infected with COVID 19 also require hospitalization, they can modify the representation of what they’re tracking over a period of time.
AMIDOL would also allow a scientist to track how many people recover from COVID 19 or how many people are asymptomatic spreaders of the virus.
In addition to COVID 19, every autumn brings the onset of multiple respiratory infections. With AMIDOL, scientists could show the effects of different infections at once. The new diagram generated for a research paper could be used by AMIDOL to synthesize new code that automatically reflects these changes.
Modeling Has a Fever, and Our Proposed Cure is More AMIDOL
AMIDOL’s next phase is being designed as a proposed AMIDOL as a Service (AaaS) that can be used by government institutes like the Centers for Disease Control (CDC) and the National Institutes of Health (NIH).
As we noted in April, resource allocation (e.g. how much Personal Protective Equipment is needed for hospitals in a city) is one challenge AMIDOL aims to address. An AaaS framework could allow agencies and local governments to make data-driven decisions to ensure accurate resource allocation management.
Other ideas could include the following:
- Helping scientists create models for potential COVID 19 vaccines.
- Determining how social distancing is working at outbreaks in a particular city or state.
- Modeling which states are testing sufficiently for COVID 19.
- Tracking the effectiveness of social distancing.
AMIDOL’s goal is for gathered data to be deployed so that states can share data and models. Currently, it can take years to transfer data between localities.
The Cure We’re Thinking Of
The COVID 19 crisis has provided a potential opportunity to make policy-based gains within governmental organizations. AMIDOL could particularly help policymakers respond to crises in real-time, making knowledge more actionable in the near-term.
If (or when) another pandemic struck, AMIDOL could help domain experts, data scientists, and programmers create more accurate and updateable models in real-time, with far less ambiguity. Scientists could leverage existing scientific knowledge and not be hampered by a single implementation. Every research paper could match the model – essentially ensuring reproducible science that other domain experts can use.
For more information about AMIDOL, please see our project page https://galois.com/project/amidol/
Acknowledgment of Support and Disclaimer: The Performer shall include an acknowledgment of the Government’s support in the publication of any material based on or developed under this Agreement, stated in the following terms: “This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Agreement No. HR00111990005.”