In the year-long battle against the coronavirus, information has been one of the most powerful weapons that public health officials have had. Now, a team at the University of Alberta hopes to use machine learning to offer decision-makers something invaluable: a glimpse into the future that predicts the results of their decisions before they make them.

"If we had a crystal ball, it would make things much easier," says Russ Greiner, principal investigator at the Alberta Machine Intelligence Institute and professor of Computing Science at the University of Alberta.

"If we knew with certainty that opening restaurants would mean that 300 additional people will die, that would make decisions easier."

Greiner is part of a multidisciplinary team that uses machine learning (a subfield of artificial intelligence) to develop more accurate COVID models that can help forecast the effects of policy decisions, like reopening restaurants or institution lockdowns.

"Asking, 'how many hospital beds will Edmonton need,' and just saying, 'it's a lot,' isn't good enough. We want more quantitative forecasts of what will happen," he says. "Will a specific hospital need two or 52 ICU beds two weeks from now? How many ventilators? That information would be critical."

Mechanistic vs. machine learning

Many of the approaches currently being used to predict COVID's spread are mechanistic models. That means analysts first estimate, then use a few specific quantities - such as, how infectious the virus is and how long people remain contagious - to mathematically calculate how the coronavirus will spread.

That approach has proven to be effective at the high level or for short-range predictions, says Pouria Ramazi, now an assistant professor at Brock University in Ontario and another of the principal investigators on the team. But there are so many variables that can change the way a virus spreads through a particular community - ranging from weather, to policy decisions, to public adherence. It's nearly impossible for a mechanistic model to account for every possible factor, which can be especially important for long-range forecasting.

"Making accurate predictions, if you want to focus on a longer range like two months, can be very difficult," he says.

Instead, the University of Alberta team's model takes a "machine learning" (ML) approach, one that relies on finding patterns within the pandemic. The researchers take massive amounts of data, gathered from the United States, Canada and other parts of the world, and use it to "train" their algorithm to find connections.

The statistical method doesn't seek to understand how the virus works. Instead, it's sniffing out patterns: how much does weather affect how the virus spreads, how much does population density matter, do long weekends cause a spike in cases?

The team's focus is on predicting two target variables: the number of cases and the number of deaths. Their model can predict up to 10 weeks into the future.

Once they've begun the training, the researchers can then use their model to make specific predictions (e.g., based only on data available on Oct. 1, forecast how many cases will be reported in Alberta on Oct. 29) and then compare that forecasted value to the actual recorded number, and use that to fine-tune the model. By adjusting the parameters of the model to produce predictions that are closer to the known case numbers, the machine learning model teaches itself which variables, and which connections, are important and which are not.

It can even find connections that the researchers never considered, yielding some surprising results.

"We were surprised to see how important meat-processing plants were ... places with a lot of meat plants were strongly correlated with the number of COVID cases," Ramazi says.

Greiner points out that this statistical ML method isn't a replacement for mechanistic models; both have their benefits and disadvantages.

From infestations to infections

The story of the team's COVID model began in an unlikely place - among sickly pine trees along Alberta's western border.

Before the pandemic struck, Greiner was working with one of his colleagues, Mark Lewis, Senior Canada Research Chair in Mathematical Biology at the U of A. They were using machine learning to predict the spread of the mountain pine beetle, an invasive species causing severe damage to trees in the Rocky Mountains and northern Alberta.

When COVID struck last year, Lewis and Greiner wondered if the work they were doing could be used as a foundation to track the virus. They expanded their multidisciplinary team by including biochemist David Wishart and mathematical biologist Hao Wang, as well as Ramazi. The team applied for and received $220,545 in funding from the Pfizer Alberta Collaboration in Health, a health innovation fund administered by Alberta Innovates, to explore the pandemic modelling idea.

As of now, a team of about 20 people (most from the University of Alberta) are actively working on the project. While Greiner obviously hopes that their work will help battle COVID-19, he also sees it as preparing for the future.

"The tools we're building now: I hope we don't need them again. But we know this isn't going to be the last time ... let's not be caught flat-footed. Maybe some of the ideas we develop in this collaboration will help."