AI tool for forecasting infectious disease risk

Written by Lucy Welsh (Digital Editor)

The first AI tool to use large language modeling to predict infectious disease risk has been developed.

Researchers at Johns Hopkins University (MD, USA) and Duke University (NC, USA) have developed a novel AI tool for predicting infectious disease risk. This tool, which outperforms existing state-of-the-art forecasting methods, could transform how public health officials predict, track and manage infectious disease outbreaks.

The coronavirus pandemic highlighted the challenges of predicting infectious disease risk, particularly due to the complexity of contributing factors. During this time, the Johns Hopkins COVID-19 dashboard was developed to track healthcare data and was relied upon worldwide. When conditions were stable, the model worked well; however, when new variants, policies and other complex factors emerged, disease spread became harder to predict.

“A pressing challenge in disease prediction is trying to figure out what drives surges in infections and hospitalizations, and to build these new information streams into the modeling,” commented co-corresponding author Lauren Gardner (Johns Hopkins).

Now, for the first time, researchers are using large language modeling – the type of AI used for ChatGPT – to predict the spread of infectious diseases. Through artificial–human cooperative design and time-series representation learning, the researchers developed PandemicLLM, a framework that encodes multi-modal data for large language models.


Coming to a consensus: the development of a new mRNA vaccine

New mRNA vaccine is more effective and less costly to develop.


This model can reframe real-time prediction of disease spread into a text-based reasoning problem, enabling the integration of real-time, complex non-numerical information. This model has been trained to utilize four types of data: state-level spatial data, epidemiological time-series data, textual health policies and genomic surveillance data. After being fed this information, the model can predict how the elements connect and affect disease behavior.

“Traditionally, we use the past to predict the future. But that doesn’t give the model sufficient information to understand and predict what’s happening. Instead, this framework uses new types of real-time information,” said co-corresponding author Hao (Frank) Yang (Johns Hopkins).

To test their model, the researchers retroactively applied it to the COVID-19 pandemic. The model was fed the four types of data and tested across all states in the USA for 19 months, revealing performance benefits for PandemicLLM over existing models. PandemicLLM can also be adapted to forecast other infectious diseases, such as bird flu, respiratory syncytial virus (RSV) and monkeypox.

The researchers are now investigating the ability of large language models to replicate how individuals make decisions about health, with the hope of helping public health officials design safer and more effective policies.

“We know from COVID-19 that we need better tools so that we can inform more effective policies. There will be another pandemic, and these types of frameworks will be crucial for supporting public health response,” concluded Gardner.