Can public health experts tell that an infectious disease outbreak is imminent simply by looking at what people are searching for on Wikipedia? Yes, at least in some cases.
Researchers from Los Alamos National Laboratory were able to make extremely accurate forecasts about the spread of dengue fever in Brazil and flu in the U.S., Japan, Poland and Thailand by examining three years worth of Wikipedia search data. They also came up with moderately success predictions of tuberculosis outbreaks in Thailand and China, and of dengue fevers spread in Thailand.
However, their efforts to anticipate cases of cholera, Ebola, HIV and plague by extrapolating from search data left much to be desired, according to a report published Thursday in the journal PLOS Computational Biology. But the researchers believe their general approach could still work if they use more sophisticated statistics and a more inclusive data set.
Accurate data on the spread of infectious diseases can be culled from a variety of sources. Government agencies typically get it from patient interviews and laboratory test results. Other data sources include calls to 911 lines, emergency room admissions and absences from work or school.
The problem with these methods is that they can be time-consuming and costly. By the time the numbers are crunched, an outbreak may be in full swing.
If you want to stop an outbreak before it starts -- and if you want to save lives and money, you certainly do -- what you need is a forecast that is both accurate and timely. And so the Los Alamos researchers turned to the treasure trove that is Wikipedia.
In addition to the about 30 million articles on topics ranging from quantum foam to the First English Civil War to Kim Kardashian, Wikipedia also collects data on the approximately 850 million search requests it gets each day. In previous studies, researchers have used this publicly available data to predict ticket sales for new movies and the movement of stock prices.
When it comes to health, people have found correlations between interest in certain health topics on Wikipedia and sales of medications. Others have linked searches for flu-related topics by American Wikipedia users to actual flu spread in the U.S.
Five members of the LANLs Defense Systems and Analysis Division thought they could do more. Their goal was to get a read on current and future trends not just for flu in the U.S. but for several diseases in several countries. Ideally, they hoped to come up with a model that could be trained with data from a place where its available and then applied to another place where it wasnt.
The researchers decided to focus on seven diseases (cholera, dengue fever, Ebola, HIV/AIDS, influenza, plague and tuberculosis) in nine countries (Brazil, China, Haiti, Japan, Norway, Poland, Thailand, Uganda and the U.S.). They mixed and matched to get models for 14 location-disease contexts.
Link:
Scientists use Wikipedia search data to forecast spread of flu