Humans are innately social creatures, and one need look no further than Twitter to see how potent the urge to share information is. According to the social networking site, as of the end of June Twitter users from all corners of the earth were sending 200 million tweets per day.
Now social scientists and computer scientists are figuring out how to mine tweets for intelligence of all sorts. And it hasn't escaped Shots' notice that Twitter chatter isn't confined to politics or celebrity trash-talking. People also tweet about their health — a lot.
In 2009, two computer scientists at the Center for Language and Speech Processing at Johns Hopkins University got access to a trove of Tweets — some 2 billion posted between May 2009 and October 2010 — and decided to find a way to analyze them.
Mark Dredze, a professor at Hopkins, and his grad student, Michael Paul, are mainly interested in language, but they felt that the most interesting conversations on Twitter might be about health. So they began to parse the 1.5 million messages (whose senders' identities were masked) in the data set that touched on health.
Twitter is a "very noisy," linguistically complex place, says Paul. One of the greatest challenges turned out to be training the computer to differentiate the medical relevance of phrases like "I gots da flu" and "Got a case of Bieber Fever. Love his new song."
But once their computerized model learned how to read Twitter, they discovered real discussions of over a dozen ailments, including allergies, obesity and insomnia. And they found remarkable correlations between actual flu rates and flu discussion on Twitter, as well as the arrival of allergy season and allergy complaints on Twitter.
Google Flu Trends has done similar work analyzing search-engine queries to track flu emergence.
Dredze and Paul will present their findings at conference next week sponsored by the Association for the Advancement of Artificial Intelligence.
In addition to disease surveillance, they also gleaned a lot about public perceptions of illnesses, medications and other health issues. Many of the perceptions about health issues expressed on Twitter turned out to be wrong, like the notion that antibiotics could be used to treat the flu. (Flu is a virus, which aren't susceptible to antibiotics.)
Paul and Dredze say they were surprised to hear that this was the most intriguing element of their work to public health experts. "They were interested using Twitter to detect misinformation not information," says Paul. "They saw it as social media tool to mine people's perceptions of health."
Analyzing Twitter to better understand what the public really believes about specific health issues could be very useful in the future. "It could help officials decide what strategies are effective and what are not," says Dredze.
That might eventually require access to real-time Tweets, which, of course, are growing in number every day, making the process of digging through them potentially more difficult. In the meantime, Paul says he and Dredze want to run the same experiments with 2011 data to see if they can get stronger evidence that Twitter health chatter is statistically significant and relevant.
So the next time you whine about your headache on Twitter, remember that someone you don't expect might be watching.