Google Flu Trends Gets It Wrong Three Years Running 64
wabrandsma writes with this story from NewScientist: "Google may be a master at data wrangling, but one of its products has been making bogus data-driven predictions. A study of Google's much-hyped flu tracker has consistently overestimated flu cases in the US for years. It's a failure that highlights the danger of relying on big data technologies.
Evan Selinger, a technology ethicist at Rochester Institute of Technology in New York, says Google Flu's failures hint at a larger problem with the algorithmic approach taken by technology companies to deliver services we all want to use. The problem is with the assumption that either the data that is gathered about us, or the algorithms used to process it, are neutral. Google Flu Trends has been discussed at slashdot before: When Google Got Flu Wrong."
Evan Selinger, a technology ethicist at Rochester Institute of Technology in New York, says Google Flu's failures hint at a larger problem with the algorithmic approach taken by technology companies to deliver services we all want to use. The problem is with the assumption that either the data that is gathered about us, or the algorithms used to process it, are neutral. Google Flu Trends has been discussed at slashdot before: When Google Got Flu Wrong."
Error, Error, Excuses (Score:2)
Big Data Fail (Score:2, Insightful)
Not siprising, most analysis on huge data sets is incorrect, that's why the NSA thing is scary! They get it wrong and you end up with a missile through your window! Oops...
Re: (Score:2)
No such thing any more. If it's a male, he's a terrorist and legitimate target nowadays. If she's a female, she's terrorist's family member and future suicide bomber. Win/win.
Re: (Score:3)
An excellent example is Li's copula, [wired.com] widely credited for triggering the 08 financial crisis.
Re: (Score:3)
Re: (Score:2)
The failure was the morons who thought didn't stop to think 'if were all making so much money, who is losing it?'
And then someone did. And thats when all hell broke lose, when someone realized that they were about to default on a bond, which uncovered the whole mountain of failure across the industry and a few others that it took out along the way!
Re: (Score:2)
Exactly. The same as abuse of big data. My point.
I dn't thin it takes into accout (Score:2)
action into place far showing the data.
You can see a trend and make a forecast. Then take action to slow the trend based on the forecast and then the prediction will be wrong.
Re:I dn't thin it takes into accout (Score:5, Insightful)
You can see a trend and make a forecast.
Agreed. Very similar to a weather forecast, but without the hundred odd years of daily data to study and manufacture predictive models on.
It is, however, necessary and noble research... they'll just need more flu seasons under their belt to tweak the variables.
Re: (Score:3)
Climate is basically the long term statistics of weather - meaning a hundered year trend in t
Re:I dn't thin it takes into accout (Score:4, Insightful)
Exactly, the correct comparison should be "technical analysis" in stock markets, which can be applied to any stock you like with the same level of (un)success.
Without an underlying theory of how things work, which also needs to be somewhat correct, trying to predict future trends simply by using past data is just dumb curve fitting - with a curve of enough degrees of freedom, you can fit any data, but that doesn't mean its prediction would be any better than random guess.
Re: (Score:2)
...which hasn't stopped anyone from using it - rationality is for the weak. We're wired for "eureka" moments - the curve fits so well, it MUST be right!
OTOH, technical analysis is also not a very good model of this, because economy is not a good model of anything in the real world due to an exceptionally strong positive feedback loop between the model and the modeled. A successful technical analysis "method" (meaning it worked for someone, that's statistically probable no matter how stupid the method is) ma
Re: (Score:2)
It would seem to me that you underestimate how complex weather systems are.
A farmer who says "when this happened last, the weather did this next" is more likely to be right than the guy who tries to model atmospheric pressure changes in a chaotic system.
Sure, some people rely entirely on models, but good weather forecasting often involves both. How do we know what a strong north-west pressure system will do? The best answer is "what did it do last time" not "lets model it to death."
PS, this is also how me
Re: (Score:2)
Re:Big Models (Score:5, Informative)
Yes it has warmed of the last 15 years, you moron.
You statement has been shown false many many times. Please stop.
Re: Big Models (Score:4, Informative)
He's not a moron, he's probably a republican; they have figured out that if you constantly state lies as facts then many people will believe them. It is the second best thing in politics after money, which is why republicans are currently having so much success at ruining America.
Re: (Score:2)
I'll respond to you and hope that your friends below manage to find it.
Re: (Score:2)
Sadly the only thing it proves is your (and Judith Curry's) lack of education on the subject of statistics [skepticalscience.com]. Note I am actually being generous here by assuming that Curry doesn't understand her mistakes.
Re: (Score:3)
Well, except for the warming climate https://www2.ucar.edu/climate/... [ucar.edu]
Re: (Score:2)
You'll notice in the graph at your link that the temperature trend is flat since about 2000. No warming. Thanks for proving my point.
Big picture here ...
http://www.ncdc.noaa.gov/sotc/... [noaa.gov]
Re: (Score:3)
You don't really get the scientific model do you? You know, the one where you don't pick an outlier as a base, and then try to "prove" that a trend is occurring by picking another outlier point. The technical term for that kind of "research" would be nit-picking, and is generally frowned upon by real researcher. You know, the kind of people who actually knows up from down, contrary to you.
Or maybe you just can't wrap your head around this whole thing called climate. I'll help you, climate is not weather. If
Re: (Score:3)
"The technical term for that kind of "research" would be nit-picking, and is generally frowned upon by real researcher. You know, the kind of people who actually knows up from down, contrary to you."
Actually the term is cherry-picking. Nit-picking is focusing on trivial details.
Re: (Score:1)
http://www.ncdc.noaa.gov/sotc/... [noaa.gov]
Global Highlights
The combined average temperature over global land and ocean surfaces for January was the warmest since 2007 and the fourth warmest on record at 12.7ÂC (54.8ÂF), or 0.65ÂC (1.17ÂF) above the 20th century average of 12.0ÂC (53.6ÂF). The margin of error associated with this temperature is ± 0.08ÂC (± 0.14ÂF).
The global land temperature was the highest since 2007 and the fourth highest on record for January, at 1.17ÂC (2.11ÂF) above the 20th century average of 2.8ÂC (37.0ÂF). The margin of error is ± 0.18ÂC (± 0.32ÂF).
For the ocean, the January global sea surface temperature was 0.46ÂC (0.83ÂF) above the 20th century average of 15.8ÂC (60.5ÂF), the highest since 2010 and seventh highest on record for January. The margin of error is ± 0.04ÂC (± 0.07ÂF).
If I choose not to believe it, it cannot be true!
Tweak the Algorithms (Score:4, Funny)
Learn from nature! Google needs a genetic algorithm that modifies itself every flu season.
The fittest algorithm will survive to infect thousands.
The problem is the question, not the answer (Score:5, Interesting)
With big data, when you actively look for patterns you always find them; this is how hedge funds have been operating for years. The purpose of the technology is not to make predictions, but rather to confirm existing trends and possibly identify new ones.
Proper way to utilize big data in this case would be:
1) to assist the CDC in confirming or refuting trends observed in the field
2) to offer additional correlations (such as: are people living closer to highways more sensitive fo specific strains of flu)
3) to provide long-term indicators facilitating the assessment of medication and other flu containment factors
Big data is not a magic eight ball but it's not a piece of shit either.
Google Flu got it wrong? (Score:1)
..well, so pretty much have all the FUD-spreaders in the CDC, government, and NGOs who've been all telling us that "any moment" we could get a "deadly flu" since the (ha ha ha) Sars "epidemic".
All I've ever gotten is the "Cry Wolf" heebie jeebies.
Re: (Score:2)
Isn't it one of those perpetual beta things? (Score:2)
Re: (Score:1)
Calm down, everyone (Score:3)
but one of its products has been making bogus data-driven predictions. A study of Google's much-hyped flu tracker has consistently overestimated flu cases in the US for years.
Bogus? Are you sure they weren't just... wrong?
It's a prediction.
Re: (Score:2)
Not bogus at all - I'm sure they really did make those predictions.
When a prediction changes behavior... (Score:3)
In addition to "all of the above", the other contribution is that of the philosophical equivalent of Heisenberg: the predictions of outbreaks may have increased vaccination usage in the areas involved, which of course will have an effect of downplaying the outbreaks in those areas.
Not saying I have any evidence for that, (and I will wager it unlikely, considering the #s who vaccinate is still far lower than it should be), but a correlation study may be interesting to see.
If the point of knowledge of a possible outcome is to act to deter it, then shouldn't the actions that attempt to deter it be taken into account?
Re: (Score:2)
It should, but only after google news picks up reporting on it. Then the modelers can say how much impact of reports of the prediction.
Next year, no one may report on it other than mockery, and you can't predict reporting that doesn't happen, so they can't start off with reporting taken in to account.
Influenza Vaccine (Score:1)
Huh, Flu, who knew? (Score:1)
So does this mean that.... (Score:2)
To be completely fair (Score:1)
Wrong measure (Score:2)
The headline is that the prediction was overestimating three times in the past three years. So what?
Google's Flu Trend plots don't have uncertainties on them, so they'll never be exactly right. So they either have to be overestimates or underestimates. In any three years, you are going to get at least *two* under or over estimates. So post-hoc, saying "ZOMG! There's three overestimates in three years!! #EPICFAIL LOL!" isn't very meaningful.
Until Big Data People understand statistical uncertainty and are hap
Mabe they based it on how many claim to be sick? (Score:2)
Put Nate Silver on this (Score:2)
"Flu virus predicted to take US congress in 2014 with 96.34% certainty."