A discovery of Excel spreadsheet error made in an important economic study seemed to have shocked the data science community. First, we will report a number of commentaries from various economic and non-economic bloggers. Our view of the whole fiasco is added at the bottom.
Fernando Perez (h/t: CTB, Lex) -
Victoria Stodden (h/t: CTB, Lex) -
Reinhart and Rogoff -
Bruce Krastrings at Zerohedge -
Acting Man -
Philip Pilkington -
The community reaction about Excel error and especially the responses from ‘data scientists’ show what kind of garbage is being sold as science these days. The ‘data science’ approach is to feed a bunch of numbers to a supposedly sophisticated computer program, and then trust the program to give us insight about a natural process (or an economic system here). Even after it was shown that minor changes in Excel spreadsheet could dramatically alter Reinhart and Rogoff’s conclusions regarding social austerity, nobody asked questions on whether the entire study itself was meaningful. Does it really make sense to take some numbers from 20 or so small countries spanning over 200 years to tell people of another different country to change their behavior? Is money borrowed to build Golden Gate bridge equivalent to money borrowed to bail our Goldman Sachs? Are those ‘GDP’ numbers reported by various governments equivalent so that we can talk about their means or medians? For all we know, US GDP went up by 500 billion yesterday, because government decided to do the calculation differently.
We have seen similar failure to ask critical questions during US housing bubble, when everyone ‘data scientist’ agreed that the banks would not fail based on their extremely sophisticated models. The models did not assume that house prices could ever go down, because in the benchmark data they used (1940-2007), house prices were never lower on aggregate. As it turned out, two comedians asked more meaningful questions than thousands of ‘data scientists’ hired by investment banks and rating agencies.