Reinhart and Rogoff-Gate Reaction Shows Failure of 'Data Science'

A discovery of Excel spreadsheet error made in an important economic study seemed to have shocked the data science community. First, we will report a number of commentaries from various economic and non-economic bloggers. Our view of the whole fiasco is added at the bottom.

Fernando Perez (h/t: CTB, Lex) -

“Literate computing” and computational reproducibility: IPython in the age of data-driven journalism

Victoria Stodden (h/t: CTB, Lex) -

What the Reinhart & Rogoff Debacle Really Shows: Verifying Empirical Results Needs to be Routine

Reinhart and Rogoff -

Reinhart-Rogoff Response to Critique

Bruce Krastrings at Zerohedge -

“Econogate” and Japan

Mish -

Excel Spreadsheets, Krugman, and a Question of Logic

Acting Man -

The Reinhart Rogoff Study Controversy

Nakedcapitalism -

Twisted Tale of Bad Math and Hubris: Global Austerity Based on a Spreadsheet Error

Philip Pilkington -

Philip Pilkington: Defrocking Reinhart and Rogoff Controversy Ignores Fundamental Issues in the Use and Abuse of Statistical Studies

-———————

The community reaction about Excel error and especially the responses from ‘data scientists’ show what kind of garbage is being sold as science these days. The ‘data science’ approach is to feed a bunch of numbers to a supposedly sophisticated computer program, and then trust the program to give us insight about a natural process (or an economic system here). Even after it was shown that minor changes in Excel spreadsheet could dramatically alter Reinhart and Rogoff’s conclusions regarding social austerity, nobody asked questions on whether the entire study itself was meaningful. Does it really make sense to take some numbers from 20 or so small countries spanning over 200 years to tell people of another different country to change their behavior? Is money borrowed to build Golden Gate bridge equivalent to money borrowed to bail our Goldman Sachs? Are those ‘GDP’ numbers reported by various governments equivalent so that we can talk about their means or medians? For all we know, US GDP went up by 500 billion yesterday, because government decided to do the calculation differently.

US GDP Will Be Revised Higher By $500 Billion Following Addition Of “Intangibles” To Economy

We have seen similar failure to ask critical questions during US housing bubble, when everyone ‘data scientist’ agreed that the banks would not fail based on their extremely sophisticated models. The models did not assume that house prices could ever go down, because in the benchmark data they used (1940-2007), house prices were never lower on aggregate. As it turned out, two comedians asked more meaningful questions than thousands of ‘data scientists’ hired by investment banks and rating agencies.

‹»Informed and Automated k-Mer Size Selection for Genome Assembly« »Today's Highlights - "A Guide for the Lonely Bioinformatician"«›