- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
Almas Heshmati, a professor of economics at Jönköping University in Sweden, used Excel’s autofill function to mend the data for one of his studies.He had marked anywhere from two to four observations before or after the missing values and dragged the selected cells down or up, depending on the case. The program then filled in the blanks. If the new numbers turned negative, Heshmati replaced them with the last positive value Excel had spit out.
But Heshmati’s data also showed that in several instances where there were no observations to use for the autofill operation, the professor had taken the values from an adjacent country in the spreadsheet. New Zealand’s data had been copied from the Netherlands, for example, and the United States’ data from the United Kingdom.
Replacing missing observations with substitute values – an operation known in statistics as imputation – is a common but controversial technique in economics that allows certain types of analyses to be carried out on incomplete data. Researchers have established methods for the practice; each comes with its own drawbacks that affect how the results are interpreted.
There is no evidence that Excel’s autofill function is among these methods, especially not when applied in a haphazard way without clear justification.
From the immortal Journal of Irreproducible Results, “The Data Enrichment Method”: “. . .its principal shortcoming is that before the enrichment process can be started, some data must be collected. It is quite true that a great deal is done with very little information, but this should not blind one to the fact that the method still embodies the ‘raw-data flaw’. The ultimate objective, complete freedom from the inconvenience and embarrassment of experimental results, still lies unattained before us.”