The Contest, to identify trend-driven time series, has now closed. Following are some related remarks.

The Contest closed at the end of 30 November 2016, after running for over a year. No winning entry was received. There were 34 entries. Almost all entries were submitted by professional researchers in fields such as climatology, computer science, econometrics, physics, etc.

After the Contest was announced, the Contest time series (1000 series) were analyzed by the statistician Andrew Gelman. Gelman’s analysis is described in a post on his blog. The main part of the analysis is valid. The analysis, though, does not consider aspects of time series. That is, the analysis effectively shows what can be done by a person who has studied some statistics, but who has no training in time series. The analysis concludes that such a person should expect to correctly identify 854 ± 10 of the 1000 series. (Note that identifying 900 series is required to win the Contest.)

Simply put, correctly identifying fewer than roughly 865 series can be reasonably done without using specialist techniques from the study of time series. Despite that, all entries to the Contest identified fewer than 865 series. Thus, none of the contestants demonstrated any skill with time series. That occurred even though some of the contestants have substantial professional experience analyzing time series. Such contestants, seemingly, based their analyses on assumptions that were invalid (e.g. the assumption that the process that generated the time series was ARIMA).

Some people correctly surmised that generation of the 1000 series involved an integrated process, and then questioned that. For remarks on this, see the essay “Integrated climatic time series”.

Lovejoy et al. [2016]
After the Contest was announced, I emailed over a hundred global-warming researchers, to tell those researchers about the Contest, and to offer a waiver on the $10 entry fee. One of those researchers is a professor at McGill University, in Canada: Shaun Lovejoy. Lovejoy has published many research papers that deal with the statistical analysis of climatic time series. Those papers present a statistical method that, Lovejoy claims, gives accurate and reliable conclusions about time series.

After I emailed him, Lovejoy applied his statistical method to the Contest time series. Lovejoy then submitted an entry to the Contest. Later on, Lovejoy and some of his colleagues published a peer-reviewed research paper about the calculations that had been done for the Contest. A reference for the paper is below.

Lovejoy S., del Rio Amador L., Hébert R., de Lima I. (2016),
Giant natural fluctuation models and anthropogenic warming”,
Geophysical Research Letters, 43,
doi: 10.1002/2016GL070428.

Lovejoy et al. assert that, using their statistical method, they can correctly identify 893 ± 9 of the 1000 series—almost enough to confidently win the Contest.

Lovejoy’s Contest entry, however, correctly identifies only 860 of the 1000 series. Thus, the entry falsifies the assertion of Lovejoy et al. about identifying 893 ± 9 series. Worse, the entry is not really better than what can be easily obtained without any time series analysis. Hence, the statistical method of Lovejoy has negligible value here.

The paper of Lovejoy et al. has other technical problems as well. One such problem will be described here.

The 1000 time series used in the Contest were generated via a statistical model. Lovejoy et al. criticize that model. The criticism is that the model implies a typical temperature change of 4 °C every 6400 years—which is too large to be realistic.

Another model was used by the IPCC, for its statistical analyses. The IPCC model implies a temperature change of about 41 °C every 6400 years. Thus, the IPCC model is far more unrealistic than the Contest model, according to the test advocated by Lovejoy et al. Hence, if the test advocated by Lovejoy et al. were adopted, then the statistical analyses done by the IPCC are untenable. Demonstrating that the IPCC statistical analyses are untenable, though, was the purpose of the Contest. Ergo, if the test advocated by Lovejoy et al. were adopted, then the purpose of the Contest would be fulfilled.

Moreover, the Contest model was never asserted to be realistic. Indeed, it is a common aphorism in statistics that “all models are wrong”. In other words, when we consider any statistical model, we will find something wrong with the model. Thus, when considering a model, the question is not whether the model is wrong—because the model is certain to be wrong. Rather, the question is whether the model is useful, for a particular application. This is a fundamental issue that is taught to undergraduates in statistics.

An additional problem is that some of the claims made in the paper of Lovejoy et al. are not only false, they also, I suspect, might libel me. I found out about those claims prior to the paper’s publication. The journal in which the paper was to appear is published by the American Geophysical Union. Hence, I sent an email, protesting the paper’s false claims, to AGU’s President (Margaret Leinen), General Secretary (Louise Pellerin), and CEO (Christine McEntee). Six days later, AGU published the paper anyway. Ergo, AGU published a paper that it demonstrably knew contained material falsehoods. Such publication was presumably done in part because the paper supports AGU’s alarmist position on global warming.

Global temperatures have increased since 1880. Many people have claimed that the increase can be shown, statistically, to be more than just random noise. Such claims are wrong, as the Contest has effectively demonstrated. From the perspective of statistics, the increase in temperatures might well be random natural variation.

See also—“Statistical Analyses of Surface Temperatures in the IPCC Fifth Assessment Report”.

Acknowledgements.  I thank Andrew Montford, for advice in developing the rules of the Contest. I am also grateful to Mike Haseler, who presented strong evidence for a flaw in the PRNG that was originally used to generate the series (for details, see his blog post “The Doug Keenan Challenge”); he also appears to be the only person who appreciated a particular aspect of the generated series—namely that the polynomially-decaying autocorrelations are accurate.

Douglas J. Keenan