As a follow-up to Richard's excellent post from last week, I decided to re-calculate the daily anomalies based upon the "new" NCDC formula that he and I discussed via email for generating daily standard deviations. The following charts for Fairbanks and Anchorage show the results of that reanalysis.
Here is how to interpret the charts - from bottom to top.
1) The red line shows the annual percentage of days that the temperature was more than three standard deviations either above or below the mean; e.g., +3.2, -3.5, etc. Since it uses percentages, 2013 can be compared to other years on equal footing. Also note that the categories are non-overlapping.
2) The dark blue line shows the percentage of days where the daily temperature anomaly was between 2 and 3 standard deviations either above or below the mean; e.g., +2.1, +2.7, -2.4, etc. The dashed line is the expected value based upon the normal distribution.
3) The solid burgundy line shows the percentage of days where the daily temperature anomaly was between 1 and 2 standard deviations either above or below the mean; e.g., +1.5, +1.2, -1.6, etc. The dashed line is the expected value based upon the normal distribution.
4) The solid green line shows the percentage of days where the daily temperature anomaly was between 0 and 1 standard deviations either above or below the mean; e.g., +0.2, -0.9, -0.7, etc. The dashed line is the expected value based upon the normal distribution.
5) The solid orange line at the top shows the Chi-Squared goodness of fit value. It is a squared, weighted measure of the difference between the actual and expected values and is the gold-standard for categorized (grouped) data. A value of zero indicates that the distribution of anomalies exactly fit a normal distribution. Any Chi-Squared value less than 6.0 (with 2 degrees of freedom) indicates that that year's temperatures approximated a normal distribution at he 95th percent significance level. The larger the value, the more anomalous (less normal) the distribution is. It can also be thought of as a measure of distribution extremes. By this metric, Fairbanks in 2013 (through August 7th) has had the most extreme temperatures of any year on record (post-1930) by a large margin. In fact, if the rest of 2013 is exactly normally distributed, it will still be the most extreme year on record. For Anchorage, 2013 is also in 1st place by a wide margin.
The graphic below shows the calculation for Fairbanks in 2013.
It is worth noting that if additional categories are used that to distinguish positive and negative anomalies, 2013 is still easily in first place overall for Fairbanks but not by as much. If the rest of the year was "normal," 2013 would only rand in 10th place. 1993 is a good year to look at. That year 11% of days were between -1 and -2 SD and 17% of days were between +1 and +2 SD. When you look at those as separate categories, they each differ fairly significantly from the expected value of 13.6. However, when they are combined, the number matches almost perfectly with the expected value. Of course, that knife cuts both ways.
Here is a good explanation of Chi-Squared: http://www.stat.yale.edu/Courses/1997-98/101/chigf.htm
Here is a good site to manually enter numbers and check the Chi-Squared value: http://www.quantpsy.org/chisq/chisq.htm