For some time I've been meaning to take a look at the long-term performance of the National Weather Service temperature forecasts for Fairbanks, and particularly with one question in mind: do the forecasts show enough variance at the end of the short-term forecast period, i.e. 5-7 days in the future?
The question is motivated by the idea that sometimes the computer models indicate a pronounced temperature anomaly from about a week in advance, but the early NWS forecasts for the same time show only a small departure from normal. A recent example was seen in the early October cold spell, when the ECMWF and GFS deterministic forecasts of September 29 both showed a notable cold anomaly in place by October 5, but the NWS forecast for the high temperature on October 5 was 38 °F, only 3.6 °F below normal. In this case, as time went on and the forecast became more certain, the forecast dropped and the observed high temperature was 31 °F. However, there are many cases when the computer forecasts are badly wrong from 7 days out, and so it is entirely justifiable for the official forecast to show only a small anomaly at longer lead times. Indeed, it would be most undesirable for the raw model forecast to be reflected in the official outlook, because the numbers would often swing wildly from day to day. The question is, does the NWS have the right balance?
It's possible to answer this question using a history of NWS forecasts that I have collected for Fairbanks airport since November 2011. First, here is the basic "skill" of the forecasts for lead times of 1-6 days, i.e. the forecasts for "tomorrow" through "6 days from now". Averaged over all seasons, the average error of the high and low temperature forecasts is similar and rises from just over 4 °F to nearly 8 °F over the six days. Not surprisingly, the errors are much larger in winter, but it is interesting to see that the winter low temperature forecasts improve more significantly at shorter lead times, whereas the winter high temperature forecast error remains over 7 °F even for "tomorrow".
Here's a similarly-formatted chart showing the bias of the forecasts, i.e. the mean difference between the forecast and the observed temperatures. Negative values indicate that the forecasts were too cold on average. We see that the winter high temperature forecasts have been several degrees too cold on average in the past 3 years, even at shorter lead times, but the bias is much smaller for the low temperatures. It would be interesting to investigate this further in search of a possible explanation.
Let's now consider the scaling of the temperature forecasts. I've examined this by calculating the mean absolute error (MAE) that would result if the NWS forecast anomaly (departure from normal) were multiplied by values ranging from 0 to 2. On the low end of this range, the forecasts would deviate very little from climatology and the forecast would just show normal values each day; but on the high end, the forecasts would show greater deviations from normal than they currently do. The chart below shows the results of this experiment for day 7 temperature forecasts from all seasons of the year.
The data from the last 3 years show that (on average through the year) the high temperature forecasts are perfectly scaled at day 7, i.e. there is no way to improve the MAE by arbitrarily reducing or increasing the forecast anomaly. We conclude that the NWS shows just the right amount of variance on average in the day 7 high temperature forecasts; this is not to say that we can't improve on any given forecast using additional information, but we can't reduce the error by simply adjusting the departure from normal across the board.
The day 7 low temperature forecasts are not quite optimally scaled, according to these results, as the NWS shows marginally too much variance. In other words, the forecasts would be marginally (but only very slightly) better if they showed smaller departures from normal.
There is one other aspect of the problem that interests me, and that is whether we can show that the forecast variance is too small when the computer models show a large anomaly (as opposed to any size anomaly) and/or when the computer models agree with each other. I'll return to this idea in a subsequent post.