Saturday, February 15, 2020

Alaska Climate Divisions

Last week a UAF news article highlighted the value of the Alaska climate division analysis that was developed a few years ago by Peter Bieniek and others from UAF and other universities, along with NOAA collaborators such as Rick Thoman.  NOAA has long used so-called climate divisions in the lower 48 to keep track of climate variations in climatically similar regions, but nothing comparable was available for Alaska until this work by Peter et al.

The journal article describing the new climate division work was published way back in 2012, but as the article explains, it took a few years for NOAA to adopt the divisions for "official" monitoring.

I'm a big fan of the Alaska climate divisions, but one of the potential shortcomings is the relative scarcity of ground-truth station data; only 42 sites (including some Canadian) were used to determine 13 climatically similar regions, and some divisions had far more sites than others.  The Northeast Interior division, for example, contains only one station (Fort Yukon), and the North Slope division has only one non-coastal site (Umiat).  Such is the world of historical Alaska climate analysis.

For reference, here are the Alaska climate divisions:

After reading the UAF news piece, I started wondering if modern reanalysis data would produce similar climate divisions to the Bieniek results.  To address this, I used monthly mean temperature data from the ERA5-Land reanalysis, now available from 1981 through most of 2019.  ERA5-Land is a higher-resolution version of ERA5 (9km vs 31km grid spacing) that models only surface variables such as 2m temperature, 10m wind, humidity, snow cover, and so on; it does not deal with oceans or the atmosphere aloft.  I'm hopeful that ERA5-Land may be an improvement over ERA5 for Alaska in winter (see this post from a few weeks ago), although I haven't done any investigation on this yet.

Regardless of the possible deficiencies of ERA5-Land, it's interesting to see what the climate division analysis produces.  I ran cluster analysis on the gridded monthly mean temperature anomalies (standardized) from 1981-2018, and the following maps show the results, ranging from 3 to 10 clusters, based on two alternative methods.  Bieniek et al tested these two methods and a third, but they focused on results from Ward's method (right column below).

K-means methodWard's method

There are a number of interesting aspects to the results.  First, the K-means cluster boundaries tend to jump around somewhat, because the method starts with a random choice each time and iterates to a solution.  For this reason it is also not 100% reproducible, i.e. you can get different results when you run it again.  In contrast, a hierarchical method like Ward's is reproducible, and the boundaries don't move around as the clusters are progressively sub-divided.

Despite the differences in the results, certain features are similar: the North Slope division emerges quickly and remains very well-defined throughout; a Panhandle division emerges at k=6 for both methods; and the clusters are really quite similar for k=5,6,7, and 9.

Perhaps most interesting, in my view, is the absence of some of the distinctions that are found in the Bieniek results.  For example, even if we go all the way up to 15 clusters (see below), there is no sub-division within the Panhandle, whereas Bieniek has three Panhandle divisions and another for the Northeast Gulf.  Similarly, the ERA5-Land clusters give no separation between Aleutians and Northwest Gulf (e.g. Kodiak Island).  As the number of clusters increases, the sub-dividing mostly takes place in the interior and eventually on the North Slope.

On the other hand, the ERA5-Land clusters quickly break apart the West Coast region, rather than keeping it together as Bieniek does.

I mention these differences out of curiosity, not to suggest that the Bieniek divisions are wrong.  It's very likely that ERA5-Land has certain deficiencies that would hamper the assessment of climate similarity - for instance, the reanalysis may be wholly inadequate in the very complex terrain of the Panhandle.  More investigation would be needed to see how well ERA5-Land reproduces climate in the vicinity of the stations used by Bieniek et al.

Lastly, it's not clear to me whether there is an optimal number of clusters based on the ERA5-Land analysis.  Traditionally one looks at the distribution of within-cluster variance and seeks to find a threshold beyond which (i.e. for smaller numbers of clusters) the variance starts to increase more quickly; but the results from ERA5-Land show no obvious stopping point.  Bieniek also found that using gridded data made it impossible to tell where to stop.

Personally I like the look of the K-means solution with 9 divisions, but it's purely a personal preference.  I'd be glad to hear any comments from readers.

No comments:

Post a Comment