One of the more exciting developments in the world of weather and climate science this year has been the release of new data sets by the European Union's Copernicus Climate Change Service. The program is funding the free and open distribution of vast quantities of data through the Climate Data Store, so there is almost unlimited scope for new research as well as commercial development using the data.
The data set that I'm most excited about is the latest generation of reanalysis from the European Centre for Medium-Range Weather Forecasts (ECMWF), which is well-known for having the most accurate global weather forecast model in the short-medium-range time frame (out to two weeks in the future). I've often used reanalysis data from NOAA on this blog, and indeed NOAA's global reanalysis from 1948-present is heavily used worldwide and is extremely valuable. However, the NOAA reanalysis relies on a model that is very out of date now. Happily, the new ECMWF reanalysis - using the ECMWF's top-notch modeling capability - is now coming online via the Copernicus program; the data are currently available back to 2000, but next year we'll see the product extended back to 1950.
Here's an article about ECMWF's new ERA5 reanalysis:
Back in 2015 I did a brief comparison of NOAA's reanalysis data with real observations from Fairbanks; here's one of the figures, showing the very poor correlation of reanalysis to actual temperature and precipitation in summer.
The chart below is a similar figure using ERA5 data for the nearest grid point to Fairbanks, which happens to be located just to the south across the Tanana River (the grid spacing is about 20km). The performance is impressive. Now admittedly the correlations ought to be very high for temperature, because the ECMWF model uses surface observations to refine its gridded estimates of evolving weather conditions hour by hour. However, precipitation is predicted by the model over short time intervals, so the model does not "know" how much precipitation occurred in reality; and neither ground-truth data nor radar estimates are used to improve the estimates. Given that the ERA5 precipitation data is purely a (short-range) forecast, I think it's very impressive that the monthly correlations are as high as ~0.8 in May through July, when hard-to-predict showers and thunderstorms produce most of the rain.
Here's a look at correlations of daily rather than monthly temperatures through the year. Daily low temperatures are generally more difficult to get right than high temperatures, because the warmest conditions of the day tend to be more closely tied to the more homogeneous, well-predicted temperatures of the free atmosphere above.
Finally, the wind speed estimates from the model are not as impressive; apparently the low-level wind regimes near Fairbanks are a challenge even for the world's best global modeling system.
In due course I will be acquiring a larger volume of the ERA5 data and will have a chance to do a more extensive analysis; and it would be fun to set up an online map catalog of ERA5 data for the Alaska domain. If anyone has an interest in helping out with such a project, let me know - perhaps there could be a collaboration.