Consolidation forecast archive. U.S. Temperature and precipitation probability of exceedence forecasts. Forecast description. These forecasts are produced by a linear regression technique called 'Ensemble Regression' Ensemble regression treats all members of a forecast ensemble as a potential solution to the problem. In a single model ensemble, all members are considered to be equally likely to be the 'best' . In a multi-model ensemble, the members from the more skillful models are assumed to be more likely to occur. The ensemble regression procedure assumes that the conditional error distribution for the best member (that is, the expected errors between the closest solution and the observation) are about the same regradless of which model produced it. The ensemble regression procedure derives a least squares solution to the entire ensemble set from the standpoint of minimizing the expected errors between the best member and the observation. File names and variable ids. File names. There are two types of archive files, hindcast and operational forecasts. A hindcast file contains a 'clean' set of retrospective forecasts. These forecasts are generally made in research mode from historical data. The “operational forecast” files contain the data issued in real time, and may including any variations in procedures or even errors, if those forecasts were officially 'issued'. Hindcast datasets are identified by a data identifier (ID). Operational forecast datasets are identified by the element followed by a “.operational” suffix in the dataset name. File format. Files for the 102 climate divisions are in ASCII format in a simple spreadsheet format. Data are grouped by the order in which the forecasts were issued. Forecasts are typically for 102 forecast divisions based on the NCDC climate division data, and are for three month seasons. Column 1 = Year and month of the center month of the three month season in YYYYMM format.. Column 2 = Year that the forecast was issued Column 3 = Month that the forecast was issued. A forecast is typically issued around the 2nd week of the month, so a forecast labeled 1982 1 would have been issued around mid-January, 1982. Column 4 = Lead time, in months, between the latest data used for this forecast and the START of the valid time. So a 1 in this column indicates a 1-month lead time. A forecast issued in January, 1982 would typically be based on data through the end of December, 1981, so a 1-month lead would refer to the 3-month period starting on February 1, 1982 (Jan + 1)=Feb. and extend through the end of April (FMA). This is labeled by the center month, M=March, 1982) hence Column 1 for this example would read 198203. Column 5 = Forecast division for which this data is valid. (See CPC website) Column 6- Probability of exceedence values. Column 6 gives the value expected to be exceeded 98% of the time. Column 7 95% PoE Column 8 90% PoE Column 9 80% PoE Column 10 70% PoE Column 11 60% PoE Column 12 50% PoE Column 13 40% PoE Column 14 30% PoE Column 15 20% PoE Column 16 10% PoE Column 17 5% PoE Column 18 2% PoE Column 19 Gives the expected value (Mean) of the distribution of observations expected for this forecast. Column 20 P(N+A) , Gives the Probability that the observation will be in the Normal or Above normal class. Here 'Normal' refers to the middle third of the distribution (not necessarily near the expected value). Below normal is the lower third (0-33.3%) of the climatological distribution of observations. Near Normal is the middle, (33.3-66.6%), and above normal is the upper third (66.7%-100%) of the climatological distribution. Climatology is always defined by the observations of the last 3 complete decades (ie. 1961-90, 1970-2000). Column 21 Gives the probability of Above normal) (P(A)). Column 22. Gives the effective skill of the relationship. The value in this column is defined as: R=SQRT(1-Vf/Vb) Where Vf is the forecast error variance (Expected value of (Forecast - Observation)^2) and Vb is the climatological variance of the observations (Observations-Obs mean)^2. For positive values, this produces a skill estimate is similar to the correlation coefficient between the forecast and observations. Negative values signify that the models are predicting a greater variance in the expected observations than climatological variance. Column 23 Forecast ID. Because models may change in the course of time, each forecast is given an idea to help identify how it was made. Forecast ID. CCCVVV CCC = decimal equivalent of binary model inclu VVV = a version number. Key to CCC = Each of four current input model forecast tools is given a position in a binary field. ECCA = Ensemble CCA CFS = CFS model ensembles CCA = Canonical Correlation Analysis (Barnston) SMLR = Screening Multiple Linear Regression. Binary Decimal ECCA, CFS, CCA, SMLR CCC 0 0 0 1 = 001 0 1 0 0 = 004 0 1 1 1 = 007 1 1 1 1 = 015