Skip to main content

Uncertainty index and stock volatility prediction: evidence from international markets


This study investigates the predictability of a fixed uncertainty index (UI) for realized variances (volatility) in the international stock markets from a high-frequency perspective. We construct a composite UI based on the scaled principal component analysis (s-PCA) method and demonstrate that it exhibits significant in- and out-of-sample predictabilities for realized variances in global stock markets. This predictive power is more powerful than those of two commonly employed competing methods, namely, PCA and the partial least squares (PLS) methods. The result is robust in several checks. Further, we explain that s-PCA outperforms other dimension-reduction methods since it can effectively increase the impacts of strong predictors and decrease those of weak factors. The implications of this research are significant for investors who allocate assets globally.

Introduction and literature review

Since the uncertainty and unpredictability of the economic policy and investment environment increase over time, predicting financial market movement is very challenging for scholars and practitioners. Thus, the measurement of the uncertainty in the financial market has attracted enormous attention, e.g., Jurado et al. (2015), Baker et al. (2016) and Huang and Luk (2020). Economic agents typically define uncertainty as the conditional volatility of a disturbance, which is generally unpredictable (Jurado et al. 2015).

In recent years, an increasing number of studies have focused on the linkage between uncertainty and financial market dynamics. For example, several studies have associated the uncertainty therein with stock returns and volatility (Pastor and Veronesi 2012; Li et al. 2020; Megaritis et al. 2021), commodity prices and volatility (Karabulut et al. 2020; Guo et al. 2022), corporate credit spreads (Kaviani et al. 2020), leverage levels (Khan et al. 2020), financial stability (Phan et al. 2021), etc.

Volatility is a well-known indicator for measuring asset price risk. It features a wide range of applications in the fields of finance and economics, such as risk management, asset pricing, and hedging strategies (Chkili 2021; Gong et al. 2022). Moreover, volatility exerts a significant predictive power on potential output growth (Vu 2015). Consequently, an elucidation of the determinants of volatility is quite relevant for investors and policymakers. Volatility is conventionally measured with daily or lower-frequency data [the standard deviation of asset returns, Generalized AutoRegressive Conditional Heteroskedasticity (GARCH)-type model, and so on (Zhang et al. 2021)]. The appearance of the realized volatility (RV), as proposed by Andersen et al. (2001), shortens the distance between the estimated and real volatilities and has been widely adopted in the literature. Compared with the low-frequency one, RV contains richer market information.

Here, we employed five-minute sampling data to construct RV and reduce market microstructure noise to focus on the issue of the high-frequency relationship between the uncertainty index (UI) and realized variance (volatility) in global stock markets. Dissimilar to many studies that had investigated a single extant uncertainty indicator (Liu and Zhang 2015; Megaritis et al. 2021), we explored uncertainty from the equity market, investor, and economic policy levels. Thereafter, we constructed a composite UI based on the scaled principal component analysis (s-PCA) method that was introduced by Huang et al. (2021). Additionally, two well-known competing methods, PCA and the partial least squares (PLS) methods, were employed as competing models.

Fig. 1
figure 1

Time dynamics of global economic policy uncertainty index

The motivations were derived from several aspects. Firstly, owing to the increasing trend of international investment, it is necessary to develop a relatively fixed and internationalized risk indicator that monitors market risk dynamics. Particularly, the intensities of the interactions among the global economic entities have grown through the increased liberalization of international trade (Tsai 2017). An increasing number of investors allocate their assets to global markets. Figure 1 shows that the global economic policy uncertainty (EPU) index of Baker et al. (2016) tended to the fluctuant and uncertain international investment environment. Under this condition, monitoring the stock price risk in each market through different indicators might not be an ideal choice because it requires time to separately respond to each market; moreover, it is expensive to simultaneously monitor the stock price risk in each market. Therefore, a relatively fixed indicator that can comprehensively predict the risk of international investment is necessary and convenient for investors to rapidly reach their next investment decisions.

Secondly, only a few studies in the literature focused on the high-frequency relationship between uncertainty and stock volatility. Recent studies offered sufficient evidence confirming that low-frequency uncertainty measures can explain potential financial market volatility. For example, the EPU exerts a significant predictive power on stock volatility (Liu and Zhang 2015; Li et al. 2020), forex volatility (Christou et al. 2018), and European Union allowance futures volatility (Liu et al. 2021). Moreover, Megaritis et al. (2021) argued that the macroeconomic uncertainty sufficiently predicts the U.S. stock volatility. However, the foregoing mainly focused on low-frequency monthly data, even though it is crucial to consider the high-frequency (microcosmic) relationship between uncertainty and volatility. For one thing, many uncertain events, such as the China–US trade war (2018–2019), which was announced by then President Donald Trump on Twitter on August 23, 2019, and the COVID-19 pandemic, which began with the lockdown of Wuhan on January 23, 2020, occur instantaneously. These unexpected events can significantly influence the financial market. A low-frequency investigation cannot readily elucidate this real-time dynamic and random change. For another, compared with the low-frequency volatility, a high-frequency-data-based RV comprises richer trading information and can consistently estimate the true integrated volatility (Andersen et al. 2001). Thus, elucidating the determinants of volatility from the microcosmic perspective is crucial for market participants, particularly short-term investors, to accurately detect financial risks.

Thirdly, many studies in the literature have investigated the predictability of a single UI in a single market (see references in the previous paragraph). It is very interesting to determine whether there is a relatively fixed composite uncertainty indicator that affects international stock markets. This motivation is straightforward and twofold. One, we anticipate a composite index that can reflect a more comprehensive market uncertainty (MU) by capturing uncertainty from different perspectives, such as economic policies and investor behaviors. Compared with a single indicator, the composite index, which is constructed via a dimension-reduction method, could exhibit more robust and outstanding performances in prediction tasks (Neely et al. 2014; Gong et al. 2022). Moreover, a robust composite index is required since this study focuses on international stock market forecasting. For the other fold, we anticipate that a relatively fixed index could influence numerous markets since many studies have documented the strong links, such as volatility co-movement (Cipollini et al. 2015), volatility spillovers (Diebold and Yilmaz 2009), and contagion (Chiang and Wang 2011), among international financial markets. Numerous findings have demonstrated significant volatility spillover effects from the U.S. market on other markets, such as the Pacific-Basin (Ng 2000) and European markets (Baele 2005). Thus, the U.S.-market-based composite UI could potentially impact other markets.

Finally, applying the dimension-reduction technique to the extraction of relevant information from different types of factors has received enormous attention, thus inspiring this study. For example, PCA is generally employed to predict stock volatility (Zhang et al. 2020) and risk premium (Neely et al. 2014). Huang et al. (2015) and Gong et al. (2022) exploited PLS to construct an aligned sentiment index, thereby significantly improving the returns and volatility forecasting, respectively. In a recent study by Huang et al. (2021), an s-PCA method, which demonstrated remarkable predictive performance in macroeconomic forecasting, was developed. Based on this work, Guo et al. (2022) and Yan et al. (2022) confirmed that the s-PCA-based PU index exhibits more powerful predictability on crude oil volatility compared with other competing methods. Moreover, s-PCA is also employed to extract predictive information from macro variables (Huang et al. 2020), technical indicators (He et al. 2021), liquidity indicators (Liao et al. 2021), and investor-attention indicators (Chen et al. 2022). They reported that the s-PCA method improves market returns forecasting. However, it is largely unknown if the s-PCA method is also effective for the prediction of stock volatility, which is fundamentally different from the forecasting of returns (Zhang et al. 2021). Moreover, the application scenarios of the method could be further expanded. Dissimilar to their studies, we applied the s-PCA method to construct a global-level composite uncertainty indicator, which is very beneficial to market participants, as discussed above. Finally and significantly, although Guo et al. (2022) and Yan et al. (2022) argued that the s-PCA method outperforms other competing models, the valid evidence to demonstrate why the s-PCA method is better is still rare, and we will attempt to fill this gap.

Fundamentally, we analyzed the channel from the financial environment uncertainty to the stock price or financial one (Goodell et al. 2020). One theoretical basis derives from increasing the uncertainty about future discount rates, cash flows (dividends), and capital structures. For example, Pastor and Veronesi (2012) revealed that the change in policy or a new policy exerts uncertain impacts on profitability, which will increase the discount rates. Moreover, Megaritis et al. (2021) observed that a significant percentage of stock market fluctuations cannot be explained by fundamentals but only by latent macroeconomic uncertainties. The unexplained component is driven by the uncertainty surrounding future dividend yields. Furthermore, Khan et al. (2020) reported that the listed firms would decrease the level of leverage when the uncertainty increases, thus affecting a firms’ capital structure.

The shocks due to extreme events, such as financial crises and epidemic diseases, account for another channel that explains the predictability of uncertainty on volatility. Naturally, such extreme events occur randomly and intangibly because of the challenge of pre-identifying the factor that generates them. This uncertain factor easily results in irrational trading and contributes to market fluctuations. Academically, numerous studies, e.g., Choudhry (2010) and Wang et al. (2020b), have demonstrated that extreme events can significantly produce violent fluctuations in the stock market. The occurrences of extreme shocks will force market participants to focus more on the financial market dynamics, particularly large asset price fluctuations, and these shocks trigger herding activity and could spread the crisis to neighboring markets (Chiang and Zheng 2010).

To investigate the impacts of uncertainty indices on stock volatilities in 23 relevant international markets, the empirical design was described as follows: the well-known Heterogeneous AutoRegressive-RV (HAR-RV) model (Corsi 2009) was employed as a benchmark model. Next, we employed the PCA, PLS, and s-PCA models to construct the composite uncertainty indices based on a news-based equity market uncertainty (EMU) index (Bakera et al. 2019), investor uncertainty indices measured by market liquidity (Uygur and Taş 2014), implied volatility index (VIX) of the Chicago Board Options Exchange (CBOE) (Deeney et al. 2015), and EPUs from the U.S., U.K., and China (Baker et al. 2016; Huang and Luk 2020). The benchmark model was extended by adding these uncertainty indices, followed by investigating the in-sample and out-of-sample performances. Additionally, several robustness checks were performed, and they supported the result that s-PCA is superior to PCA and PLS. Finally, we discussed why s-PCA outperforms PCA and PLS.

By investigating the predictive power of the proposed composite UI on stock volatilities, this study contributed to the literature in the following aspects. First, a global composite UI based on s-PCA was proposed. This approach is more comprehensive compared with that which was adopted by Yan et al. (2022) and Guo et al. (2022), who developed a composite index employing the s-PCA method on policy-related indices only. The composite index positively affects stock volatility, indicating that a higher uncertainty in the financial environment would increase the price uncertainty, and this is consistent with the theoretical basis and extant studies (Liu and Zhang 2015; Li et al. 2020). Moreover, it exerts significant in- and out-of-sample predictive powers on stock volatility in the 23 markets, although it also exhibited a better and more robust out-of-sample performance than the PCA and PLS methods in most stock markets. Furthermore, this index benefits investors in making decisions, because it is constructed mainly based on the U.S. market data and is relatively fixed.

Second, we observed that VIX is a powerful volatility predictor in most stock markets, and this correlates with the results reported by Wang et al. (2020a); Liang et al. (2020); Megaritis et al. (2021). Additionally, we availed new evidence that the change in VIX (DVIX) exerts a greater short-term predictive power on stock volatility than itself in most markets. Conversely, VIX outperforms DIVX in long-term forecasting. Thus, our results indicated that international investors must focus on different indicator forms (itself or its change) for different investment horizons (short-term or long-term). Moreover, high-frequency EPUs exhibit weak predictability on stock volatility, disagreeing with much extant evidence from monthly frequencies, e.g., Liu and Zhang (2015) and Li et al. (2020). This indicates that it is not rational to apply daily EPUs to the identification of market risk movement, which should be a warning to market participants.

Finally, this study empirically answered the question regarding why s-PCA outperforms PCA and PLS via time-varying loadings. We demonstrated that the main contributors of the PCA, PLS, and s-PCA factors are markedly different. More specifically, the loadings of the PCA factors exhibited generally equal relevance. Thus, its predictability would be reduced in the presence of strong and weak predictors. Further, the PLS method can effectively identify the main predictors but cannot reasonably assign weights. Contrarily, s-PCA is a superior method because it can effectively extract relevant predictive information and extract weak factors by placing a higher (lower) weight on the powerful (weak) predictors, thus ensuring a better prediction performance.

The remainder of this paper is organized as follows: “Measurement” section presents the measurements; “Methodology” section introduces the methodologies; and “Empirical analysis” section reports the empirical results, including the in-sample, out-of-sample, longer forecast horizon, and robustness analyses. “Predictability analyses” section further analyzes the difference in the predictability methods from the microcosmic perspective. Finally, our conclusions are reported in “Conclusion” section.


This section introduces the measurement methods, including RV and UIs, employed in this study. We demonstrated the uncertainty measures from three aspects, including MU, investor uncertainty, and EPU.

Realized variance

The utilization of high-frequency data to model volatility is a well-known and widely accepted approach because it could be a good proxy for real volatility. Realized variance,Footnote 1 indicated as RV, the sum of the squared log-returns, as defined by Andersen et al. (2001), is a simple, efficient, and consistent estimator of volatility. To overcome the influence of microstructure noise, sampling every five minutes is a common method. Following this, RV on the trading day, t, is given by the following:

$$\begin{aligned} R V_{t}=\sum _{j=1}^{M_{t}} r_{t, j}^{2}, \end{aligned}$$

where \(r_{t, j}=\log \left( p_{t, j}\right) -\log \left( p_{t, j-1}\right)\) is the logarithmic returns from time, \(j-1\) to j; \(p_{t,j}\) refers to the closing price on the jth five-minute point in the trading periods; and \(M_t\) denotes the number of five-minute intervals in the tth trading period.

Uncertainty variable

Two aspects are generally considered when selecting the uncertainty measures. One involves focusing on the high-frequency relationship, and the other involves exploring a relatively fixed UI that exerts a significant predictive power on international stock markets. Thus, the following uncertainty measures were employed. They are mainly derived from the American market since it is the biggest and most developed capital market worldwide.

Equity market uncertainty

Facing the big data area, the media account for the main source of information for the public. Different types of participants, including retail and institutional investors, managers, and policymakers, exist in this field. Thus, we cannot ignore the information from the media that are related to MU. Accordingly, we employed the newspaper-based equity market uncertainty index (EMU), which was proposed by Bakera et al. (2019), to capture the uncertainty reported by the media. EMU was constructed employing the scaled frequency counts of newspaper articles that contain the following three types of sets: economic, economy, and financial; stock market, equity, equities, etc.; and volatility, volatile, risk, etc.

Investor uncertainty

We postulated that investor psychology, which dominates investors’ behaviors, can be viewed as a source of uncertainty in the financial market for two reasons. One, investor psychology is unpredictable because it changes with the information that are available to the investor. Thus, investor psychology can reflect uncertain information from the market via investors. Secondly, investor sentiment and attention are good measures for capturing investors’ cognitive biases (Baker and Wurgler 2006; Da et al. 2011). Investor sentiment is regarded as the propensity to generally speculate (display optimism or pessimism) markets. Put differently, investor sentiment comprises future expectations. Investor attention is defined as a scarce cognitive resource. Extreme events are expected to increase investors’ attention via Internet activities, e.g., the search volume on Google. Thus, investor psychology must be the source of uncertainty in the financial market.

Considering the availability of high-frequency data, the first employed investor uncertainty was the CBOE volatility index (VIX) because it is a proxy of investor sentiment (Deeney et al. 2015), which is also employed as an uncertainty measure (Wang et al. 2020a; Megaritis et al. 2021). Considering that VIX is a popular and powerful factor that affects the financial market, we further focused on the changes therein, indicated by DVIX, to capture the change in investor uncertainty. Another measure is the change in the trading volume (VOL) of the National Association of Securities Dealers Automated Quotations (NASDAQ) composite index. This measure is regarded as an information flow (Zhang et al. 2021), and is a good proxy of market liquidity, which adequately reflects investor sentiment (Baker and Wurgler 2006; Uygur and Taş 2014).

Economic policy uncertainty

Aldy and Viscusi (2014) reported that environmental risks might comprise the most relevant policy-related applications of the economics of risk and uncertainty. The linkage between EPU and economic activities has been widely proven, e.g., Liu and Zhang (2015); Li et al. (2020). However, the studies focused on low-frequency analysis; the microcosmic evidence is lacking. We selected EPUs from the U.S. (USEPU), U.K. (UKEPU), and China (CNEPU) since they constitute powerful and influential countries globally. Another reason is the availability of high-frequency data. The newspaper-based USEPU and UKEPU indexes were proposed by Baker et al. (2016) who measure uncertainty by calculating the number of keywords in leading newspapers, such as economic or economy; uncertain or uncertainty. Although Baker et al. (2016) also introduced CNEPU, we employed the measure proposed by Huang and Luk (2020) because it is based on more comprehensive materials, including ten influential newspapers in mainland China.


Dimension reduction methods

A single UI could be limited to predicting the stock volatility in international markets; thus, a composite index is required because it can capture uncertainty from a more comprehensive perspective. Moreover, considering all the UIs in a “kitchen sink” model, it is easy to achieve in-sample over-fitting and poor out-of-sample performances (Huang et al. 2015, 2021). To address it, this study introduced three types of dimension-reduction methods to construct composite indexes.

Assuming that there were N uncertainty indicators, \(u_{i,t}\) for \(i=1, \cdots , N\), that are relevant but imperfect predictor variables of the target variable (RV) denoted by \(U_t=\left( u_{1, t}, u_{2, t},\cdots , u_{N, t}\right) ^{\prime }\) for \(t=1, \cdots , T\), where T refers to the number of observations. \(U= \{\mathrm {EMU, DVIX, VOL, USEPU, UKEPU, CNEPU}\}\) for the following analyses, as well as the definition of each \(u_{i,t}\), are presented in Table 1. Notably, we employed DVIX here, rather than VIX, to consider the stationarity of time series, which aims to avoid incorrect statistical inferences. Following the convention, we standardized each predictor in set U before constructing these composite uncertainty indicators.

PCA and s-PCA techniques

The oldest and most commonly employed approach for combining predictors into a lower-dimensional linear space is the (PCA) model, which could preserve the covariance structure among these factors (Gu et al. 2020). Mathematically, the PCA model extracts diffusion indexes as linear combinations of the predictors, i.e., set U in this study, via the following equation:

$$\begin{aligned} u_{i,t}=\mu _i+\lambda _{i}^{\prime } F_{t}^{\mathrm {P C A}}+\epsilon _{i, t}, \quad i=1, 2, \cdots , N, \quad t=1, 2, \ldots , T, \end{aligned}$$

where \(F_{t}^{\mathrm {P C A}}\) is the PCA diffusion indexes that were extracted from \(U_t=\left( u_{1, t}, u_{2, t};\cdots , u_{N, t}\right) ^{\prime }\), which is a K-dimensional vector (\(K<<N\)), \(\lambda\), is the K-dimensional parameter to be estimated; and \(e_{i, t}\) is the idiosyncratic noise term.

Although PCA is a well-known dimension-reduction technique that has been widely employed in the literature, it is limited by its negligence of the ultimate statistical objective. An improved target-driven dimension-reduction method is the s-PCA method that was recently proposed by Huang et al. (2021); it scales each predictor variable with its predictive slope on the to-be-predicted target. This method is implemented by the following two steps: first, we generated a panel of scaled predictors, \(\left( {\hat{\theta }}_{1} u_{1, t},{\hat{\theta }}_{1} u_{2, t}, \ldots , {\hat{\theta }}_{N} u_{N, t}\right)\), in which the coefficient, \({\hat{\theta }}_{i}\), was the estimated slope from regressing the target variable on the ith uncertainty predictor, \(u_{i, t}\), as follows:

$$\begin{aligned} \log (RV_{t})=\theta _{i, 0}+\theta _{i} u_{i, t}+\epsilon _{i, t}, \quad i=1, 2, \cdots , N. \end{aligned}$$

Second, similar to Eq. (2), we applied PCA to \(\left( {\hat{\theta }}_{1} u_{1, t},{\hat{\theta }}_{1} u_{2, t}, \ldots , {\hat{\theta }}_{N} u_{N, t}\right)\) to extract the factors and forecast the target variable. Compared with PCA, Huang et al. (2021) argued that the s-PCA exhibited several advantages: (i) s-PCA could distinguish between the target-relevant and -irrelevant latent factors when the factors are strong, while PCA could not; (ii) s-PCA could extract the signals from a large amount of noise, while PCA failed to do that, thus resulting in biased forecasts even when all the factors were weak.

Subsequently, we investigated two cases involving the use of s-PCA: in the first case, we employed the first principal component to measure a composite UI, denoted by s-FPCA. In the other case, we employed a weight s-PCA, following Gong et al. (2022), and defined as follows:

$$\begin{aligned} \mathrm {UI}^\mathrm {s-PCA}=\sum _{i=1}^{M}\left( \mathrm {PC}_{i}^{\mathrm {s-PCA}} \cdot {eigen}_{i}\right) / \sum _{j=1}^{M} {eigen}_{j}, \end{aligned}$$

where \(\mathrm {PC_{i}}^\mathrm {s-PCA}\) is the ith principal component, \({eigen }_{i}\) is its eigenvalue, and M is the total number of principal components. Compared with s-FPCA, the weighted s-PCA index (s-PCA) comprises more predictive information that could be useful since it is screened by the target variable.

PLS technique

Another supervised learning technique is the PLS (PLS) method, which can separate the irrelevant component from the proxy variables and extract the predictive information for the forecasting task (Huang et al. 2015). Following Huang et al. (2015) and Gong et al. (2022), PLS can be implemented via the following two steps:

In the first step, we ran the time-series regressions N times, where N is the number of basic uncertainty proxies. More specifically, each uncertainty predictor variable, \(u_{i, t}\), regressed on a constant and logarithmic RV. Namely,

$$\begin{aligned} u_{i, t}=\phi _{i, 0}+\phi _{i} \log (RV_{t})+\epsilon _{i, t}, \quad t=1, 2, \ldots , T, \end{aligned}$$

where the loading \(\phi _{i}\) captures the sensitivity of each \(u_i\) to the uncertainty measure that was instrumented by RV.

In the second step, T cross-sectional regressions were run. For each time period, t, we regressed \(u_i\) on the estimated coefficient, \({\hat{\phi }}_{i}\), in the regression 5 and obtained the following:

$$\begin{aligned} u_{i, t}=\psi _{t}+\mathrm {UI}_{t}^{\mathrm {PLS}} \hat{\phi }_{i}+\varepsilon _{i, t}, \quad i=1, 2, \cdots , N, \end{aligned}$$

where the slope of this regression, \(\mathrm {UI}_{t}^{\mathrm {PLS}}\), is the estimated PLS uncertainty index.

Notably, we employed contemporaneous regression in the target-related equations, Eq. (3) and (5), differing from the application in the return predictions of Huang et al. (2015) and Huang et al. (2021). This is because the volatility was highly autocorrelated, dissimilar to the asset returns. The predictive information regarding the volatility must exert a potential predictive power on one-step-ahead volatilities. Moreover, the volatility model below considers the historical information on the volatility. Thus, focusing on the contemporaneous target variable can prevent the overlap of information between the volatility and uncertainty indicators.

This study investigated whether there was a fixed uncertainty indicator that significantly impacted stock volatility in international markets. Thus, the target variables in Eqs. (3) and (5) were set as the logarithmic RVs of the Dow Jones Industrial Average (DJIA) stock index. This is because the U.S. market is the biggest and most developed capital market. Moreover, the well-known volatility spillover effects examined the shocks from the U.S. to other markets, such as the European equity (Baele 2005) and Pacific-Basin (Ng 2000) markets. Therefore, we assumed that the composite uncertainty indicator, which is driven by the volatility of the U.S. stock market, might effectively predict other equity markets.

Predictive regression model and its extension

To investigate whether UI is an effective factor for predicting stock volatility, we first set the HAR-RV model that was proposed by Corsi (2009) as the benchmark model. This model is based on the heterogeneous market hypothesis, where the heterogeneity derives from the differences in time horizons, i.e., the different types of market participants, such as high- and low-frequency traders, exert different impacts on future volatility. The HAR-RV model is formulated as follows:

$$\begin{aligned} R V_{t+h}=\alpha _0+\alpha ^{(d)} R V_{t}+\alpha ^{(w)} R V_{t}^{(5)}+\alpha ^{(m)} R V_{t}^{(22)}+\epsilon _{t+h}, \end{aligned}$$

where \(R V_{t}^{(m)}=\sum _{n=1}^{m} R V_{t-n+1} / m\), and h denote the forecast horizon.

Afterward, following Liang et al. (2020); Zhang et al. (2021) among others, we incorporated UI into the HAR-RV model. Apparently, the HAR-RV-UI model was specified as follows:

$$\begin{aligned} R V_{t+h}=\alpha _0+\alpha ^{(d)} R V_{t}+\alpha ^{(w)} R V_{t}^{(5)}+\alpha ^{(m)} R V_{t}^{(22)}+\beta \mathrm {UI}_{t}+\epsilon _{t+h}, \end{aligned}$$

where the key variable UI \(\in\){EMU, VIX, DVIX, VOL, USEPU, UKEPU, CNEPU, PCA, PLS, s-FPCA, s-PCA}. In the following, we focused on the coefficient, \(\beta\), since its significance reflects the predictability of UI.

Regarding the estimations of the parameters of the predictive regression models (7) and (8), we employed the logarithmic RV to ensure that the distributions were more approximately Gaussian, following the report of Paye (2012), Gong et al. (2022) and others. This prevented achieving a misleading statistical inference in the ordinary least squares (OLS) estimation. Notably, we employed the information available only up to time t to predict the target variable in time \(t+h\), to avoid the look-ahead bias in the out-of-sample analysis. More specifically, when employing the composite UI to predict RV, we calculated PCA, PLS, s-FPCA, and s-PCA recurrently with only the in-sample data to avoid the usage of the out-of-sample information for the prediction of the out-of-sample RV.

Forecast combination

Although this study mainly focused on the relationship between UI and stock volatility, we also compared the predictive performances of the dimension-reduction methods and forecast-combination methods since the latter is widely employed as the competing models, e.g., Guo et al. (2022) and Yan et al. (2022). The forecast combinations employed all the predictive information from each predictor (Set U) and combined them to obtain the final prediction. This method can be mathematically described as follows Timmermann (2006) and Weiss et al. (2018): First, we ran the HAR–RV–UI model (8) on each uncertainty indicator \(u_i\) (\(\in U\)) to obtain the individual forecasts

$$\begin{aligned} {\widehat{RV}}_{n,t+1}={\hat{\alpha }}_{0,n}+{\hat{\alpha }}_n^{(d)} {\widehat{RV}}_{n,t}+{\hat{\alpha }}_n^{(w)} {\widehat{RV}}_{n,t}^{(5)})+{\hat{\alpha }}_n^{(m)} {\widehat{RV}}_{n,t}^{(22)})+{\hat{\beta }}_n \mathrm {UI}_{n,t}, \end{aligned}$$

where, \({\hat{\alpha }}_{0,n}\), \({\hat{\alpha }}_n^{(d)}\), \({\hat{\alpha }}_n^{(w)}\), \({\hat{\alpha }}_n^{(m)}\), and \({\hat{\beta }}_n\) are the estimated coefficients from model (8) of the nth uncertainty indicator employing the information up to time \(t-1\), and n=1, 2, \(\cdots\), N. Thereafter, the final prediction was obtained by combining the individual forecasts based on some weight schemes, as follows:

$$\begin{aligned} {\widehat{RV}}_{t\mid t-1}^C=\sum _{n=1}^{N} \omega _{n, t-1} {\widehat{RV}}_{n, t\mid t-1}, \end{aligned}$$

where C is the combination style determined by the weight, \(\omega _{t-1}\), given at time, \(t-1\).

Three types of classical forecast combinations were employed as the competing models. The first simple method is the mean combination (MC) obtained by averaging all the individual forecasts as follows:

$$\begin{aligned} {\widehat{RV}}_{t \mid t-1}^{MC}=\frac{1}{N} ({\widehat{RV}}_{t \mid t-1, 1}+ {\widehat{RV}}_{t \mid t-1, 2}+\cdots + {\widehat{RV}}_{t \mid t-1, N}), \end{aligned}$$

i.e., \(\omega _{n, t-1}=1 / N\).

The second simple-weighted method is the median combination (MEDC) obtained from the median values of the individual forecasts, as exhibited below:

$$\begin{aligned} {\widehat{RV}}_{t \mid t-1}^{MEDC}=Median \{{\widehat{RV}}_{t \mid t-1, 1}, {\widehat{RV}}_{t \mid t-1, 2}, \cdots , {\widehat{RV}}_{t \mid t-1, N} \}. \end{aligned}$$

The winsorized mean (WMC) is the final combination, which handles outliers employing a softer line. This method caps outliers at a certain level, and it is specified as follows:

$$\begin{aligned} {\widehat{RV}}_{t \mid t-1}^{WMC}=\frac{1}{N}\left[ \lambda N {\widetilde{RV}}_{t \mid t-1,\lambda N+1}+\sum _{n=\lambda N+1}^{N-\lambda N}{\widetilde{RV}}_{t \mid t-1,n} +\lambda N {\widetilde{RV}}_{t \mid t-1,N-\lambda N}\right] , \end{aligned}$$

where \(\lambda\) is also a trim factor, i.e., the top/bottom 100\(\cdot\) \(\lambda\)% are winsorized, that takes the value of 0.1 in the empirical analysis; \({\widetilde{RV}}_{i}\) is the ith statistic by increasing order in \(\{{\widehat{RV}}_n\}_{n=1}^{N}\). This measure involves taking the (\(\lambda N\))th smallest and (\(\lambda N\))th largest forecasts and equating them to the \(\left( \lambda N+1\right)\)th smallest and \(\left( \lambda N+1\right)\)th largest forecasts, respectively.

Out-of-sample regression mechanism and evaluation criteria

Out-of-sample predictability could change with time since many extreme events, such as the sub-prime crisis in 2008 and the COVID-19 pandemic in 2020, occurred during our sampling periods. Following Catania and Proietti (2020), we addressed this employing a rolling window regression method, which is a common technique for evaluating stability and prediction accuracies in time-series forecasting. More specifically, we split the full sample, T, into initial train data (in-sample) with a fixed window length, W, and test data (out-of-sample) with \(T-W\) observations. This fixed window method replaces one old observation and a new one. In the empirical analysis, we employed a four-year window, i.e., \(W=1000\), to conduct the investigations. As alternative robustness checks, \(W=2000\) and 3000 were discussed.

To assess the out-of-sample relative performance of the UI model concerning the benchmark model, following Huang et al. (2015) and Neely et al. (2014), the out-of-sample \(R^2\) (\(R_{\mathrm {OS}}^{2}\)) was employed to evaluate the out-of-sample performance. It is given by the following:

$$\begin{aligned} R_{\mathrm {OS}}^{2}=1-\frac{\sum _{t=1}^{T_{\mathrm {OS}}}I_{t}^{\mathrm {C}}\left( RV_{r,t}-RV_{f,t}^{\mathrm {U}}\right) ^{2}}{\sum _{t=1}^{T_{\mathrm {OS}}}I_{t}^{\mathrm {C}}\left( RV_{r,t}-RV_{f,t}^{\mathrm {B}}\right) ^{2}}, \quad \mathrm {C}=\mathrm {Full}, \text{ Expansions }, \text{ Contractions }, \end{aligned}$$

where \(RV_{r,t}\) refers to the actual RV, \(RV_{f,t}^{\mathrm {B}}\) and \(RV_{f,t}^{\mathrm {U}}\) are the fitted values from the benchmark (7) and UI (8) models, respectively, \(T_{\mathrm {OS}}\) denotes the out-of-sample size, and \(I_{t}^{\mathrm {c}}\) is an indicator function whose value is 1 if day t belongs to the periods of C and 0 otherwise. Computing \(R_{\mathrm {OS}}^{2}\) separately during economic expansions and contractions clarifies whether UI exerts a significant out-of-sample predictive power over the different economic periods.

We expected \(R_{\mathrm {OS}}^{2}\) to be significantly positive from a statistical perspective, i.e., the mean square prediction error (MSPE) from the competing model is expected to be less than that of the benchmark model, indicating that UI can improve the out-of-sample predictive performance. We exploited an approximately normal test that was developed by Clark and West (2007) for equal predictive accuracy. The null (alternative) hypothesis states that the benchmark model has equal or less (larger) MSPE with the competing model, corresponding to \(H_0\): \(R_{\mathrm {OS}}^{2} \le 0\) against \(H_A\): \(R_{\mathrm {OS}}^{2} > 0\). To realize it, we regressed the time series \({\hat{f}}_{t}\), formulated by

$$\begin{aligned} {\hat{f}}_{t}=\left( RV_{r,t}-RV_{f, t}^{\mathrm {B}}\right) ^{2}-\left[ \left( RV_{r,t}-RV_{f, t}^{\mathrm {U}}\right) ^{2}-\left( RV_{f, t}^{\mathrm {B}}-RV_{f, t}^{\mathrm {U}}\right) ^{2}\right] , \end{aligned}$$

on a constant and calculated the t statistic corresponding to the constant coefficient. Thereafter, the t statistic from a one-tailed (right) test was employed for the statistical decision.

Empirical analysis

This section discusses the predictability of UIs on RVs of international stock markets based on in- and out-of-sample analyses. Moreover, we investigated its predictive power on longer horizons. Finally, several robustness checks were designed to analyze the performances of the uncertainty indicators under different conditions.

Data and statistical analyses

The information regarding the single uncertainty variables, including the abbreviations, definitions, periods, and data sources of the variables, are presented in Table 1. Moreover, we focused on 23 stock markets globally, e.g., the U.S., Australia, Belgium, Brazil, Canada, China, Denmark, Euro Area, Finland, France, Germany, Hong Kong, India, Italy, Japan, Mexico, Norway, Pakistan, South Korea, Spain, Sweden, Switzerland, and the U.K., covering five continents, as well as developed and developing markets. Notably, these markets were the main focus of the literature. We obtained the high-frequency RV data of stock indexes from the realized library.Footnote 2

Table 1 Definition of uncertainty variables

Table 2 presents description statistics of the RVs. Most stock indexes covered the period between January 1, 2001, and August 31, 2021. Some exhibited a shorter interval owing to data availability. The autocorrelation coefficients (\(\rho\)) revealed that RVs were highly dependent, thus indicating the rationality of modeling the HAR-RV model. Moreover, the Jarque and Bera (1987) statistic (JB-stat) rejects the null hypothesis, indicating that all the time series did not follow the normal distribution. Thus, it was necessary to take the logarithm transformation in the empirical analysis to avoid misleading statistical inferences. The augmented Dickey–Fuller (ADF) statistic, which was developed by Cheung and Lai (1995), indicated that all the time series were stationary, and this is a sufficient condition for conducting econometric analyses. Finally, the difference in the observations (Obs.) indicated that each market had a different number of trading days.

Table 2 Description statistics of realized variances

Figure 2 shows the time dynamics of the uncertainty indicators and RVs. The shaded area highlights the National Bureau of Economic Research (NBER)-dated economic recession periods.Footnote 3 Evidently, RVs increased during the economic contractions, particularly during the 2008 sub-prime crisis and the COVID-19 pandemic. This result is consistent with the trends of EMU and VIX. However, it was challenging to determine whether there was a potential relationship between EPUs and the economic cycle since EPUs fluctuate frequently and irregularly. Moreover, regarding the VOL, we observed a relatively subdued tendency. Finally, we noted that several stock indexes, which the economic cycle could not capture, fluctuated acutely. For example, the Chinese stock market (SSEC) fluctuated greatly and frequently before the 2008 sub-prime crisis and was shocked between 2015 and 2016 owing to the well-known 2015–2016 Chinese stock market turbulence. Additionally, the Pakistani stock market (KSE) exhibited continuous fluctuations over time. These findings indicate that these stock markets were not steady and could cause many challenges to the prediction task.

Fig. 2
figure 2

Time series of realized variances and uncertainty indices

In-sample analysis

Table 3 reports the in-sample results of the one-step-ahead forecasts (\(h=1\)). For the single UIs, we observed that EMU, VIX, DVIX, and VOL significantly impacted RVs in most stock markets. More specifically, EMU and VIX performed poorly only in the Chinese market (SSEC). VOL could not predict stock volatility in the American (DJI) and Pakistan (KSE) markets. Surprisingly and interestingly, the change in VIX (DVIX) performed well in all the stock markets. What’s more, DVIX delivered a better predictive performance than VIX according to the magnitude of the adjusted \(R^2\), indicating that the changes in VIX exerted more power to capture the market dynamics than itself. Moreover, the positive coefficients indicated that volatility increases with uncertainty. This result is consistent with some findings regarding the relationship between uncertainty and volatility, e.g., Li et al. (2020) and Megaritis et al. (2021). The results indicate that the uncertainty information about the U.S. market could effectively impact the stock volatility in many international stock markets.

However, the predictive abilities of the EPU indexes were weak. Each EPU exerts significant impacts on several markets (\(\le\)4) from the perspective of the number of significant results. From the significant-level perspective, most of the results were not statistically significant or were significant at a low level (10% or 5%). These findings indicated that EPUs were not strong predictor variables for predicting stock volatility. This contradicts the arguments of Li et al. (2020) and Liu and Zhang (2015), who observed a significant relationship between EPU and stock volatility. This might be because we utilized high-frequency data, while they utilized a monthly frequency.

The composite UIs demonstrated a robust and significant predictive power on all stock markets except for PLS of the Chinese market (SSEC). This result was expected since the composite indices were derived from many single uncertainty indicators exhibiting significant predictabilities on RV in international stock markets. Moreover, the highest adjusted \(R^2\)s often appear in the s-PCA index, indicating that this composite uncertainty indicator exerted the best in-sample predictability. Notably, the composite UIs exhibited very close predictability with DVIX, which is the best volatility factor in the single uncertainty indicators. Thus, the predictive ability of the composite UIs might mainly derive from DVIX.

Table 3 In-sample results for one-step-ahead forecasts (\(h=1\))

Out-of-sample analysis

Table 4 presents the out-of-sample results. The bold font highlights the significantly positive \(R_{\mathrm {OS}}^{2}\)s, and the underline font highlights the highest one.Footnote 4 We observed that EMU, VIX, DVIX, and VOL exhibited insignificant the out-of-sample predictive abilities in only a few stock markets. More specifically, EMU exhibited poor ability in forecasting RVs in Italy (FTMIB), Canada (GSPTSE), Pakistan (KSE), and China (SSEC). VIX and DVIX did not perform well in only SSEC and KSE, respectively. Additionally, VOL could not effectively predict the stock volatilities in Brazil (BVSP), America (DJI), Pakistan (KSE), and China (SSEC). The terrible performances in China and Pakistan were predictable because the volatilities of both markets fluctuated greatly and frequently (see Figure 2). Moreover, compared with VIX, we noted that DVIX exerted stronger predictive ability in most markets based on its greater \(R_{\mathrm {OS}}^{2}\)s. Thus, DVIX is a better indicator for identifying the potential movement of stock volatility compared with VIX. This finding meaningfully supplements the extant literature investigating the short-term impact of VIX on stock volatility, e.g., Wang et al. (2020a) and Liang et al. (2020). However, most EPUs performed poorly and even had negative \(R_{\mathrm {OS}}^{2}\) values in most cases, indicating that the high-frequency relation between EPU and stock volatility was not significant.

The composite UIs exhibited significant predictability on RVs in all the markets except for s-FPCA of KSE. Thus, compared with the single uncertainty indicators, the composite indices delivered more robust prediction results. What’s more, the s-PCA methods performed better than PCA and PLS according to the magnitude of \(R_{\mathrm {OS}}^{2}\), indicating that the s-PCA method exerted a higher power to capture prediction information from single uncertainty indicators and incorporate lesser noise. Although the composite indexes exhibited the highest \(R_{\mathrm {OS}}^{2}\) (the underlined ones) occasionally, their prediction accuracy was inferior to those of DVIX in some cases, implying that the predictability was mainly derived from DVIX.

Table 4 Out-of-sample results for one-step-ahead forecast

Comparison with the forecast combination models

We compared the prediction accuracy of the dimension-reduction methods and the forecast combination methods based on the model confidence set (MCS) test of Hansen et al. (2011). The results based on the Tmax statistic, which were evaluated by MSPE and the mean absolute error (MAE), are presented in Table 5.Footnote 5 We set the confidence level to be 90%, indicating that a model was excluded from MCS if the p-value was <0.1. The p-values were obtained based on 10,000 block bootstraps. The results demonstrated that the maximum p-value generally appeared in the s-(F)PCA model, indicating that the s-(F)PCA model exhibited better prediction accuracies in different evaluation indicators and different stock markets (except for KSE) than the competing models from the statistical perspective.

Table 5 MCS test based on T-MAX statistics for competing models

Longer forecast horizon analyses

To determine whether the predictability of UIs was persistent, we further investigated the out-of-sample performance on longer forecasts horizons. More specifically, we set horizon h as 3, 6, and 12, and Table 6 presents the corresponding results. To conserve space, we only reported the results of \(R_{\mathrm {OS}}^{2}\), where the bold font indicates that the value was significantly positive, following the test by Clark and West (2007) and the underline font denotes that the value was the highest in the corresponding row. Overall, most UIs exerted a significant predictive power on longer horizons, although their impacts decreased with the increasing forecast horizon (except for several particular cases). This result indicated the persistence of their predictive abilities. Interestingly, VIX performed better on the longer prediction horizons because many of the highest \(R_{\mathrm {OS}}^{2}\)s (the underlined ones) appeared. Thus, considering the long view, VIX was more effective for forecasting stock volatility concerning other uncertainty indicators.

Table 6 Out-of-sample predictability for longer horizons

Robustness analyses

Robustness check for different window lengths

Table 7 presents the out-of-sample results when the lengths of the rolling window (W) were set at 2000 and 3000. We observed that the changes in the window lengths exerted weak impacts on the results reported above. VIX and DVIX were also the most significant single uncertainty indicators for international stock markets. Particularly, DVIX exerted a significant predictive power on RVs of all the markets, including KSE, where it performed poorly when W=1000. Moreover, PLS could not predict the stock volatility in Finland (OMXHPI) and Sweden (OMXSPI) when W=3000, indicating that its predictive power was unstable in several cases. Finally, s-PCA exhibited more robust and outstanding predictabilities in the composite indexes. Overall, the results were robust when the window lengths were changed in the rolling regression framework.

Table 7 Robustness check for different windows

Robustness check for the business cycle

The predictability of stock volatility has been proven to change over time. Paye (2012) observed that the predictive performance changed in different subperiods. This subsection discussed a robustness check to identify whether the out-of-sample predictability changed in the business cycle. Table 8 presents the out-of-sample results during the NBER-dated U.S. economic expansions and contractions.

Regarding the single UI, we observed that DVIX exhibited robust predictive ability during the economic expansions and recessions in most markets except for KSE and Mexico (MXX). Moreover, VIX exhibited poor performance during economic recessions in many countries, including Belgium (BFX), America (DJI), the U.K. (FTSE), Spain (IBEX), Japan (N225), Denmark (OMXC20), Sweden (OMXSPI), Norway (OSEAX), China (SSEC), and Switzerland (SSMI). This indicated that VIX was not a robust predictor in many markets, which the extant literature did not report, e.g., Wang et al. (2020a) and Liang et al. (2020). Further, this result highlights that DVIX was superior to VIX regarding robustness. Moreover, EMU and VOL exerted robust explanatory powers on potential RVs during expansions and recessions in most stock markets, indicating that they were relatively significant volatility predictors for forecasting international stock market volatilities. Finally, EPUs performed poorly in both periods, as always.

Regarding the composite UIs, dissimilar to VIX, PCA exhibited a weak predictive ability over the economic contractions in a few countries. This result is consistent with that of Gong et al. (2022) who observed that the investor sentiment predicted stock volatility better under economic expansion conditions than under recession ones. This might be related to the increases in uncertainty during an economic recession, which results in poor predictive performance employing an unsupervised learning method, such as PCA. Moreover, PLS and s-PCA were the only robust indexes that exerted a significant predictive power in both expansions and recessions based on the positive \(R_{\mathrm {OS}}^{2}\). Interestingly, for PLS, we observed that it exhibited a better out-of-sample performance during recessions than during expansions, indicating that the PLS method could capture more prediction information during economic recessions.

Table 8 Robustness check for the business cycle

Robustness check employing realized semi-variances as the response variable

Although RV, which has attracted enormous attention in the literature, is a popular measure for identifying market risks, the realized semi-variance, which captures the impacts of negative returns (downside risk), could be more relevant to investors. This measure was developed by Barndorff-Nielsen et al. (2010) and defined by the following equation:

$$\begin{aligned} R S_{t}=\sum _{j=1}^{M_{t}}I_{r_{t,j}<0}\cdot r_{t, j}^{2}, \end{aligned}$$

where \(I_{r_{t,j}<0}\) is an indicator function that takes the value of unity if \(r_{t,j}<0\) and zero otherwise. We replaced (log)RV with (log)RS in the regression models (7) and (8). Table 9 reports the results of whether UIs impacted the realized semi-variance in global stock markets. The results demonstrated that the findings were consistent with RV. More specifically, VIX, DVIX, and s-PCA were the main, significant, and powerful contributors to the prediction of stock downside risks in international markets, respectively. Moreover, some UIs exerted a significantly higher predictive power on the Australian stock market, as evidenced by the large \(R_{\mathrm {OS}}^{2}\)s (27.25% and 22.99% for DVIX and s-PCA, respectively).

Table 9 Robustness check for using realized semi-variance as dependent variable

Predictability analyses

The empirical results revealed significant differences among the uncertainty indicators regarding predictability. This section further analyzed the reasons. To do this, two schemes were designed. In the first one, we compared the prediction errors of all the models, and in the second, we investigated why the composite indexes delivered different out-of-sample performances by analyzing the loadings of the dimension-reduction methods.

Comparison of the prediction error

We conducted the analyses from the following two dimensions. On the one hand, we focused on the time dimension, and on the other, we compared which uncertainty measure exhibited better-fitted values in longer periods. For example, if DVIX produced a smaller prediction error in more periods than the other indexes, it was considered to demonstrate a greater possibility for achieving high prediction accuracy. Conversely, we focused on the stability dimension. More specifically, we focused on the volatility of the prediction errors. If the residuals fluctuated wildly, it must be unstable. Many extremely predicted values (colossal prediction error) could significantly affect the prediction accuracy. Thus, we expected more stable prediction results, which exhibited less extreme predicted values.

Owing to the outstanding out-of-sample performance of DVIX, we set it as the benchmark and compared the prediction errors between it and the other UIs (u) over time. We first discussed the time dimension. To do this, we defined the following:

$$\begin{aligned} D_{t}^{u}= {\left\{ \begin{array}{ll} 1, &{} \text{ if } \left| R V_{f, t}^{\mathrm {DVIX}}-R V_{r, t}\right| \le \left| R V_{f, t}^{u}-R V_{r, t}\right| , t=1, 2, \cdots , T_{\mathrm {OS}} \\ 0. &{} \text{ otherwise } \end{array}\right. }, \end{aligned}$$

Next, we defined a “superior probability”, as follows:

$$\begin{aligned} p_{sup}=\frac{\sum _{t=1}^{T_{\mathrm {OS}}} D_{t}^{u}}{T_{\mathrm {OS}}}. \end{aligned}$$

The condition \(\left| R V_{f, t}^{\mathrm {DVIX}}-R V_{r, t}\right| \le \left| R V_{f, t}^{u}-R V_{r, t}\right|\) indicated whether the residual error derived from the HAR-RV-DVIX model was not larger than that derived from the HAR-RV-u model on day t, where \(u\in \mathrm {UI}\setminus \{\mathrm {DVIX}\}\). Thus, Eq. (18) measures the probability of DVIX to produce a smaller residual error compared with the other UIs.

Table 10 presents the superior probability, \(p_{sup}\), in each market, where the bold font highlights that the probability was <50%. DVIX outperformed the other uncertainty indicators in predicting RVs during more than half of the out-of-sample periods. This is a universal phenomenon except for the s-(F)PCA indexes in most markets. Notably, the out-of-sample size was between 1994 and 4176, indicating that 1% in \(p_{sup}\) denoted 20-42 observations. Thus, DVIX exhibited better performance than the others except for the s-(F)PCA indexes since it had smaller prediction errors in longer periods.

Table 10 Comparison of prediction errors between the DVIX and other uncertainty indicators based on time dimension

We noted that the predicted value of DVIX was more often closer to the real value than the other UIs were, although the superiority did not appear to be very significant since the superior probabilities approached 50%. Thus, we further analyzed the (absolute) prediction error sequence to investigate the impacts of the extreme values (from the stability dimension). Table 11 presents the 99%, 95%, and 90% quantiles of the prediction error sequences of UIs after subtracting that of DVIX. The positive (negative) ones denote that the prediction error of DVIX at the quantile was smaller (larger) than those of UIs. We highlighted the negative ones in bold font. The results demonstrated that most UIs exhibited higher extreme prediction errors than DVIX, indicating that DVIX delivered better prediction results since its prediction errors were more stable (exhibiting less-extreme values). Finally, compared with DVIX, we observed that the s-PCA-based index exhibited an advantage and a disadvantage in the time and stability dimensions. This could account for why they exhibited their prediction advantages in different markets.

Table 11 Comparison of prediction errors between the DVIX and other uncertainty indicators based on stability dimension

Comparison of the composite UIs

The empirical results demonstrated that the PCA-based and PLS-based composite UIs demonstrated lower prediction accuracies compared with the s-PCA-based ones. This subsection further discusses the loadings of these dimension-reduction methods to explain the result. Put differently, we analyzed the main contributors of these composite indexes. Dissimilar to the findings of He et al. (2021) and Neely et al. (2014) who employed static analysis to discuss the loadings, we employed dynamic analysis to demonstrate the change in the loadings with time, and this enabled us to observe the changes in the weight over time and prevented particularity. Based on the one-step-ahead rolling (W=1000), we calculated the loadings recurrently. Thus, the length of a series of loadings correlated with the out-of-sample size.

Time-varying loadings of the PCA factors

Figure 3 displays the loadings of the PCA factors over time. First, we observed that each loading changed over time, indicating that the contribution of each predictor to the PCA factor was time-varying. Thus, the time-varying analysis was more suitable compared with the static analysis. Moreover, we observed that every single UI exhibited approximate loadings, indicating that each predictor in the PCA component played an equally essential role all the time or sometimes. Notably, EPUs exhibited a limited explanatory power on RVs, which should destroy the predictability of PCA.

Fig. 3
figure 3

Time-varying loadings of the PCA factors (\(W=1000\))

Time-varying loadings of the PLS factors

Figure 4 shows that the loadings of the PLS factors were more stable over time compared with those of the PCA method except for EMU. The figure shows that EMU exhibited the largest weight, followed by VOL, DVIX, and the other predictors, indicating that EMU was the main contributor to UI of PLS even though it exhibited time-varying weights. Revisiting the in- and out-of-sample results (Tables 3 and 4), EMU, VOL, and DVIX exerted a significant predictive power on stock volatility in most markets. Thus, PLS performed better than PCA since it could identify and extract the significant predictors and reduce the impacts of the insignificant predictors (EPUs).

Fig. 4
figure 4

Time-varying loadings of the PLS factors (\(W=1000\))

Time-varying loadings of the s-PCA factors

Figure 5 shows the time-varying loadings of the s-PCA factors. Interestingly, the figure shows that the s-PCA-based index was mainly constructed by DVIX and VOL since they exhibited a significantly higher weight than the other predictors. VOL dominated other predictors before 2009, while DVIX became the main contributor afterward. For the other predictors (EPUs and EMU), we observed that their weights approached zero over time, indicating that their contributions to the s-PCA-based UIs were limited. Recall that DVIX delivered more outstanding in- and out-of-sample performances than VOL and the other predictors in volatility forecasting. Although PLS and s-PCA were supervised learning techniques, s-PCA could further differentiate between the relative importance of the strong predictors. Put differently, s-PCA could identify the better (worse) predictors, DVIX and VOL, and place more (less) weights on them, while PLS could only identify the powerful predictors but could not arrange reasonable weights. Thus, s-PCA is a more effective dimension-reduction method in the presence of strong and weak predictors.

Fig. 5
figure 5

Time-varying loadings of the Scaled-PCA factors (\(W=1000\))

Index performance during the financial crises

To further observe the differences among composite UIs intuitively, we depicted their time series. Considering that we employed daily data, which were collected within a long period, we demonstrated the time series before and after two well-known crises, namely the 2008 subprime crisis (January 1, 2007, to December 31, 2009) and the 2020 COVID-19 pandemic (January 1, 2020, to the end of the year). For comparison, we added the time dynamics of the U.S. market RV as a reference. Figure 6 shows that the s-PCA-based index (red line) exhibited synchronous and consistent fluctuations with RVs of DJIA (blue line), such as March 3, 2007, November 3, 2008, and August 2, 2019. The PLS-based index (cyan line) exhibited a similar character with the s-PCA-based index only in periods of great fluctuations, such as September 2008 and March 2020. Moreover, it exhibited a small swing, which was not similar to those of RV and the red line with frequent fluctuations, over time. However, the PCA-based index (orange line) fluctuated continually over time, which was just like the random walk process. Although it was challenging to visually capture the relationship between it and RV, we observed that there were no significant differences among PCA-based indexes during financial crises and non-crisis.

Fig. 6
figure 6

Comparison of uncertainty indices before and after crises

In summary, from the loadings and picture analyses, we revealed that the s-PCA method outperformed PCA and PLS owing to two aspects: first, the s-PCA method identified strong predictors and could further place reasonable weight on each predictor. Secondly, compared with the PLS method, s-PCA could solve the over-fitting issue and avoid the incorporation of much noise because it could transform many predictors into orthogonal components Huang et al. (2021), thus reducing the number of variables.


Uncertainty index is beneficial to decision-making investors and policymakers monitoring market risks. Though enormous efforts have been invested into constructing this index, the method for building one exhibiting a relatively fixed composite and imposing significant impacts on international stock volatilities is still rare, and this study has filled that research gap. We constructed a composite uncertainty index based on the s-PCA method and investigated the high-frequency relationship between the proposed index and stock volatilities in global markets. The proposed index comprehensively captured the uncertainties from the equity-market, investor, and economic-policy levels. More crucially, it was very practical and user-friendly, in reality, for its property of a relatively fixed composite.

The empirical analyses of 23 international stock market volatilities revealed that the proposed index exhibited excellent performances in the in- and out-of-sample predictabilities, and these performances were better and more robust than those of competing models, including the widely employed PCA and PLS methods. This superiority is rational. One reason is that the proposed method reserved the advantage of the PCA method, which avoids adding much noise to the prediction task and reduces the risk of overfitting. The other reason is that the proposed index could not only identify relevant predictors, it also achieved the best use of them by placing more weight on more informative predictors, while the PLS method could not.

Our results exhibit the following practical implications: (i) We availed fixed and valuable indicators for investors and policymakers with keen interests in the international stock markets. These indicators can effectively reflect market risk dynamics. (ii) We established the insignificant high-frequency relationship between EPU and stock volatility, which brings a warning to short-term investors when allocating their wealth. (iii) We discussed the differences among popular dimension-reduction methods that deal with both strong and weak factors, which give a good reference to scholars and practitioners when employing econometric models to investigate market movements.

Availability of data and materials

The data were derived from public domain resources. The data that support the findings of this study are available on public websites. All the data are available from the authors upon reasonable request.


  1. Realized variance is the square of realized volatility and thus they have the same economic meanings. Although we focus on realized variance in this study, we use the terms stock volatility and realized variance interchangeably.

  2. See

  3. We split economic expansions and recessions following the NBER, see

  4. Note that the \(R_{\mathrm {OS}}^{2}\) is not very large in some cases but statistically significant. This is common since we use high-frequency data in this study. A similar result described in He et al. (2021) reports that the statistically significant \(R_{OS}^2\)s are 18.38%, 14.53%, and 0.55% for monthly, weekly, and daily frequency, respectively.

  5. The result based on the TR statistic is consistent with the TMAX statistic results. The results are not reported owing to space limitations but are available from the authors. For details of MCS test, one can refer to Hansen et al. (2011) and Zhao et al. (2021).


  • Aldy JE, Viscusi WK (2014) Chapter 10—Environmental risk and uncertainty. In: Handbook of the economics of risk and uncertainty, 1st edn. North-Holland, Kidlington, pp 601–649

  • Andersen TG, Bollerslev T, Diebold FX, Ebens H (2001) The distribution of realized stock return volatility. J Financ Econ 61(1):43–76

    Article  Google Scholar 

  • Baele L (2005) Volatility spillover effects in European equity markets. J Financ Quant Anal 40(2):373–401

    Article  Google Scholar 

  • Baker M, Wurgler J (2006) Investor sentiment and the cross-section of stock returns. J Financ 61(4):1645–1680

    Article  Google Scholar 

  • Baker SR, Bloom N, Davis SJ (2016) Measuring economic policy uncertainty. Q J Econ 131(4):1593–1636

    Article  Google Scholar 

  • Bakera SR, Bloomb N, Davisc SJ, Kostd K (2019) Policy news and stock market volatility. National Bureau of Economic Research Working Paper (25720)

  • Barndorff-Nielsen OE, Kinnebrouk S, Shephard N (2010) Measuring downside risk: realised semivariance. Oxford University Press, Oxford, pp 117–136

    Google Scholar 

  • Catania L, Proietti T (2020) Forecasting volatility with time-varying leverage and volatility of volatility effects. Int J Forecast 36(4):1301–1317

    Article  Google Scholar 

  • Chen J, Tang G, Yao J, Zhou G (2022) Investor attention and stock returns. J Financ Quant Anal 57(2):455–484

    Article  Google Scholar 

  • Cheung Y-W, Lai KS (1995) Lag order and critical values of the augmented Dickey–Fuller test. J Bus Econ Stat 13(3):277–280

    Google Scholar 

  • Chiang M-H, Wang L-M (2011) Volatility contagion: a range-based volatility approach. J Econom 165(2):175–189

    Article  Google Scholar 

  • Chiang TC, Zheng D (2010) An empirical analysis of herd behavior in global stock markets. J Bank Finance 34(8):1911–1921

    Article  Google Scholar 

  • Chkili W (2021) Modeling bitcoin price volatility: long memory vs Markov switching. Eurasian Econ Rev 11(3):433–448

    Article  Google Scholar 

  • Choudhry T (2010) World War II events and the Dow Jones industrial index. J Bank Finance 34(5):1022–1031

    Article  Google Scholar 

  • Christou C, Gupta R, Hassapis C, Suleman T (2018) The role of economic uncertainty in forecasting exchange rate returns and realized volatility: evidence from quantile predictive regressions. J Forecast 37(7):705–719

    Article  Google Scholar 

  • Cipollini A, Cascio IL, Muzzioli S (2015) Volatility co-movements: a time-scale decomposition analysis. J Empir Financ 34:34–44

    Article  Google Scholar 

  • Clark TE, West KD (2007) Approximately normal tests for equal predictive accuracy in nested models. J Econom 138(1):291–311

    Article  Google Scholar 

  • Corsi F (2009) A simple approximate long-memory model of realized volatility. J Financ Econom 7(2):174–196

    Google Scholar 

  • Da Z, Engelberg J, Gao P (2011) In search of attention. J Financ 66(5):1461–1499

    Article  Google Scholar 

  • Deeney P, Cummins M, Dowling M, Bermingham A (2015) Sentiment in oil markets. Int Rev Financ Anal 39:179–185

    Article  Google Scholar 

  • Diebold FX, Yilmaz K (2009) Measuring financial asset return and volatility spillovers, with application to global equity markets. Econ J 119(534):158–171

    Article  Google Scholar 

  • Gong X, Zhang W, Wang J, Wang C (2022) Investor sentiment and stock volatility: new evidence. Int Rev Finan Anal.

    Article  Google Scholar 

  • Goodell JW, McGee RJ, McGroarty F (2020) Election uncertainty, economic policy uncertainty and financial market uncertainty: a prediction market analysis. J Bank Finance 110:105684

    Article  Google Scholar 

  • Gu S, Kelly B, Xiu D (2020) Empirical asset pricing via machine learning. Rev Financ Stud 33(5):2223–2273

    Article  Google Scholar 

  • Guo Y, He F, Liang C, Ma F (2022) Oil price volatility predictability: new evidence from a scaled PCA approach. Energy Econ 105:105714

    Article  Google Scholar 

  • Hansen PR, Lunde A, Nason JM (2011) The model confidence set. Econometrica 79(2):453–497

    Article  Google Scholar 

  • He M, Zhang Y, Wen D, Wang Y (2021) Forecasting crude oil prices: a scaled PCA approach. Energy Econ 97:105189.

    Article  Google Scholar 

  • Huang D, Jiang F, Tu J, Zhou G (2015) Investor sentiment aligned: a powerful predictor of stock returns. Rev Financ Stud 28(3):791–837

    Article  Google Scholar 

  • Huang D, Jiang F, Li K, Tong G, Zhou G (2020) Are bond returns predictable with real-time macro data? Available at SSRN, 3107612

  • Huang D, Jiang F, Li K, Tong G, Zhou G (2021) Scaled PCA: a new approach to dimension reduction. Manage Sci 68(3):1678–1695

    Article  Google Scholar 

  • Huang Y, Luk P (2020) Measuring economic policy uncertainty in China. China Econ Rev 59:1–18

    Article  Google Scholar 

  • Jarque CM, Bera AK (1987) A test for normality of observations and regression residuals. Int Stat Rev 55(2):163–172

    Article  Google Scholar 

  • Jurado K, Ludvigson SC, Ng S (2015) Measuring uncertainty. Am Econ Rev 105(3):1177–1216

    Article  Google Scholar 

  • Karabulut G, Bilgin MH, Doker AC (2020) The relationship between commodity prices and world trade uncertainty. Econ Anal Policy 66:276–281

    Article  Google Scholar 

  • Kaviani MS, Kryzanowski L, Maleki H, Savor P (2020) Policy uncertainty and corporate credit spreads. J Financ Econ 138(3):838–865

    Article  Google Scholar 

  • Khan MA, Qin X, Jebran K (2020) Uncertainty and leverage nexus: does trade credit matter? Eurasian Bus Rev 10:355–389

    Article  Google Scholar 

  • Li T, Ma F, Zhang X, Zhang Y (2020) Economic policy uncertainty and the Chinese stock market volatility: novel evidence. Econ Model 87:24–33

    Article  Google Scholar 

  • Liang C, Wei Y, Zhang Y (2020) Is implied volatility more informative for forecasting realized volatility: an international perspective. J Forecast 39(8):1253–1276

    Article  Google Scholar 

  • Liao C, Luo Q, Tang G (2021) Aggregate liquidity premium and cross-sectional returns: evidence from china. Econ Model 104:105645

    Article  Google Scholar 

  • Liu J, Zhang Z, Yan L, Wen F (2021) Forecasting the volatility of EUA futures with economic policy uncertainty using the GARCH-MIDAS model. Financ Innov 7(1):1–19

    Article  Google Scholar 

  • Liu L, Zhang T (2015) Economic policy uncertainty and stock market volatility. Financ Res Lett 15:99–105

    Article  Google Scholar 

  • Megaritis A, Vlastakis N, Triantafyllou A (2021) Stock market volatility and jumps in times of uncertainty. J Int Money Finance 113:102355.

    Article  Google Scholar 

  • Neely CJ, Rapach DE, Tu J, Zhou G (2014) Forecasting the equity risk premium: the role of technical indicators. Manag Sci 60(7):1772–1791

    Article  Google Scholar 

  • Newey WK, West KD (1987) A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55(3):703–708

    Article  Google Scholar 

  • Ng A (2000) Volatility spillover effects from Japan and the US to the Pacific-Basin. J Int Money Financ 19(2):207–233

    Article  Google Scholar 

  • Pastor L, Veronesi P (2012) Uncertainty about government policy and stock prices. J Financ 67(4):1219–1264

    Article  Google Scholar 

  • Paye BS (2012) ‘Déjà vol’: predictive regressions for aggregate stock market volatility using macroeconomic variables. J Financ Econ 106(3):527–546

    Article  Google Scholar 

  • Phan DHB, Iyke BN, Sharma SS, Affandi Y (2021) Economic policy uncertainty and financial stability—is there a relation? Econ Model 94:1018–1029

    Article  Google Scholar 

  • Timmermann A (2006) Chapter 4 Forecast combinations. In: Handbook of economic forecasting, 1, pp 135–196

  • Tsai I-C (2017) The source of global stock market risk: a viewpoint of economic policy uncertainty. Econ Model 60:122–131

    Article  Google Scholar 

  • Uygur U, Taş O (2014) The impacts of investor sentiment on returns and conditional volatility of international stock markets. Qual Quant 48(3):1165–1179

    Article  Google Scholar 

  • Vu NT (2015) Stock market volatility and international business cycle dynamics: evidence from OECD economies. J Int Money Financ 50:1–15

    Article  Google Scholar 

  • Wang J, Lu X, He F, Ma F (2020) Which popular predictor is more useful to forecast international stock markets during the coronavirus pandemic: VIX vs EPU? Int Rev Financ Anal 72:101596

    Article  Google Scholar 

  • Wang L, Ma F, Liu J, Yang L (2020) Forecasting stock price volatility: new evidence from the GARCH-MIDAS model. Int J Forecast 36(2):684–694

    Article  Google Scholar 

  • Weiss CE, Raviv E, Roetzer G (2018) Forecast combinations in R using the ForecastComb Package. R Journal 10(2):262–281

  • Yan X, Bai J, Li X, Chen Z (2022) Can dimensional reduction technology make better use of the information of uncertainty indices when predicting volatility of Chinese crude oil futures? Resour Policy 75:102521

    Article  Google Scholar 

  • Zhang W, Yan K, Shen D (2021) Can the Baidu Index predict realized volatility in the Chinese stock market? Financ Innov 7(1):1–31

    Article  Google Scholar 

  • Zhang W, Gong X, Wang C, Ye X (2021) Predicting stock market volatility based on textual sentiment: a nonlinear analysis. J Forecast 40(8):1479–1500

    Article  Google Scholar 

  • Zhang Y, Ma F, Liao Y (2020) Forecasting global equity market volatilities. Int J Forecast 36(4):1454–1475

    Article  Google Scholar 

  • Zhao Y, Zhang W, Gong X, Wang C (2021) A novel method for online real-time forecasting of crude oil price. Appl Energy 303(1):117588.

    Article  Google Scholar 

Download references


The authors have no potential conflict of interest to declare. We would like to thank the editor (Gang Kou), Yingjie Zhang (the Assistant editor), and three anonymous reviewers for improving the quality of this paper a lot. And of course, the authors are responsible for the consequences of this article. We thank Xin Ye from Wuhan University for the valuable suggestions. We thank the seminar participants in the Financial Service Innovation and Risk Management Research Base of Guangzhou. This work was supported by the National Natural Science Foundation of China (Grant No. 71720107002, U1901223 [Joint Foundation with Guangdong Province], 71771091, 71901124), the Foundation for Key Program of the Ministry of Science and Technology of China (Grant No. 2020AAA0108404), and the Fundamental Research Funds for the Central Universities (No. ZKXM202127).

Author information

Authors and Affiliations



XG: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Writing—original draft, Writing—Review & Editing. WZ: Conceptualization, Funding acquisition, Resources, Supervision, Writing—Review & Editing. WX: Funding acquisition, Data curation, Writing—Review & Editing. ZL: Funding acquisition, Methodology, Writing—Review & Editing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Weiguo Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gong, X., Zhang, W., Xu, W. et al. Uncertainty index and stock volatility prediction: evidence from international markets. Financ Innov 8, 57 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Uncertainty index
  • High-frequency data
  • Realized variance
  • Scaled-PCA

JEL Classifications

  • C22
  • G15
  • G17
  • G41