Discrete time stochastic volatility models commonly assume that the financial (log) return, at time t, is given by
$$\begin{aligned} r_t = e^{h_t/2}\epsilon _t, \end{aligned}$$
(1)
where
$$\begin{aligned} h_t = \mu + \phi (h_{t-1}-\mu ) + \sigma \eta _t \end{aligned}$$
(2)
is the log variance. Here \(\sigma >0\), \(|\phi |<1\), and \(\mu \in {\mathbb{R}}\) are parameters, and \(\{\epsilon _t\}\) and \(\{\eta _t\}\) are sequences of independent and identically distributed (iid) random variables representing the innovations for \(r_t\) and \(h_t\), respectively. We do not, in general, assume that for a given t, \(\epsilon _t\) and \(\eta _t\) are independent of each other. Note that the log variance is modeled by an AR(1) process. The assumption that \(|\phi |<1\) ensures that this process is weakly stationary, see Ruppert and Matteson (2015). Under general conditions on the distributions of the innovations, this model can be seen as a discretization of a continuous time SV model where the log variance is modeled by a process of Ornstein–Uhlenbeck type, see Taylor (1994) or Barndorff-Nielsen and Shephard (2001). We are interested in evaluating the ES for this model.
We begin by establishing some notation. Let \({\mathcal{F}}_{t-1}\) denote the information set available at time \(t-1\). For simplicity, we sometimes write \(\text{P}_{t-1}\) to denote the conditional probability \(\text{P}_{t-1}(\cdot ) = \text{P}(\cdot |{\mathcal{F}}_{t-1})\) and \(\text{E}_{t-1}\) to denote the conditional expectation \(\text{E}_{t-1}(\cdot ) =\text{E}(\cdot |{\mathcal{F}}_{t-1})\). For \(\tau \in (0,1)\), the \(\tau\)th VaR at time t, denoted by \(\text{VaR}_{\tau }(t)\), is the smallest number for which \(\text{P}_{t-1}\left\{ r_{t}<-\text{VaR}_{\tau }(t)\right\} \le \tau\). Note that \(-\text{VaR}_\tau (t)\) is the \(\tau\)th conditional (given \({\mathcal{F}}_{t-1}\)) quantile of \(r_{t}\). For this reason, we sometimes write \(Q_\tau (r_t|{\mathcal{F}}_{t-1})\) for \(-\text{VaR}_{\tau }(t)\). The \(\tau\)th ES at time t, denoted by \(\text{ES}_{\tau }(t)\), is defined by
$$\begin{aligned}\text{ES}_{\tau }(t) = \frac{1}{\tau } \int _0^\tau \text{VaR}_{s}(t) \text{d}s, \end{aligned}$$
when the integral exists, and is undefined otherwise. The parameter \(\tau\) is typically chosen to be a small number such as 0.01, 0.025, or 0.05. Throughout, we assume
-
1.
that the distribution of \(r_t\) is continuous, and
-
2.
that it satisfies
$$\begin{aligned} \text{E}_{t-1}(|r_t|)<\infty . \end{aligned}$$
(3)
The second assumption ensures that \(\text{ES}_{\tau }(t)\) is well defined, while the first allows us to use the more explicit formula
$$\begin{aligned} \text{ES}_{\tau }(t) =\text{E}_{t-1}\left[ -r_{t} |-r_{t} > \text{VaR}_{\tau }(t)\right] =-\frac{1}{\tau } \text{E}_{t-1}\left[ r_{t} 1\{r_{t} < -\text{VaR}_{\tau }(t)\} \right] . \end{aligned}$$
Here and throughout, we write \(1\{\cdot \}\) to denote the indicator function.
Using the fact that the innovations are independent over time, together with basic properties of quantiles and expectations, we can write
$$\begin{aligned} \text{ES}_{\tau }(t) = -e^{{ \{ \mu (1-\phi )+\phi h_{t-1} \} /2}} M(\tau ,\sigma ), \end{aligned}$$
(4)
where
$$\begin{aligned} M(\tau ,\sigma ) = \frac{1}{\tau }\text{E}\left[ e^{\sigma Y/2}Z 1\{e^{\sigma Y/2}Z<a\}\right] , \end{aligned}$$
(5)
\(a=Q_\tau \left( e^{\sigma Y/2}Z\right)\) is the \(\tau\)th (unconditional) quantile of the random variable \(e^{\sigma Y/2}Z\), and the joint distribution of (Y, Z) is the same as the joint distribution of \((\eta _t,\epsilon _t)\). The difficulty in evaluating M is that we must work with the distribution of \(X=e^{\sigma Y/2}Z\), which can be complicated even when the distributions of Y and Z are fairly simple. Little is known about the distribution of X even in the case where Y and Z are both standard normal random variables, see Yang (2008) and the references therein. For this reason, we develop Monte Carlo methods to approximate \(M(\tau ,\sigma )\).
We begin by approximating \(a=Q_\tau (e^{\sigma Y/2}Z)\). Toward this end, fix some large integer \(N_1\) and simulate an iid sequence of bivariate random variables \(\{(Y_i,Z_i)\}_{i=1}^{N_1}\) from the joint distribution of \((\eta _t,\epsilon _t)\). Next, for \(i=1,2,\ldots ,N_1\), set \(X_i = e^{\sigma Y_i/2}Z_i\). Now sort these from smallest to largest to get \(X_{(1)}\le X_{(2)}\le \cdots \le X_{(N_1)}\). Finally, approximate \(a=Q_\tau (e^{\sigma Y/2}Z)\) by
$$\begin{aligned} {\widehat{a}} = X_{(\lfloor \tau N_1\rfloor )}, \end{aligned}$$
(6)
where \(\lfloor \cdot \rfloor\) is the floor function. One can also use a smooth approximation using kernel estimators, see e.g. Sheather and Marron (1990). However, we did not find much of an improvement when using these. Next, fix another large integer \(N_2\) and simulate a new iid sequence \(\{(Y_i,Z_i)\}_{i=1}^{N_2}\) from the joint distribution of \((\eta _t,\epsilon _t)\) and approximate \(M(\tau ,\sigma )\) by
$$\begin{aligned} \widehat{M}_1(\tau ,\sigma ) = \frac{1}{N_2\tau } \sum _{i=1}^{N_2} e^{\sigma Y_i/2}Z_i1\{e^{\sigma Y_i/2}Z_i\le {\widehat{a}}\} . \end{aligned}$$
(7)
We note that, in principle one can use the same dataset to evaluate a and \(M(\tau ,\sigma )\) although for smaller sample sizes this may create bias. Either way, the difficulty with this approach is that approximately \((1-\tau )100\%\) of the simulated values will not satisfy the condition in the indicator function in (7) and will thus be thrown out. As such, very few values will actually be used in the sum. For this reason, we may need \(N_2\) to be an extremely large number to get a reasonable approximation. One could try to implement an importance sampling or related modification, but the fact that we are working with the product of two random variables, makes it difficult to use such an approach. Instead, we use the specific structure of this problem to implement an approach that works better in several important situations.