Tail risk is a classic topic in stressed portfolio optimization to treat unprecedented risks, while the traditional mean–variance approach may fail to perform well. This study proposes an innovative semiparametric method consisting of two modeling components: the nonparametric estimation and copula method for each marginal distribution of the portfolio and their joint distribution, respectively. We then focus on the optimal weights of the stressed portfolio and its optimal scale beyond the Gaussian restriction. Empirical studies include statistical estimation for the semiparametric method, risk measure minimization for optimal weights, and value measure maximization for the optimal scale to enlarge the investment. From the outputs of short-term and long-term data analysis, optimal stressed portfolios demonstrate the advantages of model flexibility to account for tail risk over the traditional mean–variance method.

Introduction

Several historical episodes, such as the financial crisis and COVID-19, have posed new challenges for investment management in unknown and unprecedented tail risks. A large body of literature on econometric research exploits the validation of various financial models and risk measures, such as value-at-risk (VaR) and conditional value at risk (CVaR) for risk management (Jorion 2007). We extend the use of these risk measures (Artzner et al. 1999) for portfolio optimization using a novel semiparametric modeling method under stressed scenarios. The scaling effect of stressed portfolios is also a concern. Risk-sensitive value measures (Miyahara 2010) were adopted to maximize the optimal scale for a given portfolio strategy.

The proposed semiparametric modeling method is constructive and consists of two estimation procedures: the nonparametric kernel method for marginal distributions and a parametric copula method for their joint distribution. This semiparametric method builds up a more complex dependence between portfolio constituents than traditional Gaussian models that can be used to exploit tail risks.

From both experimental and theoretical perspectives, we find that the proposed optimal stressed portfolio and the semiparametric method perform better than Markowitz’s mean–variance method (Markowitz 1952). From an experimental perspective, our implementation of the stressed portfolio optimization relies on a rolling window approach and checks its robustness. In addition, from a theoretical perspective, the risk-sensitive value measure (RSVM) is equipped with more properties for general heavy-tail distribution than Markowitz’s mean–variance model, thus making mean–variance a special case in the risk-sensitive value measure.

The remainder of this paper is organized as follows: “Literature review” section provides a literature review, particularly on the nonparametric kernel method and the parametric copula method. “The semiparametric method” section generates non-Gaussian distributed portfolios using the proposed semiparametric method with two parts. First, we construct the marginal distribution of each constituent asset by nonparametric estimation with cross-validation to obtain the optimal bandwidth of a kernel function and its perturbation analysis. The alternative part estimates the parameters of copula functions by full maximum likelihood estimation (MLE). “Stressed portfolio optimization and its scaling effect” section solves the optimal weights of the portfolio using the semiparametric method by minimizing risk measures, such as VaR and CVaR. The scaling effect is then optimized by maximizing the risk-sensitive value measures. “Empirical studies and data analysis” section presents the data set, intensive empirical results, and a comparison between the stressed portfolio and the traditional mean–variance method. We conclude the paper in “Conclusion” section.

Literature review

There are two major directions for tail risk estimation: modeling the return distribution and capturing the volatility process. For the former direction, various techniques are employed for modeling the entire return distribution or just the tail areas, including known parametric distribution, kernel density approximation, and extreme value theory (Tsay 2010). The latter direction mostly relies on discrete-time volatility models, such as the exponentially weighted moving average model (EWMA) and autoregressive general conditional heteroskedasticity (GARCH) model to capture the volatility process. See Jondeau et al. (2007) for further details.

Traditional modeling methods in financial management often rely on the Gaussian distribution by virtue of closed-form solutions for mean–variance analysis (Fu et al. 2021), the optimal risk measure, and so on. There are other risk measures such as the Entropic Value-at-Risk (Mills et al. 2017). However, some stylized facts of heavy tails and asymmetry among empirical distributions expose extra risk for the fraud of initial assumptions. In contrast, we relax the Gaussian assumption using a semiparametric method, which renders flexible distributions to describe more details and properties for unknown tail risks.

Distinct from previous studies on financial modeling, the aim of this study is to build up the joint distribution of portfolios in high dimensions without assumptions of each underlying asset distribution. This innovative construction of a joint distribution is based on nonparametric estimation (Robinson 1983) and the copula method (Cherubini et al. 2004, 2011). Nonparametric estimation with a kernel function is adopted to estimate the probability density function of each underlying asset, and the parametric copula method is used to describe a joint distribution between the assets of the portfolio. Among nonparametric estimations, several studies exploit the optimal kernel functions and bandwidth in the estimation by Robinson (1983). There is no universally accepted approach to select the optimal kernel function that has little influence on the estimation results. We concentrate on the selection of the optimal bandwidth using cross-validation theory (Horová et al. 2012). A bias estimation for the perturbed optimal bandwidth is derived. Regarding the parametric copula method, there are two primitive families of copula functions: elliptical and Archimedean copula (Nelsen 1999). The multivariate copula method builds up the dependence on portfolio constituents.

Notably, the proposed semiparametric modeling method is static, in comparison to dynamical multivariate models such as the GARCH DCC model (Engle 2002) in discrete time or stochastic volatility matrix model in continuous time (Mancino et al. 2017; Han 2018). The former static model can be quite complex in its structure, whereas the latter dynamic model advances its prediction capability. The stressed portfolio optimization problem under the static model is the focus of this study. Owing to the complexity of financial modeling, computational schemes such as optimization solvers and the Monte Carlo estimator by simulations are utilized. There are several techniques for solving portfolio optimization models (Esfahanipour and Khodaee 2021), particle swarm optimization (PSO), and so on. Motivated by Babazadeh and Esfahanipour (2019), the optimization algorithm genetic algorithm (GA) was used to solve risk measure minimization problems using MATLAB’s package. However, its counterpart dynamic version requires solving high-dimensional nonlinear HJB-type partial differential equations (Fleming and Soner 2006) in continuous time.

A nonparametric estimation utilizes kernel functions to smooth out the shape of the distribution from discrete raw data into continuous data. The degree of smoothness has a limited relationship with kernel functions, whereas it depends on the bandwidth of the kernel.

Kernel function

These are many choices of kernel functions, such as Gaussian kernel, exponential kernel, and Cauchy kernel. However, the Gaussian kernel,

is commonly used in practice because it does not influence the asymptotic of the estimation as significant as the bandwidth used by Horová et al. (2012).

Definition 2.1

Suppose that there are \(n\) observed values (or returns) denoted by vector \(X\). The kernel estimator (Rosenblatt-Parzen) \(\widehat{f}\) at point \(x\in R\) is defined as:

where \({K}_{h}\left(t\right)=\frac{1}{h}K\left(\frac{t}{h}\right), h>0.\) The positive number \(h\) is a smoothing parameter called the bandwidth of the kernel function.

Joint distribution: copula method

The copula method (Nelsen 1999, Cherubini et al. 2004) provides a useful tool for describing the dependence between variables. Two families of copula functions are often considered: elliptical and Archimedean copulas. Unlike the nonparametric kernel function, the copula method is parametric and contributes to the joint distribution of the portfolio from its multiple marginal distributions (Bouyé et al. 2000; Cambanis et al. 1981; Cherubini and Luciano 2001).

Definition 2.2

An m-dimension copula is a distribution function on \({[\mathrm{0,1}]}^{m}\) with standard uniform marginal distributions.

$$C\left({\varvec{u}}\right)= C \left({u}_{1} {,u}_{2} ,\dots ,{ u}_{m}\right),$$

(2.3)

where \(C\) is called a copula function.

The copula function \(C\) is a mapping of form \(C{: [\mathrm{0,1}]}^{m}\to \left[\mathrm{0,1}\right].\) These are two major types of elliptical copula families: Gaussian and Student’s t copulas. Both are associated with a class of elliptical distributions.

The multivariate dispersion copula

The m-dimensional normal or Gaussian copula is derived from the m-dimensional Gaussian distribution. The Gaussian copula is generated from a set of correlated normally distributed variates \({v}_{1},{v}_{2}\)…\({v}_{m}\) using Cholesky’s decomposition, and then transforms these to uniform variables \({u}_{1}=\Phi \left({v}_{1}\right), {u}_{2}=\Phi ({v}_{2})\)…\({u}_{m}=\Phi ({v}_{m})\), where \(\Phi\) is the cumulative standard normal; therefore, the pair \(({u}_{1},{u}_{2}\dots {u}_{m})\) draws from the Gaussian copula.

The marginal distribution of each variable is standard normal, and the joint normal distribution can be defined as

where \(R\) is the m-dimensional covariance matrix, and \({\Phi }_{m}\) is the cumulative multivariate normal distribution function in dimension \(m\).

For the multivariate Gaussian copula (MGC), let \(R\) be a symmetric, positive define matrix with \(\mathrm{diag}\left(R\right)={(\mathrm{1,1}\dots 1)}^{T},\) and the corresponding density function of (2.4) is,

where \(R\) is the covariance matrix of vector \(X,\) and \(\left|\mathrm{R}\right|\) is the determinant of \(\Sigma .\) Let \({u}_{j}={\Phi }\left({x}_{j}\right);\) therefore, \({x}_{j}={\Phi }^{-1}\left({u}_{j}\right).\) This copula density function can be rewritten as given below:

where \(\varsigma=({\Phi }^{-1}\left({u}_{1}\right),\dots ,{\Phi }^{-1}\left({u}_{m}\right))\).

Let \({\varvec{\mu}}={({\mu }_{1},{\mu }_{2}\dots {\mu }_{m})}^{T}\) be a positive parameter, \({\varvec{\upsigma}}={({\upsigma }_{1},{\upsigma }_{2}\dots {\upsigma }_{m})}^{T}\) be a dispersion parameter, and \(\mathrm{R}\) be a correlation matrix. The multivariate dispersion copula (MDC) density is as given below:

where \({\varsigma}_{j}={\Phi }^{-1}{(F}_{j}\left({x}_{j};{\mu }_{j},{\upsigma }_{j}\right))\), and \({{f}_{j}\left({x}_{j};{\mu }_{j},{\upsigma }_{j}\right)=\frac{\partial {(F}_{j}\left({x}_{j};{\mu }_{j},{\upsigma }_{j}\right)}{\partial {x}_{j}}}\) for every set of c.d.f. \({F}_{j}\left({x}_{j};{\mu }_{j},{\upsigma }_{j}\right)\).

The multivariate student’s t copula

Similarly, the m-dimensional Student’s t-copula is derived from the m-dimensional Student’s t-distribution. Student’s t copulas are models with a heavier tail than Gaussian copulas. We denote \({\mathbf{T}}_{m}\)(\({\epsilon }_{1},\dots ,{\epsilon }_{m};{\varvec{R}},v\)) be the joint Student’s t distribution and \(\mathbf{T}(x)\) be the univariate Student’s t distributions. The Student’s t copula is defined as,

where \({t}_{v}^{-1}\) is the inverse of the univariate cumulative distribution function of Student’s t with \(v\) degrees of freedom. Using the standard representation, the copula density for multivariate Student’s t copula (Cherubini et al. 2004) is:

where \({\varsigma}_{j}={t}_{v}^{-1}\left({u}_{j}\right).\)

The Archimedean copula

In contrast to the elliptical copula, it is easy to deduce parameterized multivariate distributions from the same class of marginal distributions. Given a function \(\phi (x)\) as the generator of the Archimedean copula function, the formula of Archimedean copulas induces a copula by

Three well-known Archimedean copulas are illustrated below with the following density functions (Table 1).

Although the Archimedean copula requires only one parameter in the estimation, the partial distribution function is not easy to calculate in high dimensions for the joint density function. Thus, we choose the MGC to build up the joint distribution in “Stressed portfolio optimization and its scaling effect” section for ease of computation.

The semiparametric method

The semiparametric method combines the nonparametric kernel and the parametric copula methods to describe the marginal distribution of each underlying asset and the joint distribution of the portfolio, respectively. Details about the formulation of each nonparametric and parametric method are discussed in the last section. We focus on the estimation procedures described below, including a bias estimation for the optimal bandwidth.

Optimal bandwidth choice

As mentioned in “Kernel function” section, the choice of bandwidth is not only pivotal as it determines the smoothness of the estimation but also plays a significant role in the weight function on a kernel. In addition, bandwidth choice is a crucial problem in kernel smoothing because no universally accepted approach exists to this problem yet.

One approach of cross-validation theory aims to minimize the mean square error (MSE) between the estimated and true densities. Thus, an appropriate \(h\) should determine the degree of smoothness and influence on the MSE between the kernel estimated density \({f}_{\widehat{p}}\left(x\right)\) and its true density \({f}_{p}\left(x\right)\).

Definition 3.1

The variance, bias, and MSE of the estimator are defined as

Let the density function \({f}_{\widehat{p}}(X)\) bound second derivative \({f}_{\widehat{p}}^{\prime\prime}\left(X\right)\), leading to Taylor expansion,

where \(\mathrm{T}\left({f}_{\widehat{p}}(x)\right)=\int \frac{{{f}_{\widehat{p}}^{\prime\prime}}^{2}\left(x\right)}{{f}_{\widehat{p}}\left(x\right)}dx,\)\({\mathrm{s}}_{p}\left({f}_{\widehat{p}}(x)\right)={E}_{p}\left[{f}_{\widehat{p}}^{2}\left(x\right)\right].\)

The optimal bandwidth is defined from the truncated \({\mathrm{MSE}}_{p}\left(x\right)\) taking only the first leading order term as,

Bias estimation for the perturbed optimal bandwidth

Here, we provide a perturbation analysis and show that the error of the Gaussian kernel function deviating from the optimal bandwidth is uniformly bounded.

Lemma 3.2

Given the Gaussian kernel function\({K}_{{h}_{opt}}\left(t\right)=\frac{1}{{h}_{opt}}K\left(\frac{t}{{h}_{opt}}\right)\)with the optimal bandwidth choice\({h}_{opt}>0,\)for any estimation error\(\upvarepsilon >0\), there exists an independent constant\(M,\)such that\(\left|{K}_{{h}_{opt}}\left(t\right)- {K}_{{h}_{opt}+\upvarepsilon }\left(t\right)\right|<M\upvarepsilon\), for \(t\in R\).

This means that the bias between the optimal kernel and its perturbed density is uniformly bounded.

Proof

Use Taylor expansion and the uniformly bounded property for the normal density.

The first term on the left is bounded by \({M}_{1}\upvarepsilon\) regardless of the variable \(t\) for some independent constant \({M}_{1}\). Because the Gaussian kernel function is a normal density function, by the mean-value theorem, the second term on the right is bounded above by \({M}_{2}\upvarepsilon\) for some independent constant \({M}_{2}.\) Therefore, \(\left|{K}_{{h}_{opt}}\left(t\right)- {K}_{{h}_{opt}+\upvarepsilon }\left(t\right)\right|\le \left({M}_{1}+{M}_{2}\right)\upvarepsilon\) for an arbitrary \(t\) is obtained. □

The joint distribution of portfolio

As a semiparametric estimation, it has nonparametric and parametric components. The kernel method offers the marginal distribution of each asset under nonparametric estimation, and the copula method is common in parametric estimation, which builds up the joint distribution between marginal distributions. After combining these two components, the joint distribution of the portfolio is obtained.

Definition 3.3

The joint distribution of assets in our portfolio is as given below:

where \(c\left({x}_{1},{x}_{2},\dots {x}_{n}\right)\) are copulas using parametric methods, and \({f}_{i}\left({x}_{i}\right)\) is the marginal distribution using nonparametric methods.

Once the joint distribution for the multivariate \(\left({X}_{1},\dots ,{X}_{n}\right)\) is estimated, its portfolio \(P\) with different weights \(({w}_{1},..,{w}_{n}\)) is defined by,

where \({w}_{i}\) and \({X}_{i,}\) are the weight and value of \({i}{th}\) asset, respectively. The total sum \(\sum_{i=1}^{n}{w}_{i}=1\). When a weight \({w}_{i}\) is nonnegative, it means that the corresponding asset is not allowed for short selling.

Parameter estimation

Maximum likelihood estimation (MLE) was employed to estimate model parameters. Based on the joint density function,

where \(c\left({x}_{1},{x}_{2}\dots {x}_{n}\right)=\frac{{\partial }^{n}C({x}_{1},{x}_{2}\dots {x}_{n})}{\partial {x}_{1}\partial {x}_{2}\dots \partial {x}_{n}}\) is the density of the \(n\) dimensional copula \(C({x}_{1},{x}_{2}\dots {x}_{n};\theta )\). The log-likelihood function is defined as follows:

where \({\mathrm{L}}_{C}=\sum_{j=1}^{N}log c({F}_{1}^{\left(j\right)},{F}_{2}^{\left(j\right)}\dots {F}_{n}^{\left(j\right)})\) is the log-likelihood function from the independent term with the copula \(C\) function, the rest term \({L}_{i}=\sum_{j=1}^{N}log{f}_{j}({{x}_{i}}^{(j)}), i=\mathrm{1,2}\dots n\) is the log-likelihood function from the dependent term, which is not necessary to estimate parameters using the nonparametric kernel method, where \(log\) denotes the natural logarithm. Thus, only parameters in \({\mathrm{L}}_{C}\) need to be estimated. Let \(\theta\) denote the parameter set of copula \(C\). This can be estimated by the following full MLE:

Stressed portfolio optimization and its scaling effect

This section introduces the methodology for stressed portfolio optimization, which includes specific procedures for constructing an optimal portfolio under tail risk and its scaling effect. We extend the use of risk measures (Artzner et al. 1999) for portfolio optimization using the previously mentioned semiparametric method. The optimal scales of such stressed portfolios are studied by maximizing risk-sensitive value measures (Miyahara 2010).

Risk measure minimization for stressed portfolio

As a regulatory standard or internal control for financial institutions, risk measures provide extreme information about potential value losses. Owing to its simplicity and clarification in risk management, VaR is the most conventional measure to estimate the loss of asset value, given a certain confidence level; therefore, an adequate capital amount is gauged to prevent negative impacts.

Definition 4.1

\(V{aR}_{\alpha }\) is defined as a quantile in statistics:

where \(\alpha\) is the confidence level, the variable \(X\) represents the loss value or its return, and \({VaR}_{\alpha }(X)\) is defined above.

Note that both values of \({VaR}_{\mathrm{\alpha }}\) and \({CVaR}_{\mathrm{\alpha }}\) are variable \(X\) dependent. This means that they are not constant, even though the value of \(\alpha\) is given. When the variable \(X\) is a portfolio, such as \(P\) defined in Eq. (3.7), minimizing nonlinear risk measures such as \({VaR}_{\mathrm{\alpha }}\) and \({CVaR}_{\mathrm{\alpha }}\) over the feasible set of portfolio weights, possibly in high dimensions, must be solved numerically. Discussions on data analysis and computational schemes are presented in “Statistical estimation for semiparametric method” section.

Value measure maximization for the scaling effect

The evaluation of a risk-sensitive portfolio is essential for finance. This section aims to revisit the optimal scale using the risk-sensitive value measures proposed by Miyahara (2010) and discuss some computational issues given stressed portfolios.

Definition 4.3

Let \(X\) be a linear space of return of portfolio; the risk-sensitive value measure in \(X\) is then the following functional defined on \(X\):

However, when the distribution of \(X\) is non-Gaussian, the mean–variance model is the first two leading terms of the risk-sensitive value measure. This can be easily deduced by substituting the Taylor expansion

As \({U}^{(\alpha )}\left(\lambda X\right)\) is a concave function of \(\lambda\) (Miyahara 2010), the optimal scale of the portfolio can be obtained by maximizing this scaled value measure:

such that \({\lambda }_{opt}=\frac{{C}_{X}}{\alpha }\), where \({C}_{X}\) is a solution of \(E\left({Xe}^{-{C}_{X}X}\right)=0.\)

Because our portfolio variable \(X\) has a complex structure from the proposed semiparametric method, we adopt the following Monte Carlo estimator to solve the optimal scale as an approximation:

where \(\lambda\) is the scale of the portfolio, \(\alpha\) is the risk aversion, \(n\) is the sample size, and \({X}^{\left(i\right)}{{}^{\prime}}s\) are random samples from historical simulations.

We comment on the strict concavity of the approximate estimator in (4.4). This can be inherently derived from the concavity of the utility function defined in Eq. (4.2) by taking the random variable \(X\) as discrete and uniformly distributed on the set of fixed outcomes \(\left\{{X}^{\left(1\right)}, {X}^{\left(2\right)}, \dots ,{X}^{\left(n\right)} \right\}.\) Since the graph of the risk-sensitive value measure over the scale is concave, the peak of this graph is identified as the optimal scale for its associated portfolio.

For investors with different levels of sensitivity to the same risk, we use different values of aversion to calculate the optimal scale. The risk-seeker (0 \(<\alpha <0.5\)), risk-neutral (\(\alpha =0.5\)), and risk-averter (\(0.5<\alpha <1\)) correspond to aversion values of 0.5, 0.5, and 0.5, respectively.

Empirical studies and data analysis

According to the framework depicted in Sects. Literature review and Stressed portfolio optimization and its scaling effect” sections, we designed the following experiments for stressed portfolio optimization using the semiparametric method. First, we build the marginal distribution for each constituent of the portfolio, given daily data from 2016 to 2020. We then describe the joint distribution of a portfolio with a Gaussian copula, which explains the dependence between these constituents. Second, we solve for the optimal weights from risk measure minimization using the genetic algorithm (GA) within MATLAB’s package. Finally, the optimal scale based on the stressed VaR portfolio is solved numerically using an approximated Monte Carlo estimator. Intensive and heavy computation, which includes modeling by semiparametric estimation and portfolio optimization under tail risk, is executed on a server cluster equipped with four Intel Xeon 5220R CPUs. Each CPU is 2.2 GHz with 24 cores.

Statistical estimation for semiparametric method

To implement our methodology on real data, we construct a diversified portfolio with five ETFs: Vanguard S&P 500 ETF (VOO), iShares 20 + Year Treasury Bond ETF (TLT), iShares iBoxx investment grade corporate bond ETF (LQD), iShares Gold Trust ETF (IAU), and Vanguard Real Estate Index Fund ETF Shares (VNQ). Daily price data spanning from 2016 to 2020 were retrieved from the Bloomberg database. Daily returns were calculated from the difference between two consecutive log prices.

Our implementation of the optimization models relies on a rolling-window approach. Specifically, at the beginning of each month, we use the return data of the previous three months to calculate the input parameters needed to determine the portfolio weights. Using these weights, we calculate portfolio returns over the next month. The following month, new portfolio weights are determined using updates of the parameter estimates.

The model parameters of the optimal bandwidth for the kernel function and the correlation matrix required in “Literature review” section for our portfolio are time-invariant in each estimate window (three months). The relevant parameters and estimation results are available upon request.

Optimal weights for risk measure: stressed portfolio optimization

Following the semiparametric model, applications for portfolio optimization under tail risk are presented. Tables 2 and 3 record the empirical results of in-sample fit for a quarterly time span (three months), which is useful for training models. Tables 4 and 5 record the empirical results of out-of-sample fit for a monthly time span, which is useful for testing models.

According to Eq. (4.1), portfolio VaR is a function of the weight vector \(w\) defined by

where \(g\) denotes the function of the weight vector \({\varvec{w}}\), and the optimal weight \(\widehat{{\varvec{w}}}\) attains the minimum value of \(g\left({\varvec{w}}\right).\) Table 2 records the in-sample fit for the optimal weight vector \(\widehat{{\varvec{w}}}\), the performance of each stressed portfolio, and its VaR value for five consecutive years from 2016 to 2020. These performance results, including volatility, return, Sharpe ratio, and VaR, are calculated quarterly.

According to Table 2, although Markowitz’s model and semiparametric method have different objective functions for weight estimation, the two methods have comparable results for the Sharpe ratio. The in-sample results show that the semiparametric method always has a lower VaR than Markowitz’s model.

Similarly, the portfolio CVaR is a function of weight vector \(w\) defined by the following equation:

where \(k\) is a function of the weight vector \({\varvec{w}}\), and the optimal weight \(\widehat{{\varvec{w}}}\) is the minimum value of \(k\left({\varvec{w}}\right)\). The optimal weight, the performance of each stressed portfolio, and its CVaR value are listed in Table 3.

Tables 2 and 3 demonstrate the in-sample tests of the dataset and the performance measure of the optimal stressed portfolio on a long-term quarterly basis. According to Tables 2 and 3, although Markowitz’s model and semiparametric method have different objective functions for weight estimation, the two methods have comparable results in terms of the Sharpe ratio. The empirical results of the in-sample show that the semiparametric method always has lower VaR and CVaR than Markowitz’s model.

We conduct out-of-sample tests on a short-term monthly basis by using the same set of five ETFs (VOO-equity, TLT-government bond, LQD-corporate bond, IAU-gold, and VNQ-real estate) and compare the performance of portfolios generated from the semiparametric method and Markowitz method from 2016 to 2020, as demonstrated in Table 4.

The results of return, volatility, Sharpe ratio, and risk measures were calculated monthly. As can be seen from Fig. 1, compared to S&P 500, our semiparametric method provides better results in terms of portfolio returns during those five years.

Note that Markowitz’s mean–variance model is profit-oriented. It selects the portfolio with the highest Sharpe ratio from the efficient frontier of the five ETF assets. Nevertheless, the semiparametric method is risk-oriented. Its objective function aims to minimize the VaR/CVaR function. Compared with Markowitz’s mean–variance method, Table 4 is summarized in Table 5. Our semiparametric method reduces the average volatility of the portfolio in those five years and decreases the average return in the same period, simultaneously, but increases the average Sharpe ratio of the portfolio. Our proposed method mitigates not only the whole risk but also the tail risk because our method has a lower portfolio VaR in those five years.

Similarly, the coherent risk measure CVaR is used to compare the results of the semiparametric method and Markowitz’s method within the same test period from 2016 to 2020. Figure 2 depicts the portfolio value of the semiparametric model with CVaR and S&P 500.

As shown in Table 7, which is a summary of Table 6, our semiparametric method reduces average volatility of portfolio in five years, whereas our method decreases average return in the same period. However, the semiparametric method increases the average Sharpe ratio of the portfolio. Our semiparametric method consistently offers better risk management than the Markowitz model in comprehensive risk and tail risk because our method has a lower portfolio CVaR.

In addition, we verify the robustness of the semiparametric method in several sensitivity checks. First, we extensively vary the dataset to examine whether our findings are robust with respect to the indices used to represent the asset classes. For example, we add other ETFs or use alternative indices to our portfolio. This procedure often leads to changes in sample size. However, we find that the variation in the dataset does not alter any of our conclusions. Second, we examine whether the performance of our method improves when shorter and longer time series of historical returns are used for parametrization, and we base the estimation method on a rolling-window approach with 2 months and 4 months of historical data available in estimation. We do not observe a consistent improvement in additional tests. Third, we repeat our analysis by utilizing other performance measures. Specifically, we employ the Sortino ratio, which does not change the qualitative nature of our results.

Optimal scale for value measure: scaling effect

As mentioned above, we can obtain a stressed portfolio using the semiparametric method with the optimal weights by minimizing the VaR of the portfolio. To further understand the scaling effect of the portfolio, we compare the mean–variance model and risk-sensitive value measure with different risk aversion, denoted by \(\alpha\) from zero to one. We assume that there are three types of investors: risk-averter (\(0.5<\alpha <1\)), risk-seeker (\(0<\alpha <0.5\)), and risk-neutral (\(\alpha =0.5\)). We discuss the optimal scale of the portfolio during the five years with three types of investors, and the results are shown in Fig. 3, 4, 5, 6 and 7.

Although the curve of mean–variance (MV) and risk-sensitive value measure (RSVM) are similar in shape to a downward parabola, the curve of MV has a particularly strong concavity. In theory, the MV is a special case of an RSVM. MV has a close-form optimal portfolio scale shown in Eq. (4.3), while the optimal scale of the risk-sensitive value measure must be calculated by the Monte Carlo estimator. The numerical comparisons are listed in Tables 8 and 9.

The empirical results show a negative correlation between the degree of risk aversion and the optimal scale in the value measure. Risk-seeking investors correspond to larger scales, while risk-averters correspond to smaller scales. In addition, there is no difference in the mean–variance model and risk-sensitive value measure only for portfolios with a Gaussian distribution, but most portfolios are non-Gaussian in practice. If investors use a mean–variance model to determine the optimal scale, which may not be a real optimal scale, because the mean–variance model is not fit in a non-Gaussian distribution. Thus, the risk-sensitive value measure is pivotal in the stressed portfolio optimization.

Conclusion

We propose an innovative semiparametric method for financial modeling and discuss the applications of portfolio optimization under tail risk with the scaling effect. This semiparametric method is composed of a nonparametric method and a copula method by estimating marginal distributions and the dependence of assets in a portfolio, respectively. Stressed portfolios and their optimal scaling effects are designed to be obtained by minimizing risk measures and maximizing risk-sensitive value measures, respectively. Through intensive empirical data analysis, we observe that the mean–variance type Markowitz method may cause bias selection, compared to the semiparametric method, which improves the efficiency of risk management with less risk exposure.

Availability of data and materials

Fortunately, our data is public because it comes from a data base company called Bloomberg.

Abbreviations

MV:

Mean–variance

VaR:

Value-at-risk

CVaR:

Conditional value-at-risk

RSVM:

Risk-sensitive value measure

References

Artzner P, Delbaen F, Eber J-M, Heath D (1999) Coherent measures of risk. Math Finance 9(3):203–228

Babazadeh H, Esfahanipour A (2019) A novel multi period mean-VaR portfolio optimization model considering practical constraints and transaction cost. J Comput Appl Math 2019(361):313–342

Bouyé E, Durrleman V, Nikeghbali A, Riboulet G, Roncalli T (2000) Copulas for finance—a reading guide and some applications. Groupe de Recherche Opérationelle, Crédit Lyonnais, working paper

Cambanis S, Huang S, Simons G (1981) On the theory of elliptically contoured distributions. J Multivar Anal 11:368–385

Engle R (2002) Dynamic conditional correlation: a simple class of multivariate generalized autoregressive conditional heteroskedasticity models. J Bus Econ Stat 20(3):339–350

Esfahanipour A, Khodaee P (2021) A constrained portfolio selection model solved by particle swarm optimization under different risk measures. In: Mercangöz BA (ed) Applying particle swarm optimization: new solutions and cases for optimized portfolios. Springer, Cham, pp 133–153

Fu C-C, HanC-H, Wang K (2021) A novel semi-static method for the index tracking problem. Accepted by Handbook of Investment Analysis, Portfolio Management and Financial Derivatives. Editor C.F. Lee.

Han C-H (2018) Systemic risk estimation under dynamic volatility matrix models. Adv Financ Plan Forecast 9:79–107

All authors wrote, corrected and agreed to the published version of the manuscript. All authors read and approved the final manuscript.

Authors’ information

Chuan-Hsiang Han is the Professor of Quantitative Finance and Mathematics at National Tsing-Hua University. His fields of research are Applied Probability, Financial Mathematics, Monte Carlo methods, and Fintech. Han received a PhD in Mathematics from North Carolina State University. Before joining Tsing-Hua University he worked at University of Minnesota and Ford Motor Company. He is an editorial Member of Advances in Financial Planning and Forecasting and an associate editor of Journal of the Chinese Statistical Association.

Kun Wang is a PhD candidate of Quantitative Finance at National Tsing-Hua University. His fields of research are time series and volatility analysis. His recent work has focused on portfolio optimization.

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.