 Research
 Open access
 Published:
Robust estimation of timedependent precision matrix with application to the cryptocurrency market
Financial Innovation volume 8, Article number: 47 (2022)
Abstract
Most financial signals show time dependency that, combined with noisy and extreme events, poses serious problems in the parameter estimations of statistical models. Moreover, when addressing asset pricing, portfolio selection, and investment strategies, accurate estimates of the relationship among assets are as necessary as are delicate in a timedependent context. In this regard, fundamental tools that increasingly attract research interests are precision matrix and graphical models, which are able to obtain insights into the joint evolution of financial quantities. In this paper, we present a robust divergence estimator for a timevarying precision matrix that can manage both the extreme events and timedependency that affect financial time series. Furthermore, we provide an algorithm to handle parameter estimations that uses the “maximization–minimization” approach. We apply the methodology to synthetic data to test its performances. Then, we consider the cryptocurrency market as a real data application, given its remarkable suitability for the proposed method because of its volatile and unregulated nature.
Introduction
Most financial signals show time dependency, which, when combined with the presence of noisy and extreme events, can cause severe difficulties in the estimation of statistical models, particularly for the asset allocation problem in which relationships among financial quantities play a fundamental role. In this work, we model the joint dynamics of financial signals, handling both the timedependency of the time series and the presence of extreme events—two features characterizing most financial time series and, in particular, the cryptocurrency (CC) market. The first CC—Bitcoin—was developed in 2009. At the end of 2015, the number of available CCs was approximately five Hundred. In mid2021, more than six thousand CCs existed. These numbers, together with the growing literature regarding these new currencies, are witnesses to how the interest in CCs increased during recent years. Several statistical aspects related to CCs were studied, as in Bariviera et al. (2017), Cunha and Da Silva (2020), Cont (2001) and Hu et al. (2019), which analyzed stylized characteristics of the CC market. Alternatively, in Ciaian and Rajcaniova (2018) and Brandvold et al. (2015), the authors investigated the joint dynamics of CCs. Moreover, several authors investigated CCs’ volatility using GARCH models and their extensions, including Caporale and Zekokh (2019), Dyhrberg (2016), and Katsiampa (2017). Robustness issues regarding GARCH modeling of CCs’ volatility were addressed by Charles and Darné (2019). Beyond multivariate GARCH models, Bouri et al. (2017) used a dynamic conditional correlation model to describe the timevarying conditional correlation between Bitcoin and other financial assets, whereas Corbet et al. (2018) studied the timevarying relationships between some CCs and other financial assets using a pairwise directional measure of volatility spillovers. In BazánPalomino (2020, 2021), the authors investigated the correlations between Bitcoin and its forks, where the terms fork refers to a new version of the blockchain obtained by introducing changes in the rules of the original blockchain. The authors used several volatility models—both univariate and bivariate—and showed that Bitcoin and their forks are dynamically correlated. Chaim and Laurini (2019) modeled the returns and volatility dynamics of CCs using a multivariate stochastic volatility model with discontinuous jumps, whereas Fry (2018) developed bubble models for CCs to handle the presence of heavy tails. To conclude this overview of CCs’ modeling literature, it is worth citing Catania et al. (2019), who proposed several models to improve CCs’ forecasting performances.
Notwithstanding the large number of potential applications of CCs, such as near realtime micropayments that are necessary to the development of the Internet of Things, currently, they are mainly used as speculative investment instruments. Indeed, many online exchange websites offer the opportunity to sell and buy all available CCs and to create investment portfolios to manage the related financial risk. However, given their nature, these currencies are far from being similar to the traditional ones (Baek and Elbeck 2015; Huang et al. 2019)—an aspect that immediately emerges by inspecting the data. Several researchers analyzed the opportunity to invest in CCs: Briere et al. (2015) analyzed the returns of traditional asset portfolios in which bitcoin was inserted. Klein et al. (2018) investigated the performance of a portfolio containing both traditional assets and several CCs and tested the robustness of their results considering the CC index CRIX. Chuen et al. (2017) analyzed the investing opportunities of CCs by studying the top ten CCs together with the CRIX index and other indexes related to traditional assets. In all of these cases, of paramount importance is to estimate the degree of dependency between the various CCs.
In this work, we study the dependencies among many CCs in an attempt to provide an instrument to investigate the behavior of the entire cryptocurrency market. To this end, we study the precision matrix, namely, the inverse of the correlation matrix, whose elements—when dealing with Gaussian distributions—represent the dependence between two variables conditionally on the remaining ones (Lauritzen 1996); that is, if the element i, j of the precision matrix is zero, then the variables i and j of the multivariate distribution are conditionally independent. We do not study the correlation matrix because it may not provide meaningful information in multivariate settings on the dependence between the variables. Indeed, two variables might show significant correlation because of the influence of other variables. Models based on the precision matrix are called graphical models, which means that the precision matrix allows for a graph to be built whose nodes represent the variables—the considered financial time series—and the edges represent the conditional dependence; see Lauritzen (1996) for more details. In highdimensional settings, the opportunity to build a graph is worthwhile because it allows for interactions to be visualized, clusters to be created, and in general, the exploitation of the entire apparatus of graph theory, which can reveal the hidden topological properties of the financial network. These models are frequently used in the statistical and machine learning literature for certain reasons. However, despite their theoretical relevance, their practical applicability is limited by the restrictive assumptions of independent and identically distributed (i.i.d.) Gaussian observations that are often principally unrealistic because of the presence of a nonGaussian or a contaminated Gaussian process—or the timedependent nature of the process. The body of literature contains studies addressing the distortion arising from a Gaussian assumption in the case of heavy tails: Lafferty et al. (2012) proposed two nonparametric approaches, one based on a transformation of the data and the second based on a kernel density estimation; Finegold and Drton (2014) proposed in a Bayesian setting Dirichlet tdistributions to handle heavy tails and investigated the computational costs derived from the Gibbs sampler algorithm tailored for such distributions; Vogel and Tyler (2014) proposed graphical Mestimators tailored for elliptical graphical models; and Hirose et al. (2017) proposed a robust estimator that minimizes the \(\gamma\)divergence between the observed and theoretical distribution. Moving on the nonstationary issue, Zhou et al. (2010) investigated timedependent graphical models by using a temporal kernel to address a precision matrix whose structure evolves smoothly over time. To the best of our knowledge, no studies simultaneously handled extreme events and timedependency: our goal in this work is to fill this gap by developing an estimation method for multivariate time series that handle timedependent parameters and that is robust against the presence of extreme events. In particular, we propose a robustification based on \(\gamma\)divergence of the timevarying approach introduced in Zhou et al. (2010). Using a kernel estimator together with an Mtype estimator was introduced by Cai and OuldSaïd (2003), who proposed a robust version of a local linear regression smoother for stationary time series. They also extended the asymptotic theory developed by Fan and Jiang (2000) to the case of noni.i.d. random variables, in which the authors studied local robust techniques in the i.i.d. case. Our approach is similar to that proposed in Cai and OuldSaïd (2003) because it merges an Mtype estimator with a kernel estimator but differs in the model considered.
The remainder of this paper is organized as follows. “Model” section describes the statistical model considered in this work. “Estimation methodology” section addresses the methodological contribution of the paper and introduces details regarding timevarying and robust estimations through \(\gamma\)divergences that are the starting points of our estimator. “Test case” section addresses simulation experiments constructed to test the performance of the proposed methodology. “Cryptocurrency market application” section describes the real data CCs’ application, and “Discussion” and “Conclusions” sections provide discussions and the conclusions, respectively.
Model
The simplest model accounting for financial price dynamics is given by the geometric Brownian motion \(\mathrm {d}p_t=\mu p_t\mathrm {d}t+\sigma p_t\mathrm {d}W_t\), where \(\mu\) and \(\sigma\) are the percentages of the drift and volatility, respectively, and \(\mathrm {d}W_t\) is a simple Brownian motion with independent increments normally distributed. The logarithm of the price follows a simple Brownian motion. As is usually done, we work with the log return of the price, \(r_{t}=\text{ log }\frac{p_{t}}{p_{t1}}\).
We assume that return dynamics are described by a contaminated timedependent Gaussian process. Formally, we assume that the observed log returns \(\varvec{r}_{t}\) are distributed according to the following probability: distribution
where \(f\left( \varvec{r}_t, t\right) = \mathsf {N}_{d}\left( \varvec{0}, \varvec{\varTheta }^{1}_t\right)\) is a zero mean multivariate Gaussian distribution with a timedependent precision matrix \(\varvec{\varTheta }_t\) (that is, the inverse of the correlation matrix), \(s\left( \varvec{r}_t, t\right)\) is a contaminating probability distribution accounting for the shock components, and \(\epsilon\) is the ratio of contamination that will never be considered as infinitesimal—meaning that we are interested in the case of a significant contamination of extreme events. Worth noting is that Eq. (1) is just a mathematical formulation to describe a contaminated model in which, in addition to the selected statistical process, extreme events, outliers, or some other possible events are present that are not described by the model \(f(\cdot )\). Therefore, the contamination ratio \(\epsilon\) should not be considered a model parameter to be estimated, and the only parameters of interest are those of \(f\left( \varvec{r}_t, t\right) = \mathsf {N}_{d}\left( \varvec{0}, \varvec{\varTheta }^{1}_t\right)\).
Moreover, the distribution \(f\left( \varvec{r}_t, t\right)\), which is assumed to be Gaussian to keep the presentation of the method as clear as possible, can be replaced by a general conditional probability distribution with Gaussian noise, leaving the formalism unchanged.
The choice of this model is motivated by the interest in the conditional correlations among CCs, which are represented by the elements of the precision matrix when addressing Gaussian distributions.
Estimation methodology
This section describes the main contribution of this study, namely, the presentation of a local robust estimator of the precision matrix described in “Robust timevarying estimation” section. The proposed estimator is constructed using two estimators: the local estimator of the timevarying precision matrix introduced in Zhou et al. (2010) and briefly recalled in “Timevarying estimation” section, and the robust estimator based on \(\gamma\)divergence introduced in Fujisawa and Eguchi (2008) and briefly recalled in “Robust estimation” section.
Timevarying estimation
In Zhou et al. (2010), the authors are interested in the local estimation of the precision matrix of the model \(f\left( \varvec{r}, t\right) = \mathsf {N}_{d}\left( \varvec{0}, \varvec{\varTheta }^{1}_t\right)\). To this end, they proposed a kernel estimator defined as follows.
Let \(\left\{ \varvec{r}_{t}\right\} _{t=1}^{T}\) be a sample of T observations of \(\varvec{r}_{t}\), as previously defined. The timevarying precision matrix \(\varvec{\varTheta }_{t}\) in the noni.i.d. case is estimated using the modified Lassopenalized maximum likelihood estimator obtained by introducing a kernel over the time domain,
where \(\left( \widehat{\varvec{S}}\left( t\right) \right) _{i,j} = \sum _{u=1}^{T}w_{ut}\varvec{r}_{i,u}\varvec{r}_{j,u}^{\prime }\), is an empirical covariance matrix weighted with a symmetric timedependent nonnegative kernel \(w_{ut}=\frac{K\left( \vert ut\vert /h\right) }{\sum _{u=1}^{T}K\left( \vert ut\vert /h\right) }\) where \(K\!(\cdot )\) is a Gaussian kernel, h is the bandwidth, and \(\lambda \Vert \varvec{\varTheta }\Vert _{1}\) is the Lasso penalty added to local likelihood of inducing sparsity in the precision matrix. Noteworthy is that the bandwidth parameter h acts on the kernel by modifying its shape; that is, it is the standard deviation of the Gaussian kernel. In particular, low values of h provide high weights only to points close to the reference time t, whereas high values of h translate into assigning relevant weights to points far from the reference time t. In other words, the bandwidth parameter h determines the estimation window size.
The estimator defined in Eq. (2) has good asymptotic properties if the precision matrix evolves smoothly over time. At varying t, it provides a chain of graphs that unveils how the multivariate structure of the process evolves over time—exactly the objective of this study. Unfortunately, the maximumlikelihood approach suffers because of its nonrobust nature (Huber and Ronchetti 2011), the presence of contamination from extreme random events. This issue is addressed in the next subsection.
Robust estimation
Many robust approaches are developed in the statistical literature. Some are based on the modification of the minimization function to obtain a finite influence function [see Huber and Ronchetti (2011)]; others are based on modifying the Kullbach–Leibler divergence, whose minimization corresponds to the maximumlikelihood approach to obtain a divergence that provides less weight to extreme events or outliers; thus, it is robust against them [see Basu et al. (1998) and Fujisawa and Eguchi (2008)]. In this study, we followed the divergencebased approach; in particular, we consider the one introduced in Fujisawa and Eguchi (2008) based on \(\gamma\)divergence. To introduce this methodology, we use the model defined in Eq. (1): in the i.i.d case, namely,
Now, let \(\left\{ \varvec{r}_{t}\right\} _{t=1}^T\) be an i.i.d. sample drawn from g. In this framework, a nonrobust approach chooses the estimated parameters \(\hat{\varvec{\theta }}\) such that \(f_{\hat{\varvec{\theta }}}\) is close to g and, thus, is biased by outliers. Instead, a robust approach chooses the estimated parameters \(\hat{\varvec{\theta }}\) such that \(f_{\hat{\varvec{\theta }}}\) is close to f. The choice of the divergence depends on the specific problem. In this paper, we consider the \(\gamma\)divergence defined as
where \(\gamma >0\) is constant. Clearly, \(f\left( \varvec{r}\right) ^{\gamma }\) weights the observations—in particular, small values of \(\gamma\) emphasized extreme events, whereas large values of \(\gamma\) emphasized the underlying distribution. The second term in Eq. (3) is needed to have an unbiased estimator. The \(\gamma\)divergence is empirically estimated by
Thus, the robust estimator is obtained as solution to the minimization problem defined as
where we changed the notation from f to \(f_{\varvec{\theta }}\) to highlight the dependence over the set of parameters \(\varvec{\theta }\) and this estimator exhibited good asymptotic properties based on the theory of Mestimators (DasGupta 2008) and shows a small bias even in the case of heavy contamination assuming that \(f\left( \varvec{r}\right)\) is sufficiently small when \(\varvec{r}\) is an outlier; that is, the following condition must be respected
Usually, \(\gamma < 1\) is considered; for further details see Fujisawa and Eguchi (2008). However, this estimator does not handle the timevarying nature of the problem considered.
Robust timevarying estimation
In this section, we present the main methodological contribution of this paper. That is, we propose an estimator that can manage the timevarying nature of the problem and the presence of extreme events contaminating the underlying distribution. To this end, we modify the estimated \(\gamma\)divergence defined in Eq. (4) to assign greater importance to observations close to the period considered. We apply the same methodology as in Zhou et al. (2010) that introduces a kernel over the time domain, as detailed in “Timevarying estimation” section. Here, we consider the complete model defined in Eq. (1), which we report here for the reader’s convenience
Using the same notation introduced in the previous sections, we define the local estimated \(\gamma\)divergence as
This estimator considers both extreme events through the parameter \(\gamma\) and the time dependency through the kernel \(w_{ut}=\frac{K\left( \vert ut\vert /h\right) }{\sum _{u=1}^{T}K\left( \vert ut\vert /h\right) }\) with bandwidth h. Indeed, the parameter \(\gamma\) weighs observations according to the underlying distribution, as detailed in “Robust estimation” section, whereas the kernel \(w_{ut}\) provides significant weight only to observations at a temporal distance less than the bandwidth h from the reference time t.
Thus, the local robust estimator at time t for \(0\le t\le T\) is obtained as a solution to the following minimization problem:
The minimization problem defined in Eq. (7) can be solved through a maximization–minimization (MM) algorithm following the procedure detailed in Hirose et al. (2017). We customize our estimator in “Appendix 2”.
In highdimensional settings—when the number of features is higher than the number of observations—sparsity can be introduced into the precision matrix structure by adding a penalty to the local estimated \(\gamma\)divergence in Eq. (6).
Worth noting is that the use of a kernel estimator with an Mtype estimator was introduced by Cai and OuldSaïd (2003), and the authors proposed a robust version of a local linear regression smoother for stationary time series. They also extended the asymptotic theory developed by Fan and Jiang (2000) to the case of noni.i.d. random variables, for which the authors studied local robust techniques in the i.i.d. case. We are working on adapting the asymptotic theory developed in Cai and OuldSaïd (2003) to our estimator; however, the technicalities and mathematical proofs, which are beyond the scope of this work, will be published in a separate paper.
Test case
In this section, we investigate the performance of the proposed estimator using a synthetic stochastic model with known parameters. In particular, in a time interval \(t\in \left[ 0:T\right]\), we consider a distribution \(g\left( \varvec{r}, t\right)\) obtained as a mixture of an underlying distribution \(f\left( \varvec{r}, t\right)\) given by a 2dimensional zeromean timedependent Gaussian random process and a contaminating distribution, \(s\left( \varvec{r}\right)\), given by a simple twodimensional Gaussian process with a large mean mimicking extreme events, as follows:
The covariance matrices \(\varvec{\varSigma }_{t}\) and \(\varvec{D}\) are defined as
where \(\sigma _{1}^{2} =0.197\) and \(\sigma _{2}^{2} =0.363\) are taken as the sample variances of the Bitcoin and Ethereum time series, respectively, detailed in “Data” section. We consider various contamination rates \(\epsilon\), and the time dependency in the underlying process is the result of a correlation coefficient \(\rho _t\), which varies in \(\left( 1, 1\right)\) in the considered time interval \(\left[ 0:T\right]\), following the sigmoid function
where \(t_0=T/2\) and \(a=T/10\).
At fixed time \(t^*=50, 150, 250, \dots , 950\), following the algorithm of “Data” section, the values of the estimated correlation \(\hat{\rho }\left( t^*\right)\) associated with different values of the contamination rate \(\epsilon =0.1, 0.2, \text{ and }\ 0.4\) and obtained for several robustness parameter values \(\gamma =0, 0.01, 0.02, 0.05, 0.1, \text{ and }\ 0.2\) and several bandwidth values \(h=100, 200, \text{ and }\ 400\) are shown in Fig. 1. In each row of the figure, the bandwidth is fixed, whereas the contamination rate in each column is fixed, as labeled in the title of each panel. The results show that the performance of the estimator changes according to both the contamination rate and the \(\gamma\) parameter. Specifically, for low values of \(\gamma\) (\(\gamma =0, 0.01, 0.02\)), the estimate \(\hat{\rho }\left( t^{*}\right)\) is not able to catch the dynamic of the correlation coefficient, \(\rho _{t}\), of the underlying distribution, \(f\left( \varvec{r}, t\right)\), as expected. For values greater than \(\gamma =0.05\), the robust estimator privileged the underlying distribution, and the estimate \(\hat{\rho }\left( t^{*}\right)\) identifies the true correlation dynamics. Clearly, increasing the contamination rate requires an increase in the value of \(\gamma\) to obtain a good estimate of \(\rho _{t}\) . Worth noting is that the values of \(\gamma\) chosen in these simulations point out (see the first row of Fig. 1) that when the contamination is small, higher values of \(\gamma\) do not improve the estimate, whereas higher values of \(\gamma\) are needed to improve the estimate. We could have chosen more values of \(\gamma\); however, to understand the method and the readability of the related plots, we found that the proposed values were a good choice. Bandwidth h acts as a smoothing parameter and softens the variation in the true correlation dynamics. Truly remarkable is that a standard nonrobust estimator (the case of small \(\gamma\)) is absolutely unable to grasp the underlying process—even in this simple example—and is completely diverted from the contamination of extreme events.
Cryptocurrency market application
Recently, the precision matrix has attracted increasing interest in the financial context, given its various and useful applications, from portfolio optimization, in which the closedform solution to the global minimum variance portfolio involves the precision matrix (Torri et al. 2019), to systemic risk analysis, in which the partial correlations are more informative than independent correlations when analyzing the financial system as a whole. Many measures based on the precision matrix have been proposed to assess both systemic risk and portfolio strategy performances; see, for instance, Senneret et al. (2016) and Torri et al. (2019).
In this real data application, we show how the estimation of CCs’ precision matrix (together with other interesting financial quantities) changes according to the choice of the estimator, that is, when considering a robust approach against a nonrobust one or assuming a timevarying approach instead of a timeinvariant one. The existence and extent of these variations must necessarily be considered by those decision makers who rely on financial quantity estimates associated with different time horizons.
Data
For this study, we consider the daily close value, \(p_{t}\), of 12 CCs, namely, Bitcoin (BTC), Ethereum (ETH), XRP (XRP), Litecoin (LTC), Stellar (XLM), Monero (XMR), Dash (DASH), Ethereum Classic (ETC), NEM (XEM), Zcash (ZEC), Dogecoin (DOGE), and Waves (WAVES), from 01/02/2017 to 26/07/2021, downloaded from CoinMarketCap (https://coinmarketcap.com/it/). The dynamics of CCs’ log returns and histograms (Figs. 3, 4) make it evident that their distribution cannot be considered Gaussian because of the presence of extreme events. Such a feature is confirmed by the sample statistics in Table 1, in which we report robust statistics, namely median, interquartile range, measures of skewness, and excess kurtosis based on quantiles [see Kim and White (2004)], minimum, and maximum. We also performed the JarqueBera test for nonnormality but did not include the pvalues, all of which were zero. Some CCs, such as XRP, XEM, and DOGE, exhibit very high kurtosis, confirming the presence of extreme events.
The dynamics of \(\vert r_{t}\mu \vert\) are reported in Fig. 2, where \(\mu\) is the mean of \(r_{t}\), and show that the variances of the CCs’ log return series cannot be considered constant over time, whereas (investigated but not shown) the autocorrelations and crosscorrelations for different lags in the CCs’ log return series can be considered—except for lag 0—zero with good approximation.
We consider the model discussed in Eq. (1) to describe the evolution of the selected 12 CCs. First, we estimate the precision matrix for several values of the robustness parameter, namely, for \(\gamma =0,0.01, 0.02, 0.05 \ \text{ and }\ 0.1\), and several bandwidth values, parameter, namely for \(h=20, 50, 100 \ \text{ and }\ \infty\). We attempt different values of \(\gamma\), and choose those that provide significantly different estimates. Therefore, we do not consider values \(\gamma > 0.1\) because the estimates do not change significantly.
We stress that the estimation windows around the reference time are larger than the bandwidth; however, significant weights are given only to observations inside the bandwidth.
Results
Precision matrices are used to construct Gaussian graphical models, which are statistical models describing relationships among variables in the form of graphs. Therefore, the estimated precision matrices are presented in “Conditional correlation graphs” section in the form of a graph. In particular, we show the five local estimates obtained considering a neighbor of the following dates: June 30, 2017, January 16, 2018, September 8, 2019, May 15, 2020, and April 30, 2021. The first date corresponds to a period in which investors started to observe the CCs market with more interest, the second date is immediately after the huge increase in Bitcoin—that is when the interest toward CCs was very high, and the third date corresponds to a period in which the interest toward this market was still high but not characterized by dramatic events. The last two dates correspond to the outbreak of the pandemic and to the second increase in the value of Bitcoin in April 2021, when its value reached 53,000 dollars. Figures 5, 6, 7, 8 and 9 shows that, in general, the smaller the value of the bandwidth of the kernel, the higher the interconnection between cryptocurrencies. This phenomenon results from the significantly not constant dependencies among the cryptocurrencies that, not being regulated by any institution and having no fundamentals from which to extract a price (Baek and Elbeck 2015; Huang et al. 2019) have wild variations that depend only on the whims of the market. With a large bandwidth, the timevarying dependency is mediated, resulting in a smaller interconnection. This evidence is the first of the importance of using a timevarying approach. Moreover, especially in Figs. 6 and 9, it is possible to observe high interconnection between cryptocurrency when the \(\gamma\) parameter is high. These two figures are both associated with periods in which CCs are characterized by high growth and high volatility, and in the underlying process, CCs turn out to be conditionally dependent; however, the dependencies are hidden by extreme noisy events. Instead, Figs. 7 and 8 show how extreme events can distort the amplitude of the dependence between cryptocurrencies.
What we have just said can be further detailed by observing Fig. 10 in “Conditional correlations dynamics” section, where we reported some conditional correlation dynamics, that is, the evolution of some elements of the precision matrix. Indeed, it turns out that, for a small value of h (first column), conditional correlations show a significant differences at varying \(\gamma\), meaning that extreme events hide the underlying conditional correlation dynamics. Worth noting is that discrepancies among different estimates can be significant—more than one hundred percent. For high values of h (second column), the large bandwidth averaged the genuine timedependent dynamics of the process, similar to what was observed from the synthetic stochastic model in “Timevarying estimation” section; however, differences in the estimation of the conditional correlation at varying \(\gamma\) are still evident. Although the analysis is carried out on the 12 CCs, here we report only three examples of conditional correlations between CC pairs as an example of the type of analysis and the results that can be achieved using the proposed methodology.
In Fig. 11 in “Quantitative measures” section, we consider a few indicators to quantify the discrepancies among the estimated precision matrices. The first indicator is the normalized Frobenius norm of the differences between the estimated precision matrix obtained for \(\gamma =0.1\), for which we expect more accurate estimates and the one obtained for \(\gamma =0,0.01, 0.02, 0.05\), that is
The computation of \(\varDelta F\left( t,\gamma \right)\) is reported in the first row of Fig. 11, which shows that, even if this measure is an average quantity, the relative differences associated with the different values of \(\gamma\) are remarkable for both small and large bandwidth values, touching one hundred percent for the smallest bandwidth.
Of course, the discrepancies among different estimates affect the computation of financial quantities. To measure such an impact, we also consider the following generally used quantities [see Billio et al. (2012) for more details]:

the sum of the elements of the precision matrix, that is
$$\begin{aligned} s\left( t, \gamma \right) =\sum _{i,j}\left {\varvec{\hat{\varTheta }}\left( t, \gamma \right) _{i,j}}\right , \end{aligned}$$gives a measure of interconnectedness of the entire system;

let \(\lambda _{k}\left( t,\gamma \right)\) for \(k=1,\dots ,d\) be the eigenvalues of the correlation matrix \(\hat{\varvec{\varSigma }}\left( t,\gamma \right)\) and \(\nu \in \left\{ 1,\dots , d\right\}\), then
$$\begin{aligned} w\left( t, \gamma \right) =\frac{\sum _{k=1}^{\nu }{\lambda _{k}\left( t,\gamma \right) }}{\sum _{k=1}^{d}{\lambda _{k}\left( t,\gamma \right) }} \end{aligned}$$is a measure of interconnectedness. Indeed, the numerator is the risk associated with the first \(\nu\) principal components, whereas the denominator is the total risk of the system.
As previously done, we considered the relative differences of these measures using as reference the estimated precision matrix obtained for \(\gamma =0.1\), for which we expect more accurate estimates, that is \(\varDelta S\left( t,\gamma \right) = \frac{s\left( t, 0.1\right) s\left( t, \gamma \right) }{s\left( t, 0.1\right) }\) and \(\varDelta W\left( t, \gamma \right) = \frac{w\left( t, 0.1\right)  w\left( t,\gamma \right) }{w\left( t, 0.1\right) }\) for \(\gamma =0,0.01, 0.02, 0.05\), to better highlight the effect of different estimations on the financial quantities of interest. We report the results in the second and third rows of Fig. 11, which show that both measures present significant differences by varying parameter \(\gamma\). Such differences can also be observed when varying the bandwidth parameter, even if, as previously noted, at a small bandwidth (\(h=20\)), the differences are greater than those for the large bandwidth (\(h=100\)) because of the usual average effect that emerges in the last case.
Considering the three quantities reported in Fig. 11, the absolute value gradually decreases from \(\varDelta F\) (first row), \(\varDelta S\) (second row), and \(\varDelta W\) (third row). This result is expected. In fact, \(\varDelta F\), measuring the sum of the differences between the elements of the precision matrices, is an accurate measure of the discrepancies between the estimates obtained at varying \(\gamma\), whereas \(\varDelta S\), measuring the differences between the overall connections in the CC market (obtained at varying \(\gamma\)) can be considered a difference between composite variables that usually fluctuate less than the individual variables of which it is composed. Lastly, \(\varDelta W\), measuring the relative differences in the percentages of variance explained by the first \(\nu\) eigenvalues of the correlation matrix, is an extensive quantity least affected by different estimates obtained at varying \(\gamma\).
Discussion
The results shown in the previous section highlighted that time dependency is a relevant feature of the cryptocurrency market, that is, nonconstant conditional dependencies among different cryptocurrencies, and different levels of contamination given by extreme events over time. We found further confirmation of our findings in recently published studies. For instance, Nie (2020) analyzed the correlation dynamics using a dimensionality reduction method. His analysis shows that the correlation dynamics experience drastic changes in relation to periods characterized by large market fluctuations. This finding is consistent with our observations. Indeed, the dynamics of conditional correlations show drastic changes when considering a small bandwidth and, thus, when having a shortterm outlook. The changes in the dynamics of the conditional correlations decrease when a wider bandwidth is considered and, thus, when having a longterm outlook. Nie (2022) used a network method to identify critical events in the correlation dynamics of the CCs. He found that the network structure is easily broken near critical events, namely, periods characterized by large market fluctuations. After these events, the network structure returns to stability. We observed something similar; indeed, the graphs we obtained using conditional correlations change their structure in correspondence of large market fluctuations. These changes are reduced when considering a wider bandwidth, that is, from a longterm perspective. In Shams (2020), the authors investigated the structure of CC returns and found persistence among CCs’ sharing similar features. The author introduced a connectivity measure that captures strong exchangespecific commonalities in CCs’ investors’ demand, which spills over to other exchanges. This spillover could explain the drastic changes in the conditional correlations we observe when having a shortterm outlook. Indeed, the author suggests that demand from users and developers might correlate with that of investors and speculators, translating into an amplified effect on prices and explaining that CCs are prone to wild price movements and bubbles. In Guo et al. (2021), the authors investigated the market segmentation problem and information propagation mechanism in the CCs market by constructing a timevarying network of CCs that combines return crosspredictability and technological similarities. The network based on return crosspredictability is obtained using an adaptive Lasso regression technique, which is very similar to our approach in which the elements of the precision matrix can also be obtained as coefficients of a linear regression model. However, the manner in which they addressed the time dependency is different from ours because they considered rolling windows. All of these results unveil the complexity of the cryptocurrency market and any type of investment strategy, such as high frequency trading or longterm hedging techniques, must be carefully planned using the right quantitative tools. Of course, other indicators could be considered to test the interconnectedness of the CCs market or to assess the performances of a portfolio containing CCs; however, it is out of the scope of this paper, which has the aim of providing a flexible and robust estimation methodology able to manage different levels of data contamination and timevarying parameters at different time scales.
Finally, some considerations on the choice of the parameters \(\gamma\) and h are in order. Figures 10 and 11 provide insights into the influence of h and \(\gamma\) on parameter estimations. In general, the choice of h depends on the type of analysis that is intended to be carried out, such as longterm portfolio management that needs a large estimation window, whereas high frequency analysis (e.g., to grasp timevarying correlations) needs a short estimation window. In the CCs market, characterized by highly nonstationary signals, parameter h acts as a smoother over the time domain; namely, a higher h results in smoother dynamics of the parameter estimates. In other words, small values of parameter h, which identifies the estimation window size, favor a precise local estimation of the model parameters at the expense of the smoothness of their dynamics. Instead, parameter \(\gamma\) provides significant differences in parameter estimations when the data are highly nonGaussian. Indeed, Fig. 11 and, in particular, the plots showing the dynamics of the \(\varDelta S\left( t,\gamma \right)\) measure for different \(\gamma\) values indicate that such dynamics remarkably diverge at a few points corresponding to the first significant increase in Bitcoin in December 2017, when its value reached 16,000 dollars, to the outbreak of the Covid19 pandemic and to the second significant increase in Bitcoin in April 2021, when the value reached 53,000 dollars. Therefore, h and \(\gamma\) must be independently set according to the need for local estimations and different data contamination levels. To conclude, we highlight the fact that, for what was just discussed, no unique optimal choice of the hyperparameters exists, and it is strictly related to the analysis of interest.
Conclusions
The main motivation for this work is the introduction of a local and robust estimator of precision matrices able to handle both extreme events and timevarying parameters that typically affect financial time series. In particular, the proposed estimator, obtained by using a kernel over the time domain and replacing the Kullbach–Leibler divergence with a robust divergence, namely, the \(\gamma\) divergence, is applied to estimate timevarying precision matrices of the log return of a representative subset of the cryptocurrency market. The proposed method is remarkably suited for the considered application because the cryptocurrency market is characterized by a peculiar timedependent and very turbulent nature because the major movements on it are the result of speculators who are free to act in a market not regulated by any authority. Moreover, the application is particularly relevant for both the increasing interest in the cryptocurrency market—witnessed by the growing literature and investment in CCs—and the attention that the precision matrix gained in the financial context, given its various and useful applications, from portfolio optimization to systemic risk analysis.
When the financial time series show cyclic or trend components, a simple Gaussian process cannot be considered a reliable model. In such cases, the model considered in this work can be easily extended by considering regression terms, and the proposed estimator could be used with few modifications to the algorithm.
Moreover, in highdimensional settings—when the number of features is higher than the number of observations—sparsity can be introduced in the precision matrix’s structure by adding a penalty to the local estimated \(\gamma\)divergence.
In addition, clustering algorithms can be used to classify timevarying precision matrices to detect turbulent periods and minimize investor risk (Kou et al. 2014, 2021a, b; Li et al. 2021).
Finally, the use of a kernel estimator together with an Mtype estimator has been introduced by Cai and OuldSaïd (2003), in which the authors proposed a robust version of a local linear regression smoother for stationary time series. They also extended to the case of noni.i.d. random variables of the asymptotic theory developed by Fan and Jiang (2000), in which the authors studied local robust techniques in the i.i.d. case. We are working on adapting the asymptotic theory developed in Cai and OuldSaïd (2003) to our estimator. The technicalities and mathematical proofs, which are out of the scope of this paper, will be published in a separate paper.
Availability of data and materials
The data is public because it comes from a data base company called CoinMarketCap.
Abbreviations
 CC:

Cryptocurrency
References
Baek C, Elbeck M (2015) Bitcoin as an investment or speculative vehicle? A first look. Appl Econ Lett 22:30–34
Bariviera AF, Basgall MJ, Hasperué W, Naiouf M (2017) Some stylized facts of the bitcoin market. Physica A 484:82–90
Basu A, Harris IR, Hjort NL, Jones MC (1998) Robust and efficient estimation by minimising a density power divergence. Biometrika 85:549–559
BazánPalomino W (2020) Bitcoin and its offspring: a volatility risk approach. In: Advanced studies of financial technologies and cryptocurrency markets, pp 233–256
BazánPalomino W (2021) How are Bitcoin forks related to Bitcoin? Financ Res Lett 40:101723
Billio M, Getmansky M, Lo AW, Pelizzon L (2012) Econometric measures of connectedness and systemic risk in the finance and insurance sectors. J Financ Econ 104:535–559
Bouri E, Molnár P, Azzi G, Roubaud D, Hagfors LI (2017) On the hedge and safehaven properties of Bitcoin: Is it really more than a diversifier? Financ Res Lett 20:192–198
Brandvold M, Molnár P, Vagstad K, Valstad OCA (2015) Price discovery on bitcoin exchanges. J Int Financ Markets Inst Money 36:18–35
Briere M, Oosterlinck K, Szafarz A (2015) Virtual currency, tangible return: portfolio diversification with bitcoin. J Asset Manag 16:365–373
Cai Z, OuldSaïd E (2003) Local Mestimator for nonparametric time series. Stat Prob Lett 65:433–449
Caporale GM, Zekokh T (2019) Modelling volatility of cryptocurrencies using Markovswitching Garch models. Res Int Bus Financ 48:143–155
Catania L, Grassi S, Ravazzolo F (2019) Forecasting cryptocurrencies under model and parameter instability. Int J Forecast 35:485–501
Chaim P, Laurini MP (2019) Nonlinear dependence on cryptocurrency markets. N Am J Econ Finance 48:32–47
Charles A, Darné O (2019) Volatility estimation for Bitcoin replication and robustness. Int Econ 157:23–32
Chuen DLK, Guo L, Wang Y (2017) Cryptocurrency: a new investment opportunities. J Altern Invest 20:16–40
Ciaian P, Rajcaniova M (2018) Virtual relationships: short and longrun evidence from Bitcoin and Altcoin markets. J Int Financ Markets Inst Money 52:173–195
Cont R (2001) Empirical properties of asset returns: stylized facts and statistical issues. Quant Finance 1:223–236
Corbet S, Meegan A, Larkin C, Lucey B, Yarovaya L (2018) Exploring the dynamic relationships between cryptocurrencies and other financial assets. Econ Lett 165:28–34
Cunha CR, Da Silva R (2020) Relevant stylized facts about Bitcoin: fluctuations, firstreturn probability, and natural phenomena. Physica A 550
Das Gupta A (2008) Asymptotic theory of statistics and probability, Springer
Dyhrberg AH (2016) Bitcoin, gold, and dollar: a GARCH volatility analysis. Financ Res Lett 16:85–92
Fan J, Jiang J (2000) Variable bandwidth and onestep local Mestimator. Sci China Ser A Math 43:65–81
Finegold M, Drton M (2014) Rejoinder: robust Bayesian graphical modeling using Dirichlet \(t\)distributions. Bayesian Anal 9:591–596
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics (Oxford, Engl) 9:432–441
Fry J (2018) Booms, busts, and heavy tails: the story of Bitcoin and cryptocurrency markets? Econ Lett 171:225–229
Fujisawa H, Eguchi S (2008) Robust parameter estimation with a small bias against heavy contamination. J Multivar Anal 99:2053–2081
Guo L, Härdle W K, Tao Y (2021) A timevarying network for cryptocurrencies. arXiv preprint arXiv:2108.11921
Hirose K, Fujisawa H, Sese J (2017) Robust sparse Gaussian graphical modeling. J Multivar Anal 161:172–190
Hu AS, Parlour CA, Rajan U (2019) Cryptocurrencies: stylized facts on a new investible instrument. Financ Manage 48:1049–1068
Huang JZ, Huang W, Ni J (2019) Predicting Bitcoin returns using highdimensional technical indicators. J Finance Data Sci 5:140–155
Huber P, Ronchetti E (2011) Robust statistics. Wiley Series probability and statistics, Wiley
Katsiampa P (2017) Volatility estimation for Bitcoin: a comparison of Garch models. Econ Lett 158:3–6
Kim TH, White H (2004) On a more robust estimation of skewness and kurtosis. Financ Res Lett 1:56–73
Klein T, Thu HP, Walther T (2018) Bitcoin is not a new golda comparison of volatility, correlation, and portfolio performance. Int Rev Financ Anal 59:105–116
Kou G, Peng Y, Wang G (2014) Evaluation of clustering algorithms for financial risk analysis using MCDM methods. Inf Sci 275:1–12
Kou G, Xu Y, Peng Y, Shen F, Chen Y, Chang K, Kou S (2021a) Bankruptcy prediction for SMEs using transactional data and twostage multi objective feature selection. Decis Support Syst 140:113429
Kou ÖG, Akdeniz O, Dinçer H (2021b) Fintech investments in European banks: a hybrid IT2 fuzzy multidimensional decisionmaking approach. Financ Innov 7:1–28
Lafferty J, Liu H, Wasserman L (2012) Sparse nonparametric graphical models. Stat Sci 27:519–537
Lauritzen SL (1996) Graphical models, vol 17, Clarendon Press
Li T, Kou G, Peng Y, Philip SY (2021) An integrated cluster detection, optimization and interpretation approach for financial data. IEEE Trans Cybern
Nie CX (2020) Correlation dynamics in the cryptocurrency market based on dimensionality reduction analysis, Physica A Stat Mech Appl 554
Nie CX (2022) Analysis of critical events in the correlation dynamics of the cryptocurrency market. Physica A 586:126462
Shams A (2020) Structure of cryptocurrency returns, Fisher College of business working paper, 202003, pp 11
Senneret M, Malevergne Y, Abry P, Perrin G, Jaffres L (2016) Covariance versus precision matrix estimation for efficient asset allocation. IEEE J Sel Top Signal Process 10:982–993
Torri G, Giacometti R, Paterlini S (2019) Sparse precision matrices for minimum variance portfolios. CMS 16:375–400
Vogel D, Tyler DE (2014) Robust estimators for nondecomposable elliptical graphical models. Biometrika 101:865–882
Zhou S, Lafferty J, Wasserman L (2010) Timevarying undirected graphs. Mach Learn 80:295–319
Acknowledgements
Not applicable.
Author information
Authors and Affiliations
Contributions
PS: designed the work, performed the data analysis and contributed in the writing of the manuscript; DV: designed the work, contributed in data analysis and contributed in the writing of the manuscript; MB: helped in the writing in a preliminary version of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Figures
Cryptocurrencies log return
Conditional correlation graphs
Conditional correlations dynamics
See Fig. 10.
Quantitative measures
See Fig. 11.
Appendix 2: Algorithm
Given the model in Eq. (1), the parameters being estimated through minimisation problem in Eq. (7) correspond to elements in the precision matrix \(\varvec{\varTheta }_{t}\), for \(0\le t\le T\). Therefore, let \(\varvec{\theta }_t=\mathrm{vec}\left( \varvec{\varTheta }_t\right) ^\prime\) be the vector of parameters at time t, and \(\widehat{\varvec{\theta }}_t^{(j)}\) be an estimate of \(\varvec{\theta }_{t}\) at the jth iteration. Let us denote
so that \(z_u^{(j)} x_u^{(j)}=w_{ut}f\left( \varvec{r}_u,\varvec{\theta }_t\right) ^\gamma\). Then, using the Jensen’s inequality to the convex function \(y=\log \left( x\right)\), we get
Moreover
Therefore by Eqs. (8) and (9) it holds
that are the two properties characterising a majorisation function, therefore allowing to use an MM–algorithm in order to solve the minimisation problem in Eq. (7).
At iteration \(j+1\) the following step have to be implemented:

Majorize–Step: compute
$$\tilde{\ell }_{{\gamma ,t}} \left( {\varvec{r},\varvec{\theta }_{t} \varvec{\hat{\theta }}_{t}^{{(j)}} } \right) = \tilde{\ell }_{{1,t}} \left( {\varvec{r},\varvec{\theta }_{t} \varvec{\hat{\theta }}_{t}^{{(j)}} } \right) + \ell _{2} \left( {\varvec{\theta }_{t} } \right).$$ 
Minimisation–Step: solve
$$\varvec{\hat{\theta }}_{t}^{{(j + 1)}} = \arg \min _{\varvec{\theta }} \tilde{\ell }_{{\gamma ,t}} \left( {\varvec{r},\varvec{\theta }_{t} \varvec{\hat{\theta }}_{t}^{{(j)}} } \right).$$
In order to solve the minimisation step we use the results in the “Appendix 2” of Hirose et al. (2017) regarding the function \(\ell _{2}\left( \varvec{\theta }_{t}\right)\). In particular, it holds
and
where \(\left( {\varvec2{\hat{S}}_{t}^{{(j)}} } \right)_{{i,j}} = \sum\limits_{{s = 1}}^{T} {z_{s}^{{(j)}} } \varvec{r}_{{i,s}} \varvec{r}_{{j,s}}^{\prime }\) and c is a constant.
Therefore the following formula is obtained
To find the update \(\varvec{\theta }_t^{(j+1)}\) we need to solve the following minimisation problem
whose solution is carried out similarly to maximum–likelihood method.
It is interesting to notice that, if we want to introduce sparsity in the precision matrix, the detailed algorithm would work in the same way and the minimisation step would require to solve the following
that can be solved using the graphical lasso algorithm, see Friedman et al. (2008).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Stolfi, P., Bernardi, M. & Vergni, D. Robust estimation of timedependent precision matrix with application to the cryptocurrency market. Financ Innov 8, 47 (2022). https://doi.org/10.1186/s40854022003554
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s40854022003554