Skip to main content

Online risk-based portfolio allocation on subsets of crypto assets applying a prototype-based clustering algorithm


Mean-variance portfolio optimization models are sensitive to uncertainty in risk-return estimates, which may result in poor out-of-sample performance. In particular, the estimates may suffer when the number of assets considered is high and the length of the return time series is not sufficiently long. This is precisely the case in the cryptocurrency market, where there are hundreds of crypto assets that have been traded for a few years. We propose enhancing the mean-variance (MV) model with a pre-selection stage that uses a prototype-based clustering algorithm to reduce the number of crypto assets considered at each investment period. In the pre-selection stage, we run a prototype-based clustering algorithm where the assets are described by variables representing the profit-risk duality. The prototypes of the clustering partition are automatically examined and the one that best suits our risk-aversion preference is selected. We then run the MV portfolio optimization with the crypto assets of the selected cluster. The proposed approach is tested for a period of 17 months in the whole cryptocurrency market and two selections of the cryptocurrencies with the higher market capitalization (175 and 250 cryptos). We compare the results against three methods applied to the whole market: classic MV, risk parity, and hierarchical risk parity methods. We also compare our results with those from investing in the market index CCI30. The simulation results generally favor our proposal in terms of profit and risk-profit financial indicators. This result reaffirms the convenience of using machine learning methods to guide financial investments in complex and highly-volatile environments such as the cryptocurrency market.


Blockchain technology is one of the most disruptive technologies in the last 30 years, with applications to many different domains where process transactions take place. As a result, blockchain has changed our view of contracts, logistics, and shipping, and has sparked academic research (Zhou et al. 2021). Finance (Zhao et al. 2016; Xu et al. 2019) is one of the fields where the impact of blockchain has been more important; in particular, the use of cryptocurrencies as trading assets, which in turn has raised considerable interest from academia (Fang et al. 2022).

Since the appearance of Bitcoin in 2008, the cryptocurrency market capitalization has grown to $1676bn in February 2022, where Bitcoin and Ethereum represent 41.8% and 18.1% of the market, respectively.Footnote 1

While the market is only in its infancy, some studies highlight a Compound Annual Growth Rate (CAGR) higher than 21% for the next 5 years,Footnote 2 which makes it extremely attractive for investors. At the same time, its volatility is also extremely high, and the number of crypto assets traded is also very high at an estimated total of around 10,000 as of February 2022.Footnote 3 Most of those crypto assets have been traded for just a few years or even less.

While the market is attractive to investors, the aforementioned characteristics render well-known portfolio optimization models, such as the Mean-Variance (MV) methods (Markowitz 1952a, 1959) unreliable. In particular, the high number of potential crypto assets and their short time in the market may hinder the estimation of the covariance matrix.

Some studies use a predefined criterion for reducing the number of considered crypto assets, such as focusing on those with higher capitalization. However, such a strategy may leave aside potentially interesting assets for the investor. Thus, we propose the use of a clustering method to partition the cryptocurrency space and the automatic selection of the partition that best suits the risk-aversion preference of the investor. In particular, we propose the use of a prototype-based clustering algorithm, such as K-Means or K-Medois, as the prototypes will be used as representative elements of the partitions. The usefulness of such methods for characterizing the crypto-asset market in a meaningful way has previously been demonstrated in Lorenzo and Arroyo (2022). Following this work, we use the bivariate representation of the average and standard deviation of the returns together with a partitional clustering algorithm as a preliminary step before the portfolio optimization. After clustering, we select the cluster that best fits one of the predefined risk-aversion profiles using the prototype as an adequate summary of the cluster. By doing so, our method reduces the number of crypto assets that will be considered by the portfolio optimization model, focusing only on those that best suit the investor strategy. Furthermore, our approach acknowledges the changing nature of the cryptocurrency market, as we repeat the process (clustering analysis, selection of the cluster, and portfolio optimization). The resulting number of clusters and their composition may change completely. In this way, our approach can be considered as an online portfolio selection methodology, where decisions are made sequentially incorporating the new information and it goes beyond the static or cross-sectional use of clustering methods in other portfolio selection approaches.

We compare the results of our clustered MV model with those of the standard MV model applied to the whole market. In addition, we also compare it with those from more sophisticated methods such as the Risk Parity (Qian 2016), and the Hierarchical Risk Parity (HRP) methods (de Prado 2016). Furthermore, we also compare it against a buy-and-hold strategy of the CCI30 cryptocurrency index that represents the overall market behavior.

The simulation entails a test period of 17 months. We perform three experiments with a different set of cryptocurrencies: the whole cryptocurrency market with data available (over 500 cryptocurrencies), and two selections of the cryptocurrencies with the higher market capitalization (175 and 250 cryptocurrencies, respectively). For each experiment and each method, we repeat the simulation 1500 times using different investment paths, each one represents the simulation days that an investor considers for entering the market. If an investment is made on a given day, the position is held for the next 30 days. The different methods are compared using standard profit and risk-profit financial indicators.

The rest of this paper is organized as follows. A literature review is presented in “Literature review” section, in which we support our investigation on well-stated portfolio allocation models and different approaches for clustering of financial markets with more details for the cryptocurrency domain. The “Data and methods” section includes data processing, the methods followed, and the simulation carried out. Finally, we discuss our results and present concluding remarks in the “Results” and “Conclusions” sections, respectively.

Literature review

Portfolio selection

An investment portfolio is a basket of tradable assets and portfolio optimization models are concerned with finding the best combination of assets according to due objectives. There are two main schools of principles and theories for portfolio selection: (i) Markowitz’s Mean Variance models and (ii) Capital Growth Theory (Kelly jr 1956; Breiman 1960; Thorp 1975; Finkelstein and Whitley 1981). This research focuses on the first type of model. Modern Portfolio Theory (MPT), also referred to as Mean-Variance Optimization models (MVO), was first posited in the 1950s by (Markowitz 1952a; Sharpe 1964; Lintner 1965) and considers the diversification of assets as the most effective way to obtain low risk-reward ratios maximizing the expected utility of the returns. Diversification of capital helps to neutralize idiosyncratic risks. In this way, MVO links with the theory of rational behavior under uncertainty (Markowitz 1952b) and they are included in what are known as risk-based models. The portfolio allocation theory framework has been exponentially developed in very different works (Steinbach 2001; Kolm et al. 2014) to propose solutions to existing constraints on the different models mostly considering practical applicability to the markets. The models now include transaction costs, tax effects, estimation errors on the risk and return forecast, and inter-temporal effects as edging conditions, and the inclusion of specific features required by financial planners. We can find an exhaustive taxonomy of MVO methods in Kalayci et al. (2019), all of which are focused on reducing risk while increasing diversification. One of the weaknesses of MVO models is that it is necessary to provide an estimation of the expected returns and covariances of all the securities in the investment universe; more details on criticism of MVO can be found in Michaud (1989) and Leland (1999). We use the acronyms MPT, MV, and MVO to refer to the same portfolio allocation model.

Risk Parity Portfolio (RPP) (Qian 2016; Roncalli 2013), also known as Equally weighted Risk Contribution (ERC) portfolio, together with MVO, belong to risk-based models. It is an approach to portfolio management that focuses on risk allocation rather than capital allocation. While the MVO methods minimize the variance, RPP models try to constrain each asset to contribute equally to the portfolio’s overall volatility (Maillard et al. 2010) and equalize risk contribution. In other words, it balances the risk so that the risk contribution of every asset is equal and in this sense, it is also considered a risk-based model.

Merging classical portfolio optimization models and hierarchical methods from unsupervised learning techniques detailed in the next subsection, the Hierarchical Risk Parity (HRP) model introduced by de Prado (2016) addresses the problems of traditional risk-based portfolios to compute a portfolio on an ill-generated or even a singular covariance matrix by conducting the optimization process by a top-down recursive bisection using graph theory and machine learning techniques. Based on the same idea, Raffinot (2017) proposed Hierarchical Clustering Based Asset Allocation (HCAA) that allocates capital within and across clusters of assets in multiple hierarchical models. The Hierarchical Equal Risk Contribution Portfolio (HERC) (Raffinot 2018) merges HRP and HCAA. Several variations to this approach have also been proposed (Lohre et al. 2020; Molyboga 2020). In particular, we use the HRP model as a benchmarking method to compare our proposal, as explained below. Moreover, our approach also tackles the problem of the covariance matrix. However, it focuses only on one cluster and in doing so, reduces the number of crypto assets and makes the computation and the inversion process of the covariance matrix easier.

An exhaustive study of the latest risk-based portfolio optimization strategies applied to the 30 highest market capitalization cryptos of the cryptocurrency market can be found in Burggraf (2019). An important strand of research is focused on the effects of adding cryptocurrencies to traditional asset portfolios (Eisl et al. 2015; Chuen et al. 2017; Petukhina et al. 2021; Culjak et al. 2022). Others, as in our proposal, are devoted exclusively to crypto markets. For example, Liu (2019) analyze the invertibility of selected top-10 major cryptocurrencies demonstrating the benefits of the diversification for different portfolio optimization models as ERC, MV, RPP, maximum Sharpe ratio, and maximum utility.

We test the performance of our proposal in cryptocurrency markets using MV, RPP, and HRP models as benchmarks, which are among the most relevant in the portfolio allocation literature.

Clustering techniques in portfolio selection

The application of unsupervised models and in particular clustering techniques to find groups of assets characterized by their financial behavior arises with the seminal paper of Mantegna (Mantegna 1999) applying a hierarchical Minimum Spanning Tree (MST) that takes the linkage between stocks from the New York Stock Exchange market into account. While Mantegna’s methodology has been extensively applied, it has some drawbacks. A criticism of the initial Mantegna approach was related to the employed distance, based on a simple static correlation among the returns’ time series. Alternative approaches have been developed considering auto-correlation structure (Piccolo 1990), distances based on GARCH parameters (Otranto 2008), frequency domain features, higher moments of time series, and so on. Different alternatives have been proposed. Onnela et al. (2003) investigated the distribution and dynamics of correlation coefficients. For example, Bonanno et al. (2004) compared the return and volatility networks considering different time horizons. Tumminello et al. (2006) proposed Planar Maximally Filtered Graph (PMFG) graphs instead of MST. Brida and Risso (2009) combined symbolic time series analysis with MST. Musmeci et al. (2014) applied a new linkage method known as a Directed Bubble Hierarchical Tree (DBHT) to financial markets. From our viewpoint, Mantegna’s approach is a powerful methodology to determine the structures of the market in the context of a cross-sectional analysis. However, we consider it difficult to adapt it to streaming data and online portfolios due to different constraints for instance with the sampling frequency (Bonanno et al. 2001) (e.g., intra-day, daily, weekly), the length of the rolling window T (Onnela et al. 2003), and the number of assets N under study (Borysov et al. 2014). We later tackle the same challenge when N approaches the T value in portfolio optimization risk-based models. Marti et al. (2017) presents an exhaustive revision of hierarchical clustering in financial markets.

Another important strand of clustering in finance applies partitional prototype-based clustering to financial markets. D’Urso et al. (2013) and D’Urso et al. (2016) used a model-based approach with different variations of fuzzy clusters and different distance metrics (autoregressive, Caiado). Iorio et al. (2018) proposed a clustering based on the computation of the spline coefficients of the time series and directly measured the performance within MV, Equally Weighted (EW), and ERC portfolio allocation models. Similarly, D’Urso et al. (2020) proposed a fuzzy clustering method based on cepstral representation, using the daily Sharpe ratio as a variable of clustering. Soleymani and Vasighi (2020) adapted a K-means to cluster NYSE stocks based on Value-at-Risk (VaR) and Conditioned Value-at-Risk (CVar) measures. Gubu et al. (2020) presents a robust portfolio selection using the KAMILA algorithm on a combination of continuous and categorical variables with a robust covariance estimation. Cerqueti et al. (2021) propose a clustering time series according to their estimated conditional moments via Autocorrelation-based fuzzy C-means (A-FCM algorithm); the proposal is enhanced in Cerqueti et al. (2022), in which they computed an optimal weight for each moment. Both proposals are tested directly on different time series as empirical experiments.

Regarding the partitional clustering family, Nanda et al. (2010) applied K-means, fuzzy C-means, and Self-Organizing Maps (SOM) to returns and financial ratios from Indian stocks to classify them into different clusters and subsequently develop portfolios. The analysis considers a set of stocks with fixed-weight allocation along the investing period. However, the approach does not use out-of-sample data and the study is not replicated over time to investigate how the clusters and the results evolve. Nguyen Cong et al. (2014) proposed another precedent, which combines a stage of clustering using return and standard deviation variables and a multi-objective portfolio optimization allocating stocks from the different clusters using a genetic algorithm. The simulation is carried out using 570 stocks from the Stock Exchange of Thailand (SET) and identifies four clusters. Again, the number of clusters does not change over time. Datta and Ghosh (2015) propose an approach that groups the daily Indian market volatility by comparing Kernel K-means, SOM, and Gaussian clustering models to achieve the right volatility prediction using the clusters as predictors. Luca and Zuccolotto (2017) propose a dynamic clustering procedure for time-series returns using the time-varying tail dependency as a dissimilarity measure. The aim is to provide a criterion for portfolio selection focusing on the lower tails of the returns distributions that are sensitive to the contagion phenomena between stocks for the FTSE MIB index. Duarte and De Castro (2020) segment the assets of the Brazilian Stock Exchange (B3) into partitional clusters of correlated assets taken as initial medoids of those assets with the lowest standard deviation of the past series of prices that feed an MV model and compared the performance with the RPP model. Instead, our approach is not based on any correlation measure but on a Euclidean distance defined on the volatility-return space.

Clustering techniques applied to the cryptocurrency market

Regarding the application of clustering methods to the cryptocurrency market, Song et al. (2012) applied Mantegna’s initial ideas based on hierarchical clustering but renewed for the emerging crypto market. Similarly, Stosic et al. (2018) use clustering to characterize the cryptocurrency market using the correlations of 110 cryptocurrencies and detect hierarchical structures using the MST. Song et al. (2019) also applied MST but removed the influence of Bitcoin-Ethereum to avoid a highly-correlated matrix. Lorenzo and Arroyo (2022) applied three different prototype-based clustering techniques to conduct a cross-sectional analysis of the cryptocurrency market and identify associations between the clusters and several financial and technological descriptors. Each clustering method deals with the cryptocurrencies represented in a different way. Namely, a representation as two variables of the average and standard deviation of the daily returns, the distribution of daily returns, and the daily return time series.

Our approach uses clustering, but contrary to other works, it uses a sliding window, allowing for the number of clusters and their composition to change over time and automatically deciding the number of clusters by combining several validity indexes using a voting mechanism.

Online portfolio selection

We adopt an approach that fits into the Online Portfolio Selection (OLPS) models. The main characteristic of such portfolios is that it sequentially select a portfolio over a set of assets to achieve certain targets. In OLPS, market information arrives sequentially and the allocation decision must be made immediately. An exhaustive survey can be found in Li and Hoi (2014), which is complemented in Li et al. (2016) with an open-source MATLAB library to apply different online algorithms.

There are two types of methodologies in the OLPS literature: (i) Batch learning where the model is trained from a batch of training instances and (ii) Online learning where the model is successively trained from a single instance taking the price change \((x_{t,i}=\frac{p_{t,i}}{p_{t-1,i}})\) as an input vector. Our research is focused on the continuous-time MV model developed for multiple period (batch) portfolio selection for the control part and it is analytically resolved in Li and Ng (2000) and Dai et al. (2010). Our proposal suits the batch approach because it is based on the deterministic management of historical data for the portfolio selection where there is not any dependency on the allocation decisions between different time frames. In addition, the target into each iteration for every investing window is a mean reversion formula inspired by the Online Mean Average Reversion (OLMAR) methods (Li and Hoi 2012; Li et al. 2015; Umino et al. 2022).

Jiang and Liang (2017) proposes an online portfolio approach in cryptocurrency markets. In particular, they propose a deterministic deep reinforcement learning based on a Convolutional Neural Network (CNN) applied to a training window of the historic price changes. The weight on the allocation is based on a reward function that maximizes the portfolio value although only for 12 cryptocurrencies. There are some similarities between this work and our approach since both use parameter tuning based on back-testing trading, and both combine machine learning and portfolio allocation. However, we first apply a clustering technique instead of the more complex CNN. Second, we use the classical portfolio allocation model MV instead of a reward function that does not consider any aspect of risk-aversion on the investor. Third, we apply our method to 534 cryptos instead of just 12.

Another relevant reference is Khedmati and Azin (2020), who presents an online selection algorithm based on the pattern matching principle where it uses K-means, k-medoids, spectral, and hierarchical clustering for the selection of the best investing time window.

Market efficiency

Market efficiency is a key financial subject that the researchers try to transpose from the classical to the cryptocurrency domain. Starting with the seminal works of Fama (1965) and Samuelson (1965) on traditional financial markets, there have been different attempts to understand the applicability of efficiency to the new markets. Kyriazis (2019) conducts a systematic survey on the predictability of cryptocurrency prices and concludes that the Efficient Market Hypothesis (EMH) is rejected, opening a door to speculation, a conclusion that we partially confirm in our investigation. Makarov and Schoar (2020) comes to the same conclusion but analyzes arbitrage opportunities for the price deviation across the different exchanges. One of the major implications of the inefficiencies of the cryptocurrency markets is that they offer investment opportunities to portfolio management of making excess returns based on out-performing the market (Palamalai et al. 2021). We find different examples of how we can take advantage of such inefficiencies by applying different machine learning techniques. For instance, Alessandretti et al. (2018) applied different forecasting models achieving profits over the investing period and performing better than a baseline strategy. The parameter optimization based on the Sharpe ratio achieves the best results, which is one of the strategies that we analyze herein. Livieris et al. (2020) ensemble different learning strategies that exhibit high efficiency and reliability mainly for low-frequency applications. Fang et al. (2021) analyze a data-driven approach with a retraining method to predict successful mid-price movements in cryptocurrency markets. The disadvantage of the learning algorithms that take advantage of the market inefficiency is that the methods are data-hungry (Marcus 2018) and the forecasting benefit decay in non-stationary time series. Finally, Sebastião and Godinho (2021) analyze the predictability of three important cryptocurrencies, Bitcoin, Ethereum, and Litecoin, using several machine learning methods and compare their profitability incorporating trading costs. The results indicate that it is possible to propose profitable trading strategies in the cryptocurrency market, even under adverse market conditions, providing an example of the market inefficiencies.

Data and methods

Dataset and preprocessing

From Cryptocompare exchange,Footnote 4 we retrieve the daily closing price for all the cryptocurrencies traded from January 1, 2018, to May 31, 2021, for a total of 1,999,953 market observations along 1247 trading days.

For each cryptocurrency, we transform the price time series into the arithmetic return time series, whose use is extended and consolidated due to its more suitable statistical properties and better comparability (Gilli et al. 2019). The arithmetic returns for the cryptocurrency i at time t are:

$$\begin{aligned} R_{t,i}= \frac{P_{t,i}-P_{t-1,i}}{P_{t-1,i}} = \frac{P_t}{P_{t-1}}-1, \, t=1,...T, \, i=1,...,m \end{aligned}$$

where \(P_i(t)\) is the daily cryptocurrency price for crypto-asset i at day t and T is the time series sampling.

Regarding data cleaning, we remove the observations with duplicated rows and NaN or Inf values for \(R_{t,i}\).

Furthermore, we filter out the cryptocurrencies with heavier tails in the return distribution because it implies relatively frequent extreme price fluctuations and affects the consistency of the results, particularly the estimation of the returns and covariance matrix for the portfolio optimization.

Heavy-tail behavior in a return distribution is related to the finite-size effects in the number of active agents linked to the liquidity and volume of the market (Watorek et al. 2020). According to Newman (2005), a distribution has a heavy-tail behavior if the tail index is lower than 2. In our case, we discard cryptocurrencies with heavier tails, that is, those with a tail index higher than 2.3. In this way, we ensure the existence of the two-moment expectation and covariance matrix required for risk-based models. We apply this filter only to the first two-years of data, that is, before the simulation starts. In this way, we avoid the look-ahead bias. The results are reported in Table 1.

Table 1 Heavy Tail and KSS inference tests of the time-series from 2018–19 of the selected 534 cryptoassets ordered by market cap; AlphaN and AlphaP are the scaling parameters (\(\alpha\)) for left and right tails of the distribution; Sd.N and Sd.P are the standard deviation for \(\alpha\); KssStat and KssPvalue are the Kss non-stationarity statistic and the corresponding P-value

Stationarity is another important property of the underlying process in a return time series, especially if we are interested in forecasting or pricing. Random walk theory allows us to test the weak form of the EMH (Samuelson 1965; Fama 1965) within a series of asset returns. The key principle of EMH is that asset prices reflect all information, making it impossible for investors to derive benefits through trying to predict their behavior. The weak form of efficiency can be expressed as an autoregressive random walk model of stock returns:

$$\begin{aligned} R_t = \rho R_{t-1} + e_t, \; t=1,2,...,T \; and \; e_t \sim N(0,\sigma ^2) \end{aligned}$$

From Eq. 2, the stock return series, \(R_t\), is considered a random walk only if \(\rho = 1\), whereas if \(|\rho | < 1\), then the series is a stationary and predictable process, which violates the weak-form EMH. We apply the Kapetanios, Shin, and Snell (KSS) nonlinear unit root test (Kapetanios et al. 2003), which is more robust when there are market frictions (i.e., transaction costs) as it is proposed in Apopo and Phiri (2021) for cryptocurrency markets. We use the test implementation in the R package by Guris (2021). The null hypothesis is that the raw time series of log returns is a random-walk type against the alternative of a stationary process. Results applied to the 2018-19 window are reported in Table 1 and we demonstrate that we cannot discard the null hypotheses with a p-value higher than 0.01 in approximately 50% of the cryptos. These results are aligned with others in the cryptocurrency market (Kyriazis 2019).

The resulting number of eligible cryptocurrencies for portfolios is 534. Additionally, we also consider two smaller sets with 250 and 175 cryptocurrencies with the highest market capitalization among the eligible ones.


In this section, we describe our methodology. We perform a Monte Carlo experiment repeating each simulation 1500 times to better assess the outcome of the different investment methods considered. The simulation period is 17 months, from January 2020 to May 2021. In each simulation, we run a sequence of investments known as a Random Investment Path (RIP) that consists of a sequence of \(t_w\) days where it will be considered entering the market.

The RIPs are the same for all the investment methods under analysis: the ones proposed and those used for benchmarking. For each investment at time \(t_w\), we consider a 2-year estimation window (730 days) from \(t_{w-729}\) to \(t_{w}\), which is used to estimate the portfolio. If an investment is made, the investing window will always be held for 30 days, that is, from \(t_{w+1}\) to \(t_{w+30}\).

The investment path is created as follows. For each of the 17 months of the simulation period, there is a 50% probability of investing in that month. If the month is selected, we randomly select the day \(t_w\) when the investment will be made. Since the holding period of an investment is 30 days, we ensure that there is no overlap between the holding period of \(t_w\) and the next investment day \(t_w+1\).

Fig. 1
figure 1

Flowchart of the algorithm for portfolio allocation

The flowchart in Fig. 1 summarizes the investment process, which consists of the following steps:

  1. 1

    Data selection As explained, for each investment time \(t_w\), we use a two-year sliding window for the estimation of the optimal portfolio, and 30-days as the holding period for the investment.

  2. 2

    Market segmentation At this stage, we use a prototype-based clustering algorithm to segment the market. In particular, we use a k-medoids algorithm and a quality index to automatically set the k value. We repeat the process 50 times to remove the uncertainty of the initialization of the algorithm. Each partition is denoted as \(P_n\) in the chart.

  3. 3

    Prototype selection strategy Given the prototypes of the 50 segmentations of the previous step, we apply a heuristic to select the most suitable cluster for later portfolio optimization. In particular, we consider four different strategies, three of which are related to risk measures and another based on the well-known Sharpe ratio. If no prototype matches the strategy requirements, then no cluster is selected and no investment will be made at \(t_w\). For a given strategy and an investment time \(t_w\), we select a cluster i of the partition \(P_n\) denoted as \(C_{w,i}^{P_n}\).

  4. 4

    Portfolio allocation The MVO method is applied to the cryptocurrencies that belong to the cluster \(C_{w,i}^{P_n}\) selected for each strategy once the risk filter is applied to remove extremely volatile cryptocurrencies. It may be the case that the portfolio optimization produces no result, in such case, no investment is made. The reason behind this fact can be due to the covariance matrix being non-symmetric positive definite and hence not invertible or because the optimization problem is ill-conditioned and it does not find a solution. It should be noted that the quadratic function to minimize in our model as defined in Eq. 8 is convex and reaches a global optimum solution if and only if the covariance matrix is semi-definite positive.

    Before applying the portfolio optimization method, we apply a risk filter to remove the cryptocurrencies with extreme volatility in the last month.

  5. 5

    Performance assessment This last step is carried out once the investment path is executed. We then measure the performance of the MV model with four of the proposed strategies and of the benchmarks. The performance is measured using profit, risk, and profit-risk indicators. As benchmarks, we use well-known portfolio optimization models over all the cryptocurrencies, that is, with no market segmentation and prototype selection. In particular, the mean-variance (MV), the risk parity (RPP), and the hierarchical risk parity (HRP). In addition, we also apply the random investment paths on the market index CCI30 to compare the strategies against the main market trend.

Sampling strategy and data selection

Our dataset ranges from 1st January, 2018 to 30th May, 2021. For each investment at time \(t_w\), we consider a 2-year estimation window (730 days) from \(t_{w-729}\) to \(t_{w}\); this data is used to estimate the portfolios. The investing period is 30 days, that is, from \(t_{w+1}\) to \(t_{w+30}\). At this stage, we apply the so-called filter risk to remove cryptocurrencies with extreme volatility in the last month of the estimation window from those considered. In particular, we remove those with \(\sigma >1\) in the last 30 daily returns. We also remove cryptocurrencies with missing data or that were no longer traded during the estimation period.

Market segmentation

First, we describe each cryptocurrency in our dataset using the average and standard deviation (\(\sigma , \mu\)) of the daily returns (Eq. 1) computed along the estimation window. This representation is used later for the automatic selection of the cluster, according to the investment strategy and is also consistent with the MV portfolio optimization. In addition, it succinctly summarizes the profitability and volatility of each asset and has been successfully used for clustering cryptocurrencies (Lorenzo and Arroyo 2022). While more sophisticated representations are previous reference, the (\(\sigma , \mu\)) variables make a faster computation possible, which is crucial due to the intensive calculations of the simulations.

For market segmentation, we use a partitional prototype-based clustering algorithm. We need the algorithm to produce a partition of disjoint subsets of assets (cryptocurrencies in our case) and we need a prototype representing each subset for some of the investment strategies explained below.

In clustering, the prototype represents the cluster elements optimally, that is, it typically minimizes the total distance between all the cluster objects and itself. The process is usually (Henning et al. 2016) formalized as the minimization of \(S({\mathcal {D}}, m_1, ..., m_k)\) as follows by choice of the prototype \(m_1,...,m_k\),

$$\begin{aligned}{}&S({\mathcal {D}}, m_1, ..., m_k)=\sum _{i=1}^{n}d(x_i,m_{c(i)}), \ where \\&c(i) = \underset{j\in (1,...,K)}{\arg \min }\ \ d(x_i , m_j), j=1,...,n \\ \end{aligned}$$

where n in Eq. 3 is the total number of objects in the space \({\mathcal {D}}\), d is the dissimilarity measure function, and K is the number of clusters. The prototypes \(m_1,...,m_k\) may be required to be objects in \({\mathcal {D}}\). d may be the given distance between the observation \(x_i\) and the cluster centroids or prototypes \(m_c\).

We use Euclidean Distance (ED) \(d^2({{\textbf {x}}}_i, {{\textbf {x}}}_j) = \Vert {{\textbf {x}}}_i - {{\textbf {x}}}_j \Vert ^2\) where \(\Vert \cdot \Vert\) is the Euclidean norm in \({\mathbb {R}}^n\) because it is both simple and meaningful where the resulting space meets the appropriate mathematical properties. Furthermore, it has also been used in a similar application for prototype-based clustering of cryptocurrencies represented as the average return and volatility (Lorenzo and Arroyo 2022; Mattera et al. 2021; Nguyen Cong et al. 2014), as in our case.

In some clustering algorithms, for example, in the K-means, the prototype is the mean of the objects. However, we want the prototype to be an observed object, so we use a K-medoids or Partition Around Medoids (PAM) algorithm.

Owing to the intensive use of the clustering algorithm in our simulations, we use a computationally-efficient version of the PAM algorithm called CLARA(Clustering LARge Applications). The difference is that these algorithms use only random samples of the dataset (instead of the entire dataset) to compute the medoids. However, it is important to note that the resulting partition includes all the elements of the dataset.

The CLARA algorithm belongs to the family of prototype-based clustering algorithms (Kaufman 1986; Kaufman and Rousseeuw 1990). It is a PAM algorithm adapted to large datasets. Readers interested in an in-depth analysis of the PAM-CLARA/CLARANS algorithm can refer to the work by Schubert and Rousseeuw (2019).

In the CLARA algorithm, we have to determine three parameters: the size of the random sample sampsize, the minimum cluster cardinality, and the number of clusters K.

In our case, sampsize vary in each execution from 50 to 100, so 50 partitions are generated. These are the aforementioned executions of the clustering algorithm. In each execution, the resampling changes and therefore slightly changes the outcome of the clustering.

For each sampsize iteration, we save all the clusters with a cardinality higher than 10 ensuring that the cluster is sufficiently large to run the portfolio allocation algorithms efficiently.

There has been extensive research on cluster detection and evaluation (Kou et al. 2014; Li et al. 2021). In this case, we rely on an automatic process that consists of computing different Cluster Validity Indices (CVIs) for crisp partitions (Arbelaitz et al. 2013), including Silhouette, Dunn, COP Davies-Bouldin, Calinski-Harabasz, or the score function, and then apply the majority rule to select the number of clusters K that is best according to more CVIs.

The outcomes of clustering algorithms for different runs are groups or clusters of cryptocurrencies, each represented by a prototype. In the next subsection, we explain how we select the cluster that is used to optimize our portfolio.

Prototype selection strategies

Our method proposes clustering as a way to partition the cryptocurrency market according to the financial behavior of the cryptocurrencies. Once we have the partitions, we need a criterion to select the portion of the market that most interests us. At this point, the medoids or prototypes of the clusters will be the inputs for a simple heuristic algorithm of cluster selection according to the different investing strategies. Only clusters with cardinality equal to or higher than 10 are considered interesting for the investor. These cardinality criteria are applied to ensure that the portfolio allocation algorithms work more efficiently. The strategies that are described next are summarized in Table 2.

Table 2 Strategies for prototype selection

Strategy 1: Sharpe ratio. The Sharpe ratio is the average excess risk-free return by volatility unit or total risk. The ratio determines the risk of the investment concerning the return of an investment with zero risk:

$$\begin{aligned} SR_c =\frac{r_P - r_f}{\sigma _P}, \end{aligned}$$

where \(r_P\) in Eq. 4 is the portfolio return, \(r_f\) is the risk-free rate and \(\sigma _P\) is the portfolio risk (standard deviation or the volatility of the portfolio). For \(r_f\) reference, we consider the daily of the annualized T-Bill over 90 days obtained from the Federal Reserve Economic Database (FRED) hosted by the Federal Reserve Bank of St. Louis. The greater the value of the Sharpe ratio, the more attractive the risk-adjusted return of the portfolio.

In this strategy, we compute the annualized Sharpe ratio of the MV portfolios considering the cryptocurrencies during the estimation window. This strategy is represented with the Sharpe ratio (SR) label in the following tables and charts.

Strategy 2: Prototype selection by a risk-aversion criteria strategy. First, we compute the volatility (\(\sigma\)) for all cryptocurrencies traded in the estimation window and compute the quartiles of the distribution that will serve as a reference for the volatility of the period and allow us to classify the cluster prototypes. Accordingly, we consider three different risk-aversion profiles for investors following (Goetzmann et al. 2014). In particular, the profiles are:

  1. 1.

    Strategy 2a represents the utility function of a risk-averse investor as it chooses the prototypes whose volatility is within the \(1^{st}\) quartile. Represented by the Low-Risk (LR) label in the following tables and charts.

  2. 2.

    Strategy 2b represents the utility function of a risk-neutral investor as it chooses the prototypes that are between the \(2^{nd}\) and the \(3^{rd}\) quartile of the volatility distribution. Represented by the Mean-Risk (MR) label.

  3. 3.

    Strategy 2c represents the utility function of a risk-seeking investor as it chooses the prototype over the \(3^{rd}\) quartile of the volatility distribution. Represented by the High-Risk (HR) label.

Since the prototype is described by two variables of average return and volatility, if more than one prototype is selected, we choose the one with the highest average return.

Portfolio allocation

In our proposal, once we select the most suitable partition according to our strategy, we run portfolio allocation using the well-known MV optimization. One of its drawbacks is its tendency to maximize the effects of errors in the input assumptions on return and volatility estimations. This means that small changes in the expected returns or the computed covariance matrix can produce very different results.

Mean-variance model assumptions

We summarize some of its main assumptions below for a single-period MV model (Steinbach 2001):

  1. 1.

    The existence of the two-moments expectation (\(\bar{{{\textbf {r}}}}\)) and covariance matrix (\(\Sigma\)) being the apostrophe (\('\)) transpose vector:

    $$\begin{aligned}{}&\bar{{{\textbf {r}}}}:={\textbf{E}}({{\textbf {r}}});{} & {} \Sigma :={\textbf{E}}[({{\textbf {r}}}-\bar{{{\textbf {r}}}})({{\textbf {r}}}-\bar{{{\textbf {r}}}})']={\textbf{E}}[{{\textbf {r}}}{{\textbf {r}}}']-\bar{{{\textbf {r}}}}\bar{{{\textbf {r}}}}' \\ \end{aligned}$$
  2. 2.

    The returns (r) are assumed to follow a normal distribution

  3. 3.

    The investors have MV preferences and thus ignore skewness

where r is a return vector. We also consider the following definitions:

  • Definition 1.1 (reward). The reward (\(\gamma\)) of a portfolio is the mean if its returns (r) by the corresponding weights (w)

    $$\begin{aligned} \gamma ({{\textbf {w}}}) := {\textbf{E}}({{\textbf {r}}}'{{\textbf {w}}})=\bar{{{\textbf {r}}}}' {{\textbf {w}}} \end{aligned}$$
  • Definition 1.2 (risk). The risk (R) of a portfolio is the variance of the returns

    $$\begin{aligned}{}&R({{\textbf {w}}}):=\sigma ^2({{\textbf {r}}}'{{\textbf {w}}})\\&={\textbf{E}}[({{\textbf {r}}}'{{\textbf {w}}}-{\textbf{E}}({{\textbf {r}}}'{{\textbf {w}}}))^2]\\&={\textbf{E}}[{{\textbf {w}}}'({{\textbf {r}}}-\bar{{{\textbf {r}}}})({{\textbf {r}}}-\bar{{{\textbf {r}}}})'{{\textbf {w}}}]\\&={{\textbf {w}}}' \Sigma {{\textbf {w}}}\end{aligned}$$

MV models require a risk measure as we see in Eq. 7 computed as a covariance matrix that at some point in the optimization process must be inverted, for which certain properties in the matrix are necessary; otherwise the matrix may not be invertible and the solution obtained may have too much error. The presence of noise in the series (Pafka and Kondor 2003) and the requirements of the covariance estimator itself force us to take care when optimizing the portfolio (Ledoit and Wolf 2003). In general, the sample covariance matrix is considered suitable for applications where its inverse is not required (Gatheral 2008), so there is a problem with risk-based portfolio selection because when the matrix is inverted, the noise is amplified (Ledoit and Wolf 2004). The sample covariance matrix contains substantial statistical noise that is amplified when it is inverted and in the same way, since the return matrix contains noise, the former may not estimate the true covariance matrix. An understanding of the estimation error of the covariance is important if we want to ensure better out-of-sample performance of the optimization model. The problem arises from the fact that the covariance matrix is calculated over a finite window length T, with T being the sampling of the time series and this inevitably leads to the appearance of noise (measurement error) in the estimator itself; this effect is greater as T approaches the value of N, the number of time-series. This is something that must be taken into consideration by applying appropriate estimators (we use cov.trob function in R MASS package (Venables and Ripley 2002)) given the narrowness of the time window (T) and considering the high number of cryptocurrencies (N). From matrix theory, the condition number of a matrix A provides a measure of the sensitivity of the solution x of the system \(Ax=b\) to perturbations in b. In many situations with time series, the ill-conditioned matrix is caused because \(N>T\). Even when \(T > N\), the eigenstructure tends to be systematically distorted unless \(T \gg N\), resulting in a numerically ill-conditioned estimator for \(\Sigma\). From a different perspective, a strong correlation between some time series corresponds with a rank deficiency as well as with non-unique solutions in MV optimization. Hence, for the stability of the solution of a risk-based model, we definitely need invertible and well-conditioned covariance matrices.

We expect that we can enhance the performance of the portfolios thanks to a better estimation of the covariance matrices by reducing the cryptocurrency space by selecting the partitions according to investor goals.

Mean-variance (MV) model

The optimization goal for the MV model is to determine the best trade-off between return and risk, subject to a set of constraints assuming that the investor knows the value of the expected return vector \(\mu\) and covariance matrix \(\Sigma\). Rational investors always pursue the lowest risk under a specific expected return or the highest return under a particular risk. The risk measure developed by Markowitz is an asset-weighted covariance matrix, \({{\textbf {w}}}' \Sigma {{\textbf {w}}}\), where \(\Sigma\) in Eq. 8 is the covariance matrix and w is the portfolio weights vector. The optimization solution is obtained by setting a target portfolio return \({\bar{r}}\) discounting transaction costs aligned with the model proposed by Wang et al. (2014) but considering, in our case, a fixed amount per portfolio, allowing only long positions and full invested conditions with maximum and minimum holding sizes \(\omega \in [0.001, 0.5]\), such that:

$$\begin{aligned} \begin{aligned} \underset{w}{\text {minimize}}\quad&\mathrm {{{\textbf {w}}}'\Sigma {{\textbf {w}}},} \, Covariance \, Risk,\\ \text {subject to} \quad&{{\textbf {w}}}' {\hat{\mu }} \ge {\bar{\mu }}_{min} + \gamma , \, Target\, Return\\ \quad&{{\textbf {w}}}' {{\textbf {1}}} + x_f = 1, \, Full \, investment\\ \quad&{{\textbf {w}}}_{min} \ge {{\textbf {w}}}\ge {{\textbf {w}}}_{max}, \, Holding \, sizes \, for \, long-only \, positions \end{aligned} \end{aligned}$$

where \({\hat{\mu }}\) is the estimated mean return vector of the cryptocurrencies computed on historical return values and \({\bar{\mu }}\) is the target required return. After portfolio optimization, we invest the remaining budget not allocated in cryptos due to portfolio weight constraints in risk-free security (\(x_f\)).

The vectors \({{\textbf {w}}}_{min}\) and \({{\textbf {w}}}_{max}\) in Eq. 8 are the hold lower and upper position bounds where \({{\textbf {w}}}=[\omega _0 , \omega _1, ..., \omega _N]'\). Under a basic constraint, the weights for allocated assets in the portfolio model Eq. (8) lie between 0 and 1 (long positions), and they sum up to 1 (fully invested portfolio). TC in Eq. 9 is the Transaction Cost as defined in Eq. (9).

In the presence of market friction, there are transaction costs paid by the investor to trade on the market.Footnote 5 For the MV model, transaction costs are computed as follows,

$$\begin{aligned} TC({{\textbf {w}}})=\sum _{i=0}^{N}C_i(w_i) \equiv \gamma , \end{aligned}$$

where N is the number of cryptocurrencies allocated into the portfolio and \(C_i\) is the Costs (C) expressed in basis points (\(1 \, bps = 1 / 100 \% = 1 / 10000\,{\$}\)). In our case, for computation of Eq. 9, we simply consider 5 bps per portfolio, which transforms Eq. 9 in a constant \(\gamma\).

The following is an explanation of the key terms in Eq. 8:

  • The targeted portfolio return (\({\bar{\mu }}_{min}\)) is computed based on the MAR of the last 30-days of historical returns for the \(H \times n\) matrix \({{\textbf {R}}}\) where n is the variable number of cryptoassets allocated into the portfolio during the holding period H. The assumption is that the market will behave during the holding period at least as well as the last 30 days taken from the estimation window

    $$\begin{aligned} {\bar{\mu }}_{min} = {\mathbb {E}}({{\textbf {R}}}_w)_{30d}= {{\textbf {w}}}' {\mathbb {E}}({{\textbf {R}}})_{30d} = ({{\textbf {w}}}' \mu )_{30d} \end{aligned}$$
  • The portfolio variance (\(\sigma\)) for the objective function

    $$\begin{aligned} \sigma _w = Var({{\textbf {R}}}_w)=\sum _{i,j}Cov({{\textbf {r}}}_i,{{\textbf {r}}}_j)w_i w_j = {{\textbf {w}}}' \Sigma {{\textbf {w}}} \end{aligned}$$

A Quadratic Program (QP) is an optimization problem whose objective is to minimize or maximize a quadratic function subject to a finite set of linear equality and inequality constraints. QP models are applied for solving many problems including most of mean-variance models Markowitz (Cornuejols and Tütüncü 2006) where \(\Sigma\) is part of the objective function as in Eq. 8. The R package selected to solve the quadratic programming problem is quadprog that implements the dual method of Goldfarb and Idnani (1983).

The optimal solution to 8 is a weight vector \({{\textbf {w}}}\) that will produce the optimal portfolio financial return (\(r_p\)) at time t when applied to the allocated cryptocurrencies:

$$\begin{aligned} r_{p_t}=\sum _{i=1}^{n}w_i \cdot R_{t_i}={{\textbf {w}}}' {{\textbf {r}}}_t \end{aligned}$$

where r is a \(n \times 1\) vector crypto-asset returns for n crypto assets allocated for the portfolio, and \(w_i\) is the weights of the crypto-asset i with a return \(r_i\). In our case, considering the aforementioned maximum and minimum holding sizes, we only consider those cryptocurrencies with weights strictly higher than 0.001 and lower than 0.5, and consider a free-risk asset until 100% of the investment is complete. In this way, we aim to obtain portfolios with a reasonable cardinality.

The cumulative portfolio return (Rp) at the end of the holding period H days applying Eq. 12 is

$$Rp= \prod _{t}^{H} (1 + r_{p_t}) - 1$$

We evaluate the performance of the investment methods using different return indicators based on Eq. 13 throughout the investing period. Regarding profit indicators, we use the arithmetic cumulative return (\(r_A=\sum _{t=1}^{T}Rp\)) and the geometric compounding return (\(r_G = ( \prod _{t=1}^{T}(1+Rp) ) - 1\)); for the arithmetic average or average return per period (e.g., one year), we have (\({\bar{r}}_A=\frac{1}{T}\sum _{t=1}^{T}Rp\)) and the geometric average return (\({\bar{r}}_G = ( \prod _{t=1}^{T}(1+Rp) )^{\frac{1}{T}} - 1\)), the compound annual return or annualized return for the annualized returns (\(r^{ann}_G = ( \prod _{t=1}^{T}(1+Rp) )^{\frac{n}{T}} - 1\)), and the annualized arithmetic average return (\(r^{ann}_A=\frac{n}{T}\sum _{t=1}^{T}Rp\)). In all cases, Rp is computed based on Eq. 13, T is the number of periods under analysis and n is the number of periods within the year (monthly \(n=12\)). Arithmetic ratios reflect the additive relationship and geometric ones reflect compounding relationships. Compounding rates apply to investors that reallocate the funds obtained after one investment period into the next one.

Benchmarking of different portfolio allocation models

We benchmark the proposed investing strategies with other portfolio allocation models well stated in the financial literature applied to the whole market.

  • Mean-Variance (MV) (Markowitz 1952a, 1959).

  • Hierarchical Risk Parity (HRP) (De Prado 2016).

  • Risk Parity Portfolio (RPP) (Roncalli 2013).

In addition, we also compare the performance of the methods against investing in the market index CCI30, a weighted market cap index launched on January 1, 2017. This price index is a weighted average of the 30 largest cryptocurrencies by market capitalization, and it is a good representative of the market’s overall growth and daily and long-term movement. We briefly present the HRP and the RPP models that are risk-based portfolio methods below.

Risk Parity Portfolio (RPP). A key concept in RPP model is the Marginal Risk Contribution (MRC) defined as follows:

$$\begin{aligned} MRC_i = \frac{\partial \sigma _P}{\partial \omega _i} = \frac{\omega _i \sigma _i^2 + \sum _{i \ne j} \omega _j \sigma _{i,j}}{\sigma _P} \end{aligned}$$

where \(\omega _i\) in Eq. 14 is the weight of the cryptocurrency i, \(\sigma _P\) is the portfolio volatility, and \(\sigma _{i,j}\) is the covariance between crypto i and j. The Total Risk Contribution (TRC) of the ith cryptocurrency to portfolio risk is

$$\begin{aligned} TRC_i = \sigma _{P,i} = \omega _i\frac{\partial \sigma _P}{\partial \omega _i} \end{aligned}$$

Hence, the portfolio risk is computed as follows:

$$\begin{aligned} \sigma _P = \sum _{i=1}^{N}\sigma _{P,i} = \sum _{i=1}^{N} \omega _i \frac{\partial \sigma _P}{\partial \omega _i} = \sum _{i=1}^{N}TRC_i \end{aligned}$$

For RPP model implementation, we apply the R package riskParityPortfolio (Cardoso and Palomar 2021; Feng and Palomar 2015).

Hierarchical Risk Parity (HRP). This model merges hierarchical clustering and portfolio allocation procedures and is based on three main steps:

  • Step 1: It determines the hierarchical relationships between the assets using the recursive cluster formation scheme Hierarchical Tree Clustering algorithm. Specifically, the algorithm calculates tree clusters based on the \(T \times N\) matrix of asset returns, where T represents the number of samples for a due time frame and N is the number of assets. The correlation-distance matrix D, where \(\rho _{i,j}\) is the correlation between time series i and j, is as follows

    $$\begin{aligned} D(i,j) = \sqrt{0.5 \times (1-\rho (i,j))} \end{aligned}$$

    and Eq. 17 is transformed in \({\hat{D}}\) by taking the ED between all the columns in a pair-wise manner as follows

    $$\begin{aligned} {\hat{D}}(i,j)=\sqrt{\sum _{k=1}^{N}(D(k,i)-D(k,j))^2} \end{aligned}$$
  • Step 2: Quasi-Diagonalization, which is a seriation algorithm that rearranges the data to show the inherent clusters more clearly. The algorithm rearranges the rows and columns of the covariance matrix of assets so that similar investments are placed together and dissimilar investments are placed apart.

  • Step 3: Recursive bisection is a top-down approach to split portfolio weights between sub-groups obtained by recursively bisecting the rearranged covariance matrix from the second step based on inverse proportion to their aggregated variances.

For HRP model implementation, we apply the R functions available in

Performance assesment

In addition to the different return indicators, we apply some evaluation measurement for portfolios to assess different aspects of the financial performance of the investment, namely:

  • Value-at-risk (VaR) measures the worst expected loss over a given interval under normal market conditions at a given confidence level (the lower the better).

  • Conditional VaR (CVaR) or Expected Short Fall (ETL) is the expected loss tail VaR and tail loss that takes the shape of the tail (the lower the better) into account.

  • Maximum drawdown (\(D_{max} or MDD\)): Percent the greatest fall from peak to valley on the return series (the lower the better). Drawdowns are measured as a percentage of that maximum cumulative return.

  • Annualized sharpe ratio (\(SR_{ann}\)) is a reward-to-variability ratio already defined by Eq. 4 but for benchmark between strategies taken \(\sigma _P\) as the standard deviation of the annualized series (the higher the better).

  • Calmar ratio (CAL) is the annualized return over the absolute value of the maximum drawdown of an investment. It is a Sharpe-type measure that uses maximum drawdown rather than standard deviation to reflect the investor’s risk:

    $$\begin{aligned} CR = \frac{r_p - r_T}{D_{max}} \end{aligned}$$

    where \(r_T\) is the minimum target return that we consider equal to zero.

  • Omega ratio (OME) is a weighted risk-return ratio for a given level of expected return set to zero in our case, which helps us to identify the chances of winning in comparison to losing (the higher the better):

    $$\begin{aligned} \Omega = \frac{\frac{1}{n}\sum _{i=1}^{i=n}max(r_i-r_T,0)}{\frac{1}{n}\sum _{i=1}^{i=n}max(r_T-r_i,0)}. \end{aligned}$$


We analyze the performance of the different strategies based on some descriptive statistics and from a financial perspective applied to different cryptocurrency spaces, the full market with 534 cryptocurrencies, and the top 250 and 175 ones according to market capitalization. Additionally, we apply different visual representations to highlight some of our findings. We complement our analysis with an exhaustive study of the outcomes of MC simulations.

Performance of monthly portfolios

Fig. 2
figure 2

Time series of the daily and cumulative returns for each method for market size 175. Mean-variance (MV), Hierarchical Risk Parity (HRP), Risk Parity Portfolio (RPP), Index CCI30 (IdX), Sharpe Ratio strategy (SR), Low-Risk strategy (LR), Mean-Risk strategy (MR) and High-Risk strategy (HR)

Fig. 3
figure 3

Time series of the daily and cumulative returns for each method for market size 250

Fig. 4
figure 4

Time series of the daily and cumulative returns for each method for the whole market (534 cryptos)

Table 3 Descriptive statistics of the cumulative portfolio returns (Rp) for each strategy
Table 4 Financial ratios of daily returns (Rp) for each method

In this section, we compare the investment strategies and benchmark methods by aggregating the everyday results of all the investments at day \(t_w\) during the simulation window for each method or strategy. The descriptive analysis is a standard way to illustrate the performance of asset allocation models where each method is compared with the other. In Table 3, we compute the basic descriptive statistics of the portfolio returns (\(Rp_t\)) obtained every day for each strategy and method, broken-down by market size. Figures 2, 3, and 4 represent the portfolio cumulative return (\(r_A\)). In the same way, Table 4 presents the more important ratios for each model, also broken-down by market sizes. From these, we can draw the main conclusions:

  • Considering the Mean Return values in Table 3, the SR (0.315, 0.300, and 0.931 values) and MR (0.304, 0.339, and 1.296) strategies outperform the others in terms of average values independently of the considered market size. However, as stated in the financial literature, median return value is considered a more representative indicator of the performance of the model when there is a heavy-tail effect on the return distribution. For median values, SR and MR perform slightly worse than RPP for the smaller market size (0.295).

  • There are differences in the result of the benchmark approaches (MV, HRP, and RPP) depending on the market size. The impact on the performance of the models indicates that the advantage of the clustered MV model increases as the market size becomes larger. This supports our hypothesis that covariance misspecification is more likely as market size increases.

  • In terms of the financial ratios in Table 4, we appreciate a similar result. As we increase the market size, SR and MR strategies more clearly outperform any of the benchmarks in terms of CumRet, AnnRet, and AnnSR, where SR strategy outperforms the others in terms of Sharpe Ratio measure (4.089), for instance.

  • The HR strategy exhibits no investments (zero values on the HR columns on Tables 3 for the lower market sizes (175 and 250) and a flat red line in Figs. 2 and 3). It means that the HR strategy for the cluster prototype allocation has not found a suitable cluster in the highest risk quartiles of the market or the cardinality of the cluster is lower than 10, meaning that any cluster with fewer than 10 cryptos is not considered by any strategy). HR centroids are only selected when we consider the whole market size.

  • Examining the figures of the Cumulative Returns, the MV strategy outperforms the others in the first half of the investing period for market size 175 as seen in Fig. 2 and only at the very beginning for market size 250 as shown in Fig. 3. However, in these cases, the superiority of the MV is due to one or two very profitable periods that occur on consecutive investment days. However, the MR and SR strategies have more sustained slopes, which suggests better financial behavior. In addition, the MR and SR strategies are better when considering the whole market in Fig. 4.

  • In general, the MV strategy outperforms the others in the risk indicators (MDD, ETL, and VaR) for all market sizes. However, SR and MV present a better trade-off between risk and returns as we can see in the combined indicators (AnnSR, CAL, and OME) for the market size 250 and the whole market.

  • The flat lines on the Cumulative Returns curves for MV, RPP, and HRP models for the whole market highlight the case that the allocation models are not working properly when the market size is large. The optimization algorithms probably do not converge to a feasible solution due to covariance misspecification, which causes a zero portfolio return for that holding period.

Financial comparison of the simulations

We now compare the aggregated results of the random investment paths for all the methods considered and the benchmarks in the three market sizes (175, 250, and the whole market).

In Figs. 5, 6, and 7, we present the aggregation graphically by means of the Cumulative Distribution Function of the Annualized Sharpe Ratio. In these figures, we present the distributions of the 1500 values that correspond to the realization of the simulation.

Fig. 5
figure 5

Cumulative distribution of annualized sharpe ratio (\(\textsf {SR}_{ann}\)) of the simulations (1500 random investing paths) for 175 higher marketcap cryptos

Fig. 6
figure 6

Cumulative distribution of annualized sharpe ratio (\(\textsf {SR}_{ann}\)) of the simulations (1500 random investing paths) for 250 higher marketcap cryptos

Fig. 7
figure 7

Cumulative distribution of annualized sharpe ratio (\(\textsf {SR}_{ann}\)) of the simulations (1500 random investing paths) for the whole market

For the smaller market size (175 cryptos), Table 5 and Fig. 5 reveal that the RPP method outperforms the rest in terms of annualized return and annualized Sharpe ratio, according to the central tendency measures followed by SR and MR strategies.

Table 5 Annualized returns (\(r^{ann}_G\)) of the simulations (1500 random investing paths) for the different strategies and market sizes

In terms of drawdown, the classical MV obtains the lowest median value with 0.043, followed by RPP value in Table 6. However, the other methods obtain similar drawdown values between 0.219 and 0.301.

Table 6 Maximum Drawdown (\(D_{max}\)) of the simulations (1500 random investing paths) for the different strategies

In terms of ETL, MV again clearly outperforms the others with -0.039 ETL value in Table 7. LR and IdX are the riskier models (-0.274 and -0.270) with RPP again performing better than the others but worse than the MV model.

Table 7 Conditional VaR (CVaR or ETL) of the simulations (1500 random investing paths) for the different strategies

Regarding the Sharpe ratio, a measure that balances risk and returns, RPP clearly outperforms the other models with a Sharpe ratio value of 2.238 in Table 8. MR and SR models keep a good trade-off between risk and returns with values of 1.637 and 1.601, respectively, which is better than the others except for RPP.

Table 8 Annualized sharpe ratio (\(SR_{ann}\)) of the simulations (1500 random investing paths) for the different strategies

When the market size is increased from 175 to 250 cryptos, we observe a clear impact on the Annualized Return performance of MR that is improved up to 3.815 (median)), while that of the MV is reduced up to 0.826 (see in Table 5). For the 250 market size, MR outperform the other models, followed by SR and the RPP. We observe no relevant differences in terms of Drawdown when the market size increases from 175 to 250. Similarly, we find no relevant differences for ETL when comparing 175 and 250 market sizes (see Table 7). However, for the Sharpe ratio indicator, MV and MR models clearly improve with values 1.449 of and 1.847, respectively, in Table 8 and Fig. 6 when the market size is greater. The RPP method outperforms the others with a value of 2.062 followed by MR according to the statistics presented.

If we investigate the financial performance for the whole market size, we can see that the annualized returns of the MR and SR strategies increase dramatically, while that of the benchmarking approaches is slightly reduced. The proposed clustered methods seem to make profit from those cryptocurrencies with smaller market capitalization, while the benchmarks seem to suffer in correctly computing the estimators when the market size increases. For Annualized Returns, the MR outperforms the other models reaching 88.360 value in Table 5, followed by SR with 40.747. In terms of Drawdown, MV outperforms the others with the lowest value followed by MR, SR and HR. At this point and referring to Drawdown, we have to take the fact that for many iterations of MV and RPP, models are not able to converge on the due training window into account; the outcome of the model is zero and that explains the medium value zero. The worst performance for the Drawdown corresponds to IdX, HRP, and LR with 0.281, 0.207, and 0.172 of median values, respectively. The riskier model according to ETL is the IdX benchmark and the riskless corresponds to MV model with a value of 0.0 in Table 7. In terms of Sharpe ratio, the SR strategy again outperforms the others with a median value of 2.469 in Table 8 followed by MR with a value of 2.078. The worst Sharpe ratio performance corresponds to LR and RPP with values of 0.680 and 0.979, respectively.

In general, independently of the method or strategy, we observe a change in the performance ratios when we compare 175/250 market sizes with the whole market in Tables 5, 6, 7, and 8. The portfolio models behave depending on the market size and we see that the standard methods misbehave for the higher market size where we find room for our proposal. If we exclusively take simulation results for \(SR_{ann}\) for the whole market represented in Fig. 7, we find that the SR strategy outperforms the others followed by MR.

In terms of the centroid strategies, we observe that MR and SR strategies over-perform compared to the others (LR, HR). Thinking exclusively in terms of returns, MR over-performs compared to SR for similar Max. Drawdown and CVaR. However, in terms of the Annualized Sharpe Ratio, the SR strategy (median value 2.469) beats all the other strategies and models.

For the MR strategy, via the simulations, we confirm the same results as for the monthly portfolios, that is, a lack of centroids for lower market sizes so no simulations either. In general HR when centroids exist perform better than LR.


We propose a methodology that, combined with the assumptions required for a good performance of MV models, allows us to extend the use of portfolio optimization models to the cryptocurrency market regardless of its size and volatility. Its usefulness extends to any portfolio management model that requires a covariance-based measure of market risk, and it is particularly suitable for managing streaming market data flows.

Our methodology proposes a clustering stage to reduce the problem of the dimensionality. It reveals how the performance of the model can be improved by reducing the number of assets considered and focusing on those that best fit the investor criteria. The results are similar to those obtained by an appropriate feature selection in prediction problems (Kou et al. 2021). Clustering reduces the space where the optimization models work, creating more accurate estimations of the different factors and mitigating possible errors. This methodology can be applied to other financial markets and other portfolio optimization models. It is specially indicated when both a large number of assets and a long time window are considered.

Based on the results, we draw the following conclusions:

  • First, we present a smart way to use the prototypes from clusters to automatize the selection of the more suitable partitioning of the market. The proposed methodology works dynamically with streaming price data of cryptos that, in our case, change on a daily basis although it can be easily adapted to other data periodicities. In this way, market partitions and portfolio generations work in concurrent mode autonomously once we set the criteria for cluster pre-selection based on the risk-aversion profiles of the investors.

  • The range of performance values when applied to cryptocurrency portfolios exceeds any comparison with traditional markets and this becomes more evident as the size of the market itself grows. Cumulative returns, risk, and drawdowns are significantly higher in cryptocurrency markets. For example, the steep upward trend of the cumulative yield curves means it is not comparable to any growth in traditional financial markets.

  • We demonstrate that the performance of the standard model Mean-Variance (MV) applied to the whole market with no partitions is not very different in magnitude to other results derived by other research (Petukhina et al. 2021; Liu 2019; Culjak et al. 2022) which make us confident regarding the results.

    In general,

  • Sharpe ratio (Strategy 1) and Mean-Risk (Strategy 2b) outperform all the other strategies in terms of Cumulative, Annualized Returns and Annualized Sharpe Ratio as the market size increases. At this point, we have to discard High-Risk (Strategy 2c) for standard investors as most of the time, there are no centroids available on the higher rank of volatility, which means this strategy has low suitability for the investor that frequently needs to take a position on the market.

  • We observe that strategies based on extreme values centroids (LR and HR) underperform the others along the out-of-sample window. In other words, centroids allocated into the interquartile distribution for estimation windows behave much better along the holding periods. The explanation is aligned with the 2nd and 3rd MV assumptions , which means that in general, risk-based models will perform better when we choose cryptos allocated in the mean region closer to the center of a normal distribution.

  • One of the drawbacks of the proposed strategies is the higher drawdown and risk compared with the classical MV, which is the more evident weakness of the proposed strategies, so it is comparable to the HRP model. We consider that SR and MR strategies independent of market size offer a good trade-off between returns, risk, and drawdown.

As we demonstrate, the results are sensitive to the number of cryptocurrencies considered. The smaller the space of cryptocurrencies with more restrictive thresholds during data pre-processing, the higher the chances to exclude the cryptocurrencies with explosive behavior, and the financial performance indicators that are obtained in these cases are more similar to those of traditional markets.

Finally, we have to consider that not all 534 cryptocurrencies considered in this research can be directly traded on the market. Depending on the selected exchange, some can be traded and for others, we should find on other exchanges; for instance, we can trade up to 124 crypto assets (we use the terms crypto asset and cryptocurrency interchangeably) in CoinbaseFootnote 6 or 380 crypto assets in BinanceFootnote 7 In other cases, some cryptocurrencies could require some days to reach a consensusFootnote 8 before incorporating to the portfolio, which introduces additional frictions on the market not considered in our model. Liquidity pools bring a solution to such frictions in creating a decentralized finance (DeFi) facilitating the turning of assets into cash and vice versa by application of smart contracts. The counterpart is an increase in the transaction complexity and associated risk together with the high turnover of investors providing that liquidity.Footnote 9 In addition, there are different liquidity approaches depending, for instance, on whether there are centralized or decentralized exchanges with different spread mechanisms that impact the performance of the trading models. Further work could consider adding a liquidity criterion to the Prototype Selection Strategy stage in Fig. 1. For instance, this could be based on a spread measure together with the risk criteria that we have already used in the cryptocurrency pre-selection stage to improve the performance of the Mean-Variance portfolios considering the conditions of crypto markets in real-life.

Data availibility

The datasets generated and/or analysed during the current study are available in the OSF repository,













Autocorrelation-based fuzzy C-means


Compound annual growth rate


Calmar ratio


Clustering large applications


Convolutional neural network


Conditioned vale at risk


Cluster Validity Index


Directed bubble hierarchical tree


Euclidean distance


Efficient market hypothesis


Equally-weighted risk contribution


Expected short fall




Generalized autoregressive conditional heteroskedasticity


Hierarchical clustering based asset allocation


High risk strategy, risk-seeking investor strategy


Hierarchical risk parity


Low risk strategy, risk-averse investor strategy


Moving average reversion


Maximum draw down


Modern portfolio theory


Mean risk strategy, risk-neutral investor strategy


Minimum spanning tree




Mean-variance optimization


Online portfolio selection model


Online portfolio selection model average reversion


Partition around medoids


Planar maximally filtered graph


Quadratic program


Risk parity portfolio


Random investment path


Self-organizing maps


Sharpe ratio or sharpe ratio strategy


Value at risk


Download references


We are very grateful for the selfless efforts of the huge community of R developers on which we relied to develop our research.


This work was supported by the European Union’s H2020 Coordination and Support Actions CA19130 under Grant Agreement Period 2.

Author information

Authors and Affiliations



The initial idea was conceived by LL. The experiments were designed by both authors. The searching on the data bases, statistical analysis and software design as performed by LL. The work was drafted by LL and revised critically by JA. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Luis Lorenzo.

Ethics declarations

Competing interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lorenzo, L., Arroyo, J. Online risk-based portfolio allocation on subsets of crypto assets applying a prototype-based clustering algorithm. Financ Innov 9, 25 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: