 Research
 Open Access
 Published:
Take Bitcoin into your portfolio: a novel ensemble portfolio optimization framework for broad commodity assets
Financial Innovation volume 7, Article number: 63 (2021)
Abstract
The emergence and growing popularity of Bitcoins have attracted the attention of the financial world. However, few empirical studies have considered the inclusion of the newly emerged commodity asset in the global commodity market. It is of great importance for investors and policymakers to take advantage of this asset and its potential benefits by incorporating it as a part of the broad commodity trading portfolio. In this study, we propose a novel ensemble portfolio optimization (NEPO) framework utilized for broad commodity assets, which integrates a hybrid variational mode decompositionbidirectional long shortterm memory deep learning model for future returns forecast and a reinforcement learningbased model for optimizing the asset weight allocation. Our empirical results indicate that the NEPO framework could effectively improve the prediction accuracy and trend prediction ability across various commodity assets from different sectors. In addition, it could effectively incorporate Bitcoins into the asset pool and achieve better financial performance compared to traditional asset allocation strategies, commodity funds, and indices.
Introduction
It is important to consider the distributions and classes of different assets in investor portfolios in maximizing the asset returns for a given level of volatility (Konno and Yamazaki 1991). Based on the belief that distinctive factors influence the price and risk movements of different assets and that correlation among global financial products is relatively low, investors construct their portfolios by diversifying investment in various assets. Traditionally, investors have mainly focused on stocks. In the past few decades, broad commodity assets have become an increasingly popular investment option, in line with the diversification premium for global commodities (Perold 1984; Bessler and Wolff 2015). Global commodities, such as crude oil, precious metals, and agriculture products, share common drivers for price and volatility movements compared to other assets such as stocks. As a result, broad commodity investment is usually regarded as a natural hedging and diversification strategy, which benefits from different seasonal cycles and supply and demand factors (Geman and Ohana 2008).
Most investments in broad commodity assets typically focus on traditional products, such as crude oil, agricultural products, and precious metals. However, recent years have seen the emergence of a new type of commodity asset—cryptocurrency—which has gained the attention of investors. Bitcoin, the most popular cryptocurrency, has seen a sharp rise in its price from almost zero in 2009 to approximately $60,000 in 2021. Due to its extreme price disturbances observed during the latter half of 2019, Bitcoin has been considered a threat to the stability of the world financial system; however, its unique economic properties have made it an attractive and potentially highreturn investment option (das Neves 2020; Jiang et al. 2021). As there is a finite number of Bitcoins in circulation, the everdecreasing supply of the asset available for buying and selling has driven a growing number of institutional investors to embrace this cryptocurrency as an investment option. For example, Bitcoin’s price resurgence in 2021 was partly fueled by the Wall Street billionaires who publicly supported and invested in the asset.^{Footnote 1} However, due to its high volatility, most investors are hesitant to solely invest in this asset. Instead, many trading firms seek to incorporate it into their portfolio along with other traditional commodities to hedge against its potential volatility risks (Liu and Tsyvinski 2018).
Bitcoin’s commodity properties have been investigated by some studies. For example, using a conditional correlation model, Bouri et al. (2017) suggest that Bitcoin can serve as a safe haven for other major commodities in the global commodity market system. Selmi et al. (2018) use the quantile regression to investigate the economic characteristics of Bitcoin, indicating that it is both a hedge and safe haven for oil price movements. As a result, Bitcoin is considered the “new gold” for its safe haven properties, which are similar to that of gold and serve as a potential hedge or safe haven asset for finical portfolio optimization (Selmi et al. 2018; Symitsi and Chalvatzis 2019). However, limited studies investigate the effects and usefulness of a cryptocurrency in portfolio investments. Therefore, this study attempts to close this gap by considering Bitcoin and its diversification properties in the development of a broad commodity portfolio optimization system based on deep learning and reinforcement learning.
The literature has sought to improve portfolio performance using various optimization methods and models. Conventional portfolio optimization models, such as the mean–variance, risk parity, and BlackLitterman models, utilize the historical returns and variances of financial assets to derive the maximized Sharpe ratio or efficient frontier of the portfolio (Kou et al. 2021). However, a potential problem with such approaches lies in the discrepancy between historical and future prices, which may lead to estimation errors, eventually generating nonoptimal proportions of the target portfolio (Guastaroba et al. 2009; Tola et al. 2008). To address this issue, algorithmic optimization approaches based on datadriven techniques have been introduced for financial time series data prediction and portfolio decisionmaking in recent studies (Branke et al. 2009; Lwin et al. 2014). Although improvements have been made, the current algorithmic methods for portfolio optimization face two primary challenges: improving the directional accuracy (DA) of multiasset return predictions and implementing multiobjective portfolio optimizations.
With the development of computer technologies, deep learning prediction models have been introduced to improve the financial time series data forecasting accuracy. Compared to traditional techniques based on econometrics and machine learning techniques that are unable to perform well on forecasting multivariate time series data due to noise disturbances (Altan et al. 2019; Galankashi et al. 2020; Jalali and Heidari 2020), deep learning techniques are observed to be more effective. For example, Atsalakis et al. (2019) introduce a novel Neurofuzzy technique with artificial neural networks (ANN) to forecast the market trends in cryptocurrency prices, which show an improvement in prediction accuracy compared to traditional prediction methods. Dutta et al. (2020) employ a gated recurring unit model to predict the price movements of Bitcoins and achieve a better forecasting performance. Long et al. (2019) propose a multifilter neural network for stock price prediction. Further, Li et al. (2019) develop a crude oil price prediction system based on the convolutional neural networks. Among all the deep learning techniques, the recurrent neural network (RNN) models have displayed a superior performance over others in terms of time series prediction accuracy (Duan et al. 2016). Such a superior performance may be attributed to the recurrent feedback layer of RNN models, which allows them to effectively use internal memories to process input data sequentially and produce more precise forecasts (Cao et al. 2012; Anbazhagan and Kumarappan 2012).
However, as most commodity market prices are volatile and nonstationary, the forecasting performances may be negatively affected because of high volatilities. In recent years, a hybrid forecasting approach known as “decomposition and ensemble” has been proposed to improve the prediction accuracy of nonstationary time series with high complexity and irregularity. The decomposition and ensemble approach is based on the principle of “divide and conquer,” which integrates signal decomposition technology, such as empirical mode decomposition, ensemble empirical mode decomposition, or variational mode decomposition (VMD), with machine learning and deep learning models (Yang et al. 2019; Wang et al. 2018). In this approach, the original prediction task is divided into subtasks to simplify the modeling difficulty (Li et al. 2021). Compared to other models, this approach is not bound by strict assumptions such as that of linearity and stationarity, which are imposed on econometric models.
Numerous studies have employed this hybrid forecasting approach and demonstrated its effectiveness in improving the time series forecasting performance (Yu et al. 2015; Wen et al. 2017). Despite the improved prediction performance, there might be a potential problem with the decomposition and ensemble approach. Estimation errors generated while forecasting the individual submodes tend to accumulate during the aggregation process. This accumulation may cause significant discrepancies between the actual and predicted values, which could negatively affect the prediction performance (Zhu et al. 2019). Therefore, we propose a new hybrid deep learningbased forecasting approach in our portfolio optimization framework to mitigate this problem. Unlike previous studies, our forecasting approach eliminates the aggregation step by directly generating the final prediction results using all the intrinsic modes as inputs simultaneously, which can potentially reduce the aggregation errors.
The literature has adopted many portfolio models, such as the cardinality constrained model (Zha et al. 2020), fuzzy selection model (Yue 2019), and Powell approaches (Powell 1964), to address the second challenge of implementing multiobjective portfolio optimizations. Among all the portfolio optimization models, reinforcement learning models are considered to be the most appropriate for financial portfolio optimization. For example, Jangmin et al. (2006) introduce a reinforcement learning framework for asset allocation optimization by using meta policy as a reinforcement learning strategy to optimize stock portfolios, which is designed to incorporate the information obtained from the ratio of the stock fund and stock recommendations. Jeong and Kim (2019) use a Deep Q Networkbased reinforcement learning model to improve the prediction and trading performance of stock markets. Qvalues are utilized to analyze which portfolio action strategies are beneficial for improving profits in a confused market. Eilers et al. (2014) develop a novel integrated robust artificial neural network reinforcement learning (ANNRL) model to filter the seasonality of financial assets, where the Sharpe ratio is introduced to act as network rewards in the reinforcement learning process.
However, despite contributing toward improving the market returns and their associated risks, previous reinforcement learningbased portfolio optimization frameworks have two limitations. First, previous models typically consider a discrete action space, implying that there are only a fixed number of portfolio weight allocations from which the model can choose. However, the portfolio weight allocation is a continuous action space in reality, as each asset can be potentially given any weight between 0 and 100%. Therefore, although the portfolio performances have improved, the previously implemented model may have ignored the possibility of other allocations that do not exist in the predetermined action space. To address this limitation and improve the model’s consistency, we develop a portfolio optimization framework based on a deep deterministic policy gradient model (Lillicrap et al. 2015) that can effectively consider the continuous characteristics of the weight allocation action space. In addition, the previously employed portfolio optimization models typically allocate asset weights to optimize the portfolio performance directly based on the forecasted trends of the assets without considering the potential prediction errors. Hence, these models assume that the forecasted value is entirely accurate, which is unrealistic in practice. Therefore, our portfolio optimization framework attempts to address this issue by considering the prediction errors that may arise in the forecasting process, which can potentially allocate weights more effectively to improve the portfolio performance.
Based on previous studies, we propose a novel ensemble portfolio optimization (NEPO) framework utilized for broad commodity assets. First, a nonrecursive decomposition approach through VMD is utilized to decompose the daily closing price data of the selected commodity assets into distinctive intrinsic modes in order to extract the additional hidden information and patterns in time series data. Second, the decomposed intrinsic modes of each asset are inserted into a bidirectional long shortterm memory (BiLSTM) deep learning model to forecast the daily closing price and return of the asset. Compared with the typical unidirectional deep learning model, the proposed prediction model can extract a twoway sequential relationship in time series data, making it more consistent with reality (Ullah et al. 2017). Additionally, unlike other decomposition and ensemble forecasting approaches, our proposed price prediction model eliminates the aggregation step by generating the forecasting results directly through the simultaneous input of all the extracted intrinsic modes into the deep learning model. Third, the predicted returns of the asset as well as the estimation errors are included in a reinforcement learningbased optimizer to allocate optimal weights for the commodity assets in the portfolio. Several prediction models, such as machine learning and deep learning models, are introduced as benchmarks to assess the forecasting performance of our decompositionbased bidirectional deep learning model. The empirical results suggest that the proposed VMDBiLSTM model can effectively improve the prediction accuracy and trend prediction ability across various commodity assets.
We compare the performance of our portfolio optimizer to that of other commodity funds, indices, and asset allocation strategies in terms of an annualized return and Sharpe ratio. The empirical results indicate that our ensemble portfolio optimization framework can generate higher returns and a better Sharpe ratio than the others. The results also indicate that by including Bitcoins in the commodity portfolio, asset managers can achieve higher returns without being exposed to significant financial risks. We find that it is possible to take advantage of the returns generated from Bitcoins while reducing the investment risks caused by its extreme volatilities. Overall, employing the proposed ensemble portfolio optimization framework and considering Bitcoin a traditional commodity portfolio can generate better fund performances for asset management and portfolio profits for commodity investors.
Our study’s primary contributions are as follows. First, we extend the broad commodity asset pool for potential diversification premiums by utilizing the economic and investment properties of Bitcoin by incorporating it into the investment portfolio. To the best of our knowledge, Bitcoin has not been considered a portfolio component in portfolio optimization problems; we aim to fill this literature gap through this analysis. Second, the proposed ensemble portfolio optimization framework allows the asset weights to be allocated in a continuous action space while considering the prediction errors generated in the optimization process. Compared to the portfolio optimization models in previous studies, our proposed model is more practical and consistent with reality.
Methodological framework
Our NEPO framework comprises three main components: an effective decomposition technique, VMD, is used to decompose the original time series of all the commodities and extract the inner patterns of the data; the extracted inner factors of the different commodities are then incorporated into the BiLSTM neural networks to predict their fiveday returns; finally, reinforcement learning is applied to optimize and rebalance the portfolio weights based on the predicted returns and forecasting evaluation metrics. The detailed framework is shown in Fig. 1.
VMD
VMD decomposes the original complex and nonstationary time series data into normally distributed stationary volatility data, thereby generating economic implications. This nonrecursive signal decomposition technique was proposed by Dragomiretskiy and Zosso (2014). It decomposes the original input signal \(f\left( t \right)\) into a series of quasiorthogonal bandlimited discrete subsignals \(u_{k}\) through Wiener filtering and Hilbert transform (Wang and Markert 2015). The decomposed subsignals \(u_{k}\) are mostly centered tightly around their respective center frequency \(\omega_{k}\) (Liu et al. 2016). The optimization procedure is as follows (Zhang et al. 2017):

Step 1: Calculate the Hilbert transform of each mode \(u_{k}\) and transform it into its respective unisided frequency spectrum.

Step 2: The frequency spectrum of each mode \(u_{k}\) is altered into a narrow frequency baseband by multiplying an exponential function tuned to the corresponding estimated center frequency.

Step 3: Obtain the bandwidth of each mode \(u_{k}\) by conducting the \(H^{1}\) Gaussian smoothness on the demodulated signal.
The iterative minimization process seeks to minimize the total bandwidth of each mode, which can be expressed in the following form:
where \(K\) denotes the number of decomposed modes, \(\{ u_{k} \}\) and \(\{ \omega_{k} \}\) are the decomposed modes and their respective center frequencies, \(\delta \left( t \right)\) denotes the Dirac delta function, \(\otimes\) represents the convolution operator, and \(f\left( t \right)\) denotes the original input signal.
For finite convergence and constraint enforcement, a quadratic penalty function \(\alpha\) and Lagrangian multiplier \(\lambda\) are introduced to obtain the optimal solution of the constrained optimization problem provided in Eq. (2). The augmented Lagrangian multiplier function \(L\) can be obtained as follows:
The optimal solution is obtained using the alternative direction method of multipliers (Hestenes 1969), while the original input signal \(f\left( t \right)\) is decomposed into \(K\) subsignal modes.
BiLSTM neural networks
The bidirectional RNN was proposed by Schuster and Paliwal (1997). It utilizes both forward and backward information in the data. As illustrated in Fig. 2, the bidirectional RNN structure contains two unidirectional hidden layers, where one layer processes information from the forward direction and the other from the backward direction. The forward and backward unidirectional layers are concatenated to one output layer, such that the neural networks can extract bidirectional sequential relationships in the time series data. Compared to traditional unidirectional neural networks, it can preserve information from both the past and future.
In our prediction model, we replace the traditional RNN cells with LSTM cells, considering their ability to learn longterm dependencies (Zhang et al. 2018). At each time step \(t\), an LSTM cell consists of an input gate \(i_{t}\), a forget gate \(f_{t}\), an output gate \(o_{t}\), and a memory cell block \(C_{t}\). \(f_{t}\) and \(i_{t}\) are defined as follows:
A \(\tanh\) layer is utilized to generate a new memory cell block \(\widetilde{{C_{t} }}\). The existing memory cell block \(C_{t}\) is then updated, while the output gate \(o_{t}\) and hidden state \(h_{t}\) are generated:
where \(x_{t}\) denotes the input at time \(t\), \(\sigma\) represents the sigmoid function, and \(*\) is the elementwise multiplication. \(W\) and \(b\) are the respective weight matrices and bias vectors.
Reinforcement learning
This study uses the predicted returns generated from the BiLSTM neural networks and integrates them into a reinforcement learning model to optimize and rebalance the weights of the portfolio. The set of agent states \(S\) represents the previous weight allocation, and the set of agent actions \(A_{t}\) denotes the possible set of portfolio allocations. The probability of the reinforcement learning model selecting an action (a portfolio weight allocation) \(a\) in state \(s\) can be expressed as follows:
The state spaces contain all the possible allocation of portfolio weights, while the actions are the set of possible allocations from state spaces. At each time step \(t\), the state \(s_{t}\) and action \(a_{t}\) can be expressed as follows:
where \(w_{i,t} \left( {i = 1, \ldots n;\;\;t = 1, \ldots T} \right)\) denotes the allocated portfolio weight for commodity \(i\) at time \(t.\)
We further define the reward function at time \(t\), which is denoted as \(r_{t}\), as the difference between the reward for the newly allocated portfolio weights and the previous portfolio weights:
Although a set of portfolio management targets and indicators, such as the Omega and Sortino ratios, are available for portfolio optimization, the Sharpe ratio is the most widely utilized indicator and serves as the baseline of portfolio ratio improvements in academia and industry (Farinelli et al. 2008; Kapsos et al. 2014). \(Q_{t}\) and \(\widetilde{{Q_{t} }}\) in Eq. (12) denote the weighted Sharpe ratio (Sharpe 1994) of the newly allocated portfolio weights and the portfolio weights observed in previous studies, respectively. \(\widetilde{{RMSE_{t} }}\) and \(RMSE_{t}\) represent the weighted root mean squared error (RMSE) of the prediction models for the new portfolio weights and the weights computed in previous studies, respectively. The reinforcement learning model is trained to find a set of portfolio weight allocations that will maximize the expected return:
After every period (five days), new commodity prices are included in the prediction model to generate the predicted returns for the next period. Based on the new prediction values, portfolio performance, and weights from the previous period, the reinforcement learning model can optimize and readjust the portfolio weights for the next period. This model is designed to selfadjust and find optimized portfolio allocations with the least human participation. In particular, this selflearning procedure can effectively find a balance between maximizing the portfolio returns and minimizing the risks of the portfolio generated from the errors in the forecasting model.
Empirical study
Our analysis consists of two parts. First, the VMDBiLSTM models first predict the fiveday prices and returns for the chosen commodities based on their respective historical time series data. Second, the reinforcement learning model considers the prediction results and allocates the optimal portfolio weights for each predicted period accordingly.
Description of the dataset
We select five major commodity markets to construct our commodity portfolio, which consists of stocks, agriculture, energy, precious metal, and the newly emerged cryptocurrency commodities. In the portfolio, each market is represented by its leading commodity, which includes the S&P 500 stock index, wheat, WTI crude oil, gold index, and Bitcoin.
The data are obtained from Yahoo Finance, from which we download the daily closing price of the S&P 500 index, wheat, WTI crude oil, gold, and Bitcoin from January 2, 2013 to February 21, 2020, obtaining 1797 observations. As Bitcoin is traded continuously throughout the day, its opening price generally refers to the price at 12:01 AM UTC and the closing price to that at 11:59 PM UTC on any given day. A graphical representation of the data for each commodity is provided in Fig. 3.
The common descriptive statistics for the commodity time series data are presented in Table 1. The daily closing prices of all the commodities are nonnormal and positively skewed (right skewed). In addition, the augmented DickeyFuller test indicates that the time series data for the stock (SPY), crude oil (WTI), gold, and Bitcoin are all nonstationary and have a unit root. The null hypothesis of the augmented DickeyFuller test for wheat is not rejected at the 10% level of statistical significance, which means that the series is stationary.
Moreover, this study conducts a correlation analysis among the commodities. The Pearson correlation coefficients are shown in Fig. 4. The results suggest that there exist moderately weak nonlinear relationships among the commodities. In particular, Bitcoin is negatively correlated with all the other commodities. In addition, there exists a strong positive correlation between the stock commodity (SPY) and the energy commodity (WTI).
We divide the data into two sets: a training set and a testing set with a split ratio of 8:2, which means that the preceding 80% of the data are used to train the prediction model, while the remainder are used to evaluate the model. We use a sliding input of 14 days in the prediction process, which means that the model considers the historical data from \(t  13\) to \(t\) to forecast the fivedayahead closing price at \(t + 5\). Therefore, our training set consists of 1421 observations from January 30, 2013 to September 19, 2018, while the testing set consists of 356 observations from September 20, 2018 to February 21, 2020.
To eliminate the differences in the variable dimension and increase model forecasting reliability, we normalize the data in the range of [0,1] as shown below:
where \(x_{t}\) denotes the true value of the time series at time \(t\), while \(\max {\kern 1pt} x_{t} \;{\text{and}}\;{\text{min}}{\kern 1pt} x_{t}\) are the maximum and the minimum true values of the time series, respectively.
After the normalized closing prices are predicted, they are converted to predicted returns as follows:
where \(\widehat{{r_{t + 5} }}\) and \(\widehat{{p_{t + 5} }}\) denote the predicted returns and predicted closing price at time \(t + 5\), respectively. Here, \(p_{t}\) represents the actual closing price at time \(t\).
The constructed prediction model used in this study consists of five layers: an input layer, two hidden layers in the forward and backward direction, an output layer, and a fully connected layer. The dimensions of the input layer, hidden layers, and output layer are same as that of the input data. The dimension of the fully connected layer is set to one to represent the single final predicted output. We use the Adam optimizer with the learning rate (LR) set to 0.01 with \(\tanh\) as the activation function. We adopt a rolling forecast process where the rolling window is set to 90days. The rolling process is illustrated in Fig. 5.
Evaluation measures
To assess the accuracy of our forecasting models, we adopt the mean square error (MSE) as the loss function. It is calculated as follows:
The RMSE and mean absolute error (MAE) are selected as the evaluation measures. They are calculated as follows:
where \(\tilde{x}_{t}\) and \(x_{t}\) \(\left( {t = 1, 2, \ldots , N} \right)\) are the predicted and actual true values at time \(t\), while \(N\) represents the total number of data points in the testing set.
In addition, we introduce DA as the metric to assess the market trend predictive ability of the model:
where
For the forecasting model, lower MSE, RMSE, and MAE values and larger DA values indicate that the model has higher predictive accuracy and a stronger ability to predict the market trend. For benchmarking purposes, we compare the performance of our decompositionbased VMDBiLSTM model against four other benchmark models, including the BiLSTM, unidirectional LSTM, support vector regression, and linear regression models.
To assess the performance of the portfolio constructed by our reinforcement learning model, we compare the average fiveday returns of our portfolio and the overall Sharpe ratio against that of other portfolios and the reported financial performance from similar commodity indices and funds. The Sharpe ratio is defined as follows:
where \(r_{p}\) denotes the annualized return of the portfolio, \(\sigma_{p}\) is the annualized volatility of the portfolio, and \(r_{f}\) represents the nominal riskfree rate. As suggested in previous studies (Fabozzi et al. 2007; Ackerman et al. 2013), we set the nominal riskfree rate \(r_{f} = 2\)%.
We consider several other portfolios, including the equalweighted portfolio (the five chosen commodities are allocated equal weights throughout the trading period) and nonBitcoin portfolio (only SPY, wheat, WTI, and gold are considered), for comparing performance. We also obtain the financial performance data from similar exchange traded funds (ETF) for comparison, which includes the broad commodity ETF and Bitcoin ETF. The financial data for these ETFs for the trading period ranging from September 14, 2018 to February 21, 2020 are all downloaded from Yahoo Finance.
Return prediction results
First, we decompose the original historical price time series for all the commodities in our portfolio via VMD. According to the literature, VMD can effectively help neural networks to capture the tendency and cyclicity of time series data. For our analysis, the historical data for each selected commodity are decomposed into their respective subseries modes as shown in Fig. 6.
As we can see from Fig. 6, the historical daily closing price data for each commodity are decomposed into 11 subseries modes labeled from M1 to M11, respectively. For each commodity, their decomposed modes display different cyclicity and fluctuation patterns. The subseries modes range from low frequency to high frequency. Specifically, the M1 modes have the lowest frequency, reflecting the longterm trends for the time series. However, M2 to M5 represent the medium frequency modes, which show the periodicity of the price fluctuation. Lastly, M6 to M11 are the highfrequency modes, which represent the shortterm fluctuations in the data.
By decomposing the historical price time series, we can extract the inner factors and patterns in each commodity. In general, these inner factors may contain hidden information that can influence the price fluctuation of the commodity (Wang et al. 2014), which cannot be captured with the original data. Consequently, this ability to extract hidden fluctuations and patterns can improve the forecasting ability of the prediction models.
In the prediction step, the historical daily closing prices of each commodity and the decomposed subseries modes are included in the prediction model to generate its fiveday ahead predicted closing price that is further converted into its predicted return. This procedure is repeated for all the five commodities in our portfolio, while considering the same model settings to ensure consistency.
Observing the prediction performances of all the commodities displayed in Table 2, we can conclude that our VMDBiLSTM model is the most suitable for generating the most reliable forecasting results as compared to other benchmark models.
After analyzing each commodity, we find that the VMDBiLSTM model displays the most drastic improvements in RMSE and MAE performances in the case of the benchmark models. The VMDBiLSTM model obtains the highest DA out of all the models. This superior performance indicates that our prediction model can effectively capture and forecast the movement trend for all the selected commodity markets. Moreover, the high prediction accuracies achieved by our VMDBiLSTM model across all the commodities indicate that it is not overfitted to a particular dataset. Thus, it could be generalized for all commodity markets.
Although our VMDBiLSTM model can significantly improve the forecasting performance, its predictive accuracy differs for each commodity. Specifically, the model displays the highest prediction accuracy in the gold commodity market by achieving the lowest RMSE and MAE and a relatively high DA of 93%. In contrast, it obtains the highest RMSE and MAE and the lowest DA of 85.4% in the Bitcoin commodity market. This difference in prediction accuracies could be attributed to the fact that the selected commodities have varied volatilities. For example, gold is regarded as an investment safe haven due to its relatively low volatilities (Baur and McDermott 2010); in contrast, Bitcoin is known to be a volatile asset as its prices can fluctuate significantly. As discrepancies in prediction accuracy exist among different commodities, we must consider them when building portfolio allocation strategies based on the predicted values.
Robustness tests
To further verify the robustness of our prediction, we test the prediction model using different combinations of neural network hyperparameters. The hyperparameter sets contain two components: the dimension of the hidden layers [11, 22, 33] and LR [0.001, 0.01, 0.1]. The results for all the datasets and commodities are presented in Table 3.
The results in Table 3 indicate that the different hyperparameter combinations can yield varied model prediction accuracies, where the bold fonts present the prediction results of 11 dimension hidden layers and 0.01 LR settings for each market. It is clear that when the dimension of the hidden layers is set to 11 and LR is set to 0.01, our prediction model obtains the best prediction results. Further, the consistently superior results across all the commodities indicate that our model is not overfitted to a particular dataset.
To further verify the robustness of our prediction model, we use the same prediction model with the best hyperparameter settings on four different datasets that include the first 95%, 90%, 85%, and 80% of the original time series, which are denoted as “Set 95,” “Set 90,” “Set 85,” and “Set 80,” respectively. For each dataset, the split ratio is set to 8:2. The test results for all the datasets and commodities are presented in Table 4.
The results in Table 4 indicate that our VMDBiLSTM prediction model displays a consistent and good performance across all the commodities in different datasets in terms of RMSE, MAE, and DA. This indicates that our prediction model can consistently predict the future prices of different commodities across various market conditions, implying that our prediction model is robust and generalizable across different commodity markets.
Portfolio optimization results
After obtaining the prediction results from the VMDBiLSTM model, we use them as the input to construct our commodity portfolios. In this analysis, we apply a deep deterministic gradient policy reinforcement learning model to optimize asset allocation automatically every five days. After obtaining the allocation weights in each portfolio, we calculate the actual annualized returns, volatility, and Sharpe ratio of the portfolios using realtime commodity prices. To further verify the practicality of our strategy in the real world, we consider the transaction fees of each commodity, which are collected from Yahoo Finance. To evaluate the performance of each selected portfolio, we divide the entire trading interval into quarters (four months). In our trading policy and simulation, we ensure that our assets are sufficiently large to cover the trading volumes for all commodities. As a result, the initial investment capital for each trading strategy is set at $10,000. The investment asset and indicator comparisons are separately illustrated in Fig. 7 and Table 5.
As Bitcoin is a relatively new commodity asset, it has not been considered a portfolio component in portfolio optimization problems by previous studies. To investigate its diversification properties and effects in a portfolio, we construct two portfolios using our NEPO framework: the extended broad commodity asset (EBCA) portfolio and the traditional broad commodity asset (TBCA) portfolio. The EBCA portfolio contains all the selected commodity assets (SPY, wheat, WTI, gold, and Bitcoin). In contrast, the TBCA portfolio includes only the traditional commodity assets (SPY, wheat, WTI, and gold). To evaluate the performance of our portfolios, we compare the results to that of the portfolios constructed using other strategies, similar indices, and funds, such as extended equalweighted portfolio, traditional equalweighted portfolio without Bitcoin, Dow Jones Commodity Index (DJCI), top field broad commodity ETF (FTGC), and top field Bitcoin ETF (GBTC) that only consists of digital assets.
The results in Table 5 and Fig. 7 show that our portfolio constructed using the reinforcement learning model outperformed the other portfolio, indices, and funds in terms of financial performances for all the trading periods in the analysis. First, our constructed EBCA and TBCA portfolios are unique in that they can maintain consistent positive returns throughout all the trading intervals. In certain intervals, such as 12/2018–02/2019, 06/2019–08/2019, 09/2019–11/2019, and 12/2019–02/2020, the other indices and funds experienced negative returns because most of the commodities in their portfolios experienced a decrease in their price. In comparison, our reinforcement learning model uses the predicted returns to optimally allocate weights to maximize the Sharpe ratio of the portfolios.
Second, in comparison with DJCI and FTGC funds, our traditional commodity TBCA portfolio records a higher average volatility at 23.09%. This higher level of risk is attributed to the common diversification knowledge (Imbs and Wacziarg 2003; Guesmi et al. 2019). As the selected indices and funds often consist of a large number of commodities, the risks of their portfolios are generally more diversified. In comparison, despite the higher volatilities, our TBCA portfolio is sufficiently diverse for individual investors as it contains four commodity assets across different sectors. Further, our portfolio yields significantly higher returns.
Third, the performance of our extended EBCA portfolio indicates that Bitcoin can yield better results when it is treated as a part of the portfolio rather than as a standalone investment. Bitcoin is more volatile than other traditional commodities (Garcia et al. 2014; Yu et al. 2019). As a standalone investment, although it presents attractive returns in certain periods, its extreme volatilities make it a risky asset. For example, the GBTC ETF obtains a return of 113.91% in 03/2019–05/2019, while the portfolio volatility attains 103.04%. As a result, this “high risk, high reward” characteristic of Bitcoin exposes investors to significant risks. Compared with the GBTC funds, our EBCA portfolio increases the average returns by 78% from 19.48 to 34.67% and significantly reduces the portfolio risk from 90.12 to 24.27%. By incorporating Bitcoin as part of our constructed portfolio, we can take advantage of the attractive returns of Bitcoin while limiting the exposure to the risks of the asset, which may be a viable strategy for individual investors.
The results in Table 5 also show that our EBCA portfolio outperformed the TBCA traditional commodity portfolio in four of six intervals by achieving better Sharpe ratios. Overall, the EBCA portfolio obtains average financial returns of 20.08% with an average volatility of 23.09%. With the inclusion of Bitcoin as part of the portfolio, the average volatility of the portfolio throughout all the quarters increased by 5.1% to 24.27%. Despite the slight increase in portfolio risks, the average returns of the portfolio saw a significant jump to 34.67%, which resulted in a higher average Sharpe ratio. As Bitcoin is a highly volatile asset, an increase in portfolio risks is expected. However, the results indicate that the additional returns an investor can gain significantly outweigh the additional risks.
Looking at the asset allocation comparison between the portfolios in Table 6 and Fig. 8, our EBCA portfolio has given Bitcoin the most weight in the portfolio. At the same time, it controls its weight within a reasonable amount so that the risks can be diversified to the traditional commodities. Our findings indicate that much higher returns can be achieved without being exposed to significant financial risks by including Bitcoin in the commodity portfolio.
Conclusion
Bitcoin has attracted significant attention from investors and policymakers in the global commodity market. Taking advantage of this asset due to its potential benefits and incorporating it as a part of the broad commodity trading portfolio will prove to be of great importance to investors and policymakers. In this paper, we propose a NEPO framework utilized for broad commodity assets, which integrates a deep learningbased model for future returns forecast and a reinforcement learningbased model for optimizing the asset weight allocation.
In terms of forecasting future prices and returns of the broad commodity assets, the empirical results suggest that our proposed VMDBiLSTM prediction model can effectively improve the prediction accuracy and the trend prediction ability consistently across various commodity assets, including stocks, agriculture, energy, precious metal, and cryptocurrency commodities, across different sectors. In terms of portfolio performances, the broad commodity portfolio constructed using our reinforcement learningbased optimizer achieves significantly higher returns and a better Sharpe ratio than other commodity funds, indices, and asset allocation strategies. In addition, by incorporating Bitcoin into the asset pool, our portfolio optimization framework can increase the financial performance of the broad commodity portfolio by taking advantage of its high returns and effectively reducing its inherent risks.
This study adds to the literature through multiple channels. First, our broad commodity portfolio optimization framework serves as an early attempt to incorporate Bitcoin in the asset pool. Further, it could be effectively used to increase the diversification premiums of the portfolio without greater exposure to investment risks. Our VMDBiLSTM forecasting approach differs from other hybrid forecasting approaches applicable in financial time series analysis. It directly generates the forecasting results by simultaneously using all the extracted intrinsic modes as prediction model inputs. Our proposed model can effectively avoid the estimation errors that tend to accumulate in the current ensemble prediction approaches by eliminating the aggregation step. Finally, our proposed NEPO framework contributes to the artificial intelligencebased portfolio optimization literature by broadening the optimizer’s weight allocation decisions from discrete to continuous actionspace and considering the asset forecasting errors in the weight allocation process. Thus, it improves the practicality as well as consistency with reality. By proposing a NEPO optimization framework, our study supports a promising trend in improving the portfolio allocation decisionmaking for broad commodity assets.
Although the results are promising, our study also faces certain limitations. For instance, our framework only uses structured data (asset prices) as input. Future studies can incorporate unstructured data such as news reports and social media sentiments to further improve the predictive ability of the framework. Moreover, for simplicity, we do not consider associated costs such as inflation and other management costs. Considering and calculating these associated assets in future studies can further improve the model’s practicality.
Availability of data materials
Data used in this paper are downloaded from Yahoo Finance via the link https://finance.yahoo.com/.
References
Ackerman F, Stanton EA, Bueno R (2013) EpsteinZin utility in DICE: is risk aversion irrelevant to climate policy? Environ Resour Econ 56(1):73–84
Altan A, Karasu S, Bekiros S (2019) Digital currency forecasting with chaotic metaheuristic bioinspired signal processing techniques. Chaos Soliton Fract 126:325–336
Anbazhagan S, Kumarappan N (2012) Dayahead deregulated electricity market price forecasting using recurrent neural network. IEEE Syst J 7(4):866–872
Atsalakis GS, Atsalaki IG, Pasiouras F, Zopounidis C (2019) Bitcoin price forecasting with neurofuzzy techniques. Eur J Oper Res 276(2):770–780
Baur DG, McDermott TK (2010) Is gold a safe haven? Int Evid J Bank Financ 34(8):1886–1898
Bessler W, Wolff D (2015) Do commodities add value in multiasset portfolios? An outofsample analysis for different investment strategies. J Bank Financ 60:1–20
Bouri E, Molnár P, Azzi G, Roubaud D, Hagfors LI (2017) On the hedge and safe haven properties of Bitcoin: is it really more than a diversifier? Financ Res Lett 20:192–198
Branke J, Scheckenbach B, Stein M, Deb K, Schmeck H (2009) Portfolio optimization with an envelopebased multiobjective evolutionary algorithm. Eur J Oper Res 199(3):684–693
Cao Q, Ewing BT, Thompson MA (2012) Forecasting wind speed with recurrent neural networks. Eur J Oper Res 221(1):148–154
das Neves RH (2020) Bitcoin pricing: impact of attractiveness variables. Financ Innov 6(1):1–18
Dragomiretskiy K, Zosso D (2014) Variational mode decomposition. IEEE Trans Signal Process 62(3):531–544
Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. In: International conference on machine learning
Dutta A, Kumar S, Basu M (2020) A gated recurrent unit approach to bitcoin price prediction. J Risk Financ Manag 13(2):23
Eilers D, Dunis CL, von Mettenheim HJ, Breitner MH (2014) Intelligent trading of seasonal effects: a decision support algorithm based on reinforcement learning. Decis Support Syst 64:100–108
Fabozzi FJ, Cheng X, Chen RR (2007) Exploring the components of credit risk in credit default swaps. Financ Res Lett 4(1):10–18
Farinelli S, Ferreira M, Rossello D, Thoeny M, Tibiletti L (2008) Beyond Sharpe ratio: optimal asset allocation using different performance ratios. J Bank Financ 32(10):2057–2063
Galankashi MR, Rafiei FM, Ghezelbash M (2020) Portfolio selection: a fuzzyANP approach. Financ Innov 6(1):1–34
Garcia D, Tessone CJ, Mavrodiev P, Perony N (2014) The digital traces of bubbles: feedback cycles between socioeconomic signals in the Bitcoin economy. J R Soc Interface 11(99):1–8
Geman H, Ohana S (2008) Timeconsistency in managing a commodity portfolio: a dynamic risk measure approach. J Bank Financ 32(10):1991–2005
Guastaroba G, Mansini R, Speranza MG (2009) On the effectiveness of scenario generation techniques in singleperiod portfolio optimization. Eur J Oper Res 192(2):500–511
Guesmi K, Saadi S, Abid I, Ftiti Z (2019) Portfolio diversification with virtual currency: evidence from bitcoin. Int Rev Financ Anal 63:431–437
Hestenes MR (1969) Multiplier and gradient methods. J Optim Theory Appl 4(5):303–320
Imbs J, Wacziarg R (2003) Stages of diversification. Am Econ Rev 93(1):63–86
Jalali MFM, Heidari H (2020) Predicting changes in Bitcoin price using grey system theory. Financ Innov 6(1):1–12
Jangmin O, Lee J, Lee JW, Zhang BT (2006) Adaptive stock trading with dynamic asset allocation using reinforcement learning. Inf Sci 176(15):2121–2147
Jeong G, Kim HY (2019) Improving financial trading decisions using deep Qlearning: predicting the number of shares, action strategies, and transfer learning. Expert Syst Appl 117:125–138
Jiang S, Li Y, Wang S, Zhao L (2021) Blockchain competition: the tradeoff between platform stability and efficiency. Eur J Oper Res
Kapsos M, Christofides N, Rustem B (2014) Worstcase robust Omega ratio. Eur J Oper Res 234(2):499–507
Konno H, Yamazaki H (1991) Meanabsolute deviation portfolio optimization model and its applications to Tokyo stock market. Manag Sci 37(5):519–531
Kou G, Akdeniz ÖO, Dinçer H, Yüksel S (2021) Fintech investments in European banks: a hybrid IT2 fuzzy multidimensional decisionmaking approach. Financ Innov 7(1):1–28
Li X, Shang W, Wang S (2019) Textbased crude oil price forecasting: a deep learning approach. Int J Forecast 35(4):1548–1560
Li Y, Jiang S, Li X, Wang S (2021) The role of news sentiment in oil futures returns and volatility forecasting: datadecomposition based deep learning approach. Energy Econ 95:105140
Lillicrap TP, Hunt JJ, Pritzel A (2015) Continuous control with deep reinforcement learning. Comput Sci 6:187
Liu W, Cao S, Chen Y (2016) Applications of variational mode decomposition in seismic timefrequency analysis. Geophy 81(5):365–378
Liu Y, Tsyvinski A (2018) Risks and returns of cryptocurrency. Rev Financ Stud (Forthcoming)
Long W, Lu Z, Cui L (2019) Deep learningbased feature engineering for stock price movement prediction. Knowl Based Syst 164:163–173
Lwin K, Qu R, Kendall G (2014) A learningguided multiobjective evolutionary algorithm for constrained portfolio optimization. Appl Soft Comput 24:757–772
Perold AF (1984) Largescale portfolio optimization. Manag Sci 30(10):1143–1160
Powell MJ (1964) An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput J 7(2):155–162
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Selmi R, Mensi W, Hammoudeh S, Bouoiyour J (2018) Is Bitcoin a hedge, a safe haven or a diversifier for oil price movements? A comparison with gold. Energy Econ 74:787–801
Sharpe WF (1994) The sharpe ratio. J Portf Manag 21(1):49–58
Symitsi E, Chalvatzis KJ (2019) The economic value of Bitcoin: a portfolio analysis of currencies, gold, oil and stocks. Res Int Bus Financ 48:97–110
Tola V, Lillo F, Gallegati M, Mantegna RN (2008) Cluster analysis for portfolio optimization. J Econ Dyn Control 32(1):235–258
Ullah A, Ahmad J, Muhammad K, Sajjad M, Baik SW (2017) Action recognition in video sequences using deep bidirectional LSTM with CNN features. IEEE Access 6:1155–1166
Wang L, Liu Z, Miao Q, Zhang X (2018) Time–frequency analysis based on ensemble local mean decomposition and fast kurtogram for rotating machinery fault diagnosis. Mech Syst Signal Proc 103:60–75
Wang S, Hu A, Wu Z, Liu Y, Bai X (2014) Multiscale combined model based on runlengthjudgment method and its application in oil price forecasting. Math Probl Eng 1–9
Wang Y, Markert R (2015) Detecting rubimpact fault of rotor system based on variational mode decomposition. Mech Mach Sci 1955–1963
Wen F, Yang X, Gong X, Lai KK (2017) Multiscale volatility feature analysis and prediction of gold price. Int J Inf Technol Decis Mak 16(01):205–223
Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications. ACM Trans Intell Syst Technol 10(2):1–19
Yu JH, Kang J, Park S (2019) Information availability and return volatility in the bitcoin market: analyzing differences of user opinion and interest. Inf Process Manag 56(3):721–732
Yu L, Wang Z, Tang L (2015) A decomposition–ensemble model with datacharacteristicdriven reconstruction for crude oil price forecasting. Appl Energy 156:251–267
Yue W, Wang Y, Xuan H (2019) Fuzzy multiobjective portfolio model based on semivariance–semiabsolute deviation risk measures. Soft Comput 23(17):8159–8179
Xu M, Chen X, Kou G (2019) A systematic review of blockchain. Financ Innov 5(1):1–14
Zha Q, Kou G, Zhang H, Liang H, Chen X, Li C, Dong Y (2020) Opinion dynamics in finance and business: a literature review and research opportunities. Financ Innov 6(1):1–22
Zhang J, Zhu Y, Zhang X, Ye M, Yang J (2018) Developing a long shortterm memory (LSTM) based model for predicting water table depth in agricultural areas. J Hydrol 561:918–929
Zhang C, Zhou J, Li C, Fu W, Peng T (2017) A compound structure of ELM based on feature selection and parameter optimization using hybrid backtracking search algorithm for wind speed forecasting. Energy Conv Manag 143:360–376
Zhu Q, Zhang F, Liu S, Wu Y, Wang L (2019) A hybrid VMD–BiGRU model for rubber futures time series forecasting. Appl Soft Comput 84:105739
Acknowledgements
The authors would like to express their sincere appreciation to the anonymous reviewers and editors for their constructive comments and suggestions.
Funding
This research work was supported by the National Natural Science Foundation of China under Grants No. 71801213 and No. 71988101, and the National Center for Mathematics and Interdisciplinary Sciences, CAS.
Author information
Affiliations
Contributions
YL conceived of the presented idea, contribute to the methodologies and analyzed the data. SJ initiated the subject with YL, contributed to the review of literature and interpretation of the results. YW modified the manuscript and contribute to the discuss of the empirical results. SW suggested such a topic and supervised throughout the manuscript preparation. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Li, Y., Jiang, S., Wei, Y. et al. Take Bitcoin into your portfolio: a novel ensemble portfolio optimization framework for broad commodity assets. Financ Innov 7, 63 (2021). https://doi.org/10.1186/s4085402100281x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4085402100281x
Keywords
 Portfolio optimization
 Bitcoin
 Deep learning
 Reinforcement learning
 Variational mode decomposition