Skip to main content

Table 1 List of studies on machine learning applied to cryptocurrencies prices (organized by chronological and alphabetical order)

From: Forecasting and trading cryptocurrencies with machine learning under changing market conditions

Article

Dependent variable

Frequency

Sample period

Models

Type (classification/regression)

Trading strategies (positions/trading costs)

Input set

Main findings

Madan et al. (2015)

Bitcoin prices in USD from Coinbase

10-s, 10-min

5Ā years since the inception of Bitcoin

Binomial logistic regressions (BLR) and random forest (RF)

Classification

ā€“

Prices and 16 blockchain features

10-min data give a better sensitivity and specificity ratio than the 10-s data

Kim et al. (2016)

Bitcoin, ethereum and ripple prices

Daily

Bitcoin: Dec-2013 to Feb-2016

Ethereum: Aug-2015 to Feb-2016

Ripple: Sept-2015 to Jan-2016

Averaged one-dependence estimators (AODE)

Classification

Long/no trading costs

Trading information, and comments and replies posted in online communities

Comments and replies are good predictors of Bitcoin prices

Żbikowski (2016)

Bitcoin prices in USD from Bitstamp

15-min

Jan-2015 to Feb-2015

Exponential moving average (EMA), box support vector machine (SVM) and volume weighted SVM (VW-SVM)

Classification

Long and short/trading costs of 0.2%

10 technical analysis indicators

VW-SVM is the best model in terms of average return and maximum drawdown

Jiang and Liang (2017)

Prices in USD of the 12 most traded cryptocurrencies at Poloniex

30-min

Jun-2015 to Aug-2016

Convolutional neural networks (CNN) with deep reinforcement learning

Regression

Long and short/trading costs of 0.25%

Returns

Mixed results between CNN portfolio and Online Newton Step and Passive Aggressive Mean Reversion portfolios

Jang and Lee (2018)

Bitcoin price index in USD

Daily

Sep-2011 to Aug-2017

Bayesian neural networks (BNN), linear regression and support vector regressions (SVM)

Regression

ā€“

26 blockchain features, trading information, exchange rates and macroeconomic variables

The BNN is the best prediction model

McNally et al. (2018)

Bitcoin prices in USD from CoinDesk

Daily

Aug-2013 to July-2016

Bayesian recurrent neural (RNN) and long short term memory (LSTM)

Classification and Regression

ā€“

OHLC prices, difficulty, and hash rate of blockchain

The best time lengths are 100Ā days for the LSTM and 20Ā days for the RNN

Nakano et al. (2018)

Bitcoin returns in USD from Poloniex

15-min

July- 2016 to Jan-2018

Artificial neural networks (ANN)

Classification

Long, and long and short/transaction costs of 0.025%,0.05% and 0.1%

Returns and 4 technical analysis indicators

Higher performance of the ANN strategy, except in the last month of data. Results are highly sensitive to the model specification and input data

Vo and Yost-Bremm (2018)

Bitcoin prices in USD, CNY, JPY, EUR from 6 online exchanges

1-min

Jan-2012 to Oct-2017

Random forests (RF) and a deep learning model

Classification

Long and short/no trading costs

5 technical analysis indicators

RF is the best model for a frequency of 15-min

Alessandretti et al. (2019)

Price indexes of 1681 cryptocurrencies in USD

Daily

Nov-2015 to Apr-2018

Ensemble of regression trees built by XGboost and long short term memory network

Regression

Long/transaction costs of 0,1%, 0,2%, 0,5% and 1%

Price, market capitalization, market share, rank, volume, and age

All strategies, produce a significant profit (expressed in bitcoin) even with transaction fees up to 0.2%

Atsalakis et al. (2019)

Bitcoin ethereum, litecoin and ripple returns

Daily

Sep-2011 to Oct-2017

PATSOSā€”a hybrid neuro-fuzzy model

Classification and regression

Long and short/no transaction costs

Returns and prices

PATSOS outperforms other competing methods and produces a return significantly higher than the Buy-and-Hold (B&H) strategy

Catania et al. (2019)

Bitcoin, ethereum, litecoin and ripple returns in USD

Daily

Aug-2015 to Dec-2017

Linear univariate and multivariate regression models, and selections and combinations of those models

Regression

ā€“

Returns and several exogenous financial variables

Statistically significant improvements in forecasting returns when using combinations of univariate models

de Souza et al. (2019)

Bitcoin prices in USD

Daily

May-2012 to May-2017

Artificial neural network (ANN) and support vector machine (SVM)

Classification

Long and short/5 USD

OHLC prices

SVM provides conservative returns on the risk adjusted basis, and ANN generates abnormal profits during short run bull trends

Han et al. (2019)

Bitcoin returns in USD

Daily

April-2013 to Mar-2018

NARX Neural Network

Regression

ā€“

Returns

NARX is effective in predicting the tendency but not the jumps

Huang et al. (2019)

Bitcoin returns in USD

Daily

Jan-2012 to Dec-2017

Trees

Classification

Long and short/no trading costs

124 technical indicators computed from the OHLC prices

Lower volatility, higher win-to-loss ratio and information ratio than those of every simple cut-off strategy or the B&H strategy

Ji et al. (2019b)

Bitcoin returns in USD from Bitstamp

Daily

Nov.-2011 to Dec.-2018

Deep Neural Network (DNN), Long Short Term Memory (LSTM), Convolutional Neural Network (CNN), Deep Residual Network (ResNet), combination of CNNs and RNNs (CRNN) and their combinations

Classification and regression

Long/no transaction costs

Prices and 17 blockchain features

Performances of the prediction models were comparable, LSTM is the best prediction model, DNN models are the best classification models, classification models were more effective for trading

Lahmiri and Bekiros (2019)

Bitcoin, digital cash and ripple prices in USD

Daily

Bitcoin: July-2010 to Oct-2018

Digital Cash: Feb-2010 to Oct-2018

Ripple: Jan-2015 to Oct-2018

Long Short Term Memory (LSTM) and Generalized Regression Neural Networks (GRNN)

Regression

ā€“

Prices

Predictability of LSTM is significantly higher than of GRNN

Mallqui and Fernandes (2019)

Bitcoin prices in USD

Daily

Apr-2013 to Apr-2017

Artificial neural networks (ANN), support vector machine (SVM) and ensembles

Classification and Regression

ā€“

OHLC prices, Blockchain information and several exogenous financial variables

Ensemble of recurrent neural networks and a Tree classifier is the best classification model, while SVM is the best regression model

Shintate and Pichl (2019)

Bitcoin returns in CNY and USD from OkCoin

1-min

Jun-2013 to Mar-2017

Random sampling method (RSM)

Classification

Long and short/No transaction costs

OHLC prices

The proposed RSM outperforms several alternatives, but the profit rates do not exceed those of the B&H strategy

Smuts (2019)

Bitcoin and ethereum prices in USD

1-h

Dec-2017 to Jun-2018

Long short term memory recurrent neural network (LSTM)

Classification

ā€“

Prices, volumes, Google trends, and Telegram chat groups dedicated to bitcoin and ethereum trading

Telegram data is a better predictor of bitcoin, while GoThe ensemble, by unweighted average of the four trading signals from the four models, after resampling the data, gives the best results.ogle Trends is a better predictor of ethereum, especially in one-week period

Borges and Neves (2020)

Prices from Binance 100 cryptocurrencies pairs with the most traded volume in USD

1-min

For each pair since beginning of trading at Binance until oct-2018

Logistic regression, random forest, support vector machine, and gradient tree boosting and an ensemble of these models

Classification

Long/transaction costs of 0.1%

Returns, resampled returns, and 11 technical indicators

Ā 

Chen et al. (2020b)

Bitcoin price index and trading prices from Binance in USD

5-min and daily

July-2017 to Jan-2018 for 5-min and Feb-2017, to Feb-2019 for daily

Logistic Regression (LR), Linear Discriminant Analysis (LDA), Random Forest (RF), XGBoost (XGB), Support Vector Machine (SVM), and Long Short-Term Memory (LSTM)

Classification

ā€“

5-min: OHLC prices and trading volume. Daily: 4 Blockchain features, 8 marketing and trading variables, Google trend search volume index, Baidu media search volume, and gold spot price

For 5-min data machine learning models achieved better accuracy than LR and LDA, with LSTM achieving the best result (67% accuracy). For daily data, LR and LDA are better, with an average accuracy of 65%

Chu et al. (2020)

Bitcoin, ethereum, dash, litecoin, MaidSafeCoin, monero and ripple from CryptoCompare in USD

Hourly

Feb-2017 to Aug-2017

Exponential Moving Averages (EMA) for time series and cross-sectional portfolios

Classification and Regression

Long and short/No transaction costs

Trading prices

Momentum trading does not beat the passive trading strategies

Sun et al. (2020)

42 cryptocurrencies

Daily

Jan-2018 to Jun-2018

LightGBM, SVM support vector machines (SVM) and Random Forests (RF)

Classification

ā€“

Trading data and macroeconomic variables

LightGBM outperforms SVM and RF, and the accuracy is higher for 2Ā weeks predictions