Skip to main content

On the robust drivers of cryptocurrency liquidity: the case of Bitcoin

Abstract

This study aims to identify the factors that robustly contribute to Bitcoin liquidity, employing a rich range of potential determinants that represent unique characteristics of the cryptocurrency industry, investor attention, macroeconomic fundamentals, and global stress and uncertainty. To construct liquidity metrics, we compile 60-min high-frequency data on the low, high, opening, and closing exchange rates of Bitcoin against the US dollar. Our empirical investigation is based on the extreme bounds analysis (EBA), which can resolve model uncertainty issues. The results of Leamer’s version of the EBA suggest that the realized volatility of Bitcoin is the sole variable relevant to explaining liquidity. With the Sala-i-Martin’s variant of EBA, however, four more variables, (viz. Bitcoin’s negative returns, trading volume, hash rates, and Google search volume) are also labeled as robust determinants. Accordingly, our evidence confirms that Bitcoin-specific factors and developments, rather than global macroeconomic and financial variables, matter for explaining its liquidity. The findings are largely insensitive to our proxy of liquidity and to the estimation method used.

Introduction

Over the past decade, cryptocurrencies have witnessed startling growth in market value, general acceptance, institutional client bases, and public attention. The evolution of Bitcoin as a highly impeccable payment facility and novel investment avenue has piqued the interest of many market participants, such as investors, regulators, e-commerce managers, and policymakers. When traditional financial markets are in turmoil due to sudden exogenous shocks (e.g., the outbreak of pandemic diseases, cross-border financial turbulence, and geo-economic and geo-political tensions), an increasing number of investors are inclined to embrace Bitcoin as a unique diversifier and/or safe haven alternative. For example, Kumar and Padakandla (2022) establish the safe-haven characteristics of Bitcoin for NASDAQ and EURO STOXX in the short- and long-term horizons during the COVID-19 pandemic. Ustaoglu (2022) finds that Bitcoin and Ethereum are effective safe haven assets against most emerging market stock returns during the global health crisis. Given its crucial implications for the global financial system, the cryptocurrency sphere has been examined by a great deal of studies to better understand its key attributes. Specifically, several aspects of cryptocurrency market microstructure have come under rigorous investigations in the literature, including, for instance, price discovery (e.g., Brauneis and Mestel 2018; Dimpfl and Peter 2021), market efficiency (Kristoufek and Vosvrda 2019; Manahov and Urquhart 2021; Noda 2021), information asymmetry (e.g., Chen 2019; Feng et al. 2018), trading patterns (e.g., Hasso et al. 2019; Petukhina et al. 2021), transaction costs (e.g., Dyhrberg et al. 2018; Easley et al. 2019; Kim 2017), and intraday trading activity (e.g., Eross et al. 2019; Pelster et al. 2019). Based on a sample of the largest cryptocurrencies, Gkillas et al. (2018) conclude that extreme correlation can be linked to cryptocurrency market trajectories rather than prevailing price swings. Extreme correlation tends to increase in market downturns for most cryptocurrency pairs, as opposed to during bull markets.

An immensely relevant property of the cryptocurrency market microstructure is liquidity, which Manahov (2021) defines as the ease and speed with which a given cryptocurrency can be converted into other peers or fiat money. Liquidity is a vital precondition for cryptocurrencies to effectively take up their role, whether as an unorthodox means of payment, an investment asset, or a safe haven commodity. Typically, a lack of liquidity in the conventional and cryptocurrency markets elevates traders’ transaction costs, gives rise to informational inefficiency, and makes it possible to manipulate prices. Bitcoin is notorious for its erratic price behavior, with traders attempting to determine the reasons underlying these gyrations, which can eventually impact its liquidity. Indeed, a thorough examination of how price and liquidity levels change over time is necessary to understand the variables influencing the rapidly growing cryptocurrency markets and their growing integration into the global financial system. Unlike the trading mechanisms of standard securities, cryptocurrency peers are mostly order-driven, implying that a trade counterparty may not be available immediately. Hence, the cryptocurrency market’s liquidity supply is endogenous, meaning that traders mostly provide it through order placement. Under harsh financial conditions, investors are not obliged to secure the liquidity necessary for the smooth functioning of the cryptocurrency market (Cheng et al. 2021). As Zhang and Li (2021) highlight, adequate knowledge of cryptocurrency pricing mechanisms requires investigating how liquidity is reflected in prices. Dimpfl and Mäckle (2020) point out that the proper functioning of Bitcoin entails that exchange platforms offer diverse trading opportunities to the extent that orders are swiftly fulfilled without a significant price impact. Smales (2019) indicates that when assessing the viability of Bitcoin as a safe haven alternative, it is of great importance to inspect not only its correlation with other assets but also its own liquidity and price discovery aspects. Brauneis et al. (2021a) maintain that the speed and smoothness with which people and businesses can exchange their Bitcoin holdings for fiat money are critical for widespread adoption. Consequently, liquidity has a prominent influence on the appeal of cryptocurrency.

An equally important, yet sparsely addressed, issue in the literature is the factors that explain cryptocurrency liquidity. Because of its practical implications for financial stability authorities and the supply and demand sides of the market, the question of what drives the liquidity of equities, bonds, and fiat currencies has been examined a great deal (e.g., Chordia et al. 2001; Cumming et al. 2011; Karnaukh et al. 2015; Manganelli and Wolswijk 2009; Todorov 2020). By contrast, comparatively less effort has been devoted to identifying the potential forces of cryptocurrency liquidity. Zhang and Gregoriou (2020) assert that owing to the apparent peculiarities of the cryptocurrency market, determining the dynamics that affect its liquidity has become an urgent topic. Yue et al. (2021) indicate that although much of the Bitcoin literature focuses on price efficiency and its determinants, the liquidity component and its drivers have received little attention.

A crucial issue related to the identification of the determinants of a given phenomenon (e.g., Bitcoin liquidity) is model uncertainty, which arises when a researcher does not possess background knowledge of the true variables that must be included in the regression specification (e.g., Chatfield 1995; Avramo 2002). A given covariate could prove statistically significant within a specific group of conditioning variables but might not do so in the context of a set of competing models. In other words, one may find that \({x}_{1}\) is statistically related to \(y\) in the presence of \({x}_{2}\) and \({x}_{3}\), but may not be so in case the right-hand side of the regression model is augmented with \({x}_{4}\). Hence, in the absence of an explicit theoretical foundation, researchers may be tempted to experiment with multiple combinations of parameters, each generating different conclusions. Durham (2000) and Cremers (2002) point out that there are instances in which scholarly works present a wide range of factors explaining a particular phenomenon, but there is very little agreement across these works on what the most relevant factors are. In these circumstances, a researcher may perform a data-fitting exercise to publish only the best-fitting finding that matches his/her prior beliefs while not reporting other findings that do not. To do so, they may resort to cherry-picking covariates and sample periods that yield statistically significant findings, thus raising concerns regarding model uncertainty. Chatfield (1995) and Hafner-Burton (2005) argue that these practices of model shopping and p-hacking can be found in sub-model selection studies that involve discrete data analysis, time-series analysis, generalized linear modeling, and ANOVA. Managing model uncertainty is an important aspect of statistical modeling and predictive analytics. This requires careful consideration of the sources of uncertainty and the development of appropriate techniques to quantify and mitigate the impact of uncertainty on model predictions.

Against this backdrop, we undertake an empirical inquiry into the sturdy determinants of Bitcoin liquidity while considering the issue of model uncertainty. We evaluate the robustness of a broad collection of candidate factors widely recognized in the literature as key explanatory variables of liquidity. The main questions addressed in this study are as follows:

  • In the presence of various competing models, which factors contribute robustly to Bitcoin liquidity?

We add to the existing body of research in three ways. First, our primary contribution to the literature is identifying robust drivers of Bitcoin liquidity while addressing the issue of model uncertainty. Although some research efforts (e.g., Brauneis et al. 2021b; Choi 2021; Corbet et al. 2022; Dimpfl and Mäckle 2020; Eross et al. 2019; Fink and Johann 2014; Marshall et al. 2019; Yao et al. 2021; Yue et al. 2021) have been devoted to pinning down the factors that contribute to the liquidity of cryptocurrencies, they fail to consider the issue of variable selection. This study is more directly related to this burgeoning strand of research but distinguishes itself by handling model uncertainty about the problem of variable selection in regression models. For this purpose, we perform an extreme bounds analysis (EBA), which was first proposed by Leamer (1983, 1985) and Levine and Renelt (1992) and subsequently enhanced by Sala-i-Martin (1997). The chief advantage of the EBA is its ability to address model uncertainty by establishing the robustness or fragility of the parameter in question for any possible alteration in the conditioning set of information. Thus, the real advantage of this approach is that it allows us to conduct a systematic sensitivity analysis to determine whether a candidate covariate is correlated with the dependent variable, regardless of the subset of conditioning variables incorporated in the regression model. Numerous studies have been conducted in the fields of politics, economics, and finance (e.g., Ahmed 2022a; Gassebner et al. 2013; Hartwig and Sturm 2014; Kim et al. 2019; Moosa and Cardak 2006; Sturm and Williams 2010) that implement EBA to account for possible uncertainty related to the model structure. However, to our best knowledge, no study has adopted a global sensitivity approach to uncover robust drivers of Bitcoin liquidity. The analysis presented in this study attempts to bridge this gap in the literature. It is worth mentioning that our work is similar in spirit to Ahmed’s (2022b), with some methodological and sampling differences. In terms of methodology, Ahmed (2022b) deploys LASSO-based algorithms to uncover the factors that contribute to the liquidity of Bitcoin, whereas our study relies on EBA. Both techniques can be used for variable selection but differ in their underlying assumptions and methods. The LASSO method estimates a sparse linear regression model by adding a penalty term to the regression coefficients. On the contrary, the EBA approach tests the robustness of regression results to changes in the set of predictors included in the model. In addition, the lasso method assumes that the true regression coefficients are sparse in the sense that only a few predictors are truly important for the outcome variable (Tibshirani 1996). On the contrary, the EBA does not depend on any assumptions about the sparsity of the true regression coefficients. Nevertheless, it necessitates that the set of predictors is not too large; otherwise, the computational burden becomes unwieldy. Regarding sampling, our results are based on a larger sample period than that used in Ahmed (2022b).

Second, we evaluate the explanatory power of 18 candidate factors corresponding to significant actors in global economic and financial scene. The selected factors are cryptocurrency-specific attributes (Bitcoin’s signed returns and volatility, trading volume, transaction fees, hash rate, number of bitcoins mined, number of transactions, and total market capitalization), public attention (Google search volume), macroeconomic and financial factors (benchmark stock indices of the US and Europe, spot exchange rates of EUR/USD, term spread, and gold markets), and global uncertainty and stress (US economic policy uncertainty, fear, and stress indicators).

Finally, our evidence contributes to a common understanding and identification of the factors affecting Bitcoin liquidity. Since such factors could be a source of liquidity-related concerns, the results may benefit central banks and international bodies responsible for maintaining financial stability.

The remainder of this paper is organized as follows: Sect. "Literature review and hypotheses development" provides a concise review of prior research and presents key hypotheses. Sect. "Methodology" outlines the EBA methodology. Sect. "Data description" outlines the data and liquidity proxies, and Sect. "Empirical evidence" discusses the results. Sect. "Additional analyses" presents robustness checks. Finally, concluding remarks are offered in Sect. "Conclusion".

Literature review and hypotheses development

Related research

Due to the multidimensional nature of market liquidity, an all-encompassing definition and measurement of this concept continues to be challenging. Many authors (e.g., Bernstein 1987; Le and Gregoriou 2020; Hasbrouck and Schwartz 1988; Naik and Reddy 2021) maintain that a theoretically coherent and unequivocally accepted definition of liquidity remains an open question. As Vayanos and Wang (2011) elaborate, the lack of liquidity can be traced back to the existence of market imperfections, such as search frictions, asymmetric information, funding constraints, imperfect competition, participation costs, and transaction costs. Sarr and Lybek (2002) highlight the main aspects of market liquidity, including depth (i.e., trade orders are large in number), breadth (i.e., large-volume orders have a negligible pricing impact), tightness (i.e., low transaction fees and spreads), resiliency (i.e., the promptness with which order imbalances are resolved), and immediacy (i.e., the speed of order execution). Nonetheless, the relative importance of these characteristics tends to differ depending on overall market circumstances. For example, in times of economic stability, liquidity might be more suggestive of reasonable transaction costs. In contrast, in periods of economic downturn, liquidity might be more indicative of immediacy and resiliency.

Unlike traditional financial markets, fast-developing cryptocurrency counterparts have experienced frequent price climbs, dramatic collapses, and bubbles and bursts in recent years. In such circumstances, efficient conversion between virtual currencies and their fiat peers has become a priority. Motivated by the practical implications of cryptocurrency liquidity, numerous studies have explored liquidity dynamics in different market states and across trading venues. For instance, by comparing the average liquidity levels of Bitcoin, Ethereum, Litecoin, and Ripple, Brauneis et al. (2021b) establish that Bitcoin (Ripple) is the most (least) liquid cryptocurrency. Loi (2018) and Marshall et al. (2019) demonstrate that liquidity varies across trading venues and currency pairs. Manahov (2021) finds that traders amplify the demand for cryptocurrency liquidity during extreme price movements. Takaishi and Adachi (2020) demonstrate that as the liquidity of the Bitcoin market rises over time, so does its efficiency. Smales (2019) finds that compared to traditional assets, Bitcoin is less liquid, excessively volatile, and more expensive in transaction fees. Sensoy (2019) and Al-Yahyaee et al. (2020) demonstrate that higher levels of liquidity (volatility) tend to boost (decrease) cryptocurrency market efficiency. Zaremba et al. (2021) find that large (small and medium) cryptocurrency markets exhibit substantial short-term return momentum (reversal) effects driven by high (low) levels of liquidity. In a similar spirit, Begušić and Kostanjčar (2019) provide evidence of mean reversion (momentum effects) in illiquid (highly liquid) cryptocurrencies. In an event study, Yue et al. (2021) establish that cryptocurrency liquidity tends to rise (decline) following announcements of good (bad) global news. Additionally, the effect of good news on liquidity lasts longer than does that of bad news. Zhang and Li (2021) find that cryptocurrencies with higher (lower) liquidity show smaller (larger) future returns. Shi (2017) shows that the launch of bitcoin futures trading reduces (increases) the intraday volatility (liquidity) of the spot market. Zhang and Gregoriou (2020) establish that, in the aftermath of China’s official announcement to ban initial coin offerings (ICOs) on September 4, 2017, major crypto markets temporarily suffered negative abnormal returns and lower levels of liquidity. Dong et al. (2022) demonstrate that funding liquidity, measured by the federal funds rate, is positively correlated with the illiquidity of cryptocurrencies, suggesting that a contractionary monetary policy can reduce cryptocurrency liquidity. Moreover, the results reveal several non-fundamental stock-market anomalies (e.g., turnover ratio, trading volume, idiosyncratic volatility, and dollar volume volatility) in the cryptocurrency market. Trimborn et al. (2020) develop a liquidity-bounded risk–return optimization (LIBRO) approach that seeks to optimize the risk-return balance of a portfolio consisting of financial assets (i.e., stocks, bonds, and commodities) and cryptocurrencies subject to liquidity constraints. Based on the LIBRO method, the results show that adding cryptocurrencies other than Bitcoin to a portfolio with traditional investment assets remarkably enhances the risk-return trade-off.

Although a particularly pertinent topic, the factors that explain the liquidity of cryptocurrencies remain much less investigated than those of mainstream financial assets. Only a few studies have endeavored to identify the sources of variation in the liquidity of Bitcoin and other virtual currencies. Fink and Johann (2014) indicate that investor activity is a primary factor in determining Bitcoin liquidity, as opposed to Google search queries, trading volumes, and average mining costs. Using various LASSO-based methods, Ahmed (2022b) finds that cryptocurrency hacking incidents, trading volume, realized volatility of Bitcoin prices, Google search volume, and Ethereum liquidity are the most important determinants of Bitcoin liquidity. Dimpfl and Mäckle (2020) investigate the liquidity factors in Kraken, a globally prominent trading venue. They document that trading platform-specific features (i.e., trading activity, microstructure noise, and volatility) and blockchain attributes (i.e., overall transaction volumes, hash rates, and miners’ compensation) matter in explaining crypto-platform liquidity. Corbet et al. (2022) demonstrate a tremendous increase in cryptocurrency market liquidity following the official WHO’s declaration of a worldwide pandemic in 2020. The results indicate a substantial interaction between return volatility and cryptocurrency liquidity during the COVID-19 pandemic. Scharnowski (2021) establishes that Bitcoin liquidity has improved immensely over time. Moreover, liquidity is highly associated with Bitcoin-specific factors but is marginally affected by US economic and financial influences. Based on intraday trade and quote data from three US cryptocurrency exchanges (i.e., Gdax, Gemini, and Kraken), Dyhrberg et al. (2018) find that average quoted spreads (i.e., the price of liquidity provision) and average effective spreads (i.e., the execution cost when traders demand liquidity) are inversely related to the mean volatility and number of trades. The results of Choi (2021) indicate that active investor attention, as measured by the number of Bitcoin tweets, triggers a real-time increase in liquidity for approximately 60 min. Ghabri et al. (2021) document weak time-varying correlations between Bitcoin liquidity innovations and their counterparts in mainstream financial markets, suggesting that investors may alleviate liquidity risks by including Bitcoin in their conventional asset portfolios. Leirvik (2022) demonstrates a positive association between the idiosyncratic volatility of cryptocurrency liquidity and expected returns, implying that investors demand a risk premium to hold cryptocurrencies whose liquidity levels fluctuate substantially over time.

Hypotheses development

A perusal of the literature discussed above reveals that much less attention has been devoted to uncovering the factors that contribute to the liquidity of cryptocurrencies. Interestingly, the sporadic research on this topic seems to overlook the issue of model uncertainty despite its practical implications for portfolio construction, risk management, and decision-making processes. The dynamics of cryptocurrency markets are distinct from traditional financial markets, implying that cryptocurrency liquidity formation may differ from that of other asset markets. Several factors may impact the liquidity of cryptocurrencies, and their identification is vital for making informed decisions and fostering the long-term sustainability of this fast-growing market. Investors and market participants must consider these factors when trading Bitcoin. Extending this incipient stream of research, we examine the robustness of a broad collection of crypto-industry-specific and external variables in determining the liquidity of Bitcoin. The analysis relies on the EBA methodology, which can address the issue of model specification uncertainty. As elaborated in SubSect. "Candidate determinants", the proposed explanatory factors comprise 18 variables representing the Bitcoin industry’s primary features and global economic and financial systems. Our candidate determinants reflect four broad dimensions: cryptomarket characteristics, public attention, macroeconomic and financial development, and global uncertainty and stress. Accordingly, considering the above discussion of related literature and our research questions, we test the following hypotheses:

H 1

Bitcoin liquidity is robustly affected by crypto market characteristics.

H 2

Bitcoin liquidity is robustly affected by public attention.

H 3

Bitcoin liquidity is robustly affected by macroeconomic and financial developments.

H 4

Bitcoin liquidity is robustly affected by global uncertainty and stress.

Methodology

As indicated in the Introduction, Leamer (1983, 1985) and Levine and Renelt (1992) set forth the first version of the EBA, which has been criticized for being extremely restrictive concerning the binary criterion of robustness/fragility. Subsequently, Sala-i-Martin (1997) proposed a less stringent method. Succinct descriptions of both variants are provided in this section.

Leamer’s variant of EBA

The objective of the EBA is to verify whether the relationship between a particular response variable \(Y\), and a covariate of interest \(X\), is robust to any changes made to the conditioning information set. The basic idea of EBA is to systematically vary the group of potential explanatory variables to determine whether the linkage between the response variable and a given factor remains robust. The robustness requirement is satisfied only if (i) the estimated coefficient of \(X\) maintains its statistical significance and (ii) the associated sign remains unaltered in the presence of diverse subsets of other regressors included in the analysis. Leamer (1985) and Levine and Renelt (1992) point out that EBA considers a real-world scenario in which one is interested in determining the robustness of a particular variable, \(Q\), in the presence of a myriad of \(U\) putative variables that have been identified in the literature as possible determinants of a phenomenon. There are four steps to carry out the EBA. First, we estimate the following form of a baseline linear regression model:

$${BL}_{i} = {\alpha }_{i} + {\beta }_{i} Q + \sum_{j=1}^{m}{\eta }_{i,j} {X}_{j} + \sum_{k=1}^{n}{\vartheta }_{i,k} {Z}_{k} + {\upsilon }_{i}$$
(1)

where \(i\) indexes the universe of regression specifications to be estimated, \(BL\) represents Bitcoin liquidity, \(\alpha\) is a constant term, \(Q\) is the focus variable in question, where \(Q\in U, X\) denotes a vector of relevant regressors (i.e., free variables) that are incorporated in each regression run due to strong theoretical and/or empirical pertinence, \(Z\) is a vector of doubtful factors that may influence \(BL\) and are included in each regression but in different combinations, where \(Z\in U,\) and \(\upsilon\) is the disturbance term. With a subset size of \(N\) doubtful variables drawn from the remaining \(U-1\) variables, the total count of regression models estimated for a specific focus variable is given by

$$P = \frac{(U-1) !}{N ! \left(U -1- N\right) !} \;for\; U-1>N>1$$
(2)

In the second step, we re-estimate Eq. (1) \(P\) times, where each estimation run includes a different linear combination of doubtful variables. Both the estimated coefficient, \({\widehat{\beta }}_{i}\), pertaining to \(Q\) and its corresponding standard deviation, \({\widehat{\sigma }}_{i}\), are extracted from each regression. Third, based on the second step, we pinpoint the maximum and minimum values of \(\widehat{\beta }\) and employ them to determine its upper and lower bounds, respectively. The upper bound is the largest value of \(\widehat{\beta }\) plus \(\tau \widehat{\sigma }\). On the contrary, the lower bound is computed as the lowest value of \(\widehat{\beta }\) minus \(\tau \widehat{\sigma }\), where \(\tau\) is the z-values related to a confidence level (e.g., 1.96 and 2.58 for the 0.95 and 0.99 confidence levels, respectively). In the final step, we determine the robustness, or lack thereof, of the factor in question, \(Q\), concerning the dependent variable. According to Leamer (1985), \(Q\) is fragile if its extreme bounds are not of the same sign or if it shows no statistical significance even once. Otherwise, \(Q\) is deemed robust because it withstands all possible changes in the model specifications. Leamer’s EBA variant has been criticized by several authors (e.g., McAleer et al. 1985; McAleer and Veall 1989; Granger and Uhlig 1990; Hendry and Krolzig 2004) for being predicated on an extremely difficult inferential requirement seldom encountered in practice.

Sala-i-Martin’s variant of EBA

As an alternative to the stringent binary robustness criterion of Leamer’s approach, Sala-i-Martin (1997) suggests considering the entire distribution of \(\widehat{\beta }\) (i.e., \({\left\{{\widehat{\beta }}_{i}\right\}}_{i=1}^{P}\)), as opposed to only its maximum and minimum bounds. He points out that one can rely on a cumulative distribution function (CDF) to assess the robustness of the relationship between a response variable and a set of potential explanatory variables. The author proposes computing the cumulative distribution function of \({\left\{{\widehat{\beta }}_{i}\right\}}_{i=1}^{P}\) at zero, (i.e., \(\left[{\text{CDF}}\left(0\right)\right]\)), employing the mean and variance of the distribution. Sala-i-Martin (1997) emphasizes that a researcher can infer that the variable in question is robust if most of \({\left\{{\widehat{\beta }}_{i}\right\}}_{i=1}^{P}\) are on the left or right side of zero. For this purpose, Eq. (1) is estimated \(P\) times, and the coefficient under consideration, \({\widehat{\beta }}_{i}\), standard deviation, \({\widehat{\sigma }}_{i}\), and the integrated likelihood, \({\mathcal{L}}_{i},\) are derived from each regression run. The next stage is to apply a likelihood-weighting scheme, in which regression models with a better fit are assigned greater weights. Accordingly, the weight \({W}_{e}\), for the eth model is calculated as

$${W}_{Qe} = {\mathcal{L}}_{Qe}/\sum_{i=1}^{P}{\mathcal{L}}_{Qi} , \quad where \sum_{i=1}^{P}{W}_{Qi}= 1$$
(3)

Sala-i-Martin (1997) provides two mutually exclusive scenarios in which the \({\text{CDF}}\left(0\right)\) can be determined. The first scenario is that all \({\widehat{\beta }}_{i}\) do not follow a particular distribution, while the second one is that all \({\widehat{\beta }}_{i}\) are assumed to follow a normal distribution. In the more generic case, the \({\text{CDF}}(0)\) is computed for each of the \(P\) model specifications, and the aggregate \({\text{CDF}}(0)\) of \(\widehat{\beta }\) is derived as an average of the respective values of the\({\text{CDFs}}(0)\), weighted by the integrated likelihoods. In other words, if \({\Phi }_{Qi} \left(0|{\widehat{ \beta }}_{Qi} ,{\widehat{\sigma }}_{Qi}^{2}\right)\) denotes the ith\({\text{CDF}}\left(0\right)\), then we can compute the aggregate \({\text{CDF}}(0)\) for non-normal \(\widehat{\beta }\) as follows:

$${\Phi }_{Q}\left(0\right)= \sum_{i=1}^{P}{W}_{Qi} {\Phi }_{Qi} \left(0| {\widehat{\beta }}_{Qi} ,{\widehat{\sigma }}_{Qi}^{2}\right)$$
(4)

Under the normal distribution assumption, on the other hand, the mean estimates of \({\widehat{\beta }}_{i}\) and \({\widehat{\sigma }}_{i}\) are defined as follows:

$$\overline{{\widehat{\beta } }_{Q}}= \sum_{i=1}^{P}{W}_{Qi} {\widehat{\beta }}_{Qi}$$
(5)
$$\overline{{\widehat{\sigma } }_{Q}^{2}}= \sum_{i=1}^{P}{W}_{Qi} {\widehat{\sigma }}_{Qi}^{2}$$
(6)

Based on the respective values of \(\overline{{\widehat{\beta } }_{Q}}\) and \(\overline{{\widehat{\sigma } }_{Q}^{2}}\), the \({\text{CDF}}(0)\) can be computed using the Gaussian distribution, such that \(\beta \sim \mathcal{N}(\overline{{\widehat{\beta } }_{Q}}, \overline{{\widehat{\sigma } }_{Q}^{2}})\). According to Sala-i-Martin (1997), a focus variable can be accurately described as robust if at least 95 percent of its density function is on the same side of zero. This implies that the associated coefficient estimates display the same sign and are statistically significant in no less than 95 percent of the regression models, supplemented with all possible subset combinations of \(Z\) variables.

Histograms can be used to demonstrate individual \({\text{CDFs}}\) as graphical EBA outputs. These graphical representations can be useful in exploring the distribution of variables in a dataset, which can inform the selection of variables for inclusion in the EBA. The bar graphs show the calculated coefficient amplitudes and dispersions over the full range of the regression runs. Additionally, by inspecting the \({\text{CDF}}\) of t-statistics, which displays the percentage of statistically significant coefficients at conventional levels, it is possible to confirm the statistical significance of \(\widehat{\beta }\).

It is worth mentioning that while the EBA is a valuable technique for appraising the robustness of empirical evidence, it is not without limitations (e.g., Granger and Uhlig 1990; Hendry and Mizon 1990; McAleer et al. 1985; McAleer and Veall 1989). First, Leamer’s (1985) variant of the EBA is excessively conservative in identifying the most robust factors. By considering only extreme combinations of variables, the EBA may fail to determine important variables that are not present in these extreme combinations but are still important in explaining the phenomenon under study. Second, the EBA assumes that all variables contribute equally to explaining the outcome variable, which is not always true. Some variables may have a larger effect than others, and the EBA does not take this into account. Moreover, EBA assumes that the regression coefficients are invariant to changes in the model’s specifications. Nevertheless, this assumption may not hold in practice, and the coefficients may vary depending on the variables included in the model. Third, extreme bound levels may result from complex models with “unreasonable” parameter estimates.

To overcome some of these drawbacks, Granger and Uhlig (1990) propose a modified version of the EBA called “Reasonable Extreme-Bounds Analysis” (REBA). REBA involves testing a wider range of variable combinations deemed “reasonable” based on prior knowledge and theory, in addition to the extreme combinations used in traditional EBA. This broader range of variable mixtures can help identify important regressors that may be missed by the EBA. Granger and Uhlig (1990) suggest adopting \({R}^{2}\) statistic as a criterion for assessing the REBA results. They indicate that within the REBA framework, extreme bounds emerge from regressions with \({R}^{2}\) values close to the maximum attainable value of \({R}^{2}\) over the entire space of the regression models. This means that models exhibiting low goodness-of-fit will be discarded. The authors argue that this relevant criterion can avoid overfitting, which occurs when a model is too complex and includes too many variables, leading to a high degree of noise and variance in estimates. By selecting the variable mixtures that produce the highest \({R}^{2}\), we could identify the most robust and parsimonious models.

Nonetheless, Granger and Uhlig (1990) acknowledge that when evaluating the goodness-of-fit of a linear regression model, \({R}^{2}\) can be misleading. For example, this statistic may be biased by including irrelevant variables, and a high \({R}^{2}\) value does not necessarily indicate that the model is a good fit for the data. Additionally, it offers no information regarding the functional form of the association between the outcome and predictor variables. Therefore, they recommend using \({R}^{2}\) in conjunction with other criteria, such as economic intuition and statistical significance, to evaluate the robustness of empirical findings.

Data description

The empirical investigation spans a sample period from January 12, 2015, to December 02, 2022, thus avoiding the first phase of cryptocurrency adoption, which is associated with numerous inactive trading hours. To further minimize the possibility of no trading periods, our analysis revolves around Bitcoin, which has the largest user base compared to other cryptocurrencies. The historical prices are obtained from https://www.Bitcoincharts.com and the trading venue of interest is Bitstamp, one of the world’s most popular liquid platforms. The time-series observations are daily for almost all variables. Nevertheless, to construct liquidity metrics and realized volatility, we compile 60-min high-frequency data on the low, high, opening, and closing exchange rates of Bitcoin against the US dollar. Rather than adopting smaller intraday time frames (e.g., 1-min or 5-min sampling frequency), we opt for 60-min intervals for two reasons. First, high-frequency data, such as 5-min intervals, tend to contain more noise and microstructure effects owing to increased observations within a trading day in cryptocurrency markets. This makes it more challenging to identify the underlying trends and patterns in the data. A higher frequency of 5-min data might lead to an excess of information, making it more difficult to distinguish significant signals from random fluctuations (Poon and Granger 2003). When using a 60-min frequency, noise and short-term fluctuations inherent in higher-frequency data and associated with confounding market microstructure effects will likely be smoothed out, allowing for a clearer picture of the respective market liquidity and volatility behaviors. Hence, hourly frequency data may provide a sufficient level of granularity while avoiding excessive noise that could distract the primary analysis. Goyenko et al. (2009) compare liquidity benchmarks derived from high-frequency data with various widely used low-frequency liquidity proxies. They find that monthly and annual low-frequency proxies effectively capture high-frequency measures of transaction costs. Second, many studies on cryptocurrency liquidity and volatility base their empirical examination on 60-min sampling frequency (e.g., Bouri et al. 2021; Brauneis et al. 2021a, b; Corbet et al. 2020a, b; Dyhrberg et al. 2018; Gradojevic et al. 2023; Hansen et al. 2022; Jalan et al. 2021), and we follow suit.

Similar to Gkillas et al. (2021), Corbet et al. (2020b), and Bouri et al. (2021), we carry out a data curation process as a prelude to the empirical analysis. Bitcoin prices within periods of either zero trading activity or a low frequency of transactions (less than 3000 transactions per hour) are replaced with the last price traded. Note that the thresholds for thin and infrequent trading can differ across cryptocurrency, equity, and bond markets because of variations in market characteristics, investor behavior, and regulatory factors. Since our focus of interest is Bitcoins, which can be bought anytime, we rely on daily observations for all seven days of the week, including weekends. Time series of a weekday frequency (i.e., Monday to Friday) are converted to a daily 7‐day frequency. To this end, a piecewise constant interpolation is performed to fill in gaps on bank holidays and weekends. In line with Corsetti et al. (2005) and Forbes and Rigobon (2002), we consider non-synchronous trading hours using two-day rolling averages for the time series. To establish stationarity, we logarithmically transform each data series into the firs-difference form. The inverse hyperbolic sine (IHS) transformation is used for a time series that contains negative or zero values and is defined as (Ravallion 2017):

$$\widetilde{{\mathcal{z}}_{t}} = {\text{log}}\left({\mathcal{z}}_{t} + \sqrt{ 1 + {\mathcal{z}}_{t}^{2} }\right)$$
(7)

where \({\mathcal{z}}_{t}\) is the time series with zero or negative values.

Liquidity proxy

Given the heterogeneity of the factors underlying the liquidity concept, the extant literature sets forth a mixture of proxies that reflect some, but not all, of these dimensions. According to Ametefe et al. (2016), metrics of liquidity can be classified into price impact measures (e.g., Amihud 2002; Hasbrouck and Schwartz 1988; Pástor and Stambaugh 2003), time-based measures (e.g., Donald et al. 1996; Peng 2001), return-based measures (e.g., Goyenko et al. 2009; Roll 1984), trading volume-based measures (e.g., Mann and Ramanlal 1996; Rouwenhorst 1999), and transaction cost-based measures (e.g., Chordia et al. 2001; Demsetz 1968; Hamao and Hasbrouck 1995). From a data frequency perspective, liquidity metrics can be estimated using either high-frequency (i.e., intraday) trade and order book data or low-frequency (i.e., daily, weekly, or monthly) transaction-based data. Although widely used in research, Amihud’s (2002) mean-adjusted illiquidity measure has certain limitations. For instance, it captures only the price impact dimension, suffers considerable size bias, and cannot reveal the trading frequency aspect of liquidity (Cochrane 2005; Florackis et al. 2011). Brauneis et al. (2021a) show that this measure fails to reflect the time-series variability in cryptocurrency liquidity.

In the empirical analysis, we adopt the bid/ask spread estimator developed by Corwin and Schultz (2012) (CS, hereafter). This metric is derived from high- and low-price data obtained from a time series with either low or high frequency. The CS estimator is based on the idea that the midpoint of bid and ask prices can be employed as a proxy for the true underlying value of an asset. The estimator measures the volatility of this midpoint price over a certain period to estimate the bid/ask spread. More explicitly, it calculates the squared returns of the midpoint price over two consecutive periods and then takes the average of the squared returns as an estimate of the spread. The CS estimator is designed specifically for high-frequency trading data, allowing for a more accurate estimation of bid/ask spreads. Moreover, it is robust to changes in market conditions, such as changes in liquidity and trading volume. The estimator also estimates the true trading costs associated with buying or selling financial assets, which can benefit investors when making informed decisions. Brauneis et al. (2021a) document that the CS measure demonstrates a superior ability to capture time-series variations in cryptocurrency liquidity. Schestag et al. (2016) find that the CS estimator satisfactorily captures changes in transaction costs in the US over-the-counter bond markets. Karnaukh et al. (2015) show that the CS estimator performs well in spot foreign exchange markets.

Consider a string of intraday high prices\({H}_{i}\), low prices\({L}_{i}\), and readings. The highest and lowest prices for two successive subintervals i, i + 1 of 60-min length in interval t are given as \({H}_{i, i+1}= MAX \left({H}_{i} , {H}_{i+1}\right)\) and\({L}_{i, i+1}= MIN \left({L}_{i} , {L}_{i+1}\right)\), respectively. We then produce the following sample estimates:

$$\widehat{\gamma } = {\left[{\text{ln}}\left(\frac{{H}_{i, i+1}}{{L}_{i, i+1}}\right)\right]}^{2}$$
(8)
$$\widehat{\vartheta } = {\left[{\text{ln}}\left(\frac{{H}_{i}}{{L}_{i}}\right)\right]}^{2} + {\left[{\text{ln}}\left(\frac{{H}_{i+1}}{{L}_{i+1}}\right)\right]}^{2}$$
(9)

A closed-form high-low spread estimator can be defined as follows, with some simplifying assumptions:

$${\widehat{CS}}_{i, i+1} = \frac{2 \left({e}^{\widehat{\alpha }} - 1\right)}{1 + {e}^{\widehat{\alpha }}}$$
(10)

where,

$$\widehat{\alpha } = \frac{\sqrt{2 \widehat{\vartheta }} - \sqrt{ \widehat{\vartheta }} }{3 - 2 \sqrt{ 2}} - \sqrt{\frac{\widehat{\gamma }}{3 - 2 \sqrt{ 2}}}$$
(11)

\(e\) is a mathematical constant. As the series of \(\widehat{CS}\) may contain negative values, we follow Corwin and Schultz (2012) and set them to zero. Finally, for each interval t, an unweighted mean of all the estimates of \(\widehat{CS}\) across all successive subintervals is generated.

Candidate determinants

As indicated in the Introduction, due to the nascency and peculiarity of the Bitcoin market, only a few studies (e.g., Brauneis et al. 2021b; Corbet et al. 2022; Choi 2021; Dimpfl and Mäckle 2020; Eross et al. 2019; Fink and Johann 2014; Yue et al. 2021; Zhang and Gregoriou 2020) have been undertaken to identify which variables contribute most to its liquidity dynamics. In the sequel, we assess the roles of an extensive array of cryptocurrency-specific and external influences as robust liquidity determinants in the Bitcoin market. The candidate explanatory set comprises 18 variables reflecting the cryptocurrency sphere’s primary aspects and the global economic system. We examine the potential contributions of as many candidate determinants as possible. The selected factors symbolize crypto-market characteristics (Bitcoin’s signed returns and volatility, trading volume, transaction fees, hash rate, number of Bitcoins mined, number of transactions, and total market capitalization), public attention (Google search volume), macroeconomic and financial factors (benchmark stock indices of the US and Europe, spot exchange rates of EUR/USD, term spread, and gold markets), and global uncertainty and stress (US economic policy uncertainty, fear, and stress indicators). The choice of these variables is motivated by three considerations. First, based on a systematic literature survey, we select factors identified in prior studies as important drivers of liquidity. Second, data availability impeded incorporating of other potential factors into the analysis. Third, following Levine and Renelt (1992), to limit the total number of regressors, the population of doubtful variables \(U\), from which a conditioning information set is selected for each regression analysis run, is kept reasonably small. Table 1 summarizes the variables and their respective data sources.

Table 1 Candidate determinants of Bitcoin liquidity

Correlation analysis

Given the large number of variables, a preliminary examination of their dependence structures is useful. A heatmap depicting pairwise correlations between variables is shown in Fig. 1. Blank cells denote correlation coefficients that are not statistically significant. Aquamarine (pale green) cells indicate a very weak positive (negative) association, while those in dark blue (dark brown) signify a very strong positive (negative) association between the two variables. A perusal of the heatmap grid reveals that most cross-correlations are either statistically insignificant (i.e., \(p\ge 0.10\)) or weak (i.e., \(\left|{{\text{r}}}_{{\text{xy}}}\right|<0.40\)). Nonetheless, there are some pairwise cases with moderately positive or negative correlation coefficients (i.e., \(0.40 \le \left|{{\text{r}}}_{{\text{xy}}}\right|<0.80\)), including Bitcoin liquidity-Bitcoin volatility (CS-BV), absolute returns-total market capitalization (ABSR-MC), transaction fees-number of transactions (TF-NT), transaction fees-EUR/USD exchange rate (TF-EXRATE), trading volume- European stock markets (TV-ESM), VIX-US stock markets (VIX-USM), and VIX-European stock markets (VIX-ESM). As a further step toward reducing potential multicollinearity, we enforce two rules. First, for a given \(Q\) variable, we limit the number of \(Z\) variables included in each regression to five, following Sala-i-Martin (1997) and Kim et al. (2019). Second, regression models with a variance inflation factor (VIF) above a threshold of 5 are excluded from the analysis.

Fig. 1
figure 1

A heatmap representation of the correlation structure of variables

Empirical evidence

An important procedure before performing EBA is to pinpoint the free variables (\(X\)), focus variables (\(Q\)), and doubtful variables (\(Z\)). Free variables represent those whose theoretical and empirical relationships with a particular response variable have been demonstrated in the literature and, therefore, have great acceptance. Owing to their prominence, these variables always appear in all the estimated models. The focus variables are those of one’s main interest and whose robust explanatory potential is under investigation. Doubtful variables are a conditioning set of covariates that change with each regression run. To provide a comprehensive assessment, we run the EBA without setting a free variable. Next, we evaluate the robustness of each of the 18 candidate factors by treating it as a focus variable in succession while designating the others (17 variables) as doubtful ones, which means that Q and Z are interchangeable. Additionally, for a given \(Q\) variable, we limit the number of \(Z\) variables included in each regression run to five (i.e., N = 5). The ordinary least squares (OLS) estimator with heteroskedasticity-robust standard errors (White 1980) is deployed to generate the EBA regression results.Footnote 1 In all hypothesis testing, we choose a 0.05 significance level to test the null hypothesis that the individual parameter coefficients of each model are not different from zero. Our estimation is based on a more realistic assumption that the parameter estimates, \(\widehat{\beta }\), are not normally distributed. In the same spirit as Sala-i-Martin (1997) and Hartwig and Sturm (2014), we infer that the covariate of interest is robust only if its corresponding estimated coefficient has a \({\text{CDF}}\left(0\right)\ge 0.90.\)

Panels A, B, and C in Table 2 report the EBA results. Specifically, some summary statistics for coefficient estimates, \(\widehat{\beta }\), of each focus variable are listed in Panel A, while Panels B and C show the EBA estimation output from Leamer’s and Sala-i-Martin’s versions, respectively. Three salient observations can be made from Panel A. First, the estimated coefficients associated with Bitcoin’s negative returns and volatility, transaction fees, hash rates, S&P 500 index returns, S&P Europe 350 index returns, term spread, the US economic policy uncertainty index, VIX, and the global financial stress index are, on average, positively signed, which suggests that as these explanatory variables increase, so does the CS spread (i.e., a decrease in Bitcoin liquidity). On the other hand, the corresponding coefficients for the remaining independent variables have, on average, a negative sign, implying that positive changes in those regressors tend to positively impact Bitcoin liquidity. Second, regarding the magnitude of the parameter coefficients, \(\widehat{\beta }\), negative returns and volatility of Bitcoin (the number of transactions and EUR/USD exchange rates) appear to have the largest positive (smallest negative) influence on Bitcoin price spreads. Third, for US stock market returns, term spread, and VIX, we notice that the number of model specifications passing the VIF test declines to 6062, 6175, and 5986, respectively, from a maximum of 6188. This suggests that the VIF values for some of the estimated coefficients pertaining to the three variables are greater than the threshold level of 5; therefore, these coefficients are removed from the EBA.

Table 2 Estimation results of EBA

Panel B of Table 2 presents the estimation results based on Leamer’s demanding version of EBA. Interestingly, the realized volatility of Bitcoin (BV) is the only variable that survives the restrictive criterion of robustness since its coefficients remain statistically significant and retain the same sign across all possible combinations of doubtful variables. This robust determinant has a positive sign, implying that an increase in volatility leads to a decrease in bitcoin liquidity. The other 17 regressors are considered fragile, given that their respective coefficients change sign at least once and thus fail to withstand alterations in the conditioning information set. It should be noted that the upper and lower bounds of \(\widehat{\beta }\) for global financial stress (economic policy uncertainty and EUR/USD exchange rates) are far apart from (close to) each other, which could be indicative of less (more) precision in their respective coefficient estimates.

Paralleling our findings, several studies highlight the key role of volatility in explaining cryptocurrency liquidity. For example, Scharnowski (2021) shows that the bid-ask spreads of Bitcoin are positively associated with its volatility. Based on data for four major cryptocurrencies (Bitcoin, Ethereum, Litecoin, and Ripple) traded on Bitfinex, Bitstamp, Coinbase Pro, and Kraken, Brauneis et al. (2021b) find a significant positive relationship between realized volatility and illiquidity. Comparably, Dimpfl and Mäckle (2020) indicate that higher levels of bitcoin volatility and greater microstructure noise induce less liquidity in Kraken. Marshall et al. (2019) demonstrate that volatility and the number of trades are the primary determinants of Bitcoin spreads. Using cross-sectional data on 456 cryptocurrencies, Wei (2018) finds that low volatility and high efficiency prevail in liquid markets, wherein active traders are expected to rule out the potential for return predictability. Koutmos (2018) shows that during episodes of low liquidity uncertainty, Bitcoin’s liquidity uncertainty is positively (negatively) correlated with returns and realized range volatility (market capitalization, trading volume, and transaction fees). In contrast, realized volatility is the sole determinant in times of high liquidity uncertainty. Corbet et al. (2022) report evidence of considerable dynamic interactions between liquidity changes and conditional volatilities in 12 cryptocurrencies before and during the COVID-19 pandemic. On the contrary, some studies reveal results that contradict our findings. For instance, Dyhrberg et al. (2018) find an inverse relationship between volatility and the quoted and effective spreads of Bitcoin. Using daily and weekly frequency data for a group of the 12 most-traded cryptocurrencies, Búdowska-Sójka et al. (2020) establish that high volatility Granger causes high liquidity. Eross et al. (2019) employ GMT-timestamped tick-level data for Bitcoin to investigate its intraday dynamics across four sample periods from 2014 to 2017. Among other results, they demonstrate the absence of significant causality between liquidity and returns and realized volatility.

Next, we proceed to the results from Sala-i-Martin’s version of the EBA, as shown in Panel C of Table 2. A close look at the results indicates that crypto industry-specific influences and investor attention are relevant to explaining the liquidity of Bitcoin, while conventional financial market dynamics and global macroeconomic risks are not. Several main points are worth mentioning: First, as expected, under this less restrictive approach, four additional independent variables (i.e., Bitcoin’s negative returns, trading volume, hash rates, and Google search volume index) are no longer fragile. Thus, out of a pool of 18 candidate factors, only five seem to matter systematically for Bitcoin liquidity; consequently, they are labeled as robust determinants. Second, in terms of importance, realized volatility appears to take the overall lead with a \({\text{CDF}}\left(0\right)=100\) percent, followed by negative returns, trading volume, and Google search queries, with \({\text{CDF}}\left(0\right)\) values ranging between 96.779 percent and 92.395 percent. The hash rate variable hovers near the bottom of the robustness ranking, achieving a borderline \({\text{CDF}}\left(0\right)\) value of 90.163.

Third, in terms of the sign, the estimated \({\text{CDFs}}\) for negative returns, realized volatility, and hash rates (trading volume and Google search queries) are practically located on the right-(left-) hand side of zero, as revealed by their respective \(\overline{{\widehat{\beta } }_{Q}}\), which suggests a predominantly positive (negative) relation with Bitcoin price spreads across the entire spectrum of model specifications. Indeed, this finding implies that negative returns, realized volatility and hash rates (trading volume and Google search queries) tend to have negative (positive) effects on Bitcoin liquidity. Consistent with our results, several studies demonstrate a significant association between cryptocurrency liquidity and these robust variables in cryptocurrency and mainstream financial markets. For instance, Dimpfl and Mäckle (2020) document that the total transaction volume, hash rates, and number of transactions are important drivers of Bitcoin liquidity. Using data on 34 cryptocurrencies, Yao et al. (2021) establish that static investor attention tends to exert a short-term positive effect on liquidity, while abnormal investor attention has a persistent negative effect. Choi (2021) reports evidence that rising levels of investor attention, proxied by the number of tweets, enhance Bitcoin liquidity. Scharnowski (2021) finds that Bitcoin spreads correlate with hash rates and lagged negative returns. Brauneis et al. (2021b) establish that Bitcoin spreads are positively (negatively) related to the average trade size (total trading volume). In contrast, the theoretical models of Brunnermeier and Pedersen (2009), Kyle and Xiong (2001), Bernardo and Welch (2004), and Morris and Shin (2004) predict that large stock price declines lead to a reduction in the supply of liquidity in equity markets. Kyle and Xiong (2001) and Hameed et al. (2010) show that lack of liquidity is positively associated with negative returns. Similarly, Chordia et al. (2001) find that bid/ask spreads tend to increase following a sharp fall in stock prices. Liu (2015) finds that stock markets become more liquid as investor sentiment increases. The results of Adachi et al. (2017) suggest a positive link between the liquidity of Japanese start-up stocks and Google search intensity, thus substantiating the “investor recognition hypothesis” of Merton (1987) and the “price pressure hypothesis” of Barber and Odean (2008). Based on data for 290 stocks from seven countries, Aouadi et al. (2018) find that Google search volume, as a proxy for information demand, positively correlates with stock market liquidity. El Ouadghiri et al. (2022) demonstrate that institutional investor attention, proxied by the number of times Bloomberg terminal users search for information about a specific stock, positively influences stock liquidity and volatility.

Fourth, perhaps more surprisingly, the remaining candidate variables (i.e., absolute returns of Bitcoin, transaction fees, the number of bitcoins, number of transactions, market capitalization, S&P 500 index, S&P Europe 350 index, exchange rates of EUR/USD, term spread, gold markets, US economic policy uncertainty, VIX, global financial stress) turn out to be fragile in the sense that their respective coefficients fail to meet the robustness threshold of \({\text{CDF}}\left(0\right)\ge 90\) percent. Once a tiny change in the conditioning information set occurs, the estimated coefficients of the independent variables flip their signs or become statistically insignificant. At a deeper glance, it appears that most of these fragile factors are outside or peripheral to the world of cryptocurrencies, suggesting that liquidity is driven almost entirely by factors inherent in the Bitcoin network. Parallel to our findings, Brauneis et al. (2021b) establish that the liquidity of Bitcoin, Ethereum, Litecoin, and Ripple is unrelated to the return and liquidity dynamics of conventional financial asset markets. Scharnowski (2021) suggests that unique addresses, TED spreads, US economic policy uncertainty, and the VIX are unimportant for explaining Bitcoin spreads. Marshall et al. (2019) find no consistent relationship between Bitcoin liquidity and the VIX or the TED spread. Quang et al. (2020) find that changes in the geopolitical uncertainty index have no significant influence on the liquidity of cryptocurrency portfolios. Based on a wavelet coherence analysis, Umar et al. (2021) document very limited co-movements between the liquidity of the NYSE composite index and the Nikkei 225 index and those of major cryptocurrencies during the COVID-19 pandemic.

As illustrative evidence, Fig. 2 depicts a graphical representation of the empirical frequency distribution of the coefficient estimates, \({\left\{{\widehat{\beta }}_{i}\right\}}_{i=1}^{P},\) for the individual focus variables. In each histogram plot, the horizontal axis indicates magnitudes of \(\widehat{\beta }\) obtained from all possible model specifications, while the vertical axis indicates the corresponding probability density. The vertical red line at zero denotes the value of \(\widehat{\beta }\) in the null-hypothesis significance test (i.e., H0: \(\widehat{\beta }=0\)). The blue curve depicts the kernel density of the focus variable, and the green curve represents a normally distributed approximation of the coefficient estimates. An examination of the two curves assists in establishing whether \({\left\{{\widehat{\beta }}_{i}\right\}}_{i=1}^{P}\) follows a normal distribution or not. If most yellow bars are located to the right (left) of zero, we infer that most coefficient estimates of a given regressor are positive (negative). A perusal of the individual histograms reveals that the coefficient estimates of negative returns (NEGR), realized volatility (BV), and hash rates (HASH) are located rightward away from the red line, whereas those of the trading volume (TV) and Google search volume (GSVI) lie almost completely on the left side of the red line. Nonetheless, for the remaining variables, the area under the density function of their respective coefficients lies on both sides of the red line, confirming their fragility. We notice that the blue curves in most histograms exhibit two or more peaks, reflecting multimodality of the distribution of the corresponding coefficient estimates. Additionally, as the kernel density curve for each variable does not closely resemble the shape of an approximate normal distribution curve, the graphs support our decision to use the generic EBA model.

Fig. 2
figure 2

The empirical density of coefficient estimates on each focus variable

Taken together, these findings support H1 and H2, which posit that Bitcoin liquidity is robustly affected by cryptomarket characteristics and public attention, respectively. On the contrary, no evidence substantiates H3, which states that Bitcoin liquidity is robustly associated with macroeconomic and financial development. Comparably, our results are against H4, which predicts that global uncertainty and stress factors are robust determinants of Bitcoin liquidity.

Finally, it is worth emphasizing that our study aims to identify the factors contributing to the liquidity of Bitcoin, which is the most influential cryptocurrency in terms of market capitalization, widespread adoption, brand recognition, and the global community. Our study also focuses on a single trading venue, Bitstamp, widely recognized for its professional reputation, high liquidity, fiat support, large geographical reach, and strong security measures. Therefore, our results pertain only to bitcoin units traded on one of the well-established and reputable cryptocurrency exchanges, Bitstamp. Comparable to our approach, Dimpfl and Mäckle (2020) examine the liquidity determinants of Bitcoin traded on the US-based cryptocurrency exchange, Kraken. Choi (2021) investigates the impact of high-frequency Bitcoin tweets on liquidity using Bitstamp-exchange tick data. Indeed, we acknowledge that our empirical evidence is not generalizable to other cryptocurrencies or exchanges. Cryptocurrency liquidity can differ significantly across various currencies and trading venues, as demonstrated by several studies (e.g., Brauneis et al. 2021b; Dyhrberg et al. 2018; Loi 2018; Marshall et al. 2019). Although Bitcoin is the most well-known and widely traded cryptocurrency, it may exhibit unique characteristics compared with other cryptocurrencies. Cryptocurrencies have varying levels of market capitalization, market structure, trading volume, and user demand, all of which bear liquidity in one way or another. For instance, Bitcoin and Ethereum are generally considered the most liquid cryptocurrencies, owing to their widespread adoption and trading volumes. Other major cryptocurrencies (e.g., Tether and Ripple) tend to have higher liquidity than smaller or less popular altcoins (e.g., Solana and Tradecurve). Accordingly, it stands to reason to recognize that findings for Bitcoin may not apply directly to other cryptocurrencies without appropriate justification.

Comparably, cryptocurrencies’ liquidity can differ across trading venues. Specifically, major centralized exchanges (e.g., Bitstamp, Binance, and Kraken) generally offer higher liquidity for popular cryptocurrencies. These exchanges have large user bases and high trading volumes and provide access to multiple trading pairs, contributing to their liquidity. However, smaller or less-established exchanges (e.g., BitMart, KuCoin, and CoinEx) may have lower liquidity and may experience wider bid-ask spreads, which can trigger higher trading costs and less efficient order execution. By contrast, while decentralized exchanges offer user control and security advantages, their liquidity can be lower than that of their centralized counterparts. Finally, liquidity may differ across exchanges based on geographic location. Exchanges that cater to specific regions may experience varying levels of liquidity based on local demand and trading activity. Brauneis et al. (2022) show that Bitcoin liquidity disparities between trading venues are more substantial and persistent than those in stock markets. They also find that liquidity is linked much more to blockchain activity and exchange-specific factors than to global factors.

While our evidence yields important insights into the robust determinants of the liquidity of Bitcoin traded on the Bitstamp crypto exchange, caution must be exercised in generalizing these findings to other cryptocurrencies and trading venues. Additional research and empirical evidence specific to other cryptocurrencies and exchanges of interest are necessary to support claims of generalizability.

Additional analyses

In this section, we conduct additional checks to ensure the validity of the empirical evidence. Specifically, Abdi and Ranaldo’s (2017) bid/ask spread estimator is adopted to assess the sensitivity of our results to alternative liquidity proxies. The elastic net algorithm of Zou and Hastie (2005), a variable shrinkage and selection technique, is used to verify whether our results are driven by the methodology applied in the main analysis.

An alternative liquidity proxy

Our first exercise involves re-running the EBA using Abdi and Ranaldo’s (2017) spread estimator (AR, hereafter) as an alternative proxy indicator of liquidity. This estimator is inspired by Roll’s (1984) autocovariance measure and Corwin and Schultz’s (2012) bid-ask spread estimator. The motivation behind the AR estimator is to bridge Roll’s (1984) spread illiquidity measure, which employs close-to-close prices, and Corwin and Schultz’s (2012) bid-ask spread estimator, which is built on high and low prices. It is worth highlighting that the AR estimator considers market volatility by utilizing the square root of the number of observations deployed to compute the mid-price in the denominator. Thus, as market volatility rises, the spread estimator also increases, reflecting the higher risk associated with trading in a more volatile market. Besides, one of the challenges in estimating the spread is that bid and ask prices can fluctuate rapidly in response to changes in market conditions or trading activities. This can result in a noisy spread estimate, which may not accurately reflect the true cost of trading financial assets. To address this challenge, the AR estimator captures bid-ask bounces by employing the difference between the bid and ask prices in the numerator. Abdi and Ranaldo (2017) show that the AR spread, compared to other low-frequency estimators, has the highest correlation with the effective spread of Trade and Quotes (TAQ), which serves as a benchmark measure. Several recent studies have adopted the AR measure as a proxy for liquidity (e.g., Abad et al. 2023; Ahmed 2022a, b; Bianchi et al. 2022; Brauneis et al. 2021b; Choi et al. 2023; Yang et al. 2023). Let \({h}_{i}\), \({l}_{i}\), and \({c}_{i}\) represent the logarithmic transformations of high, low, and close prices during subinterval i. To calculate the “two-day corrected” AR estimator, we use data from two successive subintervals, i, i + 1. The estimator is formally described as

$${\widehat{AR}}_{t} = \frac{1}{N-1} \sum_{i=1}^{N-1}{\widehat{AR}}_{i}$$
(12)

where,

$${\widehat{AR}}_{i} = \sqrt{MAX \left\{4 \left({c}_{i} - {\eta }_{i}\right) \left({c}_{i} - {\eta }_{i+1}\right), 0\right\}}$$
(13)

\({\eta }_{i}\) is the midrange of \({h}_{i}\), and \({l}_{i}\) during i and is calculated as \({\eta }_{i}=\left({h}_{i}+ {l}_{i}\right)/2,\) \(N\) denotes the number of subintervals i in a trading day t. Equation (13) illustrates that the negative estimates are replaced with zeros before the spread is computed. Table 3 displays the estimation results.

Table 3 Estimation results of EBA with an alternative proxy for liquidity

Overall, the results appear qualitatively similar to those reported in Table 2. Panel A of Table 3 reveals that the coefficient estimates on the absolute returns of Bitcoin, trading volume, Google search queries, EUR/USD exchange rates, term spread, gold, and the financial stress index are, on average, negatively signed. On the contrary, the corresponding coefficients for the rest of the explanatory variables have, on average, a positive sign. In absolute terms, the negative returns and realized volatility of Bitcoin (the number of transactions, EUR/USD exchange rates, and US economic policy uncertainty) demonstrate the largest (smallest) effect on Bitcoin spreads. The results based on Leamer’s and Sala-i-Martin’s versions are given in Panels B and C of Table 3. As Panel B shows, Bitcoin’s realized volatility maintains its supremacy as the sole robust driver of liquidity, whereas the remaining ones are fragile. Nevertheless, under Sala-i-Martin’s lenient robustness, Panel C shows that negative returns, trading volumes, hash rates, and Google Trends are upgraded to the level of robustness. Realized volatility is still in the overall lead with a \({\text{CDF}}\left(0\right)=100\) percent, closely followed by negative returns, trading volume, and Google Trends, with a \({\text{CDF}}\left(0\right)\) stretching between 99.414 percent and 96.352 percent. Again, the hash rate variable received the lowest ranking for sturdiness, with a marginal \({\text{CDF}}\left(0\right)\) of 91.558 percent. In terms of the sign, the estimated \({\text{CDFs}}\) for negative returns, volatility, and hash rates (trading volume and Google trends) are mostly located on the right-(left-) hand side of zero, as expressed by their respective \(\overline{{\widehat{\beta } }_{Q}}\), which indicates a primarily positive (negative) association with price spreads of Bitcoin across all model specifications. Thus, we conclude that our findings do not depend on using a specific proxy for liquidity.

An alternative methodology

The second exercise examines the extent to which our results are driven by the methodology used. For this purpose, we apply the elastic net (ENet) estimator of Zou and Hastie (2005), an alternative machine learning method. By enhancing the least absolute shrinkage and selection operator (LASSO) of Tibshirani (1996), this feature-selection technique can adequately handle the interpretability, predictive performance, and calculation complexity issues of a particular regression model concurrently. Zou and Hastie (2005) show that the solution paths of LASSO are more likely to be unstable in the presence of high multicollinearity among the features of a given model. In this case, the LASSO pulls an arbitrary representative covariate out of each strongly correlated group. To remedy this shortcoming, Zou and Hastie (2005) propose the ENet method, which performs a “grouped selection” of highly collinear variables and its ability to implement LASSO-type continuous shrinkage and automatic feature selection. The variable selection results are listed in Table 4.

Table 4 Results of the elastic net method

In line with EBA evidence, the ENet results support the belief that cryptocurrency-specific factors influence Bitcoin liquidity more than their global economic and financial counterparts do. The ENet identifies a subset of 10 of the 18 candidate factors as the most powerful determinants of liquidity. Largely reflecting attributes underlying the cryptocurrency world, these variables include negative returns, realized volatility, trading volume, transaction fees, the number of bitcoins, number of transactions, hash rates, Google search volume, term spread, and financial stress. Therefore, contrary to the EBA results, the ENet estimator nominates five more variables (i.e., transaction fees, number of Bitcoins, number of transactions, term spread, and financial stress) relevant to Bitcoin liquidity. By contrast, the remaining variables (absolute returns, market capitalization, US and European stock markets, exchange rates of EUR/USD, economic policy uncertainty, gold, and VIX) appear to have no material bearing on liquidity. Consistent with those reported in Table 2, the estimated coefficients of negative returns, volatility, transaction fees, number of Bitcoins, number of transactions, and hash rates (trading volume, Google search volume, financial stress, and term spread) are positive (negative), which implies a negative (positive) impact on liquidity. Regarding the order-of-magnitude ranking, we note that realized volatility, negative returns, and trading volume (transaction fees, hash rates, and number of transactions) are the most (least) important determinants of liquidity. Finally, the coefficient path of each variable included in the analysis is graphically shown in Fig. 3. It is clear that Bitcoin volatility, followed by negative returns, enters the penalized model early in the solution path, and their respective coefficients continue to diverge from the zero-horizontal axis even after other factors enter the model. This finding indicates the predominant role played by both variables in explaining Bitcoin liquidity. Nevertheless, the other variables seem to take longer to depart from zero, possibly indicating their minor importance.

Fig. 3
figure 3

Coefficient paths based on the elastic net method

Conclusion

Bitcoin has risen to prominence in practitioner and academic communities since its formal launch in 2009, gradually moving from the obscurity of technology to capturing the attention of investors, global corporations, financial institutions, and governments. Many studies have been conducted over the last decade to better understand Bitcoin’s unique characteristics, such as speculation and bubbles, market efficiency, price discovery, price jumps and volatility, trading dynamics, and interactions with mainstream financial and commodity markets. A pertinent property of Bitcoin’s market microstructure is liquidity, a crucial condition through which Bitcoin can adequately assume its role, whether as an unorthodox means of payment, an investment asset, or a safe haven commodity. This study conducts an empirical inquiry into the robust determinants of Bitcoin liquidity while considering the issue of model uncertainty. We evaluate the robustness of a broad range of candidate factors recognized in the literature as the main explanatory variables of liquidity. These factors feature crypto market attributes (Bitcoin’s signed returns and volatility, trading volume, transaction fees, hash rate, number of Bitcoins mined, number of transactions, and total market capitalization), public attention (Google search volume index), macroeconomic and financial factors (benchmark stock indices of the US and Europe, exchange rates of EUR/USD, term spread, and gold markets), and global uncertainty and stress (US economic policy uncertainty, fear, and stress indicators). The liquidity of Bitcoin is proxied by the Corwin and Schultz’s (2012) bid/ask spread estimator, which is constructed from high- and low-price data. An extreme bound analysis, a large-scale sensitivity test, is deployed to address the problem of model uncertainty. The EBA delves into a universe of independent variables to determine whether a given parameter is robust or fragile in the face of a small change in the conditioning information set.

The results from Leamer’s version of the EBA suggest that the realized volatility of Bitcoin is the only variable that passes the restrictive criterion of robustness since its coefficients remain statistically significant and maintain the same sign across all possible combinations of doubtful variables. The remaining explanatory variables are deemed fragile given that their respective coefficients flip sign at least once; hence, they fail to withstand alterations in the model specifications. Nevertheless, the results of Sala-i-Martin’s variant indicate that Bitcoin’s negative returns, trading volume, hash rates, and Google search trends are robust determinants of liquidity. The rest of the candidate variables (i.e., absolute returns of Bitcoin, transaction fees, the number of bitcoins, the number of transactions, market capitalization, S&P 500 index, S&P Europe 350 index, exchange rates of EUR/USD, term spread, gold markets, US economic policy uncertainty, VIX, global financial stress) turn out to be fragile in the sense that their respective coefficients are unable to meet the robustness threshold of \({\text{CDF}}\left(0\right)\ge 90\) percent. The robustness checks indicate that our findings are independent of using a specific liquidity proxy. Moreover, the ENet method identifies 10 out of 18 candidate factors as the most powerful drivers of liquidity. These variables are negative returns, realized volatility, trading volume, transaction fees, number of Bitcoins, number of transactions, hash rates, Google search volume, term spread, and financial stress.

Taken together, our evidence confirms that Bitcoin liquidity has minimal exposure to the dynamics of conventional financial markets and macroeconomic influences. In this respect, two important implications are highlighted. First, despite the fact that the literature offers many variables as primary determinants of Bitcoin liquidity, only a handful of these variables demonstrate reliability and sturdiness towards changes in the composition of the doubtful-variable subset. Several studies propose models that appear to be well-specified, given the available datasets. Yet, they arrive at contradictory findings regarding the variables that should be identified as true drivers of Bitcoin liquidity, which could raise concerns regarding the validity of these results. To properly specify a model for understanding Bitcoin liquidity, many potential candidates, their linkages, and their interactions must be considered. The relevance of accounting for model uncertainty when building models to explain liquidity dynamics is emphasized by the ambiguity regarding the most important drivers in past literature and the structural form of their relationships with Bitcoin liquidity. Second, Bitcoin’s realized volatility, negative returns, trading volume, hash rates, and Google search trends seem to contribute to a more profound apprehension of Bitcoin liquidity owing to their respective sturdiness. Thus, crypto asset investors looking for useful information about Bitcoin’s future liquidity trends should keep an eye on the movements of such robust drivers. Over time, an increasing number of investors have become aware of Bitcoin’s long-term value, and paying close attention to the trajectory of these factors is essential for ensuring a stable and frictionless cryptocurrency trading environment.

Lastly, as is typical of most studies, our paper has two limitations. First, because the empirical investigation is based on data spanning a sufficiently long period, there are likely structural breaks in the time path of a series, which is a problem that this study does not address. The multiple structural breaks approach developed by Bai and Perron (1998, 2003) can be used to detect structural shifts in an underlying model. Thus, it would be interesting for future research to examine whether the factors contributing to Bitcoin liquidity are invariant across regime changes. Second, we explored the explanatory potential of only 18 variables. Nonetheless, to make the analysis more comprehensive, this set can include other unaccounted-for factors such as cryptocurrency hacking incidents, energy markets, the TED spread, and other major cryptocurrencies’ prices, volatility, and liquidity. These two limitations call for additional investigations and offer potential directions for future research.

Availability of data materials

The datasets used in this study are available upon a reasonable request from the author.

Notes

  1. To achieve estimation results, we use ExtremeBounds, an R package created by Hlavac (2016).

Abbreviations

EBA:

Extreme bound analysis

ENet:

Elastic net method

CS:

The Corwin and Schultz’s (2012) bid-ask spread estimator

AR:

The Abdi and Ranaldo’s (2017) spread estimator

\({\hbox{ABSR}}\) :

Absolute returns of Bitcoin

\({\hbox{NEGR}}\) :

Negative returns of Bitcoin

\({\hbox{BV}}\) :

Bitcoin realized volatility

\({\hbox{TV}}\) :

Trading volume

\({\hbox{TF}}\) :

Transaction fees

\({\hbox{NB}}\) :

Number of bitcoins mined

\({\hbox{NT}}\) :

Number of transactions

\({\hbox{MC}}\) :

Total market capitalization

\({\hbox{HASH}}\) :

Hash rate

\({\hbox{GSVI}}\) :

Google search volume index

\({\hbox{USM}}\) :

US stock market

\({\hbox{ESM}}\) :

European stock market

\({\hbox{EXRATE}}\) :

EUR/USD exchange rate

\({\hbox{TSD}}\) :

Term spread

\({\hbox{GLD}}\) :

Gold market

\({\hbox{EPU}}\) :

US economic policy uncertainty

\({\hbox{VIX}}\) :

CBOE volatility index

\({\hbox{FSI}}\) :

Global financial stress index

References

Download references

Acknowledgements

Not applicable.

Funding

This study has received no financial support.

Author information

Authors and Affiliations

Authors

Contributions

The author has solely conducted this study.

Corresponding author

Correspondence to Walid M. A. Ahmed.

Ethics declarations

Competing interests

The author declares that he has no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmed, W.M.A. On the robust drivers of cryptocurrency liquidity: the case of Bitcoin. Financ Innov 10, 69 (2024). https://doi.org/10.1186/s40854-023-00598-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40854-023-00598-9

Keywords

JEL Classification