Skip to main content

Tax avoidance and earnings management: a neural network approach for the largest European economies

Abstract

In this study, we investigate the relationship between tax avoidance and earnings management in the largest five European Union economies by using artificial neural network regressions. This methodology allows us to deal with nonlinearities detected in the data, which is the principal contribution to the previous literature. We analyzed Compustat data for Germany, the United Kingdom, France, Italy, and Spain for the 2006–2015 period, focusing on discretionary accruals. We considered three tax avoidance measures, two based on the effective tax rate (ETR) and one on book-tax differences (BTD). Our results indicate the presence of nonlinear patterns and a positive, statistically significant relationship between discretionary accruals and both ETR indicators implying that when companies resort to earnings management, a larger taxable income—and thus higher ETR and lesser tax avoidance– would ensue. Hence, as also highlighted by the fact that discretionary accruals do not appear to affect BTD, our evidence does not suggest that companies are exploiting tax manipulation to reduce their tax payments; thus, the gap between accounting and taxation seems largely unaffected by earnings management.

Introduction

In this study, we determine whether earnings management (EM) affects tax avoidance by using artificial neural network regressions. Studies have thoroughly analyzed the determinants of tax avoidance or tax aggressiveness. Hanlon and Heitzman (2010, p. 137) report that “there is widespread interest and concern over the magnitude, determinants and consequences of corporate tax avoidance and aggressiveness. The challenge for the area is that there are no universally accepted definitions of, or constructs for, tax avoidance or tax aggressiveness; the terms mean different things to different people.” Similarly, Dyreng and Maydew (2018, pp. 3–4) stated that “tax avoidance is generally defined broadly to include any reduction in a firm’s taxes relative to pretax accounting income, though some studies investigate more specific forms of tax avoidance such as aggressive tax avoidance, tax sheltering, or tax risk. Tax avoidance is an area that attracts attention from academics, policymakers, and the business press because it addresses the fundamental policy issues of tax equity (i.e., fairness) and tax efficiency.”

In this study, we consider “tax avoidance” and “tax aggressiveness” as equivalents, which is standard practice in related research (Rego and Wilson 2012; Kim and Zhang 2016; Kubick and Masli 2016). Many empirical studies have attempted to determine how tax avoidance or tax aggressiveness can be properly assessed. As the degree of tax avoidance cannot be directly quantified, several alternative indicators have been developed in the literature, based on data from company financial statements. In this investigation, we consider three tax avoidance measures, two based on the effective tax rate (ETR) and one on book-tax differences (BTD).

Increasing interest in how much firms pay in corporate income taxes –and in how much they do not pay through their tax avoidance strategies– has motivated a large amount of literature (e.g., Shackelford and Shevlin 2001; Hanlon and Heitzman 2010; Graham et al. 2012; Lietz 2013a; Dyreng and Maydew 2018; Lopo Martinez 2017; Wilde and Wilson 2018). The above literature demonstrates the extensive diversity of results. Despite employing identical explanatory variables of tax avoidance, authors have reported either positive or negative relationships (Fonseca-Díaz et al. 2019, pp. 227–230). This lack of agreement in the results may be attributable to the inadequacy of the methodologies employed; hence, previous studies have relied exclusively on linear regression models. However, while analyzing the traditional determinants of tax avoidance based on business variables, Delgado et al. (2014) and Molina Llopis and Barberá-Martí (2017) have reported the presence of nonlinear relationships. This is potentially problematic because incorrect model specification (as that arising when analyzing nonlinear phenomena through linear models) typically has serious consequences that potentially lead to spurious conclusions when statistical inferences are conducted based on those misspecified models. In turn, these conclusions may lead to potentially bad decisions by economic agents, which may be avoided by employing more suitable (nonlinear) models; therefore, it is important to minimize risk in applications. To capture nonlinearities and analyze complex phenomena such as those that arising in this field, introducing more flexible approaches to study tax avoidance is essential. Along this line, Kliestik et al. (2021, pp. 1465–1466) suggest that introducing alternative approaches based on nonlinear regression models or neural networks in this research area would be interesting.

Nonparametric regressions offer a classical remedy for the misspecification problem. The toolkit of nonparametric statistical tools includes series regressions, splines, kernel regressions, and artificial neural networks (ANNs). All these techniques provide flexible tools –asymptotically model-free– for estimation and hypothesis testing (for a thorough review of the literature, see, e.g., Pagan and Ullah (1999)).

In this study, we rely on nonparametric ANN regressions to analyze the determinants of tax aggressiveness. To the best of our knowledge, this is the first study on tax aggressiveness that utilizes ANNs and their flexibility to capture nonlinearities and analyze complex phenomena in the field. Our primary contribution to the literature consists in using this ANN methodology. Moreover, in this study, we include EM as a central explanatory variable (in its traditional form as discretionary accruals, measured by following Kothari et al. (2005)) and control for the effects of the variables commonly employed in the literature (i.e., size, leverage, asset composition and profitability, gross domestic product (GDP) growth, and country and sector dummies). Additionally, our analysis focuses on the five largest European economies in terms of GDP above one trillion euros: Germany, the United Kingdom, France, Italy, and Spain. No joint studies have been conducted on this matter for that group of countries. Interestingly, these five nations have almost identical ratios of corporate taxation to GDP. In the last year for which data were available in our database (2015), GDP growth was 2.0%, 2.4%, 2.5%, and 2.6% in Italy, Germany and Spain, the United Kingdom, and France, respectively. Average growth rate for that year was 2.5% in both the whole European Union (EU) and the Eurozone. Additionally, country size is considered a relevant variable according to corporate tax competition and related literature (Heimberger 2021, p. 12).

Among other possibilities, our choice of ANNs was motivated by several factors, including the relatively large number of regressors (about 30, including many dummies for qualitative variables) that made using other approaches impractical (e.g., multivariate polynomials, radial basis functions, kernels, or splines). Additionally, ANN structures that we used embed standard linear models, permitting certain inferences (e.g., linearity testing) to be conducted easily. Finally, the ANN structures we employed are relatively simple (in terms of the number of parameters to be fitted) and can be estimated using efficient nonlinear least squares (NLS) algorithms.

Although we exploited ANNs with a relatively narrow goal (as a tool of nonparametric statistics), the range of applicability of ANNs –and artificial intelligence in general– is impressive and continuously increases with new paradigms and fields of application. These paradigms include accounting–where algorithmic data-driven accounting information systems and big data analytics are increasingly employed (e.g., Ionescu 2020a, b, 2021; Bin 2022)—and all types of industrial and economic processes wherein the Internet of Things is increasingly intertwined with more classical paradigms (e.g., decision-support and artificial intelligence data-driven systems) (Li et al. 2020a, b, 2021; Brown 2021; Edwards 2021; Kou et al. 2021; Nica et al. 2021; Watkins 2021).

Our results suggest that the ANN methodology can effectively detect nonlinear relationships between tax avoidance and EM. The nonparametric ANN-based framework we utilize here can manage nonlinearities, enabling asymptotically correct estimation and hypothesis testing and reducing the risk of wrong decisions being made by economic agents when employing incorrectly specified (linear) statistical models. Hence, we believe this methodology should provide a useful addition to the previous literature (focused exclusively on linear regression models), also facilitating new, more flexible approaches to research in this field.

Particularly, the empirical results from ANN-based analysis suggest that in the large dataset of EU companies we have analyzed, the more extensive the EM, the heavier the tax burden, and, hence, less tax avoidance results. Moreover, companies do not seem to be exploiting tax manipulation for tax reduction—accruals do not affect BTD, so the gap between accounting and taxation is not influenced by EM. In conclusion, our findings demonstrate that companies in Europe’s largest economies do not evade taxes.

The remainder of the paper proceeds as follows. In Section "Literature review and hypotheses", we review the literature that relates tax avoidance to EM, and we posit the hypotheses to be tested. In section "Methodological issues", we outline our methodology. In section "Data and results", we describe the data and our main results, and we offer our discussion in section "Discussion". Finally, in section "Conclusion", we state our conclusions and suggest further research avenues.

Literature review and hypotheses

First, we present a synthesis of the literature on tax avoidance, focusing on studies that have considered EM as one of the explanatory variables. For brevity, we then focus on the measures of tax aggressiveness and EM most heavily employed in the literature.

Tax avoidance and earnings management

Literature on tax aggressiveness is vast. Traditional business-variables determinants of tax avoidance include size, leverage, composition of assets, and profitability (Fernández-Rodríguez et al. 2021, p. 695). However, in recent years, research efforts have moved to incorporate new study variables. These include management incentives (e.g., Rego and Wilson 2012), company governance activities (e.g., Desai and Dharmapala 2006; Minnick and Noga 2010; Armstrong et al. 2012; Richardson et al. 2013; Whait et al. 2018), and ownership structure from various perspectives (e.g., Chen et al. 2010; Wu et al. 2012; Badertscher et al. 2013; McGuire et al. 2014; Steijvers and Niskanen 2014).Footnote 1

In this work, we focus on EM. Some previous studies have specifically analyzed whether tax avoidance is affected by EM, in some cases as a main variable and in others as a control variable. A review of those studies (see Table 1 below) clearly indicates that their results are inconclusive.

Table 1 Summary of selected studies on tax avoidance and earnings management

Table 1 indicates that most previous studies have focused on the USA, detecting a positive relationship between tax avoidance and EM. However, those studies typically relied on different tax avoidance measures and EM, which may have affected results. For instance, Dhaliwal et al. (2004) did not examine ETR but change in ETR, Blaylock et al. (2012) only analyzed the behavior of the most aggressive companies, and Goh et al. (2013) and Kubick and Masli (2016) did not employ any ETR measures to assess tax avoidance.

Additionally, recent literature has provided evidence of an inverse relationship. First, Guenther et al. (2017) outlined a negative relationship between tax avoidance and EM, but only when ETR is used as a proxy for tax aggressiveness. Conversely, no relationship was identified when other proxies were employed. Their work focused on the USA and covered a long timeframe (1987–2011), so its study period encompassed those previously analyzed by other authors (i.e., 1986–1999, 1991–2005, 1993–2005, 2000–2010, 1999–2009, and 1994–2012) (Table 1). Only the research of Kubick and Masli (2016) utilized a sample incorporating a more recent year—2012. When comparing the conclusions of the various studies conducted for the USA, noncoincident conclusions abound even when the same study periods were analyzed.

The same appears to occur for China-focused studies. Hence, Richardson et al. (2016), for the period 2005–2010, found a positive relationship between tax avoidance and EM when ETR was employed, but not when other tax avoidance measures (e.g., BTG) were used instead. Tang et al. (2017), analyzing the years 1999–2006, detected a positive relationship between tax avoidance and EM. In contrast, Wang and Mao (2021), for a sample of companies analyzed for the period 2003–2015, reported a negative relationship.

For Europe, related research remains at a very early stage. We believe the only reference available is the work by Kałdoński and Jewartowski (2020), who analyzed a 2005–2017 sample of companies listed in Poland and concluded that tax avoidance and EM presented an inverse relationship.

While other related studies employ identical variables, have relied on different approaches. Among them are Wilson (2009, pp. 982–984) and Lisowsky (2010, pp. 1708–1709), who analyzed the determinants of tax shelters and concluded that both EM and BTD had positive effects. Fernández-Rodríguez and Martínez-Arias (2015, p. 180) studied whether EM affected the temporary differences, finding that those companies engaging in more extensive EM practices tended to apply more negative adjustments and less positive ones to defer taxation. Blaylock et al. (2015, pp. 164–165) analyzed the determinants of EM and concluded that BTD positively affected EM. Jackson (2015, pp. 65–66) addressed the determinants of the change in future earnings by analyzing both EM and BTD, demonstrating that neither EM nor permanent differences had any effect, whereas temporary differences had a negative effect. Tang (2015, pp. 457–458) and Sundvik (2017, pp. 37–38), who analyzed the conditioning factors of EM through book-tax conformity (BTC), concluded that BTC negatively affected EM.

Other recent studies on EM focusing on Central European countries (i.e., Hungary, the Czech Republic, Poland, and Slovakia) have also reached interesting conclusions. Particularly, Gregova et al. (2021, p. 235) found that an enterprise could improve its financial standing through EM techniques, which should enable it to raise more debt, raise its tax shield and increase share values. Kliestik et al. (2021, p. 1465) and Valaskova et al. (2021, pp. 172–174), using linear regression, concluded that EM is common practice among enterprises. However, Kliestik et al. (2021, pp. 1465–1466) highlighted the potential interest of an alternative approach based on nonlinear regression models or neural networks.

In conclusion, a review of the literature reveals a large diversity of results concerning the potential relationship between tax avoidance and EM. Interestingly, all the above studies assumed (or tested for) linear relationships. However, as commented on above, some studies focusing on traditional business-variables determinants of tax avoidance (e.g., Delgado et al. 2014; Molina Llopis and Barberá Martí 2017), while using linear regression models, have detected nonlinear relationships. These findings strongly support the interest of introducing more flexible approaches like the ANN methodology to the field of tax avoidance, with the objective of capturing potential nonlinearities and analyzing complex phenomena like those arising in the field. Therefore, we posit the following hypothesis to evaluate the relationship between tax avoidance and EM:

  • H1. Relationship between EM and tax avoidance is nonlinear.

  • To test this hypothesis, we will utilize the ANN methodology.

Tax avoidance measures and earnings management

Tax avoidance or tax aggressiveness do not have universally accepted definitions (Hanlon and Heitzman 2010, p. 137).Footnote 2 Wang et al. (2020, p. 796) suggest that the most common approaches can be classified into two groups: ETR and BTD.Footnote 3 ETR divides measures of tax liability by a measure of pretax income, while BTD relies on the gap between taxable base and accounting result.

Most studies used one or several definitions of tax avoidance, generally some of ETR variants. The following possibilities stand out:

  • GAAP_ETR: Total tax expense scaled by pretax income.

  • CASH_ETR: Cash taxes paid scaled by pretax income.

  • CURRENT_ETR: Current tax expense scaled by pretax income.

Although less frequently employed in the literature, some measures of tax aggressiveness are based on BTD, including the following:

  • Total difference between book and taxable income–BTD (Manzon and Plesko 2002).

  • Permanent BTD (Shevlin 2002).

  • Temporary BTD: Deferred tax expense–STR (Hanlon 2005).

  • Discretionary total BTD (Desai and Dharmapala 2006, 2009).

  • DTAX: Unexplained portion of the ETR differential (GAAP_ETR minus STR) (Frank et al. 2009).

The fundamental idea contained in the above indicators is that ETR should decrease with tax avoidance, while BTD increases with practices of tax aggressiveness. Hence, ETR and BTD would move in opposite directions; hence, for this reason, Tang et al. (2017, pp. 263–266) made a BTD comparable to ETR by multiplying it by − 1 so that lower BTD indicated a higher level of tax avoidance.

Generally, researchers have used several tax avoidance measures, some based on ETR and others on BTD (Monterrey-Mayoral and Sánchez-Segura 2022, p. 20). We used three measures, two of them in the sphere of ETR, namely, CASH_ETR (a cash-based indicator) and GAAP_ETR (relying on expenditures). Chen et al. (2019, p. 282) indicate that GAAP_ETR captures tax avoidance activities only through permanent BTD, while CASH_ETR captures both permanent and temporary BTD. Lower GAAP_ETR and CASH_ETR values reflect greater tax avoidance. Additionally, Blouin (2014, p. 880) indicated that GAAP_ETR does not represent taxes reported on the current period’s tax return. The third measure is BTD, as modified by Tang et al. (2017, pp. 263–266) to ensure that interpretations of the three measures remain coincident so that low values of ETR and BTD imply greater tax aggressiveness.

Remarkably, a large body of literature exists that directly centers on the search for the conditioning factors of ETR or BTD, without explicitly referring to tax avoidance or aggressiveness. Hence, several studies focusing on ETR actually aimed at studying companies’ tax burden. Conversely, those centered on BTD addressed the determinants of the gap between accounting results and taxable income. The most frequently used explanatory variables for ETR are size, economic and financial structure, and profitability, while literature on BTD highlights the role of EM as an explanatory variable.

Notably, BTD are a consequence of discrepancies between accounting norms and the tax rules, while EM arises as a byproduct of the alternatives offered by accounting norms. Studies on EM are diverse and have adopted various approaches, including taxation as one of the many areas analyzed. Specifically, various items related to the corporate income tax (CIT) have been assessed on whether they may have effects on EM and include deferred tax assets and liabilities (e.g., Bauman et al. 2001; Wang et al. 2016), income tax expense (e.g., Dhaliwal et al. 2004), deferred tax expense (e.g., Phillips et al. 2004; Noor et al. 2007; Ifada and Wulandari 2015), deferred tax provisions (e.g., Holland and Jackson 2004), and BTD (e.g., Wilson 2009; Fernández-Rodríguez and Martínez-Arias 2015).

Two types of EM (based on accounting and tax) may arise in practice, as both accounting and tax regulations typically provide opportunities. Hence, firm managers must consider both possibilities, as their decisions will affect both accounting results and tax burdens. Indeed, one of the main methods of EM relates to deferred taxation, which implies an advance or delay in the payment of CIT that generates deferred tax assets and liabilities.

Researchers have included a specific variable in their studies to attempt to detect whether EM may be an explanatory variable for ETR (e.g., Frank et al. 2009; Kim and Zhang 2016; Richardson et al. 2016; Guenther et al. 2017; Wang and Mao 2021). Moreover, just as tax avoidance is difficult to measure, the same problem applies to EM. As the extent of EM cannot be directly quantified, several proxies relying on publicly available data have been developed in the literature. Accrual models, specifically the Jones model (1991) and its subsequent modifications, are the most commonly used.Footnote 4 In this work, we relied on the modified Jones model with return on assets (ROA) (Kothari et al. 2005) as Reguera-Alvarado et al. (2015, p. 18) indicated the superiority of the Jones model adjusted to ROA relative to the Jones model and the Jones modified model.Footnote 5

Methodological issues

ANN-regression model

We relied on a classical functional regression framework. The dataset comprised a sample of n pairs, where yi involved the corporate tax rate (GAAP_ETR, CASH_ETR, or BTD), and \({\varvec{x}}_{i}\) was an N-dimension row vector of covariates. The relationship is as follows:

$$y_{i} = f^{*} \left( {{\varvec{x}}_{i} } \right) + \varepsilon_{i} ;\;i = 1, \ldots ,n,$$
(1)

with \(\varepsilon_{i}\) being a zero-mean random error term and \(f^{*}\) being the regression surface we wished to estimate. As approximate models for \(f^{*} ,\) we considered the following ANN structures:

$$f\left( {\tilde{\user2{x}},{\varvec{\theta}}} \right) = \user2{\alpha \tilde{x}}^{\prime } + \mathop \sum \limits_{j = 1}^{m} \;\beta_{j} F\left( {{\varvec{\gamma}}_{j} \tilde{\user2{x}}^{\prime } } \right);m = 0,1, \ldots ,$$
(2)

where \(\tilde{\user2{x}} = \left( {1,{\varvec{x}}} \right)\), \(F\left( z \right) = \left[ {1 + {\text{exp}}\left( { - z} \right)} \right]^{ - 1} ,z \in {\mathbb{R}}\) (i.e., logistic transfer functions are taken as “hidden units”), and \({\varvec{\theta}} = \left( {{\varvec{\alpha}},\beta_{1} ,{\varvec{\gamma}}_{1} , \ldots ,\beta_{m} ,{\varvec{\gamma}}_{m} } \right)\) is a row vector which collects all free parameters, given the complexity index m.

Classical theoretical results (e.g., Barron 1994) have ensured that the above one-hidden-layer structures are universal approximators, with the above model class also consisting a leading choice in applications of ANNs in the fields of business and economics. A large body of theoretical literature [among others, classical works by White (1990), Gallant and White (1992), Kuan and White (1994), Chen and Shen (1998), Stinchcombe and White (1998), and Chen and White (1999)] underlies the use of ANNs as nonparametric regression and inference devices.

Model fitting in the above ANN structures can be conducted using efficient NLS algorithms (here, we used the Levenberg–Marquardt algorithm instead of the computationally lighter but less efficient backpropagation). As the NLS error surface is nonconvex in ANN models, a preliminary stage (e.g., intensive random search over the parameter space) is strongly advisable for reducing the risk of local optima (e.g., Ripley 1996, pp. 158–159).

Model complexity (m) can usually be determined through some data-driven device (e.g., complexity-penalization rules to prevent overfitting). These include minimization of some suitable information measure (e.g., Schwartz’s information criterion (SIC) or Akaike’s information criterion (AIC)) and cross-validation. Here we employed cross-validation; however, as detailed below, information criteria led to roughly the same model complexities in our dataset.

Testing model specification and variable significance

Many interesting hypotheses related to the model could readily be tested based on the above ANN estimators; however, we focused on linearity and variable significance.

Linearity testing can be conducted using the misspecification test outlined in Landajo et al. (2012), which extends the classical ANN-based linearity test proposed by White (1989). Linearity amounts to no ANN model being able to extract structure from the residuals of the sample linear OLS regression of y on the set of covariates. Hence, the test can be formulated as the ANN structure as follows:

$$f\left( {\tilde{\user2{x}},{\varvec{\theta}}} \right) = \user2{\alpha \tilde{x}}^{^{\prime}} + \beta F\left( {\user2{\gamma \tilde{x}}^{^{\prime}} } \right)$$
(3)

and fit by NLS the ANN model \({\mathbf{e}} = \user2{\tilde{X}\alpha }^{\prime } + \beta F\left( {\user2{\tilde{X}\gamma }^{\prime } } \right)\), with \({\mathbf{e}}\) denoting the OLS residuals of regressing \({\mathbf{Y}} = (y_{1} , \ldots ,y_{n} )^{\prime }\) on \(\tilde{\user2{X}} = (\tilde{\user2{x}}_{1}^{\prime } ,...,\tilde{\user2{x}}_{n}^{\prime } )^{\prime }\). Then, the null \(H_{0} :\) \(\beta = 0\) for all \({\varvec{\gamma}} \in {\Gamma } \equiv \left[ { - 2,2} \right]^{1 + N}\) is tested against H \({ }_{1} :\) \(\beta \ne 0\) for (essentially all) \({\varvec{\gamma}} \in {\Gamma }\). (The set of regressors is assumed to be scaled to the [0,1] interval.) The test statistic is \(d^{*} = \mathop {\text{sup }}\limits_{{{\varvec{\gamma}} \in {\Gamma }^{*} }} d\left( {\varvec{\gamma}} \right)\), where \(d = d\left( {\varvec{\gamma}} \right) = nR^{2}\), \(R^{2} = \frac{{{\hat{\mathbf{e}}}^{\prime } {\hat{\mathbf{e}}}}}{{{\mathbf{e}}^{\prime } {\mathbf{e}}}},\) \({\hat{\mathbf{e}}} = \user2{\tilde{X}\hat{\alpha }}^{\prime } + \hat{\beta }F\left( {\user2{\tilde{X}\hat{\gamma }}^{\prime } } \right)\) is the vector of NLS fitted values in model (3) above, and \(\Gamma^{*}\) is a closed subset of \(\Gamma\) not containing pathological points (roughly, those points in \(\Gamma\) that produce close-to-singular Hessians must be discarded). The limiting null distribution of \(d^{*}\) is nonstandardFootnote 6 but can be approximated by the Monte Carlo procedure proposed by Hansen (1996) or, equivalently, using algorithm A from Landajo et al. (2012, Appendix 2).

While the significance for each of the variables in the model could also be tested nonparametrically within the ANN framework, we relied on the proposal of Landajo et al. (2012). For each covariate \(x_{k}\), \(k = 1, \ldots ,N\), we aimed to test the null hypothesis that \(x_{k}\) was irrelevant in explaining \(y\) (i.e., \(H_{0} :E\left( {y{|}{\varvec{x}}} \right) = E\left( {y{|}{\varvec{x}}_{\left( k \right)} } \right))\), versus \(H_{1} :E\left( {y{|}{\varvec{x}}} \right) \ne E\left( {y{|}{\varvec{x}}_{\left( k \right)} } \right)\), with \({\varvec{x}}_{\left( k \right)}\) denoting the covariate vector x with component \(x_{k}\) excluded. A suitable test statistic for this problem is \(\hat{\delta }_{k} = \left[ {n^{ - 1} \mathop \sum \nolimits_{i = 1}^{n} \left[ {\hat{f}\left( {{\varvec{x}}_{i} } \right) - \hat{f}_{\left( k \right)} \left( {{\varvec{x}}_{i} } \right)} \right]^{2} } \right]^{\frac{1}{2}}\), with \(\hat{f}\left( {\varvec{x}} \right)\) and \(\hat{f}_{\left( k \right)} \left( {\varvec{x}} \right)\) being, respectively, consistent (ANN-regression) estimators for \(E\left( {y{|}{\varvec{x}}} \right) = f^{*} \left( {\varvec{x}} \right)\) and \(E\left( {y{|}{\varvec{x}}_{\left( k \right)} } \right) = f_{\left( k \right)}^{*} \left( {\varvec{x}} \right)\). This test directly utilizes the idea of robustness checking, and its rationale stems from the fact that \(\hat{\delta }_{k}\) estimates the distance between the complete ANN (with all the regressors included) and the simplified model (that excludes regressor \(x_{k}\)), and tends to be very small under the null (because the complete and restricted ANN-regression models tend to coincide in their sample forecasts when regressor \(x_{k}\) is redundant). Conversely, far larger differences appear under the alternative because a relevant regressor from the complete ANN-regression model has been omitted in the simplified model. Moreover, this test can be readily bootstrapped (see algorithm B in Landajo et al. (2012, Appendix 2)).

Marginal-effect estimates

Nonparametric estimates for the marginal effect on \(y\) of variations in each covariate can be readily obtained by computing average partial derivatives of ANN-based regressions. For each covariate \(x_{k}\), \(k = 1, \ldots ,N\), we calculated the following plug-in (mean derivative) estimator:

$$\hat{D}_{k} = n^{ - 1} \mathop \sum \limits_{i = 1}^{n} \frac{{\partial \hat{f}\left( {{\varvec{x}}_{i} } \right)}}{{\partial x_{k} }},$$
(4)

with \(\hat{f}\left( {{\varvec{x}}_{{\varvec{i}}} } \right) = {\text{f}}\left( {{\varvec{x}}_{i} ,\hat{\user2{\theta }}} \right)\) being the fitted ANN model, evaluated at observation \({\varvec{x}}_{i}\).

The above sensitivity measure can be readily interpreted because for the linear regression model, the above estimate coincides with the sample partial regression coefficient for each covariate.

Data and results

Our study analyzed a group of EU economies for the period 2006–2015. In EU member states, listed companies of have only been obliged to elaborate their annual accounts according to the International Financial Reporting Standards (IFRS) since 2005. Hence, our sample began in 2006. In our analysis, we considered the five largest economies in the EU during the study period (all of them with respective GDP over one trillion euros): Germany, the United Kingdom, France, Italy, and Spain. Furthermore, in that decade, these five member states also represented the largest tax collection in the EU.

The data source was the Compustat database, which provides financial information on listed companies. As is usual, we only considered nonfinancial firms. Our full sample included 13,151 companies; however, depending on the measure of tax avoidance employed, the number of firms in the final sample was slightly smaller. Specifically, for ETR measures, all observations with negative numerators or denominators were eliminated [the range of the variable was also limited to between zero and one, following the usual procedure in previous research (e.g., Fernández-Rodríguez et al. 2021, p. 697)].

As we noted above, the dependent variable was tax avoidance, defined in three ways:

  • GAAP_ETR: Income Taxes-Total/Pretax Income

  • CASH_ETR: Income Taxes Paid/Pretax Income

  • BTD: {[(Pretax Income × STR)–Income Taxes-Total]/Assets-Total} × (–1)

Following literature on this topic, among the explanatory variables, we took discretionary accruals (ACCRUALS) as the main determinant, defined as follows:

  • ACCRUALS, estimated from the Jones model adjusted to ROA (Kothari et al. 2005) by year and industry.

    Specifically, we estimated discretionary accruals using the following equation:

    $$TACC_{i,t} = \alpha_{0} + \alpha_{1} \left( {\frac{1}{{TA_{i,t - 1} }}} \right) + \alpha_{2} \left( {\Delta REV - \Delta AR} \right)_{i,t} + \alpha_{3} PPE_{i,t} + \alpha_{4} ROALAG_{i,t} + \varepsilon_{i,t} ,$$
    (5)

    where TACC is total accruals; TA is total assets; \(\Delta\) REV is the change in sales from year t-1 to year t; \(\Delta\) AR is the change in accounts receivable from year t-1 to year t; PPE is gross property, plant, and equipment; and ROALAG is the ratio of earnings before income tax to lagged total assets. Accruals were obtained as absolute values of the residuals from the above regression model.

    Principally, companies engaging in more EM will have better results and probably higher ETRs. Simultaneously, they may attempt to reduce tax payments by investing in tax planning, with consequences for their BTD. According to Guenther et al. (2021, p. 27), “a firm that manages earnings upward without paying additional tax on the managed earnings can be considered to have avoided tax.” Therefore, we can expect a negative relationship between ACCRUALS and our tax avoidance measures. However, conducting accounting and tax manipulation practices, even within legally permissible limits, can be costly for companies. Therefore, to avoid risks, when EM practices are higher, tax evasion may be lower. Prior literature tends to agree with the former view, although recent research by Kałdoński and Jewartowski (2020, p. 1) has found that those companies conducting more EM practices appear to be less willing to conduct aggressive tax planning; therefore, supporting higher ETRs.

    Regarding the control variables, the extensive previous literature is not conclusive, as in a review of results presented in Fonseca-Díaz et al. (2019, pp. 227–230). Thus, we provide the following opinions:

  • Company size (SIZE), measured as the logarithm of total assets. Previous results have been mixed because the literature offers two competing theories on this relation: political cost theory (suggesting a positive relationship between size and ETR) and political power theory (suggesting a negative relationship between size and ETR). However, independently of the two theories, Belz et al. (2019, p. 1) found that tax planning aspects potentially affect size–ETR relation. Thus, larger companies can devote more resources to tax planning, which should result in a negative relationship between SIZE and the three tax avoidance measures proposed.

  • Leverage (LEV), defined as the ratio of total debt to total assets. Prior literature has commonly found a negative relationship between debt and ETR. However, Molina Llopis and Barberá Martí (2017, pp. 79–80) and Vintilă et al. (2018, p. 571), in their EU-focused studies, detected a positive relationship that may be explained by limitations on the deductibility of interest in recent years. Hence, we propose that the relationship between LEV and our tax avoidance measures may be positive for the major European economies during the period analyzed.

  • Capital intensity (CAPINT), defined as the ratio of gross property, plant, and equipment to total assets. Previous investigations have reported an inverse relationship between CAPINT and ETR owing to the deductibility of depreciation, an indicator expected in our analysis.

  • Inventory intensity (INVINT), measured as the ratio of inventories to total assets. This variable, used as a complement to CAPINT, is not as frequently considered. However, either a positive relationship with ETR or a lack of relationship has been obtained in previous studies, as stocks do not generate profit or tax-deductible expenses.

  • ROA, measured as the ratio of earnings before income tax to total assets. Most of the previous literature has found a positive relationship between ETR and ROA. However, two recent EU-focused studies (Konečná and Andrejovská 2020, pp. 116–117; Thomsen and Watrin 2018, p. 53) have detected a negative relationship between ROA and ETR. We believe this finding is consistent with the fact that the most profitable companies can devote more resources to tax planning to attempt to reduce tax burden.

    Additionally, we considered GDP growth and statutory tax rate (STR) for each country.

  • STR in the country each year. Delgado et al. (2019) found divergence between the STRs of the five countries studied; hence, the interest in considering this variable.

  • GDP growth (GROWTH). This control variable is usual in studies focused on ETR in several countries (e.g., Fonseca-Díaz et al. 2019; Zeng 2019; Fernández-Rodríguez et al. 2021).

Table 2 reports the descriptive statistics, and Table 3 includes the correlation matrix. Moreover, the number of final observations is different for the three dependent variables considered.

Table 2 Descriptive statistics
Table 3 Correlation matrix

The results in Table 2 indicate that mean GAAP_ETR was higher than that of CASH_ETR in all the countries analyzed. Therefore, in the five member states considered, deferred taxation has been observed as a practice used to postpone tax payments. However, their average ETRs were generally lower than average STRs, except in Italy. Indeed, Italy, with an average STR of 32%, had a GAAP_ETR of 39.7% and a CASH_ETR of 37.9% (i.e., the large Italian companies bore a higher tax burden than the average STR). Evidently, average BTD in Italy was the highest, with a mean value of 0.005, indicating that the differences between accounting and tax rules are detrimental to Italian companies, as they increase their CIT payments. Conversely, Spain, the country with the largest gap between its STR and ETR averages, was the only one with a negative average BTD; hence, Spanish companies manage to reduce CIT payments owing to the differences between accounting and tax regulations. Relative to the explanatory variables by country, nothing was remarkable, and all countries presented homogeneous average values.

Table 2 also demonstrates that the average of GAAP_ETR was always higher than that of CASH_ETR, except for companies with no declared activity. Therefore, CIT expense was higher than the CIT payment, indicating that companies utilize deferred taxation to postpone tax payments. BTD had average values around zero or positive, except in the health care, utilities, and real estate sectors. ACCRUALS, the main explanatory variable, had a mean value in line with those reported for most sectors. The remainder of the explanatory variables exhibited homogeneous mean values in most of the sectors analyzed.

As for the ANN-based modeling and testing results, ANNs were fitted using NLS and employing the Levenberg–Marquardt algorithm, with the maximum number of iterations set at 300 and an intensive preliminary random search over the parameter space to reduce risk of local minima. Both the numbers of Monte Carlo replications employed in the linearity tests and of bootstrap replications in the significance tests were set at 500. We programmed and executed all calculations in MATLAB.

Table 4 reports the results of ANN-based misspecification. We conducted these tests to detect potential nonlinearities in the relationships under study. Nonlinearity is a precondition for nonparametric modeling, as without this evidence, nonlinear or nonparametric modeling is pointless. Results in Table 4 clearly signal the presence of nonlinear patterns, with p-values close to zero and the null of linearity strongly rejected in all cases.

Table 4 Results of the ANN-based misspecification tests

Table 5 includes goodness-of-fit statistics for the linear and ANN models. In relative terms, the results demonstrate clear increases in the R-squared statistics of the optimal (cross-validated) ANN models against their linear counterparts: 29.10% in CASH_ETR, 39.92% in GAAP_ETR and 6.60% for BTD. In absolute terms, the R-squared statistics for all the models were low, which corroborates the highly noisy nature of the relationship under study.

Table 5 Goodness-of-fit and cross-validated statistics for linear and ANN models

Table 5 also reports the values of three complexity-penalization criteria (AIC, SIC, and two cross-validated error measures). To compute the cross-validated diagnostics, we employed fourfold cross-validation.Footnote 7 These criteria allowed us to select ANN models in a way that penalizes complexity and hopefully minimizes overfitting risk (linear models may also be regarded as ANNs with \(m = 0\) nonlinear terms). The results in Table 5 demonstrate that the three model-selection criteria roughly coincide: the ANN models finally selected by the complexity-penalization criteria included only a single neuron in the cases of CASH_ETR and BTD and two neurons in the GAAP_ETR case. On the one hand, this would confirm the conclusions of linearity testing (Table 4), with nonlinear patterns also being detected by the cross-validated diagnostics and information criteria. On the other hand, the small number of nonlinear terms selected by the penalization criteria suggests that only moderate departures from linearity would be operating in the relationships at hand, with a large portion of the variability observed as purely random.

Table 6 demonstrates the marginal-effect estimates and outcome of the significance tests for each explanatory variable. Given that the above results indicate that ANNs outperformed linear models in this setting, and for the sake of brevity, we only comment on the results from the ANNs, with those from the linear models reported for benchmarkingFootnote 8 purposes. This is because all the diagnostics above strongly indicated that the relevant relationships were nonlinear; hence, a linear model would be inappropriate and may produce misleading results in this case.

Table 6 Marginal effect (mean derivative, \(\hat{D}_{k}\)) estimates and significance testing results. Linear versus optimal cross-validated ANN results

Beginning with the main variable in the study, the results indicated a positive and statistically significant relationship between accruals and two indicators of ETR. Hence, higher levels of EM would come along with a higher tax burden, implying less tax avoidance. These results suggest that companies may increase their earnings by resorting to EM; however, this effort may be self-defeating as it would produce higher taxable income or fiscal results and, therefore, higher ETR. Thus, companies do not seem to be exploiting tax manipulation to reduce tax payments–especially since accruals do not affect BTD, proving that the gap between accounting and taxation is not influenced by EM.

We also observed a negative, statistically significant relationship between size and two of the measures analyzed–GAAP_ETR and BTD–whereas we found no significant relationship in CASH_ETR. Inverse dependence may result from the largest companies reducing their taxation through deferred taxation owing to the resources devoted to fiscal planning.

Leverage exhibited a positive, significant marginal effect in the three cases. Hence, firms would be less tax aggressive when in greater debt.

Regarding asset composition, CAPINT presented a negative, significant marginal effect in the three cases, which may be explained by the deductibility of depreciation in most countries. INVINT results were nonconclusive.

ROA also had a negative, significant relationship for the three tax avoidance measures, implying that the most profitable companies are also more tax aggressive, possibly because they may allocate more resources to tax planning to reduce taxation.

As for STR, marginal-effect estimates were positive and significant in the cases of CASH_ETR and GAAP_ETR and negative (but also significant) for BTD. Expecting that higher statutory tax rates lead to higher ETRs is reasonable. By increasing both payment-based and spending-focused CIT (which includes payment plus deferred taxation), differences between accounting and taxation may not contribute to reducing GAAP_ETR. However, STR seemed to have a significant effect on BTD; however, it was quite small in magnitude.

The relationship between GDP and tax aggressiveness was positive owing to the inverse association between GROWTH and the two ETR indicators. With higher economic growth, companies are more likely to obtain improved results; however, simultaneously, they will implement (tax avoidance) strategies to alleviate taxation. Finally, for BTD, economic growth was not statistically significant.

Discussion

The results in Tables 4 and 5 above strongly indicate the presence of nonlinear features in the relationship between tax avoidance and EM. Hence, relying on more general approaches capable of properly detecting and analyzing those nonlinearities is important. In this sense, the application of the ANN regressions in this field should constitute a useful addition that fills a methodological gap in the literature. Related studies (collected in Table 1 above) have analyzed the relationship between tax avoidance and EM by employing exclusively linear regression models. However, some recent studies (e.g., Kliestik et al. 2021) have proposed the existence of nonlinear patterns that, by definition, cannot be properly captured through linear models. Moreover, recent empirical evidence (Delgado et al. 2014; Molina Llopis and Barberá Martí 2017) strongly suggests that this may be the case. Our empirical results would confirm the findings of previous studies (for the large database of EU companies we have studied), underlying the need to employ more sophisticated nonlinear methods in this field to reach more accurate conclusions in the analysis. The holistic ANN-based approach we have utilized here makes the estimation of nonlinear relationships and testing several interesting statistical hypotheses (e.g., linearity and variable significance) in a systematic way possible, which is similar to that available for classical linear models. As the approach is nonparametric in nature, it is flexible and does not require prior model specification.

Considering the main variable, our results are in line with those obtained by some recent studies (Kałdoński and Jewartowski 2020; Richardson et al. 2016; Wang and Mao 2021), which found a positive relationship between EM and ETR. Certainly, firms engaging in EM can employ tax planning strategies to reduce their tax burden; however, corporate tax aggressiveness typically results in higher BTD that increases scrutiny from regulators and external monitors (Badertscher et al. 2009). Moreover, companies involved in EM may be reluctant to deviate excessively in terms of tax burden from their industry peers as they want to avoid raising the suspicion of tax authorities, regulators, or savvy investors (Armstrong et al. 2019). Such scrutiny makes hiding the real motives of managers” actions more difficult (Hanlon et al. 2014).

Following Guenther et al. (2021), tax avoidance is influenced by EM as pretax financial accounting income is used as a benchmark for how tax payments are measured. A company that manages earnings upwards without paying additional tax on the managed earnings may be considered to have avoided tax. In summary, our findings suggest that companies in the largest European economies do not evade taxes.

As described above, most previous studies have found an inverse relationship between debt and ETR. However, our findings are in line with those discussed below. Thus, Feeny et al. (2006) detected a positive relationship in their analysis for Australia, where limits are imposed on allowable interest deductions to discourage excessive leveraging. In recent years, European countries have also limited the deductibility of interest, which may explain the positive effect observed. Indeed, all the large countries now have limits in force: Germany and Italy since 2008, the United Kingdom since 2009, and France and Spain since 2013. Similarly, Molina Llopis and Barberá Martí (2017) and Vintilă et al. (2018) also discovered a positive relationship, the former in a study focused on EU member states during 2004–2015, and the latter for emerging European countries in 2000–2016.

Regarding asset composition, our results are in line with those obtained in previous investigations, as can be seen in the review of literature by Fonseca-Díaz et al. (2019). Contrary to our findings, most prior studies have detected a positive relationship between ETR and ROA. However, in a recent analysis for the EU in the 2008–2016 period, Konečná and Andrejovská (2020), found that ROA was the most influential variable on ETR and had a negative effect. Moreover, they highlighted that companies that were more profitable had lower costs associated with tax administration; hence, they had more funding to invest in tax planning, which led in the end to a reduction in their ETRs. Our findings are in line with that result. Additionally, Thomsen and Watrin (2018) highlighted that, for both EU and USA samples, increases in ROA were associated with increases in tax avoidance (i.e., a lower ETR).

As for STR, our results agree with the limited evidence provided by previous literature. Particularly, Delgado et al. (2014) also found a positive, significant relationship between ETR and STR for the EU’s fifteen countries in the period 1992–2009.

Conclusion

In this paper, we have studied the relationship between tax avoidance and EM in the EU. More precisely, we analyzed the five largest economies in the EU –Germany, the United Kingdom, France, Italy, and Spain– all with GDP exceeding one trillion euros. The study period was 2006–2015, and we used data from the Compustat database. We used ANNs in our approach, exploiting their nonparametric regression and inference capabilities to more readily deal with the nonlinearities potentially present in the dataset. The use of this methodology, along with the focus of our analysis on the five largest EU economies (all of them having similar corporate tax rates as a percentage of GDP), is the main contribution of the study to tax avoidance literature.

We have considered three tax avoidance measures, two based on ETR and another based on BTD, and along with the traditional explanatory variables in the literature, we focused on discretionary accruals, as measured by Kothari et al. (2005). Results from ANN regressions clearly indicate the presence of nonlinearities in the relationships we have studied. Moreover, the results suggest that the more extensive the EM, the heavier the tax burden and the lesser the tax avoidance results. Moreover, companies do not seem to be exploiting tax manipulation to reduce their tax payments. This is proven by accruals not affecting BTD; hence, the gap between accounting and taxation is not influenced by EM. In conclusion, our findings indicate that companies in Europe’s largest economies do not evade taxes.

Furthermore, the ANN methodology has allowed us to provide new evidence regarding the classic explanatory variables of tax avoidance. The ANN-based nonparametric test we employed to test for the significance of each explanatory variable directly exploits robustness checking for each explanatory variable of the model. Furthermore, we employed a model-fitting and model-selection protocol that ensures that models are “optimal”. Our results indicated that most variables in the theoretical economic model were statistically significant, with empirical evidence provided by the data being weak (i.e., statistically nonsignificant) only in a few of the explanatory variables considered. First, our results indicated that the largest companies reduced their taxation through deferred taxation owing to the resources devoted to fiscal planning. Second, firms were less tax aggressive when they were in greater debt. This may be because, in recent years, the European countries have limited the deductibility of interest. Third, companies with higher ratios of property, plant, and equipment exhibited higher tax avoidance. Finally, the most profitable companies were also more tax aggressive, possibly because they could allocate more resources to tax planning to reduce their taxation.

The above findings based on the ANN methodology are relevant as they add new evidence to previously available literature. Our results are in line with the latest related research available for the USA, China, and Poland, which indicates that large companies do not use EM for CIT purposes. This may indicate that they need to maintain their reputation rather than engage in aggressive tax practices. We believe this information could be of great interest to both governments and companies. Governments could benefit by applying our findings when setting their tax policies and undertaking tax reforms and when they must adopt accounting reforms that implement the various options proposed by the IFRS. Companies could use our results to help managers in their business tax planning, given the effects that investment decisions, financing, and choice of accounting criteria can have on tax avoidance.

Finally, we must also note some limitations of this study and several future research avenues they suggest. First, as discussed above, although we have relied on the most frequently employed tax avoidance measures, the literature also includes alternative approaches that must be considered. Second, several options for estimating EM exist. In this study, we employed the Jones model with lagged ROA both because it has been heavily used in previous literature and as it has been proven to be superior (Reguera Alvarado et al. 2015). We believe that further extensions of the analyses conducted in this study, with other tax avoidance measures and different approaches to EM assessment, will deliver interesting results. Third, we have included business characteristics commonly used in previous literature as control variables. Although our choice was guided both by the common practice in the field and by the nature and limitations of the information available in databases, we believe that incorporating several qualitative factors as additional control variables into the models will result in significant, new insights, provided that this type of additional information eventually becomes publicly available to researchers. Fourth, we conducted analysis on the largest European countries, and we caution against directly expanding the results to smaller economies. Another future research goal wherein we are interested in involves exploiting this new methodology to analyze the EU as a whole and other areas like the USA and emerging economies.

Availability of data and materials

We extracted the datasets for our study from the Compustat database.

Notes

  1. Mocanu et al. (2021) offered a summary of relevant recent international studies on corporate tax avoidance.

  2. Gebhart (2017), Hanlon and Heitzman (2010) and Lietz (2013b) presented an analysis of the measures of tax avoidance employed in the literature.

  3. Other measures of tax avoidance are UTB and tax shelter activity, reviewed by Hanlon and Heitzman (2010) and Lietz (2013b), though they are less frequently used in the literature.

  4. Dechow et al. (2010) presented a summary of widely used models of accruals, and Kourdoumpalou (2017) presented several models employed in the literature to obtain the accruals. In addition, McMullin and Schonberger (2020) recently proposed new techniques, via propensity-score matching and entropy balancing, to control for common accrual determinants.

  5. Several studies cited in section "Literature review and hypotheses" (e.g. Blaylock et al. 2012, 2015; Frank et al. 2009; Jackson 2015; Sundvik 2017; Tang 2015) have used the modified Jones model with ROA (Kothari et al. 2005) in their research on tax avoidance and EM.

  6. The above sup-LM approach, originally developed by Stinchcombe and White (1998), extends Bierens’s (1990) classical specification test.

  7. The dataset was randomly divided into four subsets, and the out-of-sample predictive performance in each subset of the model fitted on the remaining three subsets was calculated and then averaged out, as stated in the Appendix.

  8. The results of the linear and ANN models were similar in the case of BTD but quite different for CASH_ETR.

Abbreviations

ACCRUALS:

Discretionary accruals

AIC:

Akaike’s information criterion

ANNs:

Artificial neural networks

BTC:

Book-tax conformity

BTD:

Book-tax differences

BTG:

Book-tax gap

CAPINT:

Capital intensity

CASH_ETR:

Cash taxes paid

CIT:

Corporate income tax

CRSP:

Center for Research in Security Price

CSMAR:

China Stock Market & Accounting Research

CVMAE:

Cross-validated mean absolute error

CVRMSE:

Cross-validated root mean squared error

DTAX:

Discretionary permanent book-tax differences

EM:

Earnings management

ETR:

Effective tax rate

EU:

European Union

GAAP_ETR:

Total tax expense

GDP:

Gross domestic product

GROWTH:

Gross domestic product growth

IFRS:

International Financial Reporting Standards

INVINT:

Inventory intensity

IRS:

Internal Revenue Service

LEV:

Leverage

MAE:

Mean absolute error

NLS:

Non-linear least squares

RMSE:

Root mean squared error

ROA:

Return on assets

SD:

Standard deviation

SIC:

Schwartz’s information criterion

SIZE:

Size of the company

STR:

Statutory tax rate

UTB:

Unrecognised tax benefit

References

Download references

Acknowledgements

We gratefully acknowledge the comments and improvement suggestions from the Editor and five Anonymous Reviewers. Any remaining shortcomings are responsibility of the authors.

Funding

The authors gratefully acknowledge the funding from the Spanish Ministry of Science and Innovation, project MCI-21-PID2020-115183RB-C21.

Author information

Authors and Affiliations

Authors

Contributions

All authors made substantial contributions to the conception of the work; analysis and interpretation of data; drafting of the work and revising it. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Roberto García-Fernández.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Variable definitions

Appendix: Variable definitions

Variable name

Variable definition and measurement procedure

Dependent variables

GAAP_ETR

Total tax expense divided by pretax income. Firms’ denominator and numerator are required to be positive. The variable is winsorized at 0 and 1

CASH_ETR

Total cash taxes paid divided by pretax income. Firms’ denominator and numerator are required to be positive. The variable is winsorized at 0 and 1

BTD

(Pre-tax income multiplied by Statutory Tax Rate minus income tax expense, scaled by total assets) multiplied by (−1), following Tang et al. (2017)

Independent variables

Main variable

ACCRUALS

Discretionary accruals, estimated from the Jones model adjusted to ROA (Kothari et al. 2005) by year and industry

Control variables

SIZE

Natural logarithm of total assets

LEV

Total leverage scaled by total assets

CAPINT

Book value of gross property, plant, and equipment scaled by total assets

INVINT

Inventories scaled by total assets

ROA

Pretax income scaled by total assets

STR

Statutory Tax Rate

GROWTH

Gross Domestic Product Growth

SECTOR

A dummy variable for each sector

YEAR

A dummy variable for each year

COUNTRY

A dummy variable for each country

All continuous business variables are winsorized at 1 percent and 99 percent of the distribution

Artificial neural networks (ANN)

AIC

Akaike’s information criterion

SIC

Schwartz’s information criterion

RMSE

(Out-of-sample) root mean squared error:

 

\(RMSE_{j} = \sqrt {\frac{1}{{n_{j} }}\mathop \sum \nolimits_{{i \in S_{j} }} \left( {y_{i} - \hat{f}_{\left( j \right)} \left( {{\varvec{x}}_{i} } \right)} \right)^{2} }\)

MAE

(Out-of-sample) mean absolute error:

 

\(MAE_{j} = \frac{1}{{n_{j} }}\mathop \sum \nolimits_{{i \in S_{j} }} \left| {y_{i} - \hat{f}_{\left( j \right)} \left( {{\varvec{x}}_{i} } \right)} \right|\)

CVRMSE

(Four-fold) cross-validated root mean squared error:

 

\(CVRMSE = 1/4\mathop \sum \nolimits_{j = 1}^{4} RMSE_{j}\)

CVMAE

(Four-fold) cross-validated mean absolute error:

 

\(CVMAE = 1/4\mathop \sum \nolimits_{j = 1}^{4} MAE_{j}\)

 

\({\text{with}}\; n_{j}\) being the size of subset \(S_{j} \left( { j = 1, \ldots , 4} \right)\) and \(\hat{f}_{\left( j \right)} \left( \cdot \right)\) being the model (resp., linear or ANN) fitted to the reduced sample with subset \({\text{S}}_{{\text{j}}} {\text{ excluded}}\)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Delgado, F.J., Fernández-Rodríguez, E., García-Fernández, R. et al. Tax avoidance and earnings management: a neural network approach for the largest European economies. Financ Innov 9, 19 (2023). https://doi.org/10.1186/s40854-022-00424-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40854-022-00424-8

Keywords

JEL Classification