 Research
 Open Access
 Published:
A new hybrid method with datacharacteristicdriven analysis for artificial intelligence and robotics index return forecasting
Financial Innovation volume 9, Article number: 75 (2023)
Abstract
Forecasting returns for the Artificial Intelligence and Robotics Index is of great significance for financial market stability, and the development of the artificial intelligence industry. To provide investors with a more reliable reference in terms of artificial intelligence index investment, this paper selects the NASDAQ CTA Artificial Intelligence and Robotics (AIRO) Index as the research target, and proposes innovative hybrid methods to forecast returns by considering its multiple structural characteristics. Specifically, this paper uses the ensemble empirical mode decomposition (EEMD) method and the modified iterative cumulative sum of squares (ICSS) algorithm to decompose the index returns and identify the structural breakpoints. Furthermore, it combines the leastsquare support vector machine approach with the particle swarm optimization method (PSOLSSVM) and the generalized autoregressive conditional heteroskedasticity (GARCH) type models to construct innovative hybrid forecasting methods. On the one hand, the empirical results indicate that the AIRO index returns have complex structural characteristics, and present timevarying and nonlinear characteristics with high complexity and mutability; on the other hand, the newly proposed hybrid forecasting method (i.e., the EEMDPSOLSSVMICSSGARCH models) which considers these complex structural characteristics, can yield the optimal forecasting performance for the AIRO index returns.
Introduction
Artificial Intelligence and Robotics are key technologies of the Fourth Industrial Revolution which are rapidly changing how people live and work. Since the onset of the COVID19 pandemic, the WHOadvised social distancing has led to a more virtual existence, which may accelerate the development of Artificial Intelligence (AI) and Robotics technologies further. During the past decade, AI technologies have been hailed by many academics and practitioners as revolutionary and gamechanging in the business world, a sphere in which the AI and robotics activities have significantly increased (Felten et al. 2018; Furman and Seamans 2019; Gruetzemacher et al. 2021; Mikalef and Gupta 2021).
Meanwhile, AI and robotics stocks have also attracted wide investor attention, and the investment in AI has grown rapidly (Bughin et al. 2017). According to the AI Index 2021 annual report, despite the pandemic, 2020 still saw a 9.3% increase in private AI investment from 2019, a higher percentage jump than in 2019 (5.7%). Furthermore, the statistical data also shows that the United States remains the leading destination for private investment, with over USD 23.6 billion in funding in 2020, followed by China (USD 9.9 billion) and the United Kingdom (USD 1.9 billion) (Zhang et al. 2021a; b). Therefore, AI and robotics technology companies are exerting a growing influence on the financial market, representing an interesting investment option for portfolio diversification. It is evident that the growth in AI investment trend is consistent, and seizing this smart investment boom has become an important question. Therefore, relevant investors must first choose a reliable index reflecting the investment opportunities associated with AI technology. Further, they need to make an accurate analysis and prediction of index returns, which could help them dynamically grasp the evolution rule for the entire range of this industry, and enable them to reasonably develop the optimal portfolio strategy (Zhang and Wang 2019; Zhang et al. 2020; Ghosh et al. 2022).
At present, the indices related to AI and robotics mainly include the Nasdaq CTA Artificial Intelligence and Robotics Index (NQROBO Index), the Global Robotics and Automation Index (ROBO Index), and the Indxx Global X Robotics & Artificial Intelligence Index (IBOTZ Index). The Nasdaq CTA Artificial Intelligence and Robotics Index (hereafter referred to as the AIRO index) is designed to track the performance of companies engaged in the AI and robotics segment of the technological, industrial, medical, and other economic sectors. Therefore, this index is the most important among the three major indices since it can comprehensively reflect the overall stock price change and the associated development of the AI industry. Based on its price data, it can be established that from December 19, 2017 to July 23, 2021, the cumulative return rate of the AIRO index reached 84.84%, and the annualized return rate was 33.92%. The movement in this index is closely tied to other financial assets (Le et al. 2021; Tiwari et al. 2021). Thus, it is essential to accurately forecast the AIRO index returns which can provide a reference for investors to select suitable index funds and investment tools, and to help them target the investment opportunities of the growing AI and robotics industries.
However, the literature on AIRO index returns forecasting is relatively scarce, and most of the research focuses on AI progress forecasting, and the application of AI technology in forecasting tasks (Chang et al. 2018; Mascio et al. 2021). In particular, research on the nonlinear and timevarying characteristics of this index is scarce, and needs to be supplemented. Currently, the commonly used financial timeseries forecasting models can be classified into traditional econometric models and machinelearning methods; both possess advantages and disadvantages when used in forecasting. For example, the traditional econometric models are usually effective in capturing the linear and timevarying components, but they cannot fully capture nonlinear components and have several requirements for data stability (Hung 2011; Lin et al. 2011; Zhang et al. 2015). However, the machinelearning methods are suitable for predicting nonstationary, nonlinear time series because of their flexible nonlinear functionfitting capabilities and lessrestrictive assumptions, but their forecasting performance is easily affected by data size and parameter settings (Wang et al. 2005; Psaradellis and Sermpinis 2016). The literature further shows that single models, characterizing a specific feature of the data, usually cannot identify all states and correlations in complex time series (Khashei and Bijari 2011). Consequently, it affects the forecasting accuracy, since they are unable to extract the inherent dynamics. Given these limitations, the hybrid models gradually emerged in the financial timeseries prediction literature (Zhang and Zhang 2018; Li et al. 2021; Xiao et al. 2021). Against this background, the issues relevant to the AIRO index returns involve the following: the data characteristics this index exhibits, and designing a reliable prediction method that accurately explores the intrinsic structural characteristics of AIRO index returns.
Hence, this paper focuses on the structural characteristics of the AIRO index and attempts to combine the econometric models and machine learning methods to develop a hybrid forecasting approach, given the complexity of the datagenerating process of the AIRO index. Specifically, this paper first employs the ensemble empirical mode decomposition (EEMD) method to decompose the AIRO index return series into a series of intrinsic mode functions (IMFs) and the residual term. Further, it uses a modified version of the iterative cumulative sum of squares algorithm (ICSS) to identify the structural breakpoints. Second, different models (namely, the leastsquare support vector machine approach with the particle swarm optimization method (PSOLSSVM) and the generalized autoregressive conditional heteroskedasticity (GARCH) type models) are developed to forecast the IMFs and the residual term, respectively, with the sum of forecasted values for all components being the final forecasting results of the decomposition and integration models. Finally, this paper employs two methods to combine the econometric and machine learning models, and constructs innovative hybrid forecasting models that consider the complexity of the datagenerating process of the AIRO index.
The contribution of this paper involves three main aspects: (1) Previous research has primarily focused on the common comprehensive indices in the financial market, such as the S&P500 index; see Rapach and Zhou (2021) for a detailed discussion of the literature associated with international stock market forecasting. However, these indices cannot reflect and predict the development of the AI industry. This paper focuses on, conducting an indepth analysis and forecasting of the Nasdaq CTA Artificial Intelligence and Robotics Index to provide additional insights and implications for traders. This can enable them to make informed investment decisions, while operating in different time horizons. (2) Previous studies have usually employed the single forecasting model; however, it cannot systematically capture the inherent structural characteristics of overall index returns (Rapach and Zhou 2013; Tiwari et al. 2016). Hence, this paper attempts to employ the EEMD and the modified ICSS algorithms to mine the structural features in the AIRO index returns. The AIRO index returns not only have linear and nonlinear characteristics, but also high complexity and mutability, which provide the basis and guidance for constructing the relevant measurement and mathematical models used for the AIRO index. (3) Based on the complex inherent characteristics of the AIRO index, this paper is unique in exploring appropriate forecasting models from multiple perspectives for the AIRO index returns. These empirical findings provide fresh evidence for investors and portfolio managers concerning hedging and diversification benefits in the era of the Fourth Industrial Revolution.
The empirical results imply that given the data characteristics, the hybrid model (i.e., EEMDPSOLSSVMICSSGARCH) can overcome the limitations of a single model and effectively depict the timevarying, nonlinear, complex and mutable characteristics of the AIRO index returns. Thus, this model achieves superior forecasting performance for the AIRO index returns, providing investors with a reliable reference for portfolio selection and asset management. Moreover, this paper uses the forecasting results of the new hybrid model to construct different portfolio strategies, finding that it can improve the forecasting performance of the single models, but also increase their economic value.
The remainder of this paper is organized as follows: “Literature review” section reviews the relevant literature. “Methods” section briefly describes the models, “Data descriptions” section presents the data set, “Results and discussions” section discusses the empirical results, and “Conclusions and future work” section offers the concluding remarks.
Literature review
In the recent past, the adoption and use of AI and robotics technologies in several industries has increased significantly (Acemoglu et al. 2018; Felten et al. 2018; Furman and Seamans 2019; Graetz and Michaels 2018; Webster and Ivanov 2020). For instance, Furman and Seamans (2019) showed that while the worldwide shipments of robots rose by approximately 150% between 2010 and 2016, and the share of jobs demanding AI skills was nearly five times higher in 2016 than in 2013. Acemoglu and Restrepo (2018) indicated that while AI and robotics can help to increase productivity growth, these new technologies will render labor redundant. Webster and Ivanov (2020) indicated that AI and robotics were allpervading in various aspects of the economy, including manufacturing, trading in financial markets, chatbots in customer relationship management, and so on. Enholm et al. (2022) discussed the impact of AI, on the evolution of organizations, leading to competitive performance, and identified several implications of AI on the process and the firm.
Currently, many scholars choose the AIRO index, which can well represent the performance of technologyintensive companies in the AI and robotics fields, to depict the development of the AI industry. For instance, Tiwari et al. (2021) chose this index and employed the timevarying Markovswitching copula models to provide evidence of a timevarying Markov taildependence structure and dynamics between AI and carbon prices. Huynh et al. (2020) used this index to explore the role of AI and robotics stocks, green bonds, and Bitcoin in portfolio diversification, and proved that the portfolios of these assets exhibited heavytail dependence. Demiralay et al. (2021) investigated the interdependence between AI and robotics stocks and traditional and alternative assets. They identified the weak (strong) comovements between AI and other investments in shorter (longer) investment horizons.
However, these studies focus on the correlation between the AIRO index and other financial assets, but lack the systematic research on index return forecasting. Most of the forecasting research focuses on AI progress forecasting, or the application of AI technology in the forecasting field (Xiao and Ke 2021). For instance, Chang et al. (2018) implemented the social network (SN) technique to examine a corporation’s competitive edge. They fed business relationship and financial information into an AIbased technique to construct a forecasting model. Gruetzemacher et al. (2021) described the development of a research agenda for forecasting the progress of AI. It utilized the Delphi technique to elicit and aggregate experts’ opinions on which questions and methods to prioritize.
Thus, it can be seen that previous research has used this index widely, indicating that it can well track the performance of technologyintensive companies active in the AI and robotics sector. Yet, research on return forecasting for this index has yielded no progress. However, many scholars point out that in the financial market, accurately predicting the return sequence of financial assets is one of the most challenging tasks. It is also a crucial aspect of investors’ ability to formulate portfolio strategies. This is important in pricing assets, and evaluating portfolio performance.
To date, many scholars have focused on stock index return forecasting, and machinelearning models as well as traditional econometric models have been widely used to do so (Zhang and Wang 2019; Zhang et al. 2020; Ghosh et al. 2022; Sebastio and Godinho 2021). For example, Giovannellia et al. (2021) extracted the information contained in a high number of macroeconomic predictors using large dimensional factor models to forecast the S&P 500 index return, and their results showed that the Generalized Dynamic Factor Model can help predict stock returns. Mascio et al. (2021) assessed the performance of three forecasting models to predict the onemonthahead S&P 500 Index return (the sentiment index model) using a combined kitchensink forecasting model and a LASSO model. The results showed that the LASSO model outperformed the other ones. Salisu and Vo (2020) used a historical averagebased model to evaluate the relevance of healthnews trends in predicting stock returns during the COVID19 period. Their results revealed that the model incorporating the healthnews index outperformed the benchmark model. Thus, the great theoretical and practical significance to predict the index return in the financial market is clear. In this regard, both the traditional measurement, and the machinelearning models have attracted considerable attention.
The relevant literature indicates that the single models, characterizing a specific feature of the data, usually cannot identify all states and correlations in complex time series (Khashei and Bijari 2011). However, some studies indicate that both machinelearning and traditional econometric models possess their own advantages and disadvantages in the process of forecasting (Zhang et al. 2015; Wang et al. 2005; Psaradellis and Sermpinis 2016). Hybrid models gradually start to draw attention in forecasting research. For instance, Yu et al. (2008) proposed the “decompositionintegration” hybrid models, and their results showed that hybrid models always possess better forecasting ability. Bildirici and Ersin (2013) combined the multilayer perceptron model with the new Smooth Transition Autoregressive model and the generalized autoregressive conditional heteroskedasticity (GARCH) model, which introduced the fractional integration and asymmetric property (LSTARLSTGARCHMLP) model. This proved that the hybrid framework can capture the volatility clustering, asymmetry, and nonlinearity characteristics of petrol prices. Rapach et al. (2010) indicated that this combination of models can improve the prediction performance by synthesizing the featurecapturing capability of individual models. Zhang et al. (2021a; b) developed an innovative ensemble deeplearning model with dynamic error correction and multiobjective ensemble pruning to address timeseries forecasting tasks. The superior forecasting performance of the proposed model was verified using timeseries data (i.e., PM2.5 concentration, wind speed, and electricity price).
Overall, the research on stock return forecasting is already quite extensive, and hybrid models become widespread because they can combine the strengths of different models. What remains unsolved in the literature is return forecasting for the AIRO index. Most related research has focused on the correlation of the AI industry with other industries and the application of AI technology in the forecasting field. However, it would be beneficial to design a reliable forecasting method considering the complexity of the datagenerating process of the AIRO index, which could help investors develop optimal stock investment portfolios and hedge investment risk. Therefore, this paper attempts to combine the econometric models and machine learning methods to depict the complex structural characteristics of the AIRO index returns, based on previous research on the subject. It then constructs a hybrid forecasting approach to obtain optimal forecasting performance.
Methods
The EEMD method
The EEMD method (Wu and Huang 2009) is selected to decompose the complex original signal into components with different characteristics while maintaining the nonstationary and nonlinear features of the original timeseries data for this study on decomposing the AIRO index returns series. The main steps of the decomposition are as follows:

(1)
Add a white noise series \(o^{i} (t)\) with a given amplitude (i.e., 0.1) to the AIRO Index returns series \(x(t)\), and the new series \(x^{i} (t)\) is as follows:
$$x^{i} (t){ = }x(t) + o^{i} (t)$$(1) 
(2)
Decompose the time series \(x^{i} (t)\) into n IMFs \(c_{j}^{i} (t)\) (j = 1, 2,..., n) and a residual term \(r^{i} (t)\) using the EMD method, and the results are as follows:
$$x^{i} (t) = \sum\limits_{j = 1}^{n} {c_{j}^{i} (t) + r^{i} (t)}$$(2)where \(c_{j}^{i} (t)\) is the jth IMF in the ith trial.

(3)
Repeat steps (1) and (2) for M times with different white noise each time, and obtain the corresponding IMF components of the decomposition.

(4)
Calculate the average of the corresponding IMFs of M trials for the final IMFs, as follows:
$$c_{j} (t) = \frac{1}{M}c_{j}^{i} (t)$$(3)
Once the EEMD completes, the original time series can be expressed as a linear combination of IMFs and the residual term as follows:
where \(c_{j}^{{}} (t)\) (t = 1, 2,…, T) is the jth IMF obtained by using the EEMD method at time t, \(r(t)\) is the final residual term, and n is the number of IMFs.
The PSO–LSSVM method
The LSSVM method
To describe the nonlinear characteristics of the AIRO index returns better, we single out the LSSVM model, which is a typical method in machinelearning (Suykens and Vandewalle 1999), and is particularly suitable for modeling smallsize samples and nonlinear problems. The specific description of the model is as follows.
Given a set of samples, \(\{y_{t} ,x_{t} \} _{{t = {1}}}^{T}\), \({\mathbf{x}}_{t}\) is the input vector, and \({\mathbf{y}}_{t}\) is the output variable. Then the decision function can be defined as:
where w is the weight vector, \(\Gamma (x)\) represents the nonlinear function used to map the input space to a highdimensional feature space, and \(c_{bias}\) is the bias term.
The objective function of the LSSVM model is:
where \(c_{reg}\) is the regularization constant, and \(\sigma_{t}\) denotes the training error.
Next, the final outcome of the LSSVM method based on the KuhnTucker conditions can be described as:
where \(K(x,x_{t} )\) denotes the kernel function. We apply the radial basis function (RBF) with a width of \(\omega\) (Keerthi and Lin 2003), which can be expressed as:
The PSO method
The PSO method is a computational technique that uses a set of particles, representing potential solutions to a problem (Eberhart and Kennedy 1995). Each particle can be defined as a potential solution to the problem in a ddimensional search space. Let \(U_{i} = (u_{i1} ,u_{i2} , \ldots ,u_{id} )\) be the current position of particle i, \(V_{i} = (v_{i1} ,v_{i2} , \ldots ,v_{id} )\) be the current velocity, \(P_{i} = (p_{i1} ,p_{i2} , \ldots ,p_{id} )\) be the previous position, and \(P_{g} = (p_{g1} ,p_{g2} , \ldots ,p_{gd} )\) be the best position among all particles, then the best positions of particle i is shown as:
where \(v_{i}^{k}\) and \(u_{i}^{k}\) are the current velocity and position of particle i, respectively; \(w\) is the inertia weight; \(c_{1}\) and \(c_{2}\) are acceleration coefficients; and \(r_{1}\) and \(r_{2}\) are two independently, uniformly distributed, random variables with the range [0, 1].
The PSOLSSVM method
Due to the parameters \(\omega\) and \(c_{reg}\) having a significant impact on forecasting accuracy, we employ the PSO method to obtain the optimal parameters (Eberhart and Kennedy 1995); hence, the main steps of the PSOLSSVM approach can be described as follows:
Step 1 Take the parameters (\(\omega\),\(c_{reg}\)) as swarms, and initialize a population of particles with random positions and velocities.
Step 2 Evaluate the fitness of each particle based on the following fitness function:\({\text{Fitness}} = [\frac{1}{N}\sum\limits_{i = 1}^{20} {(\hat{y}_{i}  y_{i}^{2} )} ]^{1/2}\), where \(y_{i}\) and \(\hat{y}_{i}\) represent the actual and forecast AIRO Index returns, respectively.
Step 3 Update the previous and global best fitness values according to the fitness evaluation results.
Step 4 Update the velocity and position values for each particle until the stop conditions are satisfied based on Eq. (9) and Eq. (10) (i.e., the number of iterations reaches a maximum of 100, or the optimal parameters satisfy the accuracy requirement, i.e., the value of fitness is less than 0.001).
The GARCH model
To capture the timevarying character of the movements for the AIRO index returns, we employ the GARCH model proposed by Bollerslev (1986), which is the most commonly used econometric model for analyzing the volatility of returns in financial markets.^{Footnote 1} The model is defined as follows:
where \(u_{t}\) represents the residual series, and \(h_{t}\) is the conditional variance. When \(t = 1, \ldots ,n\), \(\varepsilon_{t}\) ~ N(0, 1), the model should satisfy \(\alpha_{0} > 0\), \(\alpha \ge 0\), \(\beta \ge 0\) and \(\alpha + \beta < 1\).
To depict the structural changes in the AIRO index returns, this paper combines the structural breakpoints with GARCH (1,1); the variance equation is shown in Eq. (12).
where \(D_{1} , \ldots ,D_{n}\) are dummy variables that are determined according to the structural breakpoints identified by the modified ICSS algorithms (Ewing and Malik 2017).
The hybrid method for forecasting AIRO index returns
The hybrid method is capable of modeling both nonlinearity and time variations, which indicates that it may possess better forecasting ability in terms of the AIRO index returns. In this circumstance, we attempt to construct a hybrid model based on the decomposition and integration, and model combination methods. The procedures can be described as follows:

(1)
The EEMD method is used to decompose the original AIRO index return series to obtain the IMFs and the residual term.

(2)
We normalize the decompose IMFs components and residual term, and appropriately select training and testing samples. Then, the single models above (i.e., the GARCHtype and PSOLSSVM models) are used to forecast the IMF components and the residual term, respectively.

(3)
The forecasting results of each IMF component and residual term are superimposed to obtain the final forecasting results of the decomposition and integration models (i.e., the EEMDGARCHtype and EEMDPSOLSSVM models).

(4)
The following two methods are used to obtain the hybrid predictions:

(5)
The GARCHtype models are built to predict highfrequency IMFs with timevarying characteristics, whereas the PSOLSSVM model predicts lowfrequency IMFs and residual terms with nonlinear characteristics. Next, the final forecasting results of the new hybrid model are obtained by superimposing the above forecasts, i.e., the EEMDPSOLSSVMGARCH(A) and the EEMDPSOLSSVMICSSGARCH(A) models.

(6)
We combine the forecasting results of the EEMDGARCHtype and the EEMDPSOLSSVM in Step (3) using the mean combination approach, and the new hybrid models, i.e., the EEMDPSOLSSVMGARCH(B) and EEMDPSOLSSVMICSSGARCH(B),^{Footnote 2} are used to obtain the final forecasting results.
The evaluation criteria for forecasting performance
In accordance with Hansen and Lunde (2005), we apply two widely used statistical loss functions, i.e., the Mean Square Error (MSE) and the Mean Absolute Error (MAE)—to evaluate the outofsample forecasting performance for the AIRO index returns, which are defined as Eqs. (13)–(14):
where h_{t} represents the actual return, whereas \(\hat{h}_{t}\) represents the forecasted return; T and N represent the number of full and insample observations, respectively, and T − N is the number of outofsample observations.
Meanwhile, we use the Model Confidence Set (MCS) method proposed by Hansen et al. (2011) to judge whether the models used have a statistically significant difference in forecasting performance. In particular, the range statistic is chosen in this study, i.e., \(T_{R} = \mathop {\max }\nolimits_{a,b \in M} \frac{{\left {\overline{g}_{ab,t} } \right}}{{\sqrt {{\text{var}} (g_{ab,t} )} }}\), where \(\overline{g}_{ab,t}\) denotes the relative performance variable of model \(a\) and \(b\). The range statistic and its corresponding pvalue are obtained using a bootstrap procedure. Following Hansen et al. (2011) and Wang et al. (2016), we consider a confidence level of 90%, which means that a model with the MCS p value larger than 0.1 will be included in the MCS.
Data descriptions
Following Huynh et al. (2020), this study chooses the daily AIRO index price data from the NASDAQ market as the research focus, with the data obtained from Bloomberg.^{Footnote 3} The AIRO index reflects the innovation level of the market and the performance of the AI industry in the era of the Fourth Industrial Revolution (Tiwari et al. 2021). The full sample ranges from 12/19/2017 to 07/26/2021, and the specific sample periods for the training and testing samples are 12/19/2017 to 10/13/2020 and 10/14/2020 to 07/26/2021, respectively. The AIRO index returns are calculated as r_{t} = 100 × [log(p_{t}) − log(p_{t−1})], where p_{t} indicates the AIRO index price at time t. The daily AIRO index log returns are shown in Fig. 1.
Table 1 presents the descriptive statistics of the AIRO index returns. It can be observed that the AIRO index returns series has negative skewness and positive excess kurtosis, suggesting the presence of a leptokurtic and fattailed distribution. Moreover, the Jarque–Bera test results indicate that the null hypothesis of a normal distribution is rejected at the 1% significance level. The Ljung–Box Qstatistics for the squared returns also reject the null hypothesis of no autocorrelation up to the 10th order at the 1% significance level, which indicates the existence of autocorrelation in the volatility of the AIRO index returns. Table 1 also presents the results of the unit root tests. Specifically, the results of the Augmented DickeyFuller (ADF—Dickey and Fuller 1981) test and the Phillips–Perron (PP—Phillips and Perron 1988) test reject the null hypothesis of a unit root at the 1% significance level, indicating that the AIRO index returns are stationary over the sample period.
Results and discussions
The EEMD decomposition results
Based on this discussion regarding methods, we obtain the EEMD decomposition result for the AIRO index returns in Fig. 2. First, the original AIRO index returns series is decomposed into eight independent intrinsic mode functions and one residual term, defined as subseries in the following section, using the EEMD method. As depicted in Fig. 2, the IMFs obtained by the EEMD algorithm are irregular, which is caused by the nonlinear and noise components of the AIRO index returns. In addition, the frequency of the eight IMF components and the residual term are arranged from high to low, which shows the diversity of the AIRO index returns in terms of the frequency and multiscale characteristics. Furthermore, it shows that the AI industry may be affected by various factors, including the strong uncertainty regarding the industrial chain, and the development of AI technology. Specifically, the average period of the IMF1IMF5 is relatively short, which is the highfrequency component of the original AIRO index returns series, and reflects the impact of shortterm irregular events on the AI industry. The GARCHtype models are used to forecast these subseries.
The average period of IMF6IMF8 is relatively long, indicating the impact of major events in the field of artificial intelligence, while the PSOLSSVM model is applied to forecast these subseries. Moreover, the residual term has declined slowly since September 2019, which reflects that under the impact of economic fundamentals—industrial structure adjustment, macro policy, and so on—the AIRO index returns have declined since September 2019. Investors can capture the AI industry development via this longterm trend, and grasp the investment risks and returns, which enable them to look for new AI industry investment opportunities.
Structural breaks test in AI and robotics market
This paper uses the modified version of Inclan and Tiao’s (1994) iterated cumulative sum of squares (ICSS) algorithm to identify six structural breakpoints in the AIRO index returns series, and divides the sample period into seven intervals accordingly. The results are presented in Table 2. It is worth noting that the structural break in February 2020 was closely related to the COVID19 pandemic, and “OpenAI Five” beat humans. Accenture invested in China and focused on an artificial intelligence layout for the first time in August 2018, causing a breakpoint in the AIRO index returns. The disease diagnosis based on AI technology made a great breakthrough for the breakpoint in February 2018, indicating that AI can revolutionize the diagnosis and management of diseases through a large amount of data analysis and classification. These results confirm that structural breaks in the AIRO index returns tend to occur due to emergencies or external shocks.
Forecasting results of AIRO index returns
We examine the forecasting performance of all competitive models, in order to find the optimal forecasting model for AIRO index returns based on the randomness, periodicity, and trend of this series. First, we consider a single forecasting model without the data decomposition method, i.e., the traditional econometric model (GARCHtype models) and the machinelearning framework (PSOLSSVM method). Second, we employ the GARCHtype and the PSOLSSVM models to forecast all the subseries, respectively, and obtain the forecasts from integrateddecomposed models (i.e., the EEMDGARCHtype and EEMDPSOLSSVM models). Third, we use two methods to combine the EEMDGARCHtype and EEMDPSOLSSVM models and derive forecasts from the final new hybrid models (i.e., EEMDPSOLSSVMICSSGARCH (A) and (B) models). Finally, we calculate the loss function values and corresponding MCS test of 1day ahead forecasting results for the daily logreturns of the AIRO index to evaluate the predictive abilities of the different models. The forecasting results are presented in Table 3. From this table, we identify the following findings.

(1)
The MSE and MAE values indicate that the forecasting performances of the GARCH and the PSOLSSVM models are not significantly different. The hybrid PSOLSSVM and GARCH models perform better than the two single models, i.e., the EEMDPSOLSSVMGARCH(A) and (B) models. The results show that the PSOLSSVM model can better capture the nonlinear characteristics of the AIRO index returns, and has the advantages of nonlinear mapping, selflearning, and selforganization. On the contrary, the GARCH model has the advantage of capturing the timevarying and volatilityclustering characteristics of the AIRO index returns. The results also suggest that the hybrid models can consider the linear and nonlinear characteristics of the AIRO index returns, and combine the advantages of the PSOLSSVM and GARCH models. This helps in obtaining superior forecasting performance compared to the single model.

(2)
The models that consider structural changes can achieve even better predictive performance. As shown in Table 3, the ICSSGARCH and EEMDICSSGARCH models have lower forecasting losses than the models without structural changes. The improved predictive performance is particularly evident for the new mixed models that consider structural breakpoints. Huynh et al. (2020) point out that as these firms are participants in the not yet mature AI market, AI stocks may react significantly to changes in other asset markets. Therefore, incorporating structural breakpoints can better capture the response of the AI market to emergencies, leading to the models making more accurate predictions.

(3)
Compared to the single models, the decompositionintegration models usually perform better in their ability to forecast the AIRO index returns. Specifically, as shown in Table 3, the values of MSE and MAE always indicate that the forecasting results of the EEMDGARCHtype and EEMDPSOLSSVM models are significantly better than those of the corresponding models that do not apply the EEMD algorithm. Moreover, the decompositionintegration models are always included in MCS under more criteria than single models with a confidence level of 90%. This result shows that the single model is greatly affected by the characteristics of the AI industry and the data itself, such that its prediction ability is weaker than that of the decompositionintegration models. Thus, the EEMD method can account for the periodicity, randomness, and trend characteristics of the AIRO index returns. This method effectively decomposes the original sequence into simple modes to obtain stable IMFs components and the residual term, thereby improving the forecasting accuracy. Moreover, the results can help investors mine the forecasting information of the AI industry index more comprehensively and measure the investment risks more reasonably.
The final new hybrid models involving the structural characteristics mentioned above always result in superior forecasting. As can be seen from Table 3, the values for the two loss functions involving the two final hybrid models are significantly lower than those of the other models. Additionally, the MCS test results also show that compared with other models, the final hybrid models are included in the MCS under more criteria. This indicates that the hybrid models can consider the linear and nonlinear, complexity, and mutability characteristics of the AIRO index returns, thereby obtaining superior forecasting performance compared with other models. Particularly, the values of MSE and MAE are significantly reduced with the hybrid models compared with the others, and the final mean combination model (i.e., the EEMDPSOLSSVMICSSGARCH(B) models) usually performs the best out of all the models considered. The results show that the AI industry is affected by multiple factors. Thus, the AIRO index returns present multiple characteristics, and the final hybrid model combined with the EEMD method and the modified ICSS algorithm can help investors effectively capture the complexity of the industry index. Hence, they can change their investment strategy to adapt to the changing financial market, and obtain steady income streams under different investment risks.
Economic significance
During the past decade, global investors have paid wide attention to the stocks of AI and robotics companies to reap the potential investment benefits. In order to judge whether the new models can help AI market investors gain higher investment benefits, this paper further uses the mean–variance investment strategy to investigate the economic values of AIRO index returns forecasting models from an asset allocation perspective (Ferreira and SantaClara 2011; Xing and Zhang 2022). The main forms are as follows:
Assuming that a mean–variance investor optimally allocates between the AIRO index and riskfree bills based on the various return forecasts, the utility U_{t} of portfolio strategy P can be defined as follows:
where \(E_{t} ( \bullet )\) and \(Var_{t} ( \bullet )\) represent the conditional mean and variance of the portfolio return \(R_{t}^{P}\) at time t, respectively. \(r_{t}^{e}\) and \(\sigma_{t}^{2}\) are the AIRO index excess return and volatility on day t, respectively. \(r_{t}^{f}\) is the riskfree rate, \(\gamma\) is the investor’s coefficient of relative risk aversion, and \(\omega_{t}\) is the portfolio weight. By maximizing the objective function, we can obtain the optimal portfolio weight \(\omega_{t}^{ * }\), given by
where \(\hat{r}_{{t{ + }1}}^{e}\) and \(\hat{\sigma }_{{t{ + }1}}^{2}\) are the outofsample forecasting value of excess returns and volatility, respectively. Specifically, we apply the forecasting models above to ensure the value of \(\hat{r}_{{t{ + }1}}^{e}\), and use the prevailing historical average to forecast \(\hat{\sigma }_{{t{ + }1}}^{2}\). We restrict the value of \(\omega_{t}^{ * }\) between 0 and 1.5 because of the shortsale constraint. Then, we compute the portfolio return \(R_{t + 1}^{P}\) as:
The mean–variance investor who allocates assets using Eq. (17) can realize the certainty equivalent return (CER), defined in Eq. (18):
where \(\hat{\mu }_{p}^{{}}\) and \(\hat{\sigma }_{p}^{2}\) are the sample mean and variance, respectively, of the portfolio return over the outofsample evaluation period.
Using the method above, this paper proposes different portfolio strategies under different optimal weights, which are determined according to the return forecasts above. Further, we calculate the average portfolio return (R) and certainty equivalent return (CER). It should be noted that the higher values of CER usually mean a greater economic value of the corresponding portfolio strategy, i.e., the economic significance of the corresponding model is more positive in practical applications. The test results are shown in Table 4. Hence, we have the following findings.
First, the R and CER values of the portfolio strategy are relatively higher for forecasting models that consider structural changes, and the decompositionintegration models also have better economic value. As shown in Table 4, the values for the models combining the EEMD methods (the EEMDGARCH and EEMDPSOLSSVM models) are mostly better than those for single models (the GARCH and PSOLSSVM models). Similarly, the economic value of the models combining structural breakpoints is largely better than the others, which enables investors to consider this factor when developing portfolio strategies to achieve better returns.
Second, the new hybrid model exhibits the best economic performance. As shown in Table 4, the R and CER values of the final hybrid models are always the highest (i.e., EEMDPSOLSSVMICSSGARCH(A) and (B)), which suggests that these can capture the complex characteristics of the AIRO index simultaneously, and thus yield the best economic value.
Robustness checks
Different data frequencies
The diversity of investor behavior normally results in data exhibiting various characteristics at different frequencies. Therefore, to examine whether and how the central empirical results change over different data frequencies, and to further judge whether the hybrid model is suitable for investors with different trading horizons, this paper replaces the daily data with weekly data, while the full sample ranges remain unchanged. Specifically, we use weekly data to reestimate the models, with the specific periods of training and testing samples being 12/19/2017–08/23/2020 and 08/24/2020–07/26/2021, respectively, given the full sample of 12/19/2017–07/26/2021.
As Table 5 shows, the MSE and MAE values of the EEMDGARCH and EEMDPSOLSSVM models are lower than those of the GARCH and PSOLSSVM models without the EEMD method. This shows that the EEMD method can effectively decompose the AIRO index return series with noise, allowing us to obtain more accurate data for the subsequent prediction process. Hence, the decompositionintegrated forecasting models proved to be better than the single models at a weekly frequency as well.
Furthermore, the final hybrid models also yield superior forecasting performance compared to the other models. Specifically, the MSE and MAE values of the final hybrid models (A and B models) are significantly lower. This shows that investors with both, long and short trading horizons, can consider the new hybrid model to capture the complex industry characteristics and forecast the AIRO index returns with more accuracy. These findings have important implications for investors and policymakers in terms of portfolio diversification, risk management, asset allocation, and price regulation. Overall, our results are robust across high and low frequency data.
Different sample periods
Some uncertainties may affect the central results presented till now. To determine whether different sample periods can affect our findings, we select a new sample period of 07/26/2018–07/26/2021 to reestimate the models, and the corresponding in and outofsample periods are chosen to be 07/26/2018–11/25/2020 and 11/26/2020–07/26/2021, respectively. The results of the 1day ahead forecasting using this new setup are presented in Table 6. By comparing the results from the two loss functions reveals that the forecasting results of the decompositionintegration models are superior to those of the single model. In addition, compared with other models, the two new final hybrid models continue to achieve better forecasting performance, and the mean combination model (B model) performs the best among all the models. In summary, the central results are also robust to different sample periods.
Different artificial intelligence index
To prove the superiority and robustness of the final hybrid model, this paper further chooses other Artificial Intelligence and Robotics indices as research objects to depict the changes in that specific industry. Specifically, this paper selects the NYSE FactSet Global Robotics and Artificial Intelligence Index (NYFSRAI), which can track equity performance in robotics and artificial intelligence. The Robotics area mainly includes companies referring to robotics integrated applications, development, manufacturing, and the devices involved in highspeed, highprecision, and automation etc. The AI area mainly includes companies involved in AI development, programming, and software and hardware implementation. The loss function values of each model are listed in Table 7, and the main results are discussed below.
On the one hand, the decompositionintegration model still has superior forecasting ability compared to single models for the NYSERAI index. As Table 7 shows, the loss function values of the EEMDPSOLSSVM and EEMDGARCH models are smaller than those of the PSOLSSVM and GARCH models. This shows that the EEMD algorithm can effectively decompose the NYSERAI index return series containing noise, and obtain the stable IMFs and the residual term to provide more suitable data for the subsequent forecasting process. On the other hand, the final hybrid models still have the best forecasting performance for the NYSERAI index. Specifically, the values of MSE and MAE in Table 7 show that the loss function values of EEMDPSOLSSVMICSSGARCH (A) and (B) decrease significantly.
Different benchmarking models
To prove the superiority of the final hybrid model in this paper, we further compare the forecasting accuracy of the hybrid model with some recognized benchmarking models, such as neural network models according to Fang et al. (2020). The daily data is trained with the same interval as above, to obtain objective comparison results. Specifically, we set the full sample range from 12/19/2017 to 07/26/2021, and the specific sample periods for the training and testing samples are 12/19/2017 to 10/13/2020 and 10/14/2020 to 07/26/2021. The 1day ahead forecasting results of the different benchmarking models are listed in Table 8. A comparison with Table 3 shows that the final hybrid models outperform the benchmarking models. For example, the MSE values of the EEMDPSOLSSVMICSSGARCH (A) and (B) are 0.6330 and 0.6280, respectively, whereas those of the DNN, LSTM, PSOBP, and GAELM are 1.7882, 1.6924, 1.7660, and 1.6252, respectively. This further proves the robustness of the central conclusion.
Transaction cost
The transaction cost may make a difference in the performance of the portfolio (Guidolin and Pedio 2021). Therefore, to judge whether the results of the economic value test are robust, this study assumes that there are 30 basis points of transaction cost when trading assets. The new results are presented in Table 9. We find that the CER and R values based on the final hybrid models are relatively higher than those of the other models. This shows that when the transaction cost is considered, we can still prove the robustness of the economic significance results.
Conclusions and future work
To accelerate the development of the AI industry globally, relevant industries must examine this rapidly changing AI market and make innovative investments. Therefore, from the investor perspective, it is crucial to understand the Artificial Intelligence index. In order to mine the intrinsic structural characteristics of the AIRO index returns deeply and comprehensively, and to judge which type of model can better predict the AIRO index returns, this paper is the first one attempting to combine machinelearning techniques with traditional econometric models based on the “decompositionintegration” and “model combination” methods for the AIRO index returns forecasting. Specifically, the EEMD method and modified ICSS algorithm are used to analyze the data characteristics, and the basic single models in this paper include the PSOLSSVM and GARCH models. The main conclusions are drawn as follows.
First, the EEMD decomposition and integration method significantly improves the forecasting performance of the single models of the AIRO index returns. This is mainly because the EEMD method can obtain a more stable and simple mode, giving full consideration to the periodicity, randomness, and trend characteristics of the AIRO index returns. Consequently, more accurate forecasting results are obtained, driven by the features of the data. In addition, the result is valid regardless of whether it is for the PSOLSSVM or GARCH models. This further proves the applicability of the decomposition and integration method to the AIRO index returns.
Second, regardless of whether we use daily or weekly data and different sample periods, the forecasting performance of the GARCH and PSOLSSVM models is not significantly different. Additionally, the hybrid model (i.e., the EEMDPSOLSSVMGARCH model), which combines these frameworks, can markedly improve the forecasting performance of the single models. This result shows that the traditional econometric model is suitable for describing the timevarying characteristics in the AIRO index returns; the machinelearning model can better capture the nonlinear characteristics; and the hybrid model can effectively combine their advantages.
Third, the AIRO index returns exhibit complex structural characteristics. Specifically, it not only presents timevarying and nonlinear characteristics, but also possesses high complexity and mutability. In a context where most AI market participants are not mature, the structural change caused by an external shock plays a critical role in predicting the AIRO index return. Additionally, the final hybrid model, which further considers structural change (i.e., the EEMDPSOLSSVMICSSGARCH model), can comprehensively capture the complex characteristics of the AIRO index returns, and yield the best forecasting performance and economic value.
These conclusions have clear theoretical and practical implications. On the one hand, we perfect the research framework in the field of Artificial Intelligence. Previous research has focused on the correlation between the AI industry and other industries or the application of AI technology in the forecasting field. We further focus on the AIRO Index, and conduct an indepth analysis and forecasting, thereby perfecting the research framework in the field of Artificial Intelligence; On the other hand, based on the essential characteristics and pattern characteristics exhibited by the AIRO index returns, we propose the optimal forecasting model (i.e., EEMDPSOLSSVMICSSGARCH model). The final forecasting model can overcome the limitations of a single model, which further expands the relevant forecasting theory.
The conclusions in this paper also have several practical implications for both policymakers and investors interested in portfolio diversification. First, the financial market participants can utilize the EEMDPSOLSSVMICSSGARCH model to capture and mine more of the data characteristics of the AIRO index returns. This can help them make more accurate forecasting decisions, which can provide an important reference for them to target investment opportunities, prevent risks, and reap benefits in the AI industry. Second, forecasting of the AIRO index returns can help the policymakers understand the future changes in the AI stock market in a timely way. This can result in the formulation of effective policies to maintain the financial market and social stability, involving financial market risk management, option pricing etc.
In the future, there is still much interesting work to be explored regarding the AI industry. In particular, we could further explore the factors influencing the AI index and analyze the characteristics of the AI industry in deeper detail. This would enable the construction of an accurate explanatory variablesbased forecasting framework, which in turn could help investors grasp investment opportunities in the AI industry.
Availability of data and materials
The data can be obtained upon request.
Notes
The GARCHtype models can consider the conditional heteroscedasticity of financial time series and can capture some other characteristics, such as timevariation, volatility clustering (Mohammadi and Su 2010; Salisu and Fasanya 2013), while the AIRO index return series usually exhibit relevant characteristics.
Further information on the AIRO index is available at https://indexes.nasdaqomx.com/Index/Overview/NQROBO.
Abbreviations
 AIRO:

NASDAQ CTA artificial intelligence and robotics
 EEMD:

Ensemble empirical mode decomposition
 ICSS:

Iterative cumulative sum of squares
 PSOLSSVM:

Leastsquare support vector machine approach with the particle swarm optimization
 GARCH:

Generalized autoregressive conditional heteroskedasticity
 AI:

Artificial intelligence
 WHO:

World Health Organization
 NQROBO:

Nasdaq CTA Artificial Intelligence and Robotics
 ROBO:

Global Robotics and Automation
 IBOTZ:

Indxx Global X Robotics & Artificial Intelligence
 IMFs:

Intrinsic mode functions
 SN:

Social network technique
 RBF:

Radial basis function
 MSE:

Mean square error
 MAE:

Mean absolute error
 MCS:

Model confidence set
 ADF:

Augmented Dickey–Fuller
 PP:

Phillips–Perron
 AIC:

Akaike information criterion
 CER:

Certainty equivalent return
 NYFSRAI:

NYSE FactSet Global Robotics and Artificial Intelligence
References
Acemoglu D, Restrepo P (2018) The race between man and machine: implications of technology for growth, factor shares, and employment. Am Econ Rev 108:1488–1542
Bildirici M, Ersin ÖÖ (2013) Forecasting oil prices: smooth transition and neural network augmented GARCH family models. J Pet Sci Eng 109:230–240
Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econom 31:307–327
Bughin J, Hazan E, Ramaswamy S, Chui M, Allas T, Dahlström P, Henke N, Trench M (2017) Artificial intelligence: the next digital frontier? MGI Report, McKinsey Global Institute
Chang TM, Hsu MF, Lin SJ (2018) Integrated news mining technique and AIbased mechanism for corporate performance forecasting. Inf Sci 424:273–286
Claeskens G, Magnus JR, Vasnev AL, Wang W (2016) The forecast combination puzzle: a simple theoretical explanation. Int J Forecast 32:754–762
Dickey DA, Fuller WA (1981) Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica 49:1057–1072
Demiralay S, Gencer HG, Bayraci S (2021) How do artificial intelligence and robotics stocks comove with traditional and alternative assets in the age of the 4th industrial revolution? Implications and insights for the COVID19 period. Technol Forecast Soc Chang 171:120989
Eberhart RC, Kennedy JA (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science. Nagoya Japan, pp 39–43
Enholm IM, Papagiannidis E, Mikalef P, Krogstie J (2022) Artificial intelligence and business value: a literature review. Inf Syst Front 24(5):1709–1734
Ewing BT, Malik F (2017) Modelling asymmetric volatility in oil prices under structural breaks. Energy Econ 63:227–233
Fang Y, Guan B, Wu S et al (2020) Optimal forecast combination based on ensemble empirical mode decomposition for agricultural commodity futures prices. J Forecast 39(6):877–886
Felten E, Raj M, Seamans R (2018) A method to link advances in artificial intelligence to occupational abilities. Am Econ Assoc Pap Proc 108:54–57
Ferreira MA, SantaClara P (2011) Forecasting stock market returns: the sum of the parts is more than the whole. J Financ Econ 100:514–537
Furman J, Seamans R (2019) AI and the economy. Innov Policy Econ 19(1):161–191
Ghosh P, Neufeld A, Sahoo JK (2022) Forecasting directional movements of stock prices for intraday trading using LSTM and random forests. Financ Res Lett 46:102280
Giovannellia A, Massaccib D, Soccors S (2021) Forecasting stock returns with large dimensional factor models. J Empir Financ 63:252–269
Graetz G, Michaels G (2018) Robots at work. Rev Econ Stat 100:753–768
Gruetzemacher R, Dorner FE, BernaolaAlvarez N et al (2021) Forecasting AI progress: a research agenda. Technol Forecast Soc Chang 170:120909
Guidolin M, Pedio M (2021) Forecasting commodity futures returns with stepwise regressions: Do commodityspecific factors help? Ann Oper Res 299(1):1317–1356
Hansen PR, Lunde A (2005) A forecast comparison of volatility models: Does anything beat a GARCH(1,1)? J Appl Econom 20(7):873–889
Hansen PR, Lunde A, James MN (2011) The model confidence set. Econometrica 79(2):453–497
Hung JC (2011) Adaptive FuzzyGARCH model applied to forecasting the volatility of stock markets using particle swarm optimization. Inf Sci 181(20):4673–4683
Huynh TLD, Hille E, Nasir MA (2020) Diversification in the age of the 4th industrial revolution: the role of artificial intelligence, green bonds and cryptocurrencies. Technol Forecast Soc Chang 159:120188
Inclan C, Tiao GC (1994) Use of cumulative sums of squares for retrospective detection of changes in variance. J Am Stat Assoc 89:913–923
Keerthi SS, Lin CJ (2003) Asymptotic behaviors of support vector machines with Gaussian kernel. Neural Comput 15:1667–1689
Khashei M, Bijarai M (2011) A new hybrid methodology for nonlinear time series forecasting model. Mod Simul Eng 15:1–5
Kuhn HW, Tucker AW (1950) Nonlinear programming. Proc Second Berkeley Symp Math Stat Probab 2:481–492
Le TNL, Abakah EJA, Tiwari AK (2021) Time and frequency domain connectedness and spillover among fintech, green bonds and cryptocurrencies in the age of the fourth industrial revolution. Technol Forecast Soc Chang 162:120382
Li YZ, Jiang SR, Li XR, Wang SY (2021) The role of news sentiment in oil futures returns and volatility forecasting: Datadecomposition based deep learning approach. Energy Econ 95:105140
Lin KP, Pai PF, Yang SL (2011) Forecasting concentrations of air pollution by logarithm support vector regression with immune algorithms. Appl Math Comput 217(12):5318–5327
Mascio DA, Fabozzi FJ, Zumwalt JK (2021) Market timing using combined forecasts and machine learning. J Forecast 40:1–16
Mikalef P, Gupta M (2021) Artificial intelligence capability: conceptualization, measurement calibration, and empirical study on its impact on organizational creativity and firm performance. Inf Manag 58(3):103434
Mohammadi H, Su L (2010) International evidence on crude oil price dynamics: applications of ARIMAGARCH models. Energy Econ 32:1001–1008
Phillips PCB, Perron P (1988) Testing for a unit root in time series regression. Biometrika 75:335–346
Psaradellis I, Sermpinis G (2016) Modelling and trading the U.S. implied volatility indices. Evidence from the VIX, VXN and VXD indices. Int J Forecast 32(4):1268–1283
Rapach DE, Zhou GF (2013) Forecasting stock returns. Handb Econ Forecast 2(1):328–383
Rapach DE, Zhou GF (2021) Asset pricing: timeseries predictability. The Oxford Research Encyclopedia of Economics and Finance
Rapach DE, Strauss JK, Zhou GF (2010) Outofsample equity premium prediction: combination forecasts and links to the real economy. Rev Financ Stud 23:821–862
Salisu AA, Fasanya IO (2013) Modelling oil price volatility with structural breaks. Energy Policy 52:554–562
Salisu AA, Vo XV (2020) Predicting stock returns in the presence of COVID19 pandemic: the role of health news. Int Rev Financ Anal 71:101546
Sebastio H, Godinho P (2021) Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financ Innov 7(1):3
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Tiwari AK, Dar AB, Bhanja N, Gupta R (2016) A historical analysis of the US stock price index using empirical mode decomposition over 1791–2015. Economics 10:1–15
Tiwari AK, Abakah EJA, Le TNL, Leyvade la Hiz DI (2021) Markovswitching dependence between artificial intelligence and carbon price: the role of policy uncertainty in the era of the 4th industrial revolution and the effect of COVID19 pandemic. Technol Forecast Soc Chang 163:120434
Wang SY, Yu L, Lai KK (2005) Crude oil price forecasting with TEI@I methodology. J Syst Sci Complex 18(2):145–166
Wang Y, Ma F, Wei Y, Wu C (2016) Forecasting realized volatility in a changing world: a dynamic model averaging approach. J Bank Financ 64:136–149
Webster C, Ivanov SH (2020) Robotics, artificial intelligence, and the evolving nature of work. In: George B, Paul J (eds) Digital Transformation in Business and Society: Theory and Cases. Palgrave, MacMillan
Wu Z, Huang NE (2009) Ensemble empirical mode decomposition: a noise—assisted data analysis method. Adv Adapt Data 11:1–41
Xiao F, Ke J (2021) Pricing, management and decisionmaking of financial markets with artificial intelligence: introduction to the issue. Financ Innov 7:85
Xiao YJ, Wang XK, Wang JQ et al (2021) An adaptive decomposition and ensemble model for shortterm air pollutant concentration forecast using ICEEMDANICA. Technol Forecast Soc Chang 166:120655
Xing LM, Zhang YJ (2022) Forecasting crude oil prices with shrinkage methods: Can nonconvex penalty and Huber loss help? Energy Econ 110:106014
Yu L, Wang SY, Lai KK (2008) Forecasting crude oil price with an EMD—based neural network ensemble learning paradigm. Energy Econ 30(5):2623–2635
Zhang YJ, Wang JL (2019) Do high frequency stock market data help forecast crude oil prices? Evidence from the MIDAS models. Energy Econ 78:192–201
Zhang YJ, Zhang JL (2018) Volatility forecasting of crude oil market: a new hybrid method. J Forecast 37:781–789
Zhang JL, Zhang YJ, Zhang L (2015) A novel hybrid method for crude oil price forecasting. Energy Econ 49:649–659
Zhang YJ, Chu G, Sheng DH (2020) The role of investor attention in predicting stock prices: the long shortterm memory networks perspective. Financ Res Lett 38(2):101484
Zhang D, Mishra S, Brynjolfsson E, et al (2021a) The AI index 2021a annual report
Zhang S, Chen Y, Zhang WY, Feng RJ (2021b) A novel ensemble deep learning model with dynamic error correction and multiobjective ensemble pruning for time series forecasting. Inf Sci 544:427–445
Acknowledgements
We gratefully acknowledge the financial support from National Natural Science Foundation of China (Nos. 71774051, 72243003) and National Social Science Fund of China (No. 22AZD128), and we also thank the seminar participants in Center for Resource and Environmental Management, Hunan University, China.
Author information
Authors and Affiliations
Contributions
YJZ: Conceptualization, methodology, formal analysis, writing—original draft, review and editing, funding acquisition. HZ: Investigation, data curation, methodology, software, writing—original draft, formal analysis. RG: Conceptualization, investigation, data curation, writing—review and editing. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, YJ., Zhang, H. & Gupta, R. A new hybrid method with datacharacteristicdriven analysis for artificial intelligence and robotics index return forecasting. Financ Innov 9, 75 (2023). https://doi.org/10.1186/s40854023004835
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s40854023004835
Keywords
 Artificial Intelligence and Robotics index return forecasting
 PSOLSSVM model
 GARCH model
 Decomposition and integration model
 Combination model
JEL Classification
 Q43
 G15
 E37