Predicting the daily return direction of the stock market using hybrid machine learning algorithms

Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields, including stock market investment. However, few studies have focused on forecasting daily stock market returns, especially when using powerful machine learning techniques, such as deep neural networks (DNNs), to perform the analyses. DNNs employ various deep learning algorithms based on the combination of network structure, activation function, and model parameters, with their performance depending on the format of the data representation. This paper presents a comprehensive big data analytics process to predict the daily return direction of the SPDR S&P 500 ETF (ticker symbol: SPY) based on 60 financial and economic features. DNNs and traditional artificial neural networks (ANNs) are then deployed over the entire preprocessed but untransformed dataset, along with two datasets transformed via principal component analysis (PCA), to predict the daily direction of future stock market index returns. While controlling for overfitting, a pattern for the classification accuracy of the DNNs is detected and demonstrated as the number of the hidden layers increases gradually from 12 to 1000. Moreover, a set of hypothesis testing procedures are implemented on the classification, and the simulation results show that the DNNs using two PCA-represented datasets give significantly higher classification accuracy than those using the entire untransformed dataset, as well as several other hybrid machine learning algorithms. In addition, the trading strategies guided by the DNN classification process based on PCA-represented data perform slightly better than the others tested, including in a comparison against two standard benchmarks.


Introduction
Big data analytic techniques developed with machine learning algorithms are gaining more attention in various application fields, including stock market investment.This is mainly because machine learning algorithms do not require any assumptions about the data and often achieve higher accuracy than econometric and statistical models; for example, artificial neural networks (ANNs), fuzzy systems, and genetic algorithms are driven by multivariate data with no required assumptions.Many of these methodologies have been applied to forecast and analyze financial variables, for instance, see Vellido, Lisboa, & to forecast the ETF daily return direction.They show that PCA-based ANN classifiers lead to significantly higher accuracy than three different PCA-based logistic regression models, including those that have successfully used fuzzy c-means clustering.Chong, Han, & Park (2017) recently examine the advantages and drawbacks of using deep learning algorithms for stock analysis and prediction, but their study focuses on intraday stock return forecasting.
In this study, the daily return direction of the SPDR S&P 500 ETF is forecasted using a deliberately designed classification mining procedure based on hybrid machine learning algorithms.This process begins by preprocessing the raw data to deal with missing values, outliers, and mismatched samples.The ANNs and DNNs, each acting as classifiers, are then used with both the entire untransformed dataset and the PCA-represented datasets to forecast the direction of future daily market returns.The remainder of this paper discusses the details of the study and is organized as follows.The data description and preprocessing are introduced next, including the transformation of the entire data set via PCA.The architectures, network topology, and learning algorithms of the newly developed DNNs, along with the previously successful benchmark ANNs, both of which are used for return direction classification, are then discussed.The forecasting procedure of three different datasets with the DNN classifiers are then described, together with the classification results and the pattern of the classification accuracy relevant to the number of hidden layers.A standard benchmark is also compared with the PCA-based ANN classifiers results.The simulation results from trading strategies based on the DNN classifiers over the three datasets are compared to each other, and the results of the ANN-based trading strategies as compared with two benchmarks are then discussed.Finally, concluding remarks and proposed future work are provided.

Data description
The dataset utilized in this study includes the daily direction (up or down) of the closing price of the SPDR S&P 500 ETF (ticker symbol: SPY) as the output, along with 60 financial and economic factors as input features.This daily data is collected from 2518 trading days between June 1, 2003 andMay 31, 2013.The 60 potential features can be divided into 10 groups, including the SPY return for the current day and the three previous days, the relative difference in percentage of the SPY return, the exponential moving averages of the SPY return, Treasury bill (T-bill) rates, certificate of deposit rates, financial and economic indicators, term and default spreads, exchange rates between the USD and four other currencies, the return of seven major world indices (other than the S&P 500), the SPY trading volume, and the return of eight large capitalization companies within the S&P 500 (which is a market cap weighted index and driven by the larger capitalization companies within the index).These features, which are a mixture of those identified by various researchers (Cao & Tay, 2001;Thawornwong & Enke, 2004;Armano, Marchesi, & Murru, 2005;Enke & Thawornwong, 2005;Niaki & Hoseinzade, 2013;andZhong &Enke, 2017a, 2017b), are included as long as their values are released without a gap of more than five continuous trading days during the study period.The details of these 60 financial and economic factors, including their descriptions, sources, and calculation formulas, are given in Table 10 of the Appendix.

Data normalization
Given that the data used in this study cover 60 factors over 2518 trading days, there invariably exist missing values, mismatching samples, and outliers.Yet, the data quality is an important factor that can make a difference in the prediction accuracy, and therefore, preprocessing the raw data is necessary.Using the 2518 trading days during the 10-year period, the collected samples from other days are initially deleted.If there are n values for any variable or column that are continuously missing, the average of the n existing values on both sides of the missing values are used to fill in the n missing values.A simple but classical statistical principle is employed to detect the possible outliers (Navidi, 2011).The possible outliers are then adjusted using a similar method to the one used by Cao & Tay (2001).Specifically, for each of the 60 factors or columns in the data, any value beyond the interval (Q 1 − 1.5 * IQR, Q 3 + 1.5 * IQR) is regarded as a possible outlier, with the factor value replaced by the closer boundary of the interval.Here, Q 1 and Q 3 are the first and third quartiles, respectively, of all the values in that column, and IQR = Q 3 − Q 1 is the interquartile of those values.The symmetry of all adjusted and cleaned columns can be checked using histograms or statistical tests.For example, Figure 1 includes the histograms of factor SPY t (i.e., the SPY current daily return), before and after data preprocessing (Zhong & Enke, 2017a).It can be observed that the outliers are removed, and the symmetry is achieved after adjustments.
In this study, the ANNs and DNNs for pattern recognition are used as the classifiers.At the start of the classification mining procedure, the cleaned data are sequentially partitioned into three parts: training data (the first 70% of the data), validation data (the last 15% of the first 85% of the data), and the testing data (the last 15% of the data).

Data transformation using PCA
As one of the earliest multivariate techniques, PCA aims to construct a low-dimensional representation of the data while maintaining the maximal variance and covariance structure of the data (Jolliffe, 1986).To achieve this goal, a linear mapping W that can maximize W T var ( X)W, where var(X) is the variance-covariance matrix of the data X, needs to be created.Given that W is formed by the principal eigenvectors of var (X), PCA turns out to be an eigenproblem var(X)W = λW, where λ represents the eigenvalues of var (X).It is also known that working on the raw data X instead of the standardized data with the PCA tends to emphasize variables that have higher variances more than variables that have very low variances, especially if the units where the variables are measured are inconsistent.In this study, not all variables are measured at the same units.Thus, here, PCA is actually applied to the standardized version of the cleaned data X.The specific procedure is given below.First, the linear mapping W * is searched such that and corr(X) is the correlation matrix of the data X.Assume that the data X has the format X = (X 1 X 2 ⋯X M ); then corr(X) = ρ is a M × M matrix, where M is the dimensionality of the data, and the ij th element of the correlation matrix is where.
; and i; j ¼ 1; 2; …; M: M and the vectors e T i ¼ ðe i1 e i2 ⋯e iM Þ denote the eigenvectors of corr(X) corresponding to the eigenvalues λ Ã i , i = 1, 2, … , M. The elements of these eigenvectors can be proven to be the coefficients of the principal components.
Secondly, the principal components of the standardized data are presented as where.
can be written as.
Using the spectral decomposition theorem, and the fact that e T i e i ¼ P M j¼1 e 2 ij ¼ 1 and the different eigenvectors are perpendicular to each other such that e T i e j ¼ 0, we can prove that and That is, the variance of the i th (largest) principal component is equal to the i th largest eigenvalue, and the principal components are mutually uncorrelated.In summary, the principal components can be written as the linear combinations of all the factors with the corresponding coefficients equaling the elements of the eigenvectors.Different amounts of principal components can explain different proportions of the variance-covariance structure of the data.The eigenvalues can be used to rank the eigenvectors based on how much of the data variation is captured by each principal component.
Theoretically, the information loss due to the dimensionality reduction of the data space from M to k is insignificant if the proportion of the variation explained by the first k principal components is large enough.In practice, the chosen principle components must be those that best explain the data while simplifying the data structure as much as possible.

Neural networks for pattern recognition
Recognized as one of the most important machine learning technologies, ANNs can be viewed as a cascading model of cell types emulating the human brain by carefully defining and designing the network architecture, including the number of network layers, the types of connections among the network layers, the numbers of neurons in each layer, the learning algorithm, the learning rate, the weights among neurons, and the various neuron activation functions.All these parameters are typically determined empirically during the learning or training phase of the neural network modeling.Thus, it is usually not easy to interpret the symbolic meaning of the trained results.However, the neural networks have high tolerance for noisy data and perform very well in recognizing the different patterns of new data during the testing stage.Also, some efficient algorithms have recently been developed to extract the classification rules from the trained neural networks.The backpropagation algorithm is well accepted as the most popular neural network learning algorithm, which is often carried out using a multilayer feed-forward neural network.

Multilayer feed-forward neural networks
Among the various types of neural networks that have been developed, the multilayer feed-forward network is most commonly used for pattern recognition, including classification, in data mining.Such a feed-forward neural network is illustrated in Fig. 2.
In Fig. 2, X i , i = 1, 2, … , I, denotes the i th component (neuron) of the input vector (layer) including I components (neurons); H j , j = 1, 2, … , J, denotes the j th neuron in the hidden layer with J neurons; and O k , k = 1, 2, … , K, denotes the k th neuron in the output layer.The connections between each neuron of two adjacent layers exist with empirically adjusted weights.For example, w ij denotes the weight between the i th neuron in the input layer and the j th neuron in the hidden layer.Given enough hidden neurons, multilayer feed-forward neural networks of linear threshold functions can closely approximate any function.The number of hidden layers is arbitrary, depending on the complexity of the neural networks.A boundary of 10 is usually used to differentiate shallow neural networks from DNNs.That is, if the feed-forward neural networks involve more than 10 hidden layers, they are considered DNNS; otherwise, shallow neural networks are referred to.More details on DNNs are given in the next section.
Traditional feed-forward ANNs often utilize the backpropagation learning algorithm (Rumelhart, et al., 1986) based on an iterative process where the connection weights between the layers are adjusted repeatedly in a backwards direction, from the output layer, through the hidden layers, and then to the first hidden layer, such that the difference between the predicted class and the true class measured by the mean squared error (MSE) can be minimized during the procedure.Although other sophisticated learning algorithms have been developed over the years for specific applications, the traditional backpropagation learning is still often used to train newly developed DNNs.

DNNs for classification
More recently, deep learning, also known as deep structured learning, hierarchical learning, or deep machine learning, has emerged as a promising branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using a deep graph with multiple processing layers composed of numerous linear and nonlinear transformations.This concept was introduced to the machine learning community by Dechter (1986), and later to those working with ANNs (Aizenberg et al., 2000).Researchers in this area attempt to develop better representations and models for learning these representations from large-scale unlabeled data, compared to shallow learning, where the number of hidden layers is usually not greater than 10.
Since the first functional DNNs using a learning algorithm called the group method of data handling are published by Ivakhnenko (1973) and his research group, a large number of DNN architectures, such as pattern recognition networks, convolutional neural networks, recurrent neural networks, and long short-term memory, have been explored.Because more hidden layers and neurons are involved in DNNs, the computational power of DNNs is expected to be higher than traditional ANNs.However, DNNs, like ANNs, suffer from overfitting, which results from the estimation of a large number of parameters used to define the connections among hidden layers and neurons involved in DNNs, thereby reducing the model's generalization ability.

Forecasting daily return direction of the SPDR S&P 500 ETF
This study focuses on predicting the daily return direction of the SPDR S&P 500 ETF (ticker symbol: SPY) for the next day.The direction forecast can be either up or down.A direction forecast (up or down) is used instead of a level forecast since this study's objective is to not only develop a forecasting model with high classification accuracy, but also develop a model that can be used successfully in a practical trading environment.Previous studies (e.g., Thawornwong & Enke, 2004) have shown that when developing forecasting/ trading systems, direction forecasts (up or down) perform better in a trading environment/ simulation than level forecasts (predicting the exact value of the stock or index one period forward).While level forecasts can result in models with higher reported training/testing prediction accuracy (greater than 90% in some instances), often these models are over-fitted to the data to achieve these results.Consequently, such models are more likely to suffer in a trading environment/simulation.On the other hand, since a small miss is still a miss (e.g., predicting up but being slightly down), successful direction forecasts are more likely to have a prediction accuracy closer to 60%; yet, these models still perform better at these accuracy levels when simulating real-world trading since the results from these models are more likely to be on the right side of the trade.Therefore, the following modeling focuses on making an accurate and ideally profitable direction forecast.
For the model testing, three different datasets are employed, with or without the use of a PCA transformation.Trading simulations of return versus risk for the best models are discussed later.

Use of ANN and DNN classifiers
The architecture of the DNNs considered in this study is designed as a pattern recognition network with a large number of hidden layers (i.e., more than 10 hidden layers); the architecture of the ANNs is also designed as a pattern recognition network with the number of hidden layers set to 10.The pattern recognition network used is typical of the type of multilayer feed-forward neural networks that are specifically designed for classification problems (Chiang et al., 2016;Kim & Enke, 2016;Zhong & Enke, 2017a, b).The MATLAB R2017b software is used for the modeling and testing, and the MSE and confusion matrix are used for the analysis and comparison, specifically for the evaluation of the performance of the ANN and DNN classifiers.The confusion matrix consists of four correctness percentages for the training, validation, testing, and total dataset that are provided as inputs to the classifiers.The percent of correctness indicates the fraction of samples that are correctly classified.A value of 0 means no correct classification, whereas a value of 100 indicates maximum correct classifications.Specifically, the Neural Network Toolbox in MATLAB R2017b functions in the following way.The training data are input to train the model, while the validation data are input to control the classifiers' overfitting problem almost simultaneously.That is, as each classifier is trained using the training data, the MSE obtained from classifying the validation data with the trained model decreases and continues to do so for a certain amount of time; the MSE of the validation starts to increase when the model suffers from overfitting, resulting in the need for the training phase to be terminated.Thus, the model can be best trained in the sense that the validation phase achieves its lowest MSE with the trained model.After the model is trained and selected, all training data, validation data, and testing data (untouched) are provided as inputs and classified by the trained model separately.The percentage of correctly predicted or classified daily directions corresponding to each category can be obtained and recorded.
Table 1 shows the classification results of the traditional benchmark ANN using 12 transformed datasets.It shows that the benchmark ANN classifier achieves the highest accuracy in the testing phase over the PCA-represented dataset with 31 principal components; the PCA-represented dataset with 60 principal components gives the second best results.
Three datasets are considered for the DNN analysis.The first dataset includes the entire preprocessed but untransformed data, including 60 factors.The second and third datasets are transformed datasets using PCA, with 60 and 31 principal components, respectively (i.e., data with PCA equal to 60 and 31 are used since the benchmark ANN classifier achieves the highest accuracy levels in the testing phase when using the PCA-represented datasets with 31 and 60 principal components).The three sets of classification results (i.e., untransformed data, PCA = 60 data, and PCA = 31 data using both the benchmark ANN and DNN classifiers) are listed in Tables 2, 3 and 4, respectively.Please note that in Tables 2, 3 and 4, the first row with the number of hidden layers equal to 10 represents the performance of the traditional benchmark feed-forward ANN.

Comparison of classification results
Once again, the first row in Tables 2, 3 and 4 provides the classification results using the benchmark ANN classifier (with 10 hidden layer neurons), while the remaining rows provide the results from the various DNN classifiers (with the number of hidden layers greater than 10).In each of the three tables, it can be observed that as the number of hidden layers increases from 12 to 28, the accuracy of the classification in the testing phase typically increases, reaching the highest values of 58.6 (in Table 2), 59.9 (in Table 3), and 59.9 (in Table 4) when the number of hidden layers equals 28, 16, and 22, respectively.However, after the number of hidden layers becomes larger than 30 or 35, the accuracy of the classification for the testing data stops climbing and drops or converges to values that are close to the results using the ANN classifiers (which includes 10 hidden layers), except for one case where the transformed data with PCs = 60 and the number of hidden layers = 500 is considered.Note that the overfitting issue appears to be under control, in part since all the ANN and DNN classifiers are strictly trained with the same criteria, such that for each classifier the four correction percentages of the classification, corresponding to the training, validation, testing, and entire data sets cannot be significantly different from each other; that is, the absolute value of the percentage difference must be within a defined threshold, for example, 5% (Zhong & Enke, 2017a, 2017b).It is also observed that after the data are transformed via PCA, the average classification accuracy in the testing phase increases significantly.Moreover, the DNN-based classification using the transformed data with PCs = 31 achieves the highest average accuracy.To verify the phenomena in a statistical manner, a set of paired t-tests at the significance level of 0.05 are conducted and the test results are given in Table 5.
Since the P-values of the paired t-tests are much less than 0.05, we reject the null hypotheses and conclude that when using the DNN classifiers, the transformed dataset with PCs = 31 produces the highest average classification accuracy, while the DNN classifiers show the poorest performance over the entire preprocessed and untransformed dataset at the significance level of 0.05.Note that the values inside the parentheses in Tables 2, 3 and 4 represent the MSEs for each classification.In general, the higher the correctness percentage, the smaller the corresponding MSEs.

Simulation
While a higher classification accuracy for a financial forecast should lead to better trading results, this is not always the case.Therefore, in this section, a trading   simulation is conducted to see if the higher prediction accuracy from the DNN classifiers indicates higher profitability among the three datasets with different representation.This study is based on predicting the direction of the SPDR S&P 500 ETF (ticker symbol: SPY) daily returns.Consequently, we modify the trading strategy for classification models defined by Enke & Thawornwong (2005) as follows.
If UP t + 1 = 1, fully invest in stocks or maintain, and receive the actual stock return for the day t + 1 (i.e., SPY t + 1 ); if UP t + 1 = 0, fully invest in one-month T-bills or maintain, and receive the actual one-month T-bill return for the day t + 1 (i.e., T1H t + 1 ).
Here UP denotes the SPY daily return direction as predicted by the models described earlier.In addition, the actual one-month T-bill return for the day t + 1 is where T1 t + 1 is the one-month T-bill discount rate (or risk-free rate) percentage on the secondary market for business day t + 1.The original data for T1 are obtained from the St. Louis Federal Reserve Economic Research database (https://fred.stlouisfed.org/series/TB4WK) and are exactly the "4-week" T-bill discount rate percentage on the secondary market; the data are listed on the website as "Monthly" in terms of the "Frequency" feature of the data but is a 28-day measure.
In practice, at the beginning of each trading day, the investor decides to buy the SPY portfolio or the one-month T-bill according to the forecasted direction of the SPY daily return.It is assumed for this research that the money invested in either a stock portfolio or T-bills is illiquid and detained in each asset during the entire trading day.Dividends and transaction costs are also not considered.In addition, for this study, both leveraging and short selling when investing are forbidden.The trading simulation is done for all the classification models over each testing period, including 376 samples of the three data sets considered; the first day of the 377-day testing period is excluded owing to the lack of a direction prediction for that day.The resulting mean, standard deviation (or volatility), and Sharpe ratio of the daily returns on investment generated from each forecasting model over each set of testing data are then calculated, with or without the PCA involved.The Sharpe ratio is obtained by dividing the mean daily return by the standard deviation of the daily returns.Therefore, the higher the Sharpe ratio, as a result of a higher mean daily return and/or a lower standard deviation or volatility of daily returns, the better the trading strategy.The relevant results are presented in Tables 6, 7 and 8.
As shown in Table 6, the trading strategies based on the DNN classifiers for the entire untransformed data generate higher Sharpe ratios than the trading strategy based on the ANN classifier, except for three cases where the number of hidden layers is 40, 50, or 500.In Table 7, the trading strategies from the DNN classification over the PCA-represented data with PCs = 60 result in higher Sharpe ratios than the ANN-based trading strategy, except when the number of hidden layers equals 14, 40, 45, or 50.Table 8 shows that the Sharpe ratios that are generated by the trading strategies using the DNN classification over the PCA-represented data with PCs = 31 are mostly higher than the Sharpe ratios generated by the ANN-based trading strategy, except for those cases where the number of hidden layers is 12, 24, 26, 45, 50, or 1000.The Sharpe ratios and their corresponding hidden layer numbers that are relevant to these exceptions are highlighted in Tables 6, 7 and 8.
To compare the three sets of Sharpe ratios (17 values in each set) that are obtained from the trading strategies based on the DNN classifiers for the entire untransformed data and the PCA-represented data with PCs = 60 and PCs = 31, another group of paired t-tests are performed at the significance level of 0.05.The P-values of the tests are included in Table 9.
Since the P-values are all much larger than 0.05, we have strong evidence of insignificant differences among the mean Sharpe ratios from the three different trading strategies at the significance level of 0.05.However, with more careful observation of these P-values (and using other significance levels, e.g., 0.40), it is reasonable to conclude that in general the trading strategies guided by the DNN classification based on the PCA-represented data perform slightly better than the ones based on the entire untransformed data, although these trading strategies perform similarly.

Conclusions and suggestions for future work
A comprehensive big data analytics procedure using hybrid machine learning algorithms has been developed to forecast the daily return direction of the SPDR S&P 500 ETF (ticker symbol: SPY).Ideally, researchers look to apply the simplest set of algorithms to the least amount of data, with both the most accurate forecasting results and the highest risk-adjusted profits being desired.We have also considered this standard for this research.
The analytic process starts with data cleaning and preprocessing and concludes with an analysis of the forecasting and simulation results.The comparison of the classification and simulation results is done with statistical hypothesis tests, showing that on average, the accuracy of the DNN-based classification is significantly higher than the PCA-represented data over the entire untransformed data set.More specifically, the DNN-based classification for the PCA-represented data set with PCs = 31 achieves the highest accuracy.It is also observed that as the number of DNN hidden layers increases, a pattern regarding the classification accuracy (as compared to the ANN classifier) emerges, with the overfitting issue remaining under control.In addition, over three data sets with different representations, the trading strategies using the DNN classifiers perform better than the ones using the ANN classifiers in most cases.Although in general there is no significant difference among the trading strategies from the DNN classification process over the entire untransformed data set and two PCA-represented data sets, the trading strategies based on the PCA-represented data perform slightly better.
In previous studies (Zhong & Enke, 2017a, 2017b), the PCA-ANN classifiers are shown to give a higher prediction accuracy for the daily return direction of the SPY ETF for the next day than the FRPCA-ANN classifiers, KPCA-ANN classifiers, and logistic regression classifiers, with or without PCA/FRPCA/KPCA involved.Also, the trading strategies based on the PCA-ANN classifiers perform better than the other strategies based on the other classifiers.Moreover, when using PCA, all classification model-based trading strategies perform better than the benchmark one-month T-bill strategy; the trading strategies from the ANN classification mining procedure perform better than the benchmark buy-and-hold strategy.Thus, when combined with the new results as illustrated in Tables 2, 3, 4 and 6, 7 8 it can be concluded that among the machine learning techniques considered in this study series, the PCA-DNN classifiers with the proper number of hidden layers can achieve the highest classification accuracy and result in the best trading strategy performance.
With additional hidden layers and more complicated learning algorithms, DNNs are recognized as an important and advanced technology in the fields of computational intelligence and artificial intelligence.However, DNNs are still regarded as a black box with less clear theoretical confirmations of the learning algorithms that are used in common deep architectures, such as the stochastic gradient descent methodology.These DNN learning algorithms actually increase the computation time as a large number of hidden layers and neurons are included.This area of research needs to receive more attention and effort in the future.

SPYt3
The return of the SPY in day t-3.finance.yahoo.com/ (p(t-3)p(t-4))/p(t-4) Relative difference in percentage of the SPY return

RDP5
The 5-day relative difference in percentage of the SPY.

Fig. 1
Fig. 1 Histogram of SPY current return (left) and histogram of adjusted SPY current return (right)

Fig. 2
Fig. 2 Topology of a multilayer feed-forward neural network used for classification

H
BAA and T6.DE4 = BAA -T6 DE5 Default spread between BAA and T3.DE5 = BAA -T3 DE6 Default spread between BAA and T1.DE6 = BAA -T1 DE7 Default spread between CD6 and T6.DE7 = CD6 -T6 Exchange rate between USD and four other currencies (in day t) USD_Y Relative change in the exchange rate between US dollar and Japanese yen.http://www.investing.com/currencies/usd-jpy-historical-data USD_GBP Relative change in the exchange rate between US dollar and British pound.http://www.investing.com/currencies/gbp-usd-historical-data (then, take the opposites to the changes) USD_CAD Relative change in the exchange rate between US dollar and Canadian dollar.http://www.

Table 1
The ANN classification results using 12 transformed datasets

Table 2
Classification results with ANN/DNN classifiers using entire untransformed data

Table 3
Classification results with ANN/DNN classifiers using transformed data with PCs = 60

Table 4
Classification results with ANN/DNN classifiers using transformed data with PCs = 31

Table 5
Comparison of classification results from DNN classifiers for three data sets

Table 6
Simulation results with ANN/DNN classifiers using entire untransformed data

Table 7
Simulation results with ANN/DNN classifiers using transformed data with PCs = 60

Table 8
Simulation results with ANN/DNN classifiers using transformed data with PCs = 31

Table 9
Comparison of simulation results from DNN classifiers for three data sets

Table 10
The 60 financial and economical features of the raw data

Table 10
The 60 financial and economical features of the raw data (Continued)

Table 10
The 60 financial and economical features of the raw data (Continued)