Forecasting directional movement of Forex data using LSTM with technical and macroeconomic indicators

Yıldırım, Deniz Can; Toroslu, Ismail Hakkı; Fiore, Ugo

doi:10.1186/s40854-020-00220-2

Research
Open access
Published: 04 January 2021

Forecasting directional movement of Forex data using LSTM with technical and macroeconomic indicators

Deniz Can Yıldırım¹,
Ismail Hakkı Toroslu ORCID: orcid.org/0000-0002-4524-8232¹ &
Ugo Fiore²

Financial Innovation volume 7, Article number: 1 (2021) Cite this article

52k Accesses
97 Citations
1 Altmetric
Metrics details

Abstract

Forex (foreign exchange) is a special financial market that entails both high risks and high profit opportunities for traders. It is also a very simple market since traders can profit by just predicting the direction of the exchange rate between two currencies. However, incorrect predictions in Forex may cause much higher losses than in other typical financial markets. The direction prediction requirement makes the problem quite different from other typical time-series forecasting problems. In this work, we used a popular deep learning tool called “long short-term memory” (LSTM), which has been shown to be very effective in many time-series forecasting problems, to make direction predictions in Forex. We utilized two different data sets—namely, macroeconomic data and technical indicator data—since in the financial world, fundamental and technical analysis are two main techniques, and they use those two data sets, respectively. Our proposed hybrid model, which combines two separate LSTMs corresponding to these two data sets, was found to be quite successful in experiments using real data.

Introduction

The foreign exchange market, known as Forex or FX, is a financial market where currencies are bought and sold simultaneously. Forex is the world’s largest financial market, with a volume of more than $5 trillion. It is a decentralized market that operates 24 h a day, except for weekends, which makes it quite different from other financial markets.

The characteristics of Forex show differences compared to other markets. These differences can bring advantages to Forex traders for more profitable trading opportunities. Some of these advantages include no commissions, no middlemen, no fixed lot size, low transaction costs, high liquidity, almost instantaneous transactions, low margins/high leverage, 24-h operations, no insider trading, limited regulation, and online trading opportunities. Two types of techniques are used to predict future values for typical financial time series—fundamental analysis and technical analysis—and both can be used for Forex. The former uses macroeconomic factors while the latter uses historical data to forecast the future price or the direction of the price.

The main decision in Forex involves forecasting the directional movement between two currencies. Traders can profit from transactions with correct directional prediction and lose with incorrect prediction. Therefore, identifying directional movement is the problem addressed in this study.

We chose the Euro/US dollar (EUR/USD) pair for the analysis since it is the largest traded Forex currency pair in the world, accounting for more than 80% of the total Forex volume.

In recent years, deep learning tools, such as long short-term memory (LSTM), have become popular and have been found to be effective for many time-series forecasting problems. In general, such problems focus on determining the future values of time-series data with high accuracy. However, in direction prediction problems, accuracy cannot be defined as simply the difference between actual and predicted values. Therefore, a novel rule-based decision layer needs to be added after obtaining predictions from LSTMs.

In this work, we propose a hybrid model composed of a macroeconomic LSTM model and a technical LSTM model, named after the types of data they use. We first separately investigated the effects of these data on directional movement. After that, we combined the results to significantly improve prediction accuracy. The macroeconomic LSTM model utilizes several financial factors, including interest rates, Federal Reserve (FED) funds rate, inflation rates, Standard and Poor’s (S&P) 500, and Deutscher Aktien IndeX (DAX) market indexes. Each factor has important effects on the trend of the EUR/USD currency pair. This can be interpreted as a fundamental analysis of price data. The other model is the technical LSTM model, which takes advantage of technical analysis. Technical analysis is based on technical indicators that are mathematical functions used to predict future price action. The feature set in our model uses popular technical indicators such as moving average (MA), moving average convergence divergence (MACD), rate of change (ROC), momentum, relative strength index (RSI), Bollinger bands (BB), and the commodity channel index (CCI).

The contributions of this study are as follows:

A popular deep learning tool called LSTM, which is frequently used to forecast values in time-series data, is adopted to predict direction in Forex data.
Both macroeconomic and technical indicators are used as features to make predictions.
A novel hybrid model is proposed that combines two different models with smart decision rules to increase decision accuracy by eliminating transactions with weaker confidence.
The proposed model and baseline models are tested using recent real data to demonstrate that the proposed hybrid model outperforms the others.

The rest of this paper is organized as follows. In “Related work” section, related studies of the financial time-series prediction problem are thoroughly examined. “Forex preliminaries”–“Technical indicators” sections provide background information about Forex, LSTM, and the technical indicators. Then, “The data set” section presents the data set used in the experiments. “LSTM-based hybrid model using macroeconomic and technical indicators” section introduces the proposed algorithm to handle the directional movement prediction problem. Moreover, the preprocessing and postprocessing phases are also explained in detail. “Experiments” section presents the results of the experiments and the classification performances of the proposed model. “Discussion” and “Conclusion” sections discuss the experimental results and provide insight for future research directions.

Related work

Various forecasting methods have been considered in the finance domain, including machine learning approaches (e.g., support vector machines and neural networks) and new methods such as deep learning. Unfortunately, there are not many survey papers on these methods. Cavalcante et al. (2016), Bahrammirzaee (2010), and Saad and Wunsch (1998) have provided overviews of the field. The most recent of these, by Cavalcante et al. (2016), categorized the approaches used in different financial markets. Although that study mainly introduced methods proposed for the stock market, it also discussed applications for foreign exchange markets.

There has been a great deal of work on predicting future values in stock markets using various machine learning methods. We discuss some of them below.

Selvamuthu et al. (2019) used neural networks based on Levenberg–Marquardt, scaled conjugate gradient, and Bayesian regularization for stock market prediction based on tick data and 15-min-interval data for an Indian company.

Patel et al. (2015b) developed a two-stage fusion structure to predict the future values of the stock market index for 1–10, 15, and 30 days using 10 technical indicators. In the first stage, support vector machine regression (SVR) was applied to these inputs, and the results were fed into an artificial neural network (ANN). SVR and random forest (RF) models were used in the second stage. They compared the fusion model with standalone ANN, SVR, and RF models. They reported that the fusion model significantly improved upon the standalone models.

Guresen et al. (2011) explored several ANN models for predicting stock market indexes. These models include multilayer perceptron (MLP), dynamic artificial neural network (DAN2), and hybrid neural networks with generalized autoregressive conditional heteroscedasticity (GARCH). Applying mean-square error (MSE) and mean absolute deviation (MAD), their results showed that MLP performed slightly better than DAN2 and GARCH-MLP while GARCH-DAN2 had the worst results.

Weng et al. (2018) developed a financial expert system using ensemble methods (i.e., neural network regressing ensemble (NNRE), support vector regression ensemble (SVRE), boosted regression tree (BRT), and random forest regression (RFR)) to predict stock prices 1 day ahead. Market prices, technical indicators, financial news, Google Trends, and the number unique visitors to Wikipedia pages were used as inputs. They also investigated the effect of PCA on performance. They reported that ensembles with PCA performed better than those without PCA. They also noted that BRT and RFR were the best while SVRE was the worst in terms of mean absolute percentage error.

Huang et al. (2005) examined forecasting weekly stock market movement direction using SVM. They compared SVM with linear discriminant analysis, quadratic discriminant analysis, and Elman back-propagation neural networks. They also proposed a model that combined SVM with other classifiers. They used not only the NIKKEI 225 index but also macroeconomic variables as features for the model. Their direction calculation was based on the first-order difference natural logarithmic transformation, and the directions were either increasing or decreasing. SVM outperformed the other models with an accuracy of 73% while the combined model was the best, with an accuracy of 75%.

Kara et al. (2011) compared the performance of ANN and SVM for predicting the direction of stock price index movement. Ten technical indicators were used as inputs for the model. They found that ANN, with an accuracy of 75.74%, performed significantly better than SVM, which had an accuracy of 71.52%.

Patel et al. (2015a) compared the performance of four classifiers (ANN, SVM, random forest, and naive Bayes) for stock price index direction using two approaches. In the first approach, they used 10 technical indicator values as inputs with different parameter settings for classifiers. Prediction accuracy fell within the range of 0.7331–0.8359. In the other approach, they represented same 10 technical indicator results as directions (up and down), which were used as inputs for the classifiers. Using this approach, they enhanced accuracy by about 15% for all of the classifiers. Although their experiments concerned short-term prediction, the direction period was not explicitly explained.

Ballings et al. (2015) evaluated ensemble methods (random forest, AdaBoost, and kernel factory) against neural networks, logistic regression, SVM, and k-nearest neighbor for predicting 1 year ahead. They used different stock market domains in their experiments. According to the median area under curve (AUC) scores, random forest showed the best performance, followed by SVM, random forest, and kernel factory.

Hu et al. (2018) introduced an improved sine–cosine algorithm (ISCA) for optimizing the weights and biases of BPNN to predict the directions of open stock prices of the S&P 500 and Dow Jones Industrial Average indices. Using Google Trends data in addition to the opening, high, low, and closing price, as well as trading volume, in their experiments, they obtained an 86.81% hit ratio for the S&P 500 index and an 88.98% hit ratio for the Dow Jones Industrial Average Index.

Gui et al. (2015) investigated SVM for predicting stock price index direction with different parameter settings. That study also compared the result for SVM with BPNN and case-based reasoning models; multiple technical indicators were used as inputs for the models. That study found that SVM outperformed the other models with an accuracy of 57.8313% while the other models had accuracies of 54.7332% and 51.9793%, respectively.

Qiu and Song (2016) developed a genetic algorithm (GA)—based optimized ANN to predict the direction of the next day’s price in the stock market index. GA was used to optimize the initial weights and bias of the model. Two types of input sets were generated using several technical indicators of the daily price of the Nikkei 225 index and fed into the model. They obtained accuracies 60.87% for the first set and 81.27% for the second set.

Zhong and Enke (2017) investigated three-dimensional reduction techniques applied to ANN for forecasting the daily direction of the S&P 500 Index ETF (SPY). Principal component analysis (PCA), fuzzy robust principal component analysis (FRPCA), and kernel-based principal component analysis (KPCA) were used to reduce the number of features. Their experiments indicated that ANN with PCA performed slightly better than the other two techniques.

Zhong and Enke (2019) used deep neural networks and ANNs to forecast the daily return direction of the stock market. They performed experiments on both untransformed and PCA-transformed data sets to validate the model.

In addition to classical machine learning methods, researchers have recently started to use deep learning methods to predict future stock market values. LSTM has emerged as a deep learning tool for application to time-series data, such as financial data.

Zhang et al. (2017) proposed a state-frequency memory recurrent network, which is a modification of LSTM, to forecast stock prices. By decomposing the hidden states of memory cells into multiple frequency components, they could learn the trading patterns of those frequencies. They used state-frequency components to predict future price values through nonlinear regression. They used stock prices from several sectors and performed experiments to make forecasts for 1, 3, and 5 days. They compared the results with LSTM and autoregressive integrated moving average (ARIMA) in terms of mean-square error. They obtained errors of 5.57, 17.00, and 28.90 for the different steps, which outperformed the other models.

Fulfillment et al. (2016) studied stock market forecasting in six different domains using LSTM. He aimed to predict the next 3 h using hourly historical stock data. The model was trained to classify three classes—namely, increasing 0–1%, increasing above 1%, and not increasing (less than 0%). The accuracy results ranged from 49.75 to 59.5%. That study also built a stock trading simulator to test the model on real-world stock trading activity. With that simulator, he managed to make profit in all six stock domains with an average of 6.89%.

Nelson et al. (2017) examined LSTM for predicting 15-min trends in stock prices using technical indicators. They used 175 technical indicators (i.e., external technical analysis library) and the open, close, minimum, maximum, and volume as inputs for the model. They compared their model with a baseline consisting of multilayer perceptron, random forest, and pseudo-random models. The accuracy of LSTM for different stocks ranged from 53 to 55.9%. They concluded that LSTM performed significantly better than the baseline models, according to the Kruskal–Wallis test.

More recently, Fischer and Krauss (2018) applied LSTM to the stock market. They investigated many different aspects of the stock market and found that LSTM was very successful for predicting future prices for that type of time-series data. They also compared LSTM with more traditional machine learning tools to show its superior performance.

Similarly, Di Persio and Honchar (2016) applied LSTM and two other traditional neural network based machine learning tools to future price prediction. They also analyzed ensemble-based solutions by combining results obtained using different tools.

In addition to traditional exchanges, many studies have also investigated Forex. Some studies of Forex based on traditional machine learning tools are discussed below.

Galeshchuk and Mukherjee (2017) investigated the performance of a convolutional neural network (CNN) for predicting the direction of change in Forex. Using the daily closing rates of EUR/USD, GBP/USD, and USD/JPY, they compared the results of CNN with their baseline models and SVM. While the baseline models and SVM had an accuracy of around 65%, their proposed CNN model had an accuracy of about 75%.

Meanwhile, Kayal (2010) investigated the use of MLP in Forex. That work used basic technical indicators as inputs.

Ghazali et al. (2009) also investigated the use of neural networks for Forex. They proposed a higher-order neural network called a dynamic ridge polynomial neural network (DRPNN). In their experiments, DRPNN performed better than a ridge polynomial neural network (RPNN) and a pi-sigma neural network (PSNN).

To predict exchange rates, Majhi et al. (2009) proposed using new ANNs, referred to as a functional link artificial neural network (FLANN) and a cascaded functional link artificial neural network (CFLANN). They demonstrated that those new networks were more robust and had lower computational costs compared to an MLP trained with back-propagation.

In what is commonly called a mark-to-market approach, market prices are increasingly being used to calibrate models to quantify risk in several sectors. The net present value of a financial institution, for example, is an important input for estimating both bankruptcy risk (e.g., Kou et al. 2020) and the likelihood that shocks will propagate throughout the financial system (Kou et al. 2019). In such a context, stock price crashes not only dramatically damage the capital market but also have medium-term adverse effects on the financial sector as a whole (Wen et al. 2019). Credit risk is a major factor in financial shocks. Therefore, a realistic appraisal of solvency needs to be an objective for banks. At the level of the individual borrower, credit scoring is a field in which machine learning methods have been used for a long time (e.g., Shen et al. 2020; Wang et al. 2020).

Deep learning methods such as LSTM are rarely used for Forex. In one recent work, Shen et al. (2015) proposed a modified deep belief network. They were able to show that deep learning approaches outperformed traditional methods.

Even though LSTM is starting to be used in financial markets, using it in Forex for direction forecasting between two currencies, as proposed in the present work, is a novel approach.

Forex preliminaries

Forex has characteristics that are quite different from those of other financial markets (Archer 2010; Ozorhan et al. 2017). To explain Forex, we start by describing how a trade is made. Profit/loss calculations are made using the difference between the final ratio and the initial ratio of the currency pair that has been traded. If the ratio of the currency pair increases and the trader goes long, or the currency pair ratio decreases and the trader goes short, the trader will profit from that transaction when it is closed. Otherwise, the trader not profit. For example, let us assume the EUR/USD ratio was 1.1500 when the trader started a transaction, going long with an initial amount of $10,000. When the position closes (i.e., the transaction ends) with a ratio of 1.1550, the trader will gain ${10000 * (1.1550 - 1.1500) = \$50}$. When the position closes with a ratio of 1.1450, the trader will lose $10000 * (1.1500 - 1.1450) = \$50$. Furthermore, these calculations are based on no leverage. If the trader uses a leverage value such as 10, both the loss and the gain are multiplied by 10.

Detailed definitions of commonly used concepts and terms in Forex can be found in Forex (2018), Archer (2010) and Özorhan (2017). Here, we explain only the most important ones.

Base currency, which is also called the transaction currency, is the first currency in the currency pair while quote currency is the second one in the pair. To illustrate, in the EUR/USD pair, EUR is the base currency, and USD is the quote currency.

Being long (or going long) means buying the base currency or selling the quote currency in the currency pair. Being short (or going short) means selling the base currency or buying the quote currency in the currency pair. Pip is an abbreviation for “percentage of point,” defined as the smallest amount of change occurring in the currency ratio. In general, pip corresponds to the fourth decimal point (i.e., minimum as 0.0001) of that currency. Pipette is the fractional pip, which corresponds to the fifth decimal point (i.e., as 0.00001). In other words, 1 pip equals 10 pipettes.

Leverage corresponds to the use of borrowed money when making transactions. A leverage of 1:100 indicates that if one opens a position with a volume of 1, the actual transaction volume will be 100. After using leverage, one can either gain or lose 100 times the amount of that volume. Margin refers to money borrowed by a trader that is supplied by a broker to make investments using leverage. In this way, one can multiply his/her gains or losses.

Bid price is the price at which the trader can sell the base currency. Ask price is the price at which the trader can buy the base currency. Spread is the difference between the ask and bid prices. A lower spread means the trader can profit from small price changes. Spread value is dependent on market volatility and liquidity. Stop loss is an order to sell a currency when it reaches a specified price. This order is used to prevent larger losses for the trader. Take profit is an order by the trader to close the open position (transaction) for a gain when the price reaches a predefined value. This order guarantees profit for the trader without having to worry about changes in the market price. Market order is an order that is performed instantly at the current price. Swap is a simultaneous buy and sell action for the currency at the same amount at a forward exchange rate. This protects traders from fluctuations in the interest rates of the base and quote currencies. If the base currency has a higher interest rate and the quote currency has a lower interest rate, then a positive swap will occur; in the reverse case, a negative swap will occur.

Fundamental analysis and technical analysis are the two techniques commonly used for predicting future prices in Forex. While the first is based on economic factors, the latter is related to price actions (Archer 2010).

Fundamental analysis focuses on the economic, social, and political factors that can cause prices to move higher, move lower, or stay the same (Archer 2010; Murphy 1999). These factors are also called macroeconomic factors. Economic data reports, interest rates, monetary policy, and international trade/investment flows are some examples (Ozorhan et al. 2017).

Technical analysis uses only the price to predict future price movements (Kritzer and Service 2012). This approach studies the effect of price movement. Technical analysis mainly uses open, high, low, close, and volume data to predict market direction or generate sell and buy signals (Archer 2010). It is based on the following three assumptions (Murphy 1999):

Market action discounts everything.
Price moves in trends.
History repeats itself.

Chart analysis and price analysis using technical indicators are the two main approaches in technical analysis. While the former is used to detect patterns in price charts, the latter is used to predict future price actions (Ozorhan et al. 2017).

Long short-term memory (LSTM)

Long short-term memory (LSTM) was proposed by Hochreiter and Schmidhuber (1997). LSTM is a recurrent neural network architecture that was designed to overcome the vanishing gradient problem found in conventional recurrent neural networks (RNNs) (Biehl 2005). Errors between layers tend to vanish or blow up, which causes oscillating weights or unacceptably long convergence times. The initial LSTM structure solves this problem by introducing the constant error carousel (CEC). In this way, the architecture ensures constant error flow between the self-connected units (Hochreiter and Schmidhuber 1997).

The memory cell of the initial LSTM structure consists of an input gate and an output gate. While the input gate decides which information should be kept or updated in the memory cell, the output gate controls which information should be output. This standard LSTM was extended with the introduction of a new feature called the forget gate (Gers et al. 2000). The forget gate is responsible for resetting a memory state that contains outdated information. Furthermore, peephole connections and full back-propagation through time (BPTT) training are final features that were added to the LSTM architecture (Gers and Schmidhuber 2000; Greff et al. 2017). With these modifications, the architecture was renamed Vanilla LSTM (Greff et al. 2017), as shown in Fig. 1.

LSTM offers an effective and scalable model for learning problems that includes sequential data (Greff et al. 2017). It has been used in many different fields, including handwriting recognition (Graves et al. 2009; Pham et al. 2014) and generation (Graves 2013), language modeling (Zaremba et al. 2014) and translation (Luong et al. 2015), acoustic modeling of speech (Zia and Zahid 2019), speech synthesis (Fan et al. 2014), protein secondary structure prediction (Sønderby and Winther 2014), audio analysis (Marchi et al. 2014), and video data analysis (Donahue et al. 2017; Greff et al. 2017).

Forward pass

One of the two main operations of LSTM, shown in Fig. 1, is called the forward pass. In the forward pass, the calculation moves forward by updating the weights (Greff et al. 2017). The weights of LSTM can be categorized as follows:

Input weights: $W_z, W_i, W_f, W_o \, \in \, \mathbb {R^{N*M}}$
Recurrent weights: $R_z, R_i, R_f, R_o \, \in \, \mathbb {R^{N*N}}$
Peephole weights: $p_i, p_f, p_o \, \in \, \mathbb {R^N}$
Bias weights: $b_z, b_i, b_f, b_o \, \in \, \mathbb {R^N}$,

where z is the block input, i is the input gate, f is the forget gate, o is the output gate, N is the number of LSTM blocks, and M is the number of inputs. By introducing $x^t$ as the input vector, $y^t$ as the block output, and $c^t$ as the cell at time t, the formulation of the forward pass in Vanilla LSTM can be defined as below:

$$\begin{aligned} {{\bar{z}}^{t}}&= {W_z}{x^t} + {R_z}{y^{t-1}} + {b_z}, \end{aligned}$$

(1)

$$\begin{aligned} {z^t}&= g({\bar{z}}^{t}), \end{aligned}$$

(2)

$$\begin{aligned} {{\bar{i}}^{t}}&= {W_i}{x^t} + {R_i}{y^{t-1}} + {p_i}\odot {c^{t-1}} + {b_i}, \end{aligned}$$

(3)

$$\begin{aligned} {i^t}&= \sigma ({\bar{i}}^{t}), \end{aligned}$$

(4)

$$\begin{aligned} {{\bar{f}}^{t}}&= {W_f}{x^t} + {R_f}{y^{t-1}} + {p_f}\odot {c^{t-1}} + {b_f}, \end{aligned}$$

(5)

$$\begin{aligned} {f^t}&= \sigma ({\bar{f}}^{t}), \end{aligned}$$

(6)

$$\begin{aligned} {c^{t}}&= {z_t}\odot {i^t} + {c^{t-1}}\odot {f^t}, \end{aligned}$$

(7)

$$\begin{aligned} {{\bar{o}}^{t}}&= {W_o}{x^t} + {R_o}{y^{t-1}} + {p_o}\odot {c^t} + {b_o}, \end{aligned}$$

(8)

$$\begin{aligned} {o^t}&= \sigma ({\bar{o}}^{t}), \end{aligned}$$

(9)

$$\begin{aligned} {y^{t}}&= {h(c^t)}\odot {o^t}, \end{aligned}$$

(10)

where $\sigma $ is the logistic sigmoid function, g and h are hyperbolic tangent functions, and $\odot $ is the point-wise multiplication of the two vectors.

Back-propagation through time

The other main operation is back-propagation. Back-propagation through time (BPTT) is the process of calculating the deltas of LSTM blocks and the gradient of the weights (Greff et al. 2017).

First, the deltas ($\delta $) of LSTM blocks and the inputs are calculated. In the below equations, $\Delta ^t$ is the vector of the deltas passed down from the above layer, and T is the transposition operator. Calculation of the deltas is performed as follows:

$$\begin{aligned} {{\delta }y^{t}}&= \Delta ^t + {R_z}^T{\delta }z^{t+1} + {R_i}^T{\delta }i^{t+1} + {R_f}^T{\delta }f^{t+1} + {R_o}^T{\delta }o^{t+1}, \end{aligned}$$

(11)

$$\begin{aligned} {{\delta }{\bar{o}}^{t}}&= {\delta }{y^t} \odot h(c^t) \odot \sigma '({\bar{o}}^{t}), \end{aligned}$$

(12)

$$\begin{aligned} {{\delta }{\bar{c}}^{t}}&= {\delta }{y^t} \odot o^t \odot h'(c^t) + p_o \odot {\delta }{\bar{o}}^{t} + p_i \odot {\delta }{\bar{i}}^{t+1} + p_f \odot {\delta }{\bar{f}}^{t+1} + {\delta }{c^{t+1}} \odot f^{t+1}, \end{aligned}$$

(13)

$$\begin{aligned} {{\delta }{\bar{f}}^{t}}&= {\delta }{c^t} \odot c^{t-1} \odot \sigma '({\bar{f}}^{t}), \end{aligned}$$

(14)

$$\begin{aligned} {{\delta }{\bar{i}}^{t}}&= {\delta }{c^t} \odot z^t \odot \sigma '({\bar{i}}^{t}), \end{aligned}$$

(15)

$$\begin{aligned} {{\delta }{\bar{z}}^{t}}&= {\delta }{c^t} \odot i^t \odot g'({\bar{z}}^{t}), \end{aligned}$$

(16)

$$\begin{aligned} {{\delta }{x^t}}&= {W_z}^T {\delta }{\bar{z}}^{t} + {W_i}^T {\delta }{\bar{i}}^{t} + {W_f}^T {\delta }{\bar{f}}^{t} + {W_o}^T {\delta }{\bar{o}}^{t}. \end{aligned}$$

(17)

Then, the calculation of the gradient of the weights is performed. In the below formulas, $*$ can be any of {${\bar{z}}, {\bar{i}}, {\bar{f}}, {\bar{o}}$}, $<*_1, *_2>$ corresponds to the outer product of the two vectors, and T is the vector length. The calculations are as follows:

$$\begin{aligned} {{\delta }W_*}&= \sum _{t=0}^T{<{\delta }*^t, x^t>}, \end{aligned}$$

(18)

$$\begin{aligned} {{\delta }R_*}&= \sum _{t=0}^{T-1}{<{\delta }*^{t+1}, y^t>}, \end{aligned}$$

(19)

$$\begin{aligned} {{\delta }b_*}&= \sum _{t=0}^T{{\delta }*^t}, \end{aligned}$$

(20)

$$\begin{aligned} {{\delta }p_i}&= \sum _{t=0}^{T-1}{c^t} \odot {\delta }{\bar{i}}^{t+1}, \end{aligned}$$

(21)

$$\begin{aligned} {{\delta }p_f}&= \sum _{t=0}^{T-1}{c^t} \odot {\delta }{\bar{f}}^{t+1}, \end{aligned}$$

(22)

$$\begin{aligned} {{\delta }p_o}&= \sum _{t=0}^{T-1}{c^t} \odot {\delta }{\bar{o}}^t. \end{aligned}$$

(23)

Using Eqs. 11–23, all weights are updated.

Technical indicators

A technical indicator is a time series that is obtained from mathematical formula(s) applied to another time series, which is typically a price (TIO 2018). These formulas generally use the close, open, high, low, and volume data. Technical indicators can be applied to anything that can be traded in an open market (e.g., stocks, futures, commodities, and Forex). They are empirical assistants that are widely used in practice to identify future price trends and measure volatility (Ozorhan et al. 2017). By analyzing historical data, they can help forecast the future prices.

According to their functionalities, technical indicators can be grouped into three categories: lagging, leading, and volatility. Lagging indicators, also referred to as trend indicators, follow the past price action. MA and MACD are the best examples of lagging indicators. Leading indicators, also known as momentum-based indicators, aim to predict future price trend directions and show rates of change in the price. ROC and RSI are the best-known examples of leading indicators. Volatility-based indicators measure volatility levels in the price. BB is the most widely used volatility-based indicator.

The technical indicators used in this study are described below.

Moving average (MA)

Moving average (MA) is a trend-following (or lagging) indicator that smooths prices by averaging them in a specified period. In this way, MA can help filter out noise. MA can not only identify the trend direction but also determine potential support and resistance levels (TIO 2018).

Moving average convergence divergence (MACD)

Moving average convergence divergence (MACD) is a momentum oscillator developed by Gerald Appel in the late 1970s. It is a trend-following indicator that uses the short and long term exponential moving averages of prices (Appel 2005). MACD uses the short-term moving average to identify price changes quickly and the long-term moving average to emphasize trends (Ozorhan et al. 2017).

Rate of change (ROC)

Rate of change (ROC) is a momentum oscillator that defines the velocity of the price. This indicator measures the percentage of the direction by calculating the ratio between the current closing price and the closing price of the specified previous time (Ozorhan et al. 2017).

Momentum

Momentum measures the amount of change in the price during a specified period (Colby 2003). It is a leading indicator that either shows rises and falls in the price or remains stable when the current trend continues. Momentum is calculated based on the differences in prices for a set time interval (Murphy 1999).

Relative strength index (RSI)

The relative strength index (RSI) is a momentum indicator developed by J. Welles Wilder in 1978. RSI is based on the ratio between the average gain and average loss, which is called the relative strength (RS) (Ozorhan et al. 2017; Wilder 1978). RSI is an oscillator, which means its values change between 0 and 100. It determines overbought and oversold levels in the prices.

Bollinger bands (BB)

Bollinger bands (BB) refers to a volatility-based indicator developed by John Bollinger in the 1980s. It has three bands that provide relative definitions of high and low according to the base (Bollinger 2001). While the middle band is the moving average in a specific period, the upper and lower bands are calculated by the standard deviations in the price, which are placed above and below the middle band. The distance between the bands depends on the volatility of the price (Bollinger 2001; Ozturk et al. 2016).

Commodity channel index (CCI)

The commodity channel index (CCI) is a momentum-based indicator developed by Donald Lambert in 1980. CCI is based on the principle that current prices should be examined based on recent past prices, not those in the distant past, to avoid confusing present patterns (Lambert 1983). This indicator can be used to highlight a new trend or warn against extreme conditions. Moreover, CCI identifies overbought and oversold conditions (Özorhan 2017).

The data set

Interest and inflation rates are two fundamental indicators of the strength of an economy. In the case of low interest rates, individuals tend to buy investment tools that strengthen the economy. In the opposite case, the economy becomes fragile. If supply does not meet demand, inflation occurs, and interest rates also increase (IRD 2018).

Germany and the US are two of the world’s most powerful economies. In such economies, the stock markets have strong relationships with their currencies. DAX is the German stock index, which has a strong relationship on the price of the EUR while the S&P 500 is one a US stock index that affects the USD. Central banks’ interest rates are also important factors determining the prices of currencies. Therefore, the interest rates determined by the Central Bank of Europe and the Fed directly affect EUR and USD prices, respectively.

In this work, to investigate the effect of macroeconomic factors on the value of the EUR/USD currency pair, we used the factors described in Table 1, as well as the close, open, high, and low values of the EUR/USD pair, which were retrieved from EUR/USD historical data (EUR 2018). The rest of the data were obtained from various online resources, including the ECB Statistical Data Warehouse (ECB 2018; EU 2018; Germany 2018), Bureau of Labor Statistics Data (2018), Federal Reserve Economic Data (EFFR 2018), and Yahoo Finance (DAX 2018).

The data set was created with values from the period January 2013–January 2018. This 5-year period contains 1234 data points in which the markets were open. There were 613 increases and 620 decreases for the EUR/USD ratio during this period. Table 1 presents explanations for each field in the data set. Monthly inflation rates were collected from the websites of central banks, and they were repeated for all days of the corresponding month to fill the fields in our daily records.

Table 1 Macroeconomic data and the currency pair used in the data set

Forecasting directional movement of Forex data using LSTM with technical and macroeconomic indicators

Abstract

Introduction

Related work

Forex preliminaries

Long short-term memory (LSTM)

Forward pass

Back-propagation through time

Technical indicators

Moving average (MA)

Moving average convergence divergence (MACD)

Rate of change (ROC)

Momentum

Relative strength index (RSI)

Bollinger bands (BB)

Commodity channel index (CCI)

The data set

LSTM-based hybrid model using macroeconomic and technical indicators

Baseline LSTMs

Macroeconomic LSTM model

Technical LSTM model

Macroeconomic and technical LSTM model

Proposed model: hybrid LSTM model

Training classifiers and labeling the data

Histogram analysis and threshold calculation

Postprocessing

Performance metric

Experiments

Experiments on long-term real data

Forecasting one day ahead

Macroeconomic LSTM model results

Technical LSTM model results

Macroeconomic and technical LSTM model results

Hybrid LSTM model results

Forecasting three days ahead

Macroeconomic LSTM model results

Technical LSTM model results

Macroeconomic and technical LSTM model results

Hybrid LSTM model results

Forecasting 5 days ahead

Macroeconomic LSTM model results

Technical LSTM model results

Macroeconomic and technical LSTM model results

Hybrid LSTM model results

Experiments using recent real data

Forecasting one day ahead

Forecasting three days ahead

Forecasting 5 days ahead

Discussion

Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Appendix

Appendix

Moving average (MA)

Moving average convergence divergence (MACD)

Rate of change (ROC)

Momentum

Relative strength Index (RSI)

Bollinger bands (BB)

Commodity channel index (CCI)

Rights and permissions

About this article

Cite this article

Share this article

Keywords