Foreign exchange trading and management with the stochastic dual dynamic programming method

We present a novel tool for generating speculative and hedging foreign exchange (FX) trading policies. Our solution provides a schedule that determines trades in each rebalancing period based on future currency prices, net foreign account positions, and incoming (outgoing) flows from business operations. To obtain such policies, we construct a multistage stochastic programming (MSP) model and solve it using the stochastic dual dynamic programming (SDDP) numerical method, which specializes in solving high-dimensional MSP models. We construct our methodology within an open-source SDDP package, avoiding implementing the method from scratch. To measure the performance of our policies, we model FX prices as a mean-reverting stochastic process with random events that incorporate stochastic trends. We calibrate this price model on seven currency pairs, demonstrating that our trading policies not only outperform the benchmarks for each currency, but may also be close to ex-post optimal solutions. We also show how the tool can be used to generate more or less conservative strategies by adjusting the risk tolerance, and how it can be used in a variety of contexts and time scales, ranging from intraday speculative trading to monthly hedging for business operations. Finally, we examine the impact of increasing trade policy uncertainty (TPU) levels on our findings. Our findings show that the volatility of currencies from emerging economies rises in comparison to currencies from developed markets. We discover that an increase in the TPU level has no effect on the average profit obtained by our method. However, the risk exposure of the policies increases (decreases) for the group of currencies from emerging (developed) markets.

Page 2 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 particularly in a competitive low-margin market. To hedge against FX risk, companies could either offset gains (losses) on foreign assets with a foreign liability (and vice versa) or use derivative contracts. 1 Several speculators profit from FX movements. Meanwhile, financial institutions are in between, controlling the market risk in international portfolios while also employing FX-based investment strategies. Even commercial banks must deal with FX risk. Consider a bank outside the United States that provides credit (e.g., a credit card) for international purchases in US dollars (USD). When retail clients pay such credit, they typically do so in local currency (e.g., a checking account). Thus, the FX desk of such a bank is constantly purchasing USD for their clients when making these credit payments.
This work aims to build a decision support system that can assist companies in trading in the FX market for speculative or non-speculative purposes. We want to generate policies that deliver the amounts and times to trade over time to maximize the compromise between expected profit and the risk of obtaining such profit. A multistage stochastic programming (MSP) model is used to generate policies. The advantage of using MSP is that decisions made at each point in time consider all possible future FX prices and decisions from that point to the horizon. Instead of using the myopic single-period approach, we use this model because it enables a trading schedule to be planned considering the proximity to the horizon.
MSP is a popular approach for solving asset allocation problems in discrete settings, especially for long-term horizon planning. Applications to pension funds (e.g., Duarte et al. 2017;De Oliveira et al. 2017) and asset-liability management in insurance companies (e.g., Carino et al. 1994;Consigli et al. 2018). More recently, Mulvey et al. (2019) and Kim et al. (2020) used MSP to build goal-driven portfolios. MSP is even being used as a pricing option. For example, Antonelli et al. (2013) developed a linear stochastic programming model to price American options in incomplete markets. Haarbrücker and Kuhn (2009) used MSP to price agreements to trade electric energy at a specific time in the future, known as swing options. Finally, Kwon and Li (2016) used stochastic semidefinite programming in regime-switching models to derive European-style option prices.
The complexity of using MSP to solve our problem is that the number of possible outcomes grows exponentially with the number of stages considered. Because MSP typically involves multidimensional state-spaces, computing the Bellman function in each state becomes intractable. To address the dimensionality issue, we employ stochastic dual dynamic programming (SDDP), a numerical method that approximates the Bellman function with linear cuts rather than computing its exact value in each state. The SDDP has been used successfully to solve large MSP models in various fields, for example, in energy planning (Pereira and Pinto 1991;Guigues 2014;Soares et al. 2017), supply chain management (Fhoula et al. 2013), and mining .
To the best of our knowledge, this is the first paper that uses the MSP approach to model and solve speculative FX trading. Most trading strategies are based on technical analysis, and their profitability has been studied (e.g., Abbey and Doukas 2012;Coakley et al. 2016;Zarrabi et al. 2017;Deng et al. 2020). Furthermore, trades are triggered by indicators that rely explicitly on historical patterns in those approaches. On the other hand, our approach is forward-looking because trading decisions are based on current and future FX rate outcomes. Our model also handles FX trades made to hedge against business activities that were not previously considered in FX trading strategies. One intriguing aspect of our methodology is that we do not need to change the model structure or the resolution method when solving problems with different time frequency and horizon settings. As demonstrated in "Results with intraday trading" and "Results with daily trading" section, the system can operate from an intraday, purely speculative trading context to monthly hedging by import/export businesses.
We intend to manage FX exposure by determining the appropriate amount of FX to buy (borrow) at any given time based on future outflows (inflows) of foreign denominated currency. In this sense, our model is similar to inventory management in supply chain problems (for a more in-depth account of this topic, see the survey by Andersson et al. (2010)). This provides an alternative to hedging currency risk with derivative contracts, which has been extensively researched. 2 The advantage of hedging with prior spot FX trades is that no money is required upfront. The disadvantage is that derivative contracts come with premium charges, which can be prohibitively expensive for small and medium businesses or businesses with low margins. Furthermore, some countries lack FX derivative markets. Nonetheless, as we will see in "Methodology" section, both approaches can produce comparable hedging effectiveness. 3 Yu et al. (2020) provided a multiperiod setting for FX, determining the optimal number of options or futures for future FX cash flows in a firm. Their solution is based on dynamic programming as well. Their own algorithm, however, is not as well documented as the SDDP, and their experiments only show the case of a single cash flow. Furthermore, the model and its solution are not intended to include speculative FX trading.
A second contribution is that our paper expands on the SDDP method's initial application in economic and financial fields. So far, the SDDP has only been used in portfolio management applications. Kozmík and Morton (2015), and later Dupačová and Kozmík (2017), were among the first to use the SDDP and to include an algorithm that reduces the number of scenarios required for asset returns. Guigues (2017) proposed a cut selection strategy to boost the efficiency of the dynamic dual programming algorithm when applied to a portfolio problem. Valladão et al. (2019) proposed a model that can handle multiple asset selections and trading costs. Risk constraints and Markovian timedependent returns based on a factor model and regimes are also included. Meanwhile, Guigues et al. (2020) extended the regularization method that enables the SDDP to solve non-linear problems. The algorithm can find asset allocations under market impact costs and risk aversion measures. Finally, Guigues (2021) proposed a method for solving the SDDP with an arbitrary number of stages. The paper demonstrates the economic benefit of using the method in cases where the trading timeframe is unknown in advance.
Hence, we would like to show that the tool can be applied in another context, that is, FX trading. To do so, we solve our novel MSP model using an open-source package called "sddp.jl, " which Dowson and Kapelevich (2017) implemented in Julia. There is no longer any need to create the algorithm from scratch with this package. As a result, we encourage finance practitioners and traders to use this tool in their work. We also note that the sddp.jl package was successfully used by Reus and Prado (2021) to solve an index-based asset allocation problem.
Third, we can obtain solutions that implicitly account for risk aversion during the solution phase. Most FX strategies reduce risk by incorporating explicit constraints (e.g., bounding positions and turnovers). The disadvantage of doing so is that good solutions may be excluded when prior constraints are imposed. Previous research (e.g., Álvarez-Díez et al. 2016) has used risk measures to find hedging positions in one-and two-stage settings. However, as Homem-de-Mello and Pagnoncelli (2016) discussed, using risk measures in a multistage setting is more difficult to implement. Fortunately, the sddp.jl package allows us to change the MSP's objective function and thus reward conservative solutions based on the user's risk tolerance, without the need for prior boundaries.
Fourth, to the best of our knowledge, this is the first paper to use the SDDP to implement a mean-reverting process in a financial application. The algorithm was initially designed to work with stage-independent random variables. Following that, new research demonstrated how to incorporate Markov processes (Philpott and De Matos 2012) and autoregressive processes (Shapiro et al. 2013;Guigues 2014). The sddp.jl package can handle both types of processes, and the present study shows how to implement the Vasicek (1977) model.
Finally, we contribute to the existing literature on economic factors (information) that explain and/or influence currency movement. In this regard, Aloosh and Bekaert (2021) concluded that clustering dollar and European currencies is reasonable. They discover that trading volume and knowing whether a currency is classified as a commodity currency can explain FX dynamics better than carry. Other research focuses on macroeconomic factors. In a recent article, Pham (2019) demonstrated that a decrease in Vietnam's money aggregate, as measured by the interest rate, results in a depreciation (appreciation) followed by an appreciation (depreciation) in the country's real effective exchange rate. These findings contradict the overshooting hypothesis proposed by Dornbusch (1976) and empirically supported by Sims (1992) and Eichenbaum and Evans (1995). This paper aims to investigate the impact of an increase in the trade policy uncertainty (TPU) index, as developed by Caldara et al. (2020), on the performance of the SDDP solutions. The TPU measures the frequency of occurrences of trade policy and uncertainty terms in major newspapers. Huynh et al. (2020) show that the TPU has a strong relationship with FX volatility, even when the Economic Policy Uncertainty (EPU) index is included. 4 Our results show that an increase in the TPU level produces an increase (decrease) in the volatility of the currencies of emerging (developed) markets. This results in an increase (decrease) in the risk of the SDDP-generated policies in comparison to the risk obtained in periods with lower TPU levels.
The SDDP's application in our work aligns with the growing demand for quantitative methods in finance. As Hu et al. (2015) and Kou et al. (2019) explained, the financial market is a highly interconnected and complex network. As a result, gaining a better understanding of it necessitates the use of advanced quantitative and machine learning Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 methods to assess systemic risk and improve financial stability. The TPU index, for example, is built using text mining methods, which are commonly used to gauge market sentiment. 5 Incorporating risk aversion endogenously in our methodology necessitates handling and adapting risk measures, which is another key element used by statistical methods. Brownlees and Engle (2017), for example, introduced SRISK, which measures a firm's capital shortfall in the event of a severe market decline. Agliardi (2018) explored the ambiguity of the value-at-risk when estimating capital requirements. Another known application is the use of machine learning techniques to classify the credit risk in retail banking. For example, Zhang et al. (2015) used a support vector machine to assess the default risk of SMEs when supply chain finance is included. Meanwhile, Li et al. (2021) applied the k-Means algorithm to financial data using a revised support vector data description model. Dastile et al. (2020) published a recent review on credit scoring using advanced statistical methods. The remainder of this paper is structured as follows. "SDDP method" section briefly explains the SDDP method. "Methodology" section presents the MSP and FX price models. Then, "Results with intraday trading" section demonstrates the model's performance and policies derived from our model within an intraday trading setting, using seven currency pairs. "Results with daily trading" section shows the model's performance in a daily trading setting using the same currency pairs and demonstrates the effect of the TPU level on the performance of our solutions. Finally, "Conclusion and future development" secrion concludes the paper and discusses possible future extensions that could be used to improve the methodology.

SDDP method
We explain the SDDP at the practitioner level in this section, summarizing the information provided by Reus and Prado (2021). The SDDP is a numerical method for calculating MSP. To facilitate resolution, MSP are typically structured as time-dependent subproblems. Let x t represent the decisions at each time period t, and ξ t represent the exogeneous stochastic process. Each subproblem can be written like Philpott and De Matos (2012).
For t = 1: For t = 2, . . . , T: Matrix B t and vector b t depend on ξ t , while A t and c t do not. Note that x t are obtained without knowing the values of ξ l ∀l > t . At period T , we obtain a deterministic value for the border condition E[Q T +1 (x T , ξ T +1 )].
(1) z = min Page 6 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 The difficulty of solving an MSP lies in the size of the state space, that is, the possible values of ξ t . To find the best value, Q t must be evaluated in every possible state, which can be time consuming in multistage problems. As a result, numerical techniques such as the SDDP must be used to find near-optimal solutions.
As illustrated in Fig. 1, the main idea of the SDDP is to build an outer approximation of E[Q t+1 (x t , ξ t+1 )] with linear cuts. Within the SDDP algorithm, problems (1)-(2) are approximated with the following linear programming (LP) models: For t = 1: For t = 2, . . . , T : The term E[Q t+1 (x t , ξ t+1 )] is replaced by variable θ t+1 , which is bounded by the cuts defined by constraints (3c)-(4c). Note that the dual variables of original constraints are denoted by π t and depend on ξ t .
To understand the algorithm in more details (including estimations of terms g k,s t+1 and G k,s t+1 ), a pseudo-code is presented below. At each iteration k , we start with the forward pass. In this step, N scenarios are sampled, and problems (3)-(4) are solved sequentially at each period t . Solution x k,s t−1 and function value Q t x k,s t−1 , ξ t are saved for each scenario s , which is used to solve the problem in the next period. The term z k (3a) z = min θ 2 ,x 1 ≥l 1 obtained can be used as a lower bound, whereas the sample average of the costs can be used as an upper bound. The stopping criterion is met if the lower bound is within the α-confidence interval of the upper bound.
The new cuts are generated with the backward pass step. By moving backward in time, problem (4) is solved at period t by using the stored decisions x k,s t−1 in the forward pass step. Both the dual variables π t and stored terms Q t x k,s t−1 , ξ t , are used to estimate g k,s t and G k,s t . Note that the cuts generated at period t are used for period t − 1. Bandarra and Guigues (2021) recently developed and tested a cut selection method for multicut decomposition algorithms on a portfolio problem. The results show that their method outperforms other cut selection methods (e.g., De Matos et al. 2015), particularly when implementing the SDDP.
The SDDP was originally designed for convex functions Q , which does not apply to economic problems like our trading model. In most economic problems, the objective function is determined by revenues (costs), which are calculated by multiplying a state variable (price/cost) by an endogenous variable (quantity). Downward et al. (2020) address this issue by incorporating an idea developed by Baucke et al. (2017) to bound non-convex value functions into the SDDP.jl.
Another enhancement to the method was the incorporation of risk measures into the objective function, as seen in Philpott et al. (2013) and Shapiro et al. (2013). In this regard, the SDDP.jl allows for the addition of risk aversion by replacing the risk neutral expected value at each stage in (2) with where β ∈ [0, 1], which is a convex combination of the expected value and the risk measure. The Conditional Value-at-Risk (CVaR) is an example of a risk measure. CVaR was first proposed by Rockafellar et al. (2000) and has since been extended to a multistage context in several publications, including Reus et al. (2019).

Currency trading problem (CTP)
Consider set t ∈ {1 . . . T } to be the times at which currency can be purchased or sold during a period with horizon T . Assume there are scheduled exogenous flows f t , which are denominated in foreign currency. If f t is negative (positive) then we have an expense (income). At each time, we must decide: The amount of foreign currency to be bought (sold) at time t.
To make the above decisions, we need to know the value of the following state variables: • S t : FX price at the end of time t , that is, the value in the domestic currency of one unit of foreign currency. • P t : Net position of foreign currency at the end of time t . If the amount is positive (negative), we have a long (short) position.
The uncertainty of prices can be included in the exogenous process ξ t , which is the FX price increase (decrease) (%) at time t. 6 We include the bid/ask spread with a parameter c and define S to be the price at which we buy and S(1 − c) to be the price at which we sell. At horizon T , we value positions at closing price S T . Initial positions and prices at the beginning of the day ( S 0 , P 0 ) are known. The CTP can be written as follows: For t = 1, . . . , T − 1: For t = T: The objective function (6a) maximizes the P&L in the local currency. The function adds the market value of any open positions in the final period. 7 Constraints (6b) are similar to the supply chain management ending inventory equation: the position at the end of a period is the value of the position at the end of the previous period plus new purchases (sales) plus exogenous flows. The difference between physical inventory and (6a) (6b) subject to : Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 accounting inventory is that account P t can be negative. This is true, for example, of incoming flows in exports businesses, which can be hedged with short positions. Equations (6c) update the exchange rate according to its stochastic processes, with ξ t a function of S t−1 . Finally, the constraints in (6d) describe the nature of the variables. Note that function (6a) does not include the income or expenses generate by the flows, which equals t=1 S t f t . This term is the total P&L obtained using the "no-hedge" policy, that is, the solution obtained by setting x ± t = 0∀t , which is feasible in the CTP. The model described above is the generic version of a CTP model. Evidently, each practitioner can add customized requirements. For example, a common constraint might be to limit the amount to be held in account P t , which can be added by imposing constraints of the type lp t ≤ P t ≤ up t . Another option is to set a limit on the number of new positions taken. As we will see in the following section, depending on the dynamics of the FX prices in Eq. (6c), new stochastic and state variables may be required.
We are said to be in a purely speculative setting when there are no flows coming from business operations to be hedged, that is, when f t = 0∀t . In the case where f t ≥ 0∀t and no speculation is allowed, then we omit variable x + t . This could be the case with exporting businesses that only want to hedge their future incomes. Analogously, for importing businesses looking to hedge their future expenses f t ≤ 0∀t , we omit variable x − t if the CTP is used only for non-speculative purposes.
The CTP also admits a complete hedge (i.e. risk-free solution, RF hereafter) if we is a natural way to completely hedge against currency volatility. The inclusion of derivative contracts is beyond the scope of this work, but forward contracts can be easily incorporated into the CTP. The price K s t of a forward contract purchased at time t and maturing at time s can be divided into the spot price S t and forward points fp s t . Assume that all contracts mature at the end of the horizon T. 8 Thus, the profit in (6a) can be replaced by which can be decomposed into the original profit and the premium for using forward contracts: Furthermore, if we consider fp T t to be similar in time (as is the case for short-horizon settings), then the second component of Eq. (8) is nearly constant when added across the entire time span, and thus it will not change the optimal hedging policy. To put it another way, using forward contracts does not always provide better hedging solutions. It aids policy implementation though.

FX price model
Empirical evidence suggests that finding a model that can fully characterize FX price dynamics is difficult, if not impossible. However, there are some facts or patterns to consider when selecting a suitable model. Several studies have found mean-reversion in real Page 10 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 exchange rates (e.g., Jorion and Sweeney 1996;Lothian 1997;Caporale and Gil-Alana 2004). Strategies based on a moving average and the Relative Strength index, as discussed in Manahov et al. (2014) and Svoboda et al. (2020), are common in FX trading and implicitly assume mean-reversion behavior in order to work. The Geometric Brownian Motion (GBM) and extensions are another process that is commonly used when pricing currency options. Prices can be modeled as a random walk for short-term horizons. Hong et al. (2007) and Colombo and Pelagatti (2020) explained the difficulty of finding FX models that improve such a process. The SDDP requires that a discrete lattice represent the chosen price model. Figure 2 depicts the advantage of this structure over a scenario tree structure. The number of states that must be handled is reduced while maintaining the quality of the process representation.
Given all of the preceding considerations, we present a model that follows a Vasicek mean-reverting process and can transform into a GBM in the presence of an external random shock. Vasicek dynamics can be expressed as with dW t a Brownian process. S ∞ is the long-run mean-level and κ calibrates the reversion speed toward S ∞ . To build the lattice, first notice that the solution to the Vasicek model in equation (9) can be written as: From Eq. (10), we see that S t is a Gaussian process with the following mean and variance: Page 11 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 With formulas in (11), we can then construct a binomial lattice with the following structure: Besides the mean-reverting structure, we add another dynamic that follows a positive or negative trend to account for specific market events that affect FX prices. To accomplish this, we can use the binomial lattice defined by Jarrow and Rudd (1983), which approximates the GBM. The sign of μ determines the direction of the trend. To combine both dynamics, we add a new i.i.d. random process δ t that can take three possible values: δ t ∈ {0, −1, 1} . If δ t = 0 , then the price follows the dynamics in (12). If δ t = −1(1) , then the price changes its dynamics to (13) with a negative (positive) drift. The probabilities of the values are denoted by π 0 , π d and π u respectively. Since the shock that changes the dynamics is not considered as common, then π 0 ≫ π d and π 0 ≫ π u . We also assume that there will be no more than one shock during the time period. That is, the GBM process changes only once and remains unchanged until the horizon. To incorporate the latter assumption, we add a state variable E t ∈ {0, −1, 1} into the CTP. The day begins with price dynamics that are mean-reverting ( E 0 = 0) . Unless a shock occurs, the value of E t equals 0. In that case, E t changes to − 1 or 1, and then remains constant until the end of the day. We add the two constraints listed below to the CTP. The first is the evolution of E t : The second is the price S t+ t , which depends on two states variables ( E t and S t ), and ξ t as

Results with intraday trading
In this section, we present the CTP results in various contexts. Our research aims to demonstrate and validate the benefits of our decision support tool by referencing market data. We do not concentrate on the calibration of price dynamics or the profit value itself because each practitioner will have a unique data set when using the model.
We use Julia's sddp.jl package to implement the CTP in all settings (for more information, see Dowson and Kapelevich (2017)). As stated in the first section, the package Page 12 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 enables us to solve the model without having to deal with the SDDP algorithm implementation. This means that the algorithm presented in "SDDP method" section does not need to be coded for the CTP. The sddp.jl uses the structure defined in models (1)-(2), which is the same structure used by the CTP. As a result, the model and input are added to the package in accordance with the formulations in (6)-(7). The model's solution includes a sampling procedure that can be used to define an allocation policy later on. Following some calibration, we discovered that sampling 10,000 scenarios was sufficient to represent future FX prices in all settings. Gurobi was the LP solver used in the SDDP procedure, and the experiments were carried out on a MacBook Pro with a Quad-Core Intel Core i5 processor and 16 GB RAM. The CPU time required to reach each solution displayed was less than 20 min. We calibrate the price models for seven pairs using 15-min intraday prices. The first three currency pairs are from emerging economies (the Chilean peso (USDCLP), the Brazilian real (USDBRL), and the Turkish lira (USDTRY), whereas the remaining four are from developed economies (the Australian dollar (AUDUSD), the British pound (GBPUSD), Euros (EURUSD), and Japanese yen (USDJPY)). We only include trades made between 9 a.m. and 2 p.m., which is typically the busiest time in the market, particularly for currencies from developing economies. The estimates for each parameter are shown in Table 1. The values of κ and σ v for the mean-reverting dynamics in (12) represent the mean results obtained from statistical calibrations on each day within a historical sample. Note that S ∞ is not calibrated because we set S ∞ = S 0 . To compare currency pairs, we track the long-term volatility σ/ √ 2κ , with σ := σ v /S 0 . Figure 7 shows the calibration results in greater detail.

Setting 1: Non-speculative trading
We start by considering a company that needs to fulfill a demand D (in foreign currency) by the end of the day, i.e. f T = −D and f t = 0∀t < T . This setting is inspired by the credit payments made by an FX desk, which were discussed at the beginning of this article, but it also applies to any importing business that needs to cover a daily expense. Note that no speculation is allowed. Thus, variables x −. t can be removed from the CTP (which also causes the solutions to be insensitive to the bid/ask spread). If we want to Table 1 Estimates of parameters κ and σ v defining the mean-reverting process in (12) are obtained from a sample containing 15-min intraday prices from 9 a.m. to 2 p.m. within the period January 2020 to December 2021 Estimates of parameters μ and σ defining the GBM process of Eq. (13) are obtained from the sample of daily currency returns within the period January 2000 to December 2021. We determine the mean and volatility of returns above the 90th and below the 10th percentile of the sample, to capture the degree of the trends (drifts). S 0 is the spot price to be used at the beginning of every simulation. Estimates are shown on a daily scale. 1 pip = 10. −4 repeat the same policy every day, we should end up in the same position as we started the day, that is P T = P 0 = 0 . Since there is no selling under this setting, we have the option to adapt the objective function to minimize costs, i.e.
We build two policies, which differ in terms of risk aversion: risk neutral (RN) and risk averse (RA). Such policies can be defined by setting a specific value for β in Eq. (5) (e.g. β = 1 for RN). As an example, consider the RF policy, which purchases everything at the start of the day at a cost of S 0 . Another reference is the "no-hedge" policy (NH hereafter). In this case, the total cost of NH equal S T D . Besides RF and NH, we compare both policies to two benchmark policies. The first (B1) is the ex-post optimal policy, which buys at the day's lowest price. The second (B2) buys the same amount, D/T , every 15 min, thus buying at the average price of the day. The performance of every policy is measured using different statistics, applied based on the total cost (in domestic currency) per unit of foreign currency. Since P T = 0, T t=1 x + t = D . Thus, the total cost divided by D is a weighted-average price at which every solution buys in each period, i.e.
For B1, B2, RF and NH, the total costs divided by D equal shock occurs. Hence, the shock bias toward changing to a GBM with positive or negative trend determines whether E 0 (S t ) is greater or less than 0. That is, if π u ≥ (≤)π d , E 0 (S t ) ≥ (≤) S 0 . This implies that benchmark B2 is dominated by the RF policy when π u ≥ π d .
The performance of the CTP solutions and the benchmarks are shown in Table 2, with shock probabilities of (π u , π d ) = (1.25%, 3.75%) . The performance for the case (π u , π d ) = (3.75%, 1.25%) is shown in Table 8. Some of the findings are as follows: 1. In terms of average costs, there is a small gap between the RF and the ex-post optimal policy (B1). For example, the average cost reduction is 0.8% for USD-CLP, 1.3% for USDBRL, 1.8% for USDTRY, 1.0% for AUDUSD, 0.7% for GBPUSD, and 0.6% for EURUSD and USDJPY. The gap is even smaller for the case where (π u , π d ) = (3.75%, 1.25%). 2. CTP solutions can reduce average costs to below S 0 , making them competitive with RF. When daily volumes are high (as in the case of a bank's FX desks), the difference between RN and RF produces significant long-term savings, particularly for currencies from emerging economies. Consider the results with a daily volume of one million dollars. For USDTRY, the savings obtained with RN (relative to RF) are 1 × (1 − 9 .8758/10) × 250 days = 3.11 million USD a year. 3. At a higher expense, we can reduce the risk of RN using RA's policies in each currency (see VaR 99% and CVaR 99% measures). The option of using RA is especially Page 14 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 desired when π u > π d (see Table 8). The degree of risk reduction can be handled with the chosen value for β . Note that we could reach RF if we set β = 0. 4. Compared to B2, RN and RA both reduce the average costs as well as the risk of buying at high prices too. For example, the CVaR 99% reduction made by RA, relative to B2, is 1 − 804.7/810.5 = 0.7% for USDCLP, 1.4% for USDBRL, 1.7% for USDTRY, 0.9% for AUDUSD, 0.5% for GBPUSD and EURUSD, and 0.9% for the USDJPY. 5. NH allows us to quantify the consequences of not hedging at all. When we compare CTP solutions to NH, the risk reduction and average savings are significant. 6. Volatility as a risk indicator may be deceptive. In many situations, B1 can have one of the highest volatilities, and we know it never buys above S 0 . In such a strategy, the volatility is produced by scenarios that allow one to buy at a very low price (e.g., during a downward trend), which is the inverse of what we consider risk.
In our model, currencies are characterized by the values of the parameters κ, σ ,μ,σ seen in Table 1. Thus, the similarity (difference) in performance between the exchanges is also determined by the similarity (difference) in these parameters. This explains why the results for EURUSD, GBPUSD, and USDJPY are so similar. According to the sensitivity analysis based on USDCLP in Table 9, an increase (decrease) in σ/ √ 2κ produces an increase (decrease) in the risk of every policy, except for B1. The same table shows that increasing the GBM drift μ , decreases average costs of every policy. It also raises the risk in benchmark B2 and NH solutions, but not always in CTP solutions. Table 10 demonstrates that the risk of each policy (except B1) is sensitive to changes in the GBM volatility σ.
The sensitivity analysis discussed above helps to explain why CTP solutions generally offer greater potential in terms of savings from emerging economies, but with greater risk exposure. In those currencies, we have higher long-term volatility in the meanreverting process and/or shocks of bigger magnitudes (i.e., higher values of μ and σ ). We can also include the AUDUSD in this group because it is a commodity currency that performs similarly to the other commodity currencies (CLP and BRL).  Figure 3 depicts the RN's decisions in seven representative scenarios chosen at random for illustration purposes. 9 One interesting finding across all 10,000 scenarios is that purchases are made sporadically throughout the day, as opposed to benchmark B2. The timing of these purchases is what distinguishes the scenarios. Clearly, a shift in FX dynamics (from mean-reverting to GBM) influences future decisions. In the first two scenarios (black), the price always follows a mean-reverting process. In these scenarios, the RN does not begin buying until 10:15, and then only if prices fall below a certain level. If this does not happen, the first purchase can be postponed (until 12:30 in scenario 2). Prices in scenarios 3 and 4 (green) change to follow a GBM with a downward trend at some point during the day. The RN policy, as expected, waits until the last period to purchase what is still required to meet demand (e.g., scenario 3). In Scenario 4, the shock occurs late in the day. Thus, purchases are made in accordance with the policy implemented in scenarios 1 and 2. The price process changes to a GBM with an upward trend in scenarios 5 and 6 (blue). The RN begins buying immediately after the shock (9:30 in scenario 5). In scenario 6, the shock occurs later. As a result, purchases are made in accordance with the policy implemented in scenarios 1 and 2. Finally, scenario 7 demonstrates how the RN policy results in an expensive solution. In this instance, the price dynamic shifts at 10:30. Prices are expected to fall in the coming months. As a result, the RN waits until late in the day to purchase. However, the price always rises and never falls. The RA policy buys earlier than the RN policy in every scenario, even though the best time to buy is late in the day on average (e.g., the case where prices follow a GBM with a negative trend). These choices avoid situations like the one in Scenario 7. 10 We should point out that we test the CTP solutions when FX prices follow a pure random walk, which is possible with our model defined in "FX price model" section by setting π u = 1(or π d = 1) and μ = 0 . In other words, we remove mean-reversion and possible trends from the price process. As expected, the tests done using this process show us that the average cost savings for every policy (except B1) equal S 0 , because the unconditional expectation E(S t ) = S 0 in a random walk. Since the RF policy cost is S 0 surely, then it outperforms all other policies except B1. B1 saved 0.4% for USDCLP, 0.8% for USDBRL, 2.3% for USDTRY, 0.8% for AUDUSD, 0.4% for GBPUSD, 0.2% for EURUSD, and 0.4% for USDJPY. The latter gaps are generally smaller than the reduction seen in Setting 1 results, indicating that there is little room for improvement in the RF policy results.

Setting 2: Speculative trading
With the same price dynamics, we now do purely speculative trading within boundaries, i.e. −L ≤ P t ≤ L . Pure trading profits (P&L) in domestic currency are divided by volume L to determine performance. We define benchmark S1 as the strategy that buys L at the day's lowest price and sells L at the day's highest price. S1 is not always ex-post optimal because there may be multiple opportunities to buy low and sell high throughout the day. However, it remains highly competitive and would be ideal to implement if we possessed such predictive abilities.  Table 3 displays the performance of the CTP solutions. There are some interesting findings that may differ from those seen in the previous non-speculative setting.
1. CTP performance is the same for trends of the same magnitude. This is not hard to explain: if we have an optimal strategy with π u − π d = a , then the optimal policy with π d − π u = a is to do the opposite trade. Thus, the performance of Table 3, which is obtained with (π u , π d ) = (1.25%, 3.75%) is identical to the case (π u , π d ) = (3.75%, 1.25%). 2. In terms of risk, RF is not the best policy. Despite the fact that the VaR and CVaR of this policy are both zero, S1 manages to generate positive profits even in the worstcase scenarios (a bad scenario for this policy would be a flat FX rate). As a result, its VaR and CVaR are negative. 3. As expected, the average profit of the RN policy is lower than the profit of S1 (the gap is 1 − 7.1/9.6 = 26% for USDCLP, 27% for USDBRL, 40% for USDTRY, 30% for AUDUSD, 29% for the GBPUSD, 21% for the EURUSD, and 20% for the USDJPY), and in the worst-case scenario, it results in losses. However, the RN policy still generates significant daily average profits for each currency pair. For USDCLP, for example, we can make an average profit of 0.9 cent for every USD invested. When the exposure is 1 million USD, this equates to 0.9% × 1 = 9,000 USD per day. Looking at Table 3 Performance results of CTP solutions (RN and RA) versus S1, when (π u , π d ) = (1.25%, 3.75%) The performance is measured based on speculative total profits (in domestic currency) divided by the bound L = 1. The numbers in parenthesis are the P&L divided by S 0 , that is, the P&L in terms of the foreign currency (US dollar cents for USDCLP, USDBRL, USDTRY, USDJPY, pennies for GBPUSD, euro cents for EURUSD, Australian dollar cents for AUDUSD). The bid/ask equals 5 pips for every currency Page 18 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 the CVaR 1% , we would have a daily average loss of 0.6% × 1 = 6,000 USD in the worst outcomes. 4. We can effectively reduce extreme losses with RA policies, at the expense of reducing average profits.
The difference in performance between currency pairs is due to FX process calibration. We can make more money in emerging-market currencies, but at a higher risk (plus the AUDUSD). The currencies with the highest values of σ/ √ 2κ come from this group, and according to the sensitivity analysis in Table 11, an increase in σ/ √ 2κ produces better trading opportunities (to buy at lower prices and sell at higher prices), but with higher risk. Currencies with the highest drift and volatility in the shocks include the BRL, TRY, and AUD. Table 11 shows that increasing these parameters results in the greatest increase in average profits when using the CTP. Figure 4 depicts the RN solution's decisions in seven scenarios (different to those chosen for Fig. 3). 11 The price always follows a mean-reverting process in the first two scenarios (black). RN trades when the price reaches certain thresholds, which change over time. It buys when the price crosses a lower threshold (generally below S 0 ) and sells when the price crosses an upper threshold (generally above S 0 ). In scenarios 3 and 4 (green), the price process changes to a GBM with a positive drift. In both scenarios, there is a short net position before the change. When the shock occurs, the RN goes long aggressively, to the maximum allowed, and holds this to the horizon, assuming that the price will rise in the future. The price changes in scenarios 5 and 6 (blue) changes to follow a GBM with a negative drift. In scenarios three and four, the policy sells immediately after the shock and maintains a short position until the horizon because the price is expected to fall. Finally, in Scenario 7, the RN policy performs poorly. The price is expected to move in the opposite direction of a predetermined trend. When a shock occurs, RN maintains a short position. The price is expected to fall as a result of the shock, so RN waits to buy at a lower price. The latter never occurs because the price rises later.
We should point out that we also tested the CTP performance when FX prices followed a pure random walk. The CTP solutions were unable to generate profits, whereas the S1 solution generated lower profits than shown in Table 3 for the same level of risk.

Results with daily trading
Now we are in a situation where trading can be done once a day for an entire month. Similarly to previous experiments, the number of trading periods (20) determines the size of the CTP. As a result, the solutions are obtained in the same time order. This configuration is based on businesses with activities that are subject to FX uncertainty but lack the turnover and regularity of daily requirements seen on a bank's FX desk. As a result, they are not required to trade on an intraday basis.
We divided the data into two parts based on the 75th percentile of the monthly TPU from January 2000 to December 2021. 12 We calibrate the price process using daily data, using a procedure similar to that used for intraday trading. The values of κ and σ for the mean-reverting dynamics in (12) represent the mean results obtained from statistical calibrations each month from January 2000 to December 2021. S ∞ is equal to the spot price seen at the beginning of each month. For more details on the results of the calibration, see Fig. 8. Table 4 shows the estimates of each parameter, for the two sets of data. It is interesting to note that when the TPU rises above its historical 75th percentile, currency volatility in non-developed economies rises significantly, in contrast to G-10 currencies.
To illustrate the benefits of our tool, consider the following income structure at day t: Table 4 Estimates of parameters κ and σ v defining the mean-reverting process in (12)   Page 20 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 f 5 = 0.1D, f 10 = 0.2D, f 15 = 0.6D, f 20 = 0.1D and f t = 0 for every t . This pattern could be a simplified version of the weekly sales seen in businesses during December, with a peak demand around Christmas time. We allow speculation, but within limits: −D ≤ P t ≤ D.

Normal TPU High TPU
We modify the benchmark S1 from Setting 2 in previous section. S1 now buys at the lowest possible price and sells at the highest possible price every week. To compare the P&L of S1 with that of the CTP solution, we subtract the revenues generated by incoming flows from the latter, which is the P&L computed for the NH policy. Under this configuration, NH's P&L equals D[0.1S 5 + 0.2S 10 + 0.6S 15 + 0.1S 20 ] . We set equal shock probabilities, π u = π d = 2.5% , and no bid/ask spread because they are irrelevant at this trading frequency.
Besides the P&L, another metric to compare the performance of the CTP with the benchmarks B1, B2, RF and NH, could be the weighted-average price at which each policy sells. For the CTP solution, that is We modify benchmarks B1 and B2 used in Setting 1 from previous section. B1 hedges at the best (highest) monthly price, so the weighted-average selling price is max t S t . B2 sells positions in proportion to incoming flows. Let x B2,t denote the amount sold under policy B2 at date t . Then Applying the trades in (16), the weighted-average selling price of B2 is For RF and NH, the weighted-average selling price is S 0 and [0.1S 5 + 0.2S 10 + 0.6S 15 + 0.1S 20 ] respectively.

Results under normal TPU
The performance of the CTP solutions and the benchmarks is shown in Tables 5, 6. From the first table, we can see that: 1. CTP solutions, like intraday trading in setting 2, can generate profits from speculation. Furthermore, when comparing average profits for currencies from emerging economies, RN solutions outperform S1 (CLP, BRL, TRY). Again, in worst-case scenarios, RA policies could be used to reduce losses. 2. The highest ratios between average profit and the CVaR measure are obtained by the USDTRY (0.7087/0.0852 = 8.3), followed by the USDBRL (3.6). 3. Adjusting for time differences, the profit-risk compromise is lower than the same compromise obtained in intraday trading. In the USDCLP case, for example, with a 1 million USD exposure, we can achieve a monthly average profit of 3.7% × 1 = 37,000 USD and a monthly average loss of 2.1% × 1 = 21, 000 USD in the worst-case sce- narios. We obtained an average daily profit of 9,000 USD for the same volume in the intraday setting, which equals 180,000 USD per month. We get a monthly exposure by extrapolating the average loss of 6,000 USD obtained with intraday trading for the worst-case scenarios of roughly √ 20 × 6,000 = 26,800 USD. Evidently, the ratio 180,000/26,800 is much higher than the 37,200/21,000 obtained when trading once a day. Something similar occurs with the other currency pairs.
The parameters defining the FX processes in Table 4 could explain the latter results. First, from Table 11, we know that an increase (decrease) in μ produces an increase (decrease) in the average profits. The currencies with the highest values correspond to the emerging economies. There is also a difference in the values of μ obtained from the intraday process. For example, μ equals 10 pips in the USDCLP based on intraday prices, which equals 20 × 10 = 200 pips when extrapolated to a daily price process. This is significantly greater than the 45-pip drift seen in Table 4. Note that the latter occurs with every currency. Second, if we extrapolate the long-term volatility σ/ √ 2κ of the intraday process to a daily scale, we get higher (lower) values in the currencies coming from emerging (developed) economies. For example, extrapolating the long-term volatility of USDTRY √ 20 × σ/ √ 2κ = √ 20 × 0.17% = 0.76% , which is bigger than the longterm volatility of 0.48% in Table 4. As we know from Table 11, the risk of each policy is directly related to σ/ √ 2κ.

Table 5 P&L of CTP solutions (RN and RA) versus benchmark policy S1
The numbers in parenthesis are the P&L divided by S 0 , that is, the P&L in terms of the foreign currency (US dollar cents for USDCLP, USDBRL, USDTRY, USDJPY, pennies for GBPUSD, euro cents for EURUSD, Australian dollar cents for AUDUSD). We set π u = π d = 2.5% and D = L = 1  The first thing to notice about the performance in Table 6 is that the average selling price for all policies except B1 is slightly above S 0 . For example, the RN is at most 0.6% above S 0 (in the case of USDTRY). The average savings in the intraday setting 4.1 are higher because the time span (one day) is 20 times shorter (1 month). One of the main reasons for this is that the CTP maximizes trading revenues rather than the weighted-average price. This is also why RA policies can have a worse AP 1% on this table too. Good trading rules involve selling above the buying prices, which does not necessarily lead to selling above S 0 . Still, the CTP solutions outperform B2 and the NH strategy in terms of average selling price. CTP solutions manage to perform better in the worst scenarios ( AP 1% ) too. Table 6 Weighted-average selling price of CTP solutions (RN and RA) versus benchmark policies B1, B2 and NH P 1% is the first percentile of that measure and AP 1% is the average below that first percentile. The numbers in parenthesis are the rates of increase (reduction) (%) relative to S 0 . We set π u = π d = 2.5% and D = L = 1 Page 23 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 Results under high TPU Table 7 displays the performance of the CTP solutions and benchmarks. The average profits of the CTP solution are comparable to the average profits of a normal TPU level. As a result, we continue to earn more in emerging-market currencies. However, these profits are now lower than S1's profits (except for EURUSD and USDJPY). In comparison to the results obtained with a normal TPU level, the risk of RN policies increases (decreases) in currencies from emerging (developed) economies. This is to be expected, given that we know from Table 4 that moving to a high-TPU period increases (decreases) long-term volatility. We have not shown the weighted-average selling price because it is very similar to the results obtained with standard TPU. This is to be expected, as it is not a measure directly optimized by the CTP. The effect of TPU level in our results is consistent with the majority of Kido's (2016) findings. This paper demonstrates that the returns of high-yielding currencies, such as the BRL, have a negative correlation with the US EPU index, whereas the Japanese yen has a positive correlation over time. In other words, when EPU falls (rises), highyield currencies appreciate (depreciate), whereas JPY falls (rises). Our previous findings show that when TPU levels are high, high-yield currencies such as the CLP, BRL, and TRY experience increased volatility, which coincides with their depreciation, as illustrated in Fig. 5. The figure also depicts the Yen's appreciation during periods of high TPU. Thus, TPU, in addition to other indicators proposed in previous research, such as the VIX proposed by Brunnermeier et al. (2008) or the EPU proposed by Kido (2016), could be a plausible indicator for detecting carry trade crushes. The numbers in parenthesis are the P&L divided by S 0 , that is, the P&L in terms of the foreign currency (US dollar cents for USDCLP, USDBRL, USDTRY, USDJPY, pennies for GBPUSD, euro cents for EURUSD, Australian dollar cents for AUDUSD). We set π u = π d = 2.5% and D = L = 1  Figure 6 depicts the RN's decisions in five representative scenarios. In the first 2 scenarios (black), the price follows the mean-reverting process in the entire month. RN generally buys when the price is below or equal to S 0 , and sell if the price is slightly above S 0 . The margin may be low, but profits may be high if those opportunities continue to occur throughout the month, as in scenario 1. The price process changes to the GBM with a positive drift in scenario 3 (green). As expected, the policy buys as much as it can following the change and maintains a long position until the horizon. Because the limit for holding more long positions has been reached, the income from operations is sold at the same time it is received.  Page 25 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 In scenario 4, the price process shifts to a GBM with negative drift. Similar to scenario 3, the policy now sells as much as possible while remaining short until the horizon. Because the price is expected to fall, this is the best decision in a risk-free scenario. The received income is also sold at the time of receipt. The reason for doing so, however, is different. We must sell as soon as possible in this case because the price is expected to fall. Scenario 5 depicts the RN producing one of its worst outcomes, which could occur when the price follows a meanreverting process for the majority of the month. The RN holds a short position in this case and waits for the price to revert (decrease) as expected. The price starts increasing instead. In this scenario, the income received is also sold at the same it is received, because the price is above S 0 , which compensates the loss. Finally the policy buys when the prices changes to a GBM with positive trend at t = 17 , and holds a long net position from that day. However, the price does not rise as anticipated (actually it suffers a slight decrease).

Conclusion and future development
This work provides a financial engineering tool for FX trading that differs significantly from the methodologies used in current FX strategies. Our methodology is based on techniques from the field of operations research, such as MSP and SDDP, which we use to create a disciplined and forward-thinking trading schedule. As demonstrated by the results, the CTP model provides competitive FX trading solutions in comparison to various benchmarks, in both non-speculative and speculative environments, and across various time frequencies and currencies. However, the results obtained with random walk prices show that CTP solutions are competitive when there is mean-reversion and/or a trend in the shocks. In terms of incorporating specific trading rules and requirements, the tool is adaptable. It also enables the user to modify the FX price dynamics, risk tolerance, and time frequency, among other things, by simply changing the values of certain parameters. Furthermore, because currencies are defined by specific coordinates, we can develop trading policies and compute their performance for various combinations of these parameters in advance. As a result, we do not have to create the policy every time we calibrate a new FX process. As a result, all that remains is to assign the calibration results to the closest prebuilt combination.
Clearly, there are numerous ways to expand on this research. Allowing multiple currencies in the CTP model, that is, having incoming (outgoing) flows in different foreign currencies, could be one of them. Fortunately, the CTP structure would not change because we would simply add the same variables and requirements for each currency. The main challenge would be developing a multidimensional version of the lattice to describe the multivariate distribution of FX prices, which would increase the state space and make the problem unsolvable in terms of CPU time. Reus and Prado (2021) used the sddp.jl to implement a multi-asset GBM process to solve an MSP problem.
Another extension would be to improve the pricing model by including a time-varying process for volatility, particularly in intraday environments. For example, we have Hansen et al. 's (2012) GARCH model with realized measures. The difficulty would arise when attempting to implement the model in the structure required by the SDDP method. Florescu and Viens (2008) may be a good reference point when including stochastic volatility in binomial trees. A third enhancement would be to allow stochastic flows in the CTP, which may be more suitable for certain applications. The sddp.jl allows for the inclusion of uncertainty in constraints in most settings. The difficulty arises when comparing the P&L across scenarios. Finally, it would be extremely useful if a practitioner could implement an automatic procedure that could take the SDDP solution and generate a ready-to-use policy, delivering a list of simplified instructions on how to trade in representative scenarios. See Figs. 7,8,9,10 and Table 8.  Table 1 Page 27 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023)    Page 33 of 38 Reus and Sepúlveda-Hurtado Financial Innovation (2023) 9:23 Sensitivity analysis on setting 1: non-speculative trading

Appendix
See Tables 9, 10. Table 9 (Up): Sensitivity analysis of the long-term volatility σ/ √ 2κ in the mean reverting intraday price process for USDCLP (Down): Sensitivity analysis of μ in the GBM intraday price process for the USDCLP. The rest of the parameters are kept unchanged and we set (π u , π d ) = (1.25%, 3.75%) . The numbers in parenthesis are the rates of increase (reduction) (%) relative to S 0 = 800 . Recall that the base case has the following values: σ/ √ 2κ,μ,σ = (0.31%, 10 pips, 12 pips)  Table 10 Sensitivity analysis of σ in the GBM intraday price process for the USDCLP. The rest of the parameters are kept unchanged and we set (π u , π d ) = (1.25%, 3.75%) The numbers in parenthesis are the rates of increase (reduction) (%) relative to S 0 = 800 . Recall that the base case has the following values: σ/ √ 2κ,μ,σ = (0.31%, 10 pips, 12 pips)