This section briefly describes the three basic evolutionary learning techniques used in this study, namely CRO, PSO, and GA.

### Chemical reaction optimization

CRO is a meta-heuristic proposed by Lam and Li (Lam & Li, 2010), inspired from natural chemical reactions. The concept mimics the properties of natural chemical reactions and slackly combines it with mathematical optimization techniques. A chemical reaction is a natural phenomenon of transforming unstable chemical substances to stable ones through intermediate reactions. A reaction starts with unstable molecules with excessive energy. Then, the molecules interact with each other through a sequence of elementary reactions and yield products with lower energy. During a chemical reaction, the energy associated with a molecule changes with the change in intra-molecular structure and becomes stable at one point, that is, the equilibrium point. The termination condition is verified by performing a chemical equilibrium (inertness) test. If the newly generated reactant has a better function value, it is included and the worse reactant excluded, and otherwise, a reversible reaction is applied. The literature includes several applications of CRO for classification and financial time series prediction (Nayak et al., 2017; Nayak et al., 2015; Alatas, 2012).

The two major components of CRO are i) *molecule*, as the basic manipulated agent, and ii) *elementary chemical reactions*, as the search operators.

#### Molecule

The basic manipulated agent in CRO is the molecule, similar to the individual in optimization techniques. An alteration in molecular structure triggers another potential solution in the search space. The energy associated with a molecule is termed as kinetic energy (*KE*) and potential energy (*PE*). A transformation of a molecule *m* to *m*' is only possible if *PE*_{ml} ≤ *PE*_{m} + *KE*_{m}. *KE* helps a molecule shift to a higher potential state and provides the ability to avoid local optima. Hence, more favorable structures may be found in future alterations. In CRO, the inter conversion between the *KE* and *PE* among molecules can be achieved through a few elementary chemical reactions similar to the number of steps in optimization techniques. As the algorithm evolves, the molecules have an increasingly energy state and ensure convergence.

#### Elementary chemical reaction

Some elementary chemical reactions are used as search operators in CRO. Different chemical reactions are applied as operators for the exploration as well as the exploitation of the search space. These reactions may be divided into two categories: monomolecular (one molecule takes part in the reaction) or bimolecular (two molecules take part in chemical reaction). Monomolecular reactions (Redox1 and Decomposition) assist in intensification, while bimolecular reactions (Synthesis, Redox2 and Displacement) can lead to diversification. Here, the chemical reactions are explained considering the binary encoding of molecules.

### Decomposition reaction

A decomposition reaction occurs when a molecule splits into two fragments on collision with the wall of the container. The products are quite different from the original reactants. Generally, we represent the decomposition of a molecule *m* into \( {m}_1^{\prime } \) and \( {m}_2^{\prime } \) as follows:

$$ \underset{m}{\underbrace{\left[0,1,1,0,1\right]}}\to \underset{\boldsymbol{m}{\mathbf{1}}^{\prime }}{\underbrace{\left[1,1,1,0,1\right]}}+\underset{\boldsymbol{m}{\mathbf{2}}^{\prime }}{\underbrace{\left[0,1,0,0,1\right]}}. $$

We examine every value of *m.* If *m(i)* is equal to one, its value is copied to \( {m}_1^{\prime }(i) \), and the \( {m}_2^{\prime }(i) \) value is set at random. If *m(i)* is equal to zero, its value is copied to \( {m}_2^{\prime }(i) \), and the \( {m}_1^{\prime }(i) \) value is set at random. Since \( {m}_1^{\prime } \) and \( {m}_2^{\prime } \) are different, they can be treated as two different solutions in the search space and may increase the exploration capability of the CRO. Reaction \( m\to {m}_1^{\prime }+{m}_2^{\prime } \) is acceptable only if \( PE\left({m}_1^{\prime}\right)+ PE\left({m}_2^{\prime}\right)> KE(m)+ PE(m) \).

### Redox1 reaction

In this reaction, a molecule is allowed to collide with the wall of the container. This is also called an on-wall-ineffective collision. As a result, a small change occurs in the molecular structure. A new product *m*^{′} is formed by flipping a random bit of *m* as follows:

$$ \underset{\boldsymbol{m}}{\underbrace{\left[1,0,1,1,0\right]}}\to \underset{{\boldsymbol{m}}^{\prime }}{\underbrace{\left[1,0,1,0,0\right]}}. $$

Chemical system *m* → *m*^{′} is acceptable if *KE*(*m*) + *PE*(*m*) < *PE*(*m*^{′}), and is otherwise rejected.

### Synthesis reaction

In a synthesis reaction, two molecules *m*_{1} and *m*_{2} synthesize to form a single product *m*^{′} with much that is significantly different from the original molecule. The reaction can be expressed as follows:

$$ \underset{\boldsymbol{m}\mathbf{1}}{\underbrace{1,0,1,1,0,1}}+\underset{\boldsymbol{m}\mathbf{2}}{\underbrace{1,1,0,1,0,1}}\to \underset{\boldsymbol{m}^{\prime }}{\underbrace{1,1,0,1,0,1}}. $$

Here, the corresponding bit values of the reactants are compared. If there is a match, the bit value of any molecule is copied to the product. If they do not match, either the bit value of *m*_{1} or *m*_{2} will be randomly copied. The new chemical system *m*1 + *m*2 → *m*′ is acceptable if *KE*(*m*1) + *KE*(*m*2) + *PE*(*m*1) + *PE*(*m*2) < *PE*(*m*′).

### Redox2 reaction

In this type of reaction, two molecules *m*_{1} and *m*_{2} are reacting with each other to produce two new products \( {m}_1^{\prime } \) and \( {m}_2^{\prime } \). This can be represented as follows:

$$ \underset{\boldsymbol{m}\mathbf{1}}{\underbrace{1,0,1,1,0,1}}+\underset{\boldsymbol{m}\mathbf{2}}{\underbrace{0,0,1,0,1,1}}\to \underset{\boldsymbol{m}{\mathbf{1}}^{\prime }}{\underbrace{1,0,1,0,0,1}}+\underset{\boldsymbol{m}{\mathbf{2}}^{\prime }}{\underbrace{0,0,1,1,1,1}}. $$

We select two random points within 1 and the length of the reactant. Then, the bit values between these points are swapped to obtain two new products. If *KE*(*m*1) + *KE*(*m*2) + *PE*(*m*1) + *PE*(*m*2) < *PE*(*m*1^{′}) + *PE*(*m*2′), chemical system *m*1 + *m*2 → *m*1^{′} + *m*2^{′} will be accepted or otherwise rejected.

### Displacement reaction

In case of a displacement reaction, two new molecules are formed as products of the collision of two reactants. The reaction can be represented as follows:

$$ \underset{\boldsymbol{m}\mathbf{1}}{\underbrace{1,0,1,1,0,1}}+\underset{\boldsymbol{m}\mathbf{2}}{\underbrace{0,0,1,0,1,1}}\to \underset{\boldsymbol{m}{\mathbf{1}}^{\prime }}{\underbrace{0,0,1,1,1,1}}+\underset{\boldsymbol{m}{\mathbf{2}}^{\prime }}{\underbrace{1,0,1,0,0,1}}. $$

We compare the corresponding bit values of the two reactants. We swap these bit values to produce new products. If KE(m1) + KE(m2) + PE(m1) + PE(m2) < *PE*(m1^{′}) + PE(m2^{′}), chemical system m1 + m2 → m1^{′} + m2^{′} will be accepted and otherwise rejected.

Under the reactant update step, a chemical equilibrium test is performed. If the newly generated reactants yield a better function value, the new reactant set is included and the worse reactant is excluded, similar to reversible chemical reactions. The reactants are updated according to their enthalpy (fitness value). The CRO is then terminated when the termination criterion (e.g., maximum number of iterations or threshold error value) has been met.

CRO is more robust and uses fewer tunable parameters as compared to other optimizations (Lam & Li, 2010; Alatas, 2012). It only requires the number of initial reactants. In this work, we use binary encoding for reactants and the uniform population method for initial population generation. The initial reactants are evenly initialized in the feasible searching space. As such, all vectors in a space can be obtained as a linear combination of elements of the base set. Absence of one element in the base set creates a reduction in that dimension corresponding to this element. Therefore, it is important that the initial reactants must contain reactants that hold each element of the base set. Additionally, the initial reactants must be regular and hold the base set. The uniform population method used to generate the initial reactant pool is defined by Algorithm 1. The overall process of CRO algorithm is shown in Fig. 1.

Variants of many nature-inspired evolutionary algorithms have been proposed and applied to solving nonlinear problems. However, their performance varies by dataset. According to the “no free lunch theorem,” there is no single state of the art constraint handling technique that can outperform all others in every problem. Hence, choosing a suitable optimization technique for solving a particular problem involves numerous trials and errors. The efficiency of these optimization techniques is characterized by tuning parameters. For better convergence of an algorithm, suitable fine-tuned parameters are required. To search for the global optimum solution, the algorithm requires an appropriate selection of parameters, which makes the use of algorithm difficult. Hence, an optimization technique requiring fewer parameters, a small number of computations, as well as a good approximation capability is best. CRO is one such technique. These facts motivated us to adopt CRO. The pseudo code for CRO is presented by Algorithm 2.

### Particle swarm optimization

PSO is a swarm intelligent-based popular metaheuristic (Kennedy & Eberhart, 1995; Eberhart et al., 1996) that simulates the social behavior of bird flocking, insects, and fish schooling. The search operation of PSO starts with a set of randomly initialized swarms or particles. Each particle can be seen as a candidate solution in the search space. A particle is related to an adaptable velocity (position change) according to which it moves in the search space and has a memory, remembering the best position it has ever visited. It moves towards the best solution with the adjustment of the trajectory of each particle towards its best location and also towards the best particle of the population for each generation. It is simple to implement and has the ability of quickly converging to an optimal solution, and hence is popular for solving multidimensional problems. In PSO, the individuals of a swarm communicate their information and adjust positions and velocities using their group information (Babaei, 2013). In this way, the initial solution propagates through the search space and progressively moves towards the global optimum over a number of generations. The standard PSO algorithm mainly consists ofthree computational steps:

- 1.
Initialize the positions and velocities of particles;

- 2.
Update the position of each particle;

- 3.
Update the velocity of each particle.

Considering a multidimensional problem, let the i^{th} particle at the k^{th} instant move in a D dimensional search space associated with a position P_{i} and velocity V_{i} as follows:

$$ {\mathrm{P}}_{\mathrm{i}}=\left({\mathrm{p}}_{\mathrm{i}1},{\mathrm{p}}_{\mathrm{i}2},\cdots, {\mathrm{p}}_{\mathrm{i}\mathrm{D}}\right), $$

$$ {\mathrm{V}}_{\mathrm{i}}=\left({\mathrm{v}}_{\mathrm{i}1},{\mathrm{v}}_{\mathrm{i}2},\cdots, {\mathrm{v}}_{\mathrm{i}\mathrm{D}}\right). $$

The position and velocity of the particle at the (k + 1) ^{th} instant can be manipulated as follows:

$$ {\mathrm{v}}_{\mathrm{i}}\left(\mathrm{k}+1\right)={\mathrm{w}}_{\mathrm{i}}{\mathrm{v}}_{\mathrm{i}}\left(\mathrm{k}\right)+{\mathrm{c}}_1\ast \operatorname{rand}\ast \Big({\mathrm{pbest}}_{\mathrm{i}}-{\mathrm{P}}_{\mathrm{i}}\left(\mathrm{k}\right)+{\mathrm{c}}_2\ast \operatorname{rand}\ast \left({\mathrm{gbest}}_{\mathrm{i}}-{\mathrm{P}}_{\mathrm{i}}\left(\mathrm{k}\right)\right) $$

$$ {\mathrm{P}}_{\mathrm{i}}\left(\mathrm{k}+1\right)={\mathrm{P}}_{\mathrm{i}}\left(\mathrm{k}\right)+{\mathrm{V}}_{\mathrm{i}}\left(\mathrm{k}+1\right), $$

where c_{1} and c_{2} are two constants called acceleration coefficients. Specifically, c_{1} is the cognitive parameter and c_{2} is the social parameter. The rand generates a random number in range [0, 1] and w_{i} is the inertia weight for the i^{th} particle; pbest_{i} and gbest_{i} are the local and global bests of the i^{th} particle, respectively.

### Genetic algorithm

Genetic algorithms are another popular metaheuristic for a population of probable solutions in the form of chromosomes (Goldberg, 1989; Holland, 1975). They attempt to trace the optimal solution through the process of artificial evolution. The principle is based on biological evolutionary theory and is used to solve optimization problems through encoding a parameter as a replacement for another parameter. It follows the repeated artificial genetic operations: *evaluation*, *selection*, *crossover*, and *mutation*. Generally, the GA process consist the following basic steps:

- 1.
Initialization of the search node randomly;

- 2.
Evaluation of individual fitness;

- 3.
Application of selection operator;

- 4.
Application of crossover operator;

- 5.
Application of mutation operator;

- 6.
Repetition of the above steps until convergence.