Skip to main content

An interval constraint-based trading strategy with social sentiment for the stock market

Abstract

Developing effective strategies to earn excess returns in the stock market is a cutting-edge topic in the field of economics. At the same time, stock price forecasting that supports trading strategies is considered one of the most challenging tasks. Therefore, this study analyzes and extracts news media data, expert comments, social opinion data, and pandemic text data using natural language processing, and then combines the data with a deep learning model to forecast future stock price patterns based on historical stock prices. An interval constraint-based trading strategy is constructed. Using data from several typical stocks in the Chinese stock market during the COVID-19 period, the empirical studies and trading simulations show, first, that the sentiment composite index and the deep learning model can improve the accuracy of stock price forecasting. Second, the interval constraint-based trading strategy based on the proposed approach can effectively enhance returns and thus, can assist investors in decision-making.

Introduction

The stock market is a crucial conduit for businesses to raise funds from investors and is a financial ecosystem connecting corporations and investors (Zhong and Enke 2019). The enormous trading volume and profitability of the stock market continues to attract investors and traders keen to employ this system to maximize their profits (Adam et al. 2016). However, the stock market has the characteristic of extreme volatility and non-stationarity, making it prone to numerous complicated shocks and games. Consequently, there are obstacles to devising solid trading methods and making profitable investment decisions (Gu and Peng 2019). Since the turn of the 20th century, a constant stream of financial institutions and researchers have been developing stock price forecasting models. With the expansion of computer technology, an increasing number of superior models, including deep learning, seek to decrease stochasticity and identify consistent trends by collecting and evaluating historical data and useful technical indicators (Salisu and Vo 2020; Liu et al. 2020).

At the beginning of the century, Hinton et al. introduced the notion of deep learning, thereby resolving the enduring impasse surrounding the arduous training of deep neural networks (Hinton et al. 2006). Meanwhile, Bengio et al. established a robust framework for the application of deep learning in addressing language modeling challenges (Bengio et al. 2000; Khurana et al. 2023). Subsequently, deep learning and natural language processing (NLP) have gained extensive utilization across diverse domains (Lecun et al. 2015). Within the realm of finance, these techniques have been employed to facilitate financial forecasting, credit default prediction, mortgage risk estimation, and risk–return management, among other applications (Zhong and Enke 2019; Alonso Robisco and Carbó Martínez 2022; Calomiris and Mamaysky 2019; Xing et al. 2018). These technological advances along with the growth of social media have collectively driven the widespread use of unstructured data (especially news data and social media data), which has improved the predictive power of models.

Specifically, the factors that make this phenomenon noteworthy are as follows. First, the efficient market hypothesis assumes that market investors are rational and seek the greatest possible profits (Fama 1970). However, as the Dutch tulip bubble and the American Internet bubble showed, investors are not always rational (Audrino et al. 2020). According to previous studies, investor sentiment and stock returns are mutually limiting and influential. In other words, the price of stocks in the market is not only defined by the intrinsic worth of companies, but is also heavily impacted by the investing subject; that is, psychological considerations and investor behavior have a significant impact on the price decisions and movements of stocks (Bustos and Pomares-Quimbaya 2020; Liang et al. 2020). Second, social platforms or news websites, as types of digital economy presentation, are increasingly crucial avenues for consumers or investors to exchange perspectives, feelings, and knowledge. Compared to conventional data sources, these platforms’ data offer the benefits of a broad user base, high socializing, high engagement, and rapid reaction times (Audrino et al. 2020; Hong et al. 2017). The efficient use of this information and its integration into research on the stock market is a highly rewarding and challenging task.

In the COVID-19 era, it is worth considering what texts should be employed for stock price analysis. In conjunction with previous studies, the following categories of data are the focus of this study: 1. news, which is a significant way for the general public to get official or more formal information through the media (Narayan 2019); 2. comments, especially from investors and practitioners, which are a synthesis of sentiment from relative professionals; 3. social media data, which reflect the collective wisdom of the general people (Teti et al. 2019); and 4. pandemic data. COVID-19 has triggered significant stock market volatility, as is common knowledge. To prevent the spread of the disease, governments across the globe implemented stricter policy controls, including limitations on labor mobility and quarantines (Salisu and Vo 2020; He et al. 2020; Li et al. 2022). This resulted in a number of challenges in global supply chains, including reduced supply and decreased demand, which discouraged investment and reduced business and consumer confidence. In this scenario, global stock markets suffer setbacks (OECD 2020). COVID-19 has had a significant impact on equity markets, and thus, pandemic-related data should also be considered. No previous study on stock price volatility has considered all of these factors.

Therefore, this study proposes a novel forecasting and interval-constraint trading approach based on deep learning and sentiment analysis for forecasting and simulating stock price fluctuations in the COVID-19 period in conjunction with big data. First, text data from four different perspectives are collected: news media, expert comments, social opinion, and the pandemic. Second, the relevant texts are then analyzed with natural language processing techniques to provide sentiment indexes that represent a combination of official, popular, and social contexts. Third, with the support of deep learning models, we employ historical data, search engine data, and sentiment indexes to forecast stock prices, including point value forecasts and interval forecasts. Fourth, we employ a number of assessment criteria and statistical tests to test the forecasting capacity of the model. Lastly, traditional trading strategies are based on forecasts for long or short positions, which carry a lot of risk, especially when located in periods of high volatility. This study proposes combining the interval estimation algorithm to add an insurance policy to a trading strategy that can support significant improvement in trading returns under high volatility conditions.

The innovation of this study lies in three aspects. First, the data are considered comprehensively. The data form includes time-series data and textual data; the data sources include news media, stock market experts, and the general public; and the data connotation includes stock characteristics, practitioner sentiment, and pandemic information. Second, this study makes innovative use of models. We employ temporal convolutional network (TCN), an effective deep learning model, combined with interval estimation algorithms to generate a reliable forecasting framework (containing point forecasts and interval forecasts). Third, this is the first study to construct a trading strategy based on interval constraints. Interval restrictions are added to the general point forecast-based trading strategy to avoid irrational investments in high volatility periods or to generate huge returns.

The rest of the paper is structured as follows. The literature review and discussion related to this study is presented in "Literature review". The proposed methodology and related models are presented in "Methodology". "Empirical analysis" shows the experimental procedure, including the data collection, data processing, forecasting results, and trading simulations. "Conclusion" presents the conclusions. Finally, the discussion and prospects for future research are provided in "Discussion and prospects".

Literature review

This section consists of two parts: the first introduces multiple types of models applied to stock price forecasting, including statistical models and artificial intelligence models; the second introduces the application of social media data in previous stock price forecasting studies, including search indexes and text data.

Forecasting models for stock price

An intriguing topic in finance and forecasting research has been how to make more accurate stock price forecasts (Kumbure et al. 2022). Numerous models have been developed to describe the volatility and trend of various stock prices on different exchange platforms. These models may be categorized into two groups: traditional statistics and artificial intelligence (mainly machine learning algorithms) (Deng et al. 2022; Liu et al. 2021).

Traditional statistics models include the vector autoregressive model, autoregressive conditional heteroskedasticity model, and autoregressive integrated moving average (ARIMA) model, among others (Ribeiro Ramos 2003; Wang et al. 2022). Traditional statistical models, which are effective instruments for illuminating the inner workings of financial market functioning, are frequently utilized in stock market forecasting and analysis. For instance, Jiang et al. (2021) found that oil prices have a significant impact on stock returns in the short term by using a structural threshold vector autoregression model. However, when using traditional statistics models, the linearity or stationarity assumptions in statistical data should be satisfied, which is typically challenging when using high-frequency data from the stock market (Kumbure et al. 2022; Tang et al. 2022). Consequently, conventional statistical models have limitations in forecasting stock prices (Lin et al. 2021).

The abovementioned issues are not present in machine learning models based on artificial intelligence algorithms (Vuong et al. 2022). When complex structures, such as nonlinear high-frequency stock trading data, are present, machine learning algorithms can provide more accurate forecasts (Yun et al. 2022). Many instances of anticipating stock prices have been used in classical machine learning research. Gupta et al. (2019) examined the predictability of stock returns using the quantile random forests method. They used indicators of inequality based on consumption and income to forecast stock returns, providing a new approach for predicting stock returns. Sadaei et al. (2016) and Kao et al. (2013) applied fuzzy set and support vector regression (SVR) to forecast stock prices, respectively. They both achieved good forecasting results. For application of other classical machine learning methods in stock price forecasting, refer to Na and Kim (2021), Zhang and Lou (2021), and Shahi et al. (2020), among others.

The forecasting of stock price using machine learning techniques is now a topic of focus with rich research. Many researchers have proposed novel machine learning methods, which can achieve more robust and more accurate forecast results. Deng et al. (2022) combined a deep learning algorithm with multivariate empirical mode decomposition, and further built a multi-input and multi-output network framework to achieve multi-step forecasting of stock prices. The empirical results show this combination method can realize better prediction results. Although the combination of a machine learning algorithm with a decomposition integration method can raise forecasting accuracy, it also makes the computation more difficult. To resolve this problem, Guo et al. (2022) employed a system clustering method and particle swarm optimization to construct a decomposition and reconstruction model, which not only reduced the complexity of the algorithm, but also obtained more accurate forecasting results. Additionally, several studies have combined different machine learning techniques to overcome the drawbacks of a single technique. For example, Ghosh et al. (2022) mixed random forest and long short-term memory network (LSTM) to achieve more accurate stock price forecasting results than the single machine learning method.

TCN is a novel type of neural network improved from the one-dimensional convolutional neural network. TCN has been shown to outperform LSTM in numerous domains, including voice processing, machine translation, and time-series forecasting, while retaining the robust feature extraction capabilities of conventional convolutional neural networks (Zhu et al. 2020; Shomron and Weiser 2019).

In such a scenario, many benchmark models of the two kinds mentioned above are used in this study; they include seasonal autoregressive integrated moving average (SARIMA), exponential smoothing (ES), SVR, extreme learning machine (ELM), back propagation neural network (BPNN), LSTM, and TCN. TCN is the main model of interest owing to its benefits of higher parallelism, stable gradients, and minimal memory requirements.

Social media and stock price forecasting

As Web 2.0 takes off, more and more investors are turning to the Internet to obtain and share real-time stock-related news (Sanford 2022). Owing to the rapid diffusion of influence through the Internet, experts’ and other influential persons’ written views on stocks may affect the decisions of others. The effects are dual (Maqsood et al. 2020; Gu and Kurov 2020). On the one hand, Internet user comments and event information can have a substantial effect on the price of a stock. On the other hand, sudden fluctuations in stock price may prompt the development and transmission of relevant information (e.g., government viewpoints), which may then impact public perceptions of prospective investment strategies (Shomron and Weiser 2019; Jin et al. 2020). Textual material (e.g., blogs, reviews, and status updates), online search queries (e.g., Google Trends), tags, and personal information are common forms of social media data. Social media data include individual views, ideas, and actions that affect stock market predictability and result in significant profits or losses (Bijl et al. 2016).

Textual data, particularly news, is a superior source of hidden information to quantitative data, because the former enables the forecasting of financial patterns with supporting evidence (Liang et al. 2020; Gu and Peng 2019). For instance, a news story about a corporation containing the terms “resignation” or “risk of default” leads the investor to anticipate a decrease in the stock price. In addition, stock market trends may be influenced by news pertaining to a variety of unforeseen events, such as terrorism, war, civil unrest, economic and political shocks, and natural disasters (Nassirtoussi et al. 2015). Similarly, Chen et al. (2014) demonstrated how information from user-generated research papers on SeekingAlpha may be utilized to anticipate earnings and stock returns. However, it is difficult for retail investors to comprehend the context of research papers completely.

Numerous social and economic effects may be forecast by Web search queries, which has attracted great interest. For example, Bijl et al. (2016) investigated the prospect of forecasting stock returns using Google Trends data and discovered that high Google search volumes are associated with negative returns. Kim et al. (2019) demonstrated that an increase in Google searches is predictive of a rise in the volatility and trading volume of the top companies listed on the Oslo Stock Exchange. Considering the national conditions of China, the Baidu index is an invaluable source for monitoring and forecasting Chinese socioeconomic activities.

According to behavioral finance theory, social network information may impact people’s financial decisions to some level. To help investors understand the connection between social networks and stock prices, Liu et al. (2021) constructed daily social networks utilizing information obtained from EastyMoney, the largest social media site in China, about individuals and the stocks they followed. The empirical data indicate that the social network variable can greatly improve forecasting accuracy. Zhang et al. (2018) investigated characteristics relating to collective mood and perception of stock relatedness based on messages from Xueqiu, a well-known Chinese social network similar to Twitter that caters to investors, and uses nonlinear models to anticipate stock price changes. However, both EastyMoney and Xueqiu are focused sites that cater to niche audiences, ignoring the hotspots and opinions from public media.

Table 1 briefly discusses previous literature, highlighting the limitations of previous studies. First, sentiment analysis of textual data has rarely been performed jointly across multiple platforms, perspectives, and participants. Second, TCN, although applied to stock forecasting, does not combine interval estimation with point forecasting to consider the situation. Third, the double insurance trading strategy of joint point and interval forecasting has not been studied. Therefore, this study proposes a comprehensive and integrated forecasting framework and trading strategy to fill the research gaps.

Table 1 A brief list of selected studies

Methodology

Temporal convolutional network

For the objective of time-series modeling, a novel algorithm that can be used to solve massively parallel computation problems in recurrent neural networks is the TCN, whose effectiveness has been verified (Bai et al. 2018; Zhu et al. 2020). Figure 1 illustrates the basic framework of the TCN model being applied to this study. The TCN model is based on a \(1-D\) fully convolutional network (FCN), where each hidden layer shares the same length as the input layer. The advantages of TCN can be seen in three aspects: causal convolution, dilated convolution, and residual block.

Fig. 1
figure 1

The framework of the TCN model

Causal convolutions. First, since the convolution is causal, only the historical and present-day inputs, and not the inputs in the future, are connected to the current output. Second, the TCN only convolves the inputs at the present time and previous time since the architecture may accept a time series of any length and map it to output data of the same length.

Dilated convolution. The dilated convolutions can enable an exponentially large receptive field to ensure the applicability of the causal convolution on sequence tasks (Bai et al. 2018). For a 1\(-D\) sequence input \(X\in R^{n}\) and a filter f, the dilated convolution operation on sth element in the sequence X is defined as

$$\begin{aligned} F\left( X_{s} \right) =\left( X*_{d} f \right) =\sum _{i=0}^{k-1} f\left( i \right) X_{s-d\cdot i} \end{aligned}$$
(1)

where d is the dilation factor, k is the filter size, and the subscript \(X_{s-d\cdot i}\) denotes the direction of the past. Therefore, dilation is the same as adding a fixed step after every two consecutive filter taps. A dilated convolution becomes a regular convolution at \(d=1\). A ConvNet’s receptive field is effectively expanded by larger dilation because it allows the top-level output to reflect a wider range of inputs. Figure 1a provides an example.

Residual block. It has been demonstrated that a residual learning framework makes network training easier and that residual blocks are useful for deep networks (Wu et al. 2021). The residual block for our baseline TCN is shown in Fig. 1b. The rectified linear unit (ReLU) is used to account for the two layers of dilated causal convolution and non-linearity in the TCN inside a residual block. Then, weight normalization to the convolutional filters can be applied. Additionally, after each dilated convolution, a spatial dropout for regularization was implemented: during each training step, an entire channel is blank out (Poernomo and Kang 2018).

Gi-MLP

Multilayer perceptrons (MLP) is a widely used neural network algorithm, which can be used to find the internal relationship of high-frequency point-valued time series (PTS). Roque et al. (2007) and Sun et al. (2018) extended the MLP used for modeling PTS to traditional interval-valued time series. Furthermore, Han et al. (2012) defined the generalized random interval allows the addition of intervals until the data collection is complete, which makes interval operation easier. The generalized random interval multilayer perceptron method (Gi-MLP) algorithm based on generalized random interval includes N inputs, M outputs, and K hidden layers. Each hidden layer has \(p_{j}(j=1,\ldots ,P)\) hidden nodes. For simplicity, we will introduce this algorithm with one hidden layer, namely \(K=1\).

The input is N generalized random interval data, namely \(x_{i}=[x_{i}^{L},x_{i}^{R}], i=1,\ldots ,N\), where \(x_{i}^{L}\) and \(x_{i}^{R}\) represent the left and right endpoints of the interval respectively. The value of the jth hidden node is obtained from the linear combination of these N generalized random intervals and two trend interval terms. The specific form is:

$$\begin{aligned} L_{j}=\beta _{j}^{l}[1,1]+\beta _{j}^{r}[-\frac{1}{2},\frac{1}{2}]+\sum _{i=1}^{N}\alpha _{ji}x_{i},\quad j=1,\ldots ,P, \end{aligned}$$
(2)

where \(\beta _{j}^{l}, \beta _{j}^{r}, \alpha _{ji}\) are constant parameters. The \(\beta _{j}^{l}\) is the trend item representing the overall level of the interval, while the coefficient \(\beta _{j}^{r}\) represents the trend term of interval radius fluctuation. F is an active function, which is usually chosen as a hyperbolic tangent function and sigmoid function. We can obtain the outputs of the hidden layer after using the active function:

$$\begin{aligned} H_{j} = [H_{j}^{L}, H_{j}^{R}] = F(L_{j}) = [F(L_{j}^{L}), F(L_{j}^{R})], \quad j=1,\ldots ,P. \end{aligned}$$
(3)

After obtaining the interval value of the hidden layer node, the ith interval value of the output layer is constructed as the linear combination of the node value of the hidden layer and two types of trend terms. It has the following form:

$$\begin{aligned} \hat{Y_{i}}=\zeta _{i}^{l}[1,1]+\zeta _{i}^{r}[-\frac{1}{2},\frac{1}{2}]+\sum _{j=1}^{P}\theta _{ij}H_{j},\quad i=1,\ldots ,M, \end{aligned}$$
(4)

where the meaning of \(\zeta _{i}^{l}, \zeta _{i}^{r}, \theta _{ij}\) is the same as (2). Figure 2 illustrates the basic framework of the Gi-MLP applied to this study.

Fig. 2
figure 2

The structure diagram of Gi-MLP

Table 2 The parameters of the forecasting models

Benchmark models and parameter setting

In this study, we used seven benchmark models, namely, SARIMA, ES, SVR, ELM, BPNN, LSTM and TCN, for stock price forecasting, and the parameter settings of these models are listed in Table 2, including the determination methods. Numpy, Pandas, Tensorflow, and Keras packages are used in python3 for model training and testing. SVR is implemented using the libsvm toolbox in MATLAB 2018b.

Furthermore, it is imperative for researchers to consider the operational efficiency of neural network models. Hence, the number of parameters of the neural network is presented in Table 2. The running time of each neural network is presented in Table 3. The recorded time values are obtained by averaging the results from 10 separate runs. It is observed that the neural networks, including the proposed model, exhibit longer running times compared to conventional machine learning models and time-series models. However, the overall time required by the neural networks remains within acceptable limits when compared to the forecasting of daily data. The central processing unit (CPU) employed in this research is the Intel(R) Core(TM) i9-10900K CPU operating at a frequency of 3.70 GHz. It is accompanied by a random access memory (RAM) capacity of 32.0 GB. Additionally, the graphics processing unit (GPU) utilized is the NVIDIA GeForce RTX 3090.

Table 3 Running time of each model

The framework

The methodology combining natural language processing and deep learning forecasting is constructed with four parts: data collection, data processing, empirical forecasting, and trading simulations, as shown in Fig. 3.

Fig. 3
figure 3

The framework of the research

Fig. 4
figure 4

The framework of natural language processing

Stage 1, data collection. We collect stock prices as the forecast target. We also collect transaction-related feature data, search engine data, news media reports, expert comments, public opinion, and pandemic-related text data.

Stage 2, data processing. We standardize the structured data, such as time series, to eliminate the dimension (unit); meanwhile, we extract the sentiment from the sentiment-rich text information and extract the public opinion fever and sensitivity index from the text comments containing public opinion. The framework of natural language processing used in this study is shown in Fig. 4. Some detailed descriptions are placed "Data processing" in section.

Stage 3, empirical forecasting. We use three effective models to forecast the corresponding series. Finally, the results of point forecasting and interval forecasting can be obtained. We will also use error evaluation metrics and statistical tests to evaluate the results.

Stage 4, trading simulations. We will combine the results of point forecasts and interval forecasts to innovatively propose trading strategies that will guide investors’ decisions with insurance recommendations for great returns.

Empirical analysis

Data collection

In the COVID-19 era, scholars from a variety of fields have concentrated on emerging markets since they have become the essential contributor to global economic growth. Numerous researchers have examined the characteristics and performance of developing economies, such as innovation creation, international trade, etc (Gu and Peng 2019; Kang 2018). As the world’s most largest emerging economy, China has led the world in economic growth since the turn of the 21st century, and its stock market performance is notable (Shen et al. 2017). Therefore, the subject of this study is the performance of stock prices in the Chinese stock market.

This research focuses on the five stocks with the biggest market capitalization in their respective sectors among the most popular sectors of the Chinese stock market. They are Kweichou Maotai (600519) in the wine category, Hengrui Pharmaceutical (600276) in the pharmaceutical category, Zhongxing Telecom Equipment (000063) in the technology category, Shanghai Airport (600009) in the logistics category, and Industrial and Commercial Bank of China (601398) in the banking category. Stock trading data are obtained from The Wind Database, including daily high and low stock prices, opening and closing prices, trading volume, and turnover, etc. The data are captured from January 1, 2020 to August 29, 2022.

Internet data, such as search engine data and text data, are frequently employed in research. In this study, two search indexes—Baidu Index and Google Trends—are employed. Meanwhile, four types of textual data are collected for this research: news data (from Tencent News), practitioner commentary data (from the stock bar forum of Oriental Fortune), public opinion data (from SinaPublic Opinion Communication), and pandemic data (from the China National Health Commission). News data, commentary data and pandemic data are collected from January 1, 2020 to August 29, 2022; opinion text data are collected from March 1, 2021 to August 29, 2022. Public opinion data provide the most extensive information; however, because of their accessibility, they cannot be maintained as long as other data. As a result, we further incorporate these data into the model after ordinary least squares manipulation. To present a fuller picture of the overall trend of public opinion in society, we also collect data from Baidu index and Google Trends to use as predictive features.

Table 4 provides a description of the data. The experimental process is discussed in the following subsections after the data collection is complete. We use the stock price of Hengrui Pharmaceutical as an example, because the development of pharmaceuticals was a core issue during the pandemic, and the fluctuation of the company’s stock price reflected this context. February 23, 2022 was adopted as the boundary to divide the data into the training set and a testing set; that is, the training set used data from January 1, 2020 to February 22, 2022 for model training and the testing set used data from February 23, 2022 to August 29, 2022 for empirical testing. The experimental results of other groups of data are shown in the Appendix (i.e., Table 14 and Table 15).

Table 4 The description of the data

Data processing

The structured data in this research, including stock price history data, stock trading data, and search engine data, are normalized to eliminate the dimension (unit) and accelerate gradient descent, as indicated in Eq. (5).

$$\begin{aligned} X_{\textrm{minmax}}=\frac{X-X_{\min }}{X_{\max }-\textrm{X}_{\textrm{min}}} \end{aligned}$$
(5)

where \(X_{\max }\) represents the maximum value in the sequence and \(X_{\min }\) represents the minimum value in the sequence. X represents the data before normalization, and \(X_{\textrm{minmax}}\) represents the data after normalization.

For text data, this study uses natural language processing techniques for analysis. Specifically, Valence Aware Dictionary and sEntiment Reasoner (VADER) is used to determine the sentiment score of each day’s text. It is worth noting that VADER is a lexicon and rule-based sentiment analysis tool specifically attuned to sentiments expressed in social media (Audrino et al. 2020; Shahi et al. 2020). As depicted in Fig. 4, the building of the lexicon includes well-established sentiment word-banks (e.g., Linguistic Inquiry and Word Count, LIWC; and the General Inquirer, GI), emoticons, sentiment-related acronyms and initialisms, commonly used slang with sentiment value, and the use of independent reviewers (Hutto and Gilbert 2014). If one day has many texts, we average them, and the sentiment score of the day is the average of the sentiment ratings of all texts. Additionally, to prevent the influence of extreme values, the daily sentiment interval is determined by the 25% to 75% percentile of the sentiment score of all comments each day. The results of the sentiment scores of these texts are shown in Fig. 5. Meanwhile, we provide word clouds for both types of text data (showing the high frequency words), as shown in Fig. 6, where the language is Chinese, because our context is the Chinese stock market. As shown in Table 5, we mark the terms that occur most frequently.

Fig. 5
figure 5

The sentiment

Fig. 6
figure 6

The word clouds

Figure 5 demonstrates that practitioner comment data has more sentiment variation and that the public influences individual sentiments. News media, however, have more severe sentiments (positive emotions tend to be 1 and negative emotions tend to be 0) and are predominantly positive, indicating that news is always presented with a bias. In addition, Table 6 shows that the news focuses on broad issues such as scientific research, experiments, and research duration. Practitioners’ worries tend to center on personal interest views, such as stock price fluctuations and direct stock appraisal. As a result, while performing a sentiment sequence research, there is variation in the sentiments of different groups, which supports the use of text material from various viewpoints in this study.

Table 5 High frequency words list

Evaluation criteria

This study employs two error assessment criteria based on data values, mean absolute error (MAE) and root mean square error (RMSE), along with one error evaluation criterion based on error percentages, mean absolute percentage error (MAPE), to properly assess forecasting performance (Li et al. 2023; Zhao et al. 2022). They can be computed numerically as follows:

$$\begin{aligned} M A E=\, & {} \frac{1}{N} \sum _{t=1}^N\left( \hat{y}_t-y_t\right) ^2 \end{aligned}$$
(6)
$$\begin{aligned} R M S E=\, & {} \sqrt{\frac{1}{n} \sum _{i=1}^n\left( \hat{y_t}-y_t\right) ^2} \end{aligned}$$
(7)
$$\begin{aligned} M A P E=\, & {} \frac{1}{N} \sum _{t=1}^N\left| \frac{y_t-\hat{y}_t}{y_t}\right| \end{aligned}$$
(8)

where \(\hat{y}_t\) and \({y_t}\) are the forecast stock price and the actual stock price at time t, respectively, and N refers to the number of samples in the test set.

Then, this study employs the Wilcoxon signed-rank (WSR) test, the superior predictive ability (SPA) test, and the Kolmogorov-Smirnov predictive accuracy (KSPA) test to assess the model’s predictive power from a statistical perspective in addition to the error evaluation index (Zhao and Itti 2018; Hansen 2005). The null hypothesis for the WSR test is the following loss differential series:

$$\begin{aligned} l(t)=f\left( e_A(t)\right) -f\left( e_B(t)\right) \end{aligned}$$
(9)

where \(f(*)\) represents the loss function (MSE in this study), \(e_A(t)\) and \(e_B(t)\) represent the forecasting error series of models A and B, respectively.

SPA is a statistical test method for evaluating point forecasting’s greater predictive power. Specifically, it can determine whether the target model’s accuracy performance is better than that of other benchmark models. The conclusion that the benchmark is the best is this null hypothesis:

$$\begin{aligned} H_0: \max _i E\left[ L_i\right] \ge E\left[ L_{bm}\right] \end{aligned}$$
(10)

where \(L_i\) represents MSE of the ith model, and \(L_{bm}\) is the MSE of the benchmark model. The KSPA is a statistical test that aids in determining the predictive accuracy of the two models. Its advantage is that it determines not only the predictive distribution of the two models, but also whether the models have minimal random error. The test is not affected by any autocorrelation in the forecasting errors (Hassani and Silva 2015; Fan et al. 2022a, b).

Forecasting results

In the first place, the performance of seven single forecasting models (i.e., SARIMA, ES, SVR, ELM, BPNN, LSTM, and TCN) is comprehensively examined in this study to illustrate the predictive performance of different models and the justification for using TCN in this study. Specifically, the performance of single forecasting models, including SARIMA, ES, SVR, ELM, BPNN, LSTM and TCN, is compared using the Hengrui Pharmaceutical stock price dataset, with MAE, RMSE, and MAPE used as error assessment criteria. Table 6 details the forecasting accuracy and ranking of various forecasting methodologies.

When evaluating the performance of seven single forecasting models, the most notable observations that emerged from the data comparison are as follows. First, the model suggested in this study (i.e., TCN) outperforms other models in forecasting the stock price of Hengrui Pharmaceutical. Second, based on the overall forecasting findings, the performance of LSTM, a neural network model, is comparable to that of TCN; SVR, ELM and BPNN perform somewhat worse. Third, the SARIMA model and ES differs in performance from the five methods discussed above, yielding somewhat unsatisfactory forecasting results using the research dataset analyzed in this study. This coincides with the findings of some previous studies on the accuracy of machine learning and statistical methods.

Based on the single-model forecasting, we match each of the seven models with the extracted sentiment indexes for further assessment and give them the corresponding names SARIMA-X, ES-X, SVR-X, ELM-X, BPNN-X, LSTM-X, and TCN-X. These models incorporate stock related characteristics, search indexes, news sentiment, and practitioner sentiment to compare and assess the proposed approach’s forecast accuracy and robustness (with the inclusion of short-length social opinion information). Table 7 displays the assessment findings for each measure, whereas Fig. 7 displays the anticipated fitted curves and comparison graphs.

Fig. 7
figure 7

Point forecasting results

Regarding these models’ forecasting errors (see Table 7), this table is informative in several ways. First, the models that used the sentiment index all had lower error values compared to the single model, and MAE values for the models decreased by 47.24%, 45.81%, 37.50%, 24.28%, 34.80%, 35.00%, and 27.91%, respectively. Second, all machine learning models except ELM-X beat SARIMA-X, indicating that SARIMA is applicable and is not inferior to artificial intelligence approaches in certain forecasting scenarios. Third, the proposed approach has the lowest error value and ranks first within the experimental control group, showing that it has the highest predictive accuracy. These outcomes demonstrate the market’s sensitivity to sentiment. The incorporation of sentiment from multiple groups has the ability to properly predict stock values, and a more inclusive group sentiment produces greater outcomes. Furthermore, it is noteworthy to highlight that neural network models exhibit a higher number of parameters, as evidenced by the data presented in Tables 2 and 3. Consequently, this leads to longer runtimes, necessitating a heightened focus on optimizing the computational efficiency of the model, particularly when dealing with larger datasets.

The Wilcoxon signed-rank test and the SPA test are employed in this study to statistically assess the differences in the models’ forecasting abilities and to confirm the validity of the aforementioned results. The results with p-values are given in Table 8. In each table, the p-value in panel A denotes the significant difference between the two comparison models, and the p-value in panel B denotes the existence of a significant superiority link between the two comparison models. For example, in Panel A, the p-value in row 2, column 3 is 0.000, meaning the test rejects at a 99% confidence level the null hypothesis (i.e., there is a significant difference between SARIMA-X and SVR-X models). Additionally, in Panel B, the p- value in the second column of row 3 is 1.000, indicating the test rejects the null hypothesis at a 99% confidence level (i.e., in forecasting performance, SVR-X beats the SARIMA-X model).

Table 8 statistically compares each models’ forecasting performance for stock price data. Some striking observations are as follows. First, when the proposed approach, is considered the benchmark model for Panels A and B, the p-values for both tests are 0.000 and 1.000 in all cases, respectively, indicating that the proposed approach significantly outperforms all other comparative models at the 99.9% confidence level. Second, the machine learning models generally outperform SARIMA-X in addition to ELM-X, with TCN performing the best, which validates the correctness of our choice of this model. Third, the ELM-X does not perform as well as SARIMA-X, indicating that SARIMA can still achieve satisfactory results in some scenarios. Fourth, among machine learning models, LSTM with multilayer neural network structure outperforms classical SVR and ELM, also indicating its capability in time-series analysis.

The results of the KSPA test also support the conclusion that the proposed approach has the best predictive performance (all corresponding p-values are less than 0.01). Figure 9 shows the KSPA error distribution and the empirical cumulative distribution function (c.d.f.) in this study. The proposed model better describes the random deviation; in other words, it has better prediction performance and higher prediction accuracy.

Fig. 8
figure 8

Interval forecasting results

After collecting the closing values from the point forecasting models, we utilize the lowest daily stock price value as the left endpoint of the interval and the maximum value as the right endpoint of the interval, and combine the 25% and 75% quantile intervals of the experts sentiment and news sentiment as inputs to the Gi-MLP model. Following the interval forecasting model, the expected values of the daily high and low prices of the stocks are then produced. The Gi-MLP model’s training and test sets are divided in a manner consistent with the point-value forecasting model, and the rolling forecasting method is used to make predictions one step ahead of time in the forecasting window. The forecasting results are shown in Fig. 8 alongside the actual results.

Fig. 9
figure 9

Distribution of errors and empirical cumulative distribution function of error (c.d.f.)

The following points can be observed from Fig. 8. First, the daily high and low stock prices may be maintained in the forecast outputs of the Gi-MLP model. The anticipated lowest price might be higher than the forecasting maximum price if point-value regression models (e.g., SVR) are used to forecast the high and low stock values independently, but the Gi-MLP model overcomes this issue because it makes use of the interval’s entire information. Second, a broader range of anticipated high and low prices is indicated when the actual volatility is strong, which depicts the general trend of the stock and indicates the change in volatility of the stock. Third, there is little deviation between the Gi-MLP model’s forecast high and low stock prices and the actual predicted values, with an average MAPE of less than 6%.

Table 6 Forecasting error and ranking of single models
Table 7 Forecasting error and ranking of models with sentiment index
Table 8 Results of the Wilcoxon signed rank test and the SPA test

Trading simulations

In previous experiments, we examine the forecasting framework using error evaluation metrics, appropriate statistical tests, and interval estimates; nevertheless, the precision of a forecast is not synonymous with the real return. In the practice of stock trading, the objective of investors is to develop a profitable strategy. Therefore, we use a long-short trading strategy to show the profitability of the proposed approach. The practical strategy is to go “short” on the position when the forecast return is below zero, and to go “long” on the position when the forecast return is above zero. The “short” and “long” stock price position is defined as selling and buying the stock at each respective current price. When the forecast return is zero, we maintained our position.

We compute the relevant return estimates based on the projected closing prices from June 8, 2022 to August 29, 2022, with the expected return on day t determined as follows:

$$\begin{aligned} \hat{R_{t}} = \frac{\hat{C_{t}}-\hat{C}_{t-1}}{\hat{C}_{t-1}}, \end{aligned}$$
(11)

where \(\hat{C_{t}}\) represents the closing price on day t and uses \(R_{t}\) to denote the true rate of return on day t. Based on the estimated rate of return, the following investment strategy can be developed.

Scheme 1.1: Invest based on the forecast return for the following day. If the next day’s return is positive, long the position at the next day’s opening, and if the next day’s return is negative, short the position at the next day’s opening, achieving a positive return as long as \(\hat{R}_{t+1}\) is the same as \(R_{t+1}\); however, if the two variables have opposite signs, the investment will be damaged.

Scheme 1.2: Transaction fee is 0.5‰ in each transaction based on Scheme 1.1.

Scheme 2.1: To mitigate the loss from a wrong investment, we use the expected high and low stock values as insurance. When the expected closing price is between the high and low prices of the stock predicted using Gi-MLP, that is, when \(\hat{Y}_{t+1}^{L} \le \hat{C}_{t+1} \le \hat{Y}_{t+1}^{R}\) is satisfied, the transaction stated in Scheme 1.1 is executed the next day. Otherwise, the stock is believed to be more volatile the next day, and the return forecasting is erroneous, therefore no deal is executed. Figure 10 gives an illustration of such a judgment (the marked points will not be involved in the transaction).

Fig. 10
figure 10

Strategy judgment curve

Scheme 2.2: Transaction fee is 0.5‰ in each transaction based on Scheme 2.1.

According to the aforementioned investment scenario, we simulated the real stock trading from June 9, 2022 to August 29, 2022, a total of 58 days, and the outcomes of Hengrui Pharmaceutical’s simulation are displayed in Table 9. As the table shows, the trading strategy proposed in this study reduces the number of transactions overall owing to its double insurance structure (a greater percentage of transactions that may deplete returns). Therefore, this strategy is particularly suitable for periods of high volatility, such as large stock rises and falls, which can guide investors to make fewer wrong transactions and helps investors make rational judgments, enhance risk management, and thus, obtain high returns.

Meanwhile, we give the simulated trading results for different models, as shown in Table 10. The results show that, our proposed approach generates the best rate of return after considering interval restrictions. The statistical precision and profitability of our strategy can be explained from two perspectives. First, a variety of variables impact the volatility of stock prices. Our suggested strategy considers sentiments from numerous platforms and extracts information using an advanced TCN to drive price forecasting. Second, interval estimation can provide an insurance cover for trading. By not depending solely on point forecast results, the frequency of trading errors is decreased, while the confidence level of the interval is examined to prevent unwanted losses.

Table 9 Simulated transaction results
Table 10 Simulated transaction results for different models

Robustness analysis

To test the robustness of the proposed method, two aspects are considered for further experiments in this subsection. The first is the prediction of the stock index. Stock index forecasting is also of great interest to financial managers and researchers as a hot topic. Second, trading simulations for longer time periods. Since this study prioritizes the medium stock price fluctuations and simulated trading in the stock market during the pandemic period, the simulation period involved is relatively short, so a set of trading simulations for a longer time period will be conducted. Specifically, we will supplement the experiments of SSE index forecasting with Hengrui Pharmaceutica trading simulation, involving dates from 2017-2022, for a total of five years. We take the first 80% as the training set and the last 20% as the test set.

The results regarding SSE predictions are shown in Table 11. Conclusions similar to those of previous experiments can be drawn: firstly, TCN has an absolute performance advantage in the single prediction model comparison. Second, machine learning models overall due to time-series models with econometric models, such as SARIMA and ES. Third, when sentiment-rich textual information is added, the overall prediction performance is improved in both cases. Fourth, the proposed model has the lowest prediction error value, that is, the best prediction performance. The forecast results and simulated trading results regarding HR stock are shown in Tables 12 and 13. It can be clearly observed that the proposed model still achieves satisfactory results under a year-long back-testing period. The performance of the corresponding benchmark model is roughly similar to previous findings.

When considered together, the results of the two sets of experiments in this section (considering index forecasting with extended back-testing period) support the conclusion drawn from previous experiments that the study of stock prices, and thus financial management decisions, can be effectively performed using the forecasting methodology and simulated trading strategy, combining multi-platform textual information proposed in this study.

Table 11 Forecasting error of SSE
Table 12 Forecasting errors of hengrui pharmaceutical
Table 13 Simulated transaction results for Robustness analysis

Conclusion

Investor sentiment is closely related to stock price volatility. This study proposes a deep learning approach based on sentiment indexes, which includes a framework for both prediction and trading strategies. The method addresses the key problem of price prediction in the stock market by analyzing the sentiment expressions of different groups, guiding the interpretation of stock price fluctuations. Specifically, this paper selects news texts, practitioner comments, public opinion texts, and pandemic reports to depict the sentiment orientations of different groups from macro media and micro individuals’ perspectives. This sentiment information is then integrated into the prediction model to enhance the forecasting accuracy. Using the deep learning model to predict stock prices, this study proposes an innovative trading strategy based on the prediction framework and interval estimation.

In the empirical analysis, this study conducts stock price forecasting and simulated trading experiments on five stocks, with Hengrui Pharmaceutica as a representative case. The forecasting results demonstrate that the proposed approach outperforms other benchmark models, yielding more robust stock price predictions. Furthermore, after conducting robustness tests, the proposed prediction method continues to perform well. This finding indicates that incorporating comprehensive integration and sentiment indexes from multiple groups significantly reduces prediction errors and generates more robust prediction results. The simulated trading results based on the prediction demonstrate that the proposed trading strategy with interval constraints improves the return on investment in stocks, such as Hengrui Pharmaceutica, compared to the benchmark strategy, which relies solely on point forecasting. This strategy effectively guides investors to make prudent and accurate trades, achieving greater returns on investment at lower trading costs.

The literature about forecasting stock prices and developing trading and investment models is extensive, as mentioned in Literature review. In comparison to these studies, this paper introduces an innovative approach by utilizing interval forecasting results as constraints to reduce investment risk in point-based trading strategies. This idea has not been addressed in previous research literature. Through simulated trading, this study demonstrates that return on investment can be improved not only by enhancing forecasting accuracy but also by reducing the number of failed trades through the application of interval constraints. This interval-constrained trading strategy provides a feasible and convenient investment solution to achieve higher returns with lower transaction costs, offering new and instructive avenues for future research.

Discussion and prospects

The proposed approach makes three significant contributions. First, it considers multiple sources of information and various data types. The research dataset includes variables related to the stock market as well as the opinions of practitioners, which are valuable for both forecasting and simulated trading. Second, the study utilizes a deep learning model called TCN for forecasting and achieves outstanding results. This highlights the effectiveness and reliability of deep learning models in forecasting stock prices. Third, the study incorporates interval forecasting and applies the results to simulated trading. This approach helps practitioners to mitigate losses substantially during market fluctuations and plays a vital role in guiding decision-making among managers and practitioners.

The practical significance of this study lies in providing an approach based on deep learning and sentiment analysis to help investors accurately predict stock price fluctuations and make effective investment decisions. By capturing investor sentiment and market expectations, this method can assist investors in making informed buying and selling decisions in the stock market, controlling investment risks, and maximizing return on investment. Furthermore, by combining the forecasting results with interval estimation, investors can better assess the balance between risk and return and formulate rational asset allocation strategies. This research is of great importance to individual investors, asset management companies, and other financial institutions, demonstrating the application potential of deep learning and sentiment analysis in investment decision-making, and providing new ideas and methods for related research and practice.

This research holds managerial and societal significance as it utilizes textual data to capture the impact of social events and investor sentiment on the stock market. By analyzing news, comments, social media, and pandemic-related data, this study provides insights into the dynamics of the stock market during significant events like the COVID-19 pandemic. This understanding contributes to maintaining market stability and supporting investment decision-making. By considering a range of integrated data sources, including sentiment analysis, this research enhances our ability to assess market sentiment and investor behavior. This can enable us to formulate effective risk management strategies, thereby contributing to the overall stability and efficiency of financial markets. Furthermore, by combining interval forecasting and trading strategies, this research provides investors with a framework to navigate market fluctuations and make informed investment decisions. Ultimately, the findings of this study have practical implications for investors, financial institutions, and policymakers pursuing sustainable and profitable investment strategies.

However, the study still has certain limitations. First, the unstructured data used in this study are all text, and in the era of big data, images and videos should also be taken into account in the future. Second, as this study has yielded these outstanding findings while focusing on emerging economic markets, the effectiveness of the proposed approach should be evaluated further in other markets. Third, the volatility of financial markets related with stocks (e.g., gold, crude oil) may also play a role in stock price volatility and these should be incorporated effectively into the features. We will investigate these promising issues soon.

Availability of data and materials

The stock price data that supports the findings of this study are available from the WIND database (https://www.wind.com.cn). The text data are taken from https://news.qq.com/, http://guba.eastmoney.com/, https://yqt.midu.com/, and http://www.nhc.gov.cn/.

Abbreviations

ARCH:

Autoregressive conditional heteroskedasticity model

ARIMA:

Autoregressive integrated moving average model

CR:

Cumulative rate of return

ELM:

Extreme learning machine

FCN:

Fully-convolutional network

GARCH:

Generalized autoregressive conditional heteroscedasticity model

Gi-MLP:

Generalized random interval multilayer perceptron method

LSTM:

Long short-term memory network

MAE:

Mean absolute error

MAPE:

Mean absolute percentage error

MLP:

Multilayer perceptrons

MSE:

Mean square error

MS-VAR:

Markov-switching vector autoregressive model

NLP:

Natural language processing

RMSE:

Root mean square error

SARIMA:

Seasonal autoregressive integrated moving average

SPA:

Superior predictive ability test

SVR:

Support vector regression

TCN:

Temporal convolutional network

TVAR:

Threshold vector autoregression model

VAR:

Vector autoregressive model

WSR:

Wilcoxon signed-rank test

References

Download references

Acknowledgements

We thank the editors and the reviewers for their useful feedback that improved this paper.

Funding

This research work was partly supported by the National Natural Science Foundation of China under Grants No. 72171223, No. 71801213, and No. 71988101, and the National Key R &D Program of China under Grants No. 2021ZD0111204

Author information

Authors and Affiliations

Authors

Contributions

ML and YW conceived of the presented idea. ML developed the integrated analysis framework and performed the experiment. ML, KY and YW collected and analyzed the data. ML and WL contributed to the interpretation of the results. SW encouraged ML, KY and WL to investigate mechanism of sentiment analysis and supervised the findings of this work. All authors provided critical feedback and helped shape the research, analysis and manuscript. All authors read and approved the manuscript.

Corresponding author

Correspondence to Yunjie Wei.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Tables 14, 15.

Table 14 Performance of benchmark models with four Chinese stock data (without sentiment index)
Table 15 Performance of the models with four Chinese stock data (with sentiment index)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M., Yang, K., Lin, W. et al. An interval constraint-based trading strategy with social sentiment for the stock market. Financ Innov 10, 56 (2024). https://doi.org/10.1186/s40854-023-00567-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40854-023-00567-2

Keywords