FuturesNet: Capturing Patterns of Price Fluctuations in Domestic Futures Trading

Pan, Qingyi; Sun, Suyu; Yang, Pei; Zhang, Jingyi

doi:10.3390/electronics13224482

Open AccessArticle

FuturesNet: Capturing Patterns of Price Fluctuations in Domestic Futures Trading

by

Qingyi Pan

¹,

Suyu Sun

²,

Pei Yang

³ and

Jingyi Zhang

^4,*

¹

Department of Statistics and Data Science, Tsinghua University, Beijing 100190, China

²

PBC School of Finance, Tsinghua University, Beijing 100190, China

³

Department of Computer Technology and Application, Qinghai University, Xining 810016, China

⁴

School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(22), 4482; https://doi.org/10.3390/electronics13224482

Submission received: 22 October 2024 / Revised: 9 November 2024 / Accepted: 10 November 2024 / Published: 15 November 2024

(This article belongs to the Special Issue Trends and Prospects in AI-Empowered Information Systems and Technologies)

Download

Browse Figures

Versions Notes

Abstract

:

Futures trading analysis plays a pivotal role in the development of macroeconomic policies and corporate strategy planning. High-frequency futures data, typically presented as time series, contain valuable historical patterns. To address challenges such as non-stationary in modeling futures prices, we propose a novel architecture called FuturesNet, which uses an InceptionTime module to capture the short-term fluctuations between ask and bid orders, as well as a long-short-term-memory (LSTM) module with skip connections to capture long-term temporal dependencies. We evaluated the performance of FuturesNet using datasets numbered 50, 300, and 500 from the domestic financial market. The comprehensive experimental results show that FuturesNet outperforms other competitive baselines in most settings. Additionally, we conducted ablation studies to interpret the behaviors of FuturesNet. Our code and collected futures datasets are released.

Keywords:

domestic futures trading; deep learning methods; price fluctuations

1. Introduction

Recently, there has been a growing interest in domestic commodity futures markets [1], which generate extensive complex data patterns as a crucial component of the global economy [2]. The fluctuations in domestic futures markets significantly influence the national economy while providing valuable insights into global marketing [3]. These trends in the futures market are influenced by a range of factors, including supply and demand, global economic conditions, geopolitical factors and inflation, uncertainty quantification [4], and more [5]. Hence, effectively leveraging them has become one of the most challenging problems in data mining. Since it is essential for economists and investors to accurately capture upward and downward trends to devise effective strategies, we need powerful machine learning models to predict domestic futures trends and devise well-informed buy or sell strategies.

Unlike traditional financial time-series tasks, domestic financial trends are non-stationary with inherent noise (as shown in Figure 1), such as potential dark pool issues, which present considerable challenges for current machine learning methods. Researchers must closely monitor market dynamics and macroeconomics while considering the influence of futures price volatility on performance. In this regard, many machine learning methods have been developed to capture upward and downward trends in domestic markets. For instance, a series of Markov-like models have recently been developed with stochastic driving terms based on historical states, such as vector auto-regression (VAR) [6] or auto-regressive integrated moving average (ARIMA) [7]. The symmetrical and asymmetric GARCH models [8] predict futures prices based on explicit assumptions and manually designed features. Mikosch and Strica [9] proposed a multivariate GARCH model with time-varying conditional correlation structures for financial market predictions. However, these traditional machine learning methods often overlook the non-stationary properties of futures markets, making it difficult to capture temporal patterns of financial data.

In this regard, we need to employ more powerful data-driven deep learning methods [10,11] for futures trading based on daily financial market data. With the rapid development of super-computing, more innovative methods (i.e., artificial neural networks) have been developed to better capture futures trends. For instance, Saud and Shakya [12] compared the performance of three powerful deep learning models (i.e., recurrent neural networks [13], long short-term memory (LSTM) [14], and gated recurrent units (GRU) [15]) using bank stocks from the Nepal Stock Exchange as a case study. Their empirical results indicate that deep learning models consistently outperform traditional statistical methods. Similarly, Mehtab and Sen [16] applied a deep architecture based on CNN and LSTM modules, yielding promising results. Moreover, Jiang and Liang [17] investigated futures trading strategies using deep reinforcement learning methods, which not only predict price movements but also support more effective investment decisions. Lim et al. [18] applied transformers to predict prices in the financial market, highlighting their robustness in capturing temporal correlations. Xu et al. [19] introduced a deep model combining multiplexed attention mechanisms and linear transformers for predicting financial markets, which reduces computational complexity in capturing long-term patterns. Olorunnimbe and Viktor [20] ensembles multiple temporal transformers with similar embeddings for long-term predictions, achieving superior performance in modeling non-stable financial time-series datasets. Additionally, Huo et al. [21] designed an artificial bee colony-based algorithm with deep architectures to predict futures market pricing rules, providing new perspectives on domestic futures trading. TCM-ABC-LSTM tried to optimize hyperparameters without modifying deep architectures, whereas we developed a novel architecture called FuturesNet based on several powerful modules.

In this paper, we propose a novel deep architecture called FuturesNet for high-frequency futures trading in domestic markets, which combines the advantages of both deep learning models and traditional statistical methods to capture both upward and downward trends in futures data. FuturesNet incorporates the InceptionTime module to extract representative patterns that characterize short-term fluctuations, as well as an LSTM module with skip connections and an auto-regressive module to capture long-term temporal dependencies. Additionally, we conducted ablation studies to verify the effectiveness of each module in FuturesNet and to gain further insights into interpreting the behaviors of our proposed architectures. The extensive empirical results also verify that our proposed FuturesNet consistently outperforms other strong baselines across real-world datasets.

The main contributions of our paper are as follows:

We propose FuturesNet, which integrates the InceptionTime module, long short-term memory with skip connections, and a linear auto-regressive module to capture both short-term and long-term temporal dependencies in domestic futures data;
Extensive empirical results show that our proposed FuturesNet significantly outperforms other strong baselines in capturing domestic futures trends, and we identify some interesting patterns that may inspire future research;
To the best of our knowledge, we are the first to apply deep learning methods to capture domestic futures trend patterns.

2. Related Work

In this section, we briefly review various machine learning methods for financial market analysis, which can be broadly categorized into traditional statistical methods (as shown in Section 2.1) and data-driven deep learning methods (as shown in Section 2.2).

2.1. Futures Trading with Traditional Methods

One of the most popular approaches for futures trading tasks is the auto-regressive integrated moving average (ARIMA) model, commonly used to predict linear patterns in financial stock prices [22]. ARIMA decomposes financial time-series data into auto-regressive (AR) and moving-average (MA) components [23]. The AR component captures relationships between current and historical states, while the MA component captures relationships between current and previous residuals. ARIMA performs well in financial environments that exhibit stationary behavior [24]. However, a major limitation of ARIMA is the linearity assumption, making it less effective in capturing the nonlinearity in financial markets. Tay and Cao [25] attempted to apply supported vector machines (SVMs) to financial time-series prediction, demonstrating the benefits of solving nonlinear problems. However, SVMs often struggle when applied to large-scale financial datasets, making them less scalable for high-frequency trading or more complex market environments. Generalized Auto-Regressive Conditional Heteroskedasticity (GARCH) is also commonly used to model financial volatility. For instance, Mikosch and Strica [9] applied the GARCH model to predict the stock market price to achieve better performance. However, GARCH models often assume that market volatility follows specific conditional heterosexuality, making it perform poorly in handling sudden events in financial markets, and the GARCH models primarily focus on volatility rather than actual price movements, limiting their performance in capturing trend patterns. Additionally, it is challenging for most traditional models to capture long-term trend patterns of futures markets.

2.2. Futures Trading with Deep Learning-Based Methods

In recent years, deep learning models have been widely applied to model financial market data due to their ability to handle complex temporal correlations [10]. A key variant of the recurrent neural network (RNN) called the long short-term memory (LSTM) network [14] is particularly well suited for time-series forecasting due to its effectiveness in capturing complex nonlinear relationships in large-scale datasets. LSTM networks outperform traditional machine learning methods by using memory cells to extract long-term patterns and dependencies from financial data [26]. Convolutional neural networks (CNNs), commonly used for computer vision, have also been applied in financial analysis by treating time-series data as one-dimensional images. CNNs can identify local fluctuations, such as price spikes, enhancing the model’s generalization significantly. Mehtab and Sen [16] proposed a hybrid model combining LSTM and CNN architectures for various financial markets. Chen et al. [27] used CNNs to extract local fluctuations and combine them with LSTM networks to capture long-term dependencies, improving prediction capabilities in financial markets. Xu et al. [19] applied the latest deep architecture, Transformer, to predict domestic stock prices in 2022. The extensive empirical results showed that the Transformer model outperformed other neural network-based methods in most settings. Due to its encoder–decoder architecture and multi-head attention mechanism, it better captures the fundamental patterns of domestic futures datasets. Meng and Khushi [28] applied deep reinforcement learning to optimize automated trading strategies in the futures market, showing strong performance in dynamic financial markets. However, deep reinforcement learning often requires large amounts of data for training, making it challenging to apply in certain data-scarce scenarios. Although deep learning methods require more computational costs compared to traditional statistical methods, their adaptability to financial markets makes them effective for modeling domestic financial markets.

3. Preliminaries

In this section, we sequentially introduce the format and pre-processing of domestic futures data. Section 3.1 introduces the format of the domestic futures datasets, while Section 3.2 introduces data pre-processing and formalizes the corresponding upward and downward futures patterns as classification problems.

3.1. Input Data Format

We study the Limit Order Book (LOB) framework for modeling the futures market [29], which consists of two sections in LOB: bid (buy) and ask (sell) orders. The sell orders include a price set

{p_{open}^{a} (t), p_{close}^{a} (t), p_{high}^{a} (t), p_{low}^{a} (t)}

, representing the opening, close, highest, and lowest futures prices for selling an asset at time step t. The buy orders include a price set

{p_{open}^{b} (t), p_{close}^{b} (t), p_{high}^{b} (t), p_{low}^{b} (t)}

, representing the opening, close, highest, and lowest futures prices to buy an asset at time step t. Additionally, the volume

V^{b} (t)

(

V^{a} (t)

) at each time step t is recorded for the buy (sell) orders. At each time step t, these features are collected with a sampling rate of

Δ (t, t + 1)

per minute. We focus on three types of datasets, Futures 50, 300, and 500, from the domestic financial market covering 36 months from 2020 to 2022, which represent certain trends in the domestic financial market. The data include all trading activities from 4 January (i.e., after the New Year’s holiday) to 30 December of each year. The daily trading hours are 4 h (corresponding to 240 points from 10:00 to 14:00) per day. With the sampling rate per minute, the test dataset contains over

7 \times 10^{4}

observations per year. After imputing missing values, we directly feed the original futures dataset into our proposed FuturesNet.

In the following sections, we discuss the advantages and disadvantages of various methods applied to the collected futures datasets and provide a comprehensive testing platform to evaluate the performance of each method.

3.2. Futures Trend Labeling

We first create labels representing the corresponding directions of upward and downward trends in the domestic market. It is widely acknowledged that feature engineering is crucial in developing machine learning models and conducting data analysis [30]. In this regard, our primary goal is to enhance generalization by transforming and creating meaningful feature labels from the original datasets, which can significantly influence the performance of machine learning methods applied to futures datasets. To minimize the impact of outliers and extract stable futures trading patterns, we calculate the average prices of bid orders and ask orders per minute, denoted by

p^{b} (t)

and

p^{a} (t)

in Equation (1):

\begin{matrix} p^{a} (t) & = \frac{p_{open}^{a} (t) + p_{close}^{a} (t) + p_{high}^{a} (t) + p_{low}^{a} (t)}{4}, \\ p^{b} (t) & = \frac{p_{open}^{b} (t) + p_{close}^{b} (t) + p_{high}^{b} (t) + p_{low}^{b} (t)}{4} . \end{matrix}

(1)

To accurately capture the futures prices at each time step, we average

p^{b} (t)

and

p^{a} (t)

to represent the average price

p (t)

of bid and ask orders:

p (t) = \frac{p^{a} (t) + p^{b} (t)}{2},

(2)

To mitigate the high randomness and non-stationary properties of financial futures, directly comparing the current

p (t)

with

p (t + k)

to determine price changes may introduce noise into the labels. Hence, we apply smoothing tricks [31] to effectively capture fluctuation patterns in futures data. Specifically,

p^{-} (t)

denotes the average of the previous k mid-prices, while

p^{+} (t)

represents the average of the next k prices.

p^{+} (t) = \frac{1}{k} \sum_{i = 1}^{k} p (t + i), p^{-} (t) = \frac{1}{k} \sum_{i = 0}^{k} p (t - i),

(3)

where

p_{t}

denotes the mid-price as defined in Equation (2), and k is the smoothing range. We calculate the percentage change in the mid-price,

l_{t}

, to determine the price directions, as shown in Equation (4) or Equation (5):

l_{t} = \frac{p^{+} (t) - p (t)}{p (t)},

(4)

l_{t} = \frac{p^{+} (t) - p^{-} (t)}{p^{-} (t)},

(5)

where both tricks identify the direction of price changes at time t. We mainly consider them as a reference for futures price fluctuations. Consequently, labels are assigned based on the threshold

α

of the percentage

l_{t}

. If

l_{t} > 2 α

or

l_{t} < - 2 α

, we define it as a large increase (+2) or decrease (−2), and if

α < l_{t} < 2 α

or

- 2 α < l_{t} < α

, we define it as a moderate increase (+1) or decrease (−1). If

- α < l_{t} < α

, we define it as a stable state (0). We set the threshold at

2 \times 10^{- 4}

based on the common transaction cost in the domestic futures market [32]. A well-chosen threshold categorizes futures price fluctuations into several levels to enhance label quality.

The orange and red lines in Figure 2 represent different threshold ranges, categorizing futures trend shifts into several types. As shown in Figure 2, volatility in the futures market clearly exhibits temporal dependencies with significant fluctuations during certain periods. The financial volatility may reflect the market’s response to external factors (such as policy changes), leading to dynamic financial analysis.

Furthermore, to ensure numerical stability and enhance the performance of our proposed model, we normalize the raw data (train and test sets) in the pre-processing stage. The normalization process ensures that both the train and test sets are scaled to a common range for more reliable training procedures.

X_{i}^{'} = \frac{X_{i} - X_{\min}}{X_{\max} - X_{\min}} .

(6)

Therefore, we transform futures trading into a five-class classification. The general train and test set

{X_{1}, \dots, X_{N}} \in X

represents the domestic futures dataset, where

X_{i}

contains historical records of the corresponding prices and volumes. The input

X \in R^{T \times C}

has a window length T and dimension C (i.e.,

C = 10

in the collected domestic futures dataset), and

X_{i, t}

denotes the feature vector of the i-th sample at time step t. The class label of

X_{i}

is denoted by

y_{i}

, which belongs to the output space

Y = {- 2, - 1, 0, 1, 2}

. Our goal is to learn a mapping function

f_{θ} : X \to Y

parameterized by learnable parameters

θ \in Θ

, where

Θ

represents the parameter space. For hyperparameter T, we apply a sliding window to extract samples from the observed time-series data to expand the training data. Each window contains a certain number of time steps within historical data. The features of each prediction include the futures prices and the corresponding volumes of the previous w trading steps to predict the upward and downward trends. Each row corresponds to a time series, where the orange part indicates the features used by the current training sample (i.e., data from the previous w time steps), and the blue part represents the target (i.e., trend change). The window gradually slides over time-series data, continuously updating the input to generate new training samples, thereby systematically constructing features and labels (as shown in Figure 3). The window size determines the length of historical data: if the window is too short, the deep learning model may fail to capture long-term patterns, and if it is too long, the training sample may include too much noise, affecting the deep model’s ability to capture trend patterns. We fine-tune the hyperparameter window size to enhance performance.

In Figure 4, we visualize the rising and falling trends in futures data, including both sampled and smoothed trend patterns. The smoothed trend, derived from sampling in 10 min intervals, uses the moving-average technique to clarify the overall trend by minimizing local fluctuations. The sampled trend curve reflects more frequent price changes and high-frequency properties, while the 10 min sampling interval filters out some short-term noise. The smoothed trend curve highlights cyclical fluctuations, showing that price trends are not random and exhibit periodic patterns. As shown in Figure 4, these trend changes provide critical insights to better capture trend shifts.

4. Our Method

In this section, we present FuturesNet for predicting trends in the futures price. Section 4.1 delves into the overall framework of FuturesNet, while Section 4.2 introduces the objective function and effective progressive training strategies.

4.1. Overall Framework

To effectively capture both long-term cyclical patterns and short-term fluctuations in futures trends, we propose a deep architecture called FuturesNet (as shown in Figure 5). The model is composed of three main modules: the InceptionTime module, which focuses on capturing short-term fluctuations across various features, and the long short-term memory (LSTM) and auto-regressive modules, which capture long-term temporal dependencies.

4.1.1. InceptionTime Module

The InceptionTime module in FuturesNet is designed to extract short-term fluctuations from futures data. We first combine several convolutions as our bottleneck layers to perform sliding operations with stride 1, effectively integrating relationships across different dimensions of futures datasets with minimal computational costs. Then, we apply different size filters (i.e.,

I \in {10, 20, 40}

) to further extract the features, as shown in Figure 6. To make our model invariant to small noise perturbations, we introduce parallel MaxPooling to capture temporal dependencies. Then, the outputs of different convolutional layers are concatenated with the MaxPooling output, forming a more powerful feature vector. By stacking multiple InceptionTime modules, FuturesNet can capture short-term fluctuations within different features within datasets.

Formally, for the input

X \in R^{T \times C}

extracted by the sliding window, the corresponding C features are organized as follows:

\{p_{open}^{b} (t), p_{close}^{b} (t), p_{high}^{b} (t), p_{low}^{b} (t), p_{open}^{a} (t), p_{close}^{a} (t), p_{high}^{a} (t), p_{low}^{a} (t), V^{b} (t), V^{a} (t)\}

(7)

We treat the collected futures data as a two-dimensional array and use convolution

W_{1 \times 1}

to integrate the short-term fluctuations between different dimensions (as shown in Equation (8)). The hidden states of the bottleneck layer are extracted.

\{\begin{matrix} X_{bottleneck} = W_{1 \times 1} \times X, \\ H_{k_{1}} = W_{k_{1}} \times X_{bottleneck}, \\ H_{k_{2}} = W_{k_{2}} \times X_{bottleneck}, \\ H_{k_{3}} = W_{k_{3}} \times X_{bottleneck}, \end{matrix}

(8)

where MaxPooling

M = M a x P o o l (\cdot)

is used to further extract key features from time-series data. By selecting the maximum value within local regions, MaxPooling reduces the dimensionality while retaining the most significant features. Then, we concatenate the outputs of the various convolution kernels

H_{(k_{1})}, H_{(k_{2})}, H_{(k_{3})}

and the output P along with the feature dimension to form the final feature representation Z:

Z = Concat ([H_{k_{1}}, H_{k_{2}}, H_{k_{3}}, M], axis = 1) .

(9)

As the input data often exhibit nonlinear characteristics and the convolutional layers primarily capture linear patterns, it is crucial to introduce activation functions to capture nonlinear relationships. Specifically, we employ the activation function

RELU (\cdot) = m a x (0, \cdot)

to enable FuturesNet to capture nonlinear features. Additionally, we also incorporate a residual layer to mitigate gradient vanishing in the training procedure. The final output dimension of the convolutional layer is the dimension

d^{'} \times T

. By integrating the sliding window and convolution layers, our InceptionTime module effectively captures the complex relationships in domestic futures, providing powerful feature representations for subsequent predictions in FuturesNet.

4.1.2. Long Short-Term Memory Module

To capture long-term temporal dependencies in futures data, we feed the representations from the InceptionTime module into the long short-term memory (LSTM) module [14]. The LSTM module is specifically designed to capture such long-term temporal dependencies from time-series data and is widely applied in many temporal modeling applications [33,34]. Given that futures data exhibit strong temporal dynamics, including both short-term price fluctuations and long-term temporal patterns, we adopt an LSTM with gated recurrent units (as shown in Figure 7). The temporal modeling at time step t is

h_{t} = LSTM (h_{t - 1}, x_{t}),

(10)

where

x_{t}

represents the input at time step t, and

h_{t}

is the hidden state of the LSTM. Our LSTM module in FuturesNet comprises a forget gate

f_{t}

, an input gate

i_{t}

, and an output gate

o_{t}

. These gates allow FuturesNet to balance short-term market fluctuations with long-term market trends, effectively capturing the temporal structure of futures data. In the futures market, long-term trends (e.g., seasonal trends or economic cycles) significantly affect futures prices. Our LSTM module leverages its internal state to store long-term historical information, while the forget gate selectively discards irrelevant short-term fluctuations. This subtle mechanism reduces the influence of transient market noise on predictions.

As shown in Figure 7, the input x, the previous hidden state

h_{t - 1}

, and the cell state

C_{t}

are passed through the forget gate for further processing. The LSTM’s recurrent unit, along with its corresponding nonlinear activation function,

σ (\cdot)

, is as follows:

\{\begin{matrix} f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f}), \\ i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i}), \\ o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o}), \\ {\hat{R}}_{t} = tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{r}), \\ σ (d) = \frac{1}{1 + e^{- d}}, \end{matrix}

(11)

where

f_{t}

,

i_{t}

,

o_{t}

,

{\hat{R}}_{t}

, and

h_{t}

represent the forget gate, input gate, output gate, intermediate outputs, and cell states, respectively.

W_{f}

,

W_{i}

,

W_{o}

, and

W_{c}

are the weights corresponding to the different state gates.

b_{f}

,

b_{i}

,

b_{o}

, and

b_{c}

represent the biases for each state gate. s denotes the number of hidden units used in historical states. On the other hand, the update process for the cell state in the LSTM is shown in Equation (12):

\{\begin{matrix} R_{t} = f_{t} * R_{t - 1} + i_{t} * {\hat{R}}_{t}, \\ h_{t} = o_{t} \cdot tanh (R_{t}), \\ tanh (R_{t}) = \frac{e^{R_{t}} - e^{- R_{t}}}{e^{R_{t}} + e^{- R^{t}}} . \end{matrix}

(12)

By the nonlinear tanh activation function, the updated cell states,

h_{t}

, are maintained through the output gate,

o_{t}

, to generate a new hidden state. The output gate ensures that only useful information is contained in the new hidden state. Consequently, a well-designed LSTM can selectively capture critical temporal dependencies in domestic futures data while effectively suppressing inherent noise.

To further strengthen the ability to capture long-term temporal dependencies in futures data, we introduce periodic skip connections to exploit common cyclical patterns in the futures market. According to microeconomic analysis [35], markets typically exhibit cyclical fluctuations, such as the frequent trading cycle of

p = 15

min:

\begin{matrix} h_{t}^{'} = W_{0} h_{t} + \sum_{i = 1}^{p - 1} W_{i} h_{t - i} + b, \\ {\hat{y}}^{D L} = f_{θ} (h_{t}^{'}), \end{matrix}

(13)

where b represents the bias. By incorporating these periodic skip connections, LSTM can fully leverage such cyclical temporal patterns, enabling FuturesNet to achieve a better trade-off between long-term temporal dependencies and short-term fluctuations, thereby enhancing the generalization of deep models in futures trading. Our empirical results show that a fine-tuned hyperparameter p significantly enhances the performance of FuturesNet. Finally, we apply a fully connected layer to adjust the dimension of the hidden state

h_{t}^{'}

, generating the final output of the hidden layer

{\hat{y}}^{DL}

.

4.1.3. Auto-Regressive Module

Auto-regressive modules are widely used in time-series analysis to capture linear trends, especially when futures data exhibit significant temporal correlations [36]. Generally, any time-series data can be subdivided into trend components, cyclical components, and residuals [31]. We use the traditional auto-regressive model to capture the linear trend patterns, with the weight coefficients recorded as

W^{a r}

and

b^{a r}

. Specifically, our auto-regressive module is expressed as follows:

{\hat{y}}^{AR} = \sum_{i = 1}^{p} W_{i}^{AR} X_{t - i} + b_{i}^{AR},

(14)

where p is the length determined by the cyclical properties of the futures market, and all dimensions share the same linear parameters set

{W_{i}, b_{i}}_{i = 1}^{p}

. Then, the overall prediction

\hat{y}

is obtained by combining the output of the auto-regressive module

{\hat{y}}^{AR}

with the prediction

{\hat{y}}^{DL}

of the deep learning module:

\hat{y} = {\hat{y}}^{AR} + {\hat{y}}^{DL} .

(15)

Although the deep learning modules described in Section 4.1.1 and Section 4.1.2 are highly effective in capturing nonlinear relationships, directly applying them to financial futures market data may cause overfitting during training. By combining the carefully fine-tuned auto-regressive component, FuturesNet achieves better performance across various futures markets. For instance, the linear regression module primarily influences predictions during stable market periods, while in periods of high volatility, the deep learning modules capture complex and nonlinear fluctuations in financial markets.

4.2. Objective Function

To optimize the model parameters during training, we use cross-entropy loss as the primary metric for accuracy [37]. Additionally, we introduce a trading cost regularization term

L_{c}

to discourage an excessive trading frequency. The overall objective function is composed of the cross-entropy loss and the transaction cost term

L_{\cos t}

:

L (D) = \sum_{y_{t} \in D} [λ_{1} L_{ce} ({\hat{y}}_{t}, y_{t}) + λ_{2} L_{\cos t} ({\hat{y}}_{t}, y_{t})],

(16)

where cross-entropy loss

L_{ce}

measures prediction accuracy, and the transaction cost term

L_{\cos t}

regulates the trading strategy by penalizing frequent trades. The regularization term is crucial for ensuring that the FuturesNet minimizes trading costs while maintaining prediction accuracy, thus increasing profit margins. The weight parameters

λ_{1}

and

λ_{2}

are designed to balance performance and the impact of trading costs during the training procedure.

L_{\cos t} = ∥{\hat{y}}_{t} - {\hat{y}}_{t - 1}∥ \times α,

(17)

where the trading cost

α

, typically set to

α = 2 \times 10^{- 4}

, based on empirical studies in microeconomics [32], serves as the extra constraint on frequent trading behaviors. Specifically, TCM-ABC-LSTM [21] used the artificial bee colony algorithm to tune hyperparameters for faster convergence, without carefully designing the LSTM-based architectures for futures data. On the other hand, we combined the InceptionTime and LSTM modules to capture the upward and downward trend patterns, requiring only a few manually tuned hyperparameters. As shown in Section 5.3, FuturesNet achieves superior performance compared to other competitive baselines on various domestic futures data.

Inspired by the dynamic weighting mechanism in [38], FuturesNet incorporates a dynamic weight-adjustment mechanism, where

λ

is adjusted throughout training to balance accuracy and transaction costs at different stages. During the early stages of training, the weight in

L_{\cos t}

is kept relatively small, allowing FuturesNet to focus on learning essential patterns. Then,

λ

gradually increases, shifting the focus toward reducing the trading frequency and minimizing costs. The dynamic weight-adjustment strategy can be expressed by

\begin{matrix} λ_{1} (m) = \frac{m}{M + 1} C_{0}, λ_{2} (m) = 1 - λ_{1} (m), \end{matrix}

(18)

where m denotes the current epoch, M is the maximum number of epochs, and

C_{0}

is the trade-off factor that balances the two losses. As m increases,

λ_{1}

gradually decreases and

λ_{2}

gradually increases, thereby enhancing the generalization by emphasizing cost reduction in later stages. To optimize the parameters of FuturesNet, we use Adam [39] to optimize parameters

θ

until convergence:

θ_{l} = θ_{l - 1} - γ \nabla_{θ} L (D),

(19)

where

γ

is the learning rate controlling the step size. The empirical results show that FuturesNet achieves efficient convergence on real-world datasets, as subsequently described in Section 5.

5. Experiments

In this section, we present extensive experiments conducted to verify the effectiveness of our proposed FuturesNet for domestic futures predictions. In Section 5.1, we introduce the experimental setup and evaluation metrics. In Section 5.2, we analyze the properties of various domestic futures markets. Section 5.3 includes performance comparisons and further analysis of FuturesNet. Finally, Section 5.4 and Section 5.5 provide hyperparameter analysis and ablation studies to verify the contributions of each component within FuturesNet.

5.1. Experimental Settings

5.1.1. Evaluation Metrics

To evaluate the performance of FuturesNet, we used two widely recognized metrics: accuracy and Sharpe ratio. The accuracy, commonly used in the literature on time-series classification and futures trading [40,41], measures the proportion of correct predictions, as shown in Equation (20):

Accuracy = \frac{Number of correct predictions}{Total predictions} \times 100 % .

(20)

The second metric is the Sharpe ratio (S-value) [42], which helps investors assess the return of an investment relative to its risk. By calculating the ratio of an investment excess over the risk-free rate to the standard deviation, the Sharpe ratio allows investors to evaluate the performance of an investment in terms of risk-adjusted returns. Specifically, the return rate

s_{t} = p_{t + 1} \times y_{t}

at each time step t is used as follows:

S - value = \frac{1}{n_{test}} \sum_{t = 1}^{n_{test}} \frac{(s_{t} - ∥y_{t} - {\hat{y}}_{t - 1}∥ \times α)}{σ^{2}} \times β,

(21)

where

n_{test}

is the test sample size,

α = 2 \times 10^{- 4}

represents the transaction cost based on empirical findings in microeconomics [32] (i.e.,

β = \sqrt{240}

), and

σ^{2}

is the variance of returns. This metric provides a comprehensive evaluation of investment performance relative to volatility, with higher S-values and accuracy reflecting more accurate predictions. These metrics indicate that deep models can adapt to market volatility while capturing long-term patterns.

5.1.2. Hyperparameter Settings

In this experiment, we split the training, validation, and test sets into 60%, 20%, and 20%, respectively. The hyperparameters are tuned based on the validation sets to optimize performance. Specifically, the number of hidden units in the LSTM and convolutional layers was set to 50, and the fully connected layer had 24 units. We used Adam [39] to optimize the FuturesNet parameters, with a batch size of 64 and a learning rate of 0.2. To avoid overfitting, we applied a dropout rate of 0.2 [43] and a weight decay rate of 0.001. All experiments were conducted on a computing platform with eight RTX 2080Ti GPUs to reduce computational time.

5.1.3. Statistics of High-Frequency Futures Data

As described in Section 3, we evaluated FuturesNet’s performance on three datasets: Futures 50, Futures 300, and Futures 500, representing the various real-world domestic futures markets. In this section, we analyze the statistical properties of these datasets, including the means, medians, and standard deviations of key economic variables in Table 1.

For Futures 50, the mean values of the opening price, closing price, highest price, and lowest price are all around 3116 to 3118, with a standard deviation of approximately 270, indicating relatively stable price fluctuations. Futures 300 exhibits higher prices, with mean values ranging from 4425 to 4428, with a larger standard deviation of 493. Futures 500 has mean prices between 5921 and 5925, with a standard deviation of 536. These observations indicate that Futures 500 exhibits higher prices and more significant fluctuations, while Futures 50 is more stable.

5.2. Futures Data Analysis

5.2.1. Visualization of Domestic Futures Data

In this section, we visualize key features of several futures in Figure 8. It is evident that price fluctuations for sellers are more dramatic compared to buyers across various futures datasets due to other external factors, such as market events or global policy changes. The buyer prices remain more stable in most periods, indicating that buyers prefer to maintain steady-state pricing to mitigate the risks associated with large price swings. Moreover, significant fluctuations are observed during specific periods (e.g., between 300 and 600 min and after 800 min), particularly in the seller’s price adjustments, which are often closely related to major events or economic factors. Figure 8a shows that Futures 50 exhibits fluctuations between 200 and 500 min. Figure 8b shows that Futures 300 also exhibits fluctuations between 200 and 500 min, while Figure 8c shows that Futures 500 exhibits significant fluctuations in [600, 800] min. During other volatile periods, sellers tend to adjust their prices earlier than buyers, indicating that sellers may adopt short-term trading strategies, while buyers pursue more long-term investment strategies. By checking these price fluctuations, we can gain valuable insights into the behaviors of both buyers and sellers, enabling further development of more effective trading strategies.

5.2.2. Futures Price Spread

We analyzed the volatility of Futures 50, 300, and 500 by visualizing the price spread at each time step (i.e., the differences between the highest and lowest prices, as well as the opening and closing prices). This approach enables us to discover overall trends within the futures data and identify potential anomalies. To further reduce noise within the raw data, we sampled the price spread between buyers and sellers every 10 min.

As shown in Figure 9, the open–close price spread between buyers and sellers exhibits synchronous fluctuations with periodic patterns, and substantial changes occur only at specific time steps. Conversely, the price spread between the highest and lowest prices per minute in the right subfigure tends to be more volatile than the open–close price spread. For instance, significant price fluctuations occur around 20 min after market opening in Futures 50 and 300 and between 80 and 90 min in Futures 500 due to some high-frequency trading activities during these periods.

When the open–close price spread is positive, the futures market displays a downward trend, indicating that buyers are more dominant, and investors may hold a pessimistic outlook about price trends. On the other hand, a negative open–close price, where the closing price is higher than the opening price, indicates that buyers are more sensitive to market changes. These observations show that it is crucial to evaluate short-term financial risks, as large price spreads may indicate significant drawdown risks for lifetime investments [44]. Our empirical results also show that futures markets are influenced by various factors, including external economic events and internal policy changes, especially in economic policy uncertainty.

5.2.3. Futures Data Trend Patterns

We analyze the trends in the changes in trading volume over the first 2000 time steps in different futures datasets. Each dataset exhibits significant trading volume peaks at different time steps, indicating large-scale activities in the financial market potentially driven by external factors, such as social policy adjustments. As shown in Figure 10, Futures 500 and 300 exhibit more frequent and higher trading volume peaks, particularly around 400 and 1600 min. In contrast, Futures 50 shows lower trading volumes and reduced volatility, likely reflecting decreased market participation and trading activities. These findings suggest that traders should focus on such abnormal events.

Then, we visualize the price trends for Futures 50, 300, and 500. Using Equation (1), we calculated the averaged opening, closing, highest, and lowest prices per minute with trend labels. The colors in the legend represent different magnitudes of price changes in Equation (4) or Equation (5). Futures 50 exhibits significant price fluctuations only between 300 and 500 steps and 1250 and 1500 steps. The overall price movements remain relatively stable, making Futures 50 suitable for investors with lower risk tolerance. Moreover, these stable periods offer conservative investors opportunities to exit the market, helping them avoid volatility risks. Compared to the other two datasets, Futures 50, with its high volatility, is most suitable for short-term traders or investors seeking quick trading opportunities from price changes. However, long-term investors should avoid such highly volatile futures.

As shown in Figure 11, various futures exhibit significant price fluctuations or stable trends during different periods. Futures 50 is suitable for conservative investors, as it shows small price fluctuations in most periods. Futures 300 with moderate volatility is suitable for investors who are seeking moderate returns and can handle medium risk. Futures 500, with the highest volatility, is suitable for short-term traders and speculators with a high risk tolerance. During high market volatility periods, traders may find opportunities for quick profits, while stable periods provide long-term investors with better opportunities to hold positions. The empirical results offer valuable insights for investors to identify high-frequency trading strategies in certain periods.

We further analyze the distributions of futures across labels

(- 2, - 1, 0, 1, 2)

using bar charts and pie charts in Figure 12a,b. Figure 12b shows that data labeled as −2 and 2 account for the largest proportions, with both exceeding 30%. The smaller fluctuations (−1 and 1) represent less than 10% across all datasets. Additionally, stable trends (0) make up a large percentage in Futures 50, indicating that certain market conditions tend to be stable. These observations highlight the strong volatility in futures data, characterized by significant price fluctuations. When designing trading strategies, it is crucial to focus on analyzing sharp movements in price to capture key market dynamics.

5.3. Main Results

In this section, we compare the performance of our proposed FuturesNet with other competitive baselines across different futures datasets (i.e., Futures 50, Futures 300, and Futures 500) over three consecutive years (2020, 2021, and 2022). As shown in Table 2 and Figure 13, FuturesNet consistently outperforms the strong baselines across all datasets and years, especially in Futures 50 and Futures 300. These results highlight the effectiveness of capturing both long-term and short-term trend dependencies in futures trend predictions. For other competitive baselines, LSTNet shows lower accuracy, likely due to too many parameters. GRU performs as the second-best model, showing relatively high accuracy. While the CNN and Transformer perform well in Futures 50 and Futures 500 for 2021, they perform relatively poorly for other years.

Figure 14 shows that FuturesNet exhibits remarkable stability across various futures datasets and years, in contrast to Transformer and LSTNet. GRU has achieved good performance with more fluctuation than FuturesNet, but it is still the second-best model after FuturesNet. Based on the above analysis, FuturesNet consistently achieves superior performance across various datasets and demonstrates high reliability in risk-adjusted returns at different time steps. The GRU and CNN models performed relatively well in specific years but struggled in others (e.g., with higher MAE), suggesting that these models may be sensitive to certain market conditions. LSTNet and Transformer have difficulties in fitting relatively sparse futures datasets for further adjustments in other applications.

To further analyze the behaviors of our proposed FuturesNet, we combined the buyer prices and volumes from Futures 50 and 300. Then, we used gradient-based CAM [45] to analyze the importance of the different features. Figure 15 shows that the feature importance evolves over time, with temporal dependencies influencing different features to varying degrees. Both buyer prices and volumes were identified as critical features, particularly during sharp rising and falling trends in futures. These observations align with established economic principles [46]. For Futures 50 and Futures 300, the buyer features were identified as highly significant during this period, showing a strong influence at adjacent time steps.

5.4. Hyperparameter Analysis

In this section, we explore whether the sample size of the training set significantly influences accuracy and the Sharpe ratio (S-value). Figure 16 shows that the performance of FuturesNet improves in all three futures datasets, with both accuracy and the S-value increasing before stabilizing. FuturesNet’s performance is limited when trained on smaller training datasets but steadily improves as the training set size grows to approximately 6 to 7 months. However, the performance curve does not follow a strict monotonic pattern, as larger datasets may introduce complexity during training. The performance curves across Futures 50, 300, and 500 datasets are similar.

Figure 17 shows the performance of FuturesNet on Futures 50 across different years (2019, 2020, and 2021). The performance curves across multiple years exhibit similar patterns, stabilizing or fluctuating slightly as the training set size increases from 3 to 7 months. The accuracy in 2020 is marginally higher than in other years, potentially due to external factors. Overall, the performance curves are consistent across different settings, with slight variations in some months.

To analyze the stability of FuturesNet, we evaluated our method using test sets of varying sizes. Specifically, we tested FuturesNet on 6 months of futures data, as shown in Figure 18. The results show that the performance curves (i.e., S-value and accuracy) decrease monotonically as the test set size increases from 1 month to 5 months. As the size of the test set grows, the difficulty in capturing patterns within price fluctuations grows, which negatively affects FuturesNet’s performance. Additionally, Figure 18 indicates that the pre-trained model performs well on shorter test datasets (1–2 months). The findings suggest that we should use around 6 months of futures data for training, which ensures sufficient data while mitigating the challenges with larger time spans. To further enhance FuturesNet’s performance, future work should focus on developing more powerful deep architectures capable of handling the complexities of futures trading scenarios and enhancing generalization across diverse datasets.

5.5. Ablation Studies

We conducted several ablation studies to evaluate the impact of various regularization terms, including the trading cost term

L_{c}

, cross-entropy loss

L_{ce}

, and dynamic weighting strategies on the overall objective function Equation (16). Specifically, we compared the following variants of FuturesNet:

w/o c: FuturesNet trained by only minimizing the cross-entropy loss $L_{ce}$ (i.e., without the trading cost term $L_{c}$ in Equation (17));
w/ c: FuturesNet trained by minimizing both the cross-entropy loss $L_{ce}$ and the additional trading cost term $L_{c}$ in Equation (17);
w/ dc: FuturesNet trained by the cross-entropy loss $L_{ce}$ and trading cost $L_{c}$ in Equation (16) using dynamic weighting mechanisms ${λ_{1}, λ_{2}}$ in Equation (18).

As shown in Figure 19, w/ dc consistently outperforms both w/ c and w/o c variants, highlighting the importance of incorporating additional trading costs as a regularization term. The results also verify that

L_{ce}

, the regularization term

L_{c}

, and the dynamic weighting mechanism in Equation (16) all contribute significantly to predictive performance (i.e., accuracy and S-value).

In addition, we sequentially set different window sizes (i.e., from 16 to 128) to evaluate the effects of historical length on predictive performance. As shown in Figure 20, FuturesNet achieves better performance with a window size of 96, while larger window sizes (i.e., 128 or 144) do not improve performance. This observation suggests that increasing the window size does not always lead to performance gains, likely due to more complex historical data, and also increases the challenges of optimization. Moreover, w/ c (i.e., our proposed FuturesNet trained by

L_{ce}

and

L_{c}

) consistently outperformed w/ c (i.e., without the trading cost term

L_{c}

). The transaction costs are a key factor in modeling real financial markets, indicating that the extra costs require additional consideration in the training procedure. Based on these findings, we recommend that researchers run experiments with different window sizes in the training procedure and select the appropriate window size carefully to avoid overfitting.

6. Conclusions

In this paper, we propose an efficient deep learning model called FuturesNet for domestic futures trading. By integrating an InceptionTime module, a long short-term memory module, an auto-regressive module, and progressive training strategies, FuturesNet effectively captures patterns in the domestic futures market to identify the upward and downward trends. In addition, we introduced additional regularization terms to enhance generalization by minimizing extra transaction costs. The comprehensive empirical results on real-world datasets also confirm the effectiveness of our method, and we hope that our findings contribute to future theoretical research in this area. FuturesNet has some inherent limitations. Due to the high complexity of domestic futures data, our method may be prone to overfitting, and further research needs to explore more effective regularization methods. Additionally, FuturesNet’s performance may be constrained when applied to various financial datasets (e.g., international financial markets), and we need to make targeted improvements based on the characteristics of different futures datasets.

Moreover, we observe a growing interest in the field of deep learning from the community of futures trading. Through an in-depth discussion of the potential challenges of futures trading, we believe this work can facilitate further exploration in this field and its intersection with deep learning-based methods.

Author Contributions

Conceptualization, S.S.; Methodology, Q.P. and P.Y.; Formal analysis, Q.P.; Writing—original draft, Q.P.; Writing—review & editing, Q.P., S.S., P.Y. and J.Z.; Visualization, Q.P.; Supervision, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 12301381), and the National Key R&D Program of China (No. 2021YFA1001300).

Data Availability Statement

The datasets are available in https://github.com/pqy000/FuturesNet.git.

Acknowledgments

We appreciate Yalin Hu for her invaluable support in providing the laboratory environment and computational resources essential for Qingyi Pan’s work, and Xiaowen Liu for providing futures datasets. Also, we are grateful to Zhelin Han for his assistance in facilitating Qingyi Pan’s daily activities.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, J.; Sun, T.; Liu, B.; Cao, Y.; Wang, D. Financial markets prediction with deep learning. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 97–104. [Google Scholar]
Nölke, A.; Ten Brink, T.; Claar, S.; May, C. Domestic structures, foreign economic policies and global economic order: Implications from the rise of large emerging economies. Eur. J. Int. Relat. 2015, 21, 538–567. [Google Scholar] [CrossRef]
Tu, Z.; Song, M.; Zhang, L. Emerging impact of Chinese commodity futures market on domestic and global economy. China World Econ. 2013, 21, 79–99. [Google Scholar] [CrossRef]
Pan, Q.; Yang, P.; Zhang, J. BayesTSF: Measuring Uncertainty Estimation in Industrial Time Series Forecasting from a Bayesian Perspective. In Proceedings of the International Conference on Intelligent Computing; Springer: Tianjin, China, 2024; pp. 81–93. [Google Scholar]
Peck, A.E. The economic role of traditional commodity futures markets. In Futures Markets: Their Economic Role; American Enterprise Institute for Public Policy Research: Washington, DC, USA, 1985; pp. 1–81. [Google Scholar]
Stock, J.H.; Watson, M.W. Vector autoregressions. J. Econ. Perspect. 2001, 15, 101–115. [Google Scholar] [CrossRef]
Shumway, R.H.; Stoffer, D.S.; Shumway, R.H.; Stoffer, D.S. ARIMA models. In Time Series Analysis and Its Applications: With R Examples; Springer: Cham, Switzerland, 2017; pp. 75–163. [Google Scholar]
Alberg, D.; Shalit, H.; Yosef, R. Estimating stock market volatility using asymmetric GARCH models. Appl. Financ. Econ. 2008, 18, 1201–1208. [Google Scholar] [CrossRef]
Mikosch, T.; Strica, C. Changes of structure in financial time series and the GARCH model. REVSTAT-Stat. J. 2004, 2, 41–73. [Google Scholar]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Pan, Q.; Hu, W.; Chen, N. Two Birds with One Stone: Series Saliency for Accurate and Interpretable Multivariate Time Series Forecasting. In Proceedings of the IJCAI, Montreal, QC, Canada, 19–27 August 2021; pp. 2884–2891. [Google Scholar]
Saud, A.S.; Shakya, S. Analysis of l2 regularization hyper parameter for stock price prediction. J. Inst. Sci. Technol. 2021, 26, 83–88. [Google Scholar] [CrossRef]
Grossberg, S. Recurrent neural networks. Scholarpedia 2013, 8, 1888. [Google Scholar] [CrossRef]
Hochreiter, S. Long Short-term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
Mehtab, S.; Sen, J. Analysis and forecasting of financial time series using CNN and LSTM-based deep learning models. In Proceedings of the Advances in Distributed Computing and Machine Learning: Proceedings of ICADCML 2021; Springer: Singapore, 2022; pp. 405–423. [Google Scholar]
Jiang, Z.; Liang, J. Cryptocurrency portfolio management with deep reinforcement learning. In Proceedings of the 2017 Intelligent Systems Conference (IntelliSys), London, UK, 7–8 September 2017; pp. 905–913. [Google Scholar]
Lim, B.; Arık, S.Ö.; Loeff, N.; Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
Xu, C.; Li, J.; Feng, B.; Lu, B. A financial time-series prediction model based on multiplex attention and linear transformer structure. Appl. Sci. 2023, 13, 5175. [Google Scholar] [CrossRef]
Olorunnimbe, K.; Viktor, H. Ensemble of temporal Transformers for financial time series. J. Intell. Inf. Syst. 2024, 62, 1087–1111. [Google Scholar] [CrossRef]
Huo, L.; Xie, Y.; Li, J. An Innovative Deep Learning Futures Price Prediction Method with Fast and Strong Generalization and High-Accuracy Research. Appl. Sci. 2024, 14, 5602. [Google Scholar] [CrossRef]
Zhang, G.P. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003, 50, 159–175. [Google Scholar] [CrossRef]
Tsay, R.S. Analysis of Financial Time Series; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
Fama, E.F.; French, K.R. Permanent and temporary components of stock prices. J. Political Econ. 1988, 96, 246–273. [Google Scholar] [CrossRef]
Tay, F.E.; Cao, L. Application of support vector machines in financial time series forecasting. Omega 2001, 29, 309–317. [Google Scholar] [CrossRef]
Nelson, D.M.; Pereira, A.C.; De Oliveira, R.A. Stock market’s price movement prediction with LSTM neural networks. In Proceedings of the 2017 International joint conference on neural networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 1419–1426. [Google Scholar]
Chen, J.F.; Chen, W.L.; Huang, C.P.; Huang, S.H.; Chen, A.P. Financial time-series data analysis using deep convolutional neural networks. In Proceedings of the 2016 7th International conference on cloud computing and big data (CCBD), Macau, China, 16–18 November 2016; pp. 87–92. [Google Scholar]
Meng, T.L.; Khushi, M. Reinforcement learning in financial markets. Data 2019, 4, 110. [Google Scholar] [CrossRef]
Sirignano, J.A. Deep learning for limit order books. Quant. Financ. 2019, 19, 549–570. [Google Scholar] [CrossRef]
Turner, C.R.; Fuggetta, A.; Lavazza, L.; Wolf, A.L. A conceptual basis for feature engineering. J. Syst. Softw. 1999, 49, 3–15. [Google Scholar] [CrossRef]
Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, USA, 2020. [Google Scholar]
Besanko, D.; Braeutigam, R. Microeconomics; John Wiley and Sons: Hoboken, NJ, USA, 2020. [Google Scholar]
Smagulova, K.; James, A.P. A survey on LSTM memristive neural network architectures and applications. Eur. Phys. J. Spec. Top. 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
Le, X.H.; Ho, H.V.; Lee, G.; Jung, S. Application of long short-term memory (LSTM) neural network for flood forecasting. Water 2019, 11, 1387. [Google Scholar] [CrossRef]
Cowell, F.A. Microeconomics: Principles and Analysis; Oxford University Press: Oxford, UK, 2018. [Google Scholar]
Zhang, M.Y.; Russell, J.R.; Tsay, R.S. A nonlinear autoregressive conditional duration model with applications to financial transaction data. J. Econom. 2001, 104, 179–207. [Google Scholar] [CrossRef]
De Boer, P.T.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A tutorial on the cross-entropy method. Ann. Oper. Res. 2005, 134, 19–67. [Google Scholar] [CrossRef]
Pan, Q.; Guo, N.; Qingge, L.; Zhang, J.; Yang, P. PMT-IQA: Progressive multi-task learning for blind image quality assessment. In Proceedings of the Pacific Rim International Conference on Artificial Intelligence; Springer: Singapore, 2023; pp. 153–164. [Google Scholar]
Kingma, D.P. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Sun, T.; Wang, J.; Ni, J.; Cao, Y.; Liu, B. Predicting futures market movement using deep neural networks. In Proceedings of the 2019 18Th IEEE International Conference on Machine Learning and Applications (ICMLA), Boca Raton, FL, USA, 16–19 December 2019; pp. 118–125. [Google Scholar]
Zhang, Z.; Zohren, S.; Roberts, S. Deeplob: Deep convolutional neural networks for limit order books. IEEE Trans. Signal Process. 2019, 67, 3001–3012. [Google Scholar] [CrossRef]
Sharpe, W.F. The sharpe ratio. Streetwise-Best J. Portf. Manag. 1998, 3, 169–185. [Google Scholar] [CrossRef]
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Geboers, H.; Depaire, B.; Annaert, J. A review on drawdown risk measures and their implications for risk management. J. Econ. Surv. 2023, 37, 865–889. [Google Scholar] [CrossRef]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the ICCV, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
Admati, A.R.; Pfleiderer, P. Selling and trading on information in financial markets. Am. Econ. Rev. 1988, 78, 96–103. [Google Scholar]

Figure 1. The domestic futures dataset shows significant price fluctuations between buyers and sellers at various time intervals.

Figure 2. The visualization of the futures markets with labels {−2, −1, 0, 1, 2} represented by orange and red lines, capturing both upward and downward trends.

Figure 3. A diagram of a sliding window. Orange defines the features utilized by the current training sample (i.e., historical data from the previous w steps), and blue is the target label (i.e., the trend pattern at each time step).

Figure 4. The upward and downward trend patterns. We sampled every 10 min to reduce noise from overly dense points to better capture overall trend changes.

Figure 5. The architecture of FuturesNet for predicting futures trends is composed of the InceptionTime module, the long short-term memory module, and the auto-regressive module.

Figure 6. The multi-scale receptive fields of a multi-layer convolutional neural network.

Figure 7. LSTM memory cell structure, including forget gate, input gate, output gate, intermediate outputs, and cell states.

Figure 8. The price fluctuations between buyers and sellers for Futures 50, 300, and 500. (a) The prices and volumes in Futures 50, exhibiting fluctuations at [200, 500]; (b) the prices and volumes in Futures 300, also exhibiting fluctuations at [200, 500]; (c) the prices and volumes for buyers and sellers in Futures 500, exhibiting significant fluctuations at [600, 800].

Figure 9. The price spreads between open and close prices and between the highest and lowest prices for various futures datasets. The left subfigure highlights the periodicity between open & close prices. The right subfigure highlights the periodicity between high and low prices.

Figure 10. The volume changes over time across different Futures 50, 300, and 500.

Figure 11. Visualization of upward and downward patterns in the Futures 50, 300, and 500 datasets.

Figure 12. The proportion of trend labels across different futures. (a) The proportion of trend labels across different futures (bar chart); (b) the proportion of trend labels across different futures (pie chart).

Figure 13. A comparison between FuturesNet and other baselines across different years.

Figure 14. The averaged performance of FuturesNet and other baselines on multiple futures. (a) The averaged performance of individual futures across different years; (b) The averaged performance across different futures for each year.

Figure 15. The feature importance for Futures 50 and 300. Our results show that both the ask prices and volumes of buyers in historical datasets of the most recent fifteen minutes are crucial for futures trading, aligning with existing microeconomic principles [46].

Figure 16. The effect of different sample sizes on the S-value and accuracy. Deep models achieve optimal performance when the training set size reaches 6–7 months. The left subfigure shows the performance curve of accuracy, and the right subfigure shows the performance curve of the S-value.

Figure 17. FuturesNet’s performance curve for Futures 50 across different years (2019, 2020, and 2021). The left subfigure shows the performance curve of accuracy, and the right subfigure shows the performance curve of the S-value.

Figure 18. FuturesNet’s performance decreases as the test set size gradually increases.

Figure 19. The contributions of trading costs

L_{c}

in Equation (17), cross-entropy loss

L_{c e}

, and dynamic weighting mechanisms in Equation (18).

Figure 19. The contributions of trading costs

L_{c}

in Equation (17), cross-entropy loss

L_{c e}

, and dynamic weighting mechanisms in Equation (18).

Figure 20. The impact of different window sizes on performance in Futures 50. The experimental results indicate that a window size of 96 achieves the best performance.

Table 1. The statistical properties of our futures datasets, including open price, close price, maximum price, lowest price, and corresponding volume.

Id	Metric	Mean	Median	Deviation
50	Open price	3117.0	3176.7	270.3
	Close price	3117.1	3176.5	270.3
	Max price	3117.9	3177.8	270.4
	Min price	3116.2	3175.4	270.2
	Volume	154,207.2	116,577.5	144,871.1
300	Open price	4427.0	4583.5	493.2
	Close price	4427.0	4583.6	493.2
	Max price	4428.1	4584.7	493.4
	Min price	4425.9	4582.1	493.0
	Volume	652,119.9	504,496.5	555,812.6
500	Open price	5923.6	5861.6	536.0
	Close price	5923.6	5861.7	536.0
	Max price	5925.2	5862.9	536.2
	Min price	5921.9	5860.2	535.9
	Volume	606,487.7	464,245.5	521,443.8

Table 2. The performance comparison between FuturesNet and other competitive baselines. The best performance is in boldface.

Futures	Method	CNN	Transformer	GRU	LSTNet	Ours
	2020	0.19	0.19	0.48	0.39	0.54
50	2021	0.31	0.33	0.32	0.29	0.41
	2022	0.32	0.31	0.39	0.29	0.51
	2020	0.19	0.21	0.49	0.48	0.54
300	2021	0.37	0.36	0.32	0.29	0.39
	2022	0.31	0.29	0.35	0.32	0.42
	2020	0.25	0.34	0.31	0.28	0.39
500	2021	0.34	0.25	0.36	0.25	0.43
	2022	0.32	0.41	0.31	0.25	0.44

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pan, Q.; Sun, S.; Yang, P.; Zhang, J. FuturesNet: Capturing Patterns of Price Fluctuations in Domestic Futures Trading. Electronics 2024, 13, 4482. https://doi.org/10.3390/electronics13224482

AMA Style

Pan Q, Sun S, Yang P, Zhang J. FuturesNet: Capturing Patterns of Price Fluctuations in Domestic Futures Trading. Electronics. 2024; 13(22):4482. https://doi.org/10.3390/electronics13224482

Chicago/Turabian Style

Pan, Qingyi, Suyu Sun, Pei Yang, and Jingyi Zhang. 2024. "FuturesNet: Capturing Patterns of Price Fluctuations in Domestic Futures Trading" Electronics 13, no. 22: 4482. https://doi.org/10.3390/electronics13224482

APA Style

Pan, Q., Sun, S., Yang, P., & Zhang, J. (2024). FuturesNet: Capturing Patterns of Price Fluctuations in Domestic Futures Trading. Electronics, 13(22), 4482. https://doi.org/10.3390/electronics13224482

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

FuturesNet: Capturing Patterns of Price Fluctuations in Domestic Futures Trading

Abstract

1. Introduction

2. Related Work

2.1. Futures Trading with Traditional Methods

2.2. Futures Trading with Deep Learning-Based Methods

3. Preliminaries

3.1. Input Data Format

3.2. Futures Trend Labeling

4. Our Method

4.1. Overall Framework

4.1.1. InceptionTime Module

4.1.2. Long Short-Term Memory Module

4.1.3. Auto-Regressive Module

4.2. Objective Function

5. Experiments

5.1. Experimental Settings

5.1.1. Evaluation Metrics

5.1.2. Hyperparameter Settings

5.1.3. Statistics of High-Frequency Futures Data

5.2. Futures Data Analysis

5.2.1. Visualization of Domestic Futures Data

5.2.2. Futures Price Spread

5.2.3. Futures Data Trend Patterns

5.3. Main Results

5.4. Hyperparameter Analysis

5.5. Ablation Studies

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI