A classic prediction problem from finance is to predict the next returns (i.e. relative price variations) from a stock market.
That is, given a stock market of N
stocks having returns Rt∈RN
at time t,
the goal is to design at each time t
a vector St+1∈RN
from the information available up to time t
such that the prediction overlap ⟨St+1,Rt+1⟩
is quite often positive.
To be fair, this is not an easy task.
In this challenge, we attack this problem armed with a linear factor model where one learns the factors over an exotic non-linear parameter space.
More precisely, the simplest estimators being the linear ones, a typical move is to consider a parametric model of the form
where the vectors Ft,ℓ∈RN
are explicative factors (a.k.a. features), usually designed from financial expertise,
are model parameters that can be fitted on a training data set.
But how to design the factors Ft,ℓ
Factors that are “well known” in the trading world include the 5
-day (normalized) mean returns Rt(5)
or the MomentumMt:=Rt−20(230)
, where Rt(m):=m1∑k=1mRt+1−k.
But if you know no finance and have developed enough taste for mathematical elegance, you may aim at learning the factors themselves within the simplest class of factors,
namely linear functions of the past returns:
for some vectors Aℓ:=(Akℓ)∈RD
and a fixed time depth parameter D.
Well, we need to add a condition to create enough independence between the factors, since otherwise they may be redundant.
One way to do this is to assume the vectors Aℓ
's are orthonormal, ⟨Ak,Aℓ⟩=δkl
for all k,ℓ
, which adds a non-linear constraint to the parameter space of our predictive model.
All in all, we thus have at hand a predictive parametric model with parameters:
with orthonormal columns,
a vector β:=(β1,…,βF)∈RF.
Note that it contains the two-factor model using Rt(5)
or the autoregressive model AR from time series analysis, as submodels.
The goal of this challenge is to design/learn factors for stock return prediction using the exotic parameter space introduced in the context section.
Participants will be able to use three-year data history of 50
stock from the same stock market (training data set) to provide the model parameters (A,β)
Then the predictive model associated with these parameters will be tested to predict the returns of 50other stocks over the same three-year time period (testing data set).
We allow D=250
days for the time depth and F=10
for the number of factors.
More precisely, we assess the quality of the predictive model with parameters (A,β)
as follows. Let R~t∈R50
be the returns of the 50
stocks of the testing data set over the three-year period (t=0…753
and let S~t=S~t(A,β)
be the participants' predictor for R~t
. The metric to maximize is defined by
for all i,j
By construction the metric takes its values in [−1,1]
and equals to −1
as soon as there exists a couple (i,j)
breaking too much the orthonormality condition.
Output structure. The output expected from the participants is a vector where the model parameters A=[A1,…,A10]∈R250×10
are stacked as follows
The training input given to the participants Xtrain
is a dataframe containing the (cleaned) daily returns of 50
stocks over a time period of 754
days (three years).
Each row represents a stock and each column refers to a day. Xtrain
should be used to find the predictive model parameters A,β.
The returns to be predicted in the training data set are provided in Ytrain
for convenience, but they are also contained in Xtrain
A possible "brute force" procedure to tackle this problem is to generate orthonormal vectors A1,…,A10∈R250
at random and then to fit β
on the training data set by using linear regression,
to repeat this operation many times, and finally to select the best result from these attempts.
More precisely, the QRT benchmark strategy to beat is (see the notebook in the supplementary material):
times the following.
Sample a 250×10
with iid Gaussian N(0,1)
Apply the Gram-Schmidt algorithm to the columns of M
to obtain a matrix A=[A1,…,A10]
with orthonormal columns (see the randomA function).
Use the columns of A
to build the factors and then take β
with minimal mean square error on the training data set (with fitBeta).
Compute the metric on the training data (metricTrain).
Return the model parameters (A,β)
that maximize this metric.
Remark: The orthonormality condition for the vectors A1,…,AF
for the matrix A:=[A1,…,AF].
The space of matrices satisfying this condition is known as the Stiefel manifold, a generalization of the orthogonal group,
and one can show that the previous procedure generates a sample from the uniform distribution on this (compact symmetric) space.
Files are accessible when logged in and registered to the challenge
The challenge provider
Qube Research & Technologies Group is a quantitative and systematic investment manager employing around 300 people with offices in Hong Kong, London, Mumbai, Paris and Singapore. We are a technology driven firm implementing a scientific approach to financial investment. QRT’s market presence is global and expands across the largest liquid electronic venues. The combination of data, research, technology and trading expertise has shaped our DNA and is at the heart of our innovation and development dynamic. The firm acts as an investment manager managing open-ended Funds used for management of third party capital.