A classic prediction problem from finance is to predict the next returns (i.e. relative price variations) from a stock market.
That is, given a stock market of $N$
stocks having returns $R_t\in\mathbb R^N$
at time $t,$
the goal is to design at each time $t$
a vector $S_{t+1}\in\mathbb R^N$
from the information available up to time $t$
such that the prediction overlap $\langle S_{t+1},R_{t+1}\rangle$
is quite often positive.
To be fair, this is not an easy task.
In this challenge, we attack this problem armed with a linear factor model where one learns the factors over an exotic non-linear parameter space.

where the vectors $F_{t,\ell}\in\mathbb R^N$
are explicative factors (a.k.a. features), usually designed from financial expertise,
and $\beta_1,\ldots,\beta_F\in\mathbb R$
are model parameters that can be fitted on a training data set.

But how to design the factors $F_{t,\ell}$
?

Factors that are โwell knownโ in the trading world include the $5$
-day (normalized) mean returns $R_t^{(5)}$
or the Momentum$M_t:= R_{t-20}^{(230)}$
, where $R_t^{(m)}:=\frac{1}{\sqrt{m}}\sum_{k=1}^{m} R_{t+1-k}.$
But if you know no finance and have developed enough taste for mathematical elegance, you may aim at learning the factors themselves within the simplest class of factors,
namely linear functions of the past returns:

for some vectors $A_\ell:=(A_{k\ell})\in\mathbb R^D$
and a fixed time depth parameter $D.$
Well, we need to add a condition to create enough independence between the factors, since otherwise they may be redundant.
One way to do this is to assume the vectors $A_\ell$
's are orthonormal, $\langle A_k,A_\ell\rangle = \delta_{kl}$
for all $k,\ell$
, which adds a non-linear constraint to the parameter space of our predictive model.

All in all, we thus have at hand a predictive parametric model with parameters:

a $D\times F$
matrix $A:=[A_1,\ldots,A_F]$
with orthonormal columns,

a vector $\beta:=(\beta_1,\ldots,\beta_F)\in\R^F.$

Note that it contains the two-factor model using $R_t^{(5)}$
and $M_t$
defined above,
or the autoregressive model AR from time series analysis, as submodels.

Challenge goals

The goal of this challenge is to design/learn factors for stock return prediction using the exotic parameter space introduced in the context section.

Participants will be able to use three-year data history of $50$
stock from the same stock market (training data set) to provide the model parameters $(A,\beta)$
as outputs.
Then the predictive model associated with these parameters will be tested to predict the returns of $50$other stocks over the same three-year time period (testing data set).

We allow $D=250$
days for the time depth and $F=10$
for the number of factors.

Metric.
More precisely, we assess the quality of the predictive model with parameters $(A,\beta)$
as follows. Let $\tilde R_t\in\R^{50}$
be the returns of the $50$
stocks of the testing data set over the three-year period ($t=0\ldots753$
)
and let $\tilde S_{t} = \tilde S_{t}(A,\beta)$
be the participants' predictor for $\tilde R_{t}$
. The metric to maximize is defined by

if $|\langle A_i,A_j\rangle-\delta_{ij}|\leq 10^{-6}$
for all $i,j$
and $\mathrm{Metric}(A,\beta):=-1$
otherwise.

By construction the metric takes its values in $[-1,1]$
and equals to $-1$
as soon as there exists a couple $(i,j)$
breaking too much the orthonormality condition.

Output structure. The output expected from the participants is a vector where the model parameters $A=[A_1,\ldots,A_{10}]\in\mathbb R^{250\times 10}$
and $\beta\in\R^{10}$
are stacked as follows

The training input given to the participants $X_{train}$
is a dataframe containing the (cleaned) daily returns of $50$
stocks over a time period of $754$
days (three years).
Each row represents a stock and each column refers to a day. $X_{train}$
should be used to find the predictive model parameters $A,\beta.$

The returns to be predicted in the training data set are provided in $Y_{train}$
for convenience, but they are also contained in $X_{train}$
.

Benchmark description

A possible "brute force" procedure to tackle this problem is to generate orthonormal vectors $A_1,\ldots,A_{10}\in\mathbb R^{250}$
at random and then to fit $\beta$
on the training data set by using linear regression,
to repeat this operation many times, and finally to select the best result from these attempts.

More precisely, the QRT benchmark strategy to beat is (see the notebook in the supplementary material):

Repeat $N_{iter}=1000$
times the following.

Sample a $250\times 10$
matrix $M$
with iid Gaussian $N(0,1)$
entries.

Apply the Gram-Schmidt algorithm to the columns of $M$
to obtain a matrix $A=[A_1,\ldots,A_{10}]$
with orthonormal columns (see the randomA function).

Use the columns of $A$
to build the factors and then take $\beta$
with minimal mean square error on the training data set (with fitBeta).

Compute the metric on the training data (metricTrain).

Return the model parameters $(A,\beta)$
that maximize this metric.

Remark: The orthonormality condition for the vectors $A_1,\ldots,A_F$
reads $A^T A=I_F$
for the matrix $A:=[A_1,\ldots,A_F].$
The space of matrices satisfying this condition is known as the Stiefel manifold, a generalization of the orthogonal group,
and one can show that the previous procedure generates a sample from the uniform distribution on this (compact symmetric) space.

Files

Files are accessible when logged in and registered to the challenge

The challenge provider

Qube Research & Technologies Group is a quantitative and systematic investment manager employing around 300 people with offices in Hong Kong, London, Mumbai, Paris and Singapore. We are a technology driven firm implementing a scientific approach to financial investment. QRTโs market presence is global and expands across the largest liquid electronic venues. The combination of data, research, technology and trading expertise has shaped our DNA and is at the heart of our innovation and development dynamic. The firm acts as an investment manager managing open-ended Funds used for management of third party capital.