Investment banking
Started on Jan. 5, 2022
Every now and then, central bankers issue a speech that is supposed to convey their analysis of the world’s financial situation. These speeches are carefully followed on a global basis by all financial actors, and therefore have a strong influence over the evolution of financial markets, and to a larger extent, the overall economy. Indeed, central bank communications may impact various key economic factors, from interest rates to monetary policy, inflation expectations, credit, debt and overall financial leverage for private and public sectors alike.
In that these speeches may impact key macro-economic factors and move financial markets, the ability to correctly decipher and interpret “central bank lingo” has become a key area of focus for all sorts of financial analysts and economic actors. We invite the participants of this challenge to exploit embeddings of these speeches that are computed with powerful new tools recently made available by Google and others: natural language transformers. The application of this technique transforms a speech into a vector of 768 real numbers that should represent all key information conveyed in the speech.
Your goal is to understand how financial markets react when central bankers deliver official speeches.
We do not provide the speeches themselves - otherwise the participants would quickly find out the date and the market moves ! - but we provide a transformed version. They were processed by a predefined BERT-style transformer, and this gives the input of the problem. The output is the mean price evolution of a collection of 39 different time series; these time series correspond to 13 different markets mesured at 3 different time scales.
We have computed the difference between closing prices of these 13 markets at 3 different maturities and the price of these markets at the closing time of the date of the speech. We are not interessed in very short time effects (between the beginning of the speech and the closing of the same day) and leaking effects (trading occuring because of information leakage before the beginning of the speech). A few tests have given us an indication that if a speech has an effect on the markets, it seems to intervene before the end of 2 weeks following the date of the speech: we have chosen 1 day lag, 1 week lag and 2 week lag to measure the possible effects on the markets.
As expected, at first sight it was difficult to distinguish an effect. We have therefore developped a technique to boost the response of the transformer using numerical NLP techniques. We deliver here the result of these boost. It is not miraculous, and the small number of points in the dataset is a real handicap.
The 13 markets are the following :
The training data consists in 2000 transformed speeches and their subsequent market moves. More precisely:
The test data is as follows:Â
The participant must submit a (415, 39) csv file for the predicted values of the market moves.
The metric is simply the L2 metric between the predicted market moves and the real market moves:
math
\mathrm{loss}(y_{pred}, y) = \sqrt{\frac{1}{415*39}\sum_{i=1}^{415}\sum_{m = 1}^{39} |\hat{y}_{i,m} - y_{i,m}|^2}.
Warning: this rmse is different from the rmse defined by:
sklearn.mean_squared_error(a,b, squared=False)
which computes the average over the 39 dimensions of the output of the RMSE over the dataset.
The benchmark is implemented using the xgboost tool available in the sklearn library. A good description of this grandient-boosted tree-based algorithm is available here.
For the maximum efficiency of the benchmark, we handle every dimension independantly and for every dimension the parameters that are used are:


With these parameters, we obtain an accuracy close to 20;Â on the public part of the test dataset, the accuracy is 18.39. Will you do better ?
Note:
We see that the RMSE over the test set is very high, compared to the RMSE over the training set which is close to 10. This is expected because the number of samples of the training set is only of 2000 for a number of dimensions of the input being 768. So, the information brought by the training set is still too small to have a good predicting power.
Files are accessible when logged in and registered to the challenge