[Paper] A Backtesting Protocol in the Era of Machine Learning
[Paper] A Backtesting Protocol in the Era of Machine Learning
A Backtesting Protocol in the Era of Machine Learning, very recent paper (re-)stating the danger of using blindly machine learning in quantitative finance: essentially not enough data for powerful machine learning models to roam free of structure and economic hypotheses.
The paper lists a couple of pitfalls such as
- the selection bias,
- not discounting discoveries for multiple testing (cf. my implementation of Lopez de Prado deflated sharpe ratio),
- picking the data transformations that yield the best results without being robust to small changes of those,
- cross-validating is not as effective in quant finance,
- ignoring trading costs and fees,
- ignoring structural changes and overcrowding,
- tweaking the model once in production,
- heading for complex models when simple ones can do the job,
- aiming at good results instead of good science.