[Paper] A Backtesting Protocol in the Era of Machine Learning

A Backtesting Protocol in the Era of Machine Learning, very recent paper (re-)stating the danger of using blindly machine learning in quantitative finance: essentially not enough data for powerful machine learning models to roam free of structure and economic hypotheses.

The paper lists a couple of pitfalls such as

the selection bias,
not discounting discoveries for multiple testing (cf. my implementation of Lopez de Prado deflated sharpe ratio),
picking the data transformations that yield the best results without being robust to small changes of those,
cross-validating is not as effective in quant finance,
ignoring trading costs and fees,
ignoring structural changes and overcrowding,
tweaking the model once in production,
heading for complex models when simple ones can do the job,
aiming at good results instead of good science.