Post by Date
2024
November 2024
What’s in a Name? AI Meets the Sociology of NamingNovember 17, 2024
[paper] Assisting in Writing Wikipedia-like Articles From Scratch with Large Language ModelsNovember 2, 2024
October 2024
June 2024
April 2024
Basic DSPy RAG tutorial on DataGrapple blog postsApril 16, 2024
Basic DSPy RAG tutorial on DataGrapple blog posts
February 2024
2023
</ul>July 2023
June 2023
May 2023
[Active Reading with ChatGPT] Quantitative Portfolio Management: The Art and Science of Statistical ArbitrageMay 28, 2023
Building a S&P 500 company classification from Wikipedia articles (guided by ChatGPT)May 7, 2023
April 2023
March 2023
February 2023
Performance attribution of a crypto market-neutral book on a statistical risk modelFebruary 27, 2023
January 2023
2022
</ul>December 2022
November 2022
Takeaways from Complex Networks 2022 in Palermo, ItalyNovember 13, 2022
Getting ready for Complex Networks 2022 in Palermo, ItalyNovember 5, 2022
September 2022
Hierarchical PCA x Hierarchical clustering on crypto perpetual futuresSeptember 17, 2022
Crypto PCA First EigenvectorSeptember 17, 2022
May 2022
April 2022
February 2022
Standard readability measures (applied on Shakespeare's plays)February 19, 2022
Naive modelling of Matalan defaulting on its MTNLN 9.5 01/31/24 NotesFebruary 8, 2022
January 2022
2021
</ul>December 2021
November 2021
[Book] Advanced Portfolio Management -- A Quant's Guide for Fundamental InvestorsNovember 29, 2021
[paper] Top2Vec: Distributed Representations of TopicsNovember 14, 2021
with application on 2020 10-K business descriptions
October 2021
[HKML] Hong Kong Machine Learning Meetup Season 4 Episode 2October 27, 2021
[HKML] Hong Kong Machine Learning Meetup Season 4 Episode 2
August 2021
Embeddings of Sectors and Industries using Graph Neural NetworksAugust 13, 2021
node2vec embeddings of industries projected onto the 2d plane
April 2021
February 2021
The Swelling Effect: Think twice before averaging covariance matricesFebruary 13, 2021
A few ellipsoids representing the associated covariance matrices along the geodesic path from the leftmost to the rightmost matrices.
Conditional CorrGAN: An example in Google ColabFebruary 5, 2021
A few cCorrGAN-generated correlation matrices, and the confusion matrix of a SPDNet + RBN classification.
January 2021
Classification of Correlation Matrices using SPDNet with Riemannian Batch NormalizationJanuary 22, 2021
Illustration from "A Riemannian Network for SPD Matrix Learning" https://arxiv.org/pdf/1608.04233.pdf
[Paper] Summary of 'Interpretable Machine Learning – A Brief History, State-of-the-Art and Challenges'January 9, 2021
[Paper] Summary of Interpretable Machine Learning – A Brief History, State-of-the-Art and Challenges
AutoGL and the Open Graph Benchmark: Datasets for Machine Learning on GraphsJanuary 2, 2021
2020
</ul>December 2020
Some further assessment of the original CorrGAN model (2019)December 18, 2020
[Paper] Summary of 'Explaining by Removing: A Unified Framework for Model Explanation'December 3, 2020
Illustration from Explaining by Removing: A Unified Framework for Model Explanation
November 2020
From Fundamental To Quantamental InvestingNovember 11, 2020
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 4November 4, 2020
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 4
October 2020
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 3October 28, 2020
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 3
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 2October 8, 2020
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 2
September 2020
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 1September 30, 2020
[HKML] Hong Kong Machine Learning Meetup Season 3 Episode 1
Some Thoughts on the Applications of Deep Generative Models in FinanceSeptember 27, 2020
Can we predict a market regime from correlation matrix features?September 4, 2020
August 2020
[Book] Commented summary of Probabilistic Graphical Models -- A New Way of Thinking in Financial ModellingAugust 30, 2020
Which portfolio allocation method to choose? Look at the correlation matrix!August 17, 2020
Portfolio construction methods and risk metrics: in- and out-of-sample comparisons on simulated dataAugust 15, 2020
Extraction of features from a given correlation matrixAugust 14, 2020
Release of a few pretrained CorrGAN modelsAugust 11, 2020
Release of a few pretrained CorrGAN models
How to combine a handful of predictors using empirical copulas and maximum likelihoodAugust 9, 2020
CopulaGAN (the 3d case) - A first naive attemptAugust 1, 2020
July 2020
Clustering Marginal Distributions of Stocks Returns, and Sampling from their Wasserstein BarycenterJuly 29, 2020
Sampling from Empirical Copulas of StocksJuly 28, 2020
Wasserstein Barycenters of Stocks Empirical CopulasJuly 26, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 8July 15, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 8
[Paper + Implementation] Hierarchical PCA and Applications to Portfolio ManagementJuly 5, 2020
Mutual Information Is Copula EntropyJuly 1, 2020
June 2020
Measuring non-linear dependence with Optimal TransportJune 25, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 7June 10, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 7
May 2020
[Book] Commented summary of Machine Learning for Factor Investing by Guillaume Coqueret and Tony GuidaMay 19, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 6May 13, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 6
April 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 5April 29, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 5
[Book] Commented summary of Machine Learning for Asset Managers by Marcos Lopez de PradoApril 12, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 4April 8, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 4
March 2020
[Paper + Implementation] The Hierarchical Equal Risk Contribution Portfolio (Part I)March 22, 2020
[Paper + Implementation] The Hierarchical Equal Risk Contribution Portfolio (Part I)
[Paper + Experimentation] CartoonGAN applied to Hong Kong landscapes using StreamlitMarch 22, 2020
February 2020
CorrVAE: A VAE for sampling realistic financial correlation matrices (Tentative II)February 17, 2020
CorrVAE: A VAE for sampling realistic financial correlation matrices (Tentative II)
CorrVAE: A VAE for sampling realistic financial correlation matrices (Tentative I)February 4, 2020
CorrVAE: A VAE for sampling realistic financial correlation matrices (Tentative I)
S&P 500 Sharpe vs. Correlation Matrices - Building a dataset for generating stressed/rally/normal scenariosFebruary 3, 2020
S&P 500 Sharpe vs. Correlation Matrices - Building a dataset for generating stressed/rally/normal scenarios
January 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 3January 22, 2020
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 3
2019
</ul>December 2019
How to define an intrisic Mean of Correlation Matrices in a Riemannian sense?December 25, 2019
Comparison of Network-based and Minimum Variance Portfolios Using CorrGANDecember 21, 2019
Comparison of Network-based and Minimum Variance Portfolios Using CorrGAN
Hierarchical Risk Parity - Implementation & Experiments (Part III)December 4, 2019
Hierarchical Risk Parity - Implementation & Experiments (Part III)
October 2019
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 2October 17, 2019
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 2
TF 2.0 DCGAN for 100x100 financial correlation matricesOctober 13, 2019
TF 2.0 DCGAN for 100x100 financial correlation matrices
September 2019
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 1September 25, 2019
[HKML] Hong Kong Machine Learning Meetup Season 2 Episode 1
TF 2.0 GAN MLP for 100x100 financial correlation matricesSeptember 22, 2019
TF 2.0 GAN MLP for 100x100 financial correlation matrices
[HKML] Supercharge your Marketing with Data & ML . [HKML <> IAB] . Off-Season #1September 5, 2019
[HKML] Supercharge your Marketing with Data & ML . [HKML <> IAB] . Off-Season #1
Permutation invariance in Neural networksSeptember 1, 2019
Permutation invariance in Neural networks
August 2019
Using LIME to 'explain' Snorkel LabelerAugust 4, 2019
Using LIME to “explain” Snorkel Labeler
July 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 12 (Season Finale)July 17, 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 12 (Season Finale)
Stylized Facts of Financial CorrelationsJuly 15, 2019
Stylized Facts of Financial Correlations
CorrGAN: A GAN for sampling correlation matrices (Part II)July 1, 2019
CorrGAN: A GAN for sampling correlation matrices (Part II)
June 2019
CorrGAN: A GAN for sampling correlation matrices (Part I)June 23, 2019
CorrGAN: A GAN for sampling correlation matrices (Part I)
[ICML 2019] Day 5 - Workshop Time SeriesJune 14, 2019
[ICML 2019] Day 5 - Workshop Time Series
[ICML 2019] Day 4 - Interpretability, Natural Language Processing, Smarter than AI four year old kids, Unsupervised LearningJune 13, 2019
[ICML 2019] Day 4 - Interpretability, Natural Language Processing, Smarter than AI four year old kids, Unsupervised Learning
[ICML 2019] Day 3 - Robotics, Good ol' Sparse Coding, misc. applications, Transfer, Multitask and Active LearningJune 12, 2019
[ICML 2019] Day 3 - Robotics, Good ol’ Sparse Coding, misc. applications, Transfer, Multitask and Active Learning
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 11June 12, 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 11
[ICML 2019] Day 2 - U.S. Census, Time Series, Hawkes Processes, Shapley values, Topological Data Analysis, Deep Learning & Logic, Random Matrices, Optimal Transport for GraphsJune 11, 2019
[ICML 2019] Day 2 - U.S. Census, Time Series, Hawkes Processes, Shapley values, Topological Data Analysis, Deep Learning & Logic, Random Matrices, Optima...
[ICML 2019] Day 1 - TutorialsJune 11, 2019
[ICML 2019] Day 1 - Tutorials
May 2019
Experimenting with LIME - A tool for model-agnostic explanations of Machine Learning modelsMay 26, 2019
Experimenting with LIME - A tool for model-agnostic explanations of Machine Learning models
[ICML 2019] Reading list of accepted papersMay 19, 2019
[ICML 2019] Reading list of accepted papers
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 10May 15, 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 10
May the Fourth: VADER for Credit Sentiment?May 4, 2019
May the Fourth: VADER for Credit Sentiment?
Snorkel Credit Sentiment - Part 1May 1, 2019
First experiment with Snorkel Metal – Credit Sentiment on DataGrapple blogs
April 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 9April 17, 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 9
March 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 8March 12, 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 8
February 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 7February 20, 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 7
January 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 6January 23, 2019
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 6
2018
</ul>December 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 5December 19, 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 5
[Paper] A Backtesting Protocol in the Era of Machine LearningDecember 9, 2018
[Paper] A Backtesting Protocol in the Era of Machine Learning
November 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 4November 21, 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 4
October 2018
[Book] The UnRules - Man, Machines and the Quest to Master MarketsOctober 16, 2018
[Book] The UnRules - Man, Machines and the Quest to Master Markets
Hierarchical Risk Parity - Implementation & Experiments (Part II)October 15, 2018
Hierarchical Risk Parity - Implementation & Experiments (Part II)
Network-based vs. Minimum Variance portfolios: Any deep connections?October 10, 2018
Network-based vs. Minimum Variance portfolios: Any deep connections?
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 3October 9, 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 3
How to sample uniformly over the space of correlation matrices? The onion methodOctober 5, 2018
How to sample uniformly over the space of correlation matrices? The onion method
Hierarchical Risk Parity - Implementation & Experiments (Part I)October 2, 2018
Hierarchical Risk Parity - Implementation & Experiments (Part I)
September 2018
[Book] Neural Network Methods for Natural Language ProcessingSeptember 20, 2018
[Book] Neural Network Methods for Natural Language Processing
``Combination of Rankings'' - The Stationary DistributionSeptember 5, 2018
``Combination of Rankings’’ - The Stationary Distribution
``Combination of Rankings'' - The Full Coverage CaseSeptember 2, 2018
``Combination of Rankings’’ - The Full Coverage Case
[Bloomberg Meetup] The Forefront of Technologies in FinanceSeptember 1, 2018
[Bloomberg Meetup] The Forefront of Technologies in Finance
August 2018
A Monte Carlo study of the ``Combination of Rankings'' methodsAugust 31, 2018
A Monte Carlo study of the ``Combination of Rankings’’ methods
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 2August 21, 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 2
Combination of RankingsAugust 16, 2018
Combination of Rankings - A Proper Merging of Experts Views
July 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 1July 18, 2018
[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 1
[ICML 2018] RetrospectivesJuly 16, 2018
[ICML 2018] Retrospectives
[ICML 2018] Day 4 - Time-Series Analysis, NLP, and More Human-like Learning MachinesJuly 13, 2018
[ICML 2018] Day 4 - Time-Series Analysis, NLP, and More Human-like Learning Machines
[ICML 2018] Day 3 - Energy, GANs, Rankings, Curriculum Learning, and our paperJuly 12, 2018
[ICML 2018] Day 3 - Energy, GANs, Rankings, Curriculum Learning, and our paper
[ICML 2018] Day 2 - Representation Learning, Networks and Relational LearningJuly 11, 2018
[ICML 2018] Day 2 - Representation Learning, Networks and Relational Learning
[ICML 2018] Day 1 - TutorialsJuly 10, 2018
[ICML 2018] Day 1 - Tutorials
June 2018
On the difficulty of reading numbers in different languagesJune 25, 2018
On the difficulty of reading numbers in different languages
Neural Style Transfer applied to paintingsJune 23, 2018
In this short blog, we apply the fast style transfer as implemented in tensorflow/magenta.
How to compute the Planar Maximally Filtered Graph (PMFG)June 3, 2018
How to compute the Planar Maximally Filtered Graph (PMFG)
May 2018
How to detect false strategies? The Deflated Sharpe RatioMay 30, 2018
Deflated Sharpe Ratio
APIs for getting crypto related dataMay 14, 2018
Following my blog post Download & Play with Cryptocurrencies Historical Data in Python, I got several times questions on how to get the historical data. ...
March 2018
Physiological analytics for sportsMarch 4, 2018
I recently got a Garmin Forerunner 935 (advised by a good friend of mine). After using it for two months, I can say that I’m happy with it so far. It has lot...
February 2018
AQR Academic FactorsFebruary 17, 2018
AQR has released an implementation of the well-known academic factors in its AQR Data Library:
Quant BlogsFebruary 13, 2018
Here a tentative of a list of interesting blogs to keep up with quant best practices to study financial markets. I hope that my readers will help me curate t...
2017
</ul>December 2017
Tail Dependence CoefficientsDecember 4, 2017
Research material:
November 2017
Riemannian Geometry of Correlation MatricesNovember 13, 2017
Research material:
PhD defense - Some contributions to the clustering of financial time seriesNovember 11, 2017
Here are the slides. The PhD studies were generously funded by Hellebore Capital.
[Correlation] How to visualize dependence between two variables?November 5, 2017
In this blog, we provide a snippet of code to explore the dependence between two variables. We illustrate its use on visualizing the dependence between a few...
September 2017
[Clustering] How to sort a distance matrixSeptember 7, 2017
Following the Ecole Polytechnique - Data Science Summer School where I got several times questions about how I produced the sorted correlation matrices disp...
[Field report] Data Science Summer School at Ecole Polytechnique (with Bengio, Russell, Bousquet, Archambeau and others)September 2, 2017
A small field report with personal viewpoint about the Data Science Summer School (Ecole Polytechnique) Monday, Aug. 28 – Friday, Sept. 1, 2017.
August 2017
Download & Play with Cryptocurrencies Historical Data in PythonAugust 25, 2017
To access the CryptoCompare public API in Python, we can use the following Python wrapper available on GitHub: cryCompare.
Reading list of NLP stuffAugust 24, 2017
General NLP:
Quick correlation study between BTC/USD and ETH/USDAugust 22, 2017
import numpy as np import scipy import pandas as pd import matplotlib.pyplot as plt %matplotlib inline import seaborn as sns import json from datetime import...
Field reports from ICML 2017 in SydneyAugust 11, 2017
My colleague, Mikolaj Binkowski, at Hellebore Capital was at the 34th International Conference on Machine Learning ICML 2017 in Sydney to represent the comp...
July 2017
Study of US Stocks Correlations, Hierarchies and ClustersJuly 24, 2017
In this small study, we use hierarchical clustering techniques to explore the structure of correlations between US stocks. To do so, we first download a data...
June 2017
Ether vs. Bitcoin -- Part 0 bisJune 28, 2017
For the last few days, ETH has lost 1/3 of its value with -20% several days in a row. We update the initial study with up-to-date data to take into account t...
Ether vs. Bitcoin -- Part 0June 22, 2017
In this introduction notebook, we simply displayed the distribution of the returns and see that tails are heavy, meaning that standard quant models cannot be...
June Ethereum London Meetup at Imperial CollegeJune 16, 2017
I attended this evening the June Ethereum London Meetup at Imperial College (I have been there a couple of times before). Imperial College’s big amphitheater...
May 2017
Swap Data Repositories for Credit Default SwapsMay 28, 2017
What are Swap Data Repositories?
[Clustering] How and Where should you cut a dendrogram?May 12, 2017
</div>