开源软件名称(OpenSource Name):emoen/Machine-Learning-for-Asset-Managers开源软件地址(OpenSource Url):https://github.com/emoen/Machine-Learning-for-Asset-Managers开源编程语言(OpenSource Language):Python 100.0%开源软件介绍(OpenSource Introduction):Install Library..with >>> from Machine_Learning_for_Asset_Managers import ch2_fitKDE_find_best_bandwidth as c >>> import numpy as np >>> c.findOptimalBWidth(np.asarray([21,3])) {'bandwidth': 10.0} Machine-Learning-for-Asset-ManagersImplementation of code snippets and exercises from Machine Learning for Asset Managers (Elements in Quantitative Finance) written by Prof. Marcos López de Prado. The project is for my own learning. If you want to use the consepts from the book - you should head over to Hudson & Thames. They have implemented these consepts and many more in mlfinlab. Edit: seems like some of theyr work - like jupyter notebooks - has gone behind a paywall. For practical application see the repository: Machine-Learning-for-Asset-Managers-Oslo-Bors. Note: In chapter 4 - there is a bug in the implementation of "Optimal Number of Clusters" algorithm (ONC) in the book
(the code from the paper - DETECTION OF FALSE INVESTMENT STRATEGIES USING UNSUPERVISED LEARNING METHODS, de Prado and Lewis (2018) - The divide and conquer method of subspaces used by ONC can be problematic because if you embed a subspace into a space with a large eigen-value. The larger space can distort the clusters found in the subspace. ONC does precisely that - it embeds subspaces into the space consisting of the largest eigenvalues found in the correlation matrix. An outline describing the problem more rigorously can be found here: https://math.stackexchange.com/questions/4013808/metric-on-clustering-of-correlation-matrix-using-silhouette-score/4050616#4050616 Other clustering algorithms should be investigated like hierarchical clustering. Chapter 2 Denoising and DetoningMarcenko-Pasture theoretical probability density function, and empirical density function:
Denoising a random matrix with signal using the constant residual eigenvalue method. This is done by fixing random eigenvalues. See code snippet 2.5
Detoned covariance matrix can be used to calculate minimum variance portfolio. The efficient frontier is the upper portion of the minimum variance frontier starting at the minimum variance portfolio. A denoised covariance matrix is less unstable to change. Note: Excersize 2.7: "Extend function fitKDE in code snippet 2.2, so that it estimates through cross-validation the optimal value of bWidth (bandwidth)". The script ch2_fitKDE_find_bandwidth.py implements this procedure and produces the (green) KDE in figure 2.3:
From code snippet 2.3 - with random matrix with signal: the histogram is how the eigenvalues of a random matrix with signal is distributed. Then the variance of the theoretical probability density function is calculated using the
Chapter 3 Distance Metrics
Standard angular distance is better used for long-only portfolio appliacations. Squared and Absolute Angular Distances for long-short portfolios. Chapter 4 Optimal ClusteringUse unsupervised learning to maximize intragroup similarities and minimize intergroup similarities. Consider matrix X of shape N x F. N objects and F features. Features are used to compute proximity(correlation, mutual information) to N objects in an NxN matrix. There are 2 types of clustering algorithms. Partitional and hierarchical:
Generating of random block correlation matrices is used to simulate instruments with correlation. The utility for doing this is in code snippet 4.3, and it uses clustering algorithms optimal number of cluster (ONC) defined in snippet 4.1 and 4.2, which does not need a predefined number of clusters (unlike k-means), but uses an 'elbow method' to stop adding clusters. The optimal number of clusters are achieved when there is high intra-cluster correlation and low inter-cluster correlation. The silhouette score is used to minimize within-group distance and maximize between-group distance. Chapter 5 Financial Labels
Tiple-Barrier Method involves holding a position until
Trend-scanning method: the idea is to identify trends and let them run for as long and as far as they may persists, without setting any barriers.
An alternative to look-forward algorithm as presented in the book is to use look-backward from the latest data-point to the window-size. E.g. if the latest data-point is at index 20 - and the window size is between 3 and 10 days. The look-backward algorithm will scan window at index 17 to 20 all the way back to index 11 to 20. Hence only considering the most recent information.
Chapter 6 Feature Importance Analysis"p-value does not measure the probability that neither the null nor the alternative hypothesis is true, or the significance of a result."
"Backtesting is not a research tool. Feature importance is." (Lopez de Prado) The Mean Decrease Impurity (MDI) algorithm deals with 3 out of 4 problems with p-values:
Figure 6.4 shows that ONC correctly recognizes that there are six relevant clusters(one cluster for each informative feature, plus one cluster of noise features), and it assigns the redundant features to the cluster that contains the informative feature from which the redundant features where derived. Given the low correlation across clusters, there is no need to replace the features with their residuals. Next, apply the clustered MDI method to the clustered data:
Clustered MDI works better han non-clustered MDI. Finally, apply the clustered MDA method to this data:
Conclusion: C_5 which is associated with noisy features is not important, and all other clusters has similar importance. Chapter 7 Portfolio ConstructionConvex portfolio optimization can calculate minimum variance portfolio and max sharp-ratio. Definition Condition number: absolute value of the ratio between the maximum and minimum eigenvalues: A_n_n / A_m_m. The condition number says something about the instability of the instability caused by covariance structures. Definition trace = sum(diag(A)) - its the sum of the diagonal elements Highly correlated time-series implies high condition number of the correlation matrix. Markowitz's curseThe correlation matrix C is stable only when the correlation Hierarchical risk parity (HRP) outperforms Markowit in out-of-sample Monte-Carlo experiments, but is sub-optimal in-sample. Code-snippet 7.1 illustrates the signal-induced instability of the correlation matrix.
Code-snippet 7.2 creates same block diagonal matrix but with one dominant block. However the condition number is the same.
This demonstrates bringing down the intrablock correlation in only one of the two blocks doesnt reduce the condition number. This shows that the instablility in Markowitz's solution can be traced back to the dominant blocks.
The nested Clustered Optimization Algorithm (NCO)NCO provides a strategy for addressing the effect of Markowitz's curse on an existing mean-variance allocation method.
Chapter 8 Testing set overfittingBacktesting is a historical simulation of how an investment strategy would have performed in the past. Backtesting suffers from selection bias under multiple testing, as researchers run millions of tests on historical data and presents the best ones (overfitted). This chapter studies how to measure the effect of selection bias. Precision and recallPrecision and recall under multiple testingThe sharpe ratioSharpe Ratio = μ/σ The 'False Strategy' theoremA researcher may run many historical simulations and report only the best one (max sharp ratio). The distribution of max sharpe ratio is not the same as the expected sharpe ratio. Hence selection bias under multiple replications (SBuMT). Experimental resultsA monte carlo experiment shows that the distribution of the max sharp ratio increases (E[max(sharp_ratio)] = 3.26) even when the expected sharp ratio is 0 (E[sharp_ratio]). So an investment strategy will seem promising even when there are no good strategy. When more than one trial takes place, the expected value of the maximum Sharpe Ratio is greater than the expected value of the Sharpe Ratio, from a random trial (when true Sharpe Ratio=0 and variance > 0).
The Deflated Sharpe RatioThe main conclusion from the False Strategy Theorem is that, unless $maxk{SR^k}>>E[maxk{SR^k}], the discovered strategy is likely to be false positive. Type II errors under multiple testingThe interaction between type I and type II errorsAppendix A: Testing on Synthetic dataEither from resampling or monte carlo |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论