Authors
Shlok Mandloi, Aryaman Jalali, and Eugene Pinsky, Boston University, USA
Abstract
This paper introduces an adaptive framework for detecting outliers in financial time-series data, focusing on Exchange-Traded Funds (ETFs). The method integrates hierarchical clustering and binary tree analysis to identify unique ETF patterns while isolating anomalies. Using the yfinance API, daily returns for nine ETFs and the S&P 500 index were collected over 24 years. Regression analysis removed market influence, producing residuals that highlight ETF-specific behavior. Hierarchical clustering was applied to these residuals annually, with dendrograms converted into binary trees. Outliers were detected as ETFs added last in clustering and as root nodes in the trees. Metrics like tree height, breadth, and cluster compactness captured temporal patterns and deviations. Experimental results demonstrate the framework’s ability to detect anomalies during major market events, such as the 2008 financial crisis and the 2020 COVID-19 crash. This scalable and interpretable approach enhances anomaly detection in financial data analysis.
Keywords
Hierarchical Clustering, Outlier Detection, Financial Time-Series, Binary Tree Analysis, Anomaly Detection.