Laura Palagi, Sapienza University of Rome
In recent years there has been a growing attention to machine learning models which can gain explanatory insights into the decisions made by the algorithm. Thanks to their interpretability, decision trees have been intensively studied for classification tasks. In the last years, due to the remarkable advances in Mixed-Integer Programming (MIP), various approaches have been proposed to formulate the Optimal Classification Tree (OCT) problem as a MIP model starting from the seminal paper [Bertsimas D. and J. Dunn. “Optimal classification trees.” Machine Learning 106.7 (2017)]. In this talk, we will overview classification trees and random forests and the optimal tree classification formulation. Drawing inspiration from the Support Vector Machines [Cortes C., and V. Vapnik. “Support vector machine.” Machine learning 20.3 (1995)], we present a novel formulation for classification tasks, named MAximum MaRGin Optimal Tree (MARGOT), that embeds the use of maximum margin multivariate hyperplanes nested in a binary tree structure. The resulting model is a Mixed Integer Quadratic Programming problem. The model can also include feature selection constraints which allow to define a hierarchy on the subsets of features that mostly affect the outcome. We present numerical experiments on standard benchmark instances and compare optimization performances and generalization capabilities.