Classifier based on straight line segments: an overview and theoretical improvements
Abstract
Literature offers several supervised machine learning algorithms focused on
binary classification for solving daily problems. Compared to well-known
conventional classifiers, the Straight-line Segment Classifier (SLS Classifier)
stands out for its low complexity and competitiveness. It takes advantage
of some good characteristics of Learning Vector Quantization and Nearest
Feature Line. In addition, it has lower computational complexity than Support
Vector Machines. The SLS binary classifier is based on distances between
a set of points and two sets of straight line segments. Therefore, it
involves finding the optimal placement of straight line segment extremities
to achieve the minimum mean square error. In previous works, we explored
three different evolutive algorithms as optimization methods to increase the
possibilities of finding a global optimum generating different solutions as
the initial population. Additionally, we proposed a new way of estimating
the number of straight line segments by applying an unsupervised clustering
method. However, some interesting questions remained to be further
analyzed, such as a detailed analysis of the parameters and base definitions
of the optimization algorithm. Furthermore, it was straightforward that
the straight-line segment lengths can grow significantly during the training
phase, negatively impacting the classification rate. Therefore, the main goal
of this thesis is to outline the SLS Classifier baseline and propose some theoretical
improvements, such as (i) Formulating an optimization approach to
provide optimal final positions for the straight line segments; (ii) Proposing
a model selection approach for the SLS Classifier; and, (iii) Determining
the SLS Classifier performance when applied on real problems (10 artificial
and 8 UCI public datasets). The proposed methodology showed promising
results compared to the original SLS Classifier version and other classifiers.
Moreover, this classifier can be used in research and industry for decisionmaking
problems due to the straightforward interpretation and classification
rates.
Temas
Aprendizaje automático (Inteligencia artificial)
Algoritmos
Algoritmos
Para optar el título de
Doctor en Ingeniería