Abstract
In many classification problems, neighbor data labels have inherent sequential relationships. Sequential learning algorithms take benefit of these relationships in order to improve generalization. In this paper, we revise the multi-scale sequential learning approach (MSSL) for applying it in the multi-class case (MMSSL). We introduce the error-correcting output codesframework in the MSSL classifiers and propose a formulation for calculating confidence maps from the margins of the base classifiers. In addition, we propose a MMSSL compression approach which reduces the number of features in the extended data set without a loss in performance. The proposed methods are tested on several databases, showing significant performance improvement compared to classical approaches.
Similar content being viewed by others
Abbreviations
- X :
-
Set of samples
- Y :
-
Set of labels
- x :
-
A sample
- y :
-
A label
- h(x):
-
A classifier
- \(y^{\prime}\) :
-
A prediction from a classifier
- \(y^{\prime\prime}\) :
-
A final prediction from a chain of classifiers
- x ext :
-
Extended set
- J :
-
Neighborhood relationship function
- z :
-
Neighborhood model features
- ρ :
-
Neighborhood
- θ :
-
Neighborhood parameterization
- w :
-
Number of elements in the neighborhood window
- s :
-
Number of scales
- c :
-
Set of different classes in a multi-class problem
- \(\hat{F}(\mathbf{x}, c)\) :
-
A prediction confidence map
- N :
-
Number of classes in a multi-class problem
- n :
-
Number of dichotomizers
- σ :
-
Parameter of a Gaussian filter
- ∑:
-
Set of scales defined by σ parameters
- b :
-
A dichotomizer
- M :
-
ECOC coding matrix
- \({\mathcal{Y}}\) :
-
A class codeword in ECOC framework
- \({\mathcal{X}}\) :
-
A sample prediction codeword in ECOC framework
- m x :
-
Margin for a prediction of sample x
- β :
-
Constant which governs transition in a sigmoidean function
- t :
-
Number of iterations in an ADABoost classifier
- δ :
-
A soft distance
- α :
-
Normalization parameter for soft distance δ
- g σ :
-
A multidimensional isotropic gaussian filter with zero mean and σ standard deviation
- \({\mathcal{P}}\) :
-
A set of partitions of classes
- P :
-
A partition of groups of classes
- γ :
-
A symbol in a partition codeword
- \(\Upgamma\) :
-
A partition codeword
- R :
-
The mean ranking for each system configurations
- E :
-
The total number of experiments
- k :
-
The total number of system configuration
- \(\chi_{2}^{F}\) :
-
Friedman statistic value
References
Allwein E, Schapire R, Singer Y (2002) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1:113–141
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286
Dietterich TG (2002) Machine learning for sequential data: A Review. Proceedings on joint IAPR international workshop on structural, syntactic, and statistical pattern recognition. In: Lecture notes in computer science, vol 2396, pp 15–30
Dietterich TG, Ashenfelter A, Bulatov Y (2004) Training conditional random fields via gradient tree boosting. In: Proceedings of the 21th ICML, pp 217–224
Nilsson NJ (1965) Learning Machines. McGraw-Hill, New York
Cohen WW, de Carvalho VR (2005) Stacked sequential learning. In: Proceedings of IJCAI 2005, pp 671–676
McCallum A, Freitag D, Pereira F (2000) Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of ICML 2000, pp 591–598
Friedman J, Hastie T, Tibshirani R (1998) Additive logistic regression: a statistical view of boosting. Ann Stat 28:2
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Lafferty JD, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML 2001, pp 282–289
Burt P, Adelson E (1983) The Laplacian pyramid as a compact image code. IEEE Trans Commun 31(4):532–540
Korč F, Förstner W (2009) eTRIMS Image Database for Interpreting Images of Man-Made Scenes, TR-IGG-P-2009-01, University of Bonn
Boykov Y, Funka-Lea G (2006) Graph cuts and eN-D image segmentation. Int J Comput Vis 70(2):109–131
Escalera S, Tax D, Pujol O, Radeva P, Duin R (2008) Subclass problem-dependent design of error-correcting output codes. IEEE Trans Pattern Anal Mach Intell 30(6):1041–1054
Mottl V, Dvoenko S, Kopylov A (2004) Pattern recognition in interrelated data: the problem, fundamental assumptions, recognition algorithms. In: Proceedingsof the 17th ICPR, Cambridge UK, vol 1, pp 188–191
Gatta C, Puertas E, Pujol O (2011) Multi-scale stacked sequential learning. Pattern Recognit 44(10–11):2414–2426
Ciompi F et al (2011)A holistic approach for the detection of media-adventitia border in IVUS. In: Med Image Comput Comput Assist Interv. MICCAI’11 vol 14, 3rd edn, pp 411–419
Dalal N, Triggs B (2011) Histograms of oriented gradients for human detection. In: Proceedings of 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05) vol 1, pp 886–893
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Casale P, Pujol O, Radeva P (2011) Personalization and user verification in wearable systems using biometric walking patterns.Personal Ubiquitous Comput, pp 1–18
Escalera S, Pujol O, Radeva P (2010) On the decoding process in ternary error-correcting output codes. Trans Pattern Anal Mach Intell 32(1):120–134
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139
Boykov Y, Kolmogorov V (2003) Computing geodesics and minimal surfaces via graph cuts. In: Proceedings Ninth IEEE international conference on computer vision, vol 1, pp 26–33, 13–16 Oct 2003
Bottou L, LeCun Y (2005) Graph transformer networks for image recognition. Bulletin of the International Statistical Institute (ISI), 55th Session
Acknowledgments
This work has been supported in part by the projects TIN2009-14404-C02, IMSERSO Mediminder and Rercercaixa 2011 Remedi.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Puertas, E., Escalera, S. & Pujol, O. Generalized multi-scale stacked sequential learning for multi-class classification. Pattern Anal Applic 18, 247–261 (2015). https://doi.org/10.1007/s10044-013-0333-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-013-0333-y