Generalized multi-scale stacked sequential learning for multi-class classification

Puertas, Eloi; Escalera, Sergio; Pujol, Oriol

doi:10.1007/s10044-013-0333-y

Generalized multi-scale stacked sequential learning for multi-class classification

Theoretical Advances
Published: 19 April 2013

Volume 18, pages 247–261, (2015)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Eloi Puertas^1,2,
Sergio Escalera^1,2 &
Oriol Pujol^1,2

384 Accesses
9 Citations
Explore all metrics

Abstract

In many classification problems, neighbor data labels have inherent sequential relationships. Sequential learning algorithms take benefit of these relationships in order to improve generalization. In this paper, we revise the multi-scale sequential learning approach (MSSL) for applying it in the multi-class case (MMSSL). We introduce the error-correcting output codesframework in the MSSL classifiers and propose a formulation for calculating confidence maps from the margins of the base classifiers. In addition, we propose a MMSSL compression approach which reduces the number of features in the extended data set without a loss in performance. The proposed methods are tested on several databases, showing significant performance improvement compared to classical approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Article 30 August 2019

Maximizing adjusted covariance: new supervised dimension reduction for classification

Article 02 April 2024

Abbreviations

X :: Set of samples
Y :: Set of labels
x :: A sample
y :: A label
h(x):: A classifier
\(y^{\prime}\) :: A prediction from a classifier
\(y^{\prime\prime}\) :: A final prediction from a chain of classifiers
x ^ext :: Extended set
J :: Neighborhood relationship function
z :: Neighborhood model features
ρ :: Neighborhood
θ :: Neighborhood parameterization
w :: Number of elements in the neighborhood window
s :: Number of scales
c :: Set of different classes in a multi-class problem
\(\hat{F}(\mathbf{x}, c)\) :: A prediction confidence map
N :: Number of classes in a multi-class problem
n :: Number of dichotomizers
σ :: Parameter of a Gaussian filter
∑:: Set of scales defined by σ parameters
b :: A dichotomizer
M :: ECOC coding matrix
\({\mathcal{Y}}\) :: A class codeword in ECOC framework
\({\mathcal{X}}\) :: A sample prediction codeword in ECOC framework
m _x :: Margin for a prediction of sample x
β :: Constant which governs transition in a sigmoidean function
t :: Number of iterations in an ADABoost classifier
δ :: A soft distance
α :: Normalization parameter for soft distance δ
g ^σ :: A multidimensional isotropic gaussian filter with zero mean and σ standard deviation
\({\mathcal{P}}\) :: A set of partitions of classes
P :: A partition of groups of classes
γ :: A symbol in a partition codeword
\(\Upgamma\) :: A partition codeword
R :: The mean ranking for each system configurations
E :: The total number of experiments
k :: The total number of system configuration
\(\chi_{2}^{F}\) :: Friedman statistic value

References

Allwein E, Schapire R, Singer Y (2002) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1:113–141
MathSciNet Google Scholar
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286
MATH Google Scholar
Dietterich TG (2002) Machine learning for sequential data: A Review. Proceedings on joint IAPR international workshop on structural, syntactic, and statistical pattern recognition. In: Lecture notes in computer science, vol 2396, pp 15–30
Dietterich TG, Ashenfelter A, Bulatov Y (2004) Training conditional random fields via gradient tree boosting. In: Proceedings of the 21th ICML, pp 217–224
Nilsson NJ (1965) Learning Machines. McGraw-Hill, New York
Cohen WW, de Carvalho VR (2005) Stacked sequential learning. In: Proceedings of IJCAI 2005, pp 671–676
McCallum A, Freitag D, Pereira F (2000) Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of ICML 2000, pp 591–598
Friedman J, Hastie T, Tibshirani R (1998) Additive logistic regression: a statistical view of boosting. Ann Stat 28:2
MathSciNet Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Article MathSciNet Google Scholar
Lafferty JD, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML 2001, pp 282–289
Burt P, Adelson E (1983) The Laplacian pyramid as a compact image code. IEEE Trans Commun 31(4):532–540
Article Google Scholar
Korč F, Förstner W (2009) eTRIMS Image Database for Interpreting Images of Man-Made Scenes, TR-IGG-P-2009-01, University of Bonn
Boykov Y, Funka-Lea G (2006) Graph cuts and eN-D image segmentation. Int J Comput Vis 70(2):109–131
Article Google Scholar
Escalera S, Tax D, Pujol O, Radeva P, Duin R (2008) Subclass problem-dependent design of error-correcting output codes. IEEE Trans Pattern Anal Mach Intell 30(6):1041–1054
Article Google Scholar
Mottl V, Dvoenko S, Kopylov A (2004) Pattern recognition in interrelated data: the problem, fundamental assumptions, recognition algorithms. In: Proceedingsof the 17th ICPR, Cambridge UK, vol 1, pp 188–191
Gatta C, Puertas E, Pujol O (2011) Multi-scale stacked sequential learning. Pattern Recognit 44(10–11):2414–2426
Article Google Scholar
Ciompi F et al (2011)A holistic approach for the detection of media-adventitia border in IVUS. In: Med Image Comput Comput Assist Interv. MICCAI’11 vol 14, 3rd edn, pp 411–419
Dalal N, Triggs B (2011) Histograms of oriented gradients for human detection. In: Proceedings of 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05) vol 1, pp 886–893
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MATH MathSciNet Google Scholar
Casale P, Pujol O, Radeva P (2011) Personalization and user verification in wearable systems using biometric walking patterns.Personal Ubiquitous Comput, pp 1–18
Escalera S, Pujol O, Radeva P (2010) On the decoding process in ternary error-correcting output codes. Trans Pattern Anal Mach Intell 32(1):120–134
Article Google Scholar
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139
Article MathSciNet Google Scholar
Boykov Y, Kolmogorov V (2003) Computing geodesics and minimal surfaces via graph cuts. In: Proceedings Ninth IEEE international conference on computer vision, vol 1, pp 26–33, 13–16 Oct 2003
Bottou L, LeCun Y (2005) Graph transformer networks for image recognition. Bulletin of the International Statistical Institute (ISI), 55th Session

Download references

Acknowledgments

This work has been supported in part by the projects TIN2009-14404-C02, IMSERSO Mediminder and Rercercaixa 2011 Remedi.

Author information

Authors and Affiliations

Dept. Matemàtica Aplicada i Anàlisi, Universitat de Barcelona, Gran Via 585, 08007, Barcelona, Spain
Eloi Puertas, Sergio Escalera & Oriol Pujol
Computer Vision Center, Campus UAB, Edifici O, 08193, Bellaterra, Spain
Eloi Puertas, Sergio Escalera & Oriol Pujol

Authors

Eloi Puertas
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Escalera
View author publications
You can also search for this author in PubMed Google Scholar
Oriol Pujol
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eloi Puertas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Puertas, E., Escalera, S. & Pujol, O. Generalized multi-scale stacked sequential learning for multi-class classification. Pattern Anal Applic 18, 247–261 (2015). https://doi.org/10.1007/s10044-013-0333-y

Download citation

Received: 29 March 2012
Accepted: 03 April 2013
Published: 19 April 2013
Issue Date: May 2015
DOI: https://doi.org/10.1007/s10044-013-0333-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized multi-scale stacked sequential learning for multi-class classification

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Maximizing adjusted covariance: new supervised dimension reduction for classification

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalized multi-scale stacked sequential learning for multi-class classification

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Maximizing adjusted covariance: new supervised dimension reduction for classification

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation