Elsevier

Pattern Recognition

Volume 42, Issue 11, November 2009, Pages 2372-2391
Pattern Recognition

Variance reduction techniques in particle-based visual contour tracking

https://doi.org/10.1016/j.patcog.2009.04.007Get rights and content

Abstract

This paper presents a comparative study of three different strategies to improve the performance of particle filters, in the context of visual contour tracking: the unscented particle filter, the Rao-Blackwellized particle filter, and the partitioned sampling technique. The tracking problem analyzed is the joint estimation of the global and local transformation of the outline of a given target, represented following the active shape model approach. The main contributions of the paper are the novel adaptations of the considered techniques on this generic problem, and the quantitative assessment of their performance in extensive experimental work done.

Introduction

Visual contour tracking is an area of research that has received much attention by the computer vision community for many years. One essential reason for this to happen is that, in many application domains, the contour of an object is a very informative cue about its state or configuration. Proof of that is the application of contour tracking in areas like visual surveillance [1], traffic monitoring [2], medical diagnosis [3], [4] and human–machine interaction [5], [6], among others.

The tracking of contours has been posed mainly as a minimization or as an inference problem. Following the first perspective, the so-called active contour methods adapt iteratively an elastic curve to image edges, while imposing some constraints on it (e.g., smoothness and compactness). The classical snakes approach [7] performs that by minimizing an energy term associated to a parametric curve. Geodesic active contours [9], which generalize in most situations classical snakes [10], pose the problem from a geometric point of view. Targets are segmented using an implicit contour representation. A non-parametric surface is evolved according to image edges, being the tracked contour the zero level set of this surface. The main advantage of this level set-based approach is that topological changes of the original curve are naturally managed. Extensions of this work, where contours are defined in terms of the content of the region that they enclose conform the active regions methods [11], [12], [13].

An important disadvantage of minimization-based approaches is the possibility of converging into local minima and mistrack the target. This drawback can be treated in a principled way by posing contour tracking as an inference problem. Now the goal is estimating the posterior density of a contour given image observations. Minimization-based approaches can be interpreted as a way of determining the maximum a posteriori of this density, assuming implicitly its unimodality. Problems appear when this density is not unimodal, which can be eluded if the whole density is estimated. This paper studies contour tracking from this perspective.

Formally, given a parametric model of the contour to be tracked, the goal is estimating at each instant t the probability density function (PDF) of the model parameters xt (i.e., the contour state), conditioned on the observations up to t(i.e.,y1:t=[yi]i=1t). In many applications this PDF can be properly assumed Gaussian, and its parameters can be efficiently estimated by means of Kalman-based filters. However, in cluttered scenes, this Gaussian assumption is usually too rough, since the PDF presents in fact multiple modes. This happens when there is more than one model parameterization that fits tightly to image observations, due to the presence of the tracked shape and also of other distractors in the scene. In these cases, it seems reasonable to maintain more than one contour tracking hypothesis, and in that way assure to keep track of the one that effectively adjusts to the object of interest. A principled manner to perform that consists in representing p(xt|y1:t) by means of a population of particles (i.e., concrete xt instances), distributed (ideally) according to this PDF. In that way, any arbitrary form of the filtering density can be properly managed, what results in a tracking performance more robust to clutter. Providing a proper particle-based representation of p(xt|y1:t) is the objective pursued by the so-called particle filters (PFs). Briefly, PFs are stochastic sampling methods that sequentially approximate p(xt|y1:t) by combining a particle-based representation of this density at the previous instant t-1 (i.e. p(xt-1|y1:t-1)), and new collected observations yt. Due to that, they are commonly referred as sequential Monte Carlo methods and good reviews of their theoretical basis can be found in [14], [15], [16]. PFs were seminally applied to the problem of contour tracking by Isard and Blake [17], [18], in a particular form that they termed as Condensation algorithm.

The Condensation algorithm is definitely the most popular form of PF applied in vision-based tracking applications. However, its computational cost (which depends on the amount of particles needed to represent p(xt|y1:t) properly) increases exponentially with the number of parameters of the target model used. That is, it suffers from the curse of dimensionality. This is a serious drawback in contour tracking problems. In general, targets being tracked present global transformations of their outline (e.g., translations, rotations, etc.), as well as simultaneous local shape deformations. Consequently, the dimension of the parametric model of the contour is rather big, what makes the cost of its robust tracking high. The good news is that, since the problem of Condensation with the state dimensionality is well known, different generic strategies have been proposed to counteract it. In this paper we analyze the performance of three of these strategies in the context of visual contour tracking: the unscented particle filter (UPF), the Rao-Blackwellized particle filter (RBPF), and the partitioned sampling (PS) technique. We contribute with their adaptation in the context of contour tracking using active shape model (ASM). Developed mainly by British research groups in Leeds, Oxford and Manchester [19], [20], [21], ASMs represent the outline of an object by means of a parametric model, whose representability is limited to a given space of transformations, whether generic (e.g., Euclidean or affine transformations of a basic shape) or specific (shape deformations learned from the statistic analysis of training data). Our study focuses on ASMs since exploit naturally the a priori knowledge on the feasible shapes that a target can take. As traditionally formulated, they do not consider topological changes of the contour. However, in [22] is shown that applying their same principles on implicit contour representations, a parametric model is obtained that can manage such cases. ASMs have been shown effective in many application domains [20], and thanks to their parametric nature, their use in inference-based contour tracking is direct. However, as it is shown in [23], [24], strategies exist to consider also non-parametric contour representations inside this framework.

As will be stated in the respective sections, two of the three techniques studied in this paper (the UPF and the PS) have also been applied previously in the contour tracking problem by other authors. However, our proposals differ significantly from the ones in these previous works. On the one hand, we use a more complete contour model, accounting for global and local shape transformations. On the other, our model of the contour observation process is more rigorous and accurate, leading to a better interpretation of the evidence extracted from frames. Another contribution of this paper is an exhaustive study of the performance of the proposed algorithms. This has been done using synthetic sequences, distorted with different levels of noise. Using the knowledge of the parameters used to generate the sequences, the performance of each technique has been measured quantitatively. This has allowed us to rank proposed algorithms at each evaluated situation, and to identify their strengths and weaknesses. Algorithms have also been tested on real sequences, in the contexts of hand and pedestrian tracking.

The remainder of this paper is organized as follows: Section 2 gives an overview of the used contour model representation (the ASM), and introduces the approach used to jointly account for the affine transformations and the local deformations of a given shape of interest. Then, Section 3 focuses on modeling the shape evolution along time, and Section 4 on how the shape model relates to observations in images. Section 5 formalizes the visual tracking of contours as a Bayesian inference problem, and presents the general solution to this problem given by the importance sampling technique, which has led to the so-called particle filtering. The main drawbacks of this approach are remarked, and three different strategies to deal with them are adapted in the following sections to the contour tracking problem: the UPF (Section 6), the RBPF (Section 7), and the PS (Section 8). A comparative study of the performance of these approaches is presented in Section 9, and final conclusions are provided in Section 10. A list of the abbreviations used in the paper is given in <?MCtwidthcolumnwidth?>Table 1.

Section snippets

Contour representation

In many application domains the use of shape tracking algorithms is motivated by the need not only to localize a given target, but also to identify its specific pose or configuration. To fulfill that, a generative model of the target shape variability is required. Many authors have worked on developing representations of shape variability in many different ways. Refs. [19], [20], [21] review the major contributions on this field, and then focus on the description of a model-based approach to

Contour dynamics

In visual contour tracking applications, the usual approach to describe the expected evolution of the parameters of a given model is by means of an expression based on discrete time series. Given a parameter c, an auto-regressive (AR) process of order n ct=k=1nαkct-k+b0wtis used to describe its dynamics, where αk are real constants, being αk0, and b0wt is a stochastic disturbance term corresponding to a Gaussian white noise process with parameters N(0,b0b0T).

For mathematical convenience,

Contour observation model

Once modeled the shape of the target and its expected dynamical behavior, it lacks to clarify how the model relates to the information available of the target to be tracked (i.e., its measurements). This is the task accomplished by the system observation model.

In contour tracking applications, observations usually correspond to salient edges in frames. The typical procedure to extract them is based on synthesizing the contour expected to be found in a frame, and establish several measurement

Contour tracking using importance sampling

The goal of PFs is to generate a particle set representing properly the distribution p(xt|y1:t). This density is just the marginal of p(x0:t|y1:t), which we use in the following to formalize PFs more easily. Expressions derived to properly represent p(x0:t|y1:t) will necessarily characterize p(xt|y1:t). Hence, the task to be fulfilled is determining a particle set that should correspond to a random sampling of p(x0:t|y1:t). Since this density is unknown, generating samples from it is not

Unscented particle filter

The key point of an SISR algorithm is the importance function q(xt|x0:t-1y1:t) used to generate particles at each time step. In [14] it is proved that the optimal function to carry this task (in terms of minimizing the variance of the particle weights wt) is p(xt|x0:t-1y1:t). Note that this optimal importance sampling density (OISD) is conditioned on current observations. In practical applications, its determination is commonly a non-trivial task. The unscented particle filter [38] proposes as

Rao-Blackwellized particle filter

The strategy considered in this section is based on factorizing the desired density p(x0:t|y1:t). If the state to be estimated is divided into two parts x0:t=[x0:tP1x0:tP2]T, then this density can be factored asp(x0:tP1x0:tP2|y1:t)=p(x0:tP1|y1:t)p(x0:tP2|x0:tP1y1:t).The Rao-Blackwellization (RB) technique [14], [46] proposes to use this structural information to infer analytically a part of the state (x0:tP2) conditionally upon the other part of the state (x0:tP1), which is estimated,

Partitioned sampling

Our third proposal to deal with the curse of dimensionality problem of PFs arises from the following observation: in most real contour tracking applications, the dynamics of global and local transformations are independent, and therefore they can be modeled separately. Could it be possible to estimate also them separately? In general, the answer to this question is clearly no, as observations manifest jointly the effect of both transformations. However, paying attention to specific contour

Experimental results

This section details the work done to objectively assess the effectiveness of proposed algorithms. Synthetic and real sequences have been processed to evaluate them quantitatively under different criteria. Examples of the performances achieved can be seen in the videos available at www.cvc.uab.es/adas/projects/contourtracking/PR/.

Conclusions

In this paper we have proposed the novel adaptation of three well-known variance-reduction technique to the problem of particle-based visual contour tracking using ASMs: the UPF, the RBPF, and the PS algorithm. Our proposals differ from other approaches in the shape model used, which accounts for local and global contour transformations, and in a more rigorous model of the contour observation process, which leads to a more accurate interpretation of the evidence extracted from frames.

In the

Acknowledgments

This work has been partially funded by Grants TRA2007–62526/AUT of the Spanish Education and Science Ministry, and Consolider Ingenio 2010: MIPRCV (CSD2007–00018).

About the Author—DANIEL PONSA received the BSc degree in Computer Science from the Universitat Autònoma de Barcelona (UAB) in 1996, the MSc degree in Computer Vision in 1998, and the PhD in 2007. From 1996 till 2003 he worked as teaching assistant at the Computer Science Department of the UAB. He is currently a full-time researcher at the Computer Vision Center research group on advanced driver assistance systems by computer vision.

References (66)

  • M. Kass et al.

    Snakes: active contour models

    International Journal of Computer Vision

    (1988)
  • V. Caselles et al.

    Geodesic active contours

    International Journal of Computer Vision

    (1997)
  • C. Xu, A. Yezzi, J.L. Prince, On the relationship between parametric and geometric active contours, in: 34th Asilomar...
  • D. Freedman et al.

    Active contours for tracking distributions

    IEEE Transactions on Image Processing

    (2004)
  • T. Zhang et al.

    Improving performance of distribution tracking through background mismatch

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2005)
  • A. Doucet et al.

    On sequential Monte Carlo sampling methods for Bayesian filtering

    Statistics and Computing

    (2000)
  • A. Doucet et al.

    Sequential Monte Carlo Methods in Practice

    (2001)
  • M.S. Arulampalam et al.

    A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking

    IEEE Transactions on Signal Processing

    (2002)
  • M. Isard, A. Blake, Contour tracking by stochastic propagation of conditional density, in: Proceedings of the European...
  • M. Isard et al.

    CONDENSATION—conditional density propagation for visual tracking

    International Journal of Computer Vision

    (1998)
  • A. Baumberg, Learning deformable models for tracking human motion, Ph.D. Thesis, The University of Leeds, School of...
  • A. Blake et al.

    Active Contours

    (1998)
  • T. Cootes, C. Taylor, Statistical models of appearance for computer vision, Technical Report, Imaging Science and...
  • D. Cremers

    Dynamical statistical shape priors for level set-based tracking

    IEEE Transactions on Pattern Analysis And Machine Intelligence

    (2006)
  • Y. Rathi et al.

    A generic framework for tracking using particle filter with dynamic shape prior

    IEEE Transactions on Image Processing

    (2007)
  • Y. Rathi et al.

    Tracking deforming objects using particle filtering for geometric active contours

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2007)
  • P. Tissainayagam, Visual tracking: development, performance evaluation, and motion model switching, Ph.D. Thesis,...
  • D. Reynard, A. Wildenberg, A. Blake, J.A. Marchant, Learning dynamics of complex motions from image sequences, in:...
  • A. Wildenberg, Learning and initialisation for visual tracking, Ph.D. Thesis, University of Oxford, Robotics Research...
  • B. North, A. Blake, Learning dynamical models using expectation-maximisation, in: ICCV, 1998, pp....
  • J. MacCormick, Probabilistic modelling and stochastic algorithms for visual localisation and tracking, Ph.D. Thesis,...
  • D. Mackay, Introduction to Monte Carlo methods, Learning in Graphical Models, MIT Press, Cambridge, MA, 1999, pp....
  • A. Kong et al.

    Sequential imputations and Bayesian missing data problems

    Journal of the American Statistical Association

    (1994)
  • Cited by (9)

    • Efficient and robust multi-template tracking using multi-start interactive hybrid search

      2014, Computer Vision and Image Understanding
      Citation Excerpt :

      Visual trackers are roughly classified as either direct (image-based) methods or indirect (feature-based) methods. In the latter approach, different feature descriptors including silhouette [7], contour [8], texture [9], local invariant features [10], Haar-like features [11], and histograms [12,13] are used to model the object appearance. Feature descriptors can – to some degree – handle illumination changes, scale and appearance variations, and outliers.

    • Dynamic appearance model for particle filter based visual tracking

      2012, Pattern Recognition
      Citation Excerpt :

      Modeling target's appearance in videos is a problem of extracting features. The features include region [5–7], shape [8], points [9] and so on. The region based feature extracts target's global information, such as gray, color, and texture.

    • Robust decentralized multi-model adaptive template tracking

      2012, Pattern Recognition
      Citation Excerpt :

      Although this tracker can use multiple cues and handle the target appearance changes, it is unstable to the unpredicted and complex target motion specially in a cluttered environment. Ponsa and López [51] proposed a particle-based contour tracking method. They used Particle Filtering algorithm to model and track contours.

    • Shape based appearance model for kernel tracking

      2012, Image and Vision Computing
      Citation Excerpt :

      In this context, the shape information constrains the object model to transform only in ways characteristic of the object of interest. For instance, in [17,35], an object is represented with an active shape model that undergoes constrained transformation defined by the shape prior information. While all the aforementioned shape models are shown to drastically improve the tracking performances, they still suffer from the inefficiency problem resulting from the high dimensional solution space, and they are of no consequence where the object need not be identified nor its behavior interpreted based on its shape.

    View all citing articles on Scopus

    About the Author—DANIEL PONSA received the BSc degree in Computer Science from the Universitat Autònoma de Barcelona (UAB) in 1996, the MSc degree in Computer Vision in 1998, and the PhD in 2007. From 1996 till 2003 he worked as teaching assistant at the Computer Science Department of the UAB. He is currently a full-time researcher at the Computer Vision Center research group on advanced driver assistance systems by computer vision.

    About the Author—ANTONIO M. LÓPEZ received the BSc degree in Computer Science from the Universitat Politécnica de Catalunya in 1992, the MSc degree in image processing and artificial intelligence from the Universitat Autònoma de Barcelona (UAB) in 1994 and the PhD in 2000. Since 1992 he has been giving lectures at the Computer Science Department of the UAB, where currently he is an associate professor. He is responsible of the research group on advanced driver assistance systems by computer vision in the Computer Vision Center at the UAB.

    View full text