Skip to main content
Log in

Abstract

A key topic in classification is the accuracy loss produced when the data distribution in the training (source) domain differs from that in the testing (target) domain. This is being recognized as a very relevant problem for many computer vision tasks such as image classification, object detection, and object category recognition. In this paper, we present a novel domain adaptation method that leverages multiple target domains (or sub-domains) in a hierarchical adaptation tree. The core idea is to exploit the commonalities and differences of the jointly considered target domains. Given the relevance of structural SVM (SSVM) classifiers, we apply our idea to the adaptive SSVM (A-SSVM; Xu et al., IEEE Trans Pattern Anal Mach Intell 36(12):2367–2380, 2014a), which only requires the target domain samples together with the existing source-domain classifier for performing the desired adaptation. Altogether, we term our proposal as hierarchical A-SSVM (HA-SSVM). As proof of concept we use HA-SSVM for pedestrian detection, object category recognition and face recognition. In the former we apply HA-SSVM to the deformable part-based model (DPM; Felzenszwalb et al., IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645, 2010) while in the rest HA-SSVM is applied to multi-category classifiers. We will show how HA-SSVM is effective in increasing the detection/recognition accuracy with respect to adaptation strategies that ignore the structure of the target data. Since, the sub-domains of the target data are not always known a priori, we shown how HA-SSVM can incorporate sub-domain discovery for object category recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Notes

  1. It is publicly available under the name CVC-07 DPM Virtual-World Pedestrian Dataset at www.cvc.uab.es/adas.

  2. www.cvlibs.net/datasets/kitti/.

  3. In the RMRC’2013 program it can be checked that we did a talk as winners of the pedestrian detection challenge, see http://ttic.uchicago.edu/~rurtasun/rmrc/program.php.

  4. http://www.cvlibs.net/datasets/kitti/eval_object.php.

  5. http://www-prima.inrialpes.fr/Pointing04/data-face.html.

References

  • Aytar, Y., & Zisserman, A. (2011). Tabula rasa: Model transfer for object category detection. In Proceedings of international conference on computer vision, Singapore.

  • Behley, J., Steinhage, V., & Cremers, A. B. (2013). Laser-based segment classification using a mixture of bag-of-words. In IEEE international conference on intelligent robots and systems, New York.

  • Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., & Vaughan, J. (2009). A theory of learning from different domains. Machine Learning, 79(1), 151–175.

    MathSciNet  Google Scholar 

  • Bergamo, A., & Torresani, L. (2010). Exploring weakly-labeled web images to improve object classification: A domain adaptation approach. In Advances in neural information processing systems, Vancouver.

  • Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE conference on computer vision and pattern recognition, San Diego.

  • Daumé III, H. (2007). Frustratingly easy domain adaptation. In Meeting of the association for computational linguistics, Prague.

  • Daumé III, H. (2009). Bayesian multitask learning with latent hierarchies. In UAI, Montreal.

  • Deng, J., Krause, J., Berg, A., & Li, F.-F. (2012). Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition. In IEEE conference on computer vision and pattern recognition, Washington.

  • Dollár, P., Wojek, C., Schiele, B., & Perona, P. (2012). Pedestrian detection: An evaluation of the state of the art. IEEE Transaction on Pattern Analysis and Machine Intelligence, 34(4), 743–761.

    Article  Google Scholar 

  • Duan, L., Tsang, I. W., Xu, D., & Chua, T.-S. (2009). Domain adaptation from multiple sources via auxiliary classifiers. In International conference on machine learning, Montreal.

  • Duan, L., Xu, D., & Tsang, I. W. (2012). Learning with augmented features for heterogeneous domain adaptation. In International conference on machine learning, Edinburgh.

  • Ess, A., Leibe, B., & Gool, L. V. (2007). Depth and appearance for mobile scene analysis. In International conference on computer vision, Rio de Janeiro.

  • Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.

    Article  Google Scholar 

  • Finkel, J., & Christopher, D. (2009). Hierarchical bayesian domain adaptation. In NAACL, Colorado.

  • Geiger, A., Wojek, C., & Urtasun, R. (2011). Joint 3D estimation of objects and scene layout. In Advances in neural information processing systems, Granada.

  • Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? the KITTI vision benchmark suite. In IEEE conference on computer vision and pattern recognition, Washington.

  • Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.

    Article  Google Scholar 

  • Girshick, R. (2012). From rigid templates to grammars: Object detection with structured models. Ph.D. thesis, The University of Chicago, Chicago.

  • Girshick, R., Felzenszwalb, P., & McAllester, D. (2012). Discriminatively trained deformable part models, release 5. http://www.people.cs.uchicago.edu/rbg/latent-release5/.

  • Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In IEEE conference on computer vision and pattern recognition, Providence.

  • Gong, B., Grauman, K., & Sha, F. (2013a). Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation. In International conference on machine learning, Atlanta.

  • Gong, B., Grauman, K., & Sha, F. (2013b). Reshaping visual datasets for domain adaptation. In Advances in neural information processing systems, Lake Tahoe.

  • Gong, B., Grauman, K., & Sha, F. (2014). Learning kernels for unsupervised domain adaptation with applications to visual object recognition. International Journal on Computer Vision, 109(1–2), 3–27.

    Article  MathSciNet  MATH  Google Scholar 

  • Gopalan, R., Li, R., & Chellappa, R. (2011). Domain adaptation for object recognition: An unsupervised approach. In International conference on computer vision, Barcelona.

  • Gourier, N., Hall, D., & Crowley, J. L. (2004). Estimating face orientation from robust detection of salient facial features. In International conference in pattern recognition, New York.

  • Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset. Technical report, California Institute of Technology.

  • Hoffman, J., Kulis, B., Darrell, T., & Saenko, K. (2012). Discovering latent domains for multisource domain adaptation. In European conference on computer vision, Florence.

  • Hoffman, J., Rodner, E., Donahue, J., Saenko, K., & Darrell, T. (2013). Efficient learning of domain invariant image representations. In International conference on learning representations, Arizona.

  • Hoffman, J., Rodner, E., Donahue, J., Kulis, B., & Saenko, K. (2014). Asymmetric and category invariant feature transformations for domain adaptation. International Journal on Computer Vision, 109(1–2), 28–41.

    Article  MathSciNet  MATH  Google Scholar 

  • Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093.

  • Jiang, J. (2008). A literature survey on domain adaptation of statistical classifiers. Technical report, School of Information Systems, Singapore Management University.

  • Kan, M., Wu, J., Shan, S., & Chen, X. (2014). Domain adaptation for face recognition: Targetize source domain bridged by common subspace. International Journal on Computer Vision, 109(1–2), 94–109.

    Article  MATH  Google Scholar 

  • Kulis, B., Saenko, K., & Darrell, T. (2011). What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In IEEE conference on computer vision and pattern recognition, Washington.

  • Lee, K., Ho, J., & Kriegman, D. (2005). Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5), 684–698.

    Article  Google Scholar 

  • Lu, B., Chellappa, R., & Nasrabadi, N. M. (2015). Incremental dictionary learning for unsupervised domain adaptation. In British machine vision conference, Swansea.

  • Mansour, Y., Mohri, M., & Rostamizadeh, A. (2008). Domain adaptation with multiple sources. In Advances in neural information processing systems, Vancouver.

  • Mirrashed, F., & Rastegar, M. (2013). Domain adaptive classification. In International conference on computer vision, Sydney.

  • Mosek. (2013). Optimization toolkit. http://www.mosek.com.

  • Nguyen, H., Ho, H. T., Patel, V., & Chellappa, R. (2015). Dash-n: Joint hierarchical domain adaptation and feature learning. IEEE Transactions on Image Processing, 24(12), 5479–5491.

    Article  MathSciNet  Google Scholar 

  • Ni, J., Qiu, Q., & Chellappa, R. (2013). Subspace interpolation via dictionary learning for unsupervised domain adaptation. In IEEE conference on computer vision and pattern recognition, Oregon.

  • Pan, S., & Yang, Q. (2009). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.

    Article  Google Scholar 

  • Park, D., Ramanan, D., & Fowlkes, C. (2010). Multiresolution models for object detection. In European conference on computer vision, Crete.

  • Pepikj, B., Stark, M., Gehler, P., & Schiele, B. (2015). Multi-view and 3d deformable part models. In IEEE transactions on pattern analysis and machine intelligence, New York.

  • Premebida, C., Carreira, J., Batista, J., & Nunes, U. (2014). Pedestrian detection combining rgb and dense lidar data. In IEEE international conference on intelligent robots and systems, Chicago.

  • Saenko, K., Hulis, B., Fritz, M., & Darrel, T. (2010). Adapting visual category models to new domains. In European conference on computer vision, Hersonissos, Heraklion, Crete.

  • Tang, K., Ramanathan, V., Fei-fei, L., & Koller, D. (2012). Shifting weights: Adapting object detectors from image to video. In Advances in neural information processing systems, Lake Tahoe.

  • Teh, Y., Daumé III, H., & Roy, D. (2007). Bayesian agglomerative clustering with coalescents. In Advances in neural information processing systems, Vancouver.

  • Vázquez, D., López, A., & Ponsa, D. (2012). Unsupervised domain adaptation of virtual and real worlds for pedestrian detection. In International conference in pattern recognition, Tsukuba.

  • Vázquez, D., López, A., Marín, J., Ponsa, D., & Gerónimo, D. (2014). Virtual and real world adaptation for pedestrian detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(4), 797–809.

    Article  Google Scholar 

  • Xu, J., Ramos, S., Vázquez, D., & López, A. (2014a). Domain adaptation of deformable part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(12), 2367–2380.

  • Xu, J., Vázquez, D., López, A., Marín, J., & Ponsa, D. (2014b). Learning a part-based pedestrian detector in a virtual world. IEEE Transactions on Intelligent Transportation Systems, 15(5), 2121–2131.

  • Xu, H., Zheng, J., & Chellappa, R. (2015). Bridging the domain shift by domain adaptive dictionary learning. In British machine vision conference, Swansea.

  • Yang, J., Yan, R., & Hauptmann, A. (2007). Cross-domain video concept detection using adaptive SVMs. In ACM multimedia, Augsburg.

  • Yebes, J., Bergasa, L., & García, M. (2015). Visual object recognition with 3d-aware features in kitti urban scenes. Sensors, 15(4), 9228–9250.

    Article  Google Scholar 

  • Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). Latent hierarchical structural learning for object detection. In IEEE conference on computer vision and pattern recognition, San Francisco.

Download references

Acknowledgments

This work is supported by the Spanish MEC Project TRA2014-57088-C2-1-R, the Spanish DGT Project SPIP2014-01352, the Generalitat de Catalunya Project 2014-SGR-1506, Jiaolong Xu’s Chinese Scholarship Council (CSC) Grant No.2011611023, and Sebastian Ramos’ FPI Grant BES-2012-058280. Finally, we also want to thank the NVIDIA Corporation for the generous support in the form of different GPU hardware units.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiaolong Xu.

Additional information

Communicated by M. Hebert.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, J., Ramos, S., Vázquez, D. et al. Hierarchical Adaptive Structural SVM for Domain Adaptation. Int J Comput Vis 119, 159–178 (2016). https://doi.org/10.1007/s11263-016-0885-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-016-0885-6

Keywords

Navigation