Skip to main content

Human Head Pose Estimation on SASE Database Using Random Hough Regression Forests

  • Conference paper
  • First Online:
Video Analytics. Face and Facial Expression Recognition and Audience Measurement (VAAM 2016, FFER 2016)

Abstract

In recent years head pose estimation has become an important task in face analysis scenarios. Given the availability of high resolution 3D sensors, the design of a high resolution head pose database would be beneficial for the community. In this paper, Random Hough Forests are used to estimate 3D head pose and location on a new 3D head database, SASE, which represents the baseline performance on the new data for an upcoming international head pose estimation competition. The data in SASE is acquired with a Microsoft Kinect 2 camera, including the RGB and depth information of 50 subjects with a large sample of head poses, allowing us to test methods for real-life scenarios. We briefly review the database while showing baseline head pose estimation results based on Random Hough Forests.

The original version of this chapter was revised: The spelling of the second author’s name was corrected. The erratum to this chapter is available at DOI: 10.1007/978-3-319-56687-0_14

An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-319-56687-0_14

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Palmas, G., Bachynskyi, M., Oulasvirta, A., Seidel, H.-P., Weinkauf, T.: MovExp: a versatile visualization tool for human-computer interaction studies with 3D performance and biomechanical data. IEEE Trans. Visual. Comput. Graphics 20(12), 2359–2368 (2014)

    Article  Google Scholar 

  2. Cao, C., Wu, H., Weng, Y., Shao, T., Zhou, K.: Real-time facial animation with image-based dynamic avatars. ACM Trans. Graphics 35(4), 126 (2016)

    Article  Google Scholar 

  3. Sollfrank, T., Hart, D., Goodsell, R., Foster, J., Tan, T.: 3D visualization of movements can amplify motor cortex activation during subsequent motor imagery. Frontiers Hum. Neurosci. 9 (2015)

    Google Scholar 

  4. Shuster, G.S., Shuster, B.M.: Avatar eye control in a multi-user animation environment, 7 December 2015. US Patent App. 14/961,744

    Google Scholar 

  5. Arellano, D., Varona, J., Perales, F.J.: Generation and visualization of emotional states in virtual characters. Comput. Animation Virtual Worlds 19(3–4), 259–270 (2008)

    Article  Google Scholar 

  6. Chen, H., Li, J., Zhang, F., Li, Y., Wang, H.: 3D model-based continuous emotion recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1836–1845 (2015)

    Google Scholar 

  7. Aggarwal, J.K., Xia, L.: Human activity recognition from 3D data: a review. Pattern Recogn. Lett. 48, 70–80 (2014)

    Article  Google Scholar 

  8. Yan, S., Liu, C., Li, S.Z., Zhang, H., Shum, H.-Y., Cheng, Q.: Face alignment using texture-constrained active shape models. Image Vis. Comput. 21(1), 69–75 (2003)

    Article  Google Scholar 

  9. Liu, X.: Generic face alignment using boosted appearance model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)

    Google Scholar 

  10. Koutras, P., Maragos, P.: Estimation of eye gaze direction angles based on active appearance models. In: IEEE International Conference on Image Processing, pp. 2424–2428. IEEE (2015)

    Google Scholar 

  11. Adeshina, S.A., Cootes, T.F.: Automatic model matching using part based model constrained active appearance models for skeletal maturity. In: 2015 Twelve International Conference on Electronics Computer and Computation, pp. 1–5. IEEE (2015)

    Google Scholar 

  12. Zhou, H., Lam, K.-M., He, X.: Shape-appearance-correlated active appearance model. Pattern Recogn. 56, 88–99 (2016)

    Article  Google Scholar 

  13. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning deep representation for face alignment with auxiliary attributes (2015)

    Google Scholar 

  14. Li, Q., Cheng, Z., Qi, S., Zhang, H., Liu, X., Deng, Y., Yi, M., Yuan, Q., Wang, T., Chen, S.: Automatic facial image standardization based on active appearance model. In: Information Technology: Proceedings of the 2014 International Symposium on Information Technology, p. 151. CRC Press (2015)

    Google Scholar 

  15. Yang, H., Mou, W., Zhang, Y., Patras, I., Gunes, H., Robinson, P.: Face alignment assisted by head pose estimation, arXiv preprint arXiv:1507.03148 (2015)

  16. Vlasic, D., Brand, M., Pfister, H., Popović, J.: Face transfer with multilinear models. ACM Trans. Graphics 24(3), 426–433 (2005)

    Article  Google Scholar 

  17. Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483. IEEE (2013)

    Google Scholar 

  18. Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 1–16. Springer, Cham (2014). doi:10.1007/978-3-319-10605-2_1

    Google Scholar 

  19. Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vision 107(2), 177–190 (2014)

    Article  MathSciNet  Google Scholar 

  20. Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539. IEEE (2013)

    Google Scholar 

  21. Weise, T., Bouaziz, S., Li, H., Pauly, M.: Realtime performance-based facial animation. ACM Trans. Graphics 30(4), 77 (2011)

    Article  Google Scholar 

  22. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874. IEEE (2014)

    Google Scholar 

  23. Traumann, A., Daneshmand, M., Escalera, S., Anbarjafari, G.: Accurate 3D measurement using optical depth information. Electron. Lett. 51(18), 1420–1422 (2015)

    Article  Google Scholar 

  24. Wang, H.-H., Dopfer, A., Wang, C.-C.: 3D AAM based face alignment under wide angular variations using 2D and 3D data. In: IEEE International Conference on Robotics and Automation, pp. 4450–4455. IEEE (2012)

    Google Scholar 

  25. Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 617–624. IEEE (2011)

    Google Scholar 

  26. Fanelli, G., Weise, T., Gall, J., Gool, L.: Real time head pose estimation from consumer depth cameras. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 101–110. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23123-0_11

    Chapter  Google Scholar 

  27. Yang, F., Huang, J., Yu, X., Cui, X., Metaxas, D.: Robust face tracking with a consumer depth camera. In: IEEE International Conference on Image Processing, pp. 561–564. IEEE (2012)

    Google Scholar 

  28. Fanelli, G., Dantone, M., Van Gool, L.: Real time 3D face alignment with random forests-based active appearance models. In: IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–8. IEEE (2013)

    Google Scholar 

  29. Lüsi, I., Escarela, S., Anbarjafari, G.: SASE: RGB-depth database for human head pose estimation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 325–336. Springer, Cham (2016). doi:10.1007/978-3-319-49409-8_26

    Google Scholar 

  30. Lüsi, I., Anbarjafari, G., Meister, E.: Real-time mimicking of Estonian speaker’s mouth movements on a 3D avatar using Kinect 2. In: International Conference on Information and Communication Technology Convergence, pp. 141–143. IEEE (2015)

    Google Scholar 

  31. Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2188–2202 (2011)

    Article  Google Scholar 

  32. Leibe, B., Schiele, B.: Interleaving object categorization and segmentation. In: Christensen, H.I., Nagel, H.-H. (eds.) Cognitive Vision Systems. LNCS, vol. 3948, pp. 145–161. Springer, Heidelberg (2006). doi:10.1007/11414353_10

    Chapter  Google Scholar 

  33. Lehmann, A., Leibe, B., Van Gool, L.: Fast prism: branch and bound hough transform for object class detection. Int. J. Comput. Vis. 94(2), 175–197 (2011)

    Article  MATH  Google Scholar 

  34. Holt, B., Bowden, R.: Static pose estimation from depth images using random regression forests and hough voting. In: VISAp. 2012-Proceedings of the International Conference on Computer Vision Theory and Applications, vol. 1, pp. 557–564 (2012)

    Google Scholar 

  35. Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)

    Article  Google Scholar 

  36. Lepetit, V., Fua, P.: Keypoint recognition using randomized trees. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1465–1479 (2006)

    Article  Google Scholar 

  37. Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)

    Article  Google Scholar 

  38. Smisek, J., Jancosek, M., Pajdla, T.: 3D with kinect. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision, pp. 3–25. Springer, London (2013)

    Chapter  Google Scholar 

Download references

Acknowledgement

This work has been partially supported by the Estonian Research Grant (PUT638), Spanish projects TIN2013-43478-P and TIN2016-74946-P, the European Commission Horizon 2020 granted project SEE.4C under call H2020-ICT-2015, and the Estonian Centre of Excellence in IT (EXCITE) funded by the European Regional Development Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iiris Lüsi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Lüsi, I., Escalera, S., Anbarjafari, G. (2017). Human Head Pose Estimation on SASE Database Using Random Hough Regression Forests. In: Nasrollahi, K., et al. Video Analytics. Face and Facial Expression Recognition and Audience Measurement. VAAM FFER 2016 2016. Lecture Notes in Computer Science(), vol 10165. Springer, Cham. https://doi.org/10.1007/978-3-319-56687-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56687-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56686-3

  • Online ISBN: 978-3-319-56687-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics