Abstract
In recent years head pose estimation has become an important task in face analysis scenarios. Given the availability of high resolution 3D sensors, the design of a high resolution head pose database would be beneficial for the community. In this paper, Random Hough Forests are used to estimate 3D head pose and location on a new 3D head database, SASE, which represents the baseline performance on the new data for an upcoming international head pose estimation competition. The data in SASE is acquired with a Microsoft Kinect 2 camera, including the RGB and depth information of 50 subjects with a large sample of head poses, allowing us to test methods for real-life scenarios. We briefly review the database while showing baseline head pose estimation results based on Random Hough Forests.
The original version of this chapter was revised: The spelling of the second author’s name was corrected. The erratum to this chapter is available at DOI: 10.1007/978-3-319-56687-0_14
An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-319-56687-0_14
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Palmas, G., Bachynskyi, M., Oulasvirta, A., Seidel, H.-P., Weinkauf, T.: MovExp: a versatile visualization tool for human-computer interaction studies with 3D performance and biomechanical data. IEEE Trans. Visual. Comput. Graphics 20(12), 2359–2368 (2014)
Cao, C., Wu, H., Weng, Y., Shao, T., Zhou, K.: Real-time facial animation with image-based dynamic avatars. ACM Trans. Graphics 35(4), 126 (2016)
Sollfrank, T., Hart, D., Goodsell, R., Foster, J., Tan, T.: 3D visualization of movements can amplify motor cortex activation during subsequent motor imagery. Frontiers Hum. Neurosci. 9 (2015)
Shuster, G.S., Shuster, B.M.: Avatar eye control in a multi-user animation environment, 7 December 2015. US Patent App. 14/961,744
Arellano, D., Varona, J., Perales, F.J.: Generation and visualization of emotional states in virtual characters. Comput. Animation Virtual Worlds 19(3–4), 259–270 (2008)
Chen, H., Li, J., Zhang, F., Li, Y., Wang, H.: 3D model-based continuous emotion recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1836–1845 (2015)
Aggarwal, J.K., Xia, L.: Human activity recognition from 3D data: a review. Pattern Recogn. Lett. 48, 70–80 (2014)
Yan, S., Liu, C., Li, S.Z., Zhang, H., Shum, H.-Y., Cheng, Q.: Face alignment using texture-constrained active shape models. Image Vis. Comput. 21(1), 69–75 (2003)
Liu, X.: Generic face alignment using boosted appearance model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Koutras, P., Maragos, P.: Estimation of eye gaze direction angles based on active appearance models. In: IEEE International Conference on Image Processing, pp. 2424–2428. IEEE (2015)
Adeshina, S.A., Cootes, T.F.: Automatic model matching using part based model constrained active appearance models for skeletal maturity. In: 2015 Twelve International Conference on Electronics Computer and Computation, pp. 1–5. IEEE (2015)
Zhou, H., Lam, K.-M., He, X.: Shape-appearance-correlated active appearance model. Pattern Recogn. 56, 88–99 (2016)
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning deep representation for face alignment with auxiliary attributes (2015)
Li, Q., Cheng, Z., Qi, S., Zhang, H., Liu, X., Deng, Y., Yi, M., Yuan, Q., Wang, T., Chen, S.: Automatic facial image standardization based on active appearance model. In: Information Technology: Proceedings of the 2014 International Symposium on Information Technology, p. 151. CRC Press (2015)
Yang, H., Mou, W., Zhang, Y., Patras, I., Gunes, H., Robinson, P.: Face alignment assisted by head pose estimation, arXiv preprint arXiv:1507.03148 (2015)
Vlasic, D., Brand, M., Pfister, H., Popović, J.: Face transfer with multilinear models. ACM Trans. Graphics 24(3), 426–433 (2005)
Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483. IEEE (2013)
Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 1–16. Springer, Cham (2014). doi:10.1007/978-3-319-10605-2_1
Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vision 107(2), 177–190 (2014)
Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539. IEEE (2013)
Weise, T., Bouaziz, S., Li, H., Pauly, M.: Realtime performance-based facial animation. ACM Trans. Graphics 30(4), 77 (2011)
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874. IEEE (2014)
Traumann, A., Daneshmand, M., Escalera, S., Anbarjafari, G.: Accurate 3D measurement using optical depth information. Electron. Lett. 51(18), 1420–1422 (2015)
Wang, H.-H., Dopfer, A., Wang, C.-C.: 3D AAM based face alignment under wide angular variations using 2D and 3D data. In: IEEE International Conference on Robotics and Automation, pp. 4450–4455. IEEE (2012)
Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 617–624. IEEE (2011)
Fanelli, G., Weise, T., Gall, J., Gool, L.: Real time head pose estimation from consumer depth cameras. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 101–110. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23123-0_11
Yang, F., Huang, J., Yu, X., Cui, X., Metaxas, D.: Robust face tracking with a consumer depth camera. In: IEEE International Conference on Image Processing, pp. 561–564. IEEE (2012)
Fanelli, G., Dantone, M., Van Gool, L.: Real time 3D face alignment with random forests-based active appearance models. In: IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–8. IEEE (2013)
Lüsi, I., Escarela, S., Anbarjafari, G.: SASE: RGB-depth database for human head pose estimation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 325–336. Springer, Cham (2016). doi:10.1007/978-3-319-49409-8_26
Lüsi, I., Anbarjafari, G., Meister, E.: Real-time mimicking of Estonian speaker’s mouth movements on a 3D avatar using Kinect 2. In: International Conference on Information and Communication Technology Convergence, pp. 141–143. IEEE (2015)
Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(11), 2188–2202 (2011)
Leibe, B., Schiele, B.: Interleaving object categorization and segmentation. In: Christensen, H.I., Nagel, H.-H. (eds.) Cognitive Vision Systems. LNCS, vol. 3948, pp. 145–161. Springer, Heidelberg (2006). doi:10.1007/11414353_10
Lehmann, A., Leibe, B., Van Gool, L.: Fast prism: branch and bound hough transform for object class detection. Int. J. Comput. Vis. 94(2), 175–197 (2011)
Holt, B., Bowden, R.: Static pose estimation from depth images using random regression forests and hough voting. In: VISAp. 2012-Proceedings of the International Conference on Computer Vision Theory and Applications, vol. 1, pp. 557–564 (2012)
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)
Lepetit, V., Fua, P.: Keypoint recognition using randomized trees. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1465–1479 (2006)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)
Smisek, J., Jancosek, M., Pajdla, T.: 3D with kinect. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision, pp. 3–25. Springer, London (2013)
Acknowledgement
This work has been partially supported by the Estonian Research Grant (PUT638), Spanish projects TIN2013-43478-P and TIN2016-74946-P, the European Commission Horizon 2020 granted project SEE.4C under call H2020-ICT-2015, and the Estonian Centre of Excellence in IT (EXCITE) funded by the European Regional Development Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Lüsi, I., Escalera, S., Anbarjafari, G. (2017). Human Head Pose Estimation on SASE Database Using Random Hough Regression Forests. In: Nasrollahi, K., et al. Video Analytics. Face and Facial Expression Recognition and Audience Measurement. VAAM FFER 2016 2016. Lecture Notes in Computer Science(), vol 10165. Springer, Cham. https://doi.org/10.1007/978-3-319-56687-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-56687-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56686-3
Online ISBN: 978-3-319-56687-0
eBook Packages: Computer ScienceComputer Science (R0)