ABSTRACT
With the rapid increase of users of wearable cameras in recent years and of the amount of data they produce, there is a strong need for automatic retrieval and summarization techniques. This work addresses the problem of automatically summarizing egocentric photo streams captured through a wearable camera by taking an image retrieval perspective. After removing non-informative images by a new CNN-based filter, images are ranked by relevance to ensure semantic diversity and finally re-ranked by a novelty criterion to reduce redundancy. To assess the results, a new evaluation metric is proposed which takes into account the non-uniqueness of the solution. Experimental results applied on a database of 7,110 images from 6 different subjects and evaluated by experts gave 95.74% of experts satisfaction and a Mean Opinion Score of 4.57 out of 5.0.
- Omid Aghazadeh, Josephine Sullivan, and Stefan Carlsson. 2011. Novelty detection from an ego-centric perspective. Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 3297--3304. Google ScholarDigital Library
- Michael Blighe, Aiden Doherty, Alan F. Smeaton, and Noel E. O'Connor. 2008. Keyframe Detection in Visual Lifelogs. In Proceedings of the 1st International Conference on PErvasive Technologies Related to Assistive Environments (PETRA '08). ACM, New York, NY, USA, Article 55, 2 pages. Google ScholarDigital Library
- Marc Bolaños, Ricard Mestre, Estefanía Talavera, Xavier Giró-i Nieto, and Petia Radeva. 2015. Visual Summary of Egocentric Photostreams by Representative Keyframes. arXiv preprint arXiv:1505.01130 (2015).Google Scholar
- Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-based Reranking for Reordering Documents and Producing Summaries Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '98). ACM, New York, NY, USA, 335--336. Google ScholarDigital Library
- Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. 2008. Novelty and diversity in information retrieval evaluation Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 659--666. Google ScholarDigital Library
- Duc-Tien Dang-Nguyen, Luca Piras, Giorgio Giacinto, Giulia Boato, and FGB De Natale. 2014. Retrieval of Diverse Images by Pre-filtering and Hierarchical Clustering. Working Notes of MediaEval (2014).Google Scholar
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248--255.Google Scholar
- Thomas Deselaers, Tobias Gass, Philippe Dreuw, and Hermann Ney. 2009 a. Jointly Optimising Relevance and Diversity in Image Retrieval Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR '09). ACM, New York, NY, USA, Article 39, b8 pages. Google ScholarDigital Library
- Thomas Deselaers, Tobias Gass, Philippe Dreuw, and Hermann Ney. 2009 b. Jointly optimising relevance and diversity in image retrieval Proceedings of the ACM international conference on image and video retrieval. ACM, 39. Google ScholarDigital Library
- Aiden R. Doherty, Daragh Byrne, Alan F. Smeaton, Gareth J. F. Jones, and Mark Hughes. 2008. Investigating Keyframe Selection Methods in the Novel Domain of Passively Captured Visual Lifelogs. In Proceedings of the 2008 International Conference on Content-based Image and Video Retrieval (CIVR '08). ACM, New York, NY, USA, 259--268. Google ScholarDigital Library
- Joydeep Ghosh, Yong Jae Lee, and Kristen Grauman. 2012. Discovering important people and objects for egocentric video summarization 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1346--1353. Google ScholarDigital Library
- Boqing Gong, Wei-Lun Chao, Kristen Grauman, and Fei Sha. 2014. Diverse Sequential Subset Selection for Supervised Video Summarization. Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K.Q. Weinberger (Eds.). Curran Associates, Inc., 2069--2077. http://papers.nips.cc/paper/5413-diverse-sequential-subset-selection-for-supervised-video-summarization.pdf Google ScholarDigital Library
- Michael Gygli, Helmut Grabner, Hayko Riemenschneider, and Luc Van Gool. 2014. Creating summaries from user videos. In European conference on computer vision. Springer, 505--520.Google ScholarCross Ref
- Steve Hodges, Lyndsay Williams, Emma Berry, Shahram Izadi, James Srinivasan, Alex Butler, Gavin Smyth, Narinder Kapur, and Ken Wood. 2006. SenseCam: A Retrospective Memory Aid. In Proceedings of the 8th International Conference on Ubiquitous Computing (UbiComp'06). Springer-Verlag, Berlin, Heidelberg, 177--193. Google ScholarDigital Library
- Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Jeff Donahue, Ross B. Girshick, Trevor Darrell, and Kate Saenko. 2014. LSDA: Large Scale Detection Through Adaptation. CoRR Vol. abs/1407.5035 (2014). http://arxiv.org/abs/1407.5035 Google ScholarDigital Library
- Bogdan Ionescu, Adrian Popescu, Mihai Lupu, Alexandru L Gınsca, and Henning Müller. 2014. Retrieving diverse social images at mediaeval 2014: Challenge, dataset and evaluation MediaEval 2014 Workshop, Barcelona, Spain.Google Scholar
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding Proceedings of the ACM International Conference on Multimedia. ACM, 675--678. Google ScholarDigital Library
- Amornched Jinda-Apiraksa, Jana Machajdik, and Robert Sablatnig. 2012. A keyframe selection of lifelog image sequences. Erasmus Mundus M. Sc. in Visions and Robotics thesis, Vienna University of Technology (TU Wien) (2012).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks Advances in neural information processing systems. 1097--1105. Google ScholarDigital Library
- Alex Kulesza and Ben Taskar. 2012. Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083 (2012). Google ScholarDigital Library
- Matthew L. Lee and Anind K. Dey. 2008. Lifelogging Memory Appliance for People with Episodic Memory Impairment Proceedings of the 10th International Conference on Ubiquitous Computing (UbiComp '08). ACM, New York, NY, USA, 44--53. Google ScholarDigital Library
- Zheng Lu and Kristen Grauman. 2013. Story-driven summarization for egocentric video. Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. IEEE, 2714--2721. Google ScholarDigital Library
- S. Mann. 1998. 'WearCam' (The wearable camera): personal imaging systems for long-term use in wearable tetherless computer-mediated reality and personal photo/videographic memory prosthesis. In Wearable Computers, 1998. Digest of Papers. Second International Symposium on. 124--131. Google ScholarDigital Library
- W. W. Mayol, B. J. Tordoff, and D. W. Murray. 2002. Wearable Visual Robots. Personal Ubiquitous Comput. Vol. 6, 1 (Jan.. 2002), 37--48. 1145/1526709.1526756 Google ScholarDigital Library
- Mark Montague and Javed A Aslam. 2001. Relevance score normalization for metasearch. In Proceedings of the tenth international conference on Information and knowledge management. ACM, 427--433. Google ScholarDigital Library
- Junting Pan, Kevin McGuinness, and Xavier Giró-i Nieto. 2016. End-to-end Convolutional Network for Saliency Prediction. In Submitted to CVPR.Google Scholar
- P. Piasek, K. Irving, and A.F. Smeaton. 2011. SenseCam intervention based on Cognitive Stimulation Therapy framework for early-stage dementia. In Pervasive Computing Technologies for Healthcare (PervasiveHealth), 2011 5th International Conference on. 522--525.Google Scholar
- Paulina Piasek, Alan F Smeaton, et al. 2014. Using lifelogging to help construct the identity of people with dementia. (2014).Google Scholar
- Sachan Priyamvada Rajendra and N. Keshaveni. 2014. A survey of automatic video summarization techniques. International Journal of Electronics, Electrical and Computational System 2 (2014).Google Scholar
- M. Elena Renda and Umberto Straccia. 2003. Web metasearch: rank vs. score based rank aggregation methods. In Proceedings of the 2003 ACM symposium on Applied computing. ACM, 841--846. Google ScholarDigital Library
- Abigail J. Sellen, Andrew Fogg, Mike Aitken, Steve Hodges, Carsten Rother, and Ken Wood. 2007. Do Life-logging Technologies Support Memory for the Past? An Experimental Study Using Sensecam. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '07). ACM, New York, NY, USA, 81--90. Google ScholarDigital Library
- Kai Song, Yonghong Tian, Wen Gao, and Tiejun Huang. 2006. Diversifying the Image Retrieval Results. In Proceedings of the 14th Annual ACM International Conference on Multimedia (MULTIMEDIA '06). ACM, New York, NY, USA, 707--710. Google ScholarDigital Library
- Aimee Spector, Lene Thorgrimsen, Bob Woods, Lindsay Royan, Steve Davies, Margaret Butterworth (deceased), and Martin Orrell. 2003. Efficacy of an evidence-based cognitive stimulation therapy programme for people with dementia. The British Journal of Psychiatry 183, 3 (2003), 248--254.Google ScholarCross Ref
- Eleftherios Spyromitros-Xioufis, Symeon Papadopoulos, Yiannis Kompatsiaris, and Ioannis Vlahavas. 2014. SocialSensor: Finding Diverse Images at MediaEval 2014. (2014).Google Scholar
- Robert C Streijl, Stefan Winkler, and David S Hands. 2016. Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives. Multimedia Systems 22, 2 (2016), 213--227. Google ScholarDigital Library
- Estefania Talavera, Mariella Dimiccoli, Marc Bolaños, Maedeh Aghaei, and Petia Radeva. 2015. R-clustering for egocentric video segmentation. In Pattern Recognition and Image Analysis. Springer, 327--336.Google Scholar
- Reinier H. van Leuken, Lluis Garcia, Ximena Olivares, and Roelof van Zwol. 2009. Visual Diversification of Image Search Results. In Proceedings of the 18th International Conference on World Wide Web (WWW '09). ACM, New York, NY, USA, 341--350. Google ScholarDigital Library
- Cheng Xiang Zhai, William W Cohen, and John Lafferty. 2003. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. ACM, 10--17. Google ScholarDigital Library
- Xiangxin Zhu and Deva Ramanan. 2012. Face detection, pose estimation, and landmark localization in the wild Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2879--2886. Google ScholarDigital Library
Index Terms
- Semantic Summarization of Egocentric Photo Stream Events
Recommendations
Text document summarization using word embedding
Highlights- Paper proposes using semantics as feature for text summarization.
- Results prove ...
AbstractAutomatic text summarization essentially condenses a long document into a shorter format while preserving its information content and overall meaning. It is a potential solution to the information overload. Several automatic ...
Exploring events and distributed representations of text in multi-document summarization
We explore an event detection framework to improve multi-document summarizationWe use distributed representations of text to address different lexical realizationsSummarization is based on the hierarchical combination of single-document summariesWe ...
Topic-driven reader comments summarization
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementReaders of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking ...
Comments