Abstract
In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems. IJPRAI 15, 65–90 (2001)
Vinciarelli, A., Bengio, S., Bunke, H.: Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. PAMI 26, 709–720 (2004)
Rodríguez-Serrano, J., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained hand-written documents. In: International Conference on Frontiers in Handwriting Recognition (2008)
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34, 211–224 (2012)
Leydier, Y., Ouji, A., Lebourgeois, F., Emptoz, H.: Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recogn. 42, 2089–2105 (2009)
Zhang, X., Tan, C.L.: Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: International Conference on Document Analysis and Recognition, pp. 827–831 (2013)
Rusiñol, M., Aldavert, D., Toledo, R., Lladós, J.: Browsing heterogeneous document collections by a segmentation-free word spotting method. In: International Conference on Document Analysis and Recognition, pp. 63–67 (2011)
Rothacker, L., Rusiñol, M., Fink, G.A.: Bag-of-features HMMs for segmentation-free word spotting in handwritten documents. In: International Conference on Document Analysis and Recognition, pp. 1305–1309 (2013)
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision, pp. 1–22 (2004)
Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Segmentation-free word spotting with exemplar SVMs. Pattern Recogn. 47, 3967–3978 (2014)
Almazán, J., Gordo, A., Fornés, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. PAMI 36, 2552–2566 (2014)
Rath, T., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139–152 (2007)
Marti, U.V., Bunke, H.: The IAM-database: an english sentence database for off-line handwriting recognition. IJDAR 5, 39–46 (2002)
Leslie, C., Eskin, E., Noble, W.: The spectrum kernel: a string kernel for SVM protein classification. In: Pacific Symposium on Biocomputing (2002)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. JMLR 2, 419–444 (2002)
Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: automatic query expansion with a generative feature model for object retrieval. In: International Conference on Computer Vision, pp. 1–8 (2007)
Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2911–2918 (2012)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Kovalchuk, A., Wolf, L., Dershowitz, N.: A simple and fast word spotting method. In: International Conference on Frontiers in Handwriting Recognition (2014)
Fischer, A., Keller, A., Frinken, V., Bunke, H.: HMM-based word spotting in handwritten documents using subword models. In: International Conference on Pattern Recognition (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ghosh, S.K., Valveny, E. (2015). A Sliding Window Framework for Word Spotting Based on Word Attributes. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_73
Download citation
DOI: https://doi.org/10.1007/978-3-319-19390-8_73
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19389-2
Online ISBN: 978-3-319-19390-8
eBook Packages: Computer ScienceComputer Science (R0)