Abstract
In this paper, we address the problem of symbol spotting in technical document images applied to scanned and vectorized line drawings. Like any information spotting architecture, our approach has two components. First, symbols are decomposed in primitives which are compactly represented and second a primitive indexing structure aims to efficiently retrieve similar primitives. Primitives are encoded in terms of attributed strings representing closed regions. Similar strings are clustered in a lookup table so that the set median strings act as indexing keys. A voting scheme formulates hypothesis in certain locations of the line drawing image where there is a high presence of regions similar to the queried ones, and therefore, a high probability to find the queried graphical symbol. The proposed approach is illustrated in a framework consisting in spotting furniture symbols in architectural drawings. It has been proved to work even in the presence of noise and distortion introduced by the scanning and raster-to-vector processes.
Similar content being viewed by others
References
Tombre K, Lamiroy B (2008) Pattern recognition methods for querying and browsing technical documentation. In: Progress in pattern recognition, image analysis and applications, LNCS 5197. pp 504–518. doi:10.1007/978-3-540-85920-8_62
Rath T, Manmatha R (2003) Word image matching using dynamic time warping. In: Proc. of the IEEE conference on computer vision and pattern recognition. pp 521–527. doi:10.1109/CVPR.2003.1211511
Kuo S, Agazzi O (1994) Keyword spotting in poorly printed documents using pseudo 2-D hidden Markov models. IEEE Trans Pattern Anal Mach Intell 16(8):842–848. doi:10.1109/34.308482
Lu S, Tan CL (2008) Retrieval of machine-printed Latin documents through word shape coding. Pattern Recognit 41:1799–1809. doi:10.1016/j.patcog.2007.10.017
Cordella L, Vento M (2000) Symbol recognition in documents: a collection of techniques? Int J Doc Anal Recognit 3(2):73–88. doi:10.1007/s100320000036
Lladós J, Valveny E, Sánchez G, Martì E (2002) Symbol recognition: current advances and perspectives. In: Graphics recognition algorithms and applications, LNCS 2390. pp 104–128. doi:10.1007/3-540-45868-9_9
Califano A, Mohan R (1994) Multidimensional indexing for recognizing visual shapes. IEEE Trans Pattern Anal Mach Intell 16(4):373–392. doi:10.1109/34.277591
Ballard DH (1981) Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognit 13(2):111–122
Lamdan Y, Wolfson HJ (1988) Geometric hashing: ageneral and efficient model-based recognition scheme. In: Proc of the second international conference on computer vision. pp 238–249
Cohen S, Guibas LJ (1997) Shape-based image retrieval using geometric hashing. In: Proc. of the ARPA image understanding workshop. pp 669–674
Lamiroy B, Gros P (1996) Rapid object indexing and recognition using enhanced geometric hashing. In: Proc. of the fourth European conference on computer vision. pp 59–70. doi:10.1007/BFb0015523
Stein F, Medioni G (1992) Structural indexing: efficient 2D object recognition. IEEE Trans Pattern Anal Mach Intell 14(12):1198–1204. doi:10.1109/34.177385
Loce RP, Dougherty ER (1997) Enhancement and restoration of digital documents: statistical design of nonlinear algorithms. Society of Photo-Optical Instrumentation Engineers (SPIE), Bellingham
Tombre K, Ah-Soon C, Dosch P, Masini G, Tabbone S (2000) Stable and robust vectorization: how to make the right choices. In: GREC 1999: selected Papers from the third international workshop on graphics recognition, recent advances, LNCS 1941. pp 3–18. doi:10.1007/3-540-40953-X_1
Rosin PL, West GA (1989) Segmentation of edges into lines and arcs. Image Vision Comput 7(2):109–114. doi: 10.1016/0262-8856(89)90004-8
Wolfson HJ (1990) On curve matching. IEEE Trans Pattern Anal Mach Intell 12(5):483–489. doi:10.1109/34.55108
Kaygin S, Bulut MM (2002) Shape recognition using attributed string matching with polygon vertices as primitives. Pattern Recognit Lett 23:287–294. doi:10.1016/S0167-8655(01)00111-8
Wagner R, Fischer M (1974) The string-to-string correction problem. J Assoc Comput Mach 21(1):168–173. doi:10.1145/321796.321811
Tsay W, Yu S (1985) Attributed string matching with merging for shape recognition. IEEE Trans Pattern Anal Mach Intell 7(4):453–462
Maes M (1990) On a cyclic string-to-string correction problem. Inform Process Lett 35:73–78. doi:10.1016/0020-0190(90)90109-B
Arkin EM, Chew LP, Huttenlocher DP, Kedem K, Mitchell JSB (1991) An efficiently computable metric for comparing polygonal shapes. IEEE Trans Pattern Anal Mach Intell 13(3):209–216. doi:10.1109/34.75509
Tsay YT, Tsai WH (1989) Model-guided attributed string matching by split-and-merge for shape recognition. Int J Pattern Recognit Artif Intell 3(2):159–179. doi:10.1142/S0218001489000140
Sánchez G, Lladós J, Tombre K (2002) A mean string algorithm to compute the average among a set of 2D shapes. Pattern Recognit Lett 23:203–213. doi:10.1016/S0167-8655(01)00122-2
Lorenz O, Monagan G (1995) A retrieval system for graphical documents. In: Symposium on document analysis and information retrieval. pp 291–300
Latecki L, Lakämper R, Eckhardt U (2000) Shape descriptors for non-rigid shapes with a single closed contour. In: Proc. of the IEEE conference on computer vision and pattern recognition. pp 424–429. doi:10.1109/CVPR.2000.855850
Kanungo T, Haralick RM, Baird HS, Stuezle W, Madigan D (2000) A statistical, nonparametric methodology for document degradation model validation. IEEE Trans Pattern Anal Mach Intell 22(11):1209–1223. doi:10.1109/34.888707
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874. DOI 10.1016/j.patrec.2005.10.010
van Rijsbergen CJ (1979) Information retrieval. Butterworths, London
Artăc M, Jogan M, Leonardis A (2002) Incremental PCA for on-line visual learning and recognition. In: Proc. of the international conference on pattern recognition. pp 781–784. doi:10.1109/ICPR.2002.1048133
Pang S, Ozawa S, Kasabov N (2005) Incremental linear discriminant analysis for classification of data streams. IEEE Trans Systems Man Cybern 35(5):905–914. doi:10.1109/TSMCB.2005.847744
Zuwala D, Tabbone S (2006) A method for symbol spotting in graphical documents. In: Proc. of the 7th workshop on document analysis systems, LNCS 3872. pp 518–528. doi:10.1007/11669487_46
Tănase M, Veltkamp R, Haverkort H (2005) Multiple polyline to polygon matching. In: Proc. of the 16th international symposium ISAAC 2005, LNCS 3872. pp 60–70. doi:10.1007/11602613_8
Acknowledgements
The authors would like to thank the anonymous reviewers for their helpful and constructive comments as well as the architect Enric Farrerons for providing the floor-plan images and Silvia Sánchez for proofreading the manuscript. This work has been partially supported by the spanish projects TIN 2006-15694-C02-02 and CONSOLIDER - INGENIO 2010 (CSD 2007-00018).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rusiñol, M., Lladós, J. & Sánchez, G. Symbol spotting in vectorized technical drawings through a lookup table of region strings. Pattern Anal Applic 13, 321–331 (2010). https://doi.org/10.1007/s10044-009-0161-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-009-0161-2