Abstract
This paper deals with the topic of performance evaluation of symbol recognition & spotting systems. We propose here a new approach to the generation of synthetic graphics documents containing non-isolated symbols in a real context. This approach is based on the definition of a set of constraints that permit us to place the symbols on a pre-defined background according to the properties of a particular domain (architecture, electronics, engineering, etc.). In this way, we can obtain a large amount of images resembling real documents by simply defining the set of constraints and providing a few pre-defined backgrounds. As documents are synthetically generated, the groundtruth (the location and the label of every symbol) becomes automatically available. We have applied this approach to the generation of a large database of architectural drawings and electronic diagrams, which shows the flexibility of the system. Performance evaluation experiments of a symbol localization system show that our approach permits to generate documents with different features that are reflected in variation of localization results.
Similar content being viewed by others
References
Greengrass, E.: Information retrieval: A survey. Tech. Rep. TR-R52-008-001, Center for architectures for data-driven information processing (CADIP), University of Maryland, US (2000)
Thacker N., Clark A., Barron J., Beveridge J.R., Courtney P., Crum W., Ramesh V., Clark C.: Performance characterisation in computer vision: A guide to best practices. Comput. Vis. Image Underst. 109, 305–334 (2008)
Muller H., Muller W., Squire D., Marchand-Maillet S., Pun T.: Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recognit. Lett. 22(5), 593–601 (2001)
Haralick, R.: Performance evaluation of document image algorithms. In: Workshop on Graphics Recognition (GREC), Vol. 1941 of Lecture Notes in Computer Science (LNCS), (2000), pp. 315–323
Chhabra, A.: Graphic symbol recognition: An overview. In: Workshop on Graphics Recognition (GREC), Vol. 1389 of Lecture Notes in Computer Science (LNCS), (1998), pp. 68–79
Cordella, L., Vento, M.: Symbol and shape recognition. In: Workshop on Graphics Recognition (GREC), Vol. 1941 of Lecture Notes in Computer Science (LNCS), (1999), pp. 167–182
Lladós, J., Valveny, E., Sánchez, G., Martí, E.: Symbol recognition : Current advances and perspectives. In: Workshop on Graphics Recognition (GREC), Vol. 2390 of Lecture Notes in Computer Science (LNCS), (2002), pp. 104–127
Tombre, K., Tabbone, S., Dosch, P.: Musings on symbol recognition. In: Workshop on Graphics Recognition (GREC), Vol. 3926 of Lecture Notes in Computer Science (LNCS), (2005), pp. 23–34.
Yoon, S., Kim, G., Choi, Y., Lee, Y.: New paradigm for segmentation and recognition. In: Workshop on Graphics Recognition (GREC), (2001), pp. 216–225
Tombre, K., Lamiroy, B.: Graphics recognition—from re-engineering to retrieval. In: International conference on document analysis and recognition (ICDAR), (2003), pp. 148–155
Dosch, P., Lladós, J.: Vectorial signatures for symbol discrimination. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), (2004), pp. 154–165
Tabbone, S., Wendling, L., Zuwala, D.: A hybrid approach to detect graphical symbols in documents. In: Workshop on Document Analysis Systems (DAS), Vol. 3163 of Lecture Notes in Computer Science (LNCS), (2004), pp. 342–353
Zuwala, D., Tabbone, S.: A method for symbol spotting in graphical documents. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS), (2006), pp. 518–528
Locteau, H., Adam, S., Trupin, E., Labiche, J., Heroux, P.: Symbol spotting using full visibility graph representation. In: Workshop on Graphics Recognition (GREC), (2007), pp. 49–50
Qureshi, R., Ramel, J., Barret, D., Cardot, H.: Symbol spotting in graphical documents using graph representations. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), (2008), pp. 91–103
Rusiñol, M., Lladós, J.: A region-based hashing approach for symbol spotting in technical documents. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), (2008)
Valveny E. et al.: A general framework for the evaluation of symbol recognition methods. Int. J. Doc. Anal. Recognit. 1(9), 59–74 (2007)
Aksoy, S., et al.: Algorithm performance contest. In: International conference on pattern recognition (ICPR), Vol. 4, pp. 870–876, (2000)
Valveny, E., Dosch, P.: Symbol recognition contest: A synthesis. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), pp. 368–386, (2004)
Dosch, P., Valveny, E.: Report on the second symbol recognition contest. In: Workshop on Graphics Recognition (GREC), Vol. 3926 of Lecture Notes in Computer Science (LNCS), pp. 381–397, (2006)
Valveny, E., Dosch, P., Fornes, A., Escalera, S.: Report on the third contest on symbol recognition. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), pp. 321–328, (2008)
Lopresti, D., Nagy, G.: Issues in ground-truthing graphic documents. In: Workshop on Graphics Recognition (GREC), Vol. 2390 of Lecture Notes in Computer Science (LNCS), pp. 46–66, (2002)
Yan, L., Wenyin, L.: Interactive recognizing graphic objects in engineering drawings. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), pp. 126–137, (2004)
Chhabra, A., Phillips, I.: The second international graphics recognition contest—raster to vector conversion : a report. In: Workshop on Graphics Recognition (GREC), Vol. 1389 of Lecture Notes in Computer Science (LNCS), pp. 390–410, (1998)
Zhai, J., Wenyin, L., Dori, D., Li, Q.: A line drawings degradation model for performance characterization. In: International conference on document analysis and recognition (ICDAR), pp. 1020–1024, (2003)
Yanikoglu B., Vincent L.: Pink panther: A complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)
Lee C., Kanungo T.: The architecture of trueviz: A groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)
Antonacopoulos, A., Karatzas, D., Bridson, D.: Ground truth for layout analysis performance evaluation. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS), pp. 302–311, (2006)
Kim D., Kanungo T.: Attributed point matching for automatic groundtruth generation. Int. J. Doc. Anal. Recognit. 5(1), 47–66 (2002)
Ford, G., Thoma, G.: Ground truth data for document image analysis. In: Symposium on document image understanding and technology (SDIUT). pp. 199–205, (2003)
Yang, L., Huang, W., Tan, C.: Semi-automatic ground truth generation for chart image recognition. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS). pp. 324–335, (2006)
Phillips, I., Ha, J., Haralick, R., Dori., D.: The implementation methodology for the cd-rom english document database, In: International Conference on Document Analysis and Recognition (ICDAR), pp. 484–487 (1993)
Kanungo, T., Haralick, R., Baird, H.S., Stuezle, W.D.M.: A statistical, nonparametric methodology for document degradation model validation. Pattern anal. mach. intell. 22(11), 1209–1223 (2000)
Delalandre, M., Ramel, J., Valveny, E., Luqman, M.: A performance characterization algorithm for symbol localization, In: Workshop on Graphics Recognition (GREC), Vol. 8, pp. 3–11, (2009)
Rusiñol M., Lladós J.: A performance evaluation protocol for symbol spotting systems in terms of recognition and location indices. Int. J. Doc. Anal. Recognit. 12(2), 83–96 (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Delalandre, M., Valveny, E., Pridmore, T. et al. Generation of synthetic documents for performance evaluation of symbol recognition & spotting systems. IJDAR 13, 187–207 (2010). https://doi.org/10.1007/s10032-010-0120-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-010-0120-x