Skip to main content
Log in

Generation of synthetic documents for performance evaluation of symbol recognition & spotting systems

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

This paper deals with the topic of performance evaluation of symbol recognition & spotting systems. We propose here a new approach to the generation of synthetic graphics documents containing non-isolated symbols in a real context. This approach is based on the definition of a set of constraints that permit us to place the symbols on a pre-defined background according to the properties of a particular domain (architecture, electronics, engineering, etc.). In this way, we can obtain a large amount of images resembling real documents by simply defining the set of constraints and providing a few pre-defined backgrounds. As documents are synthetically generated, the groundtruth (the location and the label of every symbol) becomes automatically available. We have applied this approach to the generation of a large database of architectural drawings and electronic diagrams, which shows the flexibility of the system. Performance evaluation experiments of a symbol localization system show that our approach permits to generate documents with different features that are reflected in variation of localization results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Greengrass, E.: Information retrieval: A survey. Tech. Rep. TR-R52-008-001, Center for architectures for data-driven information processing (CADIP), University of Maryland, US (2000)

  2. Thacker N., Clark A., Barron J., Beveridge J.R., Courtney P., Crum W., Ramesh V., Clark C.: Performance characterisation in computer vision: A guide to best practices. Comput. Vis. Image Underst. 109, 305–334 (2008)

    Article  Google Scholar 

  3. Muller H., Muller W., Squire D., Marchand-Maillet S., Pun T.: Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recognit. Lett. 22(5), 593–601 (2001)

    Article  Google Scholar 

  4. Haralick, R.: Performance evaluation of document image algorithms. In: Workshop on Graphics Recognition (GREC), Vol. 1941 of Lecture Notes in Computer Science (LNCS), (2000), pp. 315–323

  5. Chhabra, A.: Graphic symbol recognition: An overview. In: Workshop on Graphics Recognition (GREC), Vol. 1389 of Lecture Notes in Computer Science (LNCS), (1998), pp. 68–79

  6. Cordella, L., Vento, M.: Symbol and shape recognition. In: Workshop on Graphics Recognition (GREC), Vol. 1941 of Lecture Notes in Computer Science (LNCS), (1999), pp. 167–182

  7. Lladós, J., Valveny, E., Sánchez, G., Martí, E.: Symbol recognition : Current advances and perspectives. In: Workshop on Graphics Recognition (GREC), Vol. 2390 of Lecture Notes in Computer Science (LNCS), (2002), pp. 104–127

  8. Tombre, K., Tabbone, S., Dosch, P.: Musings on symbol recognition. In: Workshop on Graphics Recognition (GREC), Vol. 3926 of Lecture Notes in Computer Science (LNCS), (2005), pp. 23–34.

  9. Yoon, S., Kim, G., Choi, Y., Lee, Y.: New paradigm for segmentation and recognition. In: Workshop on Graphics Recognition (GREC), (2001), pp. 216–225

  10. Tombre, K., Lamiroy, B.: Graphics recognition—from re-engineering to retrieval. In: International conference on document analysis and recognition (ICDAR), (2003), pp. 148–155

  11. Dosch, P., Lladós, J.: Vectorial signatures for symbol discrimination. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), (2004), pp. 154–165

  12. Tabbone, S., Wendling, L., Zuwala, D.: A hybrid approach to detect graphical symbols in documents. In: Workshop on Document Analysis Systems (DAS), Vol. 3163 of Lecture Notes in Computer Science (LNCS), (2004), pp. 342–353

  13. Zuwala, D., Tabbone, S.: A method for symbol spotting in graphical documents. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS), (2006), pp. 518–528

  14. Locteau, H., Adam, S., Trupin, E., Labiche, J., Heroux, P.: Symbol spotting using full visibility graph representation. In: Workshop on Graphics Recognition (GREC), (2007), pp. 49–50

  15. Qureshi, R., Ramel, J., Barret, D., Cardot, H.: Symbol spotting in graphical documents using graph representations. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), (2008), pp. 91–103

  16. Rusiñol, M., Lladós, J.: A region-based hashing approach for symbol spotting in technical documents. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), (2008)

  17. Valveny E. et al.: A general framework for the evaluation of symbol recognition methods. Int. J. Doc. Anal. Recognit. 1(9), 59–74 (2007)

    Google Scholar 

  18. Aksoy, S., et al.: Algorithm performance contest. In: International conference on pattern recognition (ICPR), Vol. 4, pp. 870–876, (2000)

  19. Valveny, E., Dosch, P.: Symbol recognition contest: A synthesis. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), pp. 368–386, (2004)

  20. Dosch, P., Valveny, E.: Report on the second symbol recognition contest. In: Workshop on Graphics Recognition (GREC), Vol. 3926 of Lecture Notes in Computer Science (LNCS), pp. 381–397, (2006)

  21. Valveny, E., Dosch, P., Fornes, A., Escalera, S.: Report on the third contest on symbol recognition. In: Workshop on Graphics Recognition (GREC), Vol. 5046 of Lecture Notes in Computer Science (LNCS), pp. 321–328, (2008)

  22. Lopresti, D., Nagy, G.: Issues in ground-truthing graphic documents. In: Workshop on Graphics Recognition (GREC), Vol. 2390 of Lecture Notes in Computer Science (LNCS), pp. 46–66, (2002)

  23. Yan, L., Wenyin, L.: Interactive recognizing graphic objects in engineering drawings. In: Workshop on Graphics Recognition (GREC), Vol. 3088 of Lecture Notes in Computer Science (LNCS), pp. 126–137, (2004)

  24. Chhabra, A., Phillips, I.: The second international graphics recognition contest—raster to vector conversion : a report. In: Workshop on Graphics Recognition (GREC), Vol. 1389 of Lecture Notes in Computer Science (LNCS), pp. 390–410, (1998)

  25. Zhai, J., Wenyin, L., Dori, D., Li, Q.: A line drawings degradation model for performance characterization. In: International conference on document analysis and recognition (ICDAR), pp. 1020–1024, (2003)

  26. Yanikoglu B., Vincent L.: Pink panther: A complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)

    Article  Google Scholar 

  27. Lee C., Kanungo T.: The architecture of trueviz: A groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)

    Article  Google Scholar 

  28. Antonacopoulos, A., Karatzas, D., Bridson, D.: Ground truth for layout analysis performance evaluation. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS), pp. 302–311, (2006)

  29. Kim D., Kanungo T.: Attributed point matching for automatic groundtruth generation. Int. J. Doc. Anal. Recognit. 5(1), 47–66 (2002)

    Article  MATH  Google Scholar 

  30. Ford, G., Thoma, G.: Ground truth data for document image analysis. In: Symposium on document image understanding and technology (SDIUT). pp. 199–205, (2003)

  31. Yang, L., Huang, W., Tan, C.: Semi-automatic ground truth generation for chart image recognition. In: Workshop on Document Analysis Systems (DAS), Vol. 3872 of Lecture Notes in Computer Science (LNCS). pp. 324–335, (2006)

  32. Phillips, I., Ha, J., Haralick, R., Dori., D.: The implementation methodology for the cd-rom english document database, In: International Conference on Document Analysis and Recognition (ICDAR), pp. 484–487 (1993)

  33. Kanungo, T., Haralick, R., Baird, H.S., Stuezle, W.D.M.: A statistical, nonparametric methodology for document degradation model validation. Pattern anal. mach. intell. 22(11), 1209–1223 (2000)

    Article  Google Scholar 

  34. Delalandre, M., Ramel, J., Valveny, E., Luqman, M.: A performance characterization algorithm for symbol localization, In: Workshop on Graphics Recognition (GREC), Vol. 8, pp. 3–11, (2009)

  35. Rusiñol M., Lladós J.: A performance evaluation protocol for symbol spotting systems in terms of recognition and location indices. Int. J. Doc. Anal. Recognit. 12(2), 83–96 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mathieu Delalandre.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Delalandre, M., Valveny, E., Pridmore, T. et al. Generation of synthetic documents for performance evaluation of symbol recognition & spotting systems. IJDAR 13, 187–207 (2010). https://doi.org/10.1007/s10032-010-0120-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-010-0120-x

Keywords

Navigation