Conference Information

============================================= Document Recognition and Retrieval XII (EI-117) Call for Papers and Announcement ============================================= Please email questions to This conference is part of the IS&T/SPIE's International Symposium on Electronic Imaging 2004, 17-20 January 2005 San Jose Marriott and San Jose Convention Center, San Jose, CA, USA Conference Chairs: Elisa H. Barney Smith, Boise State Univ.; Kazem Taghva, Univ. of Nevada/Las Vegas Program Committee: James Allan, Univ. of Massachusetts/Amherst; Tim Andersen, Boise State Univ.; Apostolos Antonacopoulos, Univ. of Liverpool (United Kingdom); Francine R. Chen, Palo Alto Research Ctr.; Xiaoqing Ding, Tsinghua Univ. (China); David S. Doermann, Univ. of Maryland/College Park; Hiromichi Fujisawa, Hitachi, Ltd. (Japan); Jianying Hu, IBM Thomas J. Watson Research Ctr.; Matthew F. Hurst, Intelliseek, Inc.; Tapas Kanungo, IBM Almaden Research Ctr.; Xiaofan Lin, Hewlett-Packard Labs.; Daniel P. Lopresti, Lehigh Univ.; Thomas A. Nartker, Univ. of Nevada/Las Vegas; Sargur N. Srihari, Univ. at Buffalo; George R. Thoma, National Library of Medicine; Marcel Worring, Univ. van Amsterdam (Netherlands); Berrin A. Yanikoglu, Sabanci Univ. (Turkey) The fields of document recognition and retrieval have grown rapidly in recent years. This development has been fueled by rising accuracy rates for omnifont and handprint optical character recognition (OCR), decreasing costs for the computational power needed to run such sophisticated algorithms, and the emergence of new application areas such as the World Wide Web (WWW), digital libraries, and video- and camera-based OCR. The use of OCR is spreading from high-volume, niche domains to more general tasks, including the processing of noisy "real-world" documents, photocopies, and faxes. Beyond OCR, document recognition includes the recovery of a document's logical structure and format. This encompasses decomposing a document into its various fundamental components (sentences, paragraphs, figures, tables, etc.), tagging these units, and then determining a higher-level structure for the document as a whole. Advanced machine learning techniques may allow to fully recover the structure of tables and equations and thus understand their content, or the conversion of line drawings from raster to a vector format where the resulting graphical objects are endowed with semantic meaning. Syntactic representation of logical structure (e.g. using grammars) and syntax-directed recognition is another important area where research contributions are solicited. One primary reason for digitizing existing paper materials is, of course, to simplify retrieval and organization of information. Therefore we are particularly interested in papers which address any of the following issues: (1) retrieval in the face of corrupted readings of the terms in a document; (2) retrieval based on sketches, images, tables, diagrams or other non-linguistic objects that appear in the document; (3) retrieval based on text appearing with non-standard alignment, in images or graphics; (4) recognition and tagging of mathematical arrays and equations which serve as indicators of subject content or methodology used in the document; (5) novel methods for retrieval and organization of information based on text or other information in a document. Papers addressing retrieval-specific issues are encouraged to use a standard methodology from either statistics (such as the ROC representation) or IR (such as precision versus recall) to assess the effectiveness of proposed techniques against the endpoint goal of correct recognition and retrieval of the entire document, or a section thereof. Papers are solicited in the following areas: Recognition * algorithms and systems for machine-printed and handwritten character and word recognition, especially for degraded documents (e.g., faxes or old/historical documents) * large scale conversion of historical document collections * quality assurance methods and systems in DRR * character and word segmentation techniques * identification and analysis of tables or equations * page segmentation, including hierarchical decomposition of documents into text regions, colored/textured background, halftones, line-art, etc. * logical structure analysis, linguistic representation of structure and syntax-directed recognition of logical structure * raster-to-vector conversion of line-art, maps, and technical drawings * filtering and enhancement techniques for document images * document image compression * document degradation models * video- and camera-based OCR * applications of document recognition to the WWW and digital libraries * techniques to support spoken language access to document text (audio browsing of document databases) * multilingual character recognition * other topics relating to document analysis and character recognition. Retrieval * impact of recognition accuracy on retrieval effectiveness * recovery and use of logical structure for retrieval * information extraction from forms * relevance feedback techniques for document retrieval * cross-language and multi-lingual retrieval * categorization of text documents and imaged documents * summarization of text documents and imaged documents * keyword spotting in document images * approximate string matching algorithms for OCR text * non-textual retrieval methods * image and multimedia search * interfaces for retrieval * benchmarking and evaluation issues * other topics relating to the retrieval of documents and document images. Note: submissions to Document Recognition and Retrieval XI should be abbreviated papers (5-7 pages). The paper should informative and address the following questions: i) What is the paper about? ii) What is the original contribution? iii) What is the most closely related work by others and how does this work differ? iv) How can others make use of this work? v) What are the main experimental/theoretical results? Full papers (10-12 pages) will be needed for the final proceedings. For more information and submission instructions, please see: http://www.electronicimaging.org or http://electronicimaging.org/call/05/conferences/index.cfm?fuseaction=EI117 Abbreviated papers (5-7 pages) Due Date: 5 July 2004 Manuscript (10-12 pages) Due Date: 25 October 2004 Final Summary (200 word abstract for program book) Due Date: 15 November 2004 Proceedings of this conference will be published and available at the meeting. The Abstract and Manuscript due dates must be strictly observed. Submissions imply the intent of at least one author to register, attend the symposium, and present the paper (either orally or in poster format).