* * * FIRST CALL FOR PARTICIPATION * * *
International Workshop on
Performance Evaluation Issues in Multilingual OCR
Sunday, September 19, 1999 (just before ICDAR'99)
Bangalore, India
WORKSHOP CHAIRS
Tapas KANUNGO University of Maryland, College Park, MD USA
Henry S. BAIRD Xerox PARC, Palo Alto, CA USA
ORGANIZING COMMITTEE
Badr AL-BADR King Abdulaziz City, Saudi Arabia
Torsten CAESAR Siemens ElectroCom, Germany
Bhabatosh CHANDA ISI Calcutta, India
Doug COOPER Southeast Asian Software Research Center, Thailand
Andreas DENGEL DFKI, Germany
Steve DENNIS U. S. Government, USA
Xiaoqing DING Tsinghua University, P.R. China
David DOERMANN University of Maryland, USA
Michel GILLOUX Service de Recherche Technique de la Poste, France
Robert M. HARALICK University of Washington, USA
Tin Kam HO Bell Laboratories, Lucent Technologies, USA
Donna HARMAN National Institute for Standards & Technology, USA
Jonathan HULL Ricoh CRC, USA
Fumitaka KIMURA Mie University, Japan
Hsi-Jian LEE National Chiao Tung University, R.O. China
Seong-Whan LEE Korea University, Korea
Tomohiko MORIOKA Japan Advanced Institute for Science & Tech., Japan
S. P. MUDUR National Center for Software Technology, India
Yasuaki NAKANO Shinshu University, Japan
Kris POPAT Xerox PARC, USA
Philip RESNIK University of Maryland, USA
A. Lawrence SPITZ Document Recognition Technologies, USA
Rohini SRIHARI CEDAR, SUNY Buffalo, USA
Ching Y. SUEN Concordia University, Montreal, Canada
Yuan Yan TANG Hong Kong Baptist University, China
Vadim TERESCHENKO ABBYY Software House, Russia
Jun TSUKUMO NEC, Kanagawa, Japan
Toru WAKAHARA NTT Human Interface Laboratories, Japan
TECHNICAL FOCUS
This workshop will explore evaluation methodologies for multilingual OCR
systems. By `multilingual' we mean to include systems that are capable
of reading more than one language in the same document, as well as
one-language-per-document systems that can be easily retargeted to new
languages. We hope to bring together researchers from many countries to
discuss these and related questions:
-- What methodologies should be used to evaluate multilingual OCR systems?
How do we compare accuracies across languages?
-- What ground-truthed data sets are now available in various languages?
What kind of datasets need to be collected? How is this to be
achieved? Which organizations might be willing to support such an
the effort?
-- What multilingual OCR evaluation tools and error visualization tools
are available or should be developed?
-- What OCR evaluation methods and metrics will be useful for OCR-based
machine translation and cross-language information retrieval?
-- What are the most pressing open research problems, promising dissertation
topics, etc?
WORKSHOP FORMAT
This will be a one-day workshop for a maximum of 70 participants.
Each participant will submit an extended abstract which will be distributed
at the Workshop. All participants are expected to contribute to the
discussions.
At the outset of the workshop, three volunteers will present brief, informal
summaries of the i) methodologies, ii) corpora, and iii) tools mentioned in
the submitted abstracts. Then we will split up into three working groups,
focused on these topics, and proceed to discuss key issues, attempt to resolve
questions, compile lists of resources, and draw up recommendations. Finally,
in a plenary session, representatives of each group will present their
recommendations and invite general discussion. There will be several
opportunities for informal discussion and socializing.
After the workshop, the organizing committee will compile a Workshop Summary,
based on the working group notes, and make it available on the Web. It is
hoped that the workshop will stimulate cooperative follow-on activities
that will accelerate the pace of research in multilingual document image
analysis.
EXTENDED ABSTRACT SUBMISSION
Each potential participant or group of participants should submit an extended
abstract, electronically via E-mail (in plain ASCII), no later than March 30,
1999 to:
Tapas Kanungo
Center for Automation Research
University of Maryland
College Park, MD 20742
E-mail: mlocr@cfar.umd.edu
The abstract should include the name, address, telephone, fax, and email
address of the author(s). It should ordinarily be limited to six printed
pages including references (no figures, please). Longer submissions
may be admitted in special cases, e.g. for catalogues of resources.
Accepted abstracts will be distributed at the Workshop and posted
on the Workshop website.
WORKSHOP WEBSITE
http://www.cfar.umd.edu/~kanungo/workshop/mlocr.html