Text in Everything Workshop (TiE)

Tel Aviv, Israel, October 2022



in conjunction with ECCV 2022



Understanding written communication through vision is a key aspect of
human civilization and should also be an important capacity of
intelligent agents aspiring to function in man-made
environments. Interpreting written information in our environment is
essential in order to perform most everyday tasks like making a
purchase, using public transportation, finding a place in the city,
getting an appointment, or checking whether a store is open or not, to
mention just a few. As such, the analysis of written communication in
images and videos has recently gained an increased interest, as well
as significant progress in a variety of text based vision tasks. While
in earlier years the main focus of this discipline was on OCR and the
ability to read business documents, today this field contains various
applications that require going beyond just text recognition, onto
additionally reasoning over multiple modalities such as the structure
and layout of documents.


Recent advances in this field have been a result of a
multi-disciplinary perspective spanning not only computer vision, but
also natural language processing, document and layout understanding,
knowledge representation and reasoning, data mining, information
retrieval, and more. The goal of this workshop is to raise awareness
about the aforementioned topics in the broader computer vision
community, and gather vision, NLP and other researchers together to
drive a new wave of progress by cross pollinating more ideas between
text/documents and non-vision related fields.


The workshop will be a hybrid, full-day event comprising invited
talks, oral and poster presentations of submitted papers and a special
challenge on Out of Vocabulary scene text understanding.


Keynote speakers

    Xiang Bai (Huazhong University)
    Tal Hassner (Meta AI)
    Aishwarya Agrawal (University of Montreal, DeepMind)
    Sharon Fogel (AWS AI Labs)


Topics of Interest

The workshop welcomes original work on any text-dependent computer
vision application, such as:

    Scene text understanding
    Scene text VQA
    Image-text aware cross-modal retrieval
    Image-text for fine-grained classification
    Text in video
    Document VQA
    Document layout prediction
    Table detection
    Information extraction


Challenge on Out-of-Vocabulary Scene Text Understanding

A challenge on Out of Vocabulary Scene Text Understanding (OOV-ST)
will be organised in the context of this workshop. The OOV-ST
challenge aims to evaluate the ability of text extraction models to
deal with out-of-vocabulary (OOV) words, that have NEVER been
encountered in the training set of the most common Scene Text
understanding datasets to date. The challenge is organised jointly by
Amazon Research, Google Research, Meta AI, and the Computer Vision


To participate to the OOV_ST Challenge, please join through the RRC Portal.



Important dates

Paper Submission Deadline:         August 1, 2022

Notification to Authors:                 August 15, 2022

Workshop Camera Ready Due:   August 22, 2022

Workshop Date:                                October 2022



Ron Litman,                        AWS AI Labs

Aviad Aberdam,                AWS AI Labs

Shai Mazor,                         AWS AI Labs

Hadar Averbuch-Elor,     Cornell University

Dimosthenis Karatzas,    Computer Vision Center / Autonomous University of Barcelona

R. Manmatha,                    AWS AI Labs