ICDAR2017 Competition on Reading Chinese Text in the Wild


Our competition is based on a dataset of more than 12,000 images. Most of the images are collected in the wild by phone cameras. Some are screenshots. The images exhibit various kinds of scenes, including street views, posters, menus, indoor scenes, and screenshots of phone apps.


Training images and annotations_v1.2(7.6G) Download from BaiduYun (Access Code: 5mzs) / Download from hust.edu.cn

Testing images (3.8G) Download from Google Drive / Download from hust.edu.cn


Every image in the dataset is annotated with text line positions and text transcripts. Positions are represented by quadrilaterals. Vertices are iterated in the clockwise direction, starting from the top-left vertex.

Each dataset image is associated with a groundtruth text file with the same file name, i.e.


In the groundtruth file, each line in the groundtruth file represents the annotation for one text instance (a word or sentence). File format is:

<x1>,<y1>,<x2>,<y2>,<x3>,<y3>,<x4>,<y4>,<difficult>,"<transcript>"; <x1>,<y1>,<x2>,<y2>,<x3>,<y3>,<x4>,<y4>,<difficult>,"<transcript>"; ...

Important Dates

  • January 20 - April 30, 2017: Registration open
  • April 15, 2017: Test dataset available
  • April 15, 2017: Submission open
  • April 30, 2017: Submission deadline