This is a limited proof of concept to search for research data, not a production system.

Search the MIT Libraries

Title: READ dataset Bozen

Type Dataset Sánchez, Joan Andreu, Romero, Verónica, Toselli, Alejandro H., Vidal, Enrique (2016): READ dataset Bozen. Zenodo. Dataset. https://zenodo.org/record/218236

Authors: Sánchez, Joan Andreu (Pattern Recognition and Human Language Technologies) ; Romero, Verónica (Pattern Recognition and Human Language Technologies) ; Toselli, Alejandro H. (Pattern Recognition and Human Language Technologies) ; Vidal, Enrique (Pattern Recognition and Human Language Technologies) ;

Links

Summary

This dataset arises from the READ project (Horizon 2020).

The dataset consists of a subset of documents from the Ratsprotokolle collection composed of minutes of the council meetings held from 1470 to 1805 (about 30.000 pages), which will be used in the READ project. This dataset is written in Early Modern German. The number of writers is unknown. Handwriting in this collection is complex enough to challenge the HTR software.

The training dataset is composed of 400 pages; most of the pages consist of a single block with many difficulties for line detection and extraction. The ground-truth in this set is in PAGE format and it is provided annotated at line level in the PAGE files.

More information

  • DOI: 10.5281/zenodo.218236

Subjects

  • ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset

Dates

  • Publication date: 2016
  • Issued: December 22, 2016

Rights


Much of the data past this point we don't have good examples of yet. Please share in #rdi slack if you have good examples for anything that appears below. Thanks!

Format

electronic resource

Relateditems

DescriptionItem typeRelationshipUri
IsPartOfhttps://zenodo.org/communities/scriptnet
IsPartOfhttps://zenodo.org/communities/zenodo