Handwritten Text Recognition for Ancient Documents

Alfons Juan, Verónica Romero, Joan Andreu Sánchez, Nicolás Serrano, Alejandro H. Toselli, Enrique Vidal
Proceedings of the First Workshop on Applications of Pattern Analysis, PMLR 11:58-65, 2010.

Abstract

Huge amounts of legacy documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need to be transcribed into a textual electronic format that would allow unrestricted indexing, browsing and querying. In some cases, adequate transcriptions of the handwritten text images are already available. In this work three systems are presented to deal with this sort of documents. The first two address two different approaches for semi-automatic transcription of document images. The third system implements an alignment method to find mappings between word images of a handwritten document and their respective words in its given transcription.

Cite this Paper


BibTeX
@InProceedings{pmlr-v11-juan10a, title = {Handwritten Text Recognition for Ancient Documents}, author = {Juan, Alfons and Romero, Verónica and Sánchez, Joan Andreu and Serrano, Nicolás and Toselli, Alejandro H. and Vidal, Enrique}, booktitle = {Proceedings of the First Workshop on Applications of Pattern Analysis}, pages = {58--65}, year = {2010}, editor = {Diethe, Tom and Cristianini, Nello and Shawe-Taylor, John}, volume = {11}, series = {Proceedings of Machine Learning Research}, address = {Cumberland Lodge, Windsor, UK}, month = {01--03 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v11/juan10a/juan10a.pdf}, url = {https://proceedings.mlr.press/v11/juan10a.html}, abstract = {Huge amounts of legacy documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need to be transcribed into a textual electronic format that would allow unrestricted indexing, browsing and querying. In some cases, adequate transcriptions of the handwritten text images are already available. In this work three systems are presented to deal with this sort of documents. The first two address two different approaches for semi-automatic transcription of document images. The third system implements an alignment method to find mappings between word images of a handwritten document and their respective words in its given transcription.} }
Endnote
%0 Conference Paper %T Handwritten Text Recognition for Ancient Documents %A Alfons Juan %A Verónica Romero %A Joan Andreu Sánchez %A Nicolás Serrano %A Alejandro H. Toselli %A Enrique Vidal %B Proceedings of the First Workshop on Applications of Pattern Analysis %C Proceedings of Machine Learning Research %D 2010 %E Tom Diethe %E Nello Cristianini %E John Shawe-Taylor %F pmlr-v11-juan10a %I PMLR %P 58--65 %U https://proceedings.mlr.press/v11/juan10a.html %V 11 %X Huge amounts of legacy documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need to be transcribed into a textual electronic format that would allow unrestricted indexing, browsing and querying. In some cases, adequate transcriptions of the handwritten text images are already available. In this work three systems are presented to deal with this sort of documents. The first two address two different approaches for semi-automatic transcription of document images. The third system implements an alignment method to find mappings between word images of a handwritten document and their respective words in its given transcription.
RIS
TY - CPAPER TI - Handwritten Text Recognition for Ancient Documents AU - Alfons Juan AU - Verónica Romero AU - Joan Andreu Sánchez AU - Nicolás Serrano AU - Alejandro H. Toselli AU - Enrique Vidal BT - Proceedings of the First Workshop on Applications of Pattern Analysis DA - 2010/09/30 ED - Tom Diethe ED - Nello Cristianini ED - John Shawe-Taylor ID - pmlr-v11-juan10a PB - PMLR DP - Proceedings of Machine Learning Research VL - 11 SP - 58 EP - 65 L1 - http://proceedings.mlr.press/v11/juan10a/juan10a.pdf UR - https://proceedings.mlr.press/v11/juan10a.html AB - Huge amounts of legacy documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need to be transcribed into a textual electronic format that would allow unrestricted indexing, browsing and querying. In some cases, adequate transcriptions of the handwritten text images are already available. In this work three systems are presented to deal with this sort of documents. The first two address two different approaches for semi-automatic transcription of document images. The third system implements an alignment method to find mappings between word images of a handwritten document and their respective words in its given transcription. ER -
APA
Juan, A., Romero, V., Sánchez, J.A., Serrano, N., Toselli, A.H. & Vidal, E.. (2010). Handwritten Text Recognition for Ancient Documents. Proceedings of the First Workshop on Applications of Pattern Analysis, in Proceedings of Machine Learning Research 11:58-65 Available from https://proceedings.mlr.press/v11/juan10a.html.

Related Material