• English
  • Italian
| Home Home | Contacts Contacts | Site map Site map | Feed RSS Feed RSS |
MonicaBerti
Home arrow Publications arrow Papers arrow When Printed Hypertexts Go Digital: Information Extraction from the Parsing of Indices
When Printed Hypertexts Go Digital: Information Extraction from the Parsing of Indices Print E-mail
Matteo Romanello - Monica Berti - Alison Babeu - Gregory Crane, When Printed Hypertexts Go Digital: Information Extraction from the Parsing of Indices, in Hypertext 2009: Proceedings of the 20th ACM Conference on Hypertext and Hypermedia, pages -, Torino, Italy: ACM Digital Library, 2009-07.

Modern critical editions of ancient works generally include manually created indices of other sources quoted in the text. Since indices can be considered as a form of domain specific language, the paper presents a parsing-based approach to the problem of extracting information from them to support the creation of a collection of fragmentary texts. The paper first considers the characteristics and structure of quotation indices and their importance when dealing with fragmentary texts. Lastly are presented the results of applying a fuzzy parser to the OCR transcription of an index of quotations to extract information from potentially noisy input.
 
 
< Prev   Next >
Copyright © 2004 - 2009 Monica Berti
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it