Document layout analysis for semantic information extraction

IRIS

Using machines to automatically extract relevant information from unstructured and semi-structured sources has practical significance in todays life and business. In this context, although understanding the meaning of words is important, the process of identifying self-consistent geometric and logical regions of interestâ blocks, cells, columns and tables, as well as paragraphs, titles and captions, only to mention a fewâ is of paramount importance too. This complex process goes under the name of document layout analysis. In this work, we discuss newly designed techniques to solve this problem effectively, by combining both syntactic and semantic document aspects. These techniques described here are at the basis of KnowRex, a comprehensive system for ontology-driven Information Extraction.

Document layout analysis for semantic information extraction

Adrian, Weronika T.;Leone, Nicola;Manna, Marco;MARTE, CINZIA

2017-01-01

Abstract

Using machines to automatically extract relevant information from unstructured and semi-structured sources has practical significance in todays life and business. In this context, although understanding the meaning of words is important, the process of identifying self-consistent geometric and logical regions of interestâ blocks, cells, columns and tables, as well as paragraphs, titles and captions, only to mention a fewâ is of paramount importance too. This complex process goes under the name of document layout analysis. In this work, we discuss newly designed techniques to solve this problem effectively, by combining both syntactic and semantic document aspects. These techniques described here are at the basis of KnowRex, a comprehensive system for ontology-driven Information Extraction.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Codice ISBN
	
				9783319701684
			
	Parole chiave
	
				Answer set programming; Document layout analysis; Information extraction; Knowledge representation; Ontologies; Table recognition; Theoretical Computer Science; Computer Science (all)
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/276674

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

8

6

social impact