The development and diffusion of ontologies allowed the creation of large banks of information regarding multiple domains known as knowledge bases. Ontologies propose a way to represent information providing semantic meaning that allows the data to be machine-interpretable. However, enjoying such rich knowledge is a difficult task for the majority of potential users who do not know either the knowledge-base definition or how to write queries with SPARQL. Systems able to translate natural language questions into SPARQL queries have the potential to overcome this problem. In this paper, we propose an approach that combines the Named Entity Recognition and Neural Machine Translation tasks to perform an automatic translation of natural language questions into executables SPARQL queries. The resulting approach provides robustness to the presence of terms that do not occur in the training set. We evaluate the potential of our approach by using Monument and QALD-9, which are well-known datasets for Question Answering over the DBpedia ontology.

A Neural-Machine-Translation System Resilient to Out of Vocabulary Words for Translating Natural Language to SPARQL

Ricca F.;Cuteri B.
2021-01-01

Abstract

The development and diffusion of ontologies allowed the creation of large banks of information regarding multiple domains known as knowledge bases. Ontologies propose a way to represent information providing semantic meaning that allows the data to be machine-interpretable. However, enjoying such rich knowledge is a difficult task for the majority of potential users who do not know either the knowledge-base definition or how to write queries with SPARQL. Systems able to translate natural language questions into SPARQL queries have the potential to overcome this problem. In this paper, we propose an approach that combines the Named Entity Recognition and Neural Machine Translation tasks to perform an automatic translation of natural language questions into executables SPARQL queries. The resulting approach provides robustness to the presence of terms that do not occur in the training set. We evaluate the potential of our approach by using Monument and QALD-9, which are well-known datasets for Question Answering over the DBpedia ontology.
2021
978-3-031-08420-1
978-3-031-08421-8
Knowledge base
Natural Language Processing
Neural machine translation
Question answering
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/356317
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact