Accessing the large volumes of information available in public knowledge bases might be complicated for those users unfamiliar with formal languages, such as the SPARQL query language and the ontology definition languages. This issue can be overcome by providing systems able to answer questions posed in natural language on a knowledge base, a task that is called Knowledge Base Question Answering (KBQA) in the literature. More in detail, many KBQA systems aim at translating automatically questions into the corresponding SPARQL queries to be executed over the knowledge base to get the answers. Effective state-of-the-art KBQA systems are based on neural-machine translation but easily fail to recognize words that are Out Of the Vocabulary (OOV) of the training set. This is a serious issue while querying large ontologies where the list of entities is huge and easily evolves over time. In this paper, we present the SPARQL-QA-v2 system that combines in an innovative way Named Entity Linking, Named Entity Recognition, and Neural Machine Translation for addressing the problem of generating SPARQL queries from questions posed in natural language. We demonstrate empirically that SPARQL-QA-v2 is effective and resilient to OOV words and delivers state-of-the-art performance in well-known datasets for question answering over DBpedia and Wikidata knowledge bases.
SPARQL-QA-v2 system for Knowledge Base Question Answering
Borroto M. A.;Ricca F.
2023-01-01
Abstract
Accessing the large volumes of information available in public knowledge bases might be complicated for those users unfamiliar with formal languages, such as the SPARQL query language and the ontology definition languages. This issue can be overcome by providing systems able to answer questions posed in natural language on a knowledge base, a task that is called Knowledge Base Question Answering (KBQA) in the literature. More in detail, many KBQA systems aim at translating automatically questions into the corresponding SPARQL queries to be executed over the knowledge base to get the answers. Effective state-of-the-art KBQA systems are based on neural-machine translation but easily fail to recognize words that are Out Of the Vocabulary (OOV) of the training set. This is a serious issue while querying large ontologies where the list of entities is huge and easily evolves over time. In this paper, we present the SPARQL-QA-v2 system that combines in an innovative way Named Entity Linking, Named Entity Recognition, and Neural Machine Translation for addressing the problem of generating SPARQL queries from questions posed in natural language. We demonstrate empirically that SPARQL-QA-v2 is effective and resilient to OOV words and delivers state-of-the-art performance in well-known datasets for question answering over DBpedia and Wikidata knowledge bases.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.