In this paper, we address the problem of storing and retrieving XML data over structured Peer-to-Peer (P2P) networks. These are becoming popular because of their access efficiency. An open problem with such networks is represented by the kinds of queries they can handle. In fact, file name lookups, used in popular unstructured networks, are not suitable for many new data formats. Keyword-based searches are also not appropriate for XML data, which must be identified by the whole path leading to an element, rather than by the sole element name. We discuss the extensions needed to properly identify XML data in structured P2P networks. A global document is split into various fragments, which are locally stored within the peers according to their own themes. Each fragment is enhanced with a set of few lightweight path expressions that have the convenient side effect of yielding a decentralized catalog. Since a mediated global schema would not be a reasonable assumption in an highly dynamic P2P network, we show that XPath query evaluation only relying on this catalog effectively biases the search towards particular peers. Our approach does not suffer the network and data size limits of previous proposals and is scalable for large P2P networks. To validate our ideas, we have devised XP2P, namely XPath for P2P, on which we have conducted a comprehensive experimental study using various XML datasets.
Storing and Retrieving XPath Fragments in Structured P2P Networks
CUZZOCREA A
2006-01-01
Abstract
In this paper, we address the problem of storing and retrieving XML data over structured Peer-to-Peer (P2P) networks. These are becoming popular because of their access efficiency. An open problem with such networks is represented by the kinds of queries they can handle. In fact, file name lookups, used in popular unstructured networks, are not suitable for many new data formats. Keyword-based searches are also not appropriate for XML data, which must be identified by the whole path leading to an element, rather than by the sole element name. We discuss the extensions needed to properly identify XML data in structured P2P networks. A global document is split into various fragments, which are locally stored within the peers according to their own themes. Each fragment is enhanced with a set of few lightweight path expressions that have the convenient side effect of yielding a decentralized catalog. Since a mediated global schema would not be a reasonable assumption in an highly dynamic P2P network, we show that XPath query evaluation only relying on this catalog effectively biases the search towards particular peers. Our approach does not suffer the network and data size limits of previous proposals and is scalable for large P2P networks. To validate our ideas, we have devised XP2P, namely XPath for P2P, on which we have conducted a comprehensive experimental study using various XML datasets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.