We address the problem of clustering XML data according to semantically-enriched features extracted by analyzing content and structural specifics in the data. Content features are selected from the textual contents of XML elements, while structure features are extracted from XML tag paths on the basis of ontological knowledge. Moreover, we conceive a transactional model for representing sets of semantically cohesive XML structures, and exploit such a model to effectively and efficiently cluster XML data. The resulting clustering framework was successfully tested on some collections extracted from the DBLP XML archive.
Clustering Transactional XML Data with Semantically-Enriched Content and Structural Features
TAGARELLI, Andrea;GRECO, Sergio
2004-01-01
Abstract
We address the problem of clustering XML data according to semantically-enriched features extracted by analyzing content and structural specifics in the data. Content features are selected from the textual contents of XML elements, while structure features are extracted from XML tag paths on the basis of ontological knowledge. Moreover, we conceive a transactional model for representing sets of semantically cohesive XML structures, and exploit such a model to effectively and efficiently cluster XML data. The resulting clustering framework was successfully tested on some collections extracted from the DBLP XML archive.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.