A great number of recent papers have investigated the possibility of introducing more effective and efficient algorithms for search engines. In traditional search engines the resulting ranking is carried out using textual information only and, as showed by several works, they are not very useful for extracting relevant information. Present research, instead, takes a new approach, called Topic Distillation, whose main task is finding relevant documents using a different similarity criterion: retrieved documents are those related to the query topic, but which do not necessarily contain the query string. Current algorithms for topic distillation first compute a base set containing all the relevant pages and then, by applying an iterative procedure, obtain the authoritative pages. In this paper, we present a different approach which computes the authoritative pages by analyzing the structure of the base set. The technique applies a statistical approach to the co-citation matrix (of the base set) to find the most co-cited pages and combines a link analysis approach with the content page evaluation. Several experiments have shown the validity of our approach.
A Probabilistic Approach for Distillation and Ranking of Web Pages / Greco, Gianluigi; Greco, Sergio; Zumpano, Ester. - In: WORLD WIDE WEB. - ISSN 1386-145X. - 4:3(2001), pp. 189-207.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
|Titolo:||A Probabilistic Approach for Distillation and Ranking of Web Pages|
|Data di pubblicazione:||2001|
|Citazione:||A Probabilistic Approach for Distillation and Ranking of Web Pages / Greco, Gianluigi; Greco, Sergio; Zumpano, Ester. - In: WORLD WIDE WEB. - ISSN 1386-145X. - 4:3(2001), pp. 189-207.|
|Appare nelle tipologie:||1.1 Articolo in rivista|