Frequent pattern mining aims to discover implicit, previously unknown, and potentially useful knowledge in the form of sets of frequently co-occurring items, events, or objects. To mine frequent patterns from probabilistic datasets of uncertain data, where each item in a transaction is usually associated with an existential probability expressing the likelihood of its presence in that transaction, the UF-growth algorithm captures important information about uncertain data in a UF-tree structure so that expected support can be computed for each pattern. A pattern is considered frequent if its expected support meets or exceeds the user-specified threshold. However, a challenge is that the UF-tree can be large. To handle this challenge, several algorithms use smaller trees such that upper bounds to expected support can be computed. In this paper, we examine these upper bounds, and determine which ones provide tighter upper bounds to expected support for frequent pattern mining of uncertain big data.

Computing theoretically-sound upper bounds to expected support for frequent pattern mining problems over uncertain big data

CUZZOCREA, Alfredo Massimiliano;
2016

Abstract

Frequent pattern mining aims to discover implicit, previously unknown, and potentially useful knowledge in the form of sets of frequently co-occurring items, events, or objects. To mine frequent patterns from probabilistic datasets of uncertain data, where each item in a transaction is usually associated with an existential probability expressing the likelihood of its presence in that transaction, the UF-growth algorithm captures important information about uncertain data in a UF-tree structure so that expected support can be computed for each pattern. A pattern is considered frequent if its expected support meets or exceeds the user-specified threshold. However, a challenge is that the UF-tree can be large. To handle this challenge, several algorithms use smaller trees such that upper bounds to expected support can be computed. In this paper, we examine these upper bounds, and determine which ones provide tighter upper bounds to expected support for frequent pattern mining of uncertain big data.
9783319405803
Big data
Data analysis
Data mining
Data science
Uncertainty
Computer Science (all)
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/20.500.11770/312784
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact