Nowadays, many entities collect useful information about users, in order to implement the provided service, and publish them as open data. To prevent privacy leakage, data are often anonymized prior to publication. Unfortunately, anonymization strongly hinders data linkage, which can be very useful for analysis purposes instead. In this paper, we deal with the above problem, by proposing a technique that enriches anonymized open data with pseudo-random labels. This way, some authorized parties (i.e., the analysts) are enabled to link data regarding the same user coming from different sources. Instead, for non-authorized people, labels do not carry any information, thus not introducing additional privacy threats with respect to original open data. In other words, our solution allows us to recover linkage capabilities on anonymized open data, thus enabling more powerful data exploitation. Indeed, the linked open data paradigm, involving both the public sector and business, is recognized as one of the most promising approaches for boosting societal growth. To offer a concrete solution, we refer to an existing open-data standard and we implement the protocol through a SAML-based SSO framework adhering to the eIDAS regulation.

Enabling anonymized open-data linkage by authorized parties

Buccafurri, Francesco;De Angelis, Vincenzo;
2023-01-01

Abstract

Nowadays, many entities collect useful information about users, in order to implement the provided service, and publish them as open data. To prevent privacy leakage, data are often anonymized prior to publication. Unfortunately, anonymization strongly hinders data linkage, which can be very useful for analysis purposes instead. In this paper, we deal with the above problem, by proposing a technique that enriches anonymized open data with pseudo-random labels. This way, some authorized parties (i.e., the analysts) are enabled to link data regarding the same user coming from different sources. Instead, for non-authorized people, labels do not carry any information, thus not introducing additional privacy threats with respect to original open data. In other words, our solution allows us to recover linkage capabilities on anonymized open data, thus enabling more powerful data exploitation. Indeed, the linked open data paradigm, involving both the public sector and business, is recognized as one of the most promising approaches for boosting societal growth. To offer a concrete solution, we refer to an existing open-data standard and we implement the protocol through a SAML-based SSO framework adhering to the eIDAS regulation.
2023
Open data
eIDAS
Anonymity
Record linkage
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/362829
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact