Nowadays, many entities collect useful information about users, in order to implement the provided service, and publish them as open data. To prevent privacy leakage, data are often anonymized prior to publication. Unfortunately, anonymization strongly hinders data linkage, which can be very useful for analysis purposes instead. In this paper, we deal with the above problem, by proposing a technique that enriches anonymized open data with pseudo-random labels. This way, some authorized parties (i.e., the analysts) are enabled to link data regarding the same user coming from different sources. Instead, for non-authorized people, labels do not carry any information, thus not introducing additional privacy threats with respect to original open data. In other words, our solution allows us to recover linkage capabilities on anonymized open data, thus enabling more powerful data exploitation. Indeed, the linked open data paradigm, involving both the public sector and business, is recognized as one of the most promising approaches for boosting societal growth. To offer a concrete solution, we refer to an existing open-data standard and we implement the protocol through a SAML-based SSO framework adhering to the eIDAS regulation.
Enabling anonymized open-data linkage by authorized parties
Buccafurri, Francesco;De Angelis, Vincenzo;
2023-01-01
Abstract
Nowadays, many entities collect useful information about users, in order to implement the provided service, and publish them as open data. To prevent privacy leakage, data are often anonymized prior to publication. Unfortunately, anonymization strongly hinders data linkage, which can be very useful for analysis purposes instead. In this paper, we deal with the above problem, by proposing a technique that enriches anonymized open data with pseudo-random labels. This way, some authorized parties (i.e., the analysts) are enabled to link data regarding the same user coming from different sources. Instead, for non-authorized people, labels do not carry any information, thus not introducing additional privacy threats with respect to original open data. In other words, our solution allows us to recover linkage capabilities on anonymized open data, thus enabling more powerful data exploitation. Indeed, the linked open data paradigm, involving both the public sector and business, is recognized as one of the most promising approaches for boosting societal growth. To offer a concrete solution, we refer to an existing open-data standard and we implement the protocol through a SAML-based SSO framework adhering to the eIDAS regulation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.