Recently there has been increased interest in methods for aggregating multiple matrices observed on a fixed set of entities, where each matrix expresses a particular notion of the dissimilarity of one entity from another. An optimization-based procedure is developed, which returns a global dissimilarity matrix in the form of a weighted average of the partial matrices. The weights are determined with the goal of overcoming conflicts and overlaps that inevitably arise when different sources of data become part of the same representational structure. One important aspect in this context is the coefficient used to measure the degree of association between matrices. Here, it is normal to adopt the vector correlation proposed by Escoufier, but this has the drawback of depending on the Pearson’s correlation, which is highly prone to the effects of outliers. The solution that we propose to mitigate this problem is the substitution of the Pearson coefficient with rank correlations that are less affected by errors of measurement, nonlinearity or outliers. The results obtained with real and simulated data confirm that applying vector rank correlations attenuates the adverse effects of anomalies and, in the case of clean and faultless data, yields weights which basically conform to those obtained using the Escoufier coefficient.

Combining dissimilarity matrices by using rank correlations

2016-01-01

Abstract

Recently there has been increased interest in methods for aggregating multiple matrices observed on a fixed set of entities, where each matrix expresses a particular notion of the dissimilarity of one entity from another. An optimization-based procedure is developed, which returns a global dissimilarity matrix in the form of a weighted average of the partial matrices. The weights are determined with the goal of overcoming conflicts and overlaps that inevitably arise when different sources of data become part of the same representational structure. One important aspect in this context is the coefficient used to measure the degree of association between matrices. Here, it is normal to adopt the vector correlation proposed by Escoufier, but this has the drawback of depending on the Pearson’s correlation, which is highly prone to the effects of outliers. The solution that we propose to mitigate this problem is the substitution of the Pearson coefficient with rank correlations that are less affected by errors of measurement, nonlinearity or outliers. The results obtained with real and simulated data confirm that applying vector rank correlations attenuates the adverse effects of anomalies and, in the case of clean and faultless data, yields weights which basically conform to those obtained using the Escoufier coefficient.
2016
DISTATIS; Multivariate association; Three-way data
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/141923
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 3
social impact