Recently there has been increased interest in methods for aggregating multiple matrices observed on a fixed set of entities, where each matrix expresses a particular notion of the dissimilarity of one entity from another. An optimization-based procedure is developed, which returns a global dissimilarity matrix in the form of a weighted average of the partial matrices. The weights are determined with the goal of overcoming conflicts and overlaps that inevitably arise when different sources of data become part of the same representational structure. One important aspect in this context is the coefficient used to measure the degree of association between matrices. Here, it is normal to adopt the vector correlation proposed by Escoufier, but this has the drawback of depending on the Pearson’s correlation, which is highly prone to the effects of outliers. The solution that we propose to mitigate this problem is the substitution of the Pearson coefficient with rank correlations that are less affected by errors of measurement, nonlinearity or outliers. The results obtained with real and simulated data confirm that applying vector rank correlations attenuates the adverse effects of anomalies and, in the case of clean and faultless data, yields weights which basically conform to those obtained using the Escoufier coefficient.
Combining dissimilarity matrices by using rank correlations
2016-01-01
Abstract
Recently there has been increased interest in methods for aggregating multiple matrices observed on a fixed set of entities, where each matrix expresses a particular notion of the dissimilarity of one entity from another. An optimization-based procedure is developed, which returns a global dissimilarity matrix in the form of a weighted average of the partial matrices. The weights are determined with the goal of overcoming conflicts and overlaps that inevitably arise when different sources of data become part of the same representational structure. One important aspect in this context is the coefficient used to measure the degree of association between matrices. Here, it is normal to adopt the vector correlation proposed by Escoufier, but this has the drawback of depending on the Pearson’s correlation, which is highly prone to the effects of outliers. The solution that we propose to mitigate this problem is the substitution of the Pearson coefficient with rank correlations that are less affected by errors of measurement, nonlinearity or outliers. The results obtained with real and simulated data confirm that applying vector rank correlations attenuates the adverse effects of anomalies and, in the case of clean and faultless data, yields weights which basically conform to those obtained using the Escoufier coefficient.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.