We present a transductive learning based framework for multilingual document classification, originally proposed in [7]. A key aspect in our approach is the use of a large-scale multilingual knowledge base, BabelNet, to support the modeling of different language-written documents into a common conceptual space, without requiring any language translation process. Results on real-world multilingual corpora have highlighted the superiority of the proposed document model against existing language-dependent representation approaches, and the significance of the transductive setting for multilingual document classification.
Multilingual document classification via transductive learning
TAGARELLI, Andrea
2015-01-01
Abstract
We present a transductive learning based framework for multilingual document classification, originally proposed in [7]. A key aspect in our approach is the use of a large-scale multilingual knowledge base, BabelNet, to support the modeling of different language-written documents into a common conceptual space, without requiring any language translation process. Results on real-world multilingual corpora have highlighted the superiority of the proposed document model against existing language-dependent representation approaches, and the significance of the transductive setting for multilingual document classification.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.