This paper presents a method for developing a malware ontology structure by detecting malware instances on Twitter. The ontology represents a semi-automatic classifier fed by the data extracted from tweets. In particular, the automatic part of the presented methodology relies on a pattern-based approach to detect trigger expressions leading to new information about malware, whilst the manual one covers the evaluation of the results by domain-experts, who also validate the reliability of the semantic relationships within the ontology framework. We present preliminary results on the application of our methodology to tweets extracted from MalwareBazaar database showing how the documents’ collection analysis, through Natural Language Processing (NLP) tasks, can support the knowledge retrieval and documents’ classification procedures for building early warning system of detected malware. Results obtained from this research paper within the time framework of 2023 are referred to the previous version of the current social network X.

Towards a semi-automatic classifier of malware through tweets for early warning threat detection

Claudia Lanza
;
2024-01-01

Abstract

This paper presents a method for developing a malware ontology structure by detecting malware instances on Twitter. The ontology represents a semi-automatic classifier fed by the data extracted from tweets. In particular, the automatic part of the presented methodology relies on a pattern-based approach to detect trigger expressions leading to new information about malware, whilst the manual one covers the evaluation of the results by domain-experts, who also validate the reliability of the semantic relationships within the ontology framework. We present preliminary results on the application of our methodology to tweets extracted from MalwareBazaar database showing how the documents’ collection analysis, through Natural Language Processing (NLP) tasks, can support the knowledge retrieval and documents’ classification procedures for building early warning system of detected malware. Results obtained from this research paper within the time framework of 2023 are referred to the previous version of the current social network X.
2024
Malware; Classification; NLP; Twitter; Text Mining.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/366017
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact