Neuropeptides (NPs) are fragile proteins that serve as essential signaling molecules in the neurological system, playing a key role in modulating various physiological processes. Identifying particular neuropeptide sequences relevant to specific disorders would be beneficial for accelerating the development of diagnostic tools. The study proposed another approach to detecting NPs with multi-layer perception (MLP) and a bagging classifier-based meta-learning method called NeuroBooster. This investigation initially focused on five feature extractions based on composition, such as AAC, PAAC, physicochemical properties, QSO, and transfer-learning, such as Bert, and F2V strategies. Subsequently, we used the XGB feature selection method in the Bert and F2V methods to obtain the most 100D crucial features. The predicted probabilistic outcomes of NPs from the 8 preliminary models merged and derived a two-stage dataset with 40 dimensions of features and transmitted them into three classic models and two meta-models, through rigorous criteria for evaluation. Compared with the existing predictor, our proposed model NeuroBooster achieved a higher accuracy of 91.91% in the independent test method. Consequently, we discovered important features in these five models underscoring that physicochemical properties are potential targets for identification, thereby revealing new avenues for therapies.
NeuroBooster: A Robust Classifier for the Discovery of Neuropeptide Sequences based on Meta-learning Approach
Cuzzocrea, Alfredo
;
2024-01-01
Abstract
Neuropeptides (NPs) are fragile proteins that serve as essential signaling molecules in the neurological system, playing a key role in modulating various physiological processes. Identifying particular neuropeptide sequences relevant to specific disorders would be beneficial for accelerating the development of diagnostic tools. The study proposed another approach to detecting NPs with multi-layer perception (MLP) and a bagging classifier-based meta-learning method called NeuroBooster. This investigation initially focused on five feature extractions based on composition, such as AAC, PAAC, physicochemical properties, QSO, and transfer-learning, such as Bert, and F2V strategies. Subsequently, we used the XGB feature selection method in the Bert and F2V methods to obtain the most 100D crucial features. The predicted probabilistic outcomes of NPs from the 8 preliminary models merged and derived a two-stage dataset with 40 dimensions of features and transmitted them into three classic models and two meta-models, through rigorous criteria for evaluation. Compared with the existing predictor, our proposed model NeuroBooster achieved a higher accuracy of 91.91% in the independent test method. Consequently, we discovered important features in these five models underscoring that physicochemical properties are potential targets for identification, thereby revealing new avenues for therapies.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


