The Ploc_Bal-Mgpos is a Powerful Artificial Intelligence Tool for Predicting the Subcellular Localization of Gram-Positive Bacterial Proteins According to Their Sequence Information Alone
Recently a very useful web-server, or AI (Artificial Intelligence) tool, has been established for predicting the subcellular localization of Gram-positive bacterial proteins purely according to their sequences information for the multi-label systems [1], in which a same protein may occur or travel between two or more locations and hence its ID (identification) needs two or more labels as well, namely the “multi-label mark” [2].
The AI tool is named as “pLoc_bal-mGpos”, where “bal” stands for that the AI tool has been treated by balancing out the training dataset [3–9], and “m” for that the AI tool bears the capacity to deal with the multi-label systems. Below, let us demonstrate how the AI tool is working.
Clicking the link at http://www.jci-bioinfo.cn/pLoc_bal-mGpos/, you will see the top page of the pLoc_bal-mGpos web-server prompted on your computer’s screen (Figure 1). Then, click the Example button and use the query protein sequences as the input. After clicking the Submit button, you will see Figure 2 shown on the screen of your computer. The corresponding outcomes were detailed in [4]. You can see from there: nearly all the success rates achieved by the AI tool for the Gram-positive bacterial proteins in each of the 6 subcellular locations are within the range of 98–99%. Such a high prediction quality is far beyond the reach of any of its counterparts.
Figure 1. A semi screenshot for the top page of pLoc_bal-mGpos (Adapted from [4] with permission).
Figure 2. A semi screenshot for the webpage obtained by following Step 3 of Section 3.5 (Adapted from [4] with permission).
In addition to the advantages of high accuracy and easy to use, the AI tool has been constructed by strictly complying with the “Chou’s 5-steps rule” and hence possesses the following terrific merits as concurred by many investigators (see, e.g., [10–91] as well as three comprehensive review papers [2, 92, 93]): (1) crystal clear in logic development, (2) completely transparent in operation, (3) easily to repeat the reported results by other investigators, (4) with high potential in stimulating other sequence-analyzing methods, and (5) very convenient to be used by the majority of experimental scientists.
Besides, the approach [94–96] of PseAAC (Pseudo Amino Acid Composition) has also been used during the development of the AI tool. It is a very powerful approach for formulating the samples of proteins by catching their special features, as done by many investigators [97–222].
Moreover, the IHTS (Inserting Hypothetical Training Samples) treatment has also been utilized to balance out the training dataset [57, 60, 84].
For the wonderful and awesome roles of the “5-steps rule” in driving proteome, genome analyses and drug development, see a series of recent papers [2, 93, 223–233] where the rule and its wide applications have been very impressively presented from various aspects or at different angles.
References
Article Type
Short Communication
Publication history
Received: February 15, 2023
Accepted: February 16, 2023
Published: February 19, 2023
Citation:
Kuo-Chen C (2023) The Ploc_Bal-Mgpos is a Powerful Artificial Intelligence Tool for Predicting the Subcellular Localization of Gram-Positive Bacterial Proteins According to Their Sequence Information Alone. Clar J Infect Dis Ther 04(01): 280–292.
Kuo-Chen Chou*
Gordon Life Science Institute, Boston, Massachusetts 02478, USA
*Corresponding author
Kuo-Chen Chou,
Gordon Life Science Institute,
Boston,
Massachusetts 02478,
USA;