
The PTM-related Information
About 5% of Swiss-Prot proteins have the known tertiary structures (PDB). For proteins without known tertiary structures, two previously published tools, RVP-net (Shandar Ahmad, et al., 2003) and PSIPRED (McGuffin LJ, et al., 2000), were applied to predict the solvent accessibility and the secondary structure, respectively. RVP-net presents a feed-forward type neural network which can predict a real value ranging from 0% to 100% of Accessible Surface Areas (ASA) for amino acid residues, based on their neighborhood information. We applied the RVP-net program to fully predict the real-valued ASA for the amino acid residues of all Swiss-Prot proteins.
PTM-related Information |
Download |
for Windows |
for Linux |
The experimentally verified PTM sites |
|
|
The predicted accessible surface areas (ASA) of Swiss-Prot proteins |
|
|
The predicted secondary structures of Swiss-Prot proteins |
|
|
The HMM Predicted PTM sites
In this update, KinasePhos-like method was applied to 20 types of PTM with enough experimentally verified PTM sites (more than 30 sites). To reduce the number of false positive predictions by, we set the predictive parameters as the values when the prediction specificity is 100% and fully detect the potential PTM sites against Swiss-Prot protein sequences. The predicted PTM sites consist of Swiss-Prot ID, modified location, PTM description, HMMER bit score, and HMMER E-value, with tab-delimited.
PTM Type |
Description |
Download |
for Windows |
for Linux |
N-linked glycosylation |
Asparagine |
|
|
O-linked glycosylation |
Lysine, serine, and threonine |
|
|
C-linked glycosylation |
Tryptophane |
|
|
Acetylation |
Alanine, lysine, methionine, and serine |
|
|
Phosphorylation |
Serine, threonine, tyrosine, and histidine |
|
|
Methylation |
Lysine and arginine |
|
|
Amidation |
Asparagine, glycine, isoleucine, leucine, methionine, phenylalanine, proline, and tyrosine |
|
|
Palmitoylation |
S-palmitoyl cysteine |
|
|
Farnesylation |
S-farnesyl cysteine |
|
|
Myristoylation |
N-myristoyl glycine |
|
|
Sulfation |
Sulfotyrosine |
|
|
The nonredundant PTM test set by the proposed benchmark
To evaluate the predictive performance among various prediction tools involved in the same type of PTM, the unified positive set and negative set should be constructed. A benchmark of non-redundant PTM dataset was constructed by referring to the method of MeMo. To avoid the overestimation, the protein sequences containing the same type of PTM sites were clustered with a threshold of 30% identity by BLASTCLUST. If two protein sequences were similar with more than 30% identity, we re-aligned the fragment sequences with window length 2n+1 residues centering on modified sites by BL2SEQ. Ff two PTM fragment sequences were similar with 100% identity and the PTM sites from the two proteins were at the same position corresponding whole protein, only one site was kept while the other one was discarded. Finally, more than 10 types of PTM with more than 30 experimentally verified sites were constructed the benchmark of non-redundant PTM test set. The benchmark was also used to negative set.
All types of PTM were categorized by the modified amino acid, including both positive and negative set with tab-delimited format. Positive set contains Swiss-Prot ID, modified position, PTM description, and the sequence with upstream 6 amino acids to downstream 6 amino acids. However, some types of PTM, which were occurred in N-terminal or C-terminal protein, were extracted the sequences with window length 0 ~ +10 or -10 ~ 0 (position 0 is modified site), respectively.
PTM Type |
Substrate |
Download |
for Windows |
for Linux |
N-linlked Glycosylation |
Asparagine |
|
|
O-linlked Glycosylation |
Lysine, serine, and threonine |
|
|
C-linlked Glycosylation |
Tryptophane |
|
|
Phosphorylation |
Serine, threonine, tyrosine, and histidine (with the annotation of catalytic kinase) |
|
|
Acetylation |
Alanine, glycine, lysine, methionine, serine, and threonine |
|
|
Methylation |
Lysine and arginine |
|
|
Myristoylation |
N-myristoyl glycine |
|
|
Palmitoylation |
N-palmitoyl csteine and S-palmitoyl csteine |
|
|
Farnesylation |
S-farnesyl cysteine |
|
|
Carboxylation |
4-carboxyglutamate |
|
|
Sulfation |
Tyrosine |
|
|
Ubiquitylation |
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter ubiquitin) |
|
|
Sumoylation |
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter in SUMO) |
|
|
|