The statistics of external PTM resource in dbPTM 2.0
Four external biological databases related to protein post-translational modification information, Swiss-Prot, Phospho.ELM, O-GLYCBASE, and UbiProt, are integrated into the proposed resource. As shown in bellow table, release 53.0 of Swiss-Prot contributes 17,957 experimental validated PTM sites within 8,086 proteins, and 124,933 putative PTM sites (annotated as "by similarity", "potential" or "probable" in the 'MOD_RES' fields) within 29,356 proteins. The Phospho.ELM entries store information about substrate proteins with the exact positions of residues are known to be phosphorylated by cellular kinases. 13,612 experimentally verified phosphorylation sites within 4,419 proteins were obtained from Phospho.ELM version 6.0. O-GLYCBASE version 6.0 provides 242 glycoproteins containing 2,765 experimentally verified O-linked, N-linked, and C-linked glycosylation sites. However, 185 glycoproteins in O-GLYCBASE are corresponded to Swiss-Prot proteins, which have 2,353 experimentally verified glycosylation sites. Especially, a novel PTM database, UbiProt, stores 417 ubiquitylated proteins which contain 165 ubiquitylation sites.
Resource |
Description |
Version |
Statistics |
|
Experimental Post-Translational Modifications (PTMs) |
Release 53 |
17,957 PTM sites within 8,086 proteins |
| Putative PTMs |
124,933 PTM sites within 29,356 proteins |
|
Experimental phosphorylation sites |
Release 6.0 |
13,612 phosphorylation sites within 4,419 proteins |
|
Experimental glycosylation sites |
Release 6.0 |
2,353 PTM sites within 185 glycoproteins |
|
Ubiquitylated protein and ubiquitylation sites |
Release 1.0 |
417 proteins contain 165 ubiquitylated sites |
The statistics of experimental and putative PTM sites in dbPTM 2.0
The statistics of the experimental and putative PTM sites from dbPTM 2.0 were shown in Table 1. The collected PTM sites from Swiss-Prot, Phospho.ELM, O-GLYCBASE and UbiProt were categorized by the PTM type, after removing the redundancy, the number of non-redundant sites in each type of PTM were calculated. There were totally 36,466 experimentally verified PTM sites in dbPTM 2.0; therein, the type with maximum number of experimental sites is phosphorylation, consisting of 22,363 sites. To provide more potential PTM information to proteins without PTM annotations, 20 types of PTM sites were fully predicted against Swiss-Prot protein sequences by the learned HMMs with the threshold of specificity 100%, predicted totally 2,860,047 sites. With various types of kinase, especially, phosphorylation contains the maximum number of predicted sites.
PTM Type |
Substrate |
Number of experimental sites |
Number of putative sites from Swiss-Prot |
Number of putative sites by HMMs |
N-linlked Glycosylation |
Asparagine and lysine |
3,036 |
72,125 |
479,955 |
O-linlked Glycosylation |
Lysine, praline, serine, threonine, and tyrosine |
1,896 |
2,558 |
386,545 |
C-linlked Glycosylation |
Tryptophane |
49 |
31 |
4,015 |
Phosphorylation |
Serine, threonine, tyrosine, aspartate, histidine or cysteine |
22,363 |
27,200 |
1,815,472 |
Acetylation |
N-terminal of some residues and side chain of lysine or cysteine |
2,071 |
5,143 |
1,206 |
Amidation |
Generally at the C-terminal of a mature active peptide after oxidative cleavage of last glycine |
2,150 |
1,117 |
24,352 |
Hydroxylation |
Generally of asparagine, aspartate, proline or lysine |
1,033 |
1,074 |
9,743 |
Methyylation |
Generally of N-terminal phenylalanine, side chain of lysine, arginine, histidine, asparagine or glutamate, and C-terminal cysteine |
746 |
2,846 |
18,716 |
Pyrrolidone Carboxylic Acid |
N-terminal glutamine which has formed an internal cyclic lactam |
598 |
584 |
12,322 |
Gamma-Carboxyglutamic Acid |
4-carboxyglutamate |
371 |
361 |
1,924 |
Farnesylation |
S-farnesyl cysteine |
61 |
216 |
5,349 |
Myristoylation |
N-myristoyl glycine |
108 |
765 |
10,998 |
Palmitoylation |
N-palmitoyl cysteine and S-palmitoyl cysteine |
210 |
3,582 |
27,841 |
Geranyl-geranylation |
S-geranylgeranyl cysteine |
47 |
819 |
14,317 |
S-diacylglycerol cysteine |
S-diacylglycerol cysteine |
36 |
1,529 |
8,977 |
GPI anchoring |
C-terminal asparagine, asparate and serine |
27 |
681 |
- |
Deamidation |
Deamidated asparagin and Deamidated glutamine (needs to be followed by a G) |
38 |
26 |
2,022 |
Sulfation |
serine, threonine, and tyrosine |
165 |
570 |
15,654 |
Sumoylation |
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter in SUMO) |
77 |
259 |
10,342 |
Ubiquitylation |
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter in ubiquitin) |
286 |
516 |
8,865 |
ADP-ribosylation |
ADP-ribosylarginine, ADP-ribosylasparagine, ADP-ribosylcysteine, ADP-ribosylserine |
3 |
203 |
- |
Formylation |
Of the N-terminal methionine |
28 |
35 |
- |
Citrullination |
Citrulline of arginine |
27 |
91 |
- |
Nitration |
Nitrated tyrosine |
47 |
5 |
1,432 |
Bromination |
6'-bromotryptophan |
18 |
3 |
- |
FAD |
O-8alpha-FAD tyrosine, Pros-8alpha-FAD histidine, S-8alpha-FAD cysteine, and Tele-8alpha-FAD histidine |
12 |
116 |
- |
S-nitrosylation |
S-nitrosocysteine |
9 |
93 |
- |
Others |
- |
889 |
2,358 |
- |
Total |
- |
36,466 |
124,933 |
2,860,047 |
The statistics of experimental PTM sites in dbPTM 1.0 and 2.0
Post-translational modification |
No. of experimental PTM sites |
Type |
Substrate |
dbPTM 1.0 |
dbPTM 2.0 |
Glycosylation |
N-linlked, O-linked, and C-linked glycosylation |
4,586 |
4,951 |
Phosphorylation |
Experimental glycosylation sites |
3,367 |
22,363 |
Acetylation |
N-terminal of some residues and side chain of lysine or cysteine |
1,019 |
2,071 |
Methyylation |
Generally of N-terminal phenylalanine, side chain of lysine, arginine, histidine, asparagine or glutamate, and C-terminal cysteine |
613 |
746 |
Lipidation |
S-farnesyl cysteine, N-terminal myristoylation, Palmitoylation, GPI anchoring |
520 |
646 |
Hydroxylation |
Generally of asparagine, aspartate, proline or lysine |
816 |
1,033 |
Dihydroxylation |
3,4-dihydroxyarginine, 3,4-dihydroxyproline, 4,5-dihydroxylysine |
180 |
182 |
Deamidation |
Deamidated asparagin and Deamidated glutamine (needs to be followed by a G) |
33 |
38 |
S-Nitrosylation |
S-nitrosocysteine |
5 |
9 |
Amidation |
Generally at the C-terminal of a mature active peptide after oxidative cleavage of last glycine |
1,554 |
2,150 |
Sulfation |
serine, threonine, and tyrosine |
144 |
165 |
Sumoylation |
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter in SUMO) |
22 |
77 |
Ubiquitylation |
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter in ubiquitin) |
178 |
286 |
ADP-ribosylation |
ADP-ribosylarginine, ADP-ribosylasparagine, ADP-ribosylcysteine, ADP-ribosylserine |
104 |
3 |
Pyrrolidone Carboxylic Acid |
N-terminal glutamine which has formed an internal cyclic lactam |
567 |
598 |
Gamma-Carboxyglutamic Acid |
4-carboxyglutamate |
343 |
371 |
FAD |
O-8alpha-FAD tyrosine, Pros-8alpha-FAD histidine, S-8alpha-FAD cysteine, and Tele-8alpha-FAD histidine |
12 |
12 |
Formylation |
Of the N-terminal methionine |
35 |
28 |
Citrullination |
Citrulline |
7 |
27 |
S-diacylglycerol cysteine |
S-diacylglycerol cysteine |
48 |
36 |
Others |
- |
328 |
889 |
Total |
- |
14,589 |
36,466 |
|