The statistics of experimental and putative PTM sites in dbPTM

Due to the inaccessibility of database contents in several online PTM resources, a total eleven biological databases related to PTMs are integrated in dbPTM. To solve the heterogeneity among the data collected from different sources, the reported modification sites are mapped to the UniProtKB protein entries using sequence comparison. With the high-throughput of mass spectrometry-based methods in post-translational proteomics, this update also includes manually curated MS/MS-identified peptides associated with PTMs from research articles through a literature survey. First, a table list of PTM-related keywords is constructed by referring to the UniProtKB PTM list ( and the annotations of RESID. Then, all fields in the PubMed database are searched based on the keywords of the constructed table list. This is then followed by downloading the full text of the research articles. For the various experiments of proteomic identification, a text-mining system is developed to survey full-text literature that potentially describes the site-specific identification of modified sites.Furthermore, in order to determine the locations of PTMs on a full-length protein sequence, the experimentally verified MS/MS peptides are then mapped to UniProtKB protein entries based on its database identifier (ID) and sequence identity. In the process of data mapping, MS/MS peptides that cannot align exactly to a protein sequence are discarded. Finally, each mapped PTM site is attributed with a corresponding literature (PubMed ID). All types of PTM were categorized by the modified amino acid, including positive set with tab-delimited format. Positive set contains UniProt ID, modified position, PTM description, and the sequence with upstream 6 amino acids to downstream 6 amino acids. However, some types of PTM, which were occurred in N-terminal or C-terminal protein, were extracted the sequences with window length 0 ~ +10 or -10 ~ 0 (position 0 is modified site), respectively.

PTM TypeNumber of experimental SitesNumber of literaturesDownload
Phosphorylation571,03252,381WindowsMAC / Linux
Acetylation137,44221,251WindowsMAC / Linux
Ubiquitination118,4951,130WindowsMAC / Linux
Succinylation17,59662WindowsMAC / Linux
Methylation17,4838,806WindowsMAC / Linux
Malonylation8,73614WindowsMAC / Linux
N-linked Glycosylation7,9161,842WindowsMAC / Linux
O-linked Glycosylation6,3403,785WindowsMAC / Linux
Sumoylation5,450178WindowsMAC / Linux
S-nitrosylation4,203324WindowsMAC / Linux
Glutathionylation4,16192WindowsMAC / Linux
Amidation2,907896WindowsMAC / Linux
Hydroxylation1,725285WindowsMAC / Linux
Pyrrolidone carboxylic acid908529Windows