download
The PTM-related Information

About 5% of Swiss-Prot proteins have the known tertiary structures (PDB). For proteins without known tertiary structures, two previously published tools, RVP-net (Shandar Ahmad, et al., 2003) and PSIPRED (McGuffin LJ, et al., 2000), were applied to predict the solvent accessibility and the secondary structure, respectively. RVP-net presents a feed-forward type neural network which can predict a real value ranging from 0% to 100% of Accessible Surface Areas (ASA) for amino acid residues, based on their neighborhood information. We applied the RVP-net program to fully predict the real-valued ASA for the amino acid residues of all Swiss-Prot proteins.

PTM-related Information
Download
for Windows
for Mac/Linux
The experimentally verified PTM sites
The predicted accessible surface areas (ASA) of Swiss-Prot proteins
The predicted secondary structures of Swiss-Prot proteins



The Benchmark Data Set for PTM Analyses

Owing to the labor-intensive MS/MS-based experiments, a variety of computational methods have been proposed to identify putative PTM sites based on protein sequence. With numerous PTM prediction methods, it is difficult to determine a best prediction tool merely according to their cross-validation performances. Although most of these studies have provided independent testing results for their prediction methods, there is no standard dataset for the evaluation of predictive powers among various PTM prediction tools. Therefore, this update compiles non-homologous benchmark datasets to evaluate the predictive power for PTM sites prediction tools, that provides suggestions to users with the need to predict PTM sites with high sensitivity (Sn), high specificity (Sp), or balanced Sn and Sp.

PTM Type
Download
for Windows
for Mac/Linux
Phosphorylation by PKA
Phosphorylation by CK2
Phosphorylation by PKC
Phosphorylation by CDK1
Phosphorylation by PKC
N-linked Glycosylation
Acetylation
O-linked Glycosylation
Amidation
Hydroxylation
Methylation
S-nitrosylation
N6-succinyllysine



The experimentally verified PTM sites

Due to the inaccessibility of database contents in several online PTM resources, a total eleven biological databases related to PTMs are integrated in dbPTM. To solve the heterogeneity among the data collected from different sources, the reported modification sites are mapped to the UniProtKB protein entries using sequence comparison. With the high-throughput of mass spectrometry-based methods in post-translational proteomics, this update also includes manually curated MS/MS-identified peptides associated with PTMs from research articles through a literature survey. First, a table list of PTM-related keywords is constructed by referring to the UniProtKB/SwissProt PTM list (http://www.uniprot.org/docs/ptmlist.txt) and the annotations of RESID. Then, all fields in the PubMed database are searched based on the keywords of the constructed table list. This is then followed by downloading the full text of the research articles. For the various experiments of proteomic identification, a text-mining system is developed to survey full-text literature that potentially describes the site-specific identification of modified sites.Furthermore, in order to determine the locations of PTMs on a full-length protein sequence, the experimentally verified MS/MS peptides are then mapped to UniProtKB protein entries based on its database identifier (ID) and sequence identity. In the process of data mapping, MS/MS peptides that cannot align exactly to a protein sequence are discarded. Finally, each mapped PTM site is attributed with a corresponding literature (PubMed ID).
      All types of PTM were categorized by the modified amino acid, including positive set with tab-delimited format. Positive set contains Swiss-Prot ID, modified position, PTM description, and the sequence with upstream 6 amino acids to downstream 6 amino acids. However, some types of PTM, which were occurred in N-terminal or C-terminal protein, were extracted the sequences with window length 0 ~ +10 or -10 ~ 0 (position 0 is modified site), respectively.

PTM Type
Substrate
Download
for Windows
for Mac/Linux
N-linked Glycosylation
Asparagine
O-linked Glycosylation
Lysine, serine, and threonine
C-linked Glycosylation
Tryptophane
Phosphorylation
Serine, threonine, tyrosine, and histidine
Acetylation
Alanine, glycine, lysine, methionine, serine, and threonine
Methylation
Lysine and arginine
Myristoylation
N-myristoyl glycine
Palmitoylation
N-palmitoyl cysteine and S-palmitoyl cysteine
Prenylation
S-farnesyl cysteine
Carboxylation
4-carboxyglutamate
Sulfation
Tyrosine
Ubiquitylation
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter ubiquitin)
Sumoylation
Glycyl lysine isopeptide (Lys-Gly)(interchain with G-Cter in SUMO)
S-Nitrosylation
S-nitrosylated cysteine



The HMM Predicted PTM sites

In this update, KinasePhos-like method was applied to 20 types of PTM with enough experimentally verified PTM sites (more than 30 sites). To reduce the number of false positive predictions by, we set the predictive parameters as the values when the prediction specificity is 100% and fully detect the potential PTM sites against Swiss-Prot protein sequences. The predicted PTM sites consist of Swiss-Prot ID, modified location, PTM description, HMMER bit score, and HMMER E-value, with tab-delimited.

PTM Type
Description
Download
for Windows
for Mac/Linux
N-linked glycosylation
Asparagine
O-linked glycosylation
Lysine, serine, and threonine
C-linked glycosylation
Tryptophane
Acetylation
Alanine, lysine, methionine, and serine
Phosphorylation
Serine, threonine, tyrosine, and histidine
Methylation
Lysine and arginine
Amidation
Asparagine, glycine, isoleucine, leucine, methionine, phenylalanine, proline, and tyrosine
Palmitoylation
S-palmitoyl cysteine
Prenylation
S-farnesyl cysteine
Myristoylation
N-myristoyl glycine
Sulfation
Sulfotyrosine