Last Updated $Date: 2010/04/30 08:16:37 $
About WoLF PSORT
WoLF PSORT predicts the subcellular localization sites of proteins
based on their amino acid sequences. The method, which is a major
extension to the venerable PSORTII program, makes predictions based on
both known sorting signal motifs and some correlative sequence
features such as amino acid content. Like PSORT and PSORTII, WoLF PSORT
displays some information about detected sorting signals which is useful
in helping users determine the reliability of the prediction in specific
Our experiments (presented at APBC06) show that the overall
prediction accuracy of WoLF PSORT is over 80%. For common localization
sites (e.g. cytosol, nucleus, mitochondria, etc) WoLF PSORT makes
better than majority classifier predictions even for queries that do
not have strong sequence similarity to any sequence in the
dataset. Thus WoLF PSORT is a useful complement to tools such as
The current dataset used to train WoLF PSORT contains over 12,000
animal sequences and more than 2,000 plant and fungi sequences
respectively. It was gathered mainly from Uniprot but several hundred
Arabidopsis thaliana sequences from the Gene Ontology database
were also included.
Paul Horton, Keun-Joon Park, Takeshi Obayashi, Naoya Fujita, Hajime Harada, C.J. Adams-Collier, & Kenta Nakai,
"WoLF PSORT: Protein Localization Predictor",
Nucleic Acids Research, doi:10.1093/nar/gkm259, 2007.
Paul Horton, Keun-Joon Park, Takeshi Obayashi & Kenta Nakai,
"Protein Subcellular Localization Prediction with WoLF PSORT",
Proceedings of the 4th Annual Asia Pacific Bioinformatics Conference APBC06, Taipei, Taiwan. pp. 39-48, 2006.
WoLF PSORT is being developed by
These people have contributed to WoLF PSORT in the past
- Keun-Joon PARK at CBRC, now at Korea Center for Disease Control & Prevention, Initial dataset preparation
- Takeshi OBAYASHI at Tokyo Institute of Technology, now at Tokyo University Human Genome Center, Initial plant dataset preparation
- C.J. Adams-Collier of Collier Technologies, Initial server design
The dataset is based mainly on annotation from Uniprot and Gene
Ontology. The table below gives a correspondence between our
localization site definitions and Gene Ontology. However, many of our
entries are based solely on Uniprot "Subcellular Localization" field
keywords and in some of these cases the site assignment may not be
completely consistent with the GO cellular component annotation.
Localization Sites and corresponding GO cellular components.
|Abbrev||Localization Site ||GO Cellular Component
|chlo||chloroplast ||0009507, 0009543
|E.R.||endoplasmic reticulum ||0005783
|extr||extracellular ||0005576, 0005618
|golg||Golgi apparatus ||0005794(1)
|plas||plasma membrane ||0005886
|vacu||vacuolar membrane ||0005774(2)
Abbreviation, Localization Site, and corresponding GO Cellular Component(s) are given
for each localization site. Numbers in parentheses, such as "0005856(2)" indicate that descendant
"part_of" cellular components were also included, up to the specified depth (2 in this case).
For example, all of the children and grandchildren of "GO:0005856" were included as "cysk".
Stand alone Package
WoLF PSORT package version 0.2
has been released September 2006. It is academic free and also relatively easy
for industrial users to use as well. Please see the package documentation for details.
Prediction Accuracy by Localization Site
The accuracy varies greatly between different localization
sites -- the general trend being that sites with few
uniprot annotated proteins are seldom correctly predicted.
In a separate
localization accuracy by utility page,
we have compiled some statistics to help answer this
What's in a name
"WoLF" does not necessarily stand for anything. A rather dramatic
mnemonic would be "Where Life Functions". Originally it was going to
be "Learned Weight Features" but I wanted the acronym to be a
pronouncable English word. Women only Love Fools.
[Slides] from a presentation
introducing the issues involved in protein localization prediction.
- WoLF PSORT Relies heavily on features inherited from PSORT(Nakai & Kanehisa)
- WoLF PSORT also uses some sequence features from iPSORT (Bannai
- The original server design was done by C.J.Collier. (But he is not to blame for subsequent hacking...)
Several evaluations have been done on the prediction accuracy of various
protein subcellular localization methods, including WoLF PSORT. In some
WoLF PSORT has performed very poorly, but in some very well. To bias the
reader, I list two here in which WoLF PSORT performed well:
- W. Qian & J. Zhang, Genome Biology and Evolution, 198-204, 2009.
Comparing methods which do not rely strongly on sequence similarity on predicting
the subcellular localization of yeast proteins, they found WoLF PSORT to perform
the best (even so it only attained an accuracy of 45%, Multiloc was second at 36%).
- E.W. Klee & C.P. Sosa, Drug Discovery Today, 12:234-40, 2007.
Evaluated several programs on predicting human secreted proteins and concluded
that WoLF PSORT was the best overall
Copyright (C) National Institute of Advanced Science and Technology (AIST), Computational Biology Research Center (CBRC). All Rights Reserved.