NAME

runWolfPsortSummary - Run WoLF PSORT subcellular localization prediction on input sequences and print in summary mode.


SYNOPSIS

runWolfPsortSummary [OPTIONS] organismType

runWolfPsortSummary (--usage|--help|--man)

Pipe sequences in from standard in.


DESCRIPTION

Run WoLF PSORT subcellular localization prediction on input sequences and print in summary mode. Input in fasta form is expected from standard in.

The output looks roughly like

  seq1 extr_plas: 11.5, plas: 11, extr: 10, E.R.: 4, lyso: 4, pero: 1.5, cyto_pero: 1.5, vacu: 1
  seq2 extr: 25, lyso: 3, plas: 2, nucl: 1, E.R.: 1
  seq3 extr: 31, lyso: 1

Each line contains several localization classes with their scores. The localization classes are:

        abbrev.  site              GO cellular component number
        extr extracellular              0005576, 0005618
        cysk cytoskeleton               0005856
        cyto cytosol(sans cytoskeleton) 0005829
        E.R. endoplasmic reticulum      0005783
        golg Golgi apparatus            0005794
        mito mitochondria               0005739
        nucl nucleus                    0005634
        plas plasma membrane            0005886
        pero peroxisome                 0005777
        vacu vacuolar membrane          0005774
        chlo chloroplast                0009507, 0009543
        lyso lysozyme                   0005764

The GO cellular component number is given for here, but most entries in our current dataset are actually based on the Uniprot and depend on that annotation. Localization classes including underscores indicate the possibility of localizing to two sites, for example ``cyto_nucl'' indicates proteins which can localize to both the cytosol and/or the nucleus. No distinction is made between conditional and constitutive dual localization.

In the output lines, the numbers after the sites roughly correspond to the number of nearest neighbors which the corresponding localization site. In this example, 25 of the nearest neighbors of seq2 are labeled as extracellular proteins in the dataset. More exactly, the number after ``extr_plas'' is a function of the number or related sites and in this case is

  #extra_plas + 0.5 * #extra + 0.5 * #plas


OPTIONS

-n, --just-print
Print the commands that should be executed without actually executing them. Mainly useful for debugging. Mnemonic: like make -n

-f, --just-compute-features
Just do the conversion from sequence to localization feature step without predicting localization. Mainly useful for degugging.

--print-neighbors
Output nearest neighbors used to make the prediction.

-d, --print-sequence-description
Output sequence descriptions taken from fasta sequence input along with predictions.


ARGUMENTS

organismType
Type of the organism. Currently supported organism types are: ``animal'', ``plant'', and ``fungi''. This determines which dataset is used for the prediction. Note that although the results may not be interesting, the software does not care if the organism type matches the actual organism of the protein.


EXAMPLES

cat human.fasta mouse.fasta | runWolfPsortSummary

runWolfPsortSummary < orfs.fasta


FILES

../data/animal.psort
../data/fungi.psort
../data/plant.psort
Dataset sequence data with localization site labels

../data/animal.wolff
../data/fungi.wolff
../data/plant.wolff
Dataset localization feature values

../data/animal.wolfw
../data/fungi.wolfw
../data/plant.wolfw
Feature weights

../data/animal.wolfu
../data/fungi.wolfu
../data/plant.wolfu
Utility matrix. Stipulates the value of predicting a protein of localization class A to to be of class B.


AUTHOR

Paul Horton horton-p AT aist.go.jp


COPYRIGHT

This Script: Copyright (C) 2004-2006, Paul B. Horton & C.J. Collier, All Rights Reserved.

PSORT: Copyright (C) 1997, 2004-2006, Kenta Nakai & Paul B. Horton, All Rights Reserved.


REFERENCE

Paul Horton, Keun-Joon Park, Takeshi Obayashi & Kenta Nakai, ``Protein Subcellular Localization Prediction with WoLF PSORT'', Proceedings of the 4th Annual Asia Pacific Bioinformatics Conference APBC06, Taipei, Taiwan. pp. 39-48, 2006.


SEE ALSO

http://wolfpsort.org/

runWolfPsortHtmlTables