FASTXY compares a DNA sequence to a protein sequence data bank
 version 3.3t04 January 25, 2000
Please cite:
 Pearson et al, Genomics (1997) 46:24-36

query.data: 68 aa
 >, 68 bp.
 vs  /local/databases/fasta/swissprot library
searching /local/databases/fasta/swissprot library

       opt      E()
< 20   223     0:==
  22     8     0:=          one = represents 151 library sequences
  24    11     0:=
  26     6     2:*
  28    17    20:*
  30    89   119:*
  32   381   460:===*
  34  1029  1248:======= *
  36  2552  2562:================*
  38  4336  4235:============================*
  40  6430  5907:=======================================*===
  42  7710  7221:===============================================*====
  44  8510  7965:====================================================*====
  46  9049  8113:=====================================================*======
  48  8202  7767:===================================================*===
  50  7102  7087:==============================================*=
  52  5977  6231:======================================== *
  54  4730  5322:================================   *
  56  3796  4446:==========================   *
  58  3003  3650:====================    *
  60  2372  2957:================   *
  62  2053  2370:============== *
  64  1797  1885:============*
  66  1353  1490:=========*
  68  1167  1172:=======*
  70   975   918:======*
  72   812   718:====*=
  74   603   560:===*
  76   481   436:==*=
  78   410   339:==*
  80   350   263:=*=
  82   238   201:=*
  84   202   159:=*
  86   161   123:*=
  88   120    95:*          inset = represents 2 library sequences
  90    93    74:*
  92    59    57:*         :============================*=
  94    44    44:*         :=====================*
  96    26    34:*         :=============   *
  98    25    26:*         :============*
 100    27    20:*         :=========*====
 102     7    16:*         :====   *
 104    12    12:*         :=====*
 106    14     9:*         :====*==
 108    14     7:*         :===*===
 110     1     6:*         := *
 112     2     4:*         :=*
 114     0     3:*         : *
 116     4     3:*         :=*
 118     4     2:*         :*=
>120     6     2:*         :*==
31411157 residues in 86593 sequences
 statistics extrapolated from 60000 to 86367 sequences
  Expectation_n fit: rho(ln(x))= 3.9164+/-0.000474; mu= 5.5093+/- 0.027;
 mean_var=30.6950+/- 6.023, 0's: 185 Z-trim: 41  B-trim: 0 in 0/64
 Kolmogorov-Smirnov  statistic: 0.0316 (N=29) at  50

FASTX (3.34 January 2000) function [optimized, BL50 matrix (15:-5)] ktup: 2
 join: 36, opt: 30, gap-pen: -15/ -2 shift: -20, width:  16
 Scan time: 22.700


The best scores are:                                       opt bits E(86367)
sp|P38398|BRC1_HUMAN BREAST CANCER TYPE 1 SUSC (1863) [f]  149   56 1.7e-07 align
sp|Q95153|BRC1_CANFA BREAST CANCER TYPE 1 SUSC (1878) [f]  143   54 6.8e-07 align
sp|P28266|NODL_RHIME NODULATION PROTEIN L (EC  ( 183) [f]   67   28     3.6 align
sp|P18684|DIPD_PROTE DIPTERICIN D PRECURSOR.   ( 101) [f]   64   27     4.2 align
sp|P10836|DIPA_PROTE DIPTERICIN A.             (  82) [f]   63   27     4.4 align
sp|P48754|BRC1_MOUSE BREAST CANCER TYPE 1 SUSC (1812) [f]   75   31     4.5 align


>>>query.data, 68 aa vs /local/databases/fasta/swissprot library

>>sp|P38398|BRC1_HUMAN BREAST CANCER TYPE 1 SUSCEPTIBILI  (1863 aa)
 initn: 148 init1: 148 opt: 149  Z-score: 255.8  bits: 55.9 E(): 1.7e-07
Smith-Waterman score: 149;  95.238% identity in 21 aa overlap (6-68:1814-1834)
Entrez lookup  Re-search database  General re-search

>sp|P38 2- 22: -----------------------------------------------: 30 60 , SWTEDNGFHAIGQMCEAPVVT .:::::::::::::::::::: sp|P38 AWTEDNGFHAIGQMCEAPVVT 1820 1830


>>sp|Q95153|BRC1_CANFA BREAST CANCER TYPE 1 SUSCEPTIBILI  (1878 aa)
 initn: 143 init1: 143 opt: 143  Z-score: 244.9  bits: 53.9 E(): 6.8e-07
Smith-Waterman score: 143;  90.476% identity in 21 aa overlap (6-68:1822-1842)
Entrez lookup  Re-search database  General re-search

>sp|Q95 2- 22: -----------------------------------------------: 30 60 , SWTEDNGFHAIGQMCEAPVVT .::::.::::::::::::::: sp|Q95 AWTEDSGFHAIGQMCEAPVVT 1830 1840


>>sp|P28266|NODL_RHIME NODULATION PROTEIN L (EC 2.3.1.-)  (183 aa)
 initn:  60 init1:  60 opt:  67  Z-score: 124.2  bits: 28.2 E():  3.6
Smith-Waterman score: 67;  45.833% identity in 24 aa overlap (1-63:116-139)
Entrez lookup  Re-search database  General re-search

>sp|P28 1- 21:------------------------------------------------ : 10 40 , PDPGQRTMASMQLGRCVR---HLW :: .. .:..:::: :: :.: sp|P28 PDDPEQRQAGLQLGRPVRIGKHVW 120 130


>>sp|P18684|DIPD_PROTE DIPTERICIN D PRECURSOR.            (101 aa)
 initn:  63 init1:  63 opt:  64  Z-score: 122.9  bits: 27.1 E():  4.2
Smith-Waterman score: 64;  61.538% identity in 13 aa overlap (3-41:58-70)
Entrez lookup  Re-search database  General re-search

>sp|P18 1- 13:------------------------------ : 30 , RSWTEDNGFHAIG . :: ::: :.:: sp|P18 KVWTSDNGRHSIG 60 70


>>sp|P10836|DIPA_PROTE DIPTERICIN A.                      (82 aa)
 initn:  62 init1:  62 opt:  63  Z-score: 122.6  bits: 26.7 E():  4.4
Smith-Waterman score: 63;  61.538% identity in 13 aa overlap (3-41:40-52)
Entrez lookup  Re-search database  General re-search

>sp|P10 1- 13:------------------------------ : 30 , RSWTEDNGFHAIG . :: ::: :.:: sp|P10 KVWTSDNGGHSIG 40 50


>>sp|P48754|BRC1_MOUSE BREAST CANCER TYPE 1 SUSCEPTIBILI  (1812 aa)
 initn:  49 init1:  49 opt:  75  Z-score: 122.4  bits: 31.1 E():  4.5
Smith-Waterman score: 75;  50.000% identity in 20 aa overlap (6-65:1756-1775)
Entrez lookup  Re-search database  General re-search

>sp|P48 2- 21: ---------------------------------------------- : 30 60 , SWTEDNGFHAIGQMCEAPVV .::::.. :::.:.: .: sp|P48 AWTEDSNCPDIGQLCKARLV 1760 1770



68 residues in 1 query   sequences
31411157 residues in 86593 library sequences
 Tcomplib (4 proc)[version 3.3t04 January 25, 2000]
 start: Sun Aug  6 21:23:16 2000 done: Sun Aug  6 21:23:33 2000
 Scan time: 22.700 Display time:  0.084

Function used was FASTXY