FASTXY compares a DNA sequence to a protein sequence data bank version 3.3t04 January 25, 2000 Please cite: Pearson et al, Genomics (1997) 46:24-36 query.data: 68 aa >, 68 bp. vs /local/databases/fasta/swissprot library searching /local/databases/fasta/swissprot library opt E() < 20 223 0:== 22 8 0:= one = represents 151 library sequences 24 11 0:= 26 6 2:* 28 17 20:* 30 89 119:* 32 381 460:===* 34 1029 1248:======= * 36 2552 2562:================* 38 4336 4235:============================* 40 6430 5907:=======================================*=== 42 7710 7221:===============================================*==== 44 8510 7965:====================================================*==== 46 9049 8113:=====================================================*====== 48 8202 7767:===================================================*=== 50 7102 7087:==============================================*= 52 5977 6231:======================================== * 54 4730 5322:================================ * 56 3796 4446:========================== * 58 3003 3650:==================== * 60 2372 2957:================ * 62 2053 2370:============== * 64 1797 1885:============* 66 1353 1490:=========* 68 1167 1172:=======* 70 975 918:======* 72 812 718:====*= 74 603 560:===* 76 481 436:==*= 78 410 339:==* 80 350 263:=*= 82 238 201:=* 84 202 159:=* 86 161 123:*= 88 120 95:* inset = represents 2 library sequences 90 93 74:* 92 59 57:* :============================*= 94 44 44:* :=====================* 96 26 34:* :============= * 98 25 26:* :============* 100 27 20:* :=========*==== 102 7 16:* :==== * 104 12 12:* :=====* 106 14 9:* :====*== 108 14 7:* :===*=== 110 1 6:* := * 112 2 4:* :=* 114 0 3:* : * 116 4 3:* :=* 118 4 2:* :*= >120 6 2:* :*== 31411157 residues in 86593 sequences statistics extrapolated from 60000 to 86367 sequences Expectation_n fit: rho(ln(x))= 3.9164+/-0.000474; mu= 5.5093+/- 0.027; mean_var=30.6950+/- 6.023, 0's: 185 Z-trim: 41 B-trim: 0 in 0/64 Kolmogorov-Smirnov statistic: 0.0316 (N=29) at 50 FASTX (3.34 January 2000) function [optimized, BL50 matrix (15:-5)] ktup: 2 join: 36, opt: 30, gap-pen: -15/ -2 shift: -20, width: 16 Scan time: 22.700
The best scores are: opt bits E(86367)
sp|P38398|BRC1_HUMAN BREAST CANCER TYPE 1 SUSC (1863) [f] 149 56 1.7e-07 align
sp|Q95153|BRC1_CANFA BREAST CANCER TYPE 1 SUSC (1878) [f] 143 54 6.8e-07 align
sp|P28266|NODL_RHIME NODULATION PROTEIN L (EC ( 183) [f] 67 28 3.6 align
sp|P18684|DIPD_PROTE DIPTERICIN D PRECURSOR. ( 101) [f] 64 27 4.2 align
sp|P10836|DIPA_PROTE DIPTERICIN A. ( 82) [f] 63 27 4.4 align
sp|P48754|BRC1_MOUSE BREAST CANCER TYPE 1 SUSC (1812) [f] 75 31 4.5 align
>>>query.data, 68 aa vs /local/databases/fasta/swissprot library
>sp|P38 2- 22: -----------------------------------------------:
30 60
, SWTEDNGFHAIGQMCEAPVVT
.::::::::::::::::::::
sp|P38 AWTEDNGFHAIGQMCEAPVVT
1820 1830
>sp|Q95 2- 22: -----------------------------------------------:
30 60
, SWTEDNGFHAIGQMCEAPVVT
.::::.:::::::::::::::
sp|Q95 AWTEDSGFHAIGQMCEAPVVT
1830 1840
>sp|P28 1- 21:------------------------------------------------ :
10 40
, PDPGQRTMASMQLGRCVR---HLW
:: .. .:..:::: :: :.:
sp|P28 PDDPEQRQAGLQLGRPVRIGKHVW
120 130
>sp|P18 1- 13:------------------------------ :
30
, RSWTEDNGFHAIG
. :: ::: :.::
sp|P18 KVWTSDNGRHSIG
60 70
>sp|P10 1- 13:------------------------------ :
30
, RSWTEDNGFHAIG
. :: ::: :.::
sp|P10 KVWTSDNGGHSIG
40 50
>sp|P48 2- 21: ---------------------------------------------- :
30 60
, SWTEDNGFHAIGQMCEAPVV
.::::.. :::.:.: .:
sp|P48 AWTEDSNCPDIGQLCKARLV
1760 1770
>>sp|P38398|BRC1_HUMAN BREAST CANCER TYPE 1 SUSCEPTIBILI (1863 aa)
initn: 148 init1: 148 opt: 149 Z-score: 255.8 bits: 55.9 E(): 1.7e-07
Smith-Waterman score: 149; 95.238% identity in 21 aa overlap (6-68:1814-1834)
Entrez lookup Re-search database General re-search
>>sp|Q95153|BRC1_CANFA BREAST CANCER TYPE 1 SUSCEPTIBILI (1878 aa)
initn: 143 init1: 143 opt: 143 Z-score: 244.9 bits: 53.9 E(): 6.8e-07
Smith-Waterman score: 143; 90.476% identity in 21 aa overlap (6-68:1822-1842)
Entrez lookup Re-search database General re-search
>>sp|P28266|NODL_RHIME NODULATION PROTEIN L (EC 2.3.1.-) (183 aa)
initn: 60 init1: 60 opt: 67 Z-score: 124.2 bits: 28.2 E(): 3.6
Smith-Waterman score: 67; 45.833% identity in 24 aa overlap (1-63:116-139)
Entrez lookup Re-search database General re-search
>>sp|P18684|DIPD_PROTE DIPTERICIN D PRECURSOR. (101 aa)
initn: 63 init1: 63 opt: 64 Z-score: 122.9 bits: 27.1 E(): 4.2
Smith-Waterman score: 64; 61.538% identity in 13 aa overlap (3-41:58-70)
Entrez lookup Re-search database General re-search
>>sp|P10836|DIPA_PROTE DIPTERICIN A. (82 aa)
initn: 62 init1: 62 opt: 63 Z-score: 122.6 bits: 26.7 E(): 4.4
Smith-Waterman score: 63; 61.538% identity in 13 aa overlap (3-41:40-52)
Entrez lookup Re-search database General re-search
>>sp|P48754|BRC1_MOUSE BREAST CANCER TYPE 1 SUSCEPTIBILI (1812 aa)
initn: 49 init1: 49 opt: 75 Z-score: 122.4 bits: 31.1 E(): 4.5
Smith-Waterman score: 75; 50.000% identity in 20 aa overlap (6-65:1756-1775)
Entrez lookup Re-search database General re-search
68 residues in 1 query sequences
31411157 residues in 86593 library sequences
Tcomplib (4 proc)[version 3.3t04 January 25, 2000]
start: Sun Aug 6 21:23:16 2000 done: Sun Aug 6 21:23:33 2000
Scan time: 22.700 Display time: 0.084
Function used was FASTXY