For approximately 50-55% level of true promoter region recognition, the TSSG program will give one false positive prediction for about 5000 bp. (this accuracy is similar with the test sequences anlysis by Prestridge's method). We estimate an accuracy of defining TSS position on 10 test genes where both (our and Prestridge's) algorithms found promoter region:
Deviation of predicted TSS from the real TSS:
_____________________________________________________________________ Method/deviation I 5b I 50 b I 150 b I mean of observed ________________________I_______I_______I_______I___deviations_______ Prestridge's I 0 I 3 I 7 I 81.2 base ________________________I_______I_______I_______I_____________________ TSSG I 7 I 3 I 0 I 7.3 base ________________________I_______I_______I_______I_____________________
Name: Seq_name:all First three lines of sequence: TTANGATTCGTTTCCATGGAGCTGCCCATGACCATTTACACCATATACATACTGTCTCTGAGCAGAGATACGACA CTCAGGCTGGTGATAAGGGAACACAGCTGTCAGGGGGCCAGAAGCAGCGTGTCGCCATAGCCCGAGCCATCATCC GCAACCCCAAACTGTTGCTCCTGGACGAGGCCACGTCTGCGCTCGACACTGAGAGTGAGAAGGTGAGACTTTATT tssg Thu Jun 3 20:39:13 CDT 1999 >Seq_name:all Length of sequence- 39951 Threshold for LDF- 4.00 8 promoter(s) were predicted Pos.: 20076 LDF- 9.94 TATA box predicted at 20044 Pos.: 25269 LDF- 9.57 Pos.: 29344 LDF- 9.32 TATA box predicted at 29314 Pos.: 29652 LDF- 7.93 TATA box predicted at 29637 Pos.: 37397 LDF- 6.36 Pos.: 33041 LDF- 6.16 TATA box predicted at 33010 Pos.: 17708 LDF- 4.56 TATA box predicted at 17678 Pos.: 21328 LDF- 4.46 TATA box predicted at 21310 Transcription factor binding sites: for promoter at position - 20076 20076 (+) S01069 ACCNNNNNNGGT 19910 (-) S01027 ACGCCC 20063 (-) S00922 AGAGG 20057 (+) S01554 ANCCTCTCY 19830 (+) S00880 ATTGG 19933 (-) S01904 CACCTG 20014 (+) S00089 CANYYY 20055 (+) S00089 CANYYY 19897 (-) S00089 CANYYY 19833 (-) S00089 CANYYY 19815 (-) S01616 CATTW 19805 (-) S01616 CATTW 19834 (-) S00633 CCAAT 19999 (+) S01187 CCCCGCCC 20000 (+) S00801 CCCGCC 19999 (+) S01936 CCCMNSSS 19866 (-) S00245 CCGAAAC 20001 (+) S00802 CCGCCC 19888 (-) S00489 CGTCA 19792 (+) S01622 CWKKANNY 20016 (-) S01622 CWKKANNY 19860 (-) S01622 CWKKANNY 19815 (-) S01622 CWKKANNY 20064 (-) S00090 GAGAGGA 19835 (-) S01089 GCCAA 19998 (+) S00216 GCCCCGCC 19836 (-) S01738 GGCCAAT 19833 (+) S00437 GGCCG 20005 (-) S00781 GGCGGG 20006 (-) S00978 GGGCGG 20002 (-) S00974 GGGGC 20006 (-) S01193 GGGNGGRR 20025 (+) S01998 GRGRTTKCAY 20025 (+) S00159 GRGRTTYCAY 20007 (-) S00064 KGGGCGGRRY 20007 (-) S01542 KRGGCGKRRY 19848 (+) S02023 MAMAG 20068 (-) S02023 MAMAG 19976 (-) S02023 MAMAG 19862 (-) S02023 MAMAG 19927 (+) S01950 RCAGNTG 19982 (+) S01190 RYYWSGTG 20004 (-) S01964 SCGSSSC 19802 (-) S00435 TACAAA 19822 (+) S01424 TGANTMA 19828 (-) S01424 TGANTMA 20007 (-) S01375 TGGGC 20007 (-) S00323 TGGGCGGGGC 19944 (+) S00250 TGRMCC 19950 (+) S00250 TGRMCC 19956 (+) S00250 TGRMCC 19851 (-) S02000 TKNNGNAAK 19795 (+) S01974 TRTTTGY 20013 (-) S01974 TRTTTGY 19944 (+) S01629 WGNAMCYK 19950 (+) S01629 WGNAMCYK 19956 (+) S01629 WGNAMCYK 20033 (-) S01629 WGNAMCYK 19999 (+) S01081 YYCCGCCC for promoter at position - 25269 25130 (+) S01152 AAGTGA 25130 (+) S01153 AARKGA 25142 (-) S01027 ACGCCC 25082 (+) S01249 ACGTMAC 25218 (-) S00922 AGAGG 25021 (-) S00922 AGAGG 24971 (-) S00922 AGAGG 25161 (-) S00392 AGGAAG 25204 (-) S00536 CAGCTGGC 25204 (-) S02128 CAGNTGGC 25147 (+) S00089 CANYYY 25181 (-) S00089 CANYYY 25134 (-) S00089 CANYYY 25050 (-) S00089 CANYYY 25115 (+) S01616 CATTW 25198 (+) S02113 CCAGCTG 25164 (-) S01003 CCCAG 25256 (+) S01187 CCCCGCCC 25257 (+) S00801 CCCGCC 25257 (+) S00256 CCCKCCCWCCT 25256 (+) S01936 CCCMNSSS 25258 (+) S00802 CCGCCC 25061 (+) S00040 CCTGC 25249 (+) S00489 CGTCA 25140 (+) S00753 CGTGAC 25210 (+) S00794 CTTTCC 25005 (+) S01622 CWKKANNY 25088 (+) S01622 CWKKANNY 25115 (+) S01622 CWKKANNY 25255 (-) S01622 CWKKANNY 25038 (-) S01622 CWKKANNY 25002 (+) S00038 GAACAG 25217 (-) S01502 GAGGAA 25168 (-) S00973 GAGGC 25168 (-) S02135 GAGGCC 25175 (-) S00539 GATGGCCG 25242 (-) S00741 GATTTC 25026 (-) S01089 GCCAA 24975 (-) S01089 GCCAA 25255 (+) S00216 GCCCCGCC 25038 (+) S00437 GGCCG 25172 (-) S00437 GGCCG 25262 (-) S00781 GGCGGG 25263 (-) S00978 GGGCGG 25259 (-) S00974 GGGGC 25263 (-) S01193 GGGNGGRR 25088 (-) S00399 GTKACGT 25088 (-) S00104 GTKACGW 25003 (+) S02023 MAMAG 25065 (+) S02023 MAMAG 25086 (+) S02023 MAMAG 25128 (+) S02023 MAMAG 25192 (+) S02023 MAMAG 25233 (+) S02023 MAMAG 25024 (-) S02023 MAMAG 25205 (-) S01950 RCAGNTG 25261 (-) S01964 SCGSSSC 25141 (+) S00143 STGACTMA 25035 (-) S00484 TATCTC 25142 (+) S01426 TGACTCA 25142 (+) S01424 TGANTMA 25148 (-) S01424 TGANTMA 25142 (+) S01935 TGASTMA 25148 (-) S01935 TGASTMA 24972 (+) S02137 TGGCA 25161 (+) S01375 TGGGC 25161 (+) S00044 TGGNNNNNNGCCA 25225 (+) S00864 TGTCCT 25148 (-) S01595 TKAGTCA 25006 (+) S01773 XGGAYGT 25015 (-) S01773 XGGAYGT 25241 (+) S00346 YCSCCMNSSS 25256 (+) S01081 YYCCGCCC for promoter at position - 29344 29046 (-) S00192 AATAAAT 29344 (+) S00880 ATTGG 29272 (+) S00089 CANYYY 29166 (-) S00089 CANYYY 29140 (-) S00089 CANYYY 29085 (+) S01003 CCCAG 29281 (+) S01003 CCCAG 29325 (+) S00801 CCCGCC 29326 (+) S00802 CCGCCC 29268 (-) S00057 CCTGAWWA 29079 (+) S01622 CWKKANNY 29144 (+) S01622 CWKKANNY 29235 (+) S01622 CWKKANNY 29275 (+) S01622 CWKKANNY 29137 (-) S01622 CWKKANNY 29343 (+) S00780 GATTGG 29330 (-) S00781 GGCGGG 29331 (-) S00978 GGGCGG 29332 (-) S00974 GGGGC 29332 (-) S00979 GGGGCGGG 29332 (-) S00331 GGGGCGGGAC 29331 (-) S01193 GGGNGGRR 29332 (-) S00064 KGGGCGGRRY 29332 (-) S01542 KRGGCGKRRY 29343 (-) S02023 MAMAG 29326 (+) S01964 SCGSSSC 29334 (-) S01964 SCGSSSC 29131 (+) S00087 TATAAA 29129 (+) S01540 TATAWAW 29072 (+) S00483 TATCTT 29088 (-) S01375 TGGGC 29286 (+) S00250 TGRMCC 29145 (+) S01052 TTTAAA 29315 (+) S01052 TTTAAA 29320 (-) S01052 TTTAAA 29150 (-) S01052 TTTAAA 29286 (-) S02121 WCTGG 29090 (-) S02121 WCTGG 29184 (+) S00487 WCTRG 29286 (-) S00487 WCTRG 29090 (-) S00487 WCTRG 29266 (-) S00381 WGATAR 29273 (-) S01629 WGNAMCYK 29324 (+) S01081 YYCCGCCC for promoter at position - 29652 29536 (+) S01153 AARKGA 29537 (+) S01090 AATGA 29590 (+) S01027 ACGCCC 29553 (-) S00392 AGGAAG 29368 (+) S00880 ATTGG 29562 (+) S00880 ATTGG 29387 (+) S00089 CANYYY 29473 (+) S00089 CANYYY 29500 (+) S00089 CANYYY 29525 (+) S00089 CANYYY 29600 (+) S00089 CANYYY 29614 (+) S00089 CANYYY 29540 (-) S00089 CANYYY 29420 (-) S00089 CANYYY 29540 (-) S01616 CATTW 29482 (-) S01616 CATTW 29566 (-) S00633 CCAAT 29372 (-) S00633 CCAAT 29593 (+) S01187 CCCCGCCC 29582 (+) S00801 CCCGCC 29594 (+) S00801 CCCGCC 29586 (+) S01936 CCCMNSSS 29587 (+) S01936 CCCMNSSS 29593 (+) S01936 CCCMNSSS 29583 (+) S00802 CCGCCC 29595 (+) S00802 CCGCCC 29357 (+) S00040 CCTGC 29607 (-) S01954 CGGAAGTG 29402 (+) S00489 CGTCA 29524 (-) S00794 CTTTCC 29576 (+) S01622 CWKKANNY 29622 (-) S01622 CWKKANNY 29482 (-) S01622 CWKKANNY 29547 (-) S01502 GAGGAA 29367 (+) S00780 GATTGG 29592 (+) S00216 GCCCCGCC 29599 (-) S00781 GGCGGG 29587 (-) S00781 GGCGGG 29600 (-) S00978 GGGCGG 29588 (-) S00978 GGGCGG 29596 (-) S00974 GGGGC 29589 (-) S00974 GGGGC 29589 (-) S00979 GGGGCGGG 29600 (-) S01193 GGGNGGRR 29601 (-) S00064 KGGGCGGRRY 29601 (-) S01542 KRGGCGKRRY 29400 (+) S00144 KWCGTCA 29533 (-) S02023 MAMAG 29506 (-) S02023 MAMAG 29445 (-) S02023 MAMAG 29414 (-) S02023 MAMAG 29609 (-) S01770 RNMGGAWGT 29583 (+) S01964 SCGSSSC 29598 (-) S01964 SCGSSSC 29479 (-) S00972 TAGGC 29449 (-) S00087 TATAAA 29406 (-) S01418 TGACGACA 29553 (+) S00869 TGACTTCT 29632 (-) S02137 TGGCA 29601 (-) S01375 TGGGC 29601 (-) S00323 TGGGCGGGGC 29610 (-) S02000 TKNNGNAAK 29373 (+) S02121 WCTGG 29373 (+) S00487 WCTRG 29523 (+) S01629 WGNAMCYK 29474 (-) S02003 YGTCAGC 29593 (+) S01081 YYCCGCCC for promoter at position - 37397 37252 (-) S00922 AGAGG 37152 (-) S00922 AGAGG 37228 (+) S00392 AGGAAG 37346 (+) S01905 CACGTG 37351 (-) S01905 CACGTG 37116 (+) S00089 CANYYY 37158 (+) S00089 CANYYY 37209 (+) S00089 CANYYY 37245 (+) S00089 CANYYY 37254 (+) S00089 CANYYY 37396 (+) S00089 CANYYY 37378 (-) S00089 CANYYY 37285 (-) S00089 CANYYY 37236 (-) S00089 CANYYY 37241 (+) S00243 CCACCA 37392 (+) S00243 CCACCA 37334 (-) S00956 CCCCCGCCCC 37333 (-) S01187 CCCCGCCC 37332 (-) S00801 CCCGCC 37327 (-) S00801 CCCGCC 37343 (-) S01936 CCCMNSSS 37342 (-) S01936 CCCMNSSS 37341 (-) S01936 CCCMNSSS 37340 (-) S01936 CCCMNSSS 37339 (-) S01936 CCCMNSSS 37338 (-) S01936 CCCMNSSS 37337 (-) S01936 CCCMNSSS 37336 (-) S01936 CCCMNSSS 37335 (-) S01936 CCCMNSSS 37334 (-) S01936 CCCMNSSS 37333 (-) S01936 CCCMNSSS 37304 (-) S01936 CCCMNSSS 37331 (-) S00802 CCGCCC 37120 (+) S00040 CCTGC 37323 (-) S00040 CCTGC 37319 (-) S00040 CCTGC 37315 (-) S00040 CCTGC 37227 (+) S01622 CWKKANNY 37236 (+) S00741 GATTTC 37329 (-) S00216 GCCCCGCC 37137 (-) S00437 GGCCG 37322 (+) S00781 GGCGGG 37327 (+) S00781 GGCGGG 37326 (+) S00978 GGGCGG 37189 (+) S00974 GGGGC 37301 (+) S00974 GGGGC 37325 (+) S00974 GGGGC 37340 (+) S00974 GGGGC 37325 (+) S00979 GGGGCGGG 37325 (+) S00326 GGGGCGGGGG 37326 (+) S01193 GGGNGGRR 37330 (+) S01193 GGGNGGRR 37331 (+) S01193 GGGNGGRR 37332 (+) S01193 GGGNGGRR 37333 (+) S01193 GGGNGGRR 37334 (+) S01193 GGGNGGRR 37335 (+) S01193 GGGNGGRR 37336 (+) S01193 GGGNGGRR 37196 (-) S00608 GTCGCC 37244 (-) S00839 GTGGAAA 37225 (+) S02023 MAMAG 37269 (+) S02023 MAMAG 37274 (+) S02023 MAMAG 37226 (+) S01770 RNMGGAWGT 37227 (+) S02024 SAGGAAGY 37187 (+) S01964 SCGSSSC 37299 (+) S01964 SCGSSSC 37323 (+) S01964 SCGSSSC 37331 (-) S01964 SCGSSSC 37309 (-) S01964 SCGSSSC 37302 (-) S01964 SCGSSSC 37219 (+) S00079 TGCRCNC 37219 (+) S01987 TGCRCRC 37284 (+) S02137 TGGCA 37106 (+) S01375 TGGGC 37394 (-) S01375 TGGGC 37159 (-) S00250 TGRMCC 37257 (+) S02121 WCTGG 37150 (+) S00487 WCTRG 37257 (+) S00487 WCTRG 37361 (+) S01773 XGGAYGT 37343 (-) S00346 YCSCCMNSSS 37342 (-) S00346 YCSCCMNSSS 37341 (-) S00346 YCSCCMNSSS 37340 (-) S00346 YCSCCMNSSS 37339 (-) S00346 YCSCCMNSSS 37338 (-) S00346 YCSCCMNSSS 37337 (-) S00346 YCSCCMNSSS 37336 (-) S00346 YCSCCMNSSS 37335 (-) S00346 YCSCCMNSSS 37331 (-) S00346 YCSCCMNSSS 37365 (+) S02003 YGTCAGC 37333 (-) S01081 YYCCGCCC for promoter at position - 33041 32923 (+) S00922 AGAGG 33030 (+) S00922 AGAGG 33040 (+) S00922 AGAGG 32872 (-) S00922 AGAGG 32958 (-) S01946 ANATGG 32949 (-) S00908 CAACCAC 32840 (-) S01904 CACCTG 32793 (+) S00089 CANYYY 32954 (+) S00089 CANYYY 32969 (-) S00089 CANYYY 32931 (-) S00089 CANYYY 32859 (-) S00089 CANYYY 32750 (-) S00089 CANYYY 32954 (+) S01616 CATTW 32970 (+) S00243 CCACCA 32875 (-) S01003 CCCAG 32826 (+) S00040 CCTGC 32830 (+) S00040 CCTGC 33024 (-) S00489 CGTCA 32976 (+) S01622 CWKKANNY 32774 (-) S01622 CWKKANNY 33031 (+) S00973 GAGGC 32871 (-) S00973 GAGGC 32844 (-) S00973 GAGGC 33031 (+) S02135 GAGGCC 33008 (-) S00437 GGCCG 32873 (+) S00974 GGGGC 33026 (-) S00144 KWCGTCA 32880 (+) S02023 MAMAG 32963 (+) S02023 MAMAG 33027 (+) S02023 MAMAG 32833 (+) S01190 RYYWSGTG 33011 (+) S00087 TATAAA 33011 (+) S00615 TATAAAA 33011 (+) S01540 TATAWAW 33000 (-) S00483 TATCTT 32972 (-) S02137 TGGCA 32836 (-) S02137 TGGCA 32870 (+) S02121 WCTGG 32975 (+) S02121 WCTGG 33039 (-) S02121 WCTGG 33011 (-) S02121 WCTGG 32856 (-) S02121 WCTGG 32870 (+) S00487 WCTRG 32975 (+) S00487 WCTRG 33039 (-) S00487 WCTRG 33011 (-) S00487 WCTRG 32856 (-) S00487 WCTRG 32758 (-) S00487 WCTRG 33026 (-) S02101 WTCGTCA 32926 (+) S01773 XGGAYGT 32977 (+) S01773 XGGAYGT 32827 (-) S01773 XGGAYGT for promoter at position - 17708 17684 (+) S01153 AARKGA 17634 (-) S01153 AARKGA 17633 (-) S01090 AATGA 17572 (+) S00922 AGAGG 17644 (-) S00922 AGAGG 17590 (+) S00392 AGGAAG 17425 (+) S00395 CACGCW 17501 (-) S00395 CACGCW 17620 (+) S00089 CANYYY 17638 (+) S00089 CANYYY 17666 (-) S00089 CANYYY 17601 (-) S00089 CANYYY 17445 (-) S00089 CANYYY 17409 (-) S00089 CANYYY 17630 (+) S01616 CATTW 17554 (+) S00243 CCACCA 17529 (-) S00243 CCACCA 17561 (+) S02113 CCAGCTG 17447 (-) S01003 CCCAG 17673 (+) S00753 CGTGAC 17649 (-) S00481 CTATCA 17655 (-) S01622 CWKKANNY 17589 (+) S01502 GAGGAA 17688 (+) S00973 GAGGC 17688 (+) S02135 GAGGCC 17451 (+) S00539 GATGGCCG 17454 (+) S00437 GGCCG 17601 (+) S01445 GTGAGTCAG 17437 (+) S02023 MAMAG 17547 (+) S02023 MAMAG 17660 (+) S02023 MAMAG 17683 (+) S02023 MAMAG 17697 (-) S02023 MAMAG 17603 (-) S02023 MAMAG 17544 (-) S02023 MAMAG 17525 (-) S02023 MAMAG 17568 (-) S01950 RCAGNTG 17472 (-) S01950 RCAGNTG 17609 (-) S00143 STGACTMA 17658 (+) S00435 TACAAA 17634 (+) S00972 TAGGC 17608 (-) S01426 TGACTCA 17602 (+) S00476 TGAGTCAG 17602 (+) S01424 TGANTMA 17608 (-) S01424 TGANTMA 17602 (+) S01935 TGASTMA 17608 (-) S01935 TGASTMA 17543 (+) S02137 TGGCA 17416 (-) S02137 TGGCA 17418 (+) S01037 TGTTCT 17602 (+) S01595 TKAGTCA 17442 (+) S02121 WCTGG 17418 (-) S02121 WCTGG 17442 (+) S00487 WCTRG 17418 (-) S00487 WCTRG 17644 (+) S00381 WGATAR 17627 (-) S01629 WGNAMCYK for promoter at position - 21328 21100 (+) S01153 AARKGA 21314 (+) S01153 AARKGA 21168 (-) S01153 AARKGA 21039 (+) S01090 AATGA 21253 (+) S01090 AATGA 21315 (+) S01090 AATGA 21126 (-) S01090 AATGA 21257 (+) S00534 ACGTCA 21260 (-) S00534 ACGTCA 21260 (-) S01257 ACGTCAT 21146 (-) S00922 AGAGG 21306 (+) S00880 ATTGG 21049 (+) S00014 CACACACACA 21149 (+) S00395 CACGCW 21083 (+) S00089 CANYYY 21091 (+) S00089 CANYYY 21179 (+) S00089 CANYYY 21196 (+) S00089 CANYYY 21275 (+) S00089 CANYYY 21247 (-) S00089 CANYYY 21123 (+) S01616 CATTW 21318 (-) S01616 CATTW 21256 (-) S01616 CATTW 21226 (-) S01051 CCAAGT 21310 (-) S00633 CCAAT 21075 (-) S00345 CCCCCGGC 21075 (-) S01936 CCCMNSSS 21258 (+) S00489 CGTCA 21259 (-) S00489 CGTCA 21294 (+) S00252 CTGATTA 21123 (+) S01622 CWKKANNY 21182 (+) S01622 CWKKANNY 21159 (-) S01622 CWKKANNY 21112 (-) S01622 CWKKANNY 21159 (+) S00741 GATTTC 21256 (+) S00144 KWCGTCA 21261 (-) S00144 KWCGTCA 21328 (-) S02023 MAMAG 21237 (-) S02023 MAMAG 21169 (-) S02023 MAMAG 21254 (+) S00559 NTGACGTCAN 21263 (-) S00559 NTGACGTCAN 21254 (+) S00153 RTGACGT 21273 (-) S01190 RYYWSGTG 21091 (+) S01205 SWATWWAG 21311 (+) S00435 TACAAA 21106 (+) S00087 TATAAA 21255 (+) S01059 TGACGT 21262 (-) S01059 TGACGT 21255 (+) S00969 TGACGTC 21262 (-) S00969 TGACGTC 21255 (+) S00072 TGACGTCA 21262 (-) S00072 TGACGTCA 21255 (+) S02107 TGACGTYW 21262 (-) S02107 TGACGTYW 21255 (+) S01940 TGACGYMR 21262 (-) S01940 TGACGYMR 21041 (+) S01424 TGANTMA 21295 (+) S01424 TGANTMA 21047 (-) S01424 TGANTMA 21293 (-) S01037 TGTTCT 21094 (+) S02000 TKNNGNAAK 21240 (+) S00563 TNNAKYNNKNNMTNATGA 21278 (+) S00487 WCTRG 21202 (+) S00381 WGATAR 21319 (+) S01629 WGNAMCYK