Gene Scan output from MIT
[ EMBnet
| APBioNet
| EBI
| NCBI
| CERNET
| PKU
| CBI ]
GENSCANW output for sequence yhm
GENSCAN 1.0 Date run: 11-Sep-99 Time: 12:20:23
Sequence YHM2 : 39951 bp : 45.84% C+G : Isochore 2 (43 - 51 C+G%)
Parameter matrix: HumanIso.smat
Predicted genes/exons:
Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr..
----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------
1.01 Intr + 66 212 147 2 0 22 100 231 0.960 18.13
1.02 Intr + 285 474 190 2 1 78 -18 280 0.070 15.56
1.03 Intr + 2629 2781 153 2 0 51 81 228 0.197 18.44
1.04 Intr + 2858 2891 34 0 1 62 94 34 0.984 -1.32
1.05 Intr + 3474 3924 451 0 1 85 94 476 0.913 41.20
1.06 Intr + 4133 4257 125 1 2 45 113 140 0.999 11.58
1.07 Intr + 4377 4548 172 0 1 101 74 244 0.999 24.25
1.08 Intr + 4586 4747 162 1 0 -14 94 236 0.896 14.17
1.09 Intr + 5608 5718 111 0 0 64 53 115 0.986 6.18
1.10 Intr + 5805 5930 126 2 0 127 94 165 0.998 21.88
1.11 Intr + 6255 6458 204 2 0 14 58 472 0.649 36.20
1.12 Intr + 6543 6713 171 2 0 16 87 258 0.632 18.64
1.13 Intr + 7091 7252 162 1 0 73 68 227 0.994 19.37
1.14 Intr + 7344 7538 195 2 0 59 83 205 0.971 16.71
1.15 Intr + 7753 7899 147 0 0 79 12 199 0.770 11.83
1.16 Intr + 9203 9295 93 1 0 81 109 -12 0.493 0.36
1.17 Intr + 9569 9652 84 1 0 47 83 109 0.948 6.32
1.18 Intr + 9726 9929 204 2 0 66 2 461 0.990 34.70
1.19 Intr + 10031 10131 101 1 2 100 21 115 0.999 4.91
1.20 Intr + 10229 10369 141 2 0 102 76 126 0.997 12.27
1.21 Intr + 11243 11399 157 2 1 61 59 159 0.996 10.31
1.22 Intr + 11487 11684 198 2 0 107 69 278 0.993 27.35
1.23 Intr + 12136 12591 456 0 0 50 100 667 0.640 57.52
1.24 Term + 12664 12870 207 0 0 78 43 296 0.907 21.44
1.25 PlyA + 13091 13096 6 1.05
2.00 Prom + 13698 13737 40 -6.46
2.01 Init + 13741 13784 44 0 2 89 83 23 0.800 1.79
2.02 Intr + 13931 14027 97 2 1 17 50 118 0.695 0.91
2.03 Intr + 14431 14590 160 0 1 106 64 263 0.991 25.26
2.04 Intr + 14661 14709 49 1 1 90 93 41 0.998 2.54
2.05 Intr + 14939 15130 192 2 0 80 26 320 0.996 23.61
2.06 Intr + 15142 15403 262 1 1 6 69 527 0.940 40.09
2.07 Intr + 15483 15607 125 2 2 24 34 220 0.951 9.58
2.08 Intr + 15913 16084 172 1 1 94 74 429 0.990 42.05
2.09 Intr + 16171 16284 114 0 0 56 83 195 0.989 16.44
2.10 Intr + 17296 17406 111 0 0 68 37 168 0.994 10.28
2.11 Intr + 17475 17600 126 2 0 104 68 200 0.751 20.48
2.12 Intr + 17772 17975 204 2 0 61 58 467 0.649 40.40
2.13 Intr + 18060 18230 171 2 0 16 87 258 0.631 18.64
2.14 Intr + 18614 18775 162 1 0 68 60 252 0.985 20.57
2.15 Intr + 18867 19034 168 2 0 70 40 188 0.999 12.34
2.16 Intr + 19419 19565 147 2 0 82 43 272 0.999 22.53
2.17 Intr + 19670 19774 105 1 0 89 111 6 0.914 3.41
2.18 Intr + 19853 19930 78 1 0 13 99 83 0.757 1.65
2.19 Intr + 20075 20158 84 1 0 55 83 100 0.969 6.22
2.20 Intr + 20232 20435 204 2 0 69 44 417 0.991 34.80
2.21 Intr + 20522 20622 101 1 2 78 18 110 0.956 1.91
2.22 Intr + 20718 20858 141 0 0 89 76 181 0.999 16.47
2.23 Intr + 21566 21722 157 2 1 89 54 234 0.999 20.11
2.24 Intr + 21806 22003 198 1 0 50 69 254 0.889 19.25
2.25 Intr + 22822 22974 153 0 0 77 -36 371 0.999 23.87
2.26 Intr + 23132 23278 147 1 0 22 100 231 0.945 18.13
2.27 Term + 23351 23557 207 1 0 101 54 318 0.999 27.04
2.28 PlyA + 23697 23702 6 -3.44
3.19 PlyA - 23870 23865 6 1.05
3.18 Term - 24656 24536 121 1 1 79 53 187 0.999 12.25
3.17 Intr - 24861 24742 120 2 0 62 80 223 0.981 18.51
3.16 Intr - 25061 24968 94 0 1 96 69 115 0.999 9.42
3.15 Intr - 25228 25150 79 1 1 104 105 133 0.999 15.72
3.14 Intr - 25419 25296 124 2 1 93 85 165 0.993 17.29
3.13 Intr - 25624 25494 131 1 2 37 64 243 0.865 16.29
3.12 Intr - 25804 25697 108 1 0 20 109 189 0.985 14.68
3.11 Intr - 25960 25877 84 1 0 56 61 165 0.996 10.62
3.10 Intr - 26145 26044 102 0 0 75 76 104 0.974 8.27
3.09 Intr - 26356 26231 126 1 0 94 21 152 0.998 9.98
3.08 Intr - 26615 26522 94 1 1 84 86 26 0.951 1.97
3.07 Intr - 27094 26986 109 2 1 39 67 57 0.443 -1.96
3.06 Intr - 27298 27174 125 0 2 102 80 82 0.454 9.03
3.05 Intr - 27607 27399 209 1 2 98 66 94 0.913 6.08
3.04 Intr - 27793 27669 125 2 2 95 57 136 0.999 11.50
3.03 Intr - 30358 30296 63 2 0 93 76 21 0.515 0.09
3.02 Intr - 30561 30444 118 0 1 118 65 171 0.999 17.84
3.01 Init - 31347 31336 12 0 0 86 99 11 0.954 2.32
3.00 Prom - 31560 31521 40 -9.26
4.00 Prom + 31672 31711 40 -5.96
4.01 Sngl + 32379 33164 786 2 0 44 42 395 0.789 26.45
4.02 PlyA + 33177 33182 6 1.05
5.14 PlyA - 33340 33335 6 1.05
5.13 Term - 35785 35717 69 1 0 95 50 40 0.517 -1.16
5.12 Intr - 36001 35885 117 1 0 68 68 38 0.437 0.46
5.11 Intr - 36199 36062 138 1 0 69 77 206 0.979 18.26
5.10 Intr - 36443 36276 168 2 0 106 49 124 0.998 10.44
5.09 Intr - 36700 36488 213 1 0 83 61 267 0.941 22.41
5.08 Intr - 36892 36784 109 0 1 120 79 114 0.996 13.99
5.07 Intr - 37257 37087 171 2 0 88 85 130 0.996 11.76
5.06 Intr - 37465 37349 117 0 0 98 68 214 0.974 19.98
5.05 Intr - 37641 37550 92 0 2 86 89 206 0.991 19.29
5.04 Intr - 37866 37748 119 1 2 40 76 135 0.999 7.68
5.03 Intr - 38320 38179 142 1 1 38 97 173 0.599 13.13
5.02 Intr - 39043 38926 118 0 1 56 116 39 0.999 3.97
5.01 Init - 39582 39473 110 0 2 70 72 160 0.817 12.29
Click here to view a gif image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s):
Predicted coding sequence(s):
>YHM2|GENSCAN_predicted_peptide_1|1396_aa
RYDTQAGDKGTQLSGGQKQRVAIARAIIRNPKLLLLDEATSALDTESEKVVQEALDQARK
GRTCIVVAHRLSTIQNADCIAVFQGGVVVEKGTHQQLIAKKGVYHMLVTKQMDGWDILMV
TIGVLMAIVNGLVNPLMCIVFGEMTDSFIQEAKLSQNHNTSNPRANSTLEADMQRFSIYY
SILGFAVLVVAYLQMSLWTLTAARQAKRIRELFFHGIMQQDISWYDVTETGELNTRLTEW
VTHIIHTPVPVTAGVVVIICGVRFPGAHDVYKIQEGIGDKAGLLIQAASTFITSFVIGFV
HGWKLTLVILAISPVLGLSAALYSKLLTSFTSKEQTAYAKAGAVAAEVLSSIRTVFAFSG
QRKAIKRYHKNLEDARDMGIKKGVAANTATGFSFLMIYLSYALAFWYGTTLVLNKEYTIG
NLLTNKSVAAETVTTCVQMKVFFVVLYGAYIIGQASPNVQSFASARGAAYKVYNIIDHKP
NIDSFSEDGYKPEYIKGDIVFQNIHFSYPSRPEIKILNDMSFHVRNGQTIALVGSSGCGK
STTIQLLQRFYDPQKGSIFIDGHDIRSLNIRYLREMIGVVSQEPVLFATTITENIRYGRL
DVTQEEIERATKESNAYDFIMNLPDKFETLVGDRGTQLSGGQKQRIAIARALVRNPKILL
LDEATSALDAESETIVQAALDKVRLGRTTIVIAHRLSTIRNADIIAGFSNGEIVEQGTHS
QLMEIKGVYHGLVTMQSFQKLEDLEDSDYEPWVAEKSQLIESFSQSSLQRRRSTRGSLLA
VSEGTKEEKEKFECDQDNIEEDENVPPVSFFKVMRYNVSEWPYILVGTICAMINGAMQPV
FSIIFTEIIMFWGFQGFCFSKSGEILTLNLRLKAFISMMRQDLSWYDNPKNTVGALTTRL
AADAAHVQGAAGVRLAVMTQNFANLGTSIIISFVYGWELTLLILAVVPILAVAGAAEVKL
LTGHAAEDKKELEMAGKIATEAIENVRTVVSLTREPTFVALYEENLTVPYKNSQKKAKIY
GLTYSFSQAMIFFVYAACFRFGAWLIEAGRMDVEGVFLVVMTMLYGAMAVGEANTYAPNF
AKAKISASHLTMLINRQPAIDNLSEEEARLEKYDGNVLFEDVKFNYPSRPDVPVLQGLNL
EVQKGETLALVGSSGCGKSTTIQLLERFYDPREGRVLLDGVDVKQLNVHWLRSQIGIVSQ
EPVLFDCSLAENIAYGDNSRSVSMDEIVAAAKAANIHSFIEGLPQVAAVNQGKWLIPHLI
DSHGAAHDHLHHIQTVSEQRYDTQAGDKGTQLSGGQKQRVAIARAIIRNPKLLLLDEATS
ALDTESEKVVQEALDQARKGRTCIVVAHRLSTIQNADCIAVFQGGVVVEKGTHQQLIAKK
GVYHMLVTKQMGYHSG
>YHM2|GENSCAN_predicted_CDS_1|4191_bp
agatacgacactcaggctggtgataagggaacacagctgtcagggggccagaagcagcgt
gtcgccatagcccgagccatcatccgcaaccccaaactgttgctcctggacgaggccacg
tctgcgctcgacactgagagtgagaaggtggtgcaggaggcgttggaccaggccaggaag
ggcaggacgtgcatcgtcgtagcccaccgtctgtccaccatccagaacgccgactgcatc
gctgtgttccagggaggagtggtggtggaaaaggggacgcaccagcagctgatcgccaag
aagggagtgtaccacatgctggtcaccaaacagatggatggttgggacatcttgatggtc
accatcggagtgctgatggccattgtgaacgggctggtgaatcctctgatgtgtatcgtg
tttggtgagatgactgacagcttcatccaggaagccaaactgtcccaaaaccacaacaca
agcaaccccagagcaaacagcaccttagaagcagatatgcagagattctccatctactac
tccatcttggggtttgctgtgctggtagtggcgtacctgcagatgtctctgtggacccta
acggccgcgcggcaggccaaacgaattcgcgagttgtttttccacggcatcatgcagcag
gacatcagctggtatgacgtgactgagacaggagagctcaacacgcgtctcacagagtgg
gtgacgcacatcatacacactccagttcctgtcacagctggcgtggtcgttatcatatgt
ggtgttcgattccctggtgcgcacgatgtctacaagatccaggagggcatcggtgacaag
gcgggtctgctgatccaggcggcctccacctttatcacttcctttgttattggttttgta
catggatggaagctcaccctggtcatcctggccatcagccctgtgttgggtctctcagct
gccctttacagtaagttgctgacaagcttcaccagtaaagagcagacagcgtacgccaaa
gctggagctgtggcagcggaggtgctatcctccatcaggactgtgtttgccttcagtggc
caaagaaaagccatcaaaagatatcataagaacctggaggatgcgagggacatgggaata
aagaagggagttgctgctaacacggccacaggcttctcctttctgatgatctacctgtcc
tatgctctggccttctggtacgggactactctggtcctcaacaaagagtacaccattgga
aatttactgactaataagagcgttgctgcagaaacagtgaccacgtgtgtccaaatgaag
gtgttcttcgtcgtcctctacggggcatacattattggacaggcctctcccaacgtccag
tcctttgccagtgccagaggagcggcgtataaagtctacaacattatcgaccacaaaccg
aatattgacagcttttcagaggacggatacaagcctgaatacatcaaaggtgacattgta
ttccagaacatccacttcagctacccttcgaggccagaaattaaaatcttaaacgacatg
tcgtttcatgtgaggaacggacagaccattgctttggtggggagcagcggctgtggtaaa
agtaccaccatccagctgctgcagaggttctacgacccccagaaaggatccatatttatc
gacggtcacgacatccgctccctcaacatccgctacctgagagaaatgatcggagtggtc
agccaggagcccgttcttttcgccaccaccatcaccgagaacatcagatacggccgactg
gacgtgacgcaagaggagatcgaacgagccactaaagagtccaacgcttatgacttcatc
atgaaccttccagacaagtttgagacgctggtgggagatcgagggactcagctgagcgga
ggacagaagcagaggatcgccatcgctcgagctctggtccgcaaccctaaaatcctcctg
ctggacgaggccacgtctgccctcgatgctgagagcgagaccatcgtacaggctgctctg
gacaaggtccgactgggtcgcaccaccatcgtcatcgctcaccgactctcgaccatcaga
aacgccgacatcattgctggattcagtaatggtgaaatcgttgagcaggggactcacagc
cagctgatggagataaagggagtctatcatggcctggtgaccatgcagagctttcagaag
ctggaggatctggaagactcagactacgagccctgggtcgctgagaagagccagctgatc
gaatccttctcccagtcctccctgcagaggaggaggtccactagaggctccttgcttgct
gtctcagaaggaacaaaagaggagaaagaaaaatttgagtgcgatcaggacaacatagag
gaggatgagaacgttcctcccgtgtcgttctttaaagtgatgcgttacaacgtttctgag
tggccgtatattttggtaggaaccatctgcgccatgatcaacggtgcgatgcagccagtg
ttcagcatcatcttcaccgagatcattatgttttggggtttccagggtttctgtttcagt
aaatctggagaaattctgaccctgaacctcagactcaaagccttcatatctatgatgaga
caggacctcagctggtacgacaatcccaaaaacaccgttggcgctctcaccactaggctg
gctgccgacgccgcccacgtacaaggagctgcaggggtgcgcctggctgtaatgacgcag
aacttcgccaacctgggcaccagcatcatcatcagcttcgtgtacggctgggagctgacc
ctgctcatcctggccgtggtgcccatcctggctgtggccggagccgctgaggtcaagctg
ctgacaggacacgccgccgaagacaagaaggagctggagatggccggaaagatcgccaca
gaggccatcgagaatgtgagaactgtggtgtccctcaccagagaaccgacatttgtggct
ttatacgaggaaaatctaactgttccatacaagaactcccagaaaaaggccaaaatttat
ggcttaacctactccttctcacaggccatgatcttctttgtttacgctgcctgtttccgc
tttggagcctggctgatcgaagcaggacggatggatgtggagggagtgttccttgtggtt
atgacaatgctgtacggcgccatggctgtcggcgaggccaacacttatgctcccaacttc
gccaaagccaaaatctcagcctcccacctgacgatgctaataaacagacagccggccata
gataatctgtcagaggaggaagcgagactggagaaatacgacggcaacgttctttttgag
gacgtcaagtttaactacccgtcgcggcccgatgtgcctgtactacaagggctgaatctg
gaggtgcaaaagggagaaactctggccttggtgggcagcagcggttgtggaaagagcacc
accatccagctgctggagaggttttatgaccccagagaggggagagtgttgctggacggt
gtcgatgtgaaacagctgaacgttcactggctgaggtctcagatcggcatcgtctcccag
gagccggtgctgttcgactgctccctggctgagaacatcgcctacggagacaacagtcgc
tccgtgtccatggatgagatagtagctgctgctaaagcagccaacatccacagcttcatc
gaagggctgcctcaggtagcggctgtgaatcaggggaaatggttgattccacatttgatc
gattcccatggagctgcccatgaccatttacaccatatacaaactgtctctgagcagaga
tacgacactcaggctggtgataagggaacacagctgtcagggggccagaagcagcgtgtc
gccatagcccgagccatcatccgcaaccccaaactgttgctcctggacgaggccacgtct
gcgctcgacactgagagtgagaaggtggtgcaggaggcgttggaccaggccaggaagggc
aggacgtgcatcgtcgtagcccaccgtctgtccaccatccagaacgccgactgcatcgct
gtgttccagggaggagtggtggtggaaaaggggacgcaccagcagctgatcgccaagaag
ggagtgtaccacatgctggtcaccaaacagatgggctatcacagtggatga
>YHM2|GENSCAN_predicted_peptide_2|1292_aa
MALKIDTAETNGDLSHDSKDDGAKNEKKKKNKKEKPPQEPMVGPITLFRFADRWDVVLLI
SGTVMAMVNGTVMPLMCIVFGEMTDSFIYADMAQHNASGWNSTTTILNSTLQEDMQRFAI
YYSVLGFVVLLAAYMQVSFWTITAGRQVKRIRSLFFHCIMQQEISWFDVNDTGELNTRLT
EEFPASAFTLCTATLGGVDDLMDVLLFSNGSDVYKIQEGIGDKVGLLIQAYTTFITAFII
GFTTGWKLTLVILAVSPALAISAAFFSKVLASFTSKEQTAYAKAGAVAEEVLSAIRTVFA
FSGQTREIERYHKNLRDAKDVGVKKAISSNIAMGFTFLMIYLSYALAFWYGSTLILNFEY
TIGNLLTVFFVVLIGAFSVGQTSPNIQNFASARGAAYKVYSIIDNKPNIDSFSEDGFKPD
FIKGDIEFKNIHFNYPSRPEVKILNNMSLSVKSGQTIALVGSSGCGKSTTIQLLQRFYDP
EEGAVFIDGHDIRSLNIRYLREMIGVVSQEPVLFATTITENIRYGRLDVTQEEIERATKE
SNAYDFIMNLPDKFETLVGDRGTQLSGGQKQRIAIARALVRNPKILLLDEATSALDAESE
TIVQAALDKVRLGRTTIVVAHRLSTIRNADIIAGFSNGKIVEQGTHSQLMEIKGVYHGLV
TMQTFHNVEEENTAMSELSAGEKSPVEKTVSQSSIIRRKSTRGSSFAASEGTKEEKTEED
EDVPDVSFFKVLHLNIPEWPYILVGLICATINGAMQPVFAILFSKIITVFADPDRDSVRR
KSEFISLMFVVIGCVSFVTMFLQGYCFGKSGEILTLKLRLRAFTAMMRQDLSWYDNPQNT
VGALTTRLAADAAQVQGAAGVRLATIMQNFANLGTSIIIAFVYGWELTLLILAVVPLIAA
AGAAEIKLLAGHAAKDKKELEKAGKIATEAIENVRTVVSLSREPKFECLYEENLRVPYKN
SQKKAHVYGLTYSFSQAMIYFAYAACFRFGAWLIEAGRMDVEGVFLVVSAVLYGAMAVGE
ANTFAPNYAKAKMAASYLMMLINKKPAIDNLSEEGTSPEKYDGNVHFEGVKFNYPSRPDV
TILQGLNLKVKKGETLALVGSSGCGKSTTIQLLERFYDPREGRVSLDGVNVKQLNIHWLR
SQIGIVSQEPVLFDCSLAENIAYGDNSRSVSMDEIRYDTQAGDKGTQLSGGQKQRVAIAR
AIIRNPKLLLLDEATSALDTESEKVVQEALDQARKGRTCIVVAHRLSTIQNADCIAVFQG
GVVVEKGTHQQLIAKKGVYHMLVTKQMGYHND
>YHM2|GENSCAN_predicted_CDS_2|3879_bp
atggccttaaagatcgatacggccgaaacaaacggtgatctgagccatgattccaaggac
gatggtgccaagaatgaaaagaaaaagaagaataaaaaggaaaagccaccacaggagccc
atggtgggccccattactctgttccgatttgcagaccgctgggacgtcgtgctgctcatc
agcgggacagtgatggccatggtcaacggcaccgtgatgcccctcatgtgcattgtcttt
ggagaaatgacggacagttttatatacgctgacatggcccaacacaacgcaagtggctgg
aattctactactactattctgaacagcacgttacaggaggacatgcaaagattcgccatt
tattactccgtcttgggatttgttgtgctgctggccgcctacatgcaggtgtccttctgg
accataacagccgggcgccaggtgaaacgcatccgcagcttgtttttccactgcatcatg
cagcaggagatcagctggtttgacgtgaacgacacaggggagctcaacactcgactgacg
gaagagttcccagcttcagcgttcacgctctgtacggctacgctcggaggtgtagatgat
ctgatggacgtgcttcttttttccaatggcagcgacgtctacaagatccaggagggcatc
ggtgacaaggtggggctgctgatccaggcgtacaccaccttcatcacggccttcatcatc
ggcttcaccacgggctggaaactgacgctggtcatcctggccgtgagccccgcgctggcc
atttcggccgccttcttcagtaaagtgcttgcgtccttcaccagtaaggagcagacggcg
tacgccaaagccggagccgtggcggaggaagtgctgtccgccatcaggaccgtgttcgcc
ttcagtggtcagaccagagagattgagagataccacaagaacctgcgggacgcaaaggac
gtgggagtgaagaaggccatctcctccaacatcgccatgggcttcaccttcctgatgatc
tacctgtcctatgctctggccttctggtacgggagtacgctcatcctgaattttgagtac
accatcggcaatttactgactgtgttttttgtcgtgcttattggagcgttcagcgtcgga
cagacctctccgaacatccagaattttgccagcgcccgaggagccgcctataaagtgtac
agcatcatcgataacaagccaaacattgacagcttttcagaggacggtttcaagccggac
ttcatcaaaggtgacatcgagttcaagaacatccacttcaattacccttcgaggcctgaa
gtcaaaatcttgaacaacatgtctctgagcgtgaagagcggacagaccattgctttggtg
gggagcagcggctgtggcaaaagtaccaccatccagctgctgcagaggttctacgacccc
gaggaaggagctgtatttatcgacggtcacgacatccgctccctcaacatccgctacctg
agagagatgatcggagtggtcagccaggagcccgttcttttcgccaccaccatcaccgag
aacatcagatacggccgactggacgtgacgcaagaggagatcgaacgagccactaaagag
tccaacgcttatgacttcatcatgaaccttccagacaagtttgagacgctggtgggagat
cgagggactcagctgagcggaggacagaagcagaggatcgccatcgctcgagctctggtc
cgcaaccctaaaatcctcctgctggacgaggccacgtctgccctcgatgctgagagcgag
accatcgtacaggctgctctggacaaggtccgactgggtcgcaccaccatcgtggtcgct
caccgactctcgaccatcagaaacgccgacatcattgctggattcagtaatggcaaaatc
gtggagcaggggactcacagccagctgatggagataaagggagtctatcatggcctggtg
accatgcagacgttccacaatgtggaggaggaaaataccgccatgtcggagttatctgct
ggggagaagagccctgtggaaaagaccgtctcccagtcgtccatcatcaggaggaagtcc
accagagggtcctcgtttgccgcgtcagaaggaaccaaagaggaaaagacagaagaggat
gaagacgttcccgacgtgtcgttctttaaagtgctgcatctgaacatccccgagtggccc
tacatccttgtggggctcatctgcgctacgatcaatggagccatgcagccggtcttcgcc
atcctcttctccaagatcatcactgtgtttgcggatccagaccgtgattctgtcaggagg
aagagtgaattcatttctctgatgtttgtcgttattggctgtgtgtcatttgtcaccatg
tttttacagggttactgtttcggtaaatccggagagattctgacgctgaagctgagactc
cgggcgttcacggcgatgatgagacaggacctcagctggtacgacaatccccaaaacacc
gttggcgctctcaccactaggctggctgccgacgccgcccaagtacaaggagctgcaggg
gtgcgcctggcgacaataatgcagaacttcgccaacctgggcaccagcatcatcatcgcc
tttgtttacggctgggagctgaccctgctcatcctggccgtggtgcccctcatcgcggcc
gccggagccgctgagatcaagctgctcgcgggtcacgccgccaaagacaagaaggagctg
gagaaggccggaaagatcgccacagaggccatcgagaacgtcagaaccgtcgtgtccctc
agcagagaaccaaaatttgagtgtttatatgaggagaatctcagagtgccgtacaagaac
tcccagaaaaaggcccacgtgtacggcttaacctactccttctcccaggccatgatctac
tttgcttacgctgcctgtttccgcttcggagcctggctgattgaagcaggacggatggac
gtggagggagtgttcctggtggtttctgcggtgctgtacggcgccatggccgtgggggaa
gctaacacctttgctccgaactacgccaaggccaaaatggctgcttcctacctgatgatg
ctaataaacaagaagcccgccattgataacctctcagaggaggggacgtctccggaaaaa
tacgacggtaatgtgcatttcgagggtgttaaattcaactacccgtcgcggcccgatgtg
accatactccaggggctgaacctgaaggtgaaaaagggagaaactctggccttggtgggc
agcagcggttgtggaaagagcaccaccatccagctgctggagaggttttatgaccccaga
gaggggagagtgtcactggacggtgtcaacgtgaaacagctgaacattcactggctgagg
tctcagatcggcatcgtctcccaggagccggtgctgttcgactgctccctggctgagaac
atcgcctacggagacaacagtcgctccgtgtccatggatgagataagatacgacactcag
gctggtgataagggaacacagctgtcagggggccagaagcagcgtgtcgccatagcccga
gccatcatccgcaaccccaaactgttgctcctggacgaggccacgtctgcgctcgacact
gagagtgagaaggtggtgcaggaggcgttggaccaggccaggaagggcaggacgtgcatc
gtggtagcccaccgtctgtccaccatccagaacgccgactgcatcgctgtgttccaggga
ggagtggtggtggaaaaggggacgcaccagcagctgatcgccaagaagggagtgtaccac
atgctggtcaccaaacagatgggctatcacaacgactga
>YHM2|GENSCAN_predicted_peptide_3|647_aa
MSKQAEFEKIAEDVKKVKTRPTDQELLDLYGLYKQAIVGDVNTDRPGLLDLKGKAKWDAW
ESRKVRPFASEKEFKATEDIVRNFQQGVGKELHQRLLQRAETRRNWMFNTVLSSQLEQWW
LDAAYLEGRSPSQLTVNFAGPAPYLEHCWPPAEGTALERASICSWHMLQYWNLIRTERLA
PQKAGETPLDMDQFRMLYCTCKVPGVTKDAIRSYFKTELEGRCPSHLVVLCRGRIFTFDA
LCDGQILTPPELFRQLSYVRQCCDGNPEGEGVSALTTEERTRWAKAREYLISIDPHNETI
LELIQSSLFTICLDETQPYSTPENYTNLTRESLTGDPTIRWGDKSYNSVVYSDGTFGSNC
DHAPYDAMVLVTMCWYVDQRIQSTGGKWKGVDTVRVLPPPEELVFTVDEKVRSDIGRAKK
QYFESAQDLQVVCYAFTAFGKAAIKQKKLHPDTFIQLAMQLAYFKLHQRPGCCYETAMTR
KFYHGRTETMRPCTVEAVKWCTAMTDPSCEDNAKRKAMQLAFEKHNNLMAEAQEGRGFDR
HLLGLYLIAKEEGRPVPELFLDPLYAKSGGGGNFVLSSSLVGYTTVLGAVAPMVPHGYGF
FYRIREDRIVISISAWKSCRQTDAVSLFNVFSSCLHEMLHLATTSQL
>YHM2|GENSCAN_predicted_CDS_3|1944_bp
atgtccaagcaggcagaatttgagaagattgcagaggatgtgaagaaagtgaagacgagg
ccgacagaccaggagctgctggatctgtatggcctttacaaacaggcaattgttggagac
gtcaatacggacaggccaggacttctggatttaaagggaaaagctaagtgggatgcctgg
gaatccaggaaagttcgtccttttgcgtccgagaaggaattcaaggccacagaggacatt
gtgaggaacttccaacagggtgttggaaaagagctgcaccagaggctcctgcagagagct
gaaaccaggaggaactggatgtttaacactgtgctgtcctcccagctggagcaatggtgg
ttagacgctgcttatctggagggccgcagcccctctcagctgactgtgaacttcgcaggg
ccggcaccttatctggaacactgctggcctcctgccgagggaacagccctggagcgagcc
agtatttgctcgtggcatatgctgcagtactggaatctgatccgcacggagaggttggcg
ccgcagaaagctggcgaaacacctttagatatggaccagttcaggatgctgtactgcacc
tgcaaagtacccggggtgacgaaagacgctattcgtagctactttaaaacagagctcgag
gggaggtgcccttcccatttggtggttctttgtcgtggacgcatcttcacatttgatgcc
ctctgtgatggacaaatactgacgcccccagaactgttcaggcagctgagctacgtgaga
cagtgctgtgatgggaacccagagggggagggagtgagcgctctcactactgaagagagg
acgcgctgggcgaaggcccgagagtatctaataagtattgatccgcacaacgagaccatc
ctggagctcatccagagcagcctgttcaccatatgtctggatgagacgcagccttactcc
actccagagaactacaccaacctcacacgggagtctctcacgggtgatcccaccatccgc
tggggggacaaatcctacaattcagtcgtctattcagatggaacgtttggatccaactgt
gatcacgcgccgtacgacgccatggtgctggtgaccatgtgctggtacgtggaccagcga
attcaaagcaccggaggcaaatggaagggtgtggacacagtcagagtcctgccgcctccc
gaggagctggtgtttactgtggacgaaaaagtccgcagcgacatcggccgtgcgaaaaaa
caatactttgagtcggcgcaggacctgcaggttgtctgttacgccttcacggctttcgga
aaagccgccatcaagcagaaaaagctgcacccggacacgttcatccaactggcgatgcag
ctggcgtactttaaactgcaccagaggccagggtgttgctacgagacagccatgactcgc
aagttctaccacggcaggacggagaccatgaggccctgcaccgtggaggcggtgaaatgg
tgcacggccatgacggacccgtcgtgcgaggacaacgctaagaggaaagccatgcagctg
gcctttgagaaacacaacaacctgatggccgaggcccaggaaggacgaggcttcgacagg
caccttctcggcctgtatctcatcgccaaagaggagggacgtcctgttccggaactcttc
ttagatccgctctatgccaagagtggcggtggcggaaactttgtgctgtcgtccagcctg
gtgggctacaccacagttctgggcgcggtggcgccgatggttccccacggctacggcttc
ttctaccgtatccgagaggacaggattgtgatttccatatcggcctggaagtcctgccgc
cagaccgacgccgtgtccctgttcaacgtcttcagcagctgtctgcacgagatgctgcac
ctggcaacaacgtctcagctctga
>YHM2|GENSCAN_predicted_peptide_4|261_aa
MAAAAKSAKKESKRYIPTKTCFTSPFTPKWSPLPQEDMHFILNTLKENFVSIGLVKKEPK
VFRPWRKKKKQEAAQSQDSDLQVSQDAASQEPPKRGWTDVAARRKLAIGINEVTKALERN
ELKLLLVCKCVKPQHMMEHLITLSTTRDVPACQVPRLSQSVSEPLGLKSVLALGFRQCLP
QERDVFSNVVEAILPKVPPLDVPWLQDTPASIKPDENRGQKRRLETESEEGTPVSSTTLQ
PLKVKKIVPNSARKGKGKKKV
>YHM2|GENSCAN_predicted_CDS_4|786_bp
atggctgctgcagcaaagtctgctaaaaaggaaagcaaaaggtatattcccacaaagacc
tgcttcacttcaccattcacgccaaaatggagcccgctcccgcaagaagacatgcatttc
atcctgaacaccttaaaggaaaactttgtttccatcggacttgtgaaaaaagagccaaag
gtgtttcgtccttggcgtaaaaagaaaaagcaggaagctgctcaatcacaagattcagac
ctccaggtgagccaggatgctgcgagtcaggaacctccaaaacgtggatggacagatgtg
gcagctagaagaaagctggccattggaatcaacgaggtcaccaaagctttggagaggaat
gagctcaaactgctgctagtgtgtaagtgtgtcaagccacaacacatgatggagcacctc
ataacgctgagcacaacgagagacgtccctgcctgccaggtgcctcggctcagccagagt
gtgtcggagcctctggggctaaaaagcgtcctcgccttaggattcagacaatgtctcccc
caagagagggatgtgttcagtaacgtggttgaagccattttacccaaagtgccaccactg
gatgttccctggctccaagatacaccggccagtataaaacctgacgaaaacagaggccag
aagaggaggctcgaaactgagtctgaggaagggacgcctgtctcctccacaactctacaa
cctctcaaagtgaagaaaatagttcccaactctgcgaggaaaggcaaaggaaagaaaaag
gtctaa
>YHM2|GENSCAN_predicted_peptide_5|560_aa
MAEDSESAASQQSLELDDQDTCGIDGDNEEENEHLQGSPGGDLGAKRKKKKQKRKKEKPS
SGGAKSDSASDSQEFKNPTLPIQKLQDIQRAMELLSCQGPAKSIDEAAKHKYQFWDTQPV
PKLNEVVTSHGPIEADKENIRQEPYSLPQGFMWDTLDLGSAEVLKELYTLLNENYVEDDD
NMFRFDYSPNFLKWALRPPGWLPQWHCGVRVSSNKKLVGFISAIPADIRIYDTVKRMVEI
NFLCVHKKLRSKRVAPVLIREITRRVNLEGIFQAVYTAGVVLPKPVSTCRYWHRSLNPRK
LVEVKFSHLSRNMTLQRTMKLYRLPDSTKTPGLRPMERRDIRQVTELLQKFLKRFQLAPS
MTEEEVSHWFLPQDNIIDTYVVEVAGISLKDSDPECLGAGGALTDFASFYTLPSTVMHHP
LHRSLKAAYSFYNVHTQTPLLDLMNDALILAKLKGFDVFNALDLMENKVFLEKLKFGIGD
GNLQYYLYNWKCPSMEPDKPWLPFRSASSFSSRVPQGCYTNTGPKVTTDRGSCHLDMQFA
GRGGDNSSIINILNVKSCYA
>YHM2|GENSCAN_predicted_CDS_5|1683_bp
atggcggaggacagtgaatccgcagctagccagcagagcctggagctggacgaccaggac
acatgcgggatagacggggacaacgaagaggagaatgagcatctgcaagggagtccggga
ggggatttgggggctaaaaggaagaagaagaaacagaagaggaagaaagagaagccgagt
tcggggggagccaagtccgactctgcctctgactcccaggagttcaagaaccctactttg
cccattcagaagctgcaggacattcaacgagccatggagttactctcctgtcagggtcct
gcaaagagcatcgacgaggcggccaagcacaagtaccagttctgggacacgcagcctgta
cccaagttaaacgaggtggtgacgagtcacgggccaatagaggccgacaaagaaaacatt
cgacaggagccatattctttacctcaaggttttatgtgggacacgctggatctgggcagc
gcagaagtgctgaaggagttgtacacgttactgaacgagaactacgtggaggacgacgac
aacatgttcagattcgactattcgccaaactttctcaaatgggctctgcgtccgccgggc
tggctcccccagtggcactgcggcgtgcgagtgtcgtcgaacaagaagctggtgggcttc
atcagcgccattcccgctgacatccgcatctacgacacagtgaagaggatggtggaaatc
aacttcctgtgtgtgcacaagaagctgcgttcgaagcgcgtcgccccggtgctcatcagg
gagatcacgcggagggtgaacctagagggcatatttcaggccgtttacacggcaggagtg
gttctgcccaaacccgtgtccacgtgcaggtactggcaccgttctctgaaccccaggaag
cttgtggaagtgaagttctcccacctgagcagaaacatgaccctgcaacggaccatgaag
ctctacagattaccagacagcacgaagactcccggtctgcggccaatggagaggcgcgac
atccgccaggtcacagagctgctacagaaattcctgaaacgcttccagctcgcaccttcc
atgacggaagaggaggtgtctcactggttcctgccgcaggacaacataattgacacttat
gtagtggaggtagccgggatcagcctgaaggactcagacccagagtgcctcggtgctggg
ggcgcgctgacagactttgctagtttctacactctgccctcgactgtgatgcaccaccct
ctccacaggagcctgaaggccgcctactctttttacaacgttcacacacaaacccctctc
ctggatttgatgaacgacgcactgatcctggccaaactgaaagggttcgatgttttcaac
gccctggatctcatggagaataaagtgttcctggagaagctcaagtttggtataggagat
ggaaatctgcagtattacctctacaactggaaatgtccatctatggagcctgataagccg
tggttgcctttcaggtcggcctcgtccttcagtagcagggttcctcagggctgctacaca
aacactgggccaaaggtcaccacggaccgtggtagttgtcacctggacatgcagttcgct
gggagggggggggacaactcgtcaatcatcaacatcctgaatgtgaagtcatgctatgcc
tga
Explanation
Gn.Ex : gene number, exon number (for reference)
Type : Init = Initial exon
Intr = Internal exon
Term = Terminal exon
Sngl = Single-exon gene
Prom = Promoter
PlyA = poly-A signal
S : DNA strand (+ = input strand; - = opposite strand)
Begin : beginning of exon or signal (numbered on input strand)
End : end point of exon or signal (numbered on input strand)
Len : length of exon or signal (bp)
Fr : reading frame (a codon ending at x is in frame f = x mod 3)
Ph : net phase of exon (length mod 3)
I/Ac : initiation signal or acceptor splice site score (x 10)
Do/T : donor splice site or termination signal score (x 10)
CodRg : coding region score (x 10)
P : probability of exon (sum over all parses containing exon)
Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores)
Comments
The SCORE of a predicted feature (e.g., exon or splice site) is a
log-odds measure of the quality of the feature based on local sequence
properties. Thus, for example, a predicted donor splice site with
score > 100 is excellent; 50-100 is acceptable; 0-50 is weak; and
below 0 is poor (probably not a real donor site).
The PROBABILITY of a predicted exon is the estimated probability under
GENSCAN's model of genomic sequence structure that the exon is correct.
This probability depends in general on global as well as local sequence
properties. This information can be used to assess the reliability of the
predicted exon, e.g., it would be better to design PCR primers based on
a predicted exon with probability > 0.95 than one with lower probability.
[ GCG
| w2h
| Staden
| GeneExplorer ]
[ WebGene
| GeneFinder
| Grail
| PROCRUSTES ]
Last modified: Sun, 12 Sept 1999,
[email protected]