CcUC04G061640 (gene) Watermelon (PI 537277) v1

Overview
NameCcUC04G061640
Typegene
OrganismCitrullus colocynthis (Watermelon (PI 537277) v1)
Descriptiongeneral transcription and DNA repair factor IIH subunit TFB1-1-like
LocationCicolChr04: 11752939 .. 11772300 (+)
RNA-Seq ExpressionCcUC04G061640
SyntenyCcUC04G061640
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: exonfive_prime_UTRCDSpolypeptidethree_prime_UTR
Hold the cursor over a type above to highlight its positions in the sequence below.
AAGTAATAAATAAATAAGAAAAGGAAATGAGAAAAAACGACTGGGATCGGGAGAGATCCAGAGAGAGAAGAGAAAAAGAGGGAAAATAAAAGAAGAAAGAAAGGAGTGAAATAGATGGAAAAAGAAAGGGAAGCGTAGAGAGAAAAGGCCGTGGAGAATTGGCGGCGACAAAGAGGATTGGCTGAAGGACCTATACGAAGCAAGAATCAGAAAAAGGGAGCTTGCAGAATCATTGATTAAGATCGTGGGTAAGTTTTCTACATAGTCGTTAAGGTTTGCCGTGAAAAGAAAAATCTAGATTGGGTCTTGTGGGTTCTATTCTTTAATTTGAAGGCTCCTGTGGATGTAATTTGGAACAAAAGTAGAAGAAATTGGAGAAGGTACAAGAAAGAAACATTTGGTGAAAATCCTTTAATTCTAGCAAGGAGATGGTTGAAAGGTTTTGCTAGCCACTTAGTTTTGAAAAATATTTTCTAGATTTTGACAATAGATTTCAATAGATTTCTATATTTTGACAACGCATAATGTATCTAGAGAGATGATTTATCATGTTCTTCTCCATCCCCATTTCCTGAGACAAATCGTTTTTTTGTGGCAAGTTGGGGTGTGAGTTGTTTAGTAGGGTCTTTGAGGTGAAAGCGCAATAGAGTTTTTAGAGTGCTTTGAATGTTCTCTGAGCAACATTTGATCACATATTCGATTTAATGTTTCTTTGTGGGCTTTGGTGGCGAAGCTTTTTGTAATTATCCATTACCATTAGGTCTTATTTTGCTTGATTGGAGGCCTTCTCTTTCTGTGGGCTTTTTTTTTTTTTTGTTTTCTTAGTTTTTTCCTTATTATTCATTATTTTTTCTTTTTTTTTTTTTATCTGCTCTTGTTTTGTTTTTTTTTTTTGTTTTCTTAGTTTTTTCCTTATTATTCATTATTTTTTCTTTTTTTTTTTTTATCTGCTCTTGTTTTGTTTTTTTTTTTTGTTTTCTTAGTTTTTTCCTTATTATTCATTATTTTTTCTTTTTTTTTTTTTATCTGCTCTTGTTTTGTTTTTTTTTTTTGTTTTCTTAGTTTTTTCCTTATTATTCATTATTTTTTCTTTTTTTTTTTTTATCTGCTCTTGTTTTGTTTTTTTTTTTTGTTTTCTTAGTTTTTTCCTTATTATTCATTATTTTTTCTTTTTTTTTTTTTATCTGCTCTTGTTTTGTTTTTTTTTTTTGTTTTCTTAGTTTTTTCCTTATTATTCATTATTTTTTCTTTTTTTTTTTTTATCTGCTCTTGTTTTGTTTTTTTTTTTTGTTTTTCTTTTTTTTTTTTTATCTGCCCTTGTATTGTTTCATTTTATTGTTTTCTCAGTGGAAGCCTTATTAATCATTAAAAAAACTATCGATAAACTTCAATATGTTTGGATTTTGGAAATCTTGAGATGATGAGATTTCCAAAGGGGTGATCTTAGGAAAGATCACCTTCTGATGATTAAACAGAGTCAGTATGGTGTGTTTAAGTTTGCTTACACCTTAGTTACGTTGTAGAGTACATTACTTGTTTACCTTTGTGGATTGGTTAACCGTGGTAATGCCAAAGCGATATAAGTTATTATGAGATATGTGTCTCGGTAATTTACAATGTTTATGTTACACTGAATTGCTATTTGTTATGATGCAATGAGCATGTTGTGGTGATATTATGCTTTTGGTGGTATTTTGTTGATGCATGGCTCCTTTAGTAGGGGTTACTTACCAACTATTCACATACTCACCCCTTACCCACCAGCTTTTTACACGTGATGAGGAAAGGAGGGTGACGATCCGGAGACTATTAGCTAGTTGAATTTCTTTTTGTAGTCTGGTTCAGTTTGGATACATTGTTACTGATTCTTTCTTTCATTATGTTATTGTTTCTGAAACAATGGAGCATTCTGAATTTCTTGGTTATCTTATCTATAAGTAAAAAATTTTCATGATCTATCTCAAACCATGGACACAAGGGACCCAACTTACTATGGCCACTTTAATGACTATTGAAGCAAGTCTTATAGCTTTCATGATCCATCTTTGGCCACATTTTTGGATTTTTATGATGGTTGTGTTTTTGAATATACGTAAATCTTGGAATTTCTGGAGATTTTGCTGTGAAATTTTGCTTGTCATCTTTCTCTGTCTCTTTTACTGGGATCTGATTGTTTATAGTGTTTTAAATGCCATAAGATGTTTGGTTTCAGTCAATTTCTTGGATCTGAATAATCTTTGCATCATGATGAATAGATTGATTTTCGTATCAATTCTTTCAGACTGATTGTCTCTGAGGGGGAGATGGGAACCAAGTATGTCCATAAGAGTGCCAAGTACAAGACCTCAGTTAAGGATCCTGGCATGCCTGGCGTTTTGGAAATGGTATTGCTCTAAAGTTTATTACTTTTGTGTCTGATTTTCTACCTATTTATCACCTTTGTTTTTGTTCTCTTTCAATTCATTTTTTTATGAGAACAACTTTCATTGAGAAAGAATGAAAAGACATATGCATACAGAAAACCAGATCCTAAAAAAGGAAGCACCCTCTAAACAAAAGTTAGAAAAATTAACAATAGTGCCTAAAGAGAAGTTACAAAAGGTCTTCGAAATCGAAGTCCACAAAGATACTTCTCTCAATTAATTATATATTTCTGATTTATTGTATTTGATGACCTAGTCTACTATTTTCTTAAGAGTATATGTATTTTATTGAAGTGAATTTGTAAGACAATATTCAAATTGTGCAAGTGGTACATGAGTGATTTGTTTTAGAGTTTTTCAGGTTAATTTTTCTAACTTTTGTTTCTGGAACATCTTCTTACGTTGTTTACTAGGAAGCATGGACACATGTTGTCCACAAATATGTTTGTAAATGCTTCAGATTCGTGTTCGACATGGACAATTTTGTGCCCCTTTTTCTTTTCTTTTCTTTTTCTCTCTCTCTCTTTAAAACAAAATTTTCAACTCAAAGAACATGCCAAAGACATGCCATGTGCACATAAGATATTAATTGTTGTTGGGAGAGTTTCATAGTTCCTTTTTTATGGATCTCTCCTACATGTCTGTTTAATTTATTTTAATGTTGTTACTTTTTATGGTTATTTTTGGTTTTTTCTTGCACATTGTATCTTGGAGCATTAGTCTTTTTTCATTTTCTTAAATAATAATAATAATAATAATAATAATAATAATAATAATAATAATAATAAATATTTGATTTCTTCTTATTTTGTTCATGTAGGGCCTCTTTCCAATTTCTTTGGTCAATATTCCCGTTGTGATATTTGTATAAAATGGGATGCATTTATTTTTTCTCACTAGGTTTGAAGTTTATTCACATGGTTTTTCTCTCCCTTCCATTGGAAGTTTGTCTCCTATAGACAGTTTTGTTTCATTATTTTATTGAAAAGTTTGCATCTTTTTCGAAAAAAGAAAAATGAAAGATGCTTATTTCTAATTATAGAAAAAGAAAATAAAACAATATGAAGTTTTGAAATGTAAGTGAGATGAAGGATGATTAGTTTCAGATTGAGTGTTAACCTTAGCTGCTAGTTTTCAAAGTCACACGTATTAACTTAACTAATTGGTGTATCTTTATCACTCACGAGTAGTATTAGATAAGTTTGATTGAGCTGTTCAATAGTTATTTCTATTATTGAGCCCCACAAGGATATGAAAGATTATCCATAATCCCTCCCCCCTCTCCATATTTATGATTTACACAAAAGTCTTCTTGTTCATGTAGAAATATTTTGATGTGCATCTGTGATTGGTTCAACCATTTAAGATCTAATTTAATTGACCACATAATTTGCTCAACTAATTTGGTTTTACTGTTTTCATTCTGTGCAGACAGAGCACAAGTTCGTATTTAGACCCAGTGATCCCACTTCAGCTTCTAAGCTTGATGTAGAGTTTAGATTTATTAAAGGTAACTACTTGATATATTTCTGAGGTTTCTACTTAATATATTTCTGCTACAATATTTATGGTATGTTCTATATTCCCGTACGTGTAGTATTTGTACTTTTGGAAAATGGCCTTTTCTTTTCATGTTGAGCACCTTCTGTAAGTAGAATTGAACTGTTTGGATGTTTGATAATTAGAACTTTTGAGTAAAGTAGTGCATTGTTGGCACAATCCCAAATGAGGAATTCTTTACTCTTTGCACTCCACTATTCACTTATTATAACTCTGAGTCGCTGACTAACATCATCTGTAACCTCTATTTCCTGTAACCTTATGTTTGAATAAGATCTTGTTATATCTAAACTTCCCGTATTTTTCAGGCCACAAAAACACTAAGGAAGGATCAAATAAACCACCGTGGCTTAATCTCACCAGAGACCAGGTTTCTAATATTTTGATGGACATTATCTATGGATTTGGGTAAAGGAAAAGGGTCCAATTAAAAGGCACCATAAGTTAGATCATTGGAAATATTCCATGATTAAATCTTTGTACATATGATTTTAAAATAGAATATAATATATAAAAATTTCTATGTTCTTAGAAGTCCATTATGTAATTTTCACCCTCAAGCTTATCTTTAAATTCTCTTTCAGTAGTTCTTTATTGAAAATACTTTGTAATTGATGAGGTCTCTTAAGTGATTTCATCAAGGGCCTTTGGATATCAATCTCAAACATCATTTCCACAAAGCATATTGTGAGTCTTGAATTTATCTTATGAGATGACAATTAAAGACATGCTACATTTTTCAGGGTGGAAGTTACATTTTTGAGTTCAAAAATTTCTCCGATCTTCATGTTTGCCGCGAGTTTGTAGGTAAGCCAGGATTCAAAATTGTAAGCAATCTTTCTTTATTAAGCTGTAATGCAATTTACATCATTTATGCTTTTAGTTATATAAGAACTGTAGGTTCATTGATCTGAAAGATGGTGAATAGTTGTGAATTTTCTTGAGGGATTGATAAGGCCCCCTTGTTTGGTCTAATTATTCATGAAAGGCAATTAGAACAGAGGGATGAAGGGTGAATATCATAATCCCTTAACTTTATAAGGGCACCTCATAACCATCATGAGTTGGGCCTAGTGGTAAAAAGGAGACTTTTTTTTTTCCTTCTTTTCTCTTTTCTCTTTTTGATATATGTGAGTGTCTAGGCTAGCTTACGCGCACCTCAACTAATCTCACGGAACAACCTGCCTGACCCTACAACATTTGGGTGTCAAGGAAACCCGTAGGATATTAAATCCTAGGTAGGTGGCCACTATGATTGAACCCATACCCCTTGGCCCTTTGGCCTCTTTAAGATATTCCCATTACCACTAGGCCAACCCATGATGGGTTCAGTCTACGGTGGTCAATCTATGGTGGTAAAAAGGAGACATAGTCTCAATAAATGACTAAGAAGTCATGGGTTCAATCTATGGTTGTCACCTACCTAGGATTTTTAATATCCCACGAGTTTCCTTGATACCGAAATATTGTAGGTTCAGATGGGGTGTCCTATGAGATTAGTCGAGGTGCGCATAAGCTGGTTCAGATACTCATGGATATAAAAAAAATAAAAATAAGGGCACCTTAGACACACTACTTGGTAATAAGTTAGAGGGGAAAGACAAGGAAGCTGGGAGGTTGAATAATATGTTAAGATCAGGTGGGGAATTGCTTTGTTAACGCTTTCTTTTGGCTTATTTGGCCTGAATAAAATTGGCGCATCTTCGGCAACAAAGCGACATATTTCTCCACCTTTTTTGAACATCTAACATACTTTACTTTCTCTGGGTGTAAATTTAACACATTCTTTTGTGATTATAGTCTTACATCTCTTATTTCACAGTGGAACTGTGTCATGTAATCATCATGGCAATTCTTGCCCTTTTGTAATTTCATTTAATCAATGATATCTTCTTTGATGTTTCTCATAGAAAAAAGGTAGGGAATCATTTCGTAGTTTTGTCCAAACTTGATATTTGAAAGCGTAGGAGGATTTTTGAATCTCTTCCTAGGAATTTGTACTATCTAATTGCGATATGAGGAACACCGATAACCTATTCTGTTCTCCATTGCAATCGTTCTAATAATTTTCGTAATATTTCTCTATTCTTTATCTGAGGTTGTACTTATACCCTTTTCCTGTATTTGTTGGATTTTCTAGCCTGATGGTTTGGCATATTGTGCCTTCTCCAGGAAGTGCTTTAGCAAAGTCAGGAGAGGCTGCACAAGCTGCTCCCTCTGAGAGGCCTGTGGCGGCATTTCCTCATGAACAACTCAGTAAATCAGAAATGGAACTTCGGATGAGATGTTTGCAAGAGGATAGGTAGACTGATGACCCATATTATTGTAAACAAGAAGTCAACCTTTATCGTTTTGTGCTTAACAAGTCACGTAACATCTCTCAGCTCATCATTTATTTTGCAGTGAACTGCAAAAACTCCATAAACAATTTGTGATTGGTGGTGTGTTGACAGAATCTGAGTTTTGGGCAGCAAGGAAGGTGAGGAGAGGATCTTTCTTCCGTTTATCTATTTTGCATTTGTAGCTGACCCTTTTAGAACTCTGATTGTATATTTAATGTTATCCAGTTCACAAAGTATACTCCGAAGAAATATGACATTTTCTGCAATTAATTAATACAGAAATTACTGGAACGAGACAGCTCCAAAAAATCAAAACAACTGATTGGTTTTAAGAGTTCAATGGTTTTGGATACCAAACCAATGTCTGATGGTCGGGTACAGCATCATTACCTTTTTTCCTACCAAATCAAACTTAGTGTATTAATCATTGATATCTTATTTGTCAAATCCTTTTCCCTGCTTGGTTTAATTCCACTGGATTTAACATTGTTATCTTTTTTCTACTTTCAGACAAACAAGGTTACATTTAATTTGACACCGGAGATCAAATATCAGGCATGATTTGATTCTTTGTTCCTTCCTCACTCCCTCCCCAATTTTTTATATAATCAAGTATAGGTTGTTTCTTAACTTATTTATTTTATTCAAACAACATGTGTGAATGGGGAGAATTGAACTTTTGACCTCTTGATCGACTTGAGATGCGTTAAGGTCGATCTTTTCGTTTCTTTTGTTATTCTATTGAATTTGTAAATTATGTACCTTTAGGCTTTATTATTTATATATATATTGTTATTTTTTATTTTTTTGAAAGGAAGCAAAAACGTCTCATAGTCAACCTCTAAGGTAAGGAAAGTTAAAGGTCTAGAGAAGCCTGAGGGAAGGAAGTAAAAAATTTGTTCGGGAACTTGAGAAGCTTGATTGTTATGTTAACTATGGAGGGAAGAGCTGTCTAGTCTTATCTTATGTTGAAGATTTTATCTTGGAATGTGCAGGCGTTAGGAGGGAAGGTTAAGCAAAGTCTTGTCGAAGAAATTATTACCCAAGAGATTCCGAATTTGTTATTCTCGTTGAAACTAAGCGTCATTTGTATGATTGTCAGCTGATTAAAAGTATTTGGGTAGTAGAAGGAAGGTCGCAAGGTTAGCCTTGATTCGGCGAGCTCCATGGGAGGGATCTTGTTAATGTGGATGGTAGTTTTTTCAGCCCTCATGAAGTGATCAAAGGTTTACTCTTGGTTATTCTTGGTTTCAATCTCCTTTACAAATTAAAAAATGAGGTTGTTTTGGATATCAGGGGTGTATGGCCCCTCTAGGATTGGATGTAGAAAGAATTTTTTGGATGAATTGGGCAATTTGTACGATCTTTGTACCCTATATGATGTGTGGTGGGGGACTTCAACTTTTCCAAATCTTCGACTGAGAAGGATTTTGTAGGTGGAGCTACTAGGTCCATGGAACTCTTCAACGCCTTCATTGAAGAAGGTAATCTTGTTGACCGAAAATTAAATGGCAAATTAACCTTGGCCTGTAGTAGGGTTCACAGGAGAATTGATAGATTTTTGCTGTCTAAGGAGTGGCTGAATTGTGTTGGTGAAGTAGACAAGTGTTAGGTCCGAGAGGGACTTTGGGTGGAATTTAACCTACCTAGGTGACGATGTGACATCTTCTTCCTTCTTTCATGTGTCCTAATGGGTCACCTTTTCAAAGATAAGAGAAGAACTCTTTAGTTGAACTTCAGGAGGACTTGCTCTTGGAGCCTTTGGTTTGCATGTAATGTGGTTTCAACAAAAGGATGCTTGACACTCTGACCTTTATTGATTCCATCACTTTCTTAACATTTTCATGGTGCAGATTGTCTCCTTTTTGTAGTTACAACTTAACTACGCTCACTACACAATGGACTCTTGTAATTATTACCTCATGATAGGACCATTTGTACTCATTTCTCTACATCAGTGAAATATTTGCTTTTGATAAAATAAATAAATAATTAAACAATTCTTGTACATCCTTGGCACAACATGTACCTATCCTAAGGCTTTTTAGCGTAGCAATTCATCTCCATGGTTAATGGTTTATGGAGCTTTCAATCATATGACAAGTCCCTCCCATCTATTTGACTCGTACACTTCTTCGCTCAGAAGCACGGTCACTTTAGTTCGGCAAGCATGTTTGTGTCTGACACGTGTCGGACACTCGAACACTTGTTAAACACGTATCGGACACTTATTAGTACATCAAAATGTATTAGATATGCATAGAACACTTGTTGAGTAGACTATAAAGGATAGATATATGACAATAATAATTACTTTTGAGCGTGAAGTACATCAAACTAAGTTTTTTAAGCATATAAATGTATCTACTTTGAATTTTCTTTTGGTATAAAAATGATATATATTTAAAAAAAATGTATATTTTAATAAACGTATCCTTCTGTGTTGTGTCCTAGATTTTAAAAATATGGCGTGTCACTATATCTGTGTTGTGTCGTATTCGTGTCTCATATTCGTATCTGTGCTTCTTAGCTTGTTAATACCATCTCAATTTCTTTACTTTCCACCCTTTTATTTTGTCTATCTCACTCTCCTCCTCTCTCTCGTCTCACTCCTCATTGTCTGTACTTTGTCTTCATCTCATGCTTCACTATATTAAATTCTCTCTTTTTGGTATGTTTTATGCCTCACCTCTCTAAGGCTCAGTTAATTTTACCTTTTCTGTGGAAAAATTATTTCTGGTATACTATGTCCCTCACCACACTCAAGTGCACACTTCATTTTTATTTTTCTATCTTGCATTAGGCTAAGGACTAGTGCTTGAGGAAATCCTTTGTTTTAAAAGATAATGGTTTCAACATTGTTACTTCTTCCAGGATTCTTCCTATTTTTATGCGTCATGTATTGGGTTTGTTATCTATTATCAAATAATTTGGTCTTCTTTTCTGTTTTCTTTTGTATAATCTGATATTGGTCCCAACATCCAGACGTTAGTCAAGTTATGCAACCCTTTTATCTACTTGAGTTATAATATGCTAATATAGTTAACCGACTTAGTGGTAAAAAAGAGACAGTCTCAATAAATGACTAAGAGATCATGGGTTCAATCTATGGTGGCCAACCTACCTAGGATTTAATATCCTACGAGTTTCCTTGACACCAAAATGTTGTAGGGTCAGGCAGGTTGTCTTGTGAGATTAGTCGAGGTGCGCGTAAGCTGGCTCAGACATTCACGAAGATAAAAAAAAAAAAATGCTAACATAGTTTCAACATTCTTTTTAATATATTTTTCTTCTTTATCTTGTTCAGATTTTTGCTCTGAAACCAGCTGTTCACCAGGCCTTCCTTAATCATGTTCCCAATAAGGTACTGCCAATACTGAAAAAAGTAACTAAGTTATCCTCTTCTTTCTTGTTGCTTTTAACCATGTCATTGATGAAGGTAATTGAATAAAGCAAATATAGAAAATGTTAATATATTTTATCATTTTCCTTATATTTAATAACAGATGTCAGAGAAAGACTTTTGGACAAAATATTTTAGAGCGGAGTACCTTCATAGTACCAAAAATTCTATTGCAGCTGCAGCAGAGGCTGCTGAAGACGAAGAACTTGCCCTTTTTCTGAAGGACGACGAGATATTGGCTGCTGAAACTCGGAAAAAGGTGGATTTTTGAAAAATGTCAACCTCTTAGAGTTTTCTCTCCTATTAAAATACTTCGGGTGGTGTTTTTCTGATCACTTTTGTCATCCCATATTAAATTCTATTTATTTTAGACGTACCAAAATTTCAGCAACTTCGATTATCATTTGGTTCATGTTATTTCTAGATTCGGCATGTTGACCCTACATTGGATTTGGAAGCGGATCTAGGAGATGATTACACACACCTTCCAGTATGTTGGCCTTTATTTTGGTGTTTTGTTTGGATTCCTTTCTTGGTTCTTCATGGAGGATTTATCTTTATTTAAACTGTTATTATTCCTTTTGTATTCATTCTTATTTGTTCTTCTATTTCTTTGGTGAAGCTTTGTACATTTGAGGCTTTAGTCTCTTTCATTATTTCAATGAAATTTTCTTGTTTATTGTTTTAGAAAGAATAGATAGGGAAGACCTTTTAGGTAATAGAACTATTTTTTGTTTTAAAGGGTTATATTTAGATTGGTGGCTTTCTTATCATACTGGTTGATGCAAATTTTCTTAAATCTAGGATCATGGAATCTTTCGTGATGGTGGCAAGGAGATAACTGAATCACAAAATGAGCACTATAAAAGGACTTTGTCACAAGACCTTAATCGTCAAGGTGCAGTTGTTCTTGAAGGCAGAACTATAGGTTAGTTTCCAACTTGCGGTAAATTATATTCTTGAGTTGCAGTTATCAGGTAAAGGTCTTAACTACTTTATAGATACTCCTTGTTGAATATATCAAGGTCAAATATACCCTAACCTATTATGGTCTGACTTATATGATAGGTAGATTACTTAGAGTGGAGATTCCAGTTTGTAGCTACTGTTTTACAAATATACCTTGCCTATCATTTTATTATTGTTCTTCCATTATTATTATTATCACCATTTCCTTGTCTGCATCATTTGTCGCTCCTCAAAAATCATTTTGTCGAAGATTTTCTTATGTGGTCTAGGAGCTTTTGTTTCTTTTCCTTTGGGTTCCGACGCTGTTGTTCTGATAGGGAGGCGATAGATGTGGTTGTTCTTCTTTCTTTACTCGAGGTCACCCCTTTAGAAGAGGAAGGAGGGATGTTAGAGTTTGGCGTCCCAATCCTTTGGAAGAGTTCTTGTGCAAGCCTTTCTTTTAGTGCTTGTTTGATCCTTCTCCCTTAGGAGTGTTGGTCTTTACGGTTCGTTGGAGGATTAAGATTCCTAGGAAGGTGAGGTTCTTTACTTGGAAAGTCCTTCATAGCCATGTTAACACAATGAATCAGCTTGCAAGGAGGTTTCCTTCGTTTTGGGCCTTTTTGTTGTATTCTTTGTTAGAAGGTGGAAGAAGGATTGGACCATATTTTTTGGCGCTGTGATTTTGTGAGTAGTGGTTGGGATTTATTCCTCCAAATGTTTGATATGATGTATGTTAGTAAAGAGTTATGTTCCTCCAATCGGCTTCATTCCAAAATGGAAGTTATCAAGAGTTATGGTGGCTGGATAAAAATAAAAAAAATCTTCCATTACCTTTCTGGAAAAAATCAGTTTTTGAAGCCATTGGCAAGCAATTAGGAGGTTTATTAGCCATTTCTTCTCAAACTCTCAATGATCTTGATTATTCTTCAGCATTAATAAAGATTGAGGAAAATTTATGTGGGTTCATTCCAGCTATGAATATTAAAGATATTAATCTTGGTGTTTTCTCTATATATTTTTAGGATTCTGATTTTGAGGAAATCCAAGGGTCAGAAATTTCCAGTGTTGATAATTTCTCAAATTCATTAGATATTTTCCGAGCCAATTTAATTCCCAAAGATGTTGATAGATTCCTGTTTTTTTGGGCAAGTCAAACGGCTCCCCCTTTAATGGAAAGGGCATTTATTGCGAAGATAAAATCTTGGAGACCAATCAACAGACTTTTCTCTCCCCTGAATTCAATATTTTGTATGTCCAATGGTCAGATTCTCTTCCTTCTTTAATGCCGGGTGCAGCAGCCATAATTGCTTTCCTTCATCTGCTCCCAAGGAGAATTTAATTCCTTCCTTTTGAAAAAAGGCAAAGAGGAATTTGGAAAGTCTCGAAGACTTTTTTCAAAAGCGTTGCCTTCTCCCGAGATGGAAGGAAACTCAAGAGTTTCATTGGAAAATAATTCTCCCCCGCTGGACGAGAACAATCAATTAATGCAACCTTCATTTTCCCCCGGTTTTAATGAAGTAGATTTATCTTAGAAAATTATTGTAAATAGAAAAAATATCAAACTATTTACAAATATAGAAAAATTTTACTGTCTATCAGTGATAGATTGCGATAGACTTTTATTGCTTGAGTGATAAAAGTCTATCCTAATCTATTACAATCGATCACTGATAGATAATGAGGTTTTTCTATATTTGTAAATAGTTTGACTCATTTGCTATATTTGGAAAAAGCATCTTTAGCATTTAATTTGGTGGCAAATAGCCGTAAAATTTCCGTGATGGAGAAGGATAGGTCAAATTCTCAAGCTCTCAAAACTCCTCATCAAAAATTAAAAAGACAACCATGTTGTCACTAGAACGATTATTCCAATTCCAAATGGGAAATTCAATTTAACGAGAGAAGTTCCATCCATTTCAAGTCCAAAAAGTCAAGTTTTAGAGGAGGATTTTGATGACGAATCTTCGGCAAGTGGTAGTAGTGAAGAACCATAAATTTTCGCTCCAAATTCGGCCTCAGAACAGGAAGATATTATTTTCGGGAAAGAGGTTGTAGATCTATTTCAAACTCCTACCTTCAAACAGTCACCATAGATGCTTTCTTGTAAGTTCTCCCTAATTAAGTCTCATGAAATCCCTTCAAAGTTTTCCTTCTTAATTGAAGCTTGCGGTTTGAAGTTCGGAAGGGTGATTCATCACAATTCACTTTAATTTTTCTGAAATGGGATTTTTAGTTTTTTCTATATCCAGGAAAGTTATGTGGTTGTTCTTCTGAAGCTACGGGTTTTGATTAAATTGGTTTGAAGTGTTTGGTTTTGTGGTTGGAGTTTGATTCATGTCCCAGCTAATATGGAGCTTTTCTTCTTCAAAGCTTATTTGGCATCATTCTTCTCTTTTAACTGCTGGTCGCTAGAAGAGTCTTCCTTGTTCTATCAAGCTCAAAGTTCAGGGTTCTTACCTTATCAGTATTCGGGTTGTTGCAGCAATTGTTCTATTCTTCTTTGGGTTTGATTTTTCATCCATCATGCTTGATGGGCTCTTTGTTAGTCTCAAGTCCAACCTGTGCAAGATATTGCGGTGGAGGCTTTTCTTCTATTTGTTTCAGGCTTGTACGAGTTTAAGTCTTCTGAAATTTGGATTTCTCGAGTTTTATGTTTAATTGGACAATTTTTACTTTTTCTTTAGAGTTTAGTCTTTATGATGTAATTTGGTTCTAGCCTTGTCTTGGCATGTTTGTAGTTTTATCTTTGTTTTCTCTCATGTACTTTGAGCTTTAGACTTATTCATCATATCAATGAACAGTCTTGTTTCTGTTTAAAAAAAATGATGTATGTTAGTCACAAAGATGCTAGTGTTATGATCGAGGAGTTCCTCTTCGATTTGCCATTTGGGAAGAAGGGGCGTTTTTTTATAGATATAGGGGGGGTGTAGGACACTGGAGGGGGGGGGGGGGGGTGAAAAGGGTTTATTAGTTAATTGAAACTTTTTTCCTAAGTGGGCCCAATTAAACAATTTAATACACTTTTTGCTAATAAATAATTTAAACAATAATGTTATACAAATAAACTAAACTCAAAATAATCAAAAAATTAATAATAAAACTTCTAAAGAACCTATTTGATTCTAATAACAAGACTGGTAAATGATATAGTTTAAGAACTATCCATCAAAATAAACCAATACAAGCAATTAATTTCAAGCATATAAATTAATTACAAAATTAAATATGAAAGGGACAGAGGAATTGACATTGGAAATTTTATAGTGGTTTTGCACAACCTGATCTGCATCCACTTCCCTAGCTCCTCTTAGATATGTCACTAGAAACCTTTGACACTTTCCATGGTTTAGAGTCAAATTGCTATAATGTCTTTTTTTGGGGGCAAGATCAAATCTGATCCTTTCCATGATCGAGGATCAAACCGTTACAACATTCTTGTGGTGCGTTCAAGGGTAACCCCTTATAATAAGTTTAGAAATGAAGAACAACTTGACAAACTTTCTGAAAGAGTAGATATACAAATTTTGAGCTCACACCAAATCAATCATGACACTATATAAATCTCTCTCAAGAATAAGATAAAAAAAAAAAAAAAAAAAAAAATGGAAGCTTGGAGAGAGCAACAATGGAAGCTTTGTGTTTTTGGAGAGGGTTGAGAATTATAAAAATTGTGAAAATGTGGAGAAGTTGAAGAGTTTTAAAAGAGAGGAAAGTTGTTGGAAACCACATTAAATTTGGACACTTGGATGTTGTTAAAATTCAGATTTAAAAGAATGGTACTATGCATTAAATATATTTTTAAGAAAAATAAACTAGTTCGTTAGACAAACATATAAAATAATATAATAATATATATATTTTAAATCATTTCTTTAAAAACCCAACAAAAGCTCAAAAGGCCACCAAGTGTCACTCCTCCATTGCTCCAAGTGGCATCTTTCAATTTGTCCACTTTGATTAAAAAAAATTGTGATATCATCCGTTTGAAATTTTTTCGTTAAGCTTGTTTGGACTATATCTTCTTTGTTTGAGCTCGTATTTGAGTAATTTAAATTGCGTTGGAATCATTATTTTGAGCTCTATGCAATGGACACTTCAAAACACTACAATTGTTAGTGAATAAAAGCAAGTTTGTTATCATCGAACTATTAATTAATTTAAATTAATCCATTTGAGATTAAGGGCCAAAAATCTAGTGGGGTGTGTGATTTTGTGATTCTTGTGGGGTGGTTAGGGGTGTGGAAAGGGACCCCAAGGAGATTTGGTCCCTTGCTCCCTTCCATGTCATTTTGTGGGCTTCGATTTTGAAGATCTTTTGTAACTATTCTATAGGCACTATTCTCATAGTTGGAGTCCCTTCTTGTGAGGGAGCTCCCCTATTTTCTGTGGGCTCATTTCTATATTCCCGTGTATTGTTTCATTTTTTCCCTATGAAAGTTCTCATTTAAAAAAAATATATATACCAAAATGCTATTCAGCTAGCCCTTCCAAACTTGAAAATTGCTTGAAAAGCCAAAACTTCGTTAAAGCAAAACAGTTAGTCGAAGTCTCAAGATTACGAGAATTCCCAAAATTAAGTTCAGCCCTAAACAAGCCATCCAGTTAGACTTTAACTGATGTTGAACGTGTTGGGATTGATTTCACAAGAATAATGGGTGACTTATCTGCTCGAATCTTTTTTGACATTTTGAAATAAGGTGTAAAAACTATCTGTGAAATAAACCTTTGAATTATGATCAACATATTCTAGGTGGGCCTAAAAATCCACAAGGTCCATATACTTGCACTAACCACTGATTCTCCATCTGAATCATAAGTAGAACAGAATTGTTATCCTTCGGAGGTGAATGGCTAGACAATTAGGAGGTTCCTTTTGGGACCTTTTTTGCTTTGTACACTTTCTTTGACTCGTAGAACTTTTTTCATCCTTAGAAACTGATTGTTTAATTTTCTGCATTGCTGCTTTACAAAATTGAGAGTAGACATAGACTTGGATGATCCAAGGACAGTAGCAGAAGCACTTGTGCGGTCTAGACATGGTTAGTACGGCTTATAAAGATTTCTTTTTCCACTTCTAATTCTTCCACCAATTGCATCCTGAGCTTGTCACTAACTTTACTTCTATCATGTATTGAGAAATTAATTACGGGAACTGATCTTACAGCGGTAGAAGGAAATGAGAAGCAGATAGCACTTGATAGGATCTCTAGGATGACAGCGATTGAGGATCTTCAAGCACCTCATAGCCATCCTTTTGCTCCTCTTTGTATCAAGGTTTGATTTTTATTGGTGAAAGATGCTTAAGATCTACAATCATTGTAGCAGAGGACTATTTGATGCCATATTTCCATAATAGGATCCCCGAGATTATTTTGATGCTCAACAAGCAAATGCCATCAAGATATTGGACGATACACGAGCAGGAATGCAACAAACTAAATGCAGTTTGGGTACTACAGAAGCATATTGCTCACTGAGGGAATCCATATCTGAGATCAAATCTTCCGGATTGAAGCTTCCCATAATTAAACCTGAAGTTGCTCTTATGGTAAAAGCACCTGCAGCATTCTCTTGCCAAAAACACGCATTTCTGCTATATATTGACAACTCTTCTTCTGCTACCAAACTTCTAATTATTCTTTTGTAGGTTTATAACGGATTAACTCAAAATATTTCTAGTACTAAATATCAACTGGGGAAAAACCCTCAGGAGAGTATTCTAGAGAGTTTACCAAACCCTACTAAGGAGGAACTTCTACACGTGAGTTGGTGCTGCTCTTGCTGTAATATTTGAATATATTTTGATGGCTGATTTTCAGGAGATGTTAAAATCACTCAATTTACCATCTCAGTAACCTTTGTTGGCCATAATAATCATATTTTGGACTTTTAAGTTGGTTTGCTGCCATACGTTAGTGTTCTCCAGTAACTTGTTTCTTCAGATATTACTGTTTAGCGAGGTATTGTTTTAAGAAAAGCTTATTGTTTCCAGCACTGGATCTCGATTCAAGAATTACTCAGGCATTTTTGGTCGTCTTACCCAATCACTACATTATATCTTTATACTAAAGTAAGTTCTAATTCTTTTTTCGTTAATAGGATGATCCTTCATTTGATATTTCACATTTTCATATTGAATGTAACTCCGTATTATCCAATCACGACATCATATCTTTATTTATTTATTTTTATATTGAATAAGTAGTTCAAGTTCTAGTTCAGTGATACTTCACATTTTCATGAACTTGAAATGATCAAGAATCTGTTTTGTTATAACTTATTACTGCTCATTGTTGTGCAAAAAGTTCACAATATTGAATTTAAGATTTACACTTTTTACAAAGCGTTCTCTCAAAGCTTCTATCTGAATTGGATAAATGATGCGAACCATTTCTCCAGTTAGTAAATGGGCTTATGTATTGTCGGCTTAGGCTCAATGAAGGCATAAGGCAATGGAATAAGAATAATGGTAAAGTTAGAATTTGATCTATAGCTTCTAATTGTCTTGGTTTGCTATTTTTGGCTCATTGTAATTCGAGCTATTGCTCTTTTCATTATATTTCAAAAAAAAAAAAATTGGTAAAATGTATTAATCTCTTAAACATTTTACAGATCCTGGAGGTTTCTAGCATTAGTTTTTGCTTCCAAACAAAAGATAGCTTTTAGGCCTTAATTGTCCCAACTCTTAAGAAGGAAGAGTTGGCAAATAGTCTCAGCTATAGTATCGAGCTTTGGCCCTCCCTATATCTTGCTGTTCCTTTACTTGTAGCATTGGTAAATTGAAAAGCCTCATTTTGGAGCACAATGGTGTTGGTAAACTATTTTGCAAGCTAAAGAAATGGTGATCTTTAACACCCTTTCTTGGGTGGGATACTTCGGGGTGTTTGGGGCACTAAGTTGGTTATTATAGCCTATAGGTTAAGTCTGTTTAGAGTGTAGACTATTTTAGAATGGGACACTGTGACAAAGAGACAGAGAGAGGATGAGAAGGAAATAGTAAATATTGTAACAAAAAGAGATTTGAAATAATAATATTGTAGTTAATTGTAGGTAAACACTGGAACAAAGAGAGATCTAGAATAGTAATATTGTAGTTAATTGTAGGTAAATACTTGGACAAATAGAGATTTGAAATAGTAATACTGTAGTTAGTTATAGGTTATAATAGTCACATCAACTATTTTAATTGGAGTCTAAAACATGGAGTGGATTATAATAGCATACTCCATGCACTCGAAGTTGGGGGTCTCTTAGATTTCGATACTGGAATGGTTGGTGCACGATTGGCCGTGTTGGAATAAAGTTTGAAACTTGGGATGTAGAATCGCCATGAGAACTGATGGGTTGCCTTGCCTCATCATACGAGGGGTATATTAAGATTGTAACTGGCTGCTAGGATGCCCTAGAGCCACATTTGATTGTTTGAAGAATGTTATCTTTTCCTTCTATCTTCTCTATTCTGGATGAGATTTGACCTGAAAGTCAATGTGATGCACTTTTGAGTGTCACTTATATTAAATGATTCATACTGTGCATATACTGCATTGTTGTATGGATGGCCTGTACTCTTATTGCTTGTTGGTATGTTATTTGCTACGCAGGTGAGCAGATTGAAGGATGCCATGTCAAAAATATATCGACAGTTAGAGGTACAAGACTCTTCCCCCCTTCCCCCAATTTGATATTTTGAACAGAGCATGCAGTAGTGCGGTCTCTCTCTCTCTCTCTCTCTATATATATATATATTTGCCACAATCTGTTGTTGATACGTTAATGTAGTTACTTATGTGGAGTGTTTTGTAAAACATCTTTTTTCTTGCTCATGATTGGATTTTTTTCCAGCTGTCTTTTATATTTATATATACACACATATATATATTTGCATTATTATAGTTTTATGATACCGTTTGTATTAAAAACAGGAGATTAAGGAAACCGTGCTTGCGGATTTCCGCCACCAAGTATCTCTTTTGGTTCGTCCAATGCATCAGGTTTGTAGTGTGTTCTCTTTTGAAATTTACCCTTGTTTGAGGAGGAAAGAGAAAAAAATTCCAGAATTCTGGGAGTAACTATAGAGCTGTTGCGATAATATTGCTCTCAAACTTGTGAAAACCAATGCATCTAAACCTGGTCACGAATGATTTGATGGTATTGTTCAATTGCAACAATTATGTGACTTTAGGGGGATTGCTTGGTAATAACCAATTATGATTTGATGATAGTGAGCATGGTGGATGGAAAGATGACCATATATGTGTGTCTTATTCATGTTTGTAGGCTCTAGATGCTGCATTTCAGCATCATGACGCGGACTTGCAGAAGAGATCAGGAAAGGACCTCTTGCGTTATGCAGAAAAGATCAGCATCATGACACGGATACACTTCGACAATATGAATATAGAACCTCCTCAATTCCCTCAATGAGTTCATCTTCAGATGAAATTTTGCGGTCGACGGCCATCGGCACGTGGTGGACGAGGCTTTATAGAGCATTCAGTGTGCCCAATGCCTAATCTACAGTAATTATTTCTCCATATATAGAATAAACATTTAGAATTGTATTATTACTTTATATATATAATATGTGAGAAGTTACAAAATATTTTATCTTCAGTCACGGCGCCCAAAAATTTGACACCATGTATCAAAGTAATAAATTTCCCAAATAGGGTATTATCCTTAG

mRNA sequence

AAGTAATAAATAAATAAGAAAAGGAAATGAGAAAAAACGACTGGGATCGGGAGAGATCCAGAGAGAGAAGAGAAAAAGAGGGAAAATAAAAGAAGAAAGAAAGGAGTGAAATAGATGGAAAAAGAAAGGGAAGCGTAGAGAGAAAAGGCCGTGGAGAATTGGCGGCGACAAAGAGGATTGGCTGAAGGACCTATACGAAGCAAGAATCAGAAAAAGGGAGCTTGCAGAATCATTGATTAAGATCGTGGACTGATTGTCTCTGAGGGGGAGATGGGAACCAAGTATGTCCATAAGAGTGCCAAGTACAAGACCTCAGTTAAGGATCCTGGCATGCCTGGCGTTTTGGAAATGACAGAGCACAAGTTCGTATTTAGACCCAGTGATCCCACTTCAGCTTCTAAGCTTGATGTAGAGTTTAGATTTATTAAAGGCCACAAAAACACTAAGGAAGGATCAAATAAACCACCGTGGCTTAATCTCACCAGAGACCAGGGTGGAAGTTACATTTTTGAGTTCAAAAATTTCTCCGATCTTCATGTTTGCCGCGAGTTTGTAGGAAGTGCTTTAGCAAAGTCAGGAGAGGCTGCACAAGCTGCTCCCTCTGAGAGGCCTGTGGCGGCATTTCCTCATGAACAACTCAGTAAATCAGAAATGGAACTTCGGATGAGATGTTTGCAAGAGGATAGTGAACTGCAAAAACTCCATAAACAATTTGTGATTGGTGGTGTGTTGACAGAATCTGAGTTTTGGGCAGCAAGGAAGAAATTACTGGAACGAGACAGCTCCAAAAAATCAAAACAACTGATTGGTTTTAAGAGTTCAATGGTTTTGGATACCAAACCAATGTCTGATGGTCGGACAAACAAGGTTACATTTAATTTGACACCGGAGATCAAATATCAGGCATGATTTGATTCTTTGTTCCTTCCTCACTCCCTCCCCAATTTTTTATATAATCAAGTATAGGTTGTTTCTTAACTTATTTATTTTATTCAAACAACATGTGTGAATGGGGAGAATTGAACTTTTGACCTCTTGATCGACTTGAGATGCGTTAAGGTCGATCTTTTCGTTTCTTTTGTTATTCTATTGAATTTGTAAATTATGTACCTTTAGGCTTTATTATTTATATATATATTGTTATTTTTTATTTTTTTGAAAGGAAGCAAAAACGTCTCATAGTCAACCTCTAAGGTAAGGAAAGTTAAAGGTCTAGAGAAGCCTGAGGGAAGGAAGTAAAAAATTTGTTCGGGAACTTGAGAAGCTTGATTGTTATGTTAACTATGGAGGGAAGAGCTGTCTAGTCTTATCTTATGTTGAAGATTTTATCTTGGAATGTGCAGGCGTTAGGAGGGAAGGTTAAGCAAAGTCTTGTCGAAGAAATTATTACCCAAGAGATTCCGAATTTGTTATTCTCGTTGAAACTAAGCGTCATTTGTATGATTGTCAGCTGATTAAAAGTATTTGGGTAGTAGAAGGAAGGTCGCAAGGTTAGCCTTGATTCGGCGAGCTCCATGGGAGGGATCTTGTTAATGTGGATGGTAGTTTTTTCAGCCCTCATGAAGTGATCAAAGGTTTACTCTTGGTTATTCTTGGTTTCAATCTCCTTTACAAATTAAAAAATGAGGTTGTTTTGGATATCAGGGGTGTATGGCCCCTCTAGGATTGGATGTAGAAAGAATTTTTTGGATGAATTGGGCAATTTGTACGATCTTTGTACCCTATATGATGTGTGGTGGGGGACTTCAACTTTTCCAAATCTTCGACTGAGAAGGATTTTGTAGGTGGAGCTACTAGGTCCATGGAACTCTTCAACGCCTTCATTGAAGAAGGTAATCTTGTTGACCGAAAATTAAATGGCAAATTAACCTTGGCCTGTAGTAGGGTTCACAGGAGAATTGATAGATTTTTGCTGTCTAAGGAGTGGCTGAATTGTGTTGGTGAAGTAGACAAGTGTTAGGTCCGAGAGGGACTTTGGGTGGAATTTAACCTACCTAGGTGACGATGTGACATCTTCTTCCTTCTTTCATGTGTCCTAATGGGTCACCTTTTCAAAGATAAGAGAAGAACTCTTTAGTTGAACTTCAGGAGGACTTGCTCTTGGAGCCTTTGGTTTGCATGTAATGTGGTTTCAACAAAAGGATGCTTGACACTCTGACCTTTATTGATTCCATCACTTTCTTAACATTTTCATGGTGCAGATTGTCTCCTTTTTGTAGTTACAACTTAACTACGCTCACTACACAATGGACTCTTGTAATTATTACCTCATGATAGGACCATTTGTACTCATTTCTCTACATCAGTGAAATATTTGCTTTTGATAAAATAAATAAATAATTAAACAATTCTTGTACATCCTTGGCACAACATGTACCTATCCTAAGGCTTTTTAGCGTAGCAATTCATCTCCATGGTTAATGGTTTATGGAGCTTTCAATCATATGACAAGTCCCTCCCATCTATTTGACTCGTACACTTCTTCGCTCAGAAGCACGGTCACTTTAGTTCGGCAAGCATGTTTGTGTCTGACACGTGTCGGACACTCGAACACTTGTTAAACACGTATCGGACACTTATTAGTACATCAAAATGTATTAGATATGCATAGAACACTTGTTGAGTAGACTATAAAGGATAGATATATGACAATAATAATTACTTTTGAGCGTGAAGTACATCAAACTAAGTTTTTTAAGCATATAAATGTATCTACTTTGAATTTTCTTTTGGTATAAAAATGATATATATTTAAAAAAAATGTATATTTTAATAAACGTATCCTTCTGTGTTGTGTCCTAGATTTTAAAAATATGGCGTGTCACTATATCTGTGTTGTGTCGTATTCGTGTCTCATATTCGTATCTGTGCTTCTTAGCTTGTTAATACCATCTCAATTTCTTTACTTTCCACCCTTTTATTTTGTCTATCTCACTCTCCTCCTCTCTCTCGTCTCACTCCTCATTGTCTGTACTTTGTCTTCATCTCATGCTTCACTATATTAAATTCTCTCTTTTTGGTATGTTTTATGCCTCACCTCTCTAAGGCTCAGTTAATTTTACCTTTTCTGTGGAAAAATTATTTCTGGTATACTATGTCCCTCACCACACTCAAGTGCACACTTCATTTTTATTTTTCTATCTTGCATTAGGCTAAGGACTAGTGCTTGAGGAAATCCTTTGTTTTAAAAGATAATGGTTTCAACATTGTTACTTCTTCCAGGATTCTTCCTATTTTTATGCGTCATGTATTGGGTTTGTTATCTATTATCAAATAATTTGGTCTTCTTTTCTGTTTTCTTTTGTATAATCTGATATTGGTCCCAACATCCAGACGTTAGTCAAGTTATGCAACCCTTTTATCTACTTGAGTTATAATATGCTAATATAGTTAACCGACTTAGTGGTAAAAAAGAGACAGTCTCAATAAATGACTAAGAGATCATGGGTTCAATCTATGGTGGCCAACCTACCTAGGATTTAATATCCTACGAGTTTCCTTGACACCAAAATGTTGTAGGGTCAGGCAGGTTGTCTTGTGAGATTAGTCGAGATTTTTGCTCTGAAACCAGCTGTTCACCAGGCCTTCCTTAATCATGTTCCCAATAAGATGTCAGAGAAAGACTTTTGGACAAAATATTTTAGAGCGGAGTACCTTCATAGTACCAAAAATTCTATTGCAGCTGCAGCAGAGGCTGCTGAAGACGAAGAACTTGCCCTTTTTCTGAAGGACGACGAGATATTGGCTGCTGAAACTCGGAAAAAGATTCGGCATGTTGACCCTACATTGGATTTGGAAGCGGATCTAGGAGATGATTACACACACCTTCCAGATCATGGAATCTTTCGTGATGGTGGCAAGGAGATAACTGAATCACAAAATGAGCACTATAAAAGGACTTTGTCACAAGACCTTAATCGTCAAGGTGCAGTTGTTCTTGAAGGCAGAACTATAGACATAGACTTGGATGATCCAAGGACAGTAGCAGAAGCACTTGTGCGGTCTAGACATGCGGTAGAAGGAAATGAGAAGCAGATAGCACTTGATAGGATCTCTAGGATGACAGCGATTGAGGATCTTCAAGCACCTCATAGCCATCCTTTTGCTCCTCTTTGTATCAAGGATCCCCGAGATTATTTTGATGCTCAACAAGCAAATGCCATCAAGATATTGGACGATACACGAGCAGGAATGCAACAAACTAAATGCAGTTTGGGTACTACAGAAGCATATTGCTCACTGAGGGAATCCATATCTGAGATCAAATCTTCCGGATTGAAGCTTCCCATAATTAAACCTGAAGTTGCTCTTATGGTTTATAACGGATTAACTCAAAATATTTCTAGTACTAAATATCAACTGGGGAAAAACCCTCAGGAGAGTATTCTAGAGAGTTTACCAAACCCTACTAAGGAGGAACTTCTACACCACTGGATCTCGATTCAAGAATTACTCAGGCATTTTTGGTCGTCTTACCCAATCACTACATTATATCTTTATACTAAAGTGAGCAGATTGAAGGATGCCATGTCAAAAATATATCGACAGTTAGAGGAGATTAAGGAAACCGTGCTTGCGGATTTCCGCCACCAAGTATCTCTTTTGGTTCGTCCAATGCATCAGGCTCTAGATGCTGCATTTCAGCATCATGACGCGGACTTGCAGAAGAGATCAGGAAAGGACCTCTTGCGTTATGCAGAAAAGATCAGCATCATGACACGGATACACTTCGACAATATGAATATAGAACCTCCTCAATTCCCTCAATGAGTTCATCTTCAGATGAAATTTTGCGGTCGACGGCCATCGGCACGTGGTGGACGAGGCTTTATAGAGCATTCAGTGTGCCCAATGCCTAATCTACAGTAATTATTTCTCCATATATAGAATAAACATTTAGAATTGTATTATTACTTTATATATATAATATGTGAGAAGTTACAAAATATTTTATCTTCAGTCACGGCGCCCAAAAATTTGACACCATGTATCAAAGTAATAAATTTCCCAAATAGGGTATTATCCTTAG

Coding sequence (CDS)

ATGGGAACCAAGTATGTCCATAAGAGTGCCAAGTACAAGACCTCAGTTAAGGATCCTGGCATGCCTGGCGTTTTGGAAATGACAGAGCACAAGTTCGTATTTAGACCCAGTGATCCCACTTCAGCTTCTAAGCTTGATGTAGAGTTTAGATTTATTAAAGGCCACAAAAACACTAAGGAAGGATCAAATAAACCACCGTGGCTTAATCTCACCAGAGACCAGGGTGGAAGTTACATTTTTGAGTTCAAAAATTTCTCCGATCTTCATGTTTGCCGCGAGTTTGTAGGAAGTGCTTTAGCAAAGTCAGGAGAGGCTGCACAAGCTGCTCCCTCTGAGAGGCCTGTGGCGGCATTTCCTCATGAACAACTCAGTAAATCAGAAATGGAACTTCGGATGAGATGTTTGCAAGAGGATAGTGAACTGCAAAAACTCCATAAACAATTTGTGATTGGTGGTGTGTTGACAGAATCTGAGTTTTGGGCAGCAAGGAAGAAATTACTGGAACGAGACAGCTCCAAAAAATCAAAACAACTGATTGGTTTTAAGAGTTCAATGGTTTTGGATACCAAACCAATGTCTGATGGTCGGACAAACAAGGTTACATTTAATTTGACACCGGAGATCAAATATCAGGCATGA

Protein sequence

MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKEGSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPHEQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIGFKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQA
Homology
BLAST of CcUC04G061640 vs. NCBI nr
Match: XP_008457278.1 (PREDICTED: probable RNA polymerase II transcription factor B subunit 1-1 isoform X1 [Cucumis melo])

HSP 1 Score: 407.1 bits (1045), Expect = 9.0e-110
Identity = 206/211 (97.63%), Postives = 207/211 (98.10%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYVHKSAKYKTSVKDPG PGVLEMTE KFVFRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVHKSAKYKTSVKDPGTPGVLEMTECKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLT+DQGGSYIFEFKNFSDLHVCREFVGSALAK GEAAQ APSERPVAAFPH
Sbjct: 61  GSNKPPWLNLTKDQGGSYIFEFKNFSDLHVCREFVGSALAKLGEAAQ-APSERPVAAFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. NCBI nr
Match: XP_038894194.1 (general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X2 [Benincasa hispida])

HSP 1 Score: 404.1 bits (1037), Expect = 7.7e-109
Identity = 203/211 (96.21%), Postives = 206/211 (97.63%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYVHKSAKYKTS+KDPG PGVLEMTE KF+FRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVHKSAKYKTSIKDPGTPGVLEMTEWKFIFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLTRDQGGS IFEFKNFSDLHVCREFVGSALAKSGEAAQ AP ERPVAAFPH
Sbjct: 61  GSNKPPWLNLTRDQGGSIIFEFKNFSDLHVCREFVGSALAKSGEAAQ-APPERPVAAFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERD+SKKSKQLIG
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDNSKKSKQLIG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. NCBI nr
Match: XP_038894195.1 (general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X3 [Benincasa hispida])

HSP 1 Score: 404.1 bits (1037), Expect = 7.7e-109
Identity = 203/211 (96.21%), Postives = 206/211 (97.63%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYVHKSAKYKTS+KDPG PGVLEMTE KF+FRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVHKSAKYKTSIKDPGTPGVLEMTEWKFIFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLTRDQGGS IFEFKNFSDLHVCREFVGSALAKSGEAAQ AP ERPVAAFPH
Sbjct: 61  GSNKPPWLNLTRDQGGSIIFEFKNFSDLHVCREFVGSALAKSGEAAQ-APPERPVAAFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERD+SKKSKQLIG
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDNSKKSKQLIG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. NCBI nr
Match: XP_038894178.1 (general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894179.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894180.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894181.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894182.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894183.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894184.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894186.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894187.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894188.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894189.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894190.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894191.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894192.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida] >XP_038894193.1 general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [Benincasa hispida])

HSP 1 Score: 404.1 bits (1037), Expect = 7.7e-109
Identity = 203/211 (96.21%), Postives = 206/211 (97.63%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYVHKSAKYKTS+KDPG PGVLEMTE KF+FRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVHKSAKYKTSIKDPGTPGVLEMTEWKFIFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLTRDQGGS IFEFKNFSDLHVCREFVGSALAKSGEAAQ AP ERPVAAFPH
Sbjct: 61  GSNKPPWLNLTRDQGGSIIFEFKNFSDLHVCREFVGSALAKSGEAAQ-APPERPVAAFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERD+SKKSKQLIG
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDNSKKSKQLIG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. NCBI nr
Match: XP_038894196.1 (general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X4 [Benincasa hispida])

HSP 1 Score: 404.1 bits (1037), Expect = 7.7e-109
Identity = 203/211 (96.21%), Postives = 206/211 (97.63%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYVHKSAKYKTS+KDPG PGVLEMTE KF+FRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVHKSAKYKTSIKDPGTPGVLEMTEWKFIFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLTRDQGGS IFEFKNFSDLHVCREFVGSALAKSGEAAQ AP ERPVAAFPH
Sbjct: 61  GSNKPPWLNLTRDQGGSIIFEFKNFSDLHVCREFVGSALAKSGEAAQ-APPERPVAAFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERD+SKKSKQLIG
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDNSKKSKQLIG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. ExPASy Swiss-Prot
Match: Q3ECP0 (General transcription and DNA repair factor IIH subunit TFB1-1 OS=Arabidopsis thaliana OX=3702 GN=TFB1-1 PE=2 SV=1)

HSP 1 Score: 223.4 bits (568), Expect = 2.4e-57
Identity = 118/206 (57.28%), Postives = 146/206 (70.87%), Query Frame = 0

Query: 6   VHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKEGSNKP 65
           + K  KYK++VKDPG PG L + E   +F P+DP S SKL V  + IK  K TKEGSNKP
Sbjct: 6   IEKLVKYKSTVKDPGTPGFLRIREGMLLFVPNDPKSDSKLKVLTQNIKSQKYTKEGSNKP 65

Query: 66  PWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPHEQLSK 125
           PWLNLT  Q  S+IFEF+N+ D+H CR+F+  ALAK     +  P+ + V +   EQLS 
Sbjct: 66  PWLNLTNKQAKSHIFEFENYPDMHACRDFITKALAK----CELEPN-KSVVSTSSEQLSI 125

Query: 126 SEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIGFKSSM 185
            E+ELR + L+E+SELQ+LHKQFV   VLTE EFWA RKKLL +DS +KSKQ +G KS M
Sbjct: 126 KELELRFKLLRENSELQRLHKQFVESKVLTEDEFWATRKKLLGKDSIRKSKQQLGLKSMM 185

Query: 186 VLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           V   KP +DGRTN+VTFNLTPEI +Q
Sbjct: 186 VSGIKPSTDGRTNRVTFNLTPEIIFQ 206

BLAST of CcUC04G061640 vs. ExPASy Swiss-Prot
Match: Q9M322 (General transcription and DNA repair factor IIH subunit TFB1-3 OS=Arabidopsis thaliana OX=3702 GN=TFB1-3 PE=2 SV=2)

HSP 1 Score: 221.1 bits (562), Expect = 1.2e-56
Identity = 118/206 (57.28%), Postives = 143/206 (69.42%), Query Frame = 0

Query: 6   VHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKEGSNKP 65
           + K  KYK+ VKDPG  G LE++E   +F P+DP S  KL V+   IK  K TKEGSNKP
Sbjct: 1   MEKRVKYKSFVKDPGTLGSLELSEVMLLFVPNDPKSDLKLKVQTHNIKSQKYTKEGSNKP 60

Query: 66  PWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPHEQLSK 125
           PWLNLT  QG S+IFEF+N+ D+H CR+F+  ALAK  E        + V   P EQLS 
Sbjct: 61  PWLNLTSKQGRSHIFEFENYPDMHACRDFITKALAKCEE-----EPNKLVVLTPAEQLSM 120

Query: 126 SEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIGFKSSM 185
           +E ELR + L+E+SELQKLHKQFV   VLTE EFW+ RKKLL +DS +KSKQ +G KS M
Sbjct: 121 AEFELRFKLLRENSELQKLHKQFVESKVLTEDEFWSTRKKLLGKDSIRKSKQQMGLKSMM 180

Query: 186 VLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           V   KP +DGRTN+VTFNLT EI +Q
Sbjct: 181 VSGIKPSTDGRTNRVTFNLTSEIIFQ 201

BLAST of CcUC04G061640 vs. ExPASy Swiss-Prot
Match: Q55FP1 (General transcription factor IIH subunit 1 OS=Dictyostelium discoideum OX=44689 GN=gtf2h1 PE=3 SV=1)

HSP 1 Score: 59.3 bits (142), Expect = 6.1e-08
Identity = 32/90 (35.56%), Postives = 58/90 (64.44%), Query Frame = 0

Query: 123 LSKSEMELRMRCLQEDSELQKLHKQFV-IGGVLTESEFWAARKKLLERDSSKKSKQLIGF 182
           LS+ +++ R+  LQ + EL++L++Q V    V++ES+FW +RK +L+ DS++  KQ  G 
Sbjct: 182 LSEQQIKQRVILLQSNKELRELYEQMVNKDRVISESDFWESRKSMLKNDSTRSEKQHTGM 241

Query: 183 KSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
            S+++ D +P S+   N V +  TP + +Q
Sbjct: 242 PSNLLADVRPSSE-TPNAVHYRFTPTVIHQ 270

BLAST of CcUC04G061640 vs. ExPASy Swiss-Prot
Match: Q9DBA9 (General transcription factor IIH subunit 1 OS=Mus musculus OX=10090 GN=Gtf2h1 PE=1 SV=2)

HSP 1 Score: 56.2 bits (134), Expect = 5.1e-07
Identity = 51/158 (32.28%), Postives = 74/158 (46.84%), Query Frame = 0

Query: 52  IKGHKNTKEGSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPS 111
           IK  K + EG  K   L L    G +  F F N S     R+ V   L       Q  P 
Sbjct: 50  IKCQKISPEGKAKIQ-LQLVLHAGDTTNFHFSNESTAVKERDAVKDLL------QQLLPK 109

Query: 112 ERPVAAFPHEQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDS 171
                    ++ +  E+E + R LQED  L +L+K  V+  V++  EFWA R  +   DS
Sbjct: 110 --------FKRKANKELEEKNRMLQEDPVLFQLYKDLVVSQVISAEEFWANRLNVNATDS 169

Query: 172 SKKS-KQLIGFKSSMVLDTKPMSDGRTNKVTFNLTPEI 209
           S  S KQ +G  ++ + D +P +DG  N + +NLT +I
Sbjct: 170 STSSHKQDVGISAAFLADVRPQTDG-CNGLRYNLTSDI 191

BLAST of CcUC04G061640 vs. ExPASy Swiss-Prot
Match: P32780 (General transcription factor IIH subunit 1 OS=Homo sapiens OX=9606 GN=GTF2H1 PE=1 SV=1)

HSP 1 Score: 55.8 bits (133), Expect = 6.7e-07
Identity = 51/159 (32.08%), Postives = 74/159 (46.54%), Query Frame = 0

Query: 52  IKGHKNTKEGSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPS 111
           IK  K + EG  K   L L    G +  F F N S     R+ V   L       Q  P 
Sbjct: 50  IKCQKISPEGKAKIQ-LQLVLHAGDTTNFHFSNESTAVKERDAVKDLL------QQLLPK 109

Query: 112 ERPVAAFPHEQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDS 171
                    ++ +  E+E + R LQED  L +L+K  V+  V++  EFWA R  +   DS
Sbjct: 110 --------FKRKANKELEEKNRMLQEDPVLFQLYKDLVVSQVISAEEFWANRLNVNATDS 169

Query: 172 SKKS--KQLIGFKSSMVLDTKPMSDGRTNKVTFNLTPEI 209
           S  S  KQ +G  ++ + D +P +DG  N + +NLT +I
Sbjct: 170 SSTSNHKQDVGISAAFLADVRPQTDG-CNGLRYNLTSDI 192

BLAST of CcUC04G061640 vs. ExPASy TrEMBL
Match: A0A1S3C6E8 (probable RNA polymerase II transcription factor B subunit 1-1 isoform X1 OS=Cucumis melo OX=3656 GN=LOC103497008 PE=4 SV=1)

HSP 1 Score: 407.1 bits (1045), Expect = 4.4e-110
Identity = 206/211 (97.63%), Postives = 207/211 (98.10%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYVHKSAKYKTSVKDPG PGVLEMTE KFVFRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVHKSAKYKTSVKDPGTPGVLEMTECKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLT+DQGGSYIFEFKNFSDLHVCREFVGSALAK GEAAQ APSERPVAAFPH
Sbjct: 61  GSNKPPWLNLTKDQGGSYIFEFKNFSDLHVCREFVGSALAKLGEAAQ-APSERPVAAFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. ExPASy TrEMBL
Match: A0A0A0KYD2 (PH_TFIIH domain-containing protein OS=Cucumis sativus OX=3659 GN=Csa_4G338430 PE=4 SV=1)

HSP 1 Score: 401.7 bits (1031), Expect = 1.8e-108
Identity = 203/212 (95.75%), Postives = 206/212 (97.17%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYVHKSAKYKTSVKDPG PGVLEMTE KFVFRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 93  MGTKYVHKSAKYKTSVKDPGTPGVLEMTECKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 152

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLT+DQGGSYIFEFKNFSDLHVCRE VGSALAK GEAAQ APSERPVAAFPH
Sbjct: 153 GSNKPPWLNLTKDQGGSYIFEFKNFSDLHVCRELVGSALAKLGEAAQ-APSERPVAAFPH 212

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSK EMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLE+D+SKKSKQLIG
Sbjct: 213 EQLSKLEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLEQDNSKKSKQLIG 272

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQA 213
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQA
Sbjct: 273 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQA 303

BLAST of CcUC04G061640 vs. ExPASy TrEMBL
Match: A0A6J1IB04 (probable RNA polymerase II transcription factor B subunit 1-1 OS=Cucurbita maxima OX=3661 GN=LOC111470852 PE=3 SV=1)

HSP 1 Score: 395.2 bits (1014), Expect = 1.7e-106
Identity = 200/211 (94.79%), Postives = 202/211 (95.73%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYV KSAKYKTSVKDPG PGVLEMTE KFVFRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVQKSAKYKTSVKDPGTPGVLEMTERKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEA Q APSE+ VA FPH
Sbjct: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAPQ-APSEKLVATFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERD S KSKQL+G
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDYSTKSKQLVG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. ExPASy TrEMBL
Match: A0A6J1EXP7 (probable RNA polymerase II transcription factor B subunit 1-1 OS=Cucurbita moschata OX=3662 GN=LOC111439158 PE=3 SV=1)

HSP 1 Score: 395.2 bits (1014), Expect = 1.7e-106
Identity = 200/211 (94.79%), Postives = 202/211 (95.73%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYV KSAKYKTSVKDPG PGVLEMTE KFVFRPSDPTSASKLDVEFRFIKGHKNTKE
Sbjct: 1   MGTKYVQKSAKYKTSVKDPGTPGVLEMTERKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEA Q APSE+ VA FPH
Sbjct: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAPQ-APSEKLVATFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERD S KSKQL+G
Sbjct: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDYSTKSKQLVG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ
Sbjct: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 210

BLAST of CcUC04G061640 vs. ExPASy TrEMBL
Match: A0A6J1DNY2 (probable RNA polymerase II transcription factor B subunit 1-1 isoform X2 OS=Momordica charantia OX=3673 GN=LOC111022945 PE=3 SV=1)

HSP 1 Score: 390.2 bits (1001), Expect = 5.5e-105
Identity = 195/211 (92.42%), Postives = 203/211 (96.21%), Query Frame = 0

Query: 1   MGTKYVHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKE 60
           MGTKYV KSAKYKTSVKDPG PGVLEMTE KFVF+PSDPTSASKLDVEFR+IKGHKNTKE
Sbjct: 1   MGTKYVQKSAKYKTSVKDPGTPGVLEMTERKFVFKPSDPTSASKLDVEFRYIKGHKNTKE 60

Query: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPH 120
           GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGE AQA+ SER VA FPH
Sbjct: 61  GSNKPPWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGETAQAS-SERHVATFPH 120

Query: 121 EQLSKSEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIG 180
           EQLSK EMELRM+CLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERD+SKKSKQL+G
Sbjct: 121 EQLSKLEMELRMKCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDNSKKSKQLVG 180

Query: 181 FKSSMVLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           FK+SMVLDTKPMSDGRTNKVTFNLTPEIKY+
Sbjct: 181 FKNSMVLDTKPMSDGRTNKVTFNLTPEIKYE 210

BLAST of CcUC04G061640 vs. TAIR 10
Match: AT1G55750.1 (BSD domain (BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins) )

HSP 1 Score: 223.4 bits (568), Expect = 1.7e-58
Identity = 118/206 (57.28%), Postives = 146/206 (70.87%), Query Frame = 0

Query: 6   VHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKEGSNKP 65
           + K  KYK++VKDPG PG L + E   +F P+DP S SKL V  + IK  K TKEGSNKP
Sbjct: 6   IEKLVKYKSTVKDPGTPGFLRIREGMLLFVPNDPKSDSKLKVLTQNIKSQKYTKEGSNKP 65

Query: 66  PWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPHEQLSK 125
           PWLNLT  Q  S+IFEF+N+ D+H CR+F+  ALAK     +  P+ + V +   EQLS 
Sbjct: 66  PWLNLTNKQAKSHIFEFENYPDMHACRDFITKALAK----CELEPN-KSVVSTSSEQLSI 125

Query: 126 SEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIGFKSSM 185
            E+ELR + L+E+SELQ+LHKQFV   VLTE EFWA RKKLL +DS +KSKQ +G KS M
Sbjct: 126 KELELRFKLLRENSELQRLHKQFVESKVLTEDEFWATRKKLLGKDSIRKSKQQLGLKSMM 185

Query: 186 VLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           V   KP +DGRTN+VTFNLTPEI +Q
Sbjct: 186 VSGIKPSTDGRTNRVTFNLTPEIIFQ 206

BLAST of CcUC04G061640 vs. TAIR 10
Match: AT3G61420.1 (BSD domain (BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins) )

HSP 1 Score: 221.1 bits (562), Expect = 8.6e-58
Identity = 118/206 (57.28%), Postives = 143/206 (69.42%), Query Frame = 0

Query: 6   VHKSAKYKTSVKDPGMPGVLEMTEHKFVFRPSDPTSASKLDVEFRFIKGHKNTKEGSNKP 65
           + K  KYK+ VKDPG  G LE++E   +F P+DP S  KL V+   IK  K TKEGSNKP
Sbjct: 1   MEKRVKYKSFVKDPGTLGSLELSEVMLLFVPNDPKSDLKLKVQTHNIKSQKYTKEGSNKP 60

Query: 66  PWLNLTRDQGGSYIFEFKNFSDLHVCREFVGSALAKSGEAAQAAPSERPVAAFPHEQLSK 125
           PWLNLT  QG S+IFEF+N+ D+H CR+F+  ALAK  E        + V   P EQLS 
Sbjct: 61  PWLNLTSKQGRSHIFEFENYPDMHACRDFITKALAKCEE-----EPNKLVVLTPAEQLSM 120

Query: 126 SEMELRMRCLQEDSELQKLHKQFVIGGVLTESEFWAARKKLLERDSSKKSKQLIGFKSSM 185
           +E ELR + L+E+SELQKLHKQFV   VLTE EFW+ RKKLL +DS +KSKQ +G KS M
Sbjct: 121 AEFELRFKLLRENSELQKLHKQFVESKVLTEDEFWSTRKKLLGKDSIRKSKQQMGLKSMM 180

Query: 186 VLDTKPMSDGRTNKVTFNLTPEIKYQ 212
           V   KP +DGRTN+VTFNLT EI +Q
Sbjct: 181 VSGIKPSTDGRTNRVTFNLTSEIIFQ 201

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
XP_008457278.19.0e-11097.63PREDICTED: probable RNA polymerase II transcription factor B subunit 1-1 isoform... [more]
XP_038894194.17.7e-10996.21general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X2 [... [more]
XP_038894195.17.7e-10996.21general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X3 [... [more]
XP_038894178.17.7e-10996.21general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X1 [... [more]
XP_038894196.17.7e-10996.21general transcription and DNA repair factor IIH subunit TFB1-1-like isoform X4 [... [more]
Match NameE-valueIdentityDescription
Q3ECP02.4e-5757.28General transcription and DNA repair factor IIH subunit TFB1-1 OS=Arabidopsis th... [more]
Q9M3221.2e-5657.28General transcription and DNA repair factor IIH subunit TFB1-3 OS=Arabidopsis th... [more]
Q55FP16.1e-0835.56General transcription factor IIH subunit 1 OS=Dictyostelium discoideum OX=44689 ... [more]
Q9DBA95.1e-0732.28General transcription factor IIH subunit 1 OS=Mus musculus OX=10090 GN=Gtf2h1 PE... [more]
P327806.7e-0732.08General transcription factor IIH subunit 1 OS=Homo sapiens OX=9606 GN=GTF2H1 PE=... [more]
Match NameE-valueIdentityDescription
A0A1S3C6E84.4e-11097.63probable RNA polymerase II transcription factor B subunit 1-1 isoform X1 OS=Cucu... [more]
A0A0A0KYD21.8e-10895.75PH_TFIIH domain-containing protein OS=Cucumis sativus OX=3659 GN=Csa_4G338430 PE... [more]
A0A6J1IB041.7e-10694.79probable RNA polymerase II transcription factor B subunit 1-1 OS=Cucurbita maxim... [more]
A0A6J1EXP71.7e-10694.79probable RNA polymerase II transcription factor B subunit 1-1 OS=Cucurbita mosch... [more]
A0A6J1DNY25.5e-10592.42probable RNA polymerase II transcription factor B subunit 1-1 isoform X2 OS=Momo... [more]
Match NameE-valueIdentityDescription
AT1G55750.11.7e-5857.28BSD domain (BTF2-like transcription factors, Synapse-associated proteins and DOS... [more]
AT3G61420.18.6e-5857.28BSD domain (BTF2-like transcription factors, Synapse-associated proteins and DOS... [more]
InterPro
Analysis Name: InterPro Annotations of Watermelon (PI 537277) v1
Date Performed: 2022-01-31
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR013876TFIIH p62 subunit, N-terminalPFAMPF08567PH_TFIIHcoord: 22..86
e-value: 1.2E-6
score: 28.8
NoneNo IPR availableGENE3D6.10.140.1200coord: 124..165
e-value: 1.8E-13
score: 52.2
NoneNo IPR availableSUPERFAMILY50729PH domain-likecoord: 9..101
IPR027079TFIIH subunit Tfb1/GTF2H1PANTHERPTHR12856TRANSCRIPTION INITIATION FACTOR IIH-RELATEDcoord: 6..211
IPR035925BSD domain superfamilySUPERFAMILY140383BSD domain-likecoord: 121..167

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
CcUC04G061640.1CcUC04G061640.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0006289 nucleotide-excision repair
biological_process GO:0006351 transcription, DNA-templated
cellular_component GO:0000439 transcription factor TFIIH core complex