Clc01G12100 (gene) Watermelon (cordophanus) v2

Overview
NameClc01G12100
Typegene
OrganismCitrullus lanatus subsp. cordophanus cv. cordophanus (Watermelon (cordophanus) v2)
DescriptionRetrovirus-related Pol polyprotein from transposon TNT 1-94
LocationClcChr01: 20554450 .. 20566589 (-)
RNA-Seq ExpressionClc01G12100
SyntenyClc01G12100
Sequences
The following sequences are available for this feature:

Gene sequence (with intron)

Legend: polypeptideexonCDS
Hold the cursor over a type above to highlight its positions in the sequence below.
ATGTTTTTTACGGCAACAAGCACGCATTCTCCATGTGCAGCCATTGTGGTATATCTTACACCTAATAGCCCAAATATTCTTTTTTGACTCAATAACTTCAAAATTTTGATGACAAATGATTAAATAGCTCTTCACTGCATTTTTTAAATTAAACTTTGATTAAATTCCATCCCCGTACTCAAAGAATCAAATCTAATATTTTTCATCTCAATGCCTAAATTAGGTTCAATATTCATGTCATTCAAGTCTATCTCATTGAAGGTTTCAGGGACATCATCAACACCATAAAATCCATGAGTTGGTTGCTCTAGATCAATTTCACCATCGTCTATTTCAATTTCTTCTAAATTTGTGGCAATAGTTGTAAATTGACCTCTTTGTCATTACCACTTTCAACCCCACTATTATTTTCAATTTCTCTCGATGATGTTCGATTTAAAATGTTCGGTTGAGAAATATAGAGATCCACTGTGTTGGAGGGGATGAAGCAGTGCTCATCATAAACCTTAAATTAGGTTCACTCCATAAAGAGATAGGAATATAGAAAGTAAATCCTGCTATATAACTAACATATCTAAATATTACATGTATCTCACCATCTTTATCTATAGATAATGCATCACATAATCCCTCCATAAATCTTTTCCCAAATCGCATCACCTTCCAAGTCAAAATTCGTCTAAAACCGAAATTTGAATTTTATTTCAAATCTAAGTTTAAACAAATCCCATCTAAGGTCAAAATTGAAATTAAAAAAGAAAGTTGCCTTGGCAAACTTTGGCCCTCCAAGTTTTATGGTTCAAATTCAAATTCCTATCAAATCGGAATTCAAATTTCACTCTAAATTGAGTTATAAATTTGATAAAGGTCAATTTTCAAACTTTCATTACAAGTTCAAGCCTCTCAAGTCGTATTAACGTAAAATTGAAAAGTTTGGCACCATTCAAATTGTCACTCAAAGAAATTTGGCACCATTTAAATTTGGGCATATGAATTCAAATCATACAAGTTTAGCCTTTTGTTCTCAAATTAAATTAAATTGTTGATAAGTTCACAAACTTATCTCAAATTAAAATTTGGTCATATTCCAAAAATTTACTCCAAATTATGGTCTATCTGTACCAAATTCATAATTTGATTGATATGTCAAATGAAGTTAAAATCAAGAATAAATTATTTGAATTTTGATTTCAACTTCATCGTTTTTGAGATTCACCTACCGTATAATCATGCATAAATCTCAAAAAGGGGCATTTGTTGAATGAGAAATTTCAGTCAACCAAATTTTGCCACGTCATCACAACAAAAGTCCAAACGCATTGTGGAAATTATAAATTGGATCGAGCCAAGTAATTAAAAGTGCTCGAGTCCAAAGCCTGACCCAATTCAAGCCCATTGTGGAAATATTGGACCAAAGAAGTTATTGGGCCCAAGCCCAAATTTTGGGCCCAAAAGCGACTGGGCCCAAGCCATGAAGGCCCATAGAAACTCTATAAATAGAGGTCTCATCCTGCATCACTAGAGGTTGAAAAATTGAGGAACAAAGTATGGAAGCTCTAGAAACCTAAGTTTTGAAGAATTGAAGACTGAAGATCTTCTCCCAAGCTCACAAGTTCTGAAGATTGAAGCCCTGTAATTCCAAAGAACTAAAGATCGAGGTTCTCAAGCATCGAAGTTCTAGAAGCATCGAAGTTCTAGAGACCTGAAGATCGAGACTCTGAAGTTGAAGACTTTGAAGTTGAAAGCTCAAAGAAGATTTCGAAGATTCAAGTTGCTTGTAATTTTTTTTAGAGAGAATCAAGGGATCAAATACTAGAGATTGTACTCACAAACACTGAAATTAATATAAATACGAAGTTCAAATTCTGCAAACAAAATTTCTTTGGAAATCTCATCGAACAATTATTATTAATAAATTTTATAAATTATATTATTTTTAAATTTAATGTGTGGTACATTATTATCAGTTTGGAAAATTTATGATTTTATCTTCTTATAGTTCATTCATTTTTAAAAATTAGATAAAAAAGTTATTTTCAAAATAATAAATTAAATTTACTTATAAAATTGGAAAAAAAAGTGAAAACCTAAAATATAAACAACTTGAACATAGGGTGTTTTATCAATTCACTTCTAAAGACTAATTCTAACCTGTATAACTTAACCTACAATTTTACTCTTAAACATAAACTTTACCTATAATTATACTTGATAATAATGGATTTTTTAAAACTTACTAAATGTCATTAGTTATTTCTAAAAATAAAGATCTAAAGTAGTAGAGAGTAAAATTTTACTATATTTGTAATTGTTTTCAACTATTTTTCTATTTTTTTAAAACAAAAATAAATAACCAAATATACAAAACCTTGACAGCTTGTGCAATTTTTCTCTTGAATAATCTCCACTGTAAACATATGAGTAATAAAAGAAAAAAAAATCAGTATAAAATCATGGTAATATATAATGAAATAAGATACGTATTATAGCCTACTAATAGAAAAATGTTGCACTAAATTTGTTCACAAAGTGTTGAAACTAAGGAGTATAATTTCAAAAATGAAGTGGTTGCCAGATTTCAAGATGAAAAACATTGAGAGACAAAGGAAGAAGAGTGACTTTAGGATGAAAAACATTGGAGGAACAAGATGAAGTTGAGAAAAAAAAAAAAATAAGGAGACGAGATTTTAAAGAAAAAATCCTTCAAACAAACATTTTTTTTTTCATTTTTAAACCAACAATTGTTGGTTCAAATAAGGGGAATAGCTAGAAAATAAGTCTTTTGACTTCCTTGTAAATGAATGCTTACGGTCGTTGATAAGTCACATCACGTCAACGGCAACAATTTTTTTTTCTTTGCATCTCTTTAAGAGACGCATTCAAATTTTTTTCGTTGCATGTCAGAATGCCTCTCTTCAACAGACGCATTCTGACCTCCGTTTATAATTCTGTTTGATCAAAAGGTGTTTTCGTCTATTTGATTTCGTTTTTCGTCTTTGATCGATTTGAATTAAGAGACGTATTGAAATAAAGATAAATAAATTACCCTCTTTTCCTAAATTTTTCTCAAACTATCCCTTTTCCCTAATTAGTTTTAAAAATATCATTTTTTACTAATTAACCCACTGAATCTTGGTTACGATCTTAGTCGAGATGGCCCAAGAAAGATTCTTTTTACGAATTTTGAAAAAAATGTGCTATTTTTGAAAAACCCTTTAGAGGATTACTTTGGACATTTCCCAAAAAAGAGTACCGATTCGTTTCTTAAATCGCGGGTTCAATTAGCTTAATCTCCACACATTTCGCCTTCGGAAATTTTGCATCTACAAAGCCCTATCTGACTTCCCAGTTCCTCCACTGCAACTACCAAAAACTCTCTCCTTCTTTCGGTAAGTAATCGTCTTCAAATAAGCTTCCATAGTTGTTCTATGTTTTCCCTTAATTGTTATGTTTCGATTGAGTCGATCGTATTCCTCTTCTTAGGAAAGGCGTTCAATCGATATGGCATCAGCTAGGACGGTCGCTAGAATCTTTTCCCGAAGGTTCTCGAGCAGCGGGAAGATTCTCAGCGAGGAAGAGAAGGCTGCTGAGAATGTCTACATCAAGGTGCTCATCTAAAGCCTCCGACCTGTGATTTTTACGTTTTCATGTTTTTAAGGGGGTTTGGGGGCTTCTCTGATTATGTTAGGATCATTTTGATTGTTTCATAGATTATAATTATAGAGATGAATAGAAAAAGTATGGCGAATTTTCTTGTTGCAAACGTACAGGTTGCTTGTTTTGAGAGGTTTTTAAGTTTCTCGATGAGTCGAGCTCTACGCTGTCAAATGAATCTGTTAAGAATAAAATTGGTTAGACTGTTCAATTTTATTTTTTTTTTTGAAATGAGATCATACCATTTGGTTTTAGAGACGGTTTTTAAGATTTTGGAGCAATTGGAGTCCCTTTGTGTTGGCTCGGAAAATTTTCTCTACATGTTCATTGAAGTCGGCAATAACATTGTGAGTGAAAGTATTGAGGAGAAGAATTTTGTCTAAGGGTCAGAAACAGACGATCATGTATTATTTTATCAACTAATTTTTTCTTTTATTTTAAATAATTACGTATTTAGATATATTATTAGGTTGGCATTTATCTTATTACTAGCGACTTTTAGTTTAGGGATTTCTAAGAATTTGCTGTTTCATAAATTATTATATTTGTATGATGAAATGAGTGCAAAGTATGAAGAATTTACTTCGGGAAATATAGATGGTCTTGCCTTTTGAGTTGTTTGAAGACCCAGTAATGGCTGTCTAAACTGGGTAATTAACAAAAGTTGCCTCTCTTAAGTTTTGATCAGGACAAATGATCCCTTTTTATAAAATTAATATATCTAGACTCCTCAACTTCTATCGAGTATGTGACCACACCCATTTACCCTCCATTTCATAAAAGAGTAAGAAATTCAAAATAGTTCCTCTCTCCCTCCTTCTCTTTTGTCCATTAACCACTTCAACTTCCCTTACCTCCCAGTTCCCATTCAATTTTTCAGCCAGATTTCATTGCTTGAAATCGTGTCTCTCCCACAAATCTACCCTAAACTCTCTCAACAATTGTCTTTCTTGATTTACATGGATTTGTTTGTGTTTTATATTGTTTTGGTTATAATGCATATTTTTGATCTTGTCTGATGTTTGGTTGGAAAAAATTGAAAATTTTATCGTTTAGGGCTGATTGAAAAGATAGGTCTTTTTTGCTTTCTATGTTTCTGTATTTTTTGTGTGAGATTTGTTTTTCTCTTTGACTGATTTTGGTGCATTGCATGCTATAAATTTTGGGGCTCGTCTAAGTTGATTTAGATACAAAAAAAACTCAAATTTTAGTTGTTAATTCGTTGGATTTCTTTCTGGAAACCGAAGCTTCCTGGTTGGGATGCATGACAGGGTCACTAGGAAGAAGAACATTACCTTTTCTGGCCTGGTGAAAAATGGCGGGTAGCAAGTATTTTTTCCCAGTCCTGATCCCAGTGAGGAAAGACAAAATGCTTGGTTGGATTTTCCCGATCATTTCTAGGTTGATCATTCTGGACCTTCTCCCAGTTGGATTTTCCCTGTTTATCATTTGGCACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGAAAAATACCAAGAACTTGAGAAGAAAAATCAATATTCACTGTGAAGCTCGATGAGATCTTCATTGGGGATGAACCCCAATCAAGAATTTACAAAACCCAGTTGGGAAGATAATCAATTGGCAGGCCAGGAACACAAATAATTTGAGCTATTCAAACAACAACGTGAAGAATATTGTATTGCATACCTTTTTTAAATTTTAAACTATTGATGGCAAATCAAATGTGAAGAAAGGTGAGATTTGTGGAGTACAACTTTTTGGAGTAAAATTTTGTTGGATGAAAATGATGAGGTGAGTGGGTGTGAGTGAGTTTAATTTTCTTTTAAGCATTAAATTATAGATTGGGTTGTTGATGATGGGTAAAATGTGGGCGAAAAATAATTAACATTGTATTTAATGAACTAATGAAGTAGAGGGTAAAGAGGTGATTTGCCATCAGAAGATTTTCATGCTTACATCATTTTCATTTTAAAGGGTCACTTTTGTCAATATTCCTGCCTAAGCTCTCTCCAACAAGTCTACGGTCCTTCAAATTTCTCTACCATTCCATTTTATTTTTATTTAGTATAGGCTCATATCTTCTTTTCTAGTGGGTTCGATGGTGTTATTGAAGTAGAAGTAGAGTATTAGTTTCTGGATATTGCCATAGAGATTCATGGCTTTGTAATCAGGACTGCTTCAGTTCTTCTTACTGCAGTTCTATAGTTCTTCATTTTGACACTGGAGTATAATTAAAGGCTCTTTCCAGGTAGTTGCACTTTCATTTTTTCATGCTTTTTCCAGGTAGAGTATAATAGATCTTTTTTATGTGGAACATGTCTAAGCTGAAGCATCAAATTGCTAAGTCTTGGAGATTTTTCAATGCCTAGAGTTTATTTATATTTTATGAGCGTGAGCTAACATTTCTCCTTTACCATCTGCTTGTTATGTTGTCTTTTTTTACCTTGTTTTCATTCATATTTCATTTGGTTGACATTTTGCAGAAAACTGAACAAGAAAAACTGGAGAAGCTTGCACGCAAGGTATGCTTGTAGGTCCTTAAAATTTTCATTTGGAAACCTGGAGTTTGAAATTATGGGGATCTTGGATAATCTAGCTTTCACATGGTAAAATTTTATGGTATCTTAATAATTGTAAAACTACAACTTTATCTTACTGAAGTAGCTTTATATGTTCTTGCGAACAATGGGAAGATTCAAGAGCTCCCAACCTCTCGTAAGAGAACCCAAATACAGAGTATAACCAAAAAGATACCAACCCCACCCCTTGCAATCACGGTATATATACTAAGAAATAAAAGGCTTAACTAACAATATTCTTAGGGCCACAACATATGCAAACTGCAATCATTCACATGAGTATTCACACCCTTCCCTTATGTCTACGTGTGTAAACCTTCTTAATTGGGGGCCTATGTTTTGATGCAAACTCTATTAGGGTGTTATTAGAGCATTAAGGGCATATTAGTAATTAGCTAGAGAGTTTGTTATGCTTGTTGGTTATAAATAAAGAATGAGGAGAATGTAGAAGATGAACATGTTATTTCCTGTTAGTAGAATTTGGTGGAAGAGTGTATGCTGAATGTGATTCAAGGCGAAGAGCGTTTTGAGTTGGCAGTATCTTTTAACTTTGGCCGTACTTTTCTGTTGGGGTATTCTTTAGGAGAAATTGAGAGAGACGGGAAGCTCTCAAAACTTTCCATTATATATGTTTAATAATAAAATTGGTGGCTGAGTTTTTATCACACAACACACAGATCTATGATAATATAATTTTCTTATTTTGAGTGAGCTAAATGTTCTAATCTTCTCTAGGCTACAAATGCATTATGTTTCTAGGGACCTAAACCAGAAGAAAAGGCAGGAGGGTCAGTAACTGATTCCGTTCCCAGTGGCTCGGCCTCAACATCAGGAGCATCGACAGAGAAAATATCTACTGACAAACACCGGAATTATGCTGTTGTAGCTGGAACTGTGACGATTCTCGGTGCTCTTGGATGGTATCTCAAATCTAAAAAGAAGCCAGAAGAAGTCTAGGATTGAGTTAACAAATGATGTCATTTTCATATCATCAGCATCTGAAGCATGGCCTGGGGGGAGATGAGTTTTCTCTTTCCTCTGCTAAATAAATGAAAAATTGAGATGTCTTATGGATCTTGGTTAATGAATACATTCTGCTGTTCAGATTTACGTTTCTCTTTGAAAATGGTTTTCTTGGCCTTATTTCTGTCTACTTCCCTGGAAAATTAACCCCGAGAGACTGACGGCATGGAACATGCTCTGTGACAAGACTTGGTTCATTGGATTTCAATTTCACTTTCTCTTCAATAATATTGAGTAAATACTCCTTTACTATCGTTTCTAACGTGATTTCTAAGGGGGGGAAAAGAATTACTCCGCTACCATCACTTTGTGAAAAGCTAAGATGTATCTATCTAGTCCATCCCAAGATTTAATTGTGTTATCAAAACTTGAAATGGACAGATGATTTAATATTTCCAATCCAAAGTTGATTCAGTAAGGAAATAATTCGTTCTTAACATAAGCATCAGTAGTAGTCATTGTTAAATATTCTTGCTTGTATTTTAAAGGAAAAAAAAATTCAAAATAATCTGTGATAGTCTTAACTACTAGCTCTTGGAATATCCCAAGTAGATTTATTCCATGTTGTGATTACGGAGAAGAATATTGGCTTCTAAAAGGTGTGCAGAGGTTACCATTTACAATGATGTGGGAAAATTTTAAGTACTACCAATAATACTTTTTCCAGTCATTATTAGATAATACTTTTTAGTTTTTCCAGGATAGCAACCTATGGAGCTTGCTCCTTTGGATTTGTTGTTGTTTCAATTAATTTCTATCATATTTTTCGAGCCTTTCATATGTCTAATCTCGAGTATCTGGAGTATGAGACTCTCGTGAAAAATCAATAATAGGTTTGGTTAGGAGACAACTTAAGCAAATTACTATTTATGATCGTTCTTTAATGTTCTCCATTGTTGATTTATGGTTATGTTGTAGCTGCCAATTTTTGAACTGATTAATGAGTTGTGTGTGATTATGTACTGTTGATGTTGGAGTTGCAGTCGGAAAGAAATTATTTAGGAGCGTTTTCTTTTGTGGTTTATCTAATTTTTGGAGAAACTTTGTAAGAAGCTTTTGCTAGCCACCTATTTTGAAAGGCGTATTTAGGGTTTGATAATAGTTTTTCGTAGTTGGAACTCATTGTCAACATGTGACATCTTAAGTTACGTAGGGTAATTTGGGGTGGGGTGTGTCGATGTGCTCAAAGGAAAATTTGAAAAATTGACTCAAATTTTTTGCCCAAATAGAGTTAGAATATGTTTGGGAGGGAGGAATTTTGACATGGCTAAATTTACTTTTGTCCTCATCAAAATGGCTTTTCAATCAAATATTGATGTTTGATGAATGTTAAATTGATTTTAAAACAAATCTAAAAAGAATAATTGGGTGAGAGGAATCACTCTAAAGTTTTACCCTCATTTCATACATTTATATTCTACCGATTGTTTCTTCTTTTACGCCCTAAATTGCAACACTGGTGTTTTGGCCTTTATATATTAGAGGCACTAGTGCGGAAACCACACACTCATTCGTTGCAGCCAGCACAATCAAAATTTCCTCACTAATTTTCTTTTCTATTTTCAAAAACTATTTATTTAAAATAATAATTGTAAGGATAAATTATAAATTTAGTATGTGTCTATTGGTTTTTTAACTTTAAAATAATTTAGTATCTAATAGGTCCTGATTCTTAAAAATGTACTAGGTATATGAAATTTTAGTTATATTTCCAATAGGTCCTTGGCATATTTTTTTAAGGTGAGCATAAATTAGTGCTAATTTGGTATGTACTTTTTTTTTGAGTTTGAAGGTTCAAACTCTCATGCCCGACTCACTCAATTTTTTTTTTCCTTTTTAAGAACCTAAAATATATTCAAGATTTTAAAATTTGACTTATCTATTAAATTTGAAAACATTAAATTTTAATTTTTTTTTTTTATCTACAAAGAGTCTTAAACTCTCAAAGACTCTCAATTAGTAAGTATATTAATTTTGTTACAAAATTTTCTAATGATTTATTGGATACTATTGAAAATTTAAGAAATTATTAGATTTATAAAGGTTTATAAGATACGAAGAACTTGTTTGATATTTTTAAAGTTTAGGACCAAAATTAATAGAGATAGATATTTATCATAATTATTTAACCAAAAAAAATTACCAAATAGAAAATATCTTTTCGTTATTTCGTAATTCTGAAATCACTCCTGAAACCAAACTGAATAATTCAAAATTATTTTTAGACATGAAAATCAGTTTTAAAAATTGGGAACCAAAAATCAGATTGATATGACAATCAATTTTAGCAAAATCAATTGTGCTTAGGTCTTCCAAATATGCACTTTCCTCGTTTTCGTGTTATTTTGTGCTTACGTCATCACTACTTAACAATCCGATTGTCAATCTCAATTTGGTCCTCATTCTCAATATGGTTCATTTACCATTCTAGCTTTCAATTTTATGGAGAAGATATTCTATTGGGATAGGAAAACATCGACTTAAGAATTTCAAATATTCATCATGAATTCATCACGTAAGTGTTCGTCATCCCTTTTCAGTATGTATCTTGGTTGATTTAATCATCCTATTAAAATCTATTTGGTGATGATCGCTTTAGCATTTGAATTTAACTGTTGCTAACATTTGGTATTAGAGCCAGGTTTTAATAGGAATTAAATCACTAGTTTAAGTTTATGAGAATTTGGGTATCTTCTCAATTGAAAGAACTTCCACAAGATTCCACAACATTGAAATAATTCTTGTAGAATAATAATTAGTTTATATGAGATCAACAATACCAGAGAAGTATTCAACTGTAGCCGTAATTTAAATCTTTAATATTTTAGAGAGTGCGTTAGTCACCAAAGTGGCCCACTCGGAGTTCATTAAATGATTTAAATTATTTTATAAATGTTTATTTGATATAGCATCATAAATGACATTTAAGAATGATAAACATTTATAGTTATTAACATGAAATTATACGGTTACTCAAAGGTAGATCTTACTCGGTTAATAACAATTGAAGTAGGTTATTAATTTTGTTTAATTAATATATAATGTAGATCCAGAAATGACTTGTATATTAAAGTAAGCTTAGTGATTAACCTTATATATGTTATATGAATGATGCATGATTGATGGTTGTTGGTTGTGTTTAAATTTAGCCCAAAGGTGAATTTAAATCCAATGTTTTGGTATATGTTATAGTCTCAATTGTGGTTTAAATTCAAAGCAATTAATACGTAAGATGTAAATATTTGCAATGGTCTCTACTATGTCAAATATGACTGCCAGTAGGGTTGGCAAAAAACTAGTGAGGCTGGGTCCCCACGGGGCCCCGACCCGATCTGGTCGGGAAATCCCTAGTTTGACTAGGGATGTAGGTCAAACCGGGGAGATTCCCCAAGTACCTGTTCAGGGTCAGGGCAGGGATGGGGAGGGTATCCCCGCCTCGGCCTCGGTCACGATCTCCCCCCCGATATATTCCTGACCGCCCCCGTATATATATATGTATGTATGTTTGTATGTATGTATCTCGCTCTGTCTCGCTCTGTCTCTCTCTCTGTCTCGCTATGTTCATGATTGCTTTAGAAATCTTGTTTTTGTTTCTAGGTTGGATGTTTTAGGATTTGACTTTAAGGTTTGACATAGTTGTTTTTCATTGTATTGTGGAACGAACCTGTGTGGTATTGGTTCATTGTTGGGTACTTTGTATTGTTTCAATCTAAATTCTACTTTTGTTAATTCCCTGTTTAATGTTGAAAGTGGTGTTGGTATGAAAAGAAATGCACTAAATGAGTCTTCTGCTGACTTGTGCATAAAGACTTGGTCATATATCCAAAGAGAAGATTTTGAGACTAATAAAAGGTAGAGTTTTGTCTCAATTGGATTTTATCGATTGAAATGATACTTGTATGGATTGCATTAAGGGAAGCAAATGAAACATGCATCAAAACAGGCTGCCACAAGAAGTAGTGGATTACTCGATTTAATACATACAAATACTAGTGGTCCTTTTGATGTACCGTCTTGGGGTGGTGAAAGGTATTTCATCACTTTCATTGATGATTACTCTAGGTATTGTTACCTTTATTTACTGCATGAAAGAACTCAATCAGTTGATGTCCTTGAGACATTCATTACAGAGGTTGAGAGCAGTTAAATAGAAAAGTTATGGTAATACGTTCTGATAGAGGTGGTGAATACTATGGATAGACATCGGGAGTTGGACAAATTTCAAGTCCATTTAAGAAGCTCCTTAAGTCTAAGGGCACTTGTGCGCAGTATACAATGCCCAGATCGCCAAATCAAAATGGTGTAACGGAAAAGCGTAATCGTACATTGATGGAAATGGTAAGGAGTATGATGAATGATAGTGTTATACCTATTTCATTGTGGATGTATGCATTGAGGATAGTCACATACATATTGAATAGGATACCTAGTAAAGCAGTTCCTAAGACACCTTATGAACTGTGGACATCTAGGAAGCCTAGTTTAAGATATCTTCATGTGTGGGGCTGGCAACCTAAAATGAGGATATATAATCCACATGAAAAAAAGTTGGATTCCAAGACCATTAGTGGCTATTTCATTGGATATCCGAAAATGTCAAATGGGTATAGATTCTATTGTTCTAATCATAGTACGAGAATTGTAGAGTCTAGAAATGCTCGCTTCATTGAAAATGGCGGAGTTAGTGGGAGTGTGGGAGCACATGATGTAGAGATAAAAGAGTCATTGATGGACCAAAATCCATCAAGTGATCCATCTCAAGTTGTTGTTCCTATTATTGTTGTCATACCCCCTCCCAAGTACCCTCTTAACCTAGGAGAAAGCATGAGGACAGCGGACACCGACCCTCTTGCGACATCCACTGCCAACCTTATACCTTGA

mRNA sequence

ATGTTTTTTACGGCAACAAGCACGCATTCTCCATGTGCAGCCATTGTGACTGAAGATCTTCTCCCAAGCTCACAAGTTCTGAAGATTGAAGCCCTTTCCTCCACTGCAACTACCAAAAACTCTCTCCTTCTTTCGGAAAGGCGTTCAATCGATATGGCATCAGCTAGGACGGTCGCTAGAATCTTTTCCCGAAGGTTCTCGAGCAGCGGGAAGATTCTCAGCGAGGAAGAGAAGGCTGCTGAGAATGTCTACATCAAGAAAACTGAACAAGAAAAACTGGAGAAGCTTGCACGCAAGGGACCTAAACCAGAAGAAAAGGCAGGAGGGTCAGTAACTGATTCCGTTCCCAGTGGCTCGGCCTCAACATCAGGAGCATCGACAGAGAAAATATCTACTGACAAACACCGGAATTATGCTGTTGTAGCTGGAACTGTGACGATTCTCGGTGCTCTTGGATGTAGGGTTGGCAAAAAACTAGTGAGGCTGGGTCCCCACGGGGCCCCGACCCGATCTGGTCGGGAAATCCCTAGTTTGACTAGGGATGTAGGTCAAACCGGGGAGATTCCCCAAGTACCTGTTCAGGGTCAGGGCAGGGATGGGGAGGGTATCCCCGCCTCGGCCTCGACATCGGGAGTTGGACAAATTTCAAGTCCATTTAAGAAGCTCCTTAAGTCTAAGGGCACTTGTGCGCAGTATACAATGCCCAGATCGCCAAATCAAAATGGTGTAACGGAAAAGCGTAATCGTACATTGATGGAAATGGTAAGGAGTATGATGAATGATAGTGTTATACCTATTTCATTGTGGATGTATGCATTGAGGATAGTCACATACATATTGAATAGGATACCTAGTAAAGCAGTTCCTAAGACACCTTATGAACTGTGGACATCTAGGAAGCCTAGTTTAAGATATCTTCATGTGTGGGGCTGGCAACCTAAAATGAGGATATATAATCCACATGAAAAAAAGTTGGATTCCAAGACCATTAGTGGCTATTTCATTGGATATCCGAAAATGTCAAATGGGTATAGATTCTATTGTTCTAATCATAGTACGAGAATTGTAGAGTCTAGAAATGCTCGCTTCATTGAAAATGGCGGAGTTAGTGGGAGTGTGGGAGCACATGATGTAGAGATAAAAGAGTCATTGATGGACCAAAATCCATCAAGTGATCCATCTCAAGTTGTTGTTCCTATTATTGTTGTCATACCCCCTCCCAAGTACCCTCTTAACCTAGGAGAAAGCATGAGGACAGCGGACACCGACCCTCTTGCGACATCCACTGCCAACCTTATACCTTGA

Coding sequence (CDS)

ATGTTTTTTACGGCAACAAGCACGCATTCTCCATGTGCAGCCATTGTGACTGAAGATCTTCTCCCAAGCTCACAAGTTCTGAAGATTGAAGCCCTTTCCTCCACTGCAACTACCAAAAACTCTCTCCTTCTTTCGGAAAGGCGTTCAATCGATATGGCATCAGCTAGGACGGTCGCTAGAATCTTTTCCCGAAGGTTCTCGAGCAGCGGGAAGATTCTCAGCGAGGAAGAGAAGGCTGCTGAGAATGTCTACATCAAGAAAACTGAACAAGAAAAACTGGAGAAGCTTGCACGCAAGGGACCTAAACCAGAAGAAAAGGCAGGAGGGTCAGTAACTGATTCCGTTCCCAGTGGCTCGGCCTCAACATCAGGAGCATCGACAGAGAAAATATCTACTGACAAACACCGGAATTATGCTGTTGTAGCTGGAACTGTGACGATTCTCGGTGCTCTTGGATGTAGGGTTGGCAAAAAACTAGTGAGGCTGGGTCCCCACGGGGCCCCGACCCGATCTGGTCGGGAAATCCCTAGTTTGACTAGGGATGTAGGTCAAACCGGGGAGATTCCCCAAGTACCTGTTCAGGGTCAGGGCAGGGATGGGGAGGGTATCCCCGCCTCGGCCTCGACATCGGGAGTTGGACAAATTTCAAGTCCATTTAAGAAGCTCCTTAAGTCTAAGGGCACTTGTGCGCAGTATACAATGCCCAGATCGCCAAATCAAAATGGTGTAACGGAAAAGCGTAATCGTACATTGATGGAAATGGTAAGGAGTATGATGAATGATAGTGTTATACCTATTTCATTGTGGATGTATGCATTGAGGATAGTCACATACATATTGAATAGGATACCTAGTAAAGCAGTTCCTAAGACACCTTATGAACTGTGGACATCTAGGAAGCCTAGTTTAAGATATCTTCATGTGTGGGGCTGGCAACCTAAAATGAGGATATATAATCCACATGAAAAAAAGTTGGATTCCAAGACCATTAGTGGCTATTTCATTGGATATCCGAAAATGTCAAATGGGTATAGATTCTATTGTTCTAATCATAGTACGAGAATTGTAGAGTCTAGAAATGCTCGCTTCATTGAAAATGGCGGAGTTAGTGGGAGTGTGGGAGCACATGATGTAGAGATAAAAGAGTCATTGATGGACCAAAATCCATCAAGTGATCCATCTCAAGTTGTTGTTCCTATTATTGTTGTCATACCCCCTCCCAAGTACCCTCTTAACCTAGGAGAAAGCATGAGGACAGCGGACACCGACCCTCTTGCGACATCCACTGCCAACCTTATACCTTGA

Protein sequence

MFFTATSTHSPCAAIVTEDLLPSSQVLKIEALSSTATTKNSLLLSERRSIDMASARTVARIFSRRFSSSGKILSEEEKAAENVYIKKTEQEKLEKLARKGPKPEEKAGGSVTDSVPSGSASTSGASTEKISTDKHRNYAVVAGTVTILGALGCRVGKKLVRLGPHGAPTRSGREIPSLTRDVGQTGEIPQVPVQGQGRDGEGIPASASTSGVGQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYALRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISGYFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSDPSQVVVPIIVVIPPPKYPLNLGESMRTADTDPLATSTANLIP
Homology
BLAST of Clc01G12100 vs. NCBI nr
Match: KAG7551855.1 (Integrase catalytic core [Arabidopsis thaliana x Arabidopsis arenosa])

HSP 1 Score: 261.5 bits (667), Expect = 1.3e-65
Identity = 123/191 (64.40%), Postives = 155/191 (81.15%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF KLL+S+G CAQYTMP +P QNGV E+RNRTLM+MVRSM+++S +P+SLW+YA
Sbjct: 250 GQCPGPFAKLLESRGICAQYTMPGTPQQNGVAERRNRTLMDMVRSMLSNSSLPLSLWIYA 309

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+  TY+LNR+PSKAVPKTP+ELWT RKPSLR+L VWG   +++ YNPHEKKLDS+T+SG
Sbjct: 310 LKTATYVLNRVPSKAVPKTPFELWTGRKPSLRHLRVWGCPAEVKSYNPHEKKLDSRTVSG 369

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           +FIGYP+ S GY FYC NHSTRIVE+ NARFIENG  SGS  +  V+I+E  ++ +    
Sbjct: 370 FFIGYPEKSKGYTFYCPNHSTRIVETGNARFIENGQTSGSGESRKVDIQEIQVEVSSPDV 429

Query: 393 PSQVVVPIIVV 404
           PS+VVVPI+ V
Sbjct: 430 PSKVVVPIVSV 440

BLAST of Clc01G12100 vs. NCBI nr
Match: KAG7564986.1 (Integrase catalytic core [Arabidopsis suecica])

HSP 1 Score: 261.5 bits (667), Expect = 1.3e-65
Identity = 123/191 (64.40%), Postives = 155/191 (81.15%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF KLL+S+G CAQYTMP +P QNGV E+RNRTLM+MVRSM+++S +P+SLW+YA
Sbjct: 576 GQCPGPFAKLLESRGICAQYTMPGTPQQNGVAERRNRTLMDMVRSMLSNSSLPLSLWIYA 635

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+  TY+LNR+PSKAVPKTP+ELWT RKPSLR+L VWG   +++ YNPHEKKLDS+T+SG
Sbjct: 636 LKTATYVLNRVPSKAVPKTPFELWTGRKPSLRHLRVWGCPAEVKSYNPHEKKLDSRTVSG 695

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           +FIGYP+ S GY FYC NHSTRIVE+ NARFIENG  SGS  +  V+I+E  ++ +    
Sbjct: 696 FFIGYPEKSKGYTFYCPNHSTRIVETGNARFIENGQTSGSGESRKVDIQEIQVEVSSPDV 755

Query: 393 PSQVVVPIIVV 404
           PS+VVVPI+ V
Sbjct: 756 PSKVVVPIVSV 766

BLAST of Clc01G12100 vs. NCBI nr
Match: RYE20332.1 (transposase, partial [Sphingobacteriaceae bacterium])

HSP 1 Score: 258.1 bits (658), Expect = 1.4e-64
Identity = 125/190 (65.79%), Postives = 153/190 (80.53%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF K L+S+G CAQYTMP +P QNGV E+RNRTLM+MVRSM+++S +P SLWM+A
Sbjct: 21  GQCPGPFAKFLESRGICAQYTMPGTPQQNGVAERRNRTLMDMVRSMLSNSSLPKSLWMHA 80

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+   Y+LNR+PSKAVPKTP+ELWT RKPSLR+LHV+G   ++RIYNPHE+KLDS+TISG
Sbjct: 81  LKTAVYLLNRVPSKAVPKTPFELWTGRKPSLRHLHVFGCPAEVRIYNPHERKLDSRTISG 140

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           +FIGYP+ S GYRFYC NHSTRIVE+ NARFIENG VSGS+   +VEI+E  +       
Sbjct: 141 FFIGYPEKSKGYRFYCPNHSTRIVETGNARFIENGEVSGSLEPRNVEIQEVRVQVPLPLP 200

Query: 393 PSQVVVPIIV 403
            SQVVVP+ V
Sbjct: 201 SSQVVVPVHV 210

BLAST of Clc01G12100 vs. NCBI nr
Match: RZC25410.1 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Glycine soja])

HSP 1 Score: 255.0 bits (650), Expect = 1.2e-63
Identity = 117/170 (68.82%), Postives = 144/170 (84.71%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ  SPF KLL+ +G CAQYTMP +P QNGV+E+RN+TLM+MVRSM+ +S +P+SLWMYA
Sbjct: 641 GQHPSPFAKLLQKRGICAQYTMPGTPQQNGVSERRNKTLMDMVRSMLINSTLPVSLWMYA 700

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+   Y+LNR+PSKAVPKTP+ELWT+R PS+R+LHVWG Q ++RIYNP E+KLD++TISG
Sbjct: 701 LKTAMYLLNRVPSKAVPKTPFELWTNRTPSMRHLHVWGCQAEIRIYNPQERKLDARTISG 760

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKE 383
           YFIGYP+ S GY FYC NHSTRIVE+ NARFIENG +SGS    +VEIKE
Sbjct: 761 YFIGYPEKSKGYMFYCPNHSTRIVETGNARFIENGEISGSTVPREVEIKE 810

BLAST of Clc01G12100 vs. NCBI nr
Match: CAA7051484.1 (unnamed protein product [Microthlaspi erraticum])

HSP 1 Score: 253.1 bits (645), Expect = 4.5e-63
Identity = 121/189 (64.02%), Postives = 148/189 (78.31%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF KLL+SKG CAQYTMP +P QNGV E+RNRTL EMVRSM+++S +P+SLW+YA
Sbjct: 540 GQCPGPFAKLLESKGICAQYTMPGTPQQNGVAERRNRTLKEMVRSMLSNSSLPLSLWIYA 599

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           LR  TY+LNR+PSKAVPKTPYELWT RKPSLR+L VWG   ++R+YNPHEKKLDS+T+S 
Sbjct: 600 LRTATYVLNRVPSKAVPKTPYELWTGRKPSLRHLRVWGCPAEVRLYNPHEKKLDSRTLSS 659

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           +FIGYP+ S GY FYC  HSTRIVE+ NARFIENG  SGS  +  V+I+E   + +    
Sbjct: 660 FFIGYPEKSKGYTFYCPKHSTRIVETGNARFIENGETSGSGESRKVDIQEIQDEVSSPVV 719

Query: 393 PSQVVVPII 402
              +VVPI+
Sbjct: 720 SPPIVVPIV 728

BLAST of Clc01G12100 vs. ExPASy Swiss-Prot
Match: Q9ZUX4 (Uncharacterized protein At2g27730, mitochondrial OS=Arabidopsis thaliana OX=3702 GN=At2g27730 PE=1 SV=1)

HSP 1 Score: 95.5 bits (236), Expect = 1.6e-18
Identity = 59/99 (59.60%), Postives = 73/99 (73.74%), Query Frame = 0

Query: 54  SARTVARIFSRRFSSSGKILSEEEKAAENVYIKKTEQEKLEKLARKGPKPEEKAGGSVTD 113
           + R   RI SRRF SSGK+LSEEE+AAENV+IKK EQEKL+KLAR+GP  E+ AG +   
Sbjct: 2   ATRNALRIVSRRF-SSGKVLSEEERAAENVFIKKMEQEKLQKLARQGP-GEQAAGSASEA 61

Query: 114 SVPSGSASTSGASTEKISTDKHRNYAVVAGTVTILGALG 153
            V   +AS S  S  K+S DK+RNYAVVAG V I+G++G
Sbjct: 62  KVAGATASASAESGPKVSEDKNRNYAVVAGVVAIVGSIG 98

BLAST of Clc01G12100 vs. ExPASy Swiss-Prot
Match: P04146 (Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3)

HSP 1 Score: 89.7 bits (221), Expect = 8.6e-17
Identity = 60/183 (32.79%), Postives = 101/183 (55.19%), Query Frame = 0

Query: 210 SGVGQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLW 269
           +G   +S+  ++    KG     T+P +P  NGV+E+  RT+ E  R+M++ + +  S W
Sbjct: 551 NGREYLSNEMRQFCVKKGISYHLTVPHTPQLNGVSERMIRTITEKARTMVSGAKLDKSFW 610

Query: 270 MYALRIVTYILNRIPSKAV---PKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLD 329
             A+   TY++NRIPS+A+    KTPYE+W ++KP L++L V+G    + I N  + K D
Sbjct: 611 GEAVLTATYLINRIPSRALVDSSKTPYEMWHNKKPYLKHLRVFGATVYVHIKN-KQGKFD 670

Query: 330 SKTISGYFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGV-SGSVGAHDVEIKESLM 389
            K+    F+GY    NG++ + + +   IV +R+    E   V S +V    V +K+S  
Sbjct: 671 DKSFKSIFVGYE--PNGFKLWDAVNEKFIV-ARDVVVDETNMVNSRAVKFETVFLKDSKE 729

BLAST of Clc01G12100 vs. ExPASy Swiss-Prot
Match: P10978 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum OX=4097 PE=2 SV=1)

HSP 1 Score: 87.8 bits (216), Expect = 3.3e-16
Identity = 54/168 (32.14%), Postives = 88/168 (52.38%), Query Frame = 0

Query: 200 GEGIPASASTSGVGQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMM 259
           G  +    S +G    S  F++   S G   + T+P +P  NGV E+ NRT++E VRSM+
Sbjct: 541 GRKLKRLRSDNGGEYTSREFEEYCSSHGIRHEKTVPGTPQHNGVAERMNRTIVEKVRSML 600

Query: 260 NDSVIPISLWMYALRIVTYILNRIPSKAVP-KTPYELWTSRKPSLRYLHVWGWQPKMRIY 319
             + +P S W  A++   Y++NR PS  +  + P  +WT+++ S  +L V+G +    + 
Sbjct: 601 RMAKLPKSFWGEAVQTACYLINRSPSVPLAFEIPERVWTNKEVSYSHLKVFGCRAFAHVP 660

Query: 320 NPHEKKLDSKTISGYFIGYPKMSNGYRFYCSNHSTRIVESRNARFIEN 367
                KLD K+I   FIGY     GYR +      +++ SR+  F E+
Sbjct: 661 KEQRTKLDDKSIPCIFIGYGDEEFGYRLW-DPVKKKVIRSRDVVFRES 707

BLAST of Clc01G12100 vs. ExPASy Swiss-Prot
Match: Q94HW2 (Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana OX=3702 GN=RE1 PE=2 SV=1)

HSP 1 Score: 70.1 bits (170), Expect = 7.0e-11
Identity = 42/136 (30.88%), Postives = 77/136 (56.62%), Query Frame = 0

Query: 235 PRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYALRIVTYILNRIPSKAVP-KTPY 294
           P +P  NG++E+++R ++E   ++++ + IP + W YA  +  Y++NR+P+  +  ++P+
Sbjct: 616 PHTPEHNGLSERKHRHIVETGLTLLSHASIPKTYWPYAFAVAVYLINRLPTPLLQLESPF 675

Query: 295 ELWTSRKPSLRYLHVWG--WQPKMRIYNPHEKKLDSKTISGYFIGYPKMSNGYRFYCSN- 354
           +      P+   L V+G    P +R YN H  KLD K+    F+GY    + Y   C + 
Sbjct: 676 QKLFGTSPNYDKLRVFGCACYPWLRPYNQH--KLDDKSRQCVFLGYSLTQSAY--LCLHL 735

Query: 355 HSTRIVESRNARFIEN 367
            ++R+  SR+ RF EN
Sbjct: 736 QTSRLYISRHVRFDEN 747

BLAST of Clc01G12100 vs. ExPASy Swiss-Prot
Match: Q9ZT94 (Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana OX=3702 GN=RE2 PE=4 SV=1)

HSP 1 Score: 67.8 bits (164), Expect = 3.5e-10
Identity = 58/206 (28.16%), Postives = 103/206 (50.00%), Query Frame = 0

Query: 235 PRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYALRIVTYILNRIPSKAVP-KTPY 294
           P +P  NG++E+++R ++EM  ++++ + +P + W YA  +  Y++NR+P+  +  ++P+
Sbjct: 595 PHTPEHNGLSERKHRHIVEMGLTLLSHASVPKTYWPYAFSVAVYLINRLPTPLLQLQSPF 654

Query: 295 ELWTSRKPSLRYLHVWG--WQPKMRIYNPHEKKLDSKTISGYFIGYPKMSNGYRFYCSNH 354
           +    + P+   L V+G    P +R YN H  KL+ K+    F+GY    + Y   C + 
Sbjct: 655 QKLFGQPPNYEKLKVFGCACYPWLRPYNRH--KLEDKSKQCAFMGYSLTQSAY--LCLHI 714

Query: 355 ST-RIVESRNARFIE--------NGGVSGSVGAHDVEIKESLMDQNPSSDPSQVVVPIIV 414
            T R+  SR+ +F E        N GVS S        +E   D  P+  PS   +P   
Sbjct: 715 PTGRLYTSRHVQFDERCFPFSTTNFGVSTS--------QEQRSDSAPNW-PSHTTLPTTP 774

Query: 415 VIPPPKYPLNLGESMRTADTDPLATS 429
           ++ P   P  LG  + T+   P + S
Sbjct: 775 LVLPA--PPCLGPHLDTSPRPPSSPS 785

BLAST of Clc01G12100 vs. ExPASy TrEMBL
Match: A0A4Q3ELL0 (Transposase (Fragment) OS=Sphingobacteriaceae bacterium OX=2021370 GN=EOP45_11240 PE=4 SV=1)

HSP 1 Score: 258.1 bits (658), Expect = 6.7e-65
Identity = 125/190 (65.79%), Postives = 153/190 (80.53%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF K L+S+G CAQYTMP +P QNGV E+RNRTLM+MVRSM+++S +P SLWM+A
Sbjct: 21  GQCPGPFAKFLESRGICAQYTMPGTPQQNGVAERRNRTLMDMVRSMLSNSSLPKSLWMHA 80

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+   Y+LNR+PSKAVPKTP+ELWT RKPSLR+LHV+G   ++RIYNPHE+KLDS+TISG
Sbjct: 81  LKTAVYLLNRVPSKAVPKTPFELWTGRKPSLRHLHVFGCPAEVRIYNPHERKLDSRTISG 140

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           +FIGYP+ S GYRFYC NHSTRIVE+ NARFIENG VSGS+   +VEI+E  +       
Sbjct: 141 FFIGYPEKSKGYRFYCPNHSTRIVETGNARFIENGEVSGSLEPRNVEIQEVRVQVPLPLP 200

Query: 393 PSQVVVPIIV 403
            SQVVVP+ V
Sbjct: 201 SSQVVVPVHV 210

BLAST of Clc01G12100 vs. ExPASy TrEMBL
Match: A0A445LQ30 (Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Glycine soja OX=3848 GN=D0Y65_004205 PE=4 SV=1)

HSP 1 Score: 255.0 bits (650), Expect = 5.7e-64
Identity = 117/170 (68.82%), Postives = 144/170 (84.71%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ  SPF KLL+ +G CAQYTMP +P QNGV+E+RN+TLM+MVRSM+ +S +P+SLWMYA
Sbjct: 641 GQHPSPFAKLLQKRGICAQYTMPGTPQQNGVSERRNKTLMDMVRSMLINSTLPVSLWMYA 700

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+   Y+LNR+PSKAVPKTP+ELWT+R PS+R+LHVWG Q ++RIYNP E+KLD++TISG
Sbjct: 701 LKTAMYLLNRVPSKAVPKTPFELWTNRTPSMRHLHVWGCQAEIRIYNPQERKLDARTISG 760

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKE 383
           YFIGYP+ S GY FYC NHSTRIVE+ NARFIENG +SGS    +VEIKE
Sbjct: 761 YFIGYPEKSKGYMFYCPNHSTRIVETGNARFIENGEISGSTVPREVEIKE 810

BLAST of Clc01G12100 vs. ExPASy TrEMBL
Match: A0A6N2L229 (Uncharacterized protein OS=Salix viminalis OX=40686 GN=SVIM_LOCUS166415 PE=4 SV=1)

HSP 1 Score: 253.8 bits (647), Expect = 1.3e-63
Identity = 121/193 (62.69%), Postives = 153/193 (79.27%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF KLL+SKG CAQYTMP +P QNGV E+RNRTLMEMVRSM+++  +P+SLW+YA
Sbjct: 591 GQCPGPFAKLLESKGICAQYTMPGTPQQNGVAERRNRTLMEMVRSMLSNCKLPMSLWIYA 650

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+  TYILNR+PSKAVP TP+EL+  RKPSLR+LHVWG   +++ YNPHEKKLDS+T++G
Sbjct: 651 LKTATYILNRVPSKAVPMTPFELFKGRKPSLRHLHVWGCPAEVKPYNPHEKKLDSRTVNG 710

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           YFIGYP+ S G+ FYC +HSTRIVE+ NARFIENG  SGS  +  V IKE  ++ +    
Sbjct: 711 YFIGYPEKSKGFVFYCPSHSTRIVETGNARFIENGETSGSSESRGVNIKEIRVEDSSPVV 770

Query: 393 PSQVVVPIIVVIP 406
           P+QVV+P++ V P
Sbjct: 771 PTQVVIPVVGVQP 783

BLAST of Clc01G12100 vs. ExPASy TrEMBL
Match: A0A6D2KEK6 (Uncharacterized protein OS=Microthlaspi erraticum OX=1685480 GN=MERR_LOCUS38719 PE=4 SV=1)

HSP 1 Score: 253.1 bits (645), Expect = 2.2e-63
Identity = 121/189 (64.02%), Postives = 148/189 (78.31%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF KLL+SKG CAQYTMP +P QNGV E+RNRTL EMVRSM+++S +P+SLW+YA
Sbjct: 540 GQCPGPFAKLLESKGICAQYTMPGTPQQNGVAERRNRTLKEMVRSMLSNSSLPLSLWIYA 599

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           LR  TY+LNR+PSKAVPKTPYELWT RKPSLR+L VWG   ++R+YNPHEKKLDS+T+S 
Sbjct: 600 LRTATYVLNRVPSKAVPKTPYELWTGRKPSLRHLRVWGCPAEVRLYNPHEKKLDSRTLSS 659

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           +FIGYP+ S GY FYC  HSTRIVE+ NARFIENG  SGS  +  V+I+E   + +    
Sbjct: 660 FFIGYPEKSKGYTFYCPKHSTRIVETGNARFIENGETSGSGESRKVDIQEIQDEVSSPVV 719

Query: 393 PSQVVVPII 402
              +VVPI+
Sbjct: 720 SPPIVVPIV 728

BLAST of Clc01G12100 vs. ExPASy TrEMBL
Match: A0A6N2K712 (Uncharacterized protein OS=Salix viminalis OX=40686 GN=SVIM_LOCUS38880 PE=4 SV=1)

HSP 1 Score: 252.7 bits (644), Expect = 2.8e-63
Identity = 128/231 (55.41%), Postives = 166/231 (71.86%), Query Frame = 0

Query: 213 GQISSPFKKLLKSKGTCAQYTMPRSPNQNGVTEKRNRTLMEMVRSMMNDSVIPISLWMYA 272
           GQ   PF KLL+SKG CAQYTMP +P QNGV E+RNRTLMEMVRSM+++  +P+SLW+YA
Sbjct: 534 GQCPGPFAKLLESKGICAQYTMPGTPQQNGVAERRNRTLMEMVRSMLSNCKLPMSLWIYA 593

Query: 273 LRIVTYILNRIPSKAVPKTPYELWTSRKPSLRYLHVWGWQPKMRIYNPHEKKLDSKTISG 332
           L+  TYILNR+PSKAVP TP+EL+  RKPSLR+L VWG   +++ YNPHEKKLDS+T+SG
Sbjct: 594 LKTATYILNRVPSKAVPMTPFELFKGRKPSLRHLRVWGCPAEVKPYNPHEKKLDSRTVSG 653

Query: 333 YFIGYPKMSNGYRFYCSNHSTRIVESRNARFIENGGVSGSVGAHDVEIKESLMDQNPSSD 392
           YFIGYP+ S G+ FYC +HSTRIVE+ NARFIENG  SGS  +  V IKE  ++ +    
Sbjct: 654 YFIGYPEKSKGFVFYCPSHSTRIVETGNARFIENGETSGSSESRGVNIKEIRVEDSSPVV 713

Query: 393 PSQVVVPIIVVIPPPK---------YPLNLGESMRTADTDPLATSTANLIP 435
           P+QVVVP++ V P  +          PLN     +  + +P+A    +++P
Sbjct: 714 PTQVVVPVVGVQPNTEIEQQNEEHTVPLN-----QEIENEPIAIQEQHIVP 759

BLAST of Clc01G12100 vs. TAIR 10
Match: AT2G27730.1 (copper ion binding )

HSP 1 Score: 95.5 bits (236), Expect = 1.1e-19
Identity = 59/99 (59.60%), Postives = 73/99 (73.74%), Query Frame = 0

Query: 54  SARTVARIFSRRFSSSGKILSEEEKAAENVYIKKTEQEKLEKLARKGPKPEEKAGGSVTD 113
           + R   RI SRRF SSGK+LSEEE+AAENV+IKK EQEKL+KLAR+GP  E+ AG +   
Sbjct: 2   ATRNALRIVSRRF-SSGKVLSEEERAAENVFIKKMEQEKLQKLARQGP-GEQAAGSASEA 61

Query: 114 SVPSGSASTSGASTEKISTDKHRNYAVVAGTVTILGALG 153
            V   +AS S  S  K+S DK+RNYAVVAG V I+G++G
Sbjct: 62  KVAGATASASAESGPKVSEDKNRNYAVVAGVVAIVGSIG 98

The following BLAST results are available for this feature:
Match NameE-valueIdentityDescription
KAG7551855.11.3e-6564.40Integrase catalytic core [Arabidopsis thaliana x Arabidopsis arenosa][more]
KAG7564986.11.3e-6564.40Integrase catalytic core [Arabidopsis suecica][more]
RYE20332.11.4e-6465.79transposase, partial [Sphingobacteriaceae bacterium][more]
RZC25410.11.2e-6368.82Retrovirus-related Pol polyprotein from transposon TNT 1-94 [Glycine soja][more]
CAA7051484.14.5e-6364.02unnamed protein product [Microthlaspi erraticum][more]
Match NameE-valueIdentityDescription
Q9ZUX41.6e-1859.60Uncharacterized protein At2g27730, mitochondrial OS=Arabidopsis thaliana OX=3702... [more]
P041468.6e-1732.79Copia protein OS=Drosophila melanogaster OX=7227 GN=GIP PE=1 SV=3[more]
P109783.3e-1632.14Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Nicotiana tabacum... [more]
Q94HW27.0e-1130.88Retrovirus-related Pol polyprotein from transposon RE1 OS=Arabidopsis thaliana O... [more]
Q9ZT943.5e-1028.16Retrovirus-related Pol polyprotein from transposon RE2 OS=Arabidopsis thaliana O... [more]
Match NameE-valueIdentityDescription
A0A4Q3ELL06.7e-6565.79Transposase (Fragment) OS=Sphingobacteriaceae bacterium OX=2021370 GN=EOP45_1124... [more]
A0A445LQ305.7e-6468.82Retrovirus-related Pol polyprotein from transposon TNT 1-94 OS=Glycine soja OX=3... [more]
A0A6N2L2291.3e-6362.69Uncharacterized protein OS=Salix viminalis OX=40686 GN=SVIM_LOCUS166415 PE=4 SV=... [more]
A0A6D2KEK62.2e-6364.02Uncharacterized protein OS=Microthlaspi erraticum OX=1685480 GN=MERR_LOCUS38719 ... [more]
A0A6N2K7122.8e-6355.41Uncharacterized protein OS=Salix viminalis OX=40686 GN=SVIM_LOCUS38880 PE=4 SV=1[more]
Match NameE-valueIdentityDescription
AT2G27730.11.1e-1959.60copper ion binding [more]
InterPro
Analysis Name: InterPro Annotations of Watermelon (cordophanus) v2
Date Performed: 2022-01-31
IPR TermIPR DescriptionSourceSource TermSource DescriptionAlignment
IPR036397Ribonuclease H superfamilyGENE3D3.30.420.10coord: 208..317
e-value: 1.3E-19
score: 72.2
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 415..434
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 112..128
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 93..130
NoneNo IPR availableMOBIDB_LITEmobidb-litedisorder_predictioncoord: 187..214
NoneNo IPR availablePANTHERPTHR33878OS08G0559000 PROTEINcoord: 53..152
NoneNo IPR availablePANTHERPTHR33878:SF1OS08G0559000 PROTEINcoord: 53..152
IPR001584Integrase, catalytic corePROSITEPS50994INTEGRASEcoord: 202..300
score: 11.072097
IPR012337Ribonuclease H-like superfamilySUPERFAMILY53098Ribonuclease H-likecoord: 215..308

Relationships

The following mRNA feature(s) are a part of this gene:

Feature NameUnique NameType
Clc01G12100.1Clc01G12100.1mRNA


GO Annotation
GO Assignments
This gene is annotated with the following GO terms.
Category Term Accession Term Name
biological_process GO:0071897 DNA biosynthetic process
biological_process GO:0015074 DNA integration
biological_process GO:0009231 riboflavin biosynthetic process
molecular_function GO:0008686 3,4-dihydroxy-2-butanone-4-phosphate synthase activity
molecular_function GO:0003887 DNA-directed DNA polymerase activity
molecular_function GO:0003676 nucleic acid binding
molecular_function GO:0008270 zinc ion binding