ID SPCC1902 standard; DNA; FUN; 5701 BP. XX AC AL049521; XX SV AL049521.2 XX DT 25-MAR-1999 (Rel. 59, Created) DT 08-NOV-1999 (Rel. 61, Last updated, Version 2) XX DE S.pombe chromosome III cosmid c1902. XX KW gaf1; GATA zinc finger; RNA-binding region; RNP-1 signature. XX OS Schizosaccharomyces pombe (fission yeast) OC Eukaryota; Fungi; Ascomycota; Schizosaccharomycetes; OC Schizosaccharomycetales; Schizosaccharomycetaceae; Schizosaccharomyces. XX RN [1] RP 1-5701 RA Seeger K., Harris D., Wood V., Rajandream M.A., Barrell B.G.; RT ; RL Submitted (17-MAR-1999) to the EMBL/GenBank/DDBJ databases. RL European Schizosaccharomyces genome sequencing project, Sanger Centre, The RL Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, E-mail: RL barrell@sanger.ac.uk XX DR GOA; Q10280; Q10280. DR GOA; Q9Y7P6; Q9Y7P6. DR SPTREMBL; Q9Y7P6; Q9Y7P6. DR SWISS-PROT; Q10280; GAF1_SCHPO. XX CC Notes: CC Details of yeast sequencing at the Sanger Centre are available on CC the World Wide Web. CC (URL, http://www.sanger.ac.uk/Projects/S_pombe/) CC During 1995 to 1996 about 66% of S. pombe chromosome 1 was sequenced CC by the Sanger Centre. The sequencing of the S. pombe genome is now CC being continued with funding from The European Commission. CC Fourteen European sequencing laboratories, including the Sanger Centre, CC are participating in the project. CC Protein coding regions (CDS) have been predicted with the help CC of computer analysis using the Genefinder program in PomBase CC (an ACEDB database) with additional predictions for the CC branch-acceptor sites supplied by the program Sp3splice. CC CAUTION: It is possible that for any individual CDS we may have CC underestimated or overestimated the number of introns/exons or CC we may not have chosen the correct splice donor/acceptor sites. CC CDS are numbered using the following system eg SPBC25H2.01c. CC SP (S. pombe), B (chromosome 2), c25H2 (cosmid name), CC .01 (first CDS), c (complementary strand). CC The more significant matches with motifs in the PROSITE CC database are also included but some of these may be fortuitous. CC The length in codons is given for each CDS. CC IMPORTANT: This sequence MAY NOT be the entire insert of CC the sequenced clone. It may be shorter because we only CC sequence overlapping sections once, or longer, because we CC arrange for a small overlap between neighbouring submissions. CC Cosmid c1902 is overlapped at the 5' end by cosmid c663, EMBL entry CC SPCC663, accession number AL031307 and EMBL entry SPCC417, CC accession number AL035076. XX FH Key Location/Qualifiers FH FT source 1..5701 FT /chromosome="III" FT /db_xref="taxon:4896" FT /organism="Schizosaccharomyces pombe" FT /strain="972h-" FT /clone="cosmid c1902" FT /map="IIIR" FT misc_feature 1..103 FT /note="nominal overlap with SPCC663 S. pombe chromosome 3" FT CDS join(complement(1967..2006),complement(1691..1852), FT complement(103..1625)) FT /db_xref="GOA:Q9Y7P6" FT /db_xref="SPTREMBL:Q9Y7P6" FT /label=SPCC1902.02 FT /note="SPCC1902.02, len:574, SIMILARITY:Schizosaccharomyces FT pombe, YAI7_SCHPO, hypothetical 62.5 kd protein , (561 aa), FT fasta scores: opt: 368, E():1.2e-14, (25.9% identity in 517 FT aa)" FT /gene="SPCC1902.02" FT /product="hypothetical protein" FT /gene="SPCC663.16c" FT /protein_id="CAB40004.1" FT /translation="MSEEDTSKLRILSVGSNAISAFISWRLSESKACHTTLIWRNRCES FT VLSEGIRIRSSVFGSTKWKPDVVAPTVEQLAMNSEPFDYIFVCLKILPSVYNLDTAIKE FT VVTPGHTCIVLNTTGIVGAEKELQHAFPNNPVLSFVLPDQFAQRGPLQFEHTTFAADSA FT KSVIYVGLTEEEDDVPDSVQDAMIETLTLTLEAGGVSCDFLSKIQKKQWETGVGHMCFY FT PLSIINDEPNLALMYRLKSFAKVIDGLMDEAFSIAQAQGCEFEPEKLDVLKRHIVNRML FT ATPRPSYPYQDYIAHRPLEVAVLLGYPVEIAKELGVSVPRMETLLALFDAKNKRNLTVR FT AGTPQSSPNFNPAMRRSPVGAASRSPSRSTIGISNRIGSVDDLLNTRQFTSSPIGGSMP FT KGPNSIYKIPSASMVNLSSPLVTSPSGLNPTGRPSRFGGRVRGNPLTMNKAGSVSDLLS FT TSTNMASSDALETASMIGVPSAMPPNSFDMLTLTQRRNRRNNQSSSPPAPSDRRYTMGA FT RRPVQTRGMTDSVIPILEDPMSTLYDTSRYPTRNSKPATPKASRPPSIASTVHRRMD" FT misc_feature complement(1163..1186) FT /note="PS00030 Eukaryotic putative RNA-binding region RNP-1 FT signature" FT misc_feature complement(1626..1643) FT /note="ctaatcgagtcattttag, splice branch and acceptor" FT misc_feature complement(1685..1690) FT /note="gtatgt, splice donor sequence" FT misc_feature complement(1853..1866) FT /note="ctaacgtgctttag, splice branch and acceptor" FT misc_feature complement(1961..1966) FT /note="gtatgg, splice donor sequence" FT CDS complement(3518..5701) FT /codon_start=1 FT /db_xref="GOA:Q10280" FT /db_xref="SWISS-PROT:Q10280" FT /label=gaf1 FT /note="SPCC1902.01, len:727" FT /partial FT /gene="gaf1" FT /gene="SPCC1902.01" FT /gene="SPCC417.01c" FT /product="gata-type transcription factor family" FT /protein_id="CAB40003.1" FT /translation="ADTTGSSLMEFNYIQRRVRKTSFDESTAKSKKRSIADSHFPDPNA FT MQRPHDLESQPFSYPKIHASNSFNFVKRDIDSSNFSNLDASALPISPPSDFFSVHSHNL FT PNAPPSIPANSNNSASPNQRIKASPKHADTDVLGLDFDMTPSEPSSFPENGGFPSFVDA FT NTHEQTLFPSSATNSFSFEHGSAGFPIPGSVPSTSYHANTASEDGFSSSYNSQGLFGIS FT SPLSSGVTPNQSFFPDVSGNNIFDVSRNNHEVSSPLIQSPGSYVSMPSINMVSSLPISA FT PVPNSNSQFPRRPNTFRTNSSKSVGQGSSGVDSNQENAESFNPSISSHNSAEWASGETT FT GHSSNSPLPGSDMFSPQFMRVGTAMGVAPVRSNSSNNFGQNFFHQTSPQFSAVPHRKVS FT AQDTNLMGSSPGMYNHMPYLNRATSANSITSPGVLPEGMAASLKKRTTNTAATPQAALP FT TTLDTKKDRSVSFNINKNAEKPTVSNAAEDKKGDANTRRANATNPTPTCTNCQTRTTPL FT WRRSPDGQPLCNACGLFMKINGVVRPLSLKTDVIKKRNRGVGTSATPKQSGGRKGSTRK FT SSSKSSSAKSTAADMKPKADSKSISPGFVGGNQSLSSERIPLDPSMRSPLQQQSSENES FT KSQSMLSANNLNAGVNDFGLGFSEGLGSAHLDSNDSSMVQGKNDFAPVVDSPLFDAFDT FT DLGMSSVAESHTMNMDPSDLSRVSKSWDWYSVM" FT misc_feature complement(4076..4195) FT /note="Match to PF00320 GATA, GATA zinc finger" FT misc_feature 5575..5701 FT /note="nominal overlap with SPCC417 EM:AL049512 S. pombe FT chromosome 3" XX SQ Sequence 5701 BP; 1922 A; 1074 C; 1180 G; 1525 T; 0 other; cggaagtaga agattatacc gaatattcaa acattggaga ggaacaataa attaaaaaag 60 gaatttaatt agtttacacg aaagaaagac aacctcaaag atctagtcca tcctacgatg 120 caccgtagag gcaatcgaag gaggccgaga agcctttgga gtcgccggtt tggaattcct 180 agtagggtaa cgactagtat cgtaaagagt tgacattgga tcctctaaaa tagggatcac 240 tgaatcggtc attcctcttg tttgaactgg acgcctagca cccatagtat agcgtcgatc 300 ggacggcgca ggaggactgg aagattgatt atttcgacga tttcgccttt gagttaatgt 360 cagcatatca aacgaatttg gaggcatggc tgaaggcaca cctatcatgg acgcagtttc 420 caaggcatcc gagctagcca tgttagtaga agtacttaaa agatcagaaa cagagccagc 480 cttgttcatc gttaaaggat tgccgcgaac gcgacctcca aaacgtgaag gcctgcccgt 540 gggattcaag cctgaaggag aggtaacgag aggcgaagac aagtttacca tagaagcact 600 aggaattttg tatatggagt tgggaccttt gggcatagag cctccgatag gagaagaagt 660 aaactgtcta gtgttcaaca agtcatcgac acttccaata cggttagata tgccaatggt 720 ggagcgagat ggtgagcgtg aggcagcacc cactggagag cgacgcattg caggattgaa 780 attgggggaa gattgaggcg ttccagctct cacagttaaa tttcttttat ttttagcatc 840 aaacagagct agtaaagttt ccatacgagg aacggatacg ccaagctcct tggcgatttc 900 gactgggtaa cccaataaaa ctgcaacctc caaaggtcga tgtgcaatat aatcctgata 960 aggataagaa ggtcgaggag ttgcaagcat acggttgaca atatgacgtt tcaaaacatc 1020 caacttttca ggttcaaatt cgcagccttg ggcttgtgca atggaaaatg cttcatccat 1080 cagaccatca attacttttg caaaagactt aaggcggtac attaacgcta aatttggttc 1140 gtcatttata atagacaatg ggtagaaaca catatgacca acccctgttt cccattgctt 1200 tttctgaatt ttggaaagaa agtcacaaga gacaccacct gcttccaaag ttaatgtaag 1260 agtttcgatc attgcatctt gaactgaatc ggggacatcg tcctcctctt cagtgagtcc 1320 aacataaatg acacttttag cagaatccgc agcaaatgtg gtatgctcaa attggagagg 1380 ccccctttga gcaaattgat caggcaaaac aaaagagaga accggattat tagggaatgc 1440 gtgctgtaat tctttttcgg ctcctactat accagtggta tttaagacga tgcacgtgtg 1500 gccaggggta actacctctt taattgctgt atctaaatta tatactgatg gaagaatttt 1560 taaacataca aaaatatagt cgaacggttc agaattcata gcaagctgct caactgtcgg 1620 ggcaactaaa atgactcgat tagcaacttg atacggagaa aaaataataa gttaaattca 1680 accaacatac caacatccgg tttccattta gttgaaccaa aaacacttga acgaatacgg 1740 ataccttcag aaagaacact ctcgcaacga tttctccaaa ttaaagtagt atgacaagcc 1800 ttactctcgg agagacgcca gctgataaat gcagaaatcg cattgcttcc gactaaagca 1860 cgttagcaaa aaaattataa attttttaaa agcgatcaaa ttaatgcacg cgatattcaa 1920 tacaaaaaat aaaataacac gcagtttaaa caaaaaaaaa ccataccact taatatgcga 1980 agcttagatg tatcttcttc cgacatcgtt ttttaacagc aaccttaagg gatataaaaa 2040 atcctggaac acaataaaag atactattaa atatctattt atataaaaaa gatatttttt 2100 caaaacgtcc aatattaaaa caaaagaagg atcaaccttc tctattctat agtttcgctt 2160 ttaccgtata tacacacttc caagcaacaa catccacaaa caaactaaat tctgcgcttg 2220 tatttttctg aaaaggaaag tttgtgttct acaaactttt ctttatgaat taagtcgttt 2280 ttcttagcga gaaattgctc caaactggaa tttattgatt ctagcaagga gctagtgttt 2340 ggtagacggt aagtgaacag ctgtgtgtat gtttatgttg gaagatatta gcattctgga 2400 gtgatgcttt gctataattc attttaataa actaattaat tatattcaaa tatttttctt 2460 tcttttcaaa acgtaatcat gcttgagtat atagcgccaa gaagttttga aaggctggca 2520 cagccgacat acacatatac acctcaacca aaagcacttt agcttagctg aatggttggg 2580 atttagaaac cccaaacttc aaaatcttta agtaactcca aattccagat attccaattt 2640 ttttttattt taattgatta gaaaattttt ttttataaat tgttctagct actgacttta 2700 agagtatgaa ttcaagatta tactatgact acggtataaa catggtcact taataactta 2760 tattcaaaag atgaatagca aactaaacta acaacaatca catgtgtaaa caggattcca 2820 gaacgaacgc caaggattga gacaaacaaa gatagttaca aaccaagatt tttatacaaa 2880 tataaatatg taggttccga cctccattta acgagcatac aaaagctaca aagaaaccca 2940 taattatcgg acgaagagtc aaagaaaaaa gtttgacata aacagattat aaaaaaacaa 3000 atatatgcct aatcggacga attgaaacaa attaactcta taattaaaga cataaataac 3060 ctcaagtagt agaattctgt ttttttttat aaaaacacag tgcagacatt tcagtaatga 3120 gtttgaagcg aataaagaga attaaaagaa gtaaataaac aatatttgtc aaaacatcac 3180 aaaattagga aaagctcaac ttgtatcacc aatcggtttc taagaaccga actaaccaaa 3240 cacaccaaag aggcagtaag gataatgcca tctgccagca gaaaaactcc catcaaattt 3300 aaattacttt gtgggaacaa ggagccaaat agtgatgaag ccatacttcg tccatcatgc 3360 acacgtaagc ctcttgctca tacaattaat cgacttttcc gacaagaaaa aaattcaagt 3420 cgaaaatata ctatctaccg aaaatgatga tcaacgcatt gtgttttata taaaaaacaa 3480 tcgaaaatca gcgaaactgc catagtttcg atataattta cataacgcta taccaatccc 3540 aggatttgga aactcgtgac aaatccgatg gatccatatt catagtgtgc gattcagcaa 3600 cagaagacat gcccaagtct gtgtcaaaag catcaaataa aggcgaatca acaacaggcg 3660 caaagtcgtt cttaccttgc accatagacg agtcattaga atcaaggtgg gcactaccca 3720 atccttcaga aaaacctaac ccaaagtcat ttacaccagc gtttaagttg ttggctgata 3780 gcatggactg ggatttggat tcattttcag aggactgttg ctgcaaagga ctccgcatgc 3840 taggatctaa aggtatcctt tctgaggata aggattgatt tccaccaacg aatccaggag 3900 aaatagattt gctatcagct ttcggcttca tatcagcagc agtcgatttg gcactcgacg 3960 atttagagct gctcttccta gttgatccct tacgcccgcc tgactgttta ggtgtagccg 4020 aggtaccaac accacgatta cgttttttaa taacgtcggt ttttaaacta aggggtctca 4080 cgacaccatt gatcttcata aacaaaccac aagcattaca caaaggttgt ccatcgggac 4140 tgcgacgcca caatggagtg gtacgtgtct gacaatttgt acaggtagga gtggggttag 4200 tggcattggc tctacgggta tttgcatctc cctttttgtc ttcagcagcg tttgaaaccg 4260 ttggcttctc ggcgttctta ttgatattaa atgacacaga acgatctttt ttcgtatcca 4320 acgtagtagg tagtgcagct tgcggagtag cagcagtatt ggtggtgcgt ttcttcaaac 4380 ttgcagccat cccttcgggc agcaccccag gagaagttat ggagttagcg gaagtagcgc 4440 gatttaagta aggcatatga ttgtacatcc caggactaga ccccattaaa ttggtatcct 4500 gagcacttac tttgcgatga ggaacagcag aaaattgtgg actggtttga tgaaagaaat 4560 tctgcccaaa gttgttagag gaattgctgc gtactggagc aacacccata gcggttccaa 4620 cacgcatgaa ttgcggagaa aacatatcag aaccagggag aggactgttg gatgaatgcc 4680 cggtagtttc accagatgcc cattcagcac tattgtggct cgaaatagaa ggattaaatg 4740 attcagcatt ttcctgattt gaatcaacgc cagaagaacc ttgacctaca gatttgcttg 4800 aattcgtacg gaaagtatta ggacgacggg ggaattgaga gttcgagttc ggtacagggg 4860 ccgagatagg tagcgaacta accatattga tagatggcat actgacatat gatcctggtg 4920 actggattaa tggcgacgaa acttcatgat tgtttctaga gacatcaaag atattatttc 4980 ccgaaacatc gggaaaaaat gattgattag gtgtgacacc agaacttaaa ggtgatgaaa 5040 ttccaaaaag cccttggcta ttataggatg agctgaagcc atcttctgat gcagtattag 5100 catgatatga ggtggaagga acactgcctg gaatgggaaa tccagcagag ccatgctcaa 5160 aactaaagga gttggtggcg gaagaaggaa aaagagtttg ttcatgagta ttggcgtcaa 5220 caaatgaggg aaagcctcca ttttctggaa atgatgaagg ttctgatgga gtcatatcaa 5280 agtccaatcc caaaacatcc gtatctgcat gctttggaga agctttgata cgctgattag 5340 gagaagcaga attattagaa ttagcgggaa tggaaggggg agcattaggc aaattgtgag 5400 aatgaacact aaaaaagtca gaaggaggag atataggcaa tgctgaggcg tcaagattag 5460 aaaaattaga gctgtcaatg tctctcttaa cgaaattaaa agaattagaa gcatgaatct 5520 ttgggtagct aaatggttgg gattccaaat catggggtct ttgcatagca ttgggatctg 5580 gaaaatgaga atcggcaata gagcgctttt tactttttgc agtagactcg tcaaagcttg 5640 ttttacgcac gcgacgctga atataattaa attccatcaa cgagcttcca gtggtatcag 5700 c 5701 //