ID SPAPJ696 standard; DNA; FUN; 5404 BP. XX AC AL133359; XX SV AL133359.1 XX DT 08-DEC-1999 (Rel. 62, Created) DT 14-JAN-2000 (Rel. 62, Last updated, Version 3) XX DE S.pombe chromosome I PCR product p696. XX KW aminopeptidase; conserved hypothetical; PX domain; SH3; KW Src homology domain; vacuolar protein sorting-associated protein. XX OS Schizosaccharomyces pombe (fission yeast) OC Eukaryota; Fungi; Ascomycota; Schizosaccharomycetes; OC Schizosaccharomycetales; Schizosaccharomycetaceae; Schizosaccharomyces. XX RN [1] RP 1-5404 RA McDougall R.C., Rajandream M.A., Barrell B.G., Brown S., Harris D.; RT ; RL Submitted (03-DEC-1999) to the EMBL/GenBank/DDBJ databases. RL European Schizosaccharomyces genome sequencing project, Sanger Centre, The RL Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, E-mail: RL barrell@sanger.ac.uk XX DR GOA; Q9URW5; Q9URW5. DR GOA; Q9URW7; Q9URW7. DR SPTREMBL; Q9URW5; Q9URW5. DR SPTREMBL; Q9URW6; Q9URW6. DR SPTREMBL; Q9URW7; Q9URW7. XX CC Notes: CC Details of yeast sequencing at the Sanger Centre are available on CC the World Wide Web. CC (URL, http://www.sanger.ac.uk/Projects/S_pombe/) CC During 1995 to 1996 about 66% of S. pombe chromosome 1 was sequenced CC by the Sanger Centre. The sequencing of the S. pombe genome is now CC being continued with funding from The European Commission. CC Fourteen European sequencing laboratories, including the Sanger Centre, CC are participating in the project. CC Protein coding regions (CDS) have been predicted with the help CC of computer analysis using the Genefinder program in PomBase CC (an ACEDB database) with additional predictions for the CC branch-acceptor sites supplied by the program Sp3splice. CC CAUTION: It is possible that for any individual CDS we may have CC underestimated or overestimated the number of introns/exons or CC we may not have chosen the correct splice donor/acceptor sites. CC CDS are numbered using the following system eg SPBC25H2.01c. CC SP (S. pombe), B (chromosome 2), c25H2 (cosmid name), CC .01 (first CDS), c (complementary strand). CC The more significant matches with motifs in the PROSITE CC database are also included but some of these may be fortuitous. CC The length in codons is given for each CDS. CC IMPORTANT: This sequence MAY NOT be the entire insert of CC the sequenced clone. It may be shorter because we only CC sequence overlapping sections once, or longer, because we CC arrange for a small overlap between neighbouring submissions. CC PCR product p696 is overlapped at the 5' end by cosmid c1296, CC EMBL entry SPAC1296, accession number AL035439, and at the 3' CC end by cosmid c22G7, EMBL entry SPAC22G7, accession number Z54328. XX FH Key Location/Qualifiers FH FT source 1..5404 FT /chromosome="I" FT /db_xref="taxon:4896" FT /organism="Schizosaccharomyces pombe" FT /strain="972h-" FT /map="IL" FT misc_feature 1..108 FT /note="nominal overlap with cosmid SPAC1296 S. pombe FT chromosome 1" FT CDS join(complement(1658..2020),complement(1137..1590), FT complement(883..1088),complement(212..838)) FT /db_xref="GOA:Q9URW7" FT /db_xref="SPTREMBL:Q9URW7" FT /label=SPAPJ696.01c FT /note="SPAPJ696.01c, len:549, LOW SIMILARITY:Saccharomyces FT cerevisiae, VP17_YEAST, vacuolar protein sorting-associated FT protein vps17., (551 aa), fasta scores: opt: 747, E():0, FT (28.2% identity in 547 aa)" FT /product="putative vacuolar protein sorting-associated FT protein" FT /gene="SPAPJ696.01c" FT /protein_id="CAB62421.1" FT /translation="MSSQHSTDDLMNSHVFSGGFSTLDDKGFQDVPIHTDMPGSISVEP FT SSEDANVGSVNGNINETPVFEADRLIAEATMNPSAASSTTGENSISQTGSGPFLRIRIV FT DIEAENSKDPVIKMNVQTTLPAYRSKLYKNVRRTHAEFKKFAKYLISTHPECLIPAVPE FT AKTSVSSGIKEDLIYLKSGLQSWLNYVSTNPNLLYDPELQLFVESDYGYSPLINTGNPT FT SGLKRKALKQFPLPPDPCQALANLRPIVKSFYKNAKDAEIKLEKLVNRKQSLALTHADL FT GQSLIDYSVEEQHNGLANALNRVGKMLQAISDVRIMQSSKQLVTLADSLCYASDNAFVV FT KEILSNRHILMRDLISSKNQTNSYLSAANRLQDSPKISKARTDDALQALEVARVHEKLL FT SDKVDFVTLNLVKESKTYTKKTSVSLQKAIREYVEKEAYYERRLLSIMESIRPHIRNID FT PFGGLSRLGREEYPRRLSNPPPSQKTNQDAWTNRKRPGYSSSFDGSSQSTFNPSNNDGA FT HNTSENADELVEPPIGNERLDPKSVANLLNAI" FT misc_feature complement(839..852) FT /note="cttactggaattag, splice branch and acceptor" FT misc_feature complement(877..882) FT /note="gtaaaa, splice donor sequence" FT misc_feature complement(1089..1105) FT /note="ctaacttttcgacatag, splice branch and acceptor" FT misc_feature complement(1131..1136) FT /note="gtacgt, splice donor sequence" FT misc_feature complement(1394..1735) FT /note="Pfam match to entry PF00787 PX, PX domain," FT misc_feature complement(1591..1602) FT /note="ctaacttcgtag, splice branch and acceptor" FT misc_feature complement(1652..1657) FT /note="gtatgt, splice donor sequence" FT CDS join(2884..2920,2973..3013,3101..4315) FT /db_xref="SPTREMBL:Q9URW6" FT /label=SPAPJ696.02 FT /note="SPAPJ696.02, len:430, SIMILARITY:Mus musculus, FT O08641, ray protein., (340 aa), fasta scores: opt: 735, FT E():0, (37.1% identity in 428 aa)" FT /product="conserved hypothetical SH3 domain protein" FT /gene="SPAPJ696.02" FT /protein_id="CAB62422.1" FT /translation="MGLHNPLPSSLKSECKKAGKILTSFVDPRQTLGAQEVIPPSVLTN FT AKGLVIMTVLKAGFLFSGRIGSGLIVARLDDGTWSAPSAVMTGGMGVGAQIGSELTDFV FT IILNSKAAVQTFARLGSITLGGNLSIAAGPLGRNAEAGGGASVGGMAPMFSYSKTKGLF FT AGVSLEGSVLVERRDANRSLYRGDITAKRLLSGQVAQPAAADPLYRVLNSKIFNLNRGD FT EGDIYNDVPIYADDEPEDIWGPSSKSTKRRDSADRSSSYSRRGDSYRSNRSRAHDDDDE FT DDYSFSRSKSLSRKTAGGSLRSSKMDNRRSKYADTPSPRRSRSYSDEDEESVYSSDVST FT ESSSQFSSRSSEYSKPSRPTAPKPKFKQDSLGPNQARAMYSFAGEQPGDLSFQKGDIID FT IVERSGSHDDWWTGRIGYREGIFPANYVKLS" FT misc_feature 2921..2926 FT /note="gtatgt, splice donor sequence" FT misc_feature 2961..2972 FT /note="ctaacatttaag, splice branch and acceptor" FT misc_feature 3014..3019 FT /note="gtatgt, splice donor sequence" FT misc_feature 3083..3100 FT /note="ttaactgttttaatttag, splice branch and acceptor" FT misc_feature 4145..4303 FT /note="Match to PF00018 SH3, Src homology domain 3 Score FT 78.78" FT CDS complement(<4676..5404) FT /codon_start=1 FT /db_xref="GOA:Q9URW5" FT /db_xref="SPTREMBL:Q9URW5" FT /label=SPAPJ696.03c FT /note="SPAPJ696.03c, len:242, SIMILARITY:Arabidopsis FT thaliana, O23206, aminopeptidase-like protein., (634 aa), FT fasta scores: opt: 890, E():0, (54.4% identity in 241 aa)" FT /product="putative aminopeptidase" FT /gene="SPAPJ696.03c" FT /protein_id="CAB62423.1" FT /translation="MFMGLSFETISSTGPNGAVIHYSPPATGSAIIDPTKIYLCDSGAQ FT YKDGTTDVTRTWHFGEPSEFERQTATLALKGHIALANIVFPKGTTGYMIDVLARQYLWK FT YGLDYLHGTGHGVGSFLNVHELPVGIGSREVFNSAPLQAGMVTSNEPGFYEDGHFGYRV FT ENCVYITEVNTENRFAGRTYLGLKDLTLAPHCQKLIDPSLLSPEEVKYLNEYHSEVYTT FT LSPMLSVSAKKWLSKHTSPI" FT misc_feature 5305..5404 FT /note="nominal overlap with cosmid SPAC22G7 S. pombe FT chromosome 1" XX SQ Sequence 5404 BP; 1620 A; 994 C; 1052 G; 1738 T; 0 other; ggtaaataat gcaatgaaat accttacgta ctaaagttat ttttgatata ttgcttgaaa 60 gacacagcca tgaatatctt gctgcttagt acttgaccag taaagatcat tagaatgcgg 120 tattattaaa caggctacca acgataaaac gtaaataagt cttgaattaa tattttatcc 180 aaatcaaaca acattatcat agcaaatcta gttaaatagc atttagtaag ttggctacgc 240 ttttaggatc aaggcgttca ttaccaattg ggggttccac taattcgtca gcattttccg 300 aagtattatg ggcaccatcg ttgttggacg gattaaaggt actttgagac gacccgtcaa 360 agctggatga gtagccaggt cttttgcgat tagtccaagc atcttgatta gttttttgac 420 taggaggtgg attgcttaac cgacgtggat attcctctct tccaagtctc gacaatccac 480 caaaaggatc aatgttgcgg atatggggac ggatggattc cattatggac aaaagacgtc 540 tttcatagta agcttctttc tccacatact ctcgaattgc cttctgtaag gaaacagaag 600 tttttttagt atacgttttt gattctttaa ccaagtttaa ggtaacaaaa tccactttat 660 ccgacaacaa cttttcatga acacgtgcaa cctcaagggc ttgtaaagca tcatctgtac 720 gagctttact tatttttggg ctatcttgta aacggttggc agccgaaaga taggaattag 780 tctgattttt actacttatc agatcgcgca taaggatatg cctgttcgat aaaatttcct 840 aattccagta agtttaaatt acaaaaaaat cgaagttttt acctttacaa caaatgcgtt 900 atcggaagca taacacagag aatcagctaa tgttacaagc tgtttgctgg attgcataat 960 tcgaacatct gaaatagctt gtagcatctt tccaaccctg ttcaatgcat ttgctaatcc 1020 attgtgctgt tcctcgaccg aataatcgat caaagattgt cccaaatcag catgtgttaa 1080 tgccaaagct atgtcgaaaa gttagcaatt atgtaaaaaa tatttttttt acgtacattg 1140 ttttcgatta acgagttttt caagtttaat ttctgcgtcc tttgcatttt tgtaaaaaga 1200 tttaacgata ggtcgtaaat tagcaagagc ctgacaagga tctggtggca aaggaaactg 1260 ttttaaagct tttcttttca atcctgatgt aggattgccc gtattaatca atggagaata 1320 accataatca ctttcgacga aaagttgaag ctcgggatca tatagcaaat taggatttgt 1380 agacacataa ttaagccatg attgtaagcc tgattttaag taaattaaat cctcctttat 1440 tccgctactg acggatgtct tcgcctcggg aacggcagga attaaacatt cgggatgagt 1500 agaaattaaa tattttgcaa attttttaaa ttctgcgtga gttcttcgaa catttttgta 1560 caacttggaa cgataagccg gtaaagttgt ctacgaagtt agaaatcgaa gagttaagta 1620 ttggagggaa aataaactac agacgttaca cacatacctg aacattcatt tttatgacag 1680 ggtcttttga attttcagct tcaatatcaa ctattcgaat cctcaaaaaa ggcccagatc 1740 ctgtttgcga gatactgttt tctccagtag tggatgaggc tgcagacgga ttcatcgtgg 1800 cttcagctat caagcgatct gcttcaaaaa caggtgtttc gtttatattt ccattaactg 1860 atccaacatt cgcgtcttct gaagaaggtt caactgatat ggagccaggc atatcagtat 1920 gaataggaac atcttgaaag cctttgtcat caagggtgga aaagcctccg ctgaaaacgt 1980 gagaattcat taaatcatct gtgctgtgtt gagatgacat tggtcaagcg attgttgtct 2040 atcaaacgtg ataataatta gatttggatt caacaaacac aaagcaagaa agtagcagct 2100 ataactgaat aatcaattgt aatggatagg agggtatgat tatacaaaat tctagtagag 2160 tacaagcctc tatactgtat gattattttt aagatttaga agaaggtttt aacactttgc 2220 aatggaaaag agaagtgaat acaaatgttg ggcagttgtc attactactt acattgttat 2280 cgcggcgatt caacaacatt cgtaatctgc ttttacatta caaagtgtga ttcgttattc 2340 taggtatatc aaatccaccg tatttttgtc tctttaaaat gatattctta aatctatatt 2400 ttgttatatt tttagttgtc taagggtttc caaggttaaa agaatggttt taaatctaga 2460 tggatagttt tcaggtgctt tctttgtaag gattgtaatg cagcattcga gctgttgtaa 2520 atattagtgt agttatttca cattcagact tccttgattg aagtaaggta attgttacat 2580 cggtttcgtt aagtatttgc acgtccatcc ataatgctaa gcaatctata attttgagtc 2640 aataacaatt acttcgatat aaaccaaatc tatttaaaaa acgctaaaca atttacatac 2700 gattttggga attcacaact gatttttgct ttaaccacag ttaaagtaac tggcttctac 2760 gtcattatcg tatgcttacc gaattcatct ctaaagttct gtatatgttg ggatgcaatg 2820 ggttccatgc tgctgctata acttaccctc ttcgtcaaac attcataaaa acaatcccta 2880 aatatgggtc ttcataaccc tttaccttcg tctttaaaga gtatgtaatg actgatcatc 2940 aattattgtt ggaaatgtaa ctaacattta aggcgaatgc aagaaggctg ggaaaatttt 3000 gactagtttt gtggtatgtc aattttaaaa ttaagtttac tttttattat taaatgaccc 3060 attgccttga tagaaaattt tattaactgt tttaatttag gacccgaggc aaactcttgg 3120 agcgcaagaa gttattccgc cgtcggtatt gacgaatgct aaaggcttag ttattatgac 3180 tgtacttaag gccggctttc ttttctctgg tagaattgga tccggtctaa ttgtcgcacg 3240 tcttgacgat ggtacttggt ctgctccttc cgcagttatg actggcggga tgggagttgg 3300 cgcacaaatt ggctctgaat taacagattt tgttattata cttaattcca aagctgctgt 3360 tcaaactttt gctcggttgg gaagtattac tttgggtgga aacctttcaa tagccgctgg 3420 acctttgggc cgaaatgccg aagctggtgg tggtgcaagt gttggcggca tggcgcctat 3480 gttctcgtat agtaaaacca aaggtctttt cgcaggtgtt tctcttgaag gatctgtgtt 3540 ggttgaacgt cgtgacgcta atcgaagtct ttatagaggt gatattactg ctaagcgact 3600 tctttcgggc caagtagctc aacccgctgc agcggatccc ctttatcggg tccttaactc 3660 taaaatattt aatttgaaca gaggtgatga aggtgacatt tataatgatg ttcctattta 3720 tgctgatgat gagcccgaag atatctgggg tccctcctca aagtctacta aacgtcgaga 3780 ctctgcagac cgatcttcct cttactctcg tcgtggtgac tcgtaccgca gcaatcgtag 3840 tcgggctcat gatgatgatg atgaagatga ttatagtttc agtcgtagta aatctctttc 3900 acggaaaact gcaggcggtt ctttacgttc ttctaaaatg gacaaccgta gatccaaata 3960 tgcggatact ccatcccccc gtcgcagtcg tagttatagc gacgaagacg aagaaagtgt 4020 ttatagttct gatgttagca cagaatcttc ctcgcaattc tcttctagga gttcagaata 4080 tagcaagcca tctcgtccaa cggcaccaaa gcctaagttt aaacaggatt ctcttggacc 4140 aaaccaagcc cgtgccatgt attcttttgc tggtgaacag ccaggtgatc tatcttttca 4200 aaagggcgat atcattgata ttgtcgaaag aagtggttcc catgatgatt ggtggactgg 4260 aagaattggc taccgcgaag gtatttttcc agctaactat gtaaaattgt cgtaatttcg 4320 tttttaccgg tgcattaatc atatatatcc ttaatccccc tacattattt ctttatagct 4380 ttttaccctt ttaccccttt acttttgaag caccgacttt tttagttttt acgctaaagg 4440 aatagaaaaa aaaaataact cttaacccgg ctctatcatt aaaggtcttt gtacttaatt 4500 ggtggctaca ataaaatcat atgtttataa ttctgttata atttttcaca tgaacgtagt 4560 gctatttttt aacatgaatt gcaaaggagt atttgtgatg gcaaaataat atacacattg 4620 aagtaatttc gtatacgatt gcattaattc aacatagctg agtttctcaa taccttcaaa 4680 ttggggatgt atgtttggaa agccactttt tggcagatac agaaagcatt ggactcaatg 4740 tggtgtaaac ttcagagtga tactcattca aatacttgac ctcctcggga gaaagaagag 4800 atggatcaat aagcttttga caatggggtg caagagtgag gtctttaagt cccaaatagg 4860 tgcggccagc aaagcggttt tcagtgttca cttctgtaat atacacacaa ttttcaacac 4920 gatatccaaa atgtccatct tcataaaaac caggttcatt gcttgtaacc atgccagctt 4980 gtaatggtgc actattaaaa acttcgcgag atccaatgcc gactggtagt tcatgaacat 5040 ttaaaaagct tcctactcca tggccagtac catgtaaata gtccaagcca tatttccaaa 5100 gatattgacg agcgagtaca tcaatcatgt aaccagtagt tccttttggg aaaacaatat 5160 ttgcaagagc aatgtggcct ttcaaagcca acgttgcagt ctgacgctca aactcagacg 5220 gctctccaaa atgccacgtt ctagtaacat ccgtggtacc atctttatac tgtgcaccag 5280 aatcacaaag ataaatctta gtcggatcta tgattgcact tccagttgcg ggaggcgagt 5340 aatggataac tgccccattc ggtccagttg aactaatagt ttcaaatgac agacccataa 5400 acag 5404 //