ID SPAPJ760 standard; DNA; FUN; 5298 BP. XX AC AL162631; XX SV AL162631.1 XX DT 29-MAR-2000 (Rel. 63, Created) DT 22-MAY-2000 (Rel. 63, Last updated, Version 2) XX DE S.pombe chromosome I PCR product pJ760. XX KW dipeptidyl aminopeptidase; extensin-like; proline-rich protein; KW SH3 domain (x2). XX OS Schizosaccharomyces pombe (fission yeast) OC Eukaryota; Fungi; Ascomycota; Schizosaccharomycetes; OC Schizosaccharomycetales; Schizosaccharomycetaceae; Schizosaccharomyces. XX RN [1] RP 1-5298 RA Harris D., Wood V., Rajandream M.A., Barrell B.G.; RT ; RL Submitted (29-MAR-2000) to the EMBL/GenBank/DDBJ databases. RL European Schizosaccharomyces genome sequencing project, Sanger Centre, The RL Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, E-mail: RL barrell@sanger.ac.uk XX DR GOA; Q9P7E8; Q9P7E8. DR GOA; Q9P7E9; Q9P7E9. DR SPTREMBL; Q9P7E7; Q9P7E7. DR SPTREMBL; Q9P7E8; Q9P7E8. DR SPTREMBL; Q9P7E9; Q9P7E9. XX CC Notes: CC Details of yeast sequencing at the Sanger Centre are available on CC the World Wide Web. CC (URL, http://www.sanger.ac.uk/Projects/S_pombe/) CC During 1995 to 1996 about 66% of S. pombe chromosome 1 was sequenced CC by the Sanger Centre. The sequencing of the S. pombe genome is now CC being continued with funding from The European Commission. CC Fourteen European sequencing laboratories, including the Sanger Centre, CC are participating in the project. CC Protein coding regions (CDS) have been predicted with the help CC of computer analysis using the Genefinder program in PomBase CC (an ACEDB database) with additional predictions for the CC branch-acceptor sites supplied by the program Sp3splice. CC CAUTION: It is possible that for any individual CDS we may have CC underestimated or overestimated the number of introns/exons or CC we may not have chosen the correct splice donor/acceptor sites. CC CDS are numbered using the following system eg SPBC25H2.01c. CC SP (S. pombe), B (chromosome 2), c25H2 (cosmid name), CC .01 (first CDS), c (complementary strand). CC The more significant matches with motifs in the PROSITE CC database are also included but some of these may be fortuitous. CC The length in codons is given for each CDS. CC IMPORTANT: This sequence MAY NOT be the entire insert of CC the sequenced clone. It may be shorter because we only CC sequence overlapping sections once, or longer, because we CC arrange for a small overlap between neighbouring submissions. CC PCR product pJ760 is overlapped at the 5' end by cosmid c14C4, CC EMBL entry SPAC14C4, accession number Z98596, and at the 3' CC end by cosmid c2H10, EMBL entry SPAC2H10, accession number AL034486. XX FH Key Location/Qualifiers FH FT source 1..5298 FT /chromosome="I" FT /db_xref="taxon:4896" FT /organism="Schizosaccharomyces pombe" FT /strain="972h-" FT /map="IR" FT misc_feature 1..107 FT /note="nominal overlap with cosmid SPAC14C4, EM:Z98596 S. FT pombe chromosome 1" FT CDS complement(1..436) FT /db_xref="GOA:Q9P7E9" FT /db_xref="SPTREMBL:Q9P7E9" FT /label=SPAPJ760.01c FT /note="SPAPJ760.01c, SIMILARITY: Saccharomyces cerevisiae., FT DAP2_YEAST, dipeptidyl aminopeptidase B., (818 aa), fasta FT scores: opt: 830, E(): 0, (26.2% identity in 748 aa)" FT /partial FT /gene="SPAPJ760.01c" FT /gene="SPAC14C4.15c" FT /product="dipeptidyl aminopeptidase" FT /protein_id="CAB83084.1" FT /translation="MNAYEGDTLNNHGKSSTRQHWRKRSAVSSSLEFSSYEESNSPIEN FT TEVLKVSEIEAKKRRRKKHRYIYLAVCLFFLASVLSCAIIFRFYLHTNRENFSLFKNDS FT YKQKEPITVSHFGESIFLPYHQDIEWITSTEGTVLYYDQST" FT CDS join(complement(3566..3713),complement(1075..3500)) FT /db_xref="GOA:Q9P7E8" FT /db_xref="SPTREMBL:Q9P7E8" FT /label=app1 FT /note="SPAPJ760.02c, len:857, SIMILARITY: some to extensin FT & mucin like proteins" FT /gene="SPAPJ760.02c" FT /gene="app1" FT /product="actin binding protein with SH3 domains" FT /protein_id="CAB83085.1" FT /translation="MSFQLDTSTHGAEIRNVYEKVLSGADDCSWAIFGYEKGQGNILKV FT VASGNDNDEFLDEFDENAVLFGFLRVKDVNTGLNKFVLVCWCGEAAARKGLFSIHMATV FT SNLLKGYHVQITGRESSDLNMDDIIRRVADASGSKYSVHTSNSTPQSKHNAFYDASQTF FT GSTAKVAPAPAPSTKTPLANISKPVVQAQKDSKDNSWDDSSKQSNTQTANTTSNLRVPV FT NASWSDAGRKEKSQENKPKPTPFGSGGPSKPTPFESHGPAKQISVQPSEHPKPSISTTT FT TGSSYRSAESSHAPTTPDHFKLTPLTKLEPQPPSGSPSKKPVSELEELHTAGNVNLSAR FT RALFEKKESSTKNVENPVSHHLKSPVRTSFPPASTTASKQDSPSTVPVDKQETAKPINK FT QVSSNETSAQEEPRESVAALRARFAKANVSENNDPPTFPKTAAKISSFNSKAGTSFAKP FT RPFTNNPNPISAPEKPTSGESLSLNPPPAMPKVFPERDISSASQKAAQPSVITPSVPQP FT PAAPVVPEAPSVHQPPAAPVAPEVPSAPQRPAAPVVPEAPSVPQRPAVPVVPEALSVPQ FT PPVAPVAPEVPSVPQPPVAPVVPEAPSVPQPPVAPVAPEVPSVPQRPAVPVVPEAPSVP FT QPPAAPVVPEVPSVPQRPAVPVVPEAPSVPQPPAAPVVPEVPSVPQPPAVPVVPEAGQL FT NEPVVPPLPPHDETQEPQVGGDVKATEHTQPTKTPAIVIYDYSPEEENEIELVENEQIQ FT ILEFVDDGWWLGENSKGQQGLFPSNYVEITGPNETANNPPAEPQAGGPGKSVKAIYDYQ FT AQEDNELSFFEDEIIANVDCVDPNWWEGECHGHRGLFPSNYVEEI" FT misc_feature complement(1078..1242) FT /note="Match to PF00018 SH3, SH3 domain Score 86.55" FT misc_feature complement(1303..1461) FT /note="Match to PF00018 SH3, SH3 domain Score 77.96" FT misc_feature complement(3501..3519) FT /note="ctaactttataattttaag, splice branch and acceptor" FT misc_feature complement(3560..3565) FT /note="gtacgg, splice donor sequence" FT CDS complement(4536..5036) FT /db_xref="SPTREMBL:Q9P7E7" FT /note="SPAPJ760.03c, len:166" FT /gene="SPAPJ760.03c" FT /product="hypothetical threonine-rich protein" FT /protein_id="CAB83086.1" FT /translation="MFLRSIFQTLCAVSFLAGSVFADSGVSIVSTPATTTVYLVRTVDC FT SSSEVTSQPVVTVYNVLKPDTVTFTVTETAGSYAKRSIEIDSDSVSPTSATTTTPVASA FT TDVSVYSASIHVPTGNPPVDTHNPLSYDTEVTATTTFSIALPKFNKGDRVSSANTYSVS FT FVA" FT misc_feature 5185..5298 FT /note="nominal overlap with cosmid SPAC2H10, EM:AL034486 S. FT pombe chromosome 1" XX SQ Sequence 5298 BP; 1654 A; 929 C; 1199 G; 1516 T; 0 other; atgtgctttg atcgtagtat aatacggtac cttcagttga ggtaatccat tcgatatctt 60 gatgatacgg cagaaaaata gactcaccaa agtgacttac ggtgatcggc tctttttgct 120 tataggaatc attcttgaag agtgagaagt tttcccgatt agtatgaaga taaaaacgga 180 atattatagc acacgataaa actgaagcta agaaaaagag acagactgct agataaatgt 240 atcgatgctt tttgcgccga cgctttttag cttcaatttc tgaaaccttt aaaacttcag 300 tgttctctat ggggctgttc gactcttcat atgatgaaaa ttctaaagac gatgatactg 360 cgctacgttt ccgccagtgt tgtctggttg atgactttcc atggttgttc aaggtgtcac 420 cttcatatgc attcattcat gccactggaa taagtgactg agaagcgcca atttttactt 480 ggtcttctta atatattgtc aaacagtatt gaatgtgatt tgatacattc gatgaccaca 540 aatacaaata taaaatagcg aaatgcaacg aggaaatatg gagtagcaaa tacatatatc 600 actatgctgt aatgaagtag gctaataaga ttgattattt tattgaatat ttgattagtt 660 ctcgaatcat gagcacattt aaaagtatta cttctttgcc aatacacaaa ccaatttgca 720 tatactcatg gcttataagt tggtgggtaa taaagtagca taatttgaac aaagaatcat 780 aacatatata aacaccgaat tcgatacaaa taaatatata tcaaatataa taaattcaac 840 agtgataggt aaaagaagaa acttaaaaaa tcagactttt attttcttta tttattttgg 900 cttgcaaggc ttccagatga ttgattcatg aacacttctg gctagtagca aaagttttat 960 gattgaccta aagagcattc caagcattta taaagtcata ttaaacccac atttctaggt 1020 aaaagacagc cattgacaat atgcaaaggc ggtaggtacg tttaacgttc aagtctaaat 1080 ttcctctaca taatttgagg gaaacagtcc tcggtgacca tgacactcac cttcccacca 1140 attgggatca acacagtcca cattggcgat aatttcgtct tcaaaaaagc tgagctcgtt 1200 atcttcttga gcctgataat cataaattgc cttcacagac ttaccaggac cccctgcttg 1260 tggctcggca ggaggattat tggcagtctc attaggtcca gttatttcaa cgtagttaga 1320 aggaaataaa ccttgttgac ctttagaatt ttctcctaac caccatccat catcgacaaa 1380 ttccaaaatc tgaatctgct cattttctac taactcgatt tcgttttctt cctcaggaga 1440 ataatcatat ataactatgg ccggagtctt tgtaggttgt gtatgctctg tagctttgac 1500 atctccaccg acttgaggtt cttgagtttc gtcatgcggt gggagaggag gaaccacggg 1560 ctcattaagt tgtccagctt ctggtacaac aggaacagcg gggggttgag gaactgaagg 1620 aacctcaggt acaacggggg cagcaggtgg ttgagggacc gaaggtgctt caggtacaac 1680 gggaacagca ggcctctgag ggaccgaagg aacttcaggt acaacaggag cagcaggtgg 1740 ttgggggacc gaaggggctt ctggtacaac gggaacagca ggcctctgag gaactgaagg 1800 aacttcaggt gcaacaggag caacaggtgg ttgaggaact gaaggagctt caggtacaac 1860 aggagcaaca ggtggttgag gaactgaagg aacttcaggt gcaacaggag caacaggtgg 1920 ttgggggacc gaaagggctt ctggtacaac gggaacagca ggcctctgag gaactgaagg 1980 agcttctggt acaactggag cagcaggcct ctgaggggcc gaaggaactt caggtgcaac 2040 tggagcagca ggcggttggt ggaccgaagg agcttctggt acaactggag cagcaggcgg 2100 ttgaggaacc gaaggagtaa tgactgatgg ttgcgccgct ttttgtgaag cagatgagat 2160 atccctttct ggaaaaacct ttggcattgc tggaggtgga tttaacgaaa gactttcgcc 2220 agatgtcggt ttttcaggag cgcttatggg attgggatta tttgtaaaag gtcgaggttt 2280 ggcaaaagaa gttcctgcct tcgaattaaa gctcgatatt ttagcagccg tcttgggaaa 2340 tgtaggagga tcgttatttt cactaacatt tgcttttgcg aaacgagctc gaagcgctgc 2400 aacgctttca cgtggctctt cctgtgctga agtttcattg gatgaaacct gtttatttat 2460 aggtttggcg gtttcttgtt tgtcgacagg taccgttgat ggggaatctt gtttgctggc 2520 agtagttgaa gcaggaggaa aacttgtgcg aacaggactt ttaagatgat gtgatactgg 2580 gttttcaaca ttcttagttg aggattcttt cttttcgaac aaagcacgtc tagcagataa 2640 attcacatta ccagcggtat gcagctcttc taattcagag acaggttttt ttgagggaga 2700 accagaaggt ggttgaggtt cgagtttagt caaaggagta agtttaaaat gatccggagt 2760 agttggagcg tgactggatt ctgcagagcg atagcttgag ccagtagttg tagttgaaat 2820 agaaggcttc gggtgttcgc taggctgaac agagatttgc ttagctggcc catgagattc 2880 aaaaggagta ggctttgatg gcccaccaga cccaaaagga gtgggtttag gcttattttc 2940 ttgagatttt tctttacgac cggcatccga ccacgatgca tttacaggga cacgtaagtt 3000 tgaagtagta tttgcagtct gggtattgct ttgcttagac gaatcgtccc aactattgtc 3060 ctttgagtct ttctgcgctt gcaccactgg ctttgaaata ttggccaaag gggttttggt 3120 agaaggagct ggtgcagggg caaccttggc agtacttcca aaagtctggc tagcatcgta 3180 aaatgcgtta tgctttgatt gaggagttga attagaagta tgaacggagt atttactacc 3240 actagcatct gctactcgac gaataatatc atccatattt agatcagacg actcacgtcc 3300 cgttatttga acatgataac ctttcaaaag attggaaacc gtagccatat gtatagagaa 3360 caaaccttta cgggctgctg cttcaccaca ccaacatacc aaaacaaatt tatttaatcc 3420 agtgttgaca tccttcacac gtagaaaccc aaacagcaca gcgttttcat caaattcatc 3480 aagaaattca tcattgtcgt cttaaaatta taaagttaga aaaatcgtca tcaaaataaa 3540 taaaactatt gtaaaaacac cgtacttcca gaagccacca cttttaagat atttccttgt 3600 cctttttcat atccaaaaat ggcccaagag caatcatcag caccgctgag tactttttca 3660 taaacgtttc taatctcagc tccatgcgtg ctcgtatcta gttgaaatga cattttgggc 3720 gggaatggta aatggaagga attgcgctgt acaatatgct tccgtcattg tgttcctcat 3780 tcatggtact atgtgctatt atcatattag taaaagtgga aatcttgttt atgaataagc 3840 aattgtaata ctctagtgtt ttaataatat tatggatttc ggaaatatga tcgtctcaat 3900 gaacgcttag tccttcatgg ttgatattag tttgacgtag cattagtgta tttcagctaa 3960 ttctaaggta ttgatatttc tttatacgta tagttaatat taactacctt ttagtcacta 4020 catacatttc aaaagctata ggtagagtat actaataata ggaatatgtt taactatctc 4080 tagttgaatc agcgatagat gattttctct gagtctgtaa tttgtttgat aattaaaata 4140 ataaaaacaa ttttctatga atataatgat tttgatgctt ttcgaatgtt catgaaaaaa 4200 tctggatcac accagttttc tgtaaacaac aatggaatgc tggccagcag catttgaagg 4260 attcagttga caaaagatca aaataaaggg tgatcctaca aacagcaaga ctagtaagtt 4320 ttgaacatgt ataagatatt gttgttaaga attgatagga aaaacatgat cagtatttcg 4380 ggtccaaaaa aaaaaataaa cgaaatgaag tgcaggtaga atgaacatga aaatgacaga 4440 ccaatggaaa tgatagatac ttagaaagtt attgctcgtg caatcctcga cggattgaag 4500 gcgtccaaaa agttttacaa gtaggcatca gagtattagg caacaaagga gacactgtag 4560 gtgttggcag agctaacacg gtctccttta ttaaactttg gcaaggcaat tgaaaaggtg 4620 gtagtagccg tcacctcagt gtcataagaa agagggttat gagtgtccac tggcggatta 4680 ccggtaggaa catgaatgct ggcagagtat acggagacgt cagtagctga agcaacagga 4740 gtagtcgtag tagcgctagt tggggaaact gaatcgctgt caatttcaat ggaacgcttc 4800 gcgtaggaac cggcagtctc agtaacagtg aaagttacag tgtctggttt gaggacattg 4860 taaacagtaa caacaggttg ggaggtgact tcagagcttg aacagtcaac agtgcgaact 4920 aaatagactg tagtagtagc aggggtagag acaatcgaaa cacccgaatc agcaaaaacg 4980 cttccagcga ggaaagagac agcgcacaag gtttggaaga tagagcgaag aaacatttta 5040 taataaagta attttttgta aatagtcact caaagaaaac aaaaacaaaa agcaatgagt 5100 attgaaatta tcccaattac ttgaaagaag tacaaaagca aggccttttg aaatataaaa 5160 aaagatttaa gttattttac gcctgatctg ccaacttata taaaaatgaa ttaaggcgct 5220 gaaaatcgaa ctgatggctg gtgaactgat caccattgta caaactggac cgtgtagcat 5280 tttttttttt ttttcaaa 5298 //