Index of /monthly_releases/2025/pombase-2025-04-01/training_data_for_ML_and_AI

Icon  Name                                       Last modified      Size  Description
[PARENTDIR] Parent Directory - [TXT] alleles.tsv 2025-04-01 02:00 1.2M [TXT] publications_with_annotations.txt 2025-04-01 02:00 76K
This directory contains files compiling data destined to train
Language Models and Artificial Intelligence models for curation
purposes.

 - alleles.tsv
   Information about curated alleles:
    - gene_systematic_id
    - gene_name
    - allele_current_internal_id - the internal PomBase ID for the allele,
                                   which changes each release
    - allele_name
    - allele_type
    - allele_description
    - allele_synonyms
   See this recent PomBase publication for details about allele
   nomenclature:
     https://doi.org/10.1093/genetics/iyad143

 - publications_with_annotations.txt is a list of the PubMed
   identifiers (PMID) of all publications with annotations in PomBase.

These files are part of PomBase release 2025-04-01

For use of this dataset please cite:
  Kim Rutherford, Manuel Lera-Ramírez, Valerie Wood
  PomBase: a Global Core Biodata Resource - growth, collaboration, and sustainability
  Genetics, February 2024
  https://doi.org/10.1093/genetics/iyae007