Index of /monthly_releases/2025/pombase-2025-04-01/training_data_for_ML_and_AI
Name Last modified Size Description
Parent Directory -
alleles.tsv 2025-04-01 02:00 1.2M
publications_with_annotations.txt 2025-04-01 02:00 76K
This directory contains files compiling data destined to train
Language Models and Artificial Intelligence models for curation
purposes.
- alleles.tsv
Information about curated alleles:
- gene_systematic_id
- gene_name
- allele_current_internal_id - the internal PomBase ID for the allele,
which changes each release
- allele_name
- allele_type
- allele_description
- allele_synonyms
See this recent PomBase publication for details about allele
nomenclature:
https://doi.org/10.1093/genetics/iyad143
- publications_with_annotations.txt is a list of the PubMed
identifiers (PMID) of all publications with annotations in PomBase.
These files are part of PomBase release 2025-04-01
For use of this dataset please cite:
Kim Rutherford, Manuel Lera-RamÃrez, Valerie Wood
PomBase: a Global Core Biodata Resource - growth, collaboration, and sustainability
Genetics, February 2024
https://doi.org/10.1093/genetics/iyae007