Home Data Tools Expression Atlas Resources Legacy works Users

Software and pipeline


This workflow diagram (right) illustrates the data, tools, and workflow used for the works presented on this site.

The resources (clickable URL below and within the graph) of data, tools, and scripts used to form the workflow are listed below.

RNA quantification pipeline

btau_ARS-UCD1.2 FAANG ensembl ncbi fastQC trim Galore STAR RSEM next_flow nf-co nf-core/rnaseq EpiDB BTO Uberon CLO


Data sources

Genome Sequence Reference Files

  • Genome version (Bos taurus ARS-UCD1.2): https://ftp.ensembl.org/pub/release-110/fasta/bos_taurus/dna/Bos_taurus.ARS-UCD1.2.dna_sm.toplevel.fa.gz
  • Transcript / Gene annotation Files

  • Transcriptome GTF file (NCBI – Bos taurus): https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/002/263/795/
  • Transcriptome GTF file (Ensembl – Bos taurus): https://ftp.ensembl.org/pub/release-110/gtf/bos_taurus/
  • Pipeline Code

  • Nextflow/rna-seq code: https://nf-co.re/rnaseq/3.12.0

    Shell script to run the piepline

    #SBATCH -o "stdout.%j"
    #SBATCH -e "stderr.%j"
    #SBATCH --mem 100GB
    ##SBATCH -p amd 
    #SBATCH -N 1
    #SBATCH -n 20
    #SBATCH -t 12:00:00
    module load nextflow
    export NXF_SINGULARITY_CACHEDIR=/path/NXFContainers/
    nextflow run nf-core/rnaseq --input sample_sheet.csv --outdir star_rsem_output --fasta ${GENOME}  --gtf ${GTF} -resume -profile singularity --aligner star_rsem     
    * This script is used to run jobs on a HPC cluster with SLURM job scheduler. It may be modified to fit your computing environment.



    The end results of processed TPM counts can be visulized as bar graphs showing expression levels (counts) by tissues and genes with our in-house tool.


    * All data use at the EpiDB follow the FAANG data sharing statement.
    Iowa State University EpiDB Team, Koltes Lab