RNA-seq Bioinformatics Toolkit

RNA-seq Bioinformatics Toolkit

This Docker image contains a comprehensive set of bioinformatics tools for RNA-seq analysis, based on the requirements described in the AWS Setup guide from rnabio.org. Currently tools are limited to those needed for bulk RNAseq parts of the course.

Contents

Base Image

System Dependencies

All necessary system libraries and dependencies as specified in the “Perform basic linux configuration” section:

Bioinformatics Tools

Python Packages

Scientific Computing:

Data Analysis & Visualization:

R and Bioconductor

R Libraries:

Bioconductor Libraries:

Versioning

This Docker image follows Semantic Versioning (SemVer) for version management:

Version Management

Use the included version management script to handle releases:

# Check current version
./version.sh current

# Increment patch version (1.1.0 -> 1.1.1)
./version.sh patch

# Increment minor version (1.1.0 -> 1.2.0)
./version.sh minor

# Increment major version (1.1.1 -> 2.0.0)
./version.sh major

# Set specific version
./version.sh set 1.1.1

The script automatically:

Automated Release Process

Use the release script to build, tag, and push Docker images to registries:

# Build and push to default registry (griffithlab)
./release.sh

# Push to a different registry
./release.sh --registry your-username

# Only build and tag locally (don't push)
./release.sh --tag-only

# Push to GitHub Container Registry
./release.sh --registry ghcr.io/your-username

# Show help
./release.sh --help

The release script:

For monorepo releases: The script uses component-specific tags like rnaseq-toolkit-docker-v1.1.1 to distinguish this Docker component from other parts of the repository.

Post-Release Checklist

After modifying the Dockerfile and completing a new release, developers should ensure the following steps are completed:

  1. Update the changelog - Ensure CHANGELOG.md includes all relevant changes for the new version
  2. Commit all changes - Verify that all modifications are properly committed to the repository
  3. Create a git tag - Use the version management script to create an appropriate git tag
  4. Push git tag to GitHub - Push the created tag to make it available on GitHub
  5. Create GitHub release - Use the provided release template to create a new release on GitHub
# Example workflow after Dockerfile modifications:
./version.sh patch              # or minor/major as appropriate
git add -A                      # stage all changes
git commit -m "Release v1.1.1"  # commit changes
git push origin master          # push commits to GitHub
git push origin --tags          # push tags to GitHub
./release.sh                    # build and push Docker image
# Then create GitHub release using the generated template

Building the Docker Image

Prerequisites

Build Instructions

  1. Navigate to the dockerfile directory:
    cd /Users/obigriffith/git/rnabio.org/docker/course/dockerfile_approach/
    
  2. Make the build script executable:
    chmod +x build.sh
    
  3. Run the build script:
    ./build.sh
    

    Or build directly with Docker:

    docker build -t rnaseq_toolkit:latest .
    

Usage

Pre-built Docker Image

A pre-built image is available on Docker Hub:

Image: griffithlab/rnaseq-toolkit

Available tags:

Running the Container

Basic usage (pre-built image):

# Pull and run the latest version with Apache web server
docker pull griffithlab/rnaseq-toolkit:latest
docker run -it -p 8080:8080 griffithlab/rnaseq-toolkit:latest

# Or run a specific version
docker run -it -p 8080:8080 griffithlab/rnaseq-toolkit:1.1.1

With Volume Mounting

To access files from your host system and serve them via Apache:

# Mount your data directory to /workspace
docker run -it -p 8080:8080 -v /path/to/your/data:/workspace griffithlab/rnaseq-toolkit:latest

With Docker Engine Access

To enable Docker functionality within the container (for running other bioinformatics containers):

# Mount the Docker socket to enable host Docker engine access
docker run -it -p 8080:8080 -v /path/to/your/data:/workspace -v /var/run/docker.sock:/var/run/docker.sock griffithlab/rnaseq-toolkit:latest

Notes about Docker socket mounting:

Running Specific Tools

You can run specific tools directly:

docker run --rm griffithlab/rnaseq-toolkit:latest samtools --help
docker run --rm griffithlab/rnaseq-toolkit:latest hisat2 --help
docker run --rm griffithlab/rnaseq-toolkit:latest stringtie --help

Apache Web Server

The container automatically starts an Apache web server that serves the /workspace directory on port 8080. This allows you to:

Access the web interface:

Features:

Image Optimization Features

This Dockerfile is optimized for size and build efficiency:

Architecture

The image is built for x86_64 architecture and should run on:

Note for Apple Silicon users: The build script automatically uses --platform linux/amd64 to ensure x86_64 compatibility and avoid ARM-related issues.

Troubleshooting

Build Issues

Runtime Issues

Docker Socket Issues

Development Notes

This Docker image has been optimized through several iterations to address common build and runtime issues:

Key Improvements Made

Known Working Versions

Documentation

Changelog

See CHANGELOG.md for detailed version history and changes.

Source Code

Support

This image is designed to support the RNA-seq analysis workflows described in the rnabio.org tutorial series. For questions about specific tools, refer to their respective documentation.

Getting Help