Usage

This section provides some simple usage examples. For more information what the workflow does, see Single-cell RNA sequencing workflow.

Running the wrapper script

lts_workflows_sm_scrnaseq comes with a wrapper script that calls snakemake and provides additional help messages. Running the wrapper without any arguments will generate a help message. If any arguments are provided they are passed on to snakemake.

$ lts_workflows_sm_scrnaseq
$ lts_workflows_sm_scrnaseq -l
$ lts_workflows_sm_scrnaseq all
$ lts_workflows_sm_scrnaseq --use-conda all
$ lts_workflows_sm_scrnaseq -s /path/to/Snakefile --use-conda all

If no Snakefile is provided, the wrapper script will automatically load the Snakefile from the package root directory (see example_snakefile). Note that you will have to pass a configuration file with the –configfile parameter.

In the case you need to add custom rules, or want to hardcode parameters such as the working directory and configuration file, you can of course copy and edit the provided Snakefile.

Running snakemake

You can of course bypass the provided wrapper script and run snakemake directly on your own Snakefile. If so, the intended usage is to include the main workflow file in a Snakefile. See the examples in the test directory.

$ snakemake -s Snakefile -d /path/to/workdir --configfile config.yaml all

Running docker/singularity containers

lts_workflows_sm_scrnaseq is also shipped with all the dependencies packaged in a Docker image. This eliminates some of the installation issues, at the cost of having to download a large image file (>5GB). In any case, the entry point of the image points to the lts_workflows_sm_scrnaseq wrapper script.

$ docker pull scilifelablts/lts-workflows-sm-scrnaseq
$ docker run scilifelablts/lts-workflows-sm-scrnaseq
$ docker run -v /path/to/workdir:/workspace -w /workspace scilifelablts/lts-workflows-sm-scrnaseq -l
$ docker run -v /path/to/workdir:/workspace -w /workspace scilifelablts/lts-workflows-sm-scrnaseq all

Docker is not allowed on many HPC systems, in which case you may be able to use Singularity instead. Docker images can, in most cases, be converted into Singularity images with:

$ singularity pull docker://scilifelablts/lts-workflows-sm-scrnaseq

This will create a file lts-workflows-sm-scrnaseq.simg (or similar) that is the Singularity image. This can then be started with:

$ singularity exec lts-workflows-sm-scrnaseq.simg

The method above is convenient if you want to run on a resource without internet access, where installing via Conda isn’t an option. A typical use case is then to request a large node and then run the whole workflow within the Singularity container on that node. This can be inefficient, since the jobs have very different memory/cpu requirements and you’re limited to however many cores are available on the node. A better, but somewhat trickier, option is to run the individual jobs in the Singularity container, but the overall workflow on the local system (see Installation for how to install this in the best way). This can be achieved with:

$ snakemake [your normal stuff] --use-singularity --use-conda

The –use-singularity flag tells Snakemake to execute the jobs in the Singularity container specified by config[‘workflow’][‘singularity’]. By default this is set automatically to fit with the current release. If the dependencies for all rules could get along in the same environment, then this would be all that was needed. Unfortunately, there are some rules that require Python 2. That is what the –use-conda flag is for. Snakemake will then create Conda environments for those specific rules, and activate within the Singularity container. If you want this to be done for all rules, maybe to use another Singularity container, you can set config[‘workflow’][‘use_conda_for_py3’] to True.

Say you are on a system without internet access and would like to submit jobs to the cluster scheduler as individual jobs. You can then combine both these methods; start the workflow in the Singularity container and then execute jobs in the same container on the compute nodes.

$ singularity pull docker://scilifelablts/lts-workflows-sm-scrnaseq
$ [upload lts-workflows-sm-scrnaseq.simg to cluster]
$ singularity exec lts-workflows-sm-scrnaseq.simg snakemake [your normal stuff] --use-singularity --use-conda

Remember to set config[‘workflow’][‘singularity’] to use the local image.

NOTE: This currently won’t work for jobs that should run locally, since that would mean running a Singularity image within a Singularity image. Consider this an advanced feature that probably requires some tinkering to get running. Maybe more applicable to very large projects.

Example Snakefile

The provided minimum Snakefile looks as follows:

# -*- snakemake -*-
from lts_workflows_sm_scrnaseq import WORKFLOW

include: WORKFLOW