Snakemake rules

A rule describes a task and is the backbone of a Snakemake pipeline. In hydra-genetics we have set up a format for rules and their names.

Create a rule template

Using the hydra-genetics tool it is very easy to create a new rule template following the hydra-genetics standard.

Installation

Activate a python virtual environment. Then install the hydra-genetics tools using pip

source venv/bin/activate
pip install hydra_genetics

Make a rule skeleton

The hydra-genetics tools will set up a rule skeleton and place the rule in module_name/workflow/rules/new_rule.smk.

hydra-genetics create-rule -c picard_mark_duplicates -t picard -m alignment -a my_name -e name@email.com [OPTIONS]
Option Description
-c, --command TEXT command that will be run, will be used to name the rule [required]
-t, --tool TEXT tool that will be used to run the command, if provided it will be used during the naming of the rule, ex samtools
-m, --module TEXT name module/workflow where rule will be added. Expected folder structure is module_name/workflow/,
the rule will be added to a subfolder named rules, env.yaml to a subfolder named envs. [required]
-a, --author TEXT Name of the main author(s) [required]
-e, --email TEXT E-mail(s) of the main author(s) [required]
-o, --outdir TEXT Output directory for where module is located (default: current dir)
--help Show help message

Example rule (workflow/rules/example_rule.smk)

All parts of the example rule should always be present.

rule example_rule:
    input:
      in="module/example_rule/{sample}_{type}.suffix1",
    output:
      out=temp("module/example_rule/{sample}_{type}.suffix2"),
    params:
      extra=config.get("example_rule", {}).("extra", ""),
    log:
      "module/example_rule/{sample}_{type}.suffix2.log"
    benchmark:
      repeat("module/example_rule/{sample}_{type}.suffix2.benchmark.tsv", config.get("example_rule", {}).get("benchmark_repeats", 1),)
    threads: config.get("example_rule", {}).get(“threads”, config["default_resources"][threads])
    resources:
      mem_mb=config.get("example_rule", {}).get("mem_mb", config["default_resources"]["mem_mb"]),
      mem_per_cpu=config.get("example_rule", {}).get("mem_per_cpu", config["default_resources"]["mem_per_cpu"]),
      partition=config.get("example_rule", {}).get("partition", config["default_resources"]["partition"]),
      threads=config.get("example_rule", {}).get("threads", config["default_resources"]["threads"]),
      time=config.get("example_rule", {}).get("time", config["default_resources"]["time"]),
    container:
        config.get("example_rule", {}).get("container", config["default_container"])
    conda:
        "../envs/example_tool.yaml"
    message:
        "{rule}: Do stuff on {input.in}"
    wrapper/script/shell/run:
        “Some run command”