_images/nf-core-logo.png

nf-core

Introduction to nf-core

nf-core is a community effort to collect a curated set of analysis pipelines built using Nextflow. As such nf-core is:

  • A community of users and developers.

  • A curated set of analysis pipelines build using Nextflow.

  • A set of guidelines (standard).

  • A set of helper tools.

Community

The nf-core community is a collaborative effort that has been growing since its creation in early 2018, as you can check on the nf-core stats site.

_images/nfcore_community.png _images/nfcore_community_map.png

Pipelines

Currently, there are 72 pipelines that are available as part of nf-core (41 released, 25 under development and 6 archived). You can browse all of them on this link.

Guidelines

All nf-core pipelines must meet a series of requirements or guidelines. These guidelines ensure that all nf-core pipelines follow the same standard and stick to current computational standards to achieve reproducibility, interoperability and portability. The guidelines are make available on this link.

Helper tools

To ease the use and development of nf-core pipelines, the community makes available a set of helper tools that we will introduce on this tutorial.

Paper

The main nf-core paper was published in 2020 in Nature Biotechnology and describes the community and the nf-core framework.

_images/nf-core-paper.png

Installation

You can use Conda to install nf-core tools. In the command below we create a new named environment that includes nf-core and then, we activate it.

conda create -n nf-core python=3.8 nf-core -c bioconda -c conda-forge -y
conda activate nf-core

Tip

Find alternative ways of installation on the nf-core documentation

We can now check the nf-core available commands:

    $ nf-core -h

                                      ,--./,-.
      ___     __   __   __   ___     /,-._.--~\
|\ | |__  __ /  ` /  \ |__) |__         }  {
| \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                      `._,._,'

nf-core/tools version 2.6 - https://nf-co.re



Usage: nf-core [OPTIONS] COMMAND [ARGS]...

nf-core/tools provides a set of helper tools for use with nf-core Nextflow pipelines.
It is designed for both end-users running pipelines and also developers creating new pipelines.

╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
│ --version                   Show the version and exit.                                           │
│ --verbose   -v              Print verbose output to the console.                                 │
│ --log-file  -l  <filename>  Save a verbose log to a file.                                        │
│ --help      -h              Show this message and exit.                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands for users ─────────────────────────────────────────────────────────────────────────────╮
│ list        List available nf-core pipelines with local info.                                    │
│ launch      Launch a pipeline using a web GUI or command line prompts.                           │
│ download    Download a pipeline, nf-core/configs and pipeline singularity images.                │
│ licences    List software licences for a given workflow.                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands for developers ────────────────────────────────────────────────────────────────────────╮
│ create             Create a new pipeline using the nf-core template.                             │
│ lint               Check pipeline code against nf-core guidelines.                               │
│ modules            Commands to manage Nextflow DSL2 modules (tool wrappers).                     │
│ schema             Suite of tools for developers to manage pipeline schema.                      │
│ bump-version       Update nf-core pipeline version number.                                       │
│ sync               Sync a pipeline TEMPLATE branch with the nf-core template.                    │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────╮
│ subworkflows      Commands to manage Nextflow DSL2 subworkflows (tool wrappers).                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

As shown in the screenshot, nf-core tools provides you with some commands meant for users and with some commands meant for developers. Here, we will discuss how nf-core can be used from a user point of view.

Tip

If you are interested in learn how nf-core could help you developing your pipelines please refer to the tool page in the nf-core site or follow this tutorial.

Commands for users

Listing pipelines

To show all the available nf-core pipelines, we can use the nf-core list command. This command also provides some other information as the last version of each of the nf-core pipelines, its publication and and when you last pulled the pipeline to your local system.

    $ nf-core list

                                      ,--./,-.
      ___     __   __   __   ___     /,-._.--~\
|\ | |__  __ /  ` /  \ |__) |__         }  {
| \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                      `._,._,'

nf-core/tools version 2.6 - https://nf-co.re

┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ Pipeline Name          ┃ Stars ┃ Latest Release ┃     Released ┃    Last Pulled ┃ Have latest release? ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ coproid                │     7 │          1.1.1 │    yesterday │              - │ -                    │
│ cutandrun              │    31 │            3.0 │  1 weeks ago │              - │ -                    │
│ smrnaseq               │    44 │          2.1.0 │  2 weeks ago │ 16 minutes ago │ Yes (v2.1.0)         │
│ nascent                │     5 │          2.0.0 │  2 weeks ago │              - │ -                    │
│ hgtseq                 │    12 │          1.0.0 │  2 weeks ago │              - │ -                    │
│ hlatyping              │    35 │          2.0.0 │  3 weeks ago │              - │ -                    │
│ demultiplex            │    14 │          1.0.0 │  4 weeks ago │              - │ -                    │
│ scrnaseq               │    64 │          2.1.0 │  4 weeks ago │              - │ -                    │
│ chipseq                │   129 │          2.0.0 │ 1 months ago │ 18 minutes ago │ Yes (v2.0.0)         │
│ rnaseq                 │   529 │            3.9 │ 1 months ago │ 19 minutes ago │ Yes (v3.9)           │
│ isoseq                 │     5 │          1.1.1 │ 1 months ago │              - │ -                    │
│ sarek                  │   205 │          3.0.2 │ 1 months ago │   7 months ago │ No (v2.7.1)          │
│ airrflow               │    22 │          2.3.0 │ 2 months ago │              - │ -                    │
│ ampliseq               │    99 │          2.4.0 │ 2 months ago │              - │ -                    │
│ mag                    │   103 │          2.2.1 │ 2 months ago │              - │ -                    │
│ epitopeprediction      │    22 │          2.1.0 │ 3 months ago │              - │ -                    │
│ eager                  │    76 │          2.4.5 │ 3 months ago │              - │ -                    │
│ viralrecon             │    79 │            2.5 │ 4 months ago │  12 months ago │ No (v2.2)            │
│ rnafusion              │    81 │          2.1.0 │ 4 months ago │              - │ -                    │
│ fetchngs               │    64 │            1.7 │ 4 months ago │   5 months ago │ No (v1.5)            │
│ circdna                │    10 │          1.0.1 │ 5 months ago │              - │ -                    │
│ nanoseq                │    88 │          3.0.0 │ 5 months ago │              - │ -                    │
│ rnavar                 │    13 │          1.0.0 │ 5 months ago │              - │ -                    │
│ mnaseseq               │     7 │          1.0.0 │ 5 months ago │              - │ -                    │
│ atacseq                │   119 │          1.2.2 │ 6 months ago │   20 hours ago │ No (dev - 88d4e6d)   │
    [..truncated..]

Tip

The pipelines can be sorted by latest release (-s release, default), by the last time you pulled a local copy (-s pulled), alphabetically (-s name) or by the number of GitHub stars (-s stars).

Filtering available nf-core pipelines

It is also possible to use keywords after the list command so that the list of pipelines is shortened to those matching the keywords or including them in the description. We can use the command below to filter on the atac and atac-seq keywords:

    $ nf-core list atac

                                      ,--./,-.
      ___     __   __   __   ___     /,-._.--~\
|\ | |__  __ /  ` /  \ |__) |__         }  {
| \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                      `._,._,'

nf-core/tools version 2.6 - https://nf-co.re

┏━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ Pipeline Name ┃ Stars ┃ Latest Release ┃     Released ┃  Last Pulled ┃ Have latest release? ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ atacseq       │   119 │          1.2.2 │ 6 months ago │ 17 hours ago │ No (dev - 88d4e6d)   │
│ hicar         │     2 │          1.0.0 │ 6 months ago │            - │ -                    │
└───────────────┴───────┴────────────────┴──────────────┴──────────────┴──────────────────────┘

Launching pipelines

The launch command enables to launch nf-core, and also Nextflow, pipelines via a web-based graphical interface or an interactive command-line wizard tool. This command becomes handy for pipelines with a considerable number of parameters since it displays the documentation alongside each of the parameters and validate your inputs.

We can now launch an nf-core pipeline:

nf-core launch

To render the description of the parameters, its grouping and defaults, the tool uses the nextflow_schema.json. This JSON file is bundled with the pipeline and includes all the information mentioned above, see an example here.

The chosen non-default parameters are dumped into a JSON file called nf-params.json. This file can be provided to new executions using the --params-in flag. See below an example of a params JSON file:

{
    "outdir": "results",
    "input": "https://raw.githubusercontent.com/nf-core/test-datasets/atacseq/samplesheet/v2.0/samplesheet_test.csv"
}

It is a good practice in terms of reproducibility to explicitly indicate the version (revision) of the pipeline that you want to use. This is done using the -r flag e.g. nf-core launch atacseq --params-in nf-params.json -r dev.

nf-core configs and profiles

nf-core configs

We have already introduced Nextflow configuration files and profiles during the course. Config files are used by nf-core pipelines to specify the computational requirements of the pipeline, define custom parameters and set which software management system to be used (Docker, Singularity or Conda). As an example take a look to the base.config that is used to set sensible defaults for the computational resources needed by the pipeline.

nf-core core profiles

nf-core pipelines use profiles to bundle a set of configuration attributes. By doing so, we can activate these attributes by using the -profile Nextflow command line option. All nf-core pipelines come along with a set of common "Core profiles" that include the conda, docker and singularity that define which software manager to use and the test profile that specifies a minimal test dataset to check that the pipelines works properly.

Note

Each configuration file can include one or several profiles

Institutional profiles

Institutional profiles are profiles where you can specify the configuration attributes for your institution system. They are hosted in https://github.com/nf-core/configs and all pipelines pull this repository when a pipeline is run. The idea is that these profiles set the custom config attributes to run nf-core pipelines in your institution (scheduler, container technology, resources, etc.). This way all the users in a cluster can make use of the profile just setting the profile of your institution (-profile institution).

Tip

You can use more than profile at a time by separating them by a comma without space, e.g. -profile test,docker

Custom config

If you need to provide any custom parameter or setting when running a nf-core pipeline, you can do it by creating a local custom config file and add it to your command with the -c flag.

_images/nfcore_config.png

*Image from https://carpentries-incubator.github.io/workflows-nextflow.

Note

Profiles will be prioritized from left to right in case conflicting settings are found.

Running pipelines with test data

All nf-core pipelines include a special configuration named test. This configuration defines all the files and parameters to test all pipeline functionality with a minimal dataset. Thus, although the functionality of the pipeline is maintained often the results are not meaningful. As an example, find on the snippet below the test configuration of the nf-core/atacseq. pipeline.

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Defines input files and everything required to run a fast and simple pipeline test.
    Use as follows:
        nextflow run nf-core/atacseq -profile test,<docker/singularity> --outdir <OUTDIR>
----------------------------------------------------------------------------------------
*/

params {
    config_profile_name = 'Test profile'
    config_profile_description = 'Minimal test dataset to check pipeline function'

    // Limit resources so that this can run on GitHub Actions
    max_cpus = 2
    max_memory = 6.GB
    max_time = 12.h

    // Input data
    input = 'https://raw.githubusercontent.com/nf-core/test-datasets/atacseq/samplesheet/v2.0/samplesheet_test.csv'
    read_length = 50

    // Genome references
    mito_name = 'MT'
    fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/atacseq/reference/genome.fa'
    gtf = 'https://raw.githubusercontent.com/nf-core/test-datasets/atacseq/reference/genes.gtf'

    // For speed to avoid CI time-out
    fingerprint_bins = 100

    // Avoid preseq errors with test data
    skip_preseq = true
}

Tip

You can find the current version of the nf-core/atacseq test.config config here

Downloading pipelines

If your HPC system or server does not have an internet connection you can still run nf-core pipelines by fetching the pipeline files first and then, manually transferring them to your system.

The nf-core download option simplifies this process and ensures the correct versioning of all the code and containers needed to run the pipeline. By default, the command will download the pipeline code and the institutional nf-core/configs files. Again, the -r flag allows to fetch a given revision of the pipeline.

Finally, you can also download any singularity image files required by the pipeline, if you specify the --singularity flag.

Tip

If you don't provide any option to `nf-core download an interactive prompt will ask you for the required options.

We can now try to download the atacseq pipeline using the command below:

nf-core download atacseq -r dev

Now we can inspect the structure of the downloaded directory:

$ tree -L 2 nf-core-atacseq-dev/

nf-core-atacseq-dev/
├── configs
│   ├── ..truncated..
│   ├── nextflow.config
│   ├── nfcore_custom.config
│   ├── pipeline
│   └── README.md
├── singularity-images
│   ├── depot.galaxyproject.org-singularity-ataqv-1.3.0--py39hccc85d7_2.img
│   ├── ..truncated..
│   └── depot.galaxyproject.org-singularity-ucsc-bedgraphtobigwig-377--h446ed27_1.img
└── workflow
        ├── assets
        ├── bin
        ├── ..truncated..
        ├── main.nf
        ├── modules
        ├── ..truncated..
        └── workflows

Pipeline output

nf-core pipelines produce a MultiQC report which summarises results at the end of the execution along with software versions of the different tools used, nf-core pipeline version and Nextflow version itself.

Each pipeline provides an example of a MultiQC report from a real execution in the nf-core website. For instance you can find the report corresponding to the current version of nf-core/atacseq here.

Acknowledgements

This nf-core tutorial has been build taking as inspiration the nf-core official tools documentation and the Carpentries materials "Introduction to Bioinformatics workflows with Nextflow and nf-core" that can be find here.