2023年1月22日日曜日

Installation of Nextflow

What is Nextflow?


I am recently working on setting up and learning the usage of Nextflow, the managing system of workflows combining various processing and calculations. 

Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting language.

 https://www.nextflow.io

It is important to conduct reproducible analysis of Omics data, though the reproducibility is not frequently kept through the entire research process of studies by various reasons. There are many discussions how to improve and guarantee the reproducibility of computational analyses especially in biological, life science study (for example, here). Nextflow has the potential of analysis managing system that could be used for describing bioinformatic analysis pipelines with enough reproducibility. 

Nextflow is already used in the bioinformatic workflows such as EPI2ME labs long-read data analysis tools (https://labs.epi2me.io/wfindex). I got to know this workflow management system when I searched the tools of nanopore analysis applications to analyze transcriptome data. I found wf-transcriptomes from epi2me-labs (https://github.com/epi2me-labs/wf-transcriptomes) and this tool uses nextflow. I think that not only it would be a convenient method to install the analysis tools for long-read sequence, but also it could be used in my daily works to describe the workflows. That's why I started to learn the usage of this tool.


Quick check and getting started


Nextflow can be used on the Linux, macOS, Unix, etc. (so called POSIX compatible system). in the webpage, 3 steps to install the nextflow are shown,

  1. Java11 or later is installed on one's system
  2. In terminal, curl -s https://get.nextflow.io | bash
  3. Run nextflow, for example, ./nextflow run hello

I usually use conda-forge in my personal computer. Nextflow can be also installed by using bioconda package manager. Maybe it would be a easiest way to quick check and try this tool if you use conda manager. 


In console, I executed the following command,


(base) tk$ conda install -c bioconda nextflow


In the package plan, nextflow and openjdk was shown.


The following packages will be downloaded:


    package                    |            build

    ---------------------------|-----------------

    coreutils-8.25             |                1         1.7 MB  bioconda

    nextflow-22.10.4           |       h4a94de4_0        24.8 MB  bioconda

    openjdk-17.0.3             |       hbc0c0cd_5       157.4 MB  conda-forge

    ------------------------------------------------------------

                                           Total:       183.9 MB



I proceeded this plan. It was quite easy and soon completed. I checked the java,


(base) tk$ java -version

openjdk version "17.0.3" 2022-04-19 LTS

OpenJDK Runtime Environment Zulu17.34+19-CA (build 17.0.3+7-LTS)

OpenJDK 64-Bit Server VM Zulu17.34+19-CA (build 17.0.3+7-LTS, mixed mode, sharing)



The help message was shown by executing  nextflow -h ,


Usage: nextflow [options] COMMAND [arg...]


Options:

  -C

     Use the specified configuration file(s) overriding any defaults

  -D

     Set JVM properties

  -bg

     Execute nextflow in background

  -c, -config

     Add the specified file to configuration set

  -config-ignore-includes

     Disable the parsing of config includes

  -d, -dockerize

     Launch nextflow via Docker (experimental)

  -h

     Print this help

  -log

     Set nextflow log file path

  -q, -quiet

     Do not print information messages

  -syslog

     Send logs to syslog server (eg. localhost:514)

  -v, -version

     Print the program version


Commands:

  clean         Clean up project cache and work directories

  clone         Clone a project into a folder

  config        Print a project configuration

  console       Launch Nextflow interactive console

  drop          Delete the local copy of a project

  help          Print the usage help for a command

  info          Print project and system runtime information

  kuberun       Execute a workflow in a Kubernetes cluster (experimental)

  list          List all downloaded projects

  log           Print executions log and runtime info

  pull          Download or update a project

  run           Execute a pipeline project

  secrets       Manage pipeline secrets (preview)

  self-update   Update nextflow runtime to the latest available version

  view          View project script file(s)




The version installed can be shown by typing  nextflow -version 


      N E X T F L O W

      version 22.10.4 build 5836

      created 09-12-2022 09:58 UTC (18:58 JDT)

      cite doi:10.1038/nbt.3820

      http://nextflow.io



I conducted the classic "Hello world" prepared for demo,


(base) tk$ nextflow run hello

N E X T F L O W  ~  version 22.10.4

Pulling nextflow-io/hello ...

 downloaded from https://github.com/nextflow-io/hello.git

Launching `https://github.com/nextflow-io/hello` [kickass_brenner] DSL2 - revision: 4eab81bd42 [master]

executor >  local (4)

[3b/ff8648] process > sayHello (2) [100%] 4 of 4 ✔

Hola world!


Hello world!


Bonjour world!


Ciao world!



 

It seems that nextflow was successfully installed on my system.
It was quite easy steps when I used bioconda for installation. But also when installing without conda, it might be not so difficult.

What's next?


The document can be available from here: https://www.nextflow.io/docs/latest/index.html
I would like to study how to use this system and incorporate into my research activity in my home.