2023年1月28日土曜日

Learning nextflow 1: get started

The other day, I installed workflow manager, nextflow on my computer. I installed using conda package manager with bioconda channel, that was quite easy way to install nextflow on my system. 

Installation of nextflow - T.Y. Blog

There is a nice tutorial for nextflow (https://www.nextflow.io/docs/latest/index.html). In get started page, there is a simple sample script of workflow that convert a string. 

On the Nextflow scripting page, the language used as nextflow scripting language was introduced (https://www.nextflow.io/docs/latest/script.html). It is an extension of the Groovy programming language.


Back to get started page, let's take a look at tutorial script,

 tutorial.nf 


params.str = 'Hello world!'




process splitLetters {

  output:

    path 'chunk_*'


  """

  printf '${params.str}' | split -b 6 - chunk_

  """

}





process convertToUpper {

  input:

    path x

  output:

    stdout


  """

  cat $x | tr '[a-z]' '[A-Z]'

  """

}





workflow {

  splitLetters | flatten | convertToUpper | view { it.trim() }

}




At first, one string parameter was written.


params.str = 'Hello world!'



This string will be processed and finally, processed string will be printed.
Following  params.str , there are two processes in this script.

First, a process named splitLetters was shown.


process splitLetters {

  output:

    path 'chunk_*'


  """

  printf '${params.str}' | split -b 6 - chunk_

  """

}


In this process, first, the output: block was shown. In processes page (https://www.nextflow.io/docs/latest/process.html#outputs), this output: is explained,

The  output  block allows you to define the output channels of a process, similar to function outputs. A process may have at most one output block, and it must contain at least one output.

I am an absolute beginner about this language, but it seems that the description within """ """ is the prosessing conducted in this "splitLetters" process. It starts from printing the parameter, then it is piped to "split".

Next, a process named convertToUpper was shown.


process convertToUpper {

  input:

    path x

  output:

    stdout


  """

  cat $x | tr '[a-z]' '[A-Z]'

  """

}


This process have  input  block because it receive the output of splitLetters. This process transforms the str received by input block to uppercase letters "tr '[a-z]' '[A-Z]'"

Finally the following output was shown on the console,


(base) tk$ nextflow run tutorial.nf 

N E X T F L O W  ~  version 22.10.5

Launching `tutorial.nf` [mad_mcclintock] DSL2 - revision: 5af7f346f0

executor >  local (3)

[f5/563a73] process > splitLetters       [100%] 1 of 1 ✔

[63/7f8425] process > convertToUpper (2) [100%] 2 of 2 ✔

HELLO

WORLD!



The first process "splitLetters" is executed once. It is shown in the output


[f5/563a73] process > splitLetters       [100%] 1 of 1 ✔



The next process "convertToUpper" is executed twice, because there are two chunks, "Hello" and "world!". 


[63/7f8425] process > convertToUpper (2) [100%] 2 of 2 ✔



If the params.str is shorter than 6, "split" in "splitLetters" will make only one chunk. and  "convertToUpper" will be executed once.


N E X T F L O W  ~  version 22.10.5

Launching `tutorial.nf` [curious_einstein] DSL2 - revision: 72afad773e

executor >  local (2)

[f5/b5207d] process > splitLetters       [100%] 1 of 1 ✔

[ab/b1650f] process > convertToUpper (1) [100%] 1 of 1 ✔

HELLO



If "--str" is specified on the commands, for example,


(base) tk$ nextflow run tutorial.nf --str 'Bonjour le monde'


then, the default parameter in "param.str" will be overridden by "Boujour le monde".


N E X T F L O W  ~  version 22.10.5

Launching `tutorial.nf` [special_wright] DSL2 - revision: 3f2cae5687

executor >  local (4)

[55/9358d8] process > splitLetters       [100%] 1 of 1 ✔

[55/6d93c7] process > convertToUpper (2) [100%] 3 of 3 ✔

BONJOU

ONDE

R LE M


In this case, the second process were executed three times. You can see "3 of 3" on the line of "convertToUpper" row. 


In the getting started page, there is a Tip about the delimiter of a nested scope.

As of version 20.11.0-edge, any . (dot) character in a parameter name is interpreted as the delimiter of a nested scope. For example, --foo.bar Hello will be interpreted as params.foo.bar. If you want to have a parameter name that contains a . (dot) character, escape it using the back-slash character, e.g. --foo\.bar Hello.

I modified the tutorial.nf to check this tips. For example, I revised the delimiter from  params.str  to  params.str1 

 tutorial2.nf 

params.str1 = 'Hello world!'




process splitLetters {

  output:

    path 'chunk_*'


  """

  printf '${params.str1}' | split -b 6 - chunk_

  """

}



For this script I can specify the default parameter by "--str1" instead of "--str",


(base) tk$ nextflow run tutorial2.nf --str1 'Bonjour le monde'


N E X T F L O W  ~  version 22.10.5

Launching `tutorial.nf` [grave_gilbert] DSL2 - revision: 421400f9e3

executor >  local (4)

[2d/b9e426] process > splitLetters       [100%] 1 of 1 ✔

[d7/1053d8] process > convertToUpper (2) [100%] 3 of 3 ✔

BONJOU

ONDE

R LE M




I cannot specify the default parameter by "--str".


(base) tk$ nextflow run tutorial2.nf --str 'Bonjour le monde'


N E X T F L O W  ~  version 22.10.5

Launching `tutorial.nf` [gloomy_lamport] DSL2 - revision: 421400f9e3

executor >  local (3)

[25/dbde42] process > splitLetters       [100%] 1 of 1 ✔

[00/93720f] process > convertToUpper (2) [100%] 2 of 2 ✔

HELLO1

2




If the number was used, the error was returned,

 tutorial3.nf 


params.1 = 'Hello world!'




process splitLetters {

  output:

    path 'chunk_*'


  """

  printf '${params.1}' | split -b 6 - chunk_

  """

}



The output was,


(base) tk$ nextflow run tutorial3.nf 


N E X T F L O W  ~  version 22.10.5

Launching `tutorial3.nf` [tiny_joliot] DSL2 - revision: 7f88a4f3a1

Script compilation error

- file : /XXX/YYY/ZZZ/tutorial3.nf

- cause: The LHS of an assignment should be a variable or a field accessing expression @ line 1, column 7.

   params.1 = 'Hello12'

         ^


1 error







These tutorials can be found at:


----------------
Revised: 2023.1.29