Nextflow#

Nextflow is a workflow management system used for executing scientific workflows across platforms scalably, portably, and reproducibly.

Here, we’ll run a demo of the microscopy pipeline mcmicro to correct uneven illumination. Reference

Note

This notebook serves as a demo for how to integrate lamindb with Nextflow runs. Typically, you run the Nextflow workflow from the command line or Nextflow tower and then register input and output data with a script.

Setup#

Let’s load an instance that already has example data.

!lamin load nextflow-mcmicro
Hide code cell output
💡 connected lamindb: testuser1/nextflow-mcmicro
import lamindb as ln
💡 connected lamindb: testuser1/nextflow-mcmicro

Run and register Nextflow workflow#

!nextflow run https://github.com/labsyspharm/mcmicro --in exemplar-001 --start-at illumination --stop-at registration
Hide code cell output
N E X T F L O W  ~  version 23.10.1
Launching `https://github.com/labsyspharm/mcmicro` [cheesy_poitras] DSL2 - revision: 69ee2efe21 [master]
[20/b50e6e] Submitted process > illumination (3)
[36/3c3d3e] Submitted process > illumination (1)
[32/f25f90] Submitted process > illumination (2)
[11/2092de] Submitted process > registration:ashlar (1)

Now we register our Nextflow run by running our registration script.

!python register_mcmicro_run.py
💡 connected lamindb: testuser1/nextflow-mcmicro
💡 saved: Transform(uid='GYnAh9Kv7kHOydQ5', name='mcmicro', version='1.0.0', type='pipeline', reference='https://github.com/labsyspharm/mcmicro', updated_at=2024-04-22 15:51:00 UTC, created_by_id=1)
💡 saved: Run(uid='wJFsBeIvVYB84YLGlZRY', transform_id=2, created_by_id=1)

Data lineage#

View data lineage:

output = ln.Artifact.filter(description__icontains="mcmicro").one()
output.view_lineage()
_images/31e622bd2eb95e2f92f065410149b0bba7af2a9e4b9d8f701d8b45321c572b14.svg

View the database content:

ln.view()
Artifact
uid storage_id key suffix accessor description version size hash hash_type n_objects n_observations transform_id run_id visibility key_is_virtual created_at updated_at created_by_id
id
11 AYPHT6pT2S4kCVqX461p 1 exemplar-001/registration/exemplar-001.ome.tif .tif None mcmicro None 175490712 tGsxMvgLmr2uUwjaSSd1-s sha1-fl None None 2 2 1 False 2024-04-22 15:51:01.150780+00:00 2024-04-22 15:51:01.150818+00:00 1
10 oTRrVmOqFUAUU413EkHY 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None None 22119019 _PVbSfSL5apmaZkcpL2mbw md5 None None 1 1 1 False 2024-04-22 15:49:05.467460+00:00 2024-04-22 15:49:05.467481+00:00 1
9 eSf8gIp3P6c47grO3ODg 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None None 22119019 fUw4NVqV-Zy_OdxiMetlfg md5 None None 1 1 1 False 2024-04-22 15:49:05.466864+00:00 2024-04-22 15:49:05.466885+00:00 1
8 wCOikxE2ai0jAQ1qy3uJ 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None None 22119019 H_ya8KoVaaeu_Ve_N14TAg md5 None None 1 1 1 False 2024-04-22 15:49:05.466093+00:00 2024-04-22 15:49:05.466113+00:00 1
7 gMFaIATmF5BczgFgyLsB 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None None 22119019 qpmIHKbuxwe2sE_rdcPqfA md5 None None 1 1 1 False 2024-04-22 15:49:05.465500+00:00 2024-04-22 15:49:05.465521+00:00 1
6 iwHqXEhcfp2KkUYWcBNv 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None None 22119019 idW8uRMTLfXNJHnboZy8GQ md5 None None 1 1 1 False 2024-04-22 15:49:05.464877+00:00 2024-04-22 15:49:05.464901+00:00 1
5 bS6Kd83VghBRKJrIrcmT 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None None 22119019 Yw4DJkg2QQ7ez4j2_qWN_Q md5 None None 1 1 1 False 2024-04-22 15:49:05.464245+00:00 2024-04-22 15:49:05.464268+00:00 1
Run
uid transform_id started_at finished_at created_by_id json report_id environment_id is_consecutive reference reference_type created_at
id
1 Uc2SIFXhikeSDkSOMK3f 1 2024-04-22 15:48:50.235378+00:00 None 1 None None None True None None 2024-04-22 15:48:50.235534+00:00
2 wJFsBeIvVYB84YLGlZRY 2 2024-04-22 15:51:00.230613+00:00 None 1 None None None None nextflow\nbe6a1f2f-56ca-4e7d-99da-8282f1f75b02 nextflow_id 2024-04-22 15:51:00.230774+00:00
Storage
uid root description type region created_at updated_at created_by_id
id
1 cvulaYEI /home/runner/work/nextflow-lamin-usecases/next... None local None 2024-04-22 15:48:48.679913+00:00 2024-04-22 15:48:48.679938+00:00 1
Transform
uid name key version description type latest_report_id source_code_id reference reference_type created_at updated_at created_by_id
id
2 GYnAh9Kv7kHOydQ5 mcmicro None 1.0.0 None pipeline None None https://github.com/labsyspharm/mcmicro None 2024-04-22 15:51:00.225028+00:00 2024-04-22 15:51:00.225067+00:00 1
1 vk3IKwALJm36lBuO Download None None None pipeline None None None None 2024-04-22 15:48:50.229641+00:00 2024-04-22 15:48:50.229669+00:00 1
User
uid handle name created_at updated_at
id
1 DzTjkKse testuser1 Test User1 2024-04-22 15:48:48.675881+00:00 2024-04-22 15:48:48.675909+00:00

Clean up the test instance:

!lamin delete --force nextflow-mcmicro
Hide code cell output
💡 deleting instance testuser1/nextflow-mcmicro
❗ manually delete your stored data: /home/runner/work/nextflow-lamin-usecases/nextflow-lamin-usecases/docs

If you are interested in registering bulk RNA-seq data with Lamin, you can have a look at our nf-core/rnaseq example.