Nextflow

Nextflow is a workflow management system used for executing scientific workflows across platforms scalably, portably, and reproducibly.

Here, we’ll run a demo of the microscopy pipeline mcmicro to correct uneven illumination. Reference

Note

Typically, you run the Nextflow workflow from the command line or Seqera Platform and then register input and output data with a script. The Seqera Platform allows for post-run scripts that can automate this process.

Let’s load an instance that already has example data.

!lamin load nextflow-mcmicro
Hide code cell output
💡 connected lamindb: testuser1/nextflow-mcmicro
import lamindb as ln
💡 connected lamindb: testuser1/nextflow-mcmicro

Run and register Nextflow workflow

!nextflow run https://github.com/labsyspharm/mcmicro --in exemplar-001 --start-at illumination --stop-at registration
Hide code cell output
N E X T F L O W  ~  version 23.10.1
Launching `https://github.com/labsyspharm/mcmicro` [cheeky_bernard] DSL2 - revision: cbd0f14967 [master]
[e7/b5dd1f] Submitted process > illumination (2)
[5b/91c48f] Submitted process > illumination (3)
[71/347fcd] Submitted process > illumination (1)
[d4/d29c18] Submitted process > registration:ashlar (1)

Now we register our Nextflow run by running our registration script.

!python register_mcmicro_run.py
Hide code cell output
💡 connected lamindb: testuser1/nextflow-mcmicro
💡 saved: Transform(version='1.0.0', uid='egNPcFNWhVZx', name='mcmicro', type='pipeline', reference='https://github.com/labsyspharm/mcmicro', updated_at=2024-05-19 23:25:18 UTC, created_by_id=1)
💡 saved: Run(uid='f3TQR6OFKQHVfv0rIJUn', transform_id=2, created_by_id=1)

Data lineage

View data lineage:

output = ln.Artifact.filter(description__icontains="mcmicro").one()
output.view_lineage()
_images/ed1d4a317d43272672a9ba2231b3dcd9f474810023712a2ce3e5b7a575fd0ea9.svg

View the database content:

ln.view()
Artifact
version created_at created_by_id updated_at uid storage_id key suffix accessor description size hash hash_type n_objects n_observations transform_id run_id visibility key_is_virtual
id
11 None 2024-05-19 23:25:19.806632+00:00 1 2024-05-19 23:25:19.806691+00:00 u1ekYnwQ554LK48M6Wcu 1 exemplar-001/registration/exemplar-001.ome.tif .tif None mcmicro 175490712 AfbcL67v_OP0kNfIHTgC47 sha1-fl None None 2 2 1 False
10 None 2024-05-19 23:23:29.955844+00:00 1 2024-05-19 23:23:29.955885+00:00 YWIsfK7FrVgjH654firB 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None 22119019 Yw4DJkg2QQ7ez4j2_qWN_Q md5 None None 1 1 1 False
9 None 2024-05-19 23:23:29.955232+00:00 1 2024-05-19 23:23:29.955271+00:00 RQuNInfFVKOTeYvZNgSq 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None 22119019 fUw4NVqV-Zy_OdxiMetlfg md5 None None 1 1 1 False
8 None 2024-05-19 23:23:29.954631+00:00 1 2024-05-19 23:23:29.954672+00:00 G7vLxEdOcovWfVMAqMf6 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None 22119019 H_ya8KoVaaeu_Ve_N14TAg md5 None None 1 1 1 False
7 None 2024-05-19 23:23:29.954021+00:00 1 2024-05-19 23:23:29.954062+00:00 1OYWJOjiJAftLhSxy0Oe 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None 22119019 qpmIHKbuxwe2sE_rdcPqfA md5 None None 1 1 1 False
6 None 2024-05-19 23:23:29.953410+00:00 1 2024-05-19 23:23:29.953451+00:00 OrWItc6otSBCAXXyFxUe 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None 22119019 idW8uRMTLfXNJHnboZy8GQ md5 None None 1 1 1 False
5 None 2024-05-19 23:23:29.952776+00:00 1 2024-05-19 23:23:29.952816+00:00 Uz6yJdMyED0qy9DtdUDG 1 exemplar-001/illumination/exemplar-001-cycle-0... .tif None None 22119019 _PVbSfSL5apmaZkcpL2mbw md5 None None 1 1 1 False
Run
uid transform_id started_at finished_at created_by_id report_id environment_id is_consecutive reference reference_type created_at
id
1 RqZYx2wkekJAdaABqAEP 1 2024-05-19 23:23:15.329866+00:00 None 1 None None True None None 2024-05-19 23:23:15.330007+00:00
2 f3TQR6OFKQHVfv0rIJUn 2 2024-05-19 23:25:18.959156+00:00 None 1 None None None nextflow\na05dde95-111e-43d4-83c9-3fdfbe7c1c84 nextflow_id 2024-05-19 23:25:18.959306+00:00
Storage
created_at created_by_id run_id updated_at uid root description type region instance_uid
id
1 2024-05-19 23:23:13.963245+00:00 1 None 2024-05-19 23:23:13.963305+00:00 FhvJ40ujhtv3 /home/runner/work/nextflow-lamin-usecases/next... None local None 7XJiuVOUySVN
Transform
version uid name key description type latest_report_id source_code_id reference reference_type created_at updated_at created_by_id
id
2 1.0.0 egNPcFNWhVZx mcmicro None None pipeline None None https://github.com/labsyspharm/mcmicro None 2024-05-19 23:25:18.955811+00:00 2024-05-19 23:25:18.955846+00:00 1
1 None rIqtPQJeYDuJ Download None None pipeline None None None None 2024-05-19 23:23:15.325114+00:00 2024-05-19 23:23:15.325143+00:00 1
User
uid handle name created_at updated_at
id
1 DzTjkKse testuser1 Test User1 2024-05-19 23:23:13.958453+00:00 2024-05-19 23:23:13.958479+00:00

Clean up the test instance:

!lamin delete --force nextflow-mcmicro
Hide code cell output
💡 deleting instance testuser1/nextflow-mcmicro
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.14/x64/bin/lamin", line 8, in <module>
    sys.exit(main())
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 367, in __call__
    return super().__call__(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
    rv = self.invoke(ctx)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 103, in delete
    return delete(instance, force=force)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 136, in delete
    isettings.storage.root.rmdir()
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/pathlib.py", line 1215, in rmdir
    self._accessor.rmdir(self)
OSError: [Errno 39] Directory not empty: '/home/runner/work/nextflow-lamin-usecases/nextflow-lamin-usecases/docs'