Redun#
Here, we’ll see how to track redun workflow runs with LaminDB.
Note
This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.
Setup#
!lamin init --storage . --name redun-lamin-fasta
Show code cell output
💡 connected lamindb: testuser1/redun-lamin-fasta
Register the workflow#
import lamindb as ln
import json
💡 connected lamindb: testuser1/redun-lamin-fasta
Register the workflow in the Transform
registry:
ln.Transform(
name="lamin-redun-fasta",
type="pipeline",
version="0.1.0",
reference="https://github.com/laminlabs/redun-lamin-fasta",
).save()
Transform(uid='yI1SutqL22AqAjZG', name='lamin-redun-fasta', version='0.1.0', type='pipeline', reference='https://github.com/laminlabs/redun-lamin-fasta', updated_at=2024-05-01 09:53:14 UTC, created_by_id=1)
How to amend a redun workflow.py to register input & output files in LaminDB?
To query input files via LaminDB, we added the following lines:
# register input files in lamindb
ln.save(ln.Artifact.from_dir(input_dir))
# query & track this pipeline
transform = ln.Transform.filter(name="lamin-redun-fasta", version="0.1.0").one()
ln.track(transform=transform)
# query input files
input_filepaths = [
file.stage() for file in ln.Artifact.filter(key__startswith="fasta/")
]
To register the output file via LaminDB, we added the following line to the last task:
ln.Artifact(output_path).save()
Run redun#
Let’s see what the input files are:
!ls ./fasta
KLF4.fasta MYC.fasta PO5F1.fasta SOX2.fasta
And call the workflow:
!redun run workflow.py main --input-dir ./fasta --tag run=test-run 1> redun_stdout.txt 2>redun_stderr.txt
Inspect the output:
!cat redun_stdout.txt
💡 connected lamindb: testuser1/redun-lamin-fasta
❗ this creates one artifact per file in the directory - you might simply call ln.Artifact(dir) to get one artifact for the entire directory
❗ no run & transform get linked, consider calling ln.track()
❗ no run & transform get linked, consider calling ln.track()
❗ no run & transform get linked, consider calling ln.track()
❗ no run & transform get linked, consider calling ln.track()
💡 loaded: Transform(uid='yI1SutqL22AqAjZG', name='lamin-redun-fasta', version='0.1.0', type='pipeline', reference='https://github.com/laminlabs/redun-lamin-fasta', updated_at=2024-05-01 09:53:14 UTC, created_by_id=1)
💡 saved: Run(uid='TpXYZ0HoCIwrJiuIT2qg', transform_id=1, created_by_id=1)
File(path=/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/results.tgz, hash=0cfc1b60)
And the error log:
!tail -1 redun_stderr.txt
[redun] Execution duration: 2.06 seconds
View data lineage:
artifact = ln.Artifact.filter(key="data/results.tgz").one() # query by name
artifact.view_lineage()
Register the redun execution id#
If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:
# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
redun_exec = json.load(open("redun_exec.json"))
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='TpXYZ0HoCIwrJiuIT2qg', started_at=2024-05-01 09:53:18 UTC, reference='af5b932b-d5b8-4d15-8ec3-8df546247d22', reference_type='redun_id', transform_id=1, created_by_id=1)
View the database content#
ln.view()
Artifact
uid | storage_id | key | suffix | accessor | description | version | size | hash | hash_type | n_objects | n_observations | transform_id | run_id | visibility | key_is_virtual | created_at | updated_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||
5 | 9q2tzTGy1J4nEv7UcjAn | 1 | data/results.tgz | .tgz | None | None | None | 83759 | R9DDhFqyndmsZk2gl_9KEQ | md5 | None | None | 1.0 | 1.0 | 1 | False | 2024-05-01 09:53:20.659884+00:00 | 2024-05-01 09:53:20.659922+00:00 | 1 |
4 | u0xXvsr3hzuTtnfCB2fR | 1 | fasta/SOX2.fasta | .fasta | None | None | None | 414 | C5q_yaFXGk4SAEpfdqBwnQ | md5 | None | None | NaN | NaN | 1 | False | 2024-05-01 09:53:18.705408+00:00 | 2024-05-01 09:53:18.705429+00:00 | 1 |
3 | 7MRWXzC5aLvscjO3tIN0 | 1 | fasta/MYC.fasta | .fasta | None | None | None | 536 | WGbEtzPw-3bQEGcngO_pHQ | md5 | None | None | NaN | NaN | 1 | False | 2024-05-01 09:53:18.704841+00:00 | 2024-05-01 09:53:18.704863+00:00 | 1 |
2 | x8fujBCvzQI8Tyll8QBK | 1 | fasta/KLF4.fasta | .fasta | None | None | None | 609 | LyuoYkWs4SgYcH7P7JLJtA | md5 | None | None | NaN | NaN | 1 | False | 2024-05-01 09:53:18.704076+00:00 | 2024-05-01 09:53:18.704100+00:00 | 1 |
1 | xH45eVTFuB5m6eTWzgT8 | 1 | fasta/PO5F1.fasta | .fasta | None | None | None | 477 | -7iJgveFO9ia0wE1bqVu6g | md5 | None | None | NaN | NaN | 1 | False | 2024-05-01 09:53:18.702875+00:00 | 2024-05-01 09:53:18.702910+00:00 | 1 |
Run
uid | transform_id | started_at | finished_at | created_by_id | json | report_id | environment_id | is_consecutive | reference | reference_type | created_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||
1 | TpXYZ0HoCIwrJiuIT2qg | 1 | 2024-05-01 09:53:18.710674+00:00 | None | 1 | None | None | None | None | af5b932b-d5b8-4d15-8ec3-8df546247d22 | redun_id | 2024-05-01 09:53:18.710926+00:00 |
Storage
uid | root | description | type | region | created_at | updated_at | created_by_id | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
1 | Jey7HFrJ | /home/runner/work/redun-lamin-fasta/redun-lami... | None | local | None | 2024-05-01 09:53:13.125072+00:00 | 2024-05-01 09:53:13.125097+00:00 | 1 |
Transform
uid | name | key | version | description | type | latest_report_id | source_code_id | reference | reference_type | created_at | updated_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||
1 | yI1SutqL22AqAjZG | lamin-redun-fasta | None | 0.1.0 | None | pipeline | None | None | https://github.com/laminlabs/redun-lamin-fasta | None | 2024-05-01 09:53:14.647999+00:00 | 2024-05-01 09:53:14.648030+00:00 | 1 |
User
uid | handle | name | created_at | updated_at | |
---|---|---|---|---|---|
id | |||||
1 | DzTjkKse | testuser1 | Test User1 | 2024-05-01 09:53:13.120779+00:00 | 2024-05-01 09:53:13.120812+00:00 |
Delete the test instance:
!lamin delete --force redun-lamin-fasta
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.10.14/x64/bin/lamin", line 8, in <module>
sys.exit(main())
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 360, in __call__
return super().__call__(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
rv = self.invoke(ctx)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 103, in delete
return delete(instance, force=force)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 130, in delete
n_objects = check_storage_is_empty(
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/core/upath.py", line 720, in check_storage_is_empty
raise InstanceNotEmpty(message)
lamindb_setup.core.upath.InstanceNotEmpty: Storage location contains 31 objects (2 ignored) - delete them prior to deleting the instance
['/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/.lamindb/_is_initialized', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/.redun/redun.db', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/.redun/redun.ini', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/989b97bf67a75b2d8b0df2ca74b868b7.lndb', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/__pycache__/workflow.cpython-310.pyc', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/changelog.md', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/KLF4.count.tsv', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/KLF4.peptides.txt', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/KLF4.plot.png', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/MYC.count.tsv', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/MYC.peptides.txt', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/MYC.plot.png', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/PO5F1.count.tsv', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/PO5F1.peptides.txt', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/PO5F1.plot.png', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/SOX2.count.tsv', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/SOX2.peptides.txt', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/SOX2.plot.png', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/protein_report.tsv', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/data/results.tgz', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/fasta/KLF4.fasta', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/fasta/MYC.fasta', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/fasta/PO5F1.fasta', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/fasta/SOX2.fasta', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/guide.md', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/index.md', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/redun.ipynb', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/redun_exec.json', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/redun_stderr.txt', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/redun_stdout.txt', '/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/workflow.py']