lamindb.Transform¶
- class lamindb.Transform(name: str, key: str | None = None, version: str | None = None, type: TransformType | None = None, is_new_version_of: Transform | None = None)¶
Bases:
Registry,HasParents,IsVersionedData transformations.
A transform can refer to a simple Python function, script, a notebook, or a pipeline. If you execute a transform, you generate a run of a transform (
Run). A run has input and output data.A pipeline is typically created with a workflow tool (Nextflow, Snakemake, Prefect, Flyte, MetaFlow, redun, Airflow, …) and stored in a versioned repository.
Transforms are versioned so that a given transform maps 1:1 to a specific version of code. If you switch on
sync_git_repo, any script-like transform is synced its hashed state in a git repository.If you execute a transform, you generate a
Runrecord. The definition of transforms and runs is consistent the OpenLineage specification where aTransformrecord would be called a “job” and aRunrecord a “run”.- Parameters:
name –
strA name or title.key –
str | None = NoneA short name or path-like semantic key.version –
str | None = NoneA version.type –
TransformType | None = "pipeline"Either'notebook','pipeline'or'script'.is_new_version_of –
Transform | None = NoneAn old version of the transform.
Notes
Examples
Create a transform for a pipeline:
>>> transform = ln.Transform(name="Cell Ranger", version="7.2.0", type="pipeline") >>> transform.save()
Create a transform from a notebook:
>>> ln.track()
View parents of a transform:
>>> transform.view_parents()
Fields
- version CharField
Version (default
None).Defines version of a family of records characterized by the same
stem_uid.Consider using semantic versioning with Python versioning.
- id AutoField
Internal id, valid only in one DB instance.
- uid CharField
Universal id.
- name CharField
A name or title. For instance, a pipeline name, notebook title, etc.
- key CharField
A key for concise reference & versioning (optional).
- description CharField
A description (optional).
- type CharField
Transform type (default
"pipeline").
- latest_report ForeignKey
Latest run report.
- source_code ForeignKey
Source of the transform if stored as artifact within LaminDB.
- reference CharField
Reference for the transform, e.g., a URL.
- reference_type CharField
Type of reference, e.g., ‘url’ or ‘doi’.
- created_at DateTimeField
Time of creation of record.
- updated_at DateTimeField
Time of last update to record.
- created_by ForeignKey
Creator of record, a
User.
- ulabels ManyToManyField
Accessor to the related objects manager on the forward and reverse sides of a many-to-many relation.
In the example:
class Pizza(Model): toppings = ManyToManyField(Topping, related_name='pizzas')
Pizza.toppingsandTopping.pizzasareManyToManyDescriptorinstances.Most of the implementation is delegated to a dynamically defined manager class built by
create_forward_many_to_many_manager()defined below.
- parents ManyToManyField
Parent transforms (predecessors) in data flow.
These are auto-populated whenever a transform loads an artifact or collection as run input.
Methods
- delete()¶
- Return type:
None
- get_type_display(*, field=<django.db.models.fields.CharField: type>)¶