lamindb.FeatureSet¶
- class lamindb.FeatureSet(features: Iterable[Registry], type: str | None = None, name: str | None = None)¶
-
Feature sets.
Stores references to sets of
Featureand other registries that may be used to identify features (e.g., class:~bionty.Geneor class:~bionty.Protein).- Parameters:
features –
Iterable[Registry]An iterable ofFeaturerecords to hash, e.g.,[Feature(...), Feature(...)]. Is turned into a set upon instantiation. If you’d like to pass values, usefrom_values()orfrom_df().type –
str | None = NoneThe simple type. Defaults toNonefor sets ofFeaturerecords, and otherwise defaults to"number"(e.g., for sets ofGene).name –
str | None = NoneA name.
Note
Feature sets are useful as you likely have many datasets that measure the same features. In LaminDB, they are all linked against the exact same feature set. If instead, you’d link each of the datasets against single features (say, genes), you’d face exploding link tables.
A feature set is identified by the hash of the feature uids in the set.
See also
from_values()Create from values.
from_df()Create from dataframe columns.
Examples
Create a featureset from df with types:
>>> df = pd.DataFrame({"feat1": [1, 2], "feat2": [3.1, 4.2], "feat3": ["cond1", "cond2"]}) >>> feature_set = ln.FeatureSet.from_df(df)
Create a featureset from features:
>>> features = ln.Feature.from_values(["feat1", "feat2"], type=float) >>> feature_set = ln.FeatureSet(features)
Create a featureset from feature values:
>>> import bionty as bt >>> feature_set = ln.FeatureSet.from_values(adata.var["ensemble_id"], Gene.ensembl_gene_id, orgaism="mouse") >>> feature_set.save()
Link a feature set to an artifact:
>>> artifact.features.add_feature_set(feature_set, slot="var")
Link features to an artifact (will create a featureset under the hood):
>>> artifact.features.add(features)
Properties
- members¶
A queryset for the individual records of the set..
Fields
- created_at DateTimeField
Time of creation of record.
- created_by ForeignKey
Creator of record, a
User.
- run ForeignKey
Last run that created or updated the record, a
Run.
- id AutoField
Internal id, valid only in one DB instance.
- uid CharField
A universal id (hash of the set of feature values).
- name CharField
A name (optional).
- n IntegerField
Number of features in the set.
- dtype CharField
Data type, e.g., “number”, “float”, “int”. Is
NoneforFeature.For
Feature, types are expected to be heterogeneous and defined on a per-feature level.
- registry CharField
The registry that stores the feature identifiers, e.g.,
'core.Feature'or'bionty.Gene'.Depending on the registry,
.membersstores, e.g.FeatureorGenerecords.
- hash CharField
The hash of the set.
Methods
- classmethod from_df(df, field=FieldAttr(Feature.name), name=None, mute=False, organism=None, public_source=None)¶
Create feature set for validated features..
- Return type:
FeatureSet|None
- classmethod from_values(values, field=FieldAttr(Feature.name), type=None, name=None, mute=False, organism=None, public_source=None, raise_validation_error=True)¶
Create feature set for validated features.
- Parameters:
values (
List[str] |Series|array) – A list of values, like feature names or ids.field (
DeferredAttribute, default:FieldAttr(Feature.name)) – The field of a reference registry to map values.type (
str|None, default:None) – The simple type. Defaults toNoneif reference registry isFeature, defaults to"float"otherwise.name (
str|None, default:None) – A name.organism (
str|Registry|None, default:None) – An organism to resolve gene mapping.public_source (
Registry|None, default:None) – A public ontology to resolve feature identifier mapping.raise_validation_error (
bool, default:True) – Whether to raise a validation error if some values are not valid.
- Raises:
ValidationError – If some values are not valid.
- Return type:
Examples
>>> features = ["feat1", "feat2"] >>> feature_set = ln.FeatureSet.from_values(features)
>>> genes = ["ENS980983409", "ENS980983410"] >>> feature_set = ln.FeatureSet.from_values(features, bt.Gene.ensembl_gene_id, float)
.
- save(*args, **kwargs)¶
Save.
- Return type:
None