Autogenerated Variants
When creating a resource, variants can be explicitly defined in the variant
field of Sources, Transformations,
Features, Labels, and Training Sets. If no variant is defined, a randomly generated variant is created and used.
The same randomly generated variant is added to each resource until import featureform
is called again or
featureform.set_run()
is called.
Setting A Run
Variants can be explicitly defined by calling featureform.set_run("my_variant")
with a string argument.
This string will be used as the variant from that point forward. Calling featureform.set_run()
with no arguments
will create a new auto-generated variant.
Example 1: Using set_run() without arguments will generate a random run name.
import featureform as ff
ff.set_run()
postgres.register_table(
name="transactions",
table="transactions_table",
)
# Applying will register the source as name=transactions, variant=<randomly-generated>
Example 2: Using set_run() with arguments will set the variant to the provided name.
import featureform as ff
ff.set_run("last_30_days")
postgres.register_table(
name="transactions",
table="transactions_table",
)
# Applying will register the source as name=transactions, variant=last_30_days
Example 3: Generated and set variant names can be used together
import featureform as ff
ff.set_run()
file = spark.register_file(
name="transactions",
path="my/transactions.parquet",
variant="last_30_days"
)
@spark.df_transformation(inputs=[file]):
def customer_count(transactions):
return transactions.groupBy("CustomerID").count()
# Applying without a variant for the dataframe transformation will result in
# the transactions source having a variant of last_30_days and the transformation
# having a randomly generated variant
Example 4: This also works within SQL Transformations
import featureform as ff
ff.set_run("last_30_days")
@postgres.sql_transformation():
def my_transformation():
return "SELECT CustomerID, Amount FROM {{ transactions }}"
# The variant will be autofilled so the SQL query is returned as:
# "SELECT CustomerID, Amount FROM {{ transactions.last_30_days }}"
Parameters:
Name | Type | Description | Default |
---|---|---|---|
run |
str
|
Name of a run to be set. |
''
|
Getting A Run
The currently set run can be gotten by calling featureform.get_run()
. This will return a string of the current run.
This is useful when using serving functions from the notebook that resources are being applied in.