Skip to content

Autogenerated Variants

When creating a resource, variants can be explicitly defined in the variant field of Sources, Transformations, Features, Labels, and Training Sets. If no variant is defined, a randomly generated variant is created and used.

The same randomly generated variant is added to each resource until import featureform is called again or featureform.set_run() is called.

Setting A Run

Variants can be explicitly defined by calling featureform.set_run("my_variant") with a string argument. This string will be used as the variant from that point forward. Calling featureform.set_run() with no arguments will create a new auto-generated variant.

Example 1: Using set_run() without arguments will generate a random run name.

import featureform as ff
ff.set_run()

postgres.register_table(
    name="transactions",
    table="transactions_table",
)

# Applying will register the source as name=transactions, variant=<randomly-generated>

Example 2: Using set_run() with arguments will set the variant to the provided name.

import featureform as ff
ff.set_run("last_30_days")

postgres.register_table(
    name="transactions",
    table="transactions_table",
)

# Applying will register the source as name=transactions, variant=last_30_days

Example 3: Generated and set variant names can be used together

import featureform as ff
ff.set_run()

file = spark.register_file(
    name="transactions",
    path="my/transactions.parquet",
    variant="last_30_days"
)

@spark.df_transformation(inputs=[file]):
def customer_count(transactions):
    return transactions.groupBy("CustomerID").count()


# Applying without a variant for the dataframe transformation will result in
# the transactions source having a variant of last_30_days and the transformation
# having a randomly generated variant

Example 4: This also works within SQL Transformations

import featureform as ff
ff.set_run("last_30_days")

@postgres.sql_transformation():
def my_transformation():
    return "SELECT CustomerID, Amount FROM {{ transactions }}"

# The variant will be autofilled so the SQL query is returned as:
# "SELECT CustomerID, Amount FROM {{ transactions.last_30_days }}"

Parameters:

Name Type Description Default
run str

Name of a run to be set.

''

Getting A Run

The currently set run can be gotten by calling featureform.get_run(). This will return a string of the current run.

This is useful when using serving functions from the notebook that resources are being applied in.

Get the current run name.

Examples:

import featureform as ff

client = ff.Client()
f = client.features(("avg_transaction_amount", ff.get_run()), {"user": "123"})

Returns:

Name Type Description
run str

The name of the current run