Reusing Providers
Featureform's API allows you to reuse already applied definitions. You can easily get pre-applied providers and resources to continue building off of.
To reuse a provider, simply use the associated get
method for that provider.
Example
from featureform as ff
postgres = ff.get_postgres("prod-instance")
postgres.register_table(
name="transactions",
variant="2022",
table="2022_transactions",
)
Available Providers
BigQuery
Get a BigQuery provider. The returned object can be used to register additional resources.
Examples:
bigquery = ff.get_bigquery("bigquery-quickstart")
transactions = bigquery.register_table(
name="transactions",
variant="kaggle",
description="Fraud Dataset From Kaggle",
table="Transactions", # This is the table's name in BigQuery
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of BigQuery provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
bigquery |
OfflineSQLProvider
|
Provider |
K8s Runner
Get a k8s provider. The returned object can be used to register additional resources.
Examples:
k8s = ff.get_kubernetes("k8s-azure-quickstart")
transactions = k8s.register_file(
name="transactions",
variant="kaggle",
description="Fraud Dataset From Kaggle",
path="path/to/blob",
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of k8s provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
k8s |
OfflineK8sProvider
|
Provider |
MongoDB
Get a MongoDB provider. The returned object can be used to register additional resources.
Examples:
mongodb = ff.get_mongodb("mongodb-quickstart")
average_user_transaction.register_resources(
entity=user,
entity_column="user_id",
inference_store=mongodb,
features=[
{"name": "avg_transactions", "variant": "quickstart", "column": "avg_transaction_amt", "type": "float32"},
],
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of MongoDB provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
mongodb |
OnlineProvider
|
Provider |
Postgres
Get a Postgres provider. The returned object can be used to register additional resources.
Examples:
postgres = ff.get_postgres("postgres-quickstart")
transactions = postgres.register_table(
name="transactions",
variant="kaggle",
description="Fraud Dataset From Kaggle",
table="Transactions", # This is the table's name in Postgres
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of Postgres provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
postgres |
OfflineSQLProvider
|
Provider |
ClickHouse
Get a ClickHouse provider. The returned object can be used to register additional resources.
Examples:
clickhouse = ff.get_clickhouse("clickhouse-quickstart")
transactions = clickhouse.register_table(
name="transactions",
variant="kaggle",
description="Fraud Dataset From Kaggle",
table="Transactions", # This is the table's name in ClickHouse
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of ClickHouse provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
clickhouse |
OfflineSQLProvider
|
Provider |
Redis
Get a Redis provider. The returned object can be used to register additional resources.
Examples:
redis = ff.get_redis("redis-quickstart")
average_user_transaction.register_resources(
entity=user,
entity_column="user_id",
inference_store=redis,
features=[
{"name": "avg_transactions", "variant": "quickstart", "column": "avg_transaction_amt", "type": "float32"},
],
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of Redis provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
redis |
OnlineProvider
|
Provider |
Redshift
Get a Redshift provider. The returned object can be used to register additional resources.
Examples:
redshift = ff.get_redshift("redshift-quickstart")
transactions = redshift.register_table(
name="transactions",
variant="kaggle",
description="Fraud Dataset From Kaggle",
table="Transactions", # This is the table's name in Postgres
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of Redshift provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
redshift |
OfflineSQLProvider
|
Provider |
S3
Get a S3 provider. The returned object can be used with other providers such as Spark and Databricks.
Examples:
s3 = ff.get_s3("s3-quickstart")
spark = ff.register_spark(
name=f"spark-emr-s3",
description="A Spark deployment we created for the Featureform quickstart",
team="featureform-team",
executor=emr,
filestore=s3,
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of S3 to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
s3 |
FileStore
|
Provider |
Snowflake
Get a Snowflake provider. The returned object can be used to register additional resources.
Examples:
snowflake = ff.get_snowflake("snowflake-quickstart")
transactions = snowflake.register_table(
name="transactions",
variant="kaggle",
description="Fraud Dataset From Kaggle",
table="Transactions", # This is the table's name in Postgres
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of Snowflake provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
snowflake |
OfflineSQLProvider
|
Provider |
Spark
Get a Spark provider. The returned object can be used to register additional resources.
Examples:
spark = ff.get_spark("spark-quickstart")
transactions = spark.register_file(
name="transactions",
variant="kaggle",
description="Fraud Dataset From Kaggle",
file_path="s3://bucket/path/to/file/transactions.parquet", # This is the path to file
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of Spark provider to be retrieved |
required |
Returns:
Name | Type | Description |
---|---|---|
spark |
OfflineSQLProvider
|
Provider |