Skip to content

Primary Sources

Tables

Tables can be registered off of any SQL database. Supported databases are: BigQuery, Postgres, Redshift, ClickHouse and Snowflake.

Register a SQL table as a primary data source.

Example

postgres = client.get_provider("my_postgres")
table =  postgres.register_table(
    name="transactions",
    variant="july_2023",
    table="transactions_table",
):

Parameters:

Name Type Description Default
name str

Name of table to be registered

required
variant str

Name of variant to be registered

''
table str

Name of SQL table

required
owner Union[str, UserRegistrar]

Owner

''
description str

Description of table to be registered

''

Returns:

Name Type Description
source ColumnSourceRegistrar

source

Files

Files can be registered from the Local, Spark, and Kubernetes providers. Supported file types are: CSV and Parquet.

Spark

Sparkmode can register single files.

Register a Spark data source as a primary data source.

Examples

spark = client.get_provider("my_spark")
transactions = spark.register_file(
    name="transactions",
    variant="quickstart",
    description="A dataset of fraudulent transactions",
    file_path="s3://featureform-spark/featureform/transactions.parquet"
)

Parameters:

Name Type Description Default
name str

Name of table to be registered

required
variant str

Name of variant to be registered

''
file_path str

The URI of the file. Must be the full path

required
owner Union[str, UserRegistrar]

Owner

''
description str

Description of table to be registered

''

Returns:

Name Type Description
source ColumnSourceRegistrar

source

Kubernetes Pandas Runner

Register a Kubernetes Runner data source as a primary data source.

Examples

k8s = client.get_provider("my_k8s")
transactions = k8s.register_file(
    name="transactions",
    variant="quickstart",
    description="A dataset of fraudulent transactions",
    file_path="s3://featureform-spark/featureform/transactions.parquet"
)

Parameters:

Name Type Description Default
name str

Name of table to be registered

required
variant str

Name of variant to be registered

''
path str

The path to blob store file

required
owner Union[str, UserRegistrar]

Owner

''
description str

Description of table to be registered

''

Returns:

Name Type Description
source ColumnSourceRegistrar

source