Primary Sources
Tables
Tables can be registered off of any SQL database. Supported databases are: BigQuery, Postgres, Redshift, ClickHouse and Snowflake.
Register a SQL table as a primary data source.
Example
postgres = client.get_provider("my_postgres")
table = postgres.register_table(
name="transactions",
variant="july_2023",
table="transactions_table",
):
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of table to be registered |
required |
variant |
str
|
Name of variant to be registered |
''
|
table |
str
|
Name of SQL table |
required |
owner |
Union[str, UserRegistrar]
|
Owner |
''
|
description |
str
|
Description of table to be registered |
''
|
Returns:
Name | Type | Description |
---|---|---|
source |
ColumnSourceRegistrar
|
source |
Files
Files can be registered from the Local, Spark, and Kubernetes providers. Supported file types are: CSV and Parquet.
Spark
Sparkmode can register single files.
Register a Spark data source as a primary data source.
Examples
spark = client.get_provider("my_spark")
transactions = spark.register_file(
name="transactions",
variant="quickstart",
description="A dataset of fraudulent transactions",
file_path="s3://featureform-spark/featureform/transactions.parquet"
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of table to be registered |
required |
variant |
str
|
Name of variant to be registered |
''
|
file_path |
str
|
The URI of the file. Must be the full path |
required |
owner |
Union[str, UserRegistrar]
|
Owner |
''
|
description |
str
|
Description of table to be registered |
''
|
Returns:
Name | Type | Description |
---|---|---|
source |
ColumnSourceRegistrar
|
source |
Kubernetes Pandas Runner
Register a Kubernetes Runner data source as a primary data source.
Examples
k8s = client.get_provider("my_k8s")
transactions = k8s.register_file(
name="transactions",
variant="quickstart",
description="A dataset of fraudulent transactions",
file_path="s3://featureform-spark/featureform/transactions.parquet"
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name of table to be registered |
required |
variant |
str
|
Name of variant to be registered |
''
|
path |
str
|
The path to blob store file |
required |
owner |
Union[str, UserRegistrar]
|
Owner |
''
|
description |
str
|
Description of table to be registered |
''
|
Returns:
Name | Type | Description |
---|---|---|
source |
ColumnSourceRegistrar
|
source |