Skip to content

Providers

Credentials

Credentials are objects that can be reused in the same definitions file when registering providers in the same cloud.

Cloud Providers

AWS

Credentials for an AWS.

Example

aws_credentials = ff.AWSCredentials(
    access_key="<AWS_ACCESS_KEY>",
    secret_key="<AWS_SECRET_KEY>"
)

Parameters:

Name Type Description Default
access_key str

AWS Access Key.

required
secret_key str

AWS Secret Key.

required

Google Cloud

Credentials for an GCP.

Example

gcp_credentials = ff.GCPCredentials(
    project_id="<project_id>",
    credentials_path="<path_to_credentials>"
)

Parameters:

Name Type Description Default
project_id str

The project id.

required
credentials_path str

The path to the credentials file.

required

Spark

Generic

Credentials for a Generic Spark Cluster

Example

spark_credentials = ff.SparkCredentials(
    master="yarn",
    deploy_mode="cluster",
    python_version="3.7.12",
    core_site_path="core-site.xml",
    yarn_site_path="yarn-site.xml"
)

spark = ff.register_spark(
    name="spark",
    executor=spark_credentials,
    ...
)

Parameters:

Name Type Description Default
master str

The hostname of the Spark cluster. (The same that would be passed to spark-submit).

required
deploy_mode str

The deploy mode of the Spark cluster. (The same that would be passed to spark-submit).

required
python_version str

The Python version running on the cluster. Supports 3.7-3.11

required
core_site_path str

The path to the core-site.xml file. (For Yarn clusters only)

''
yarn_site_path str

The path to the yarn-site.xml file. (For Yarn clusters only)

''

Databricks

Credentials for a Databricks cluster.

Example

databricks = ff.DatabricksCredentials(
    username="<my_username>",
    password="<my_password>",
    host="<databricks_hostname>",
    token="<databricks_token>",
    cluster_id="<databricks_cluster>",
)

spark = ff.register_spark(
    name="spark",
    executor=databricks,
    ...
)

Parameters:

Name Type Description Default
username str

Username for a Databricks cluster.

required
password str

Password for a Databricks cluster.

required
host str

The hostname of a Databricks cluster.

required
token str

The token for a Databricks cluster.

required
cluster_id str

ID of an existing Databricks cluster.

required

EMR

Credentials for an EMR cluster.

Example

emr = ff.EMRCredentials(
    emr_cluster_id="<cluster_id>",
    emr_cluster_region="<cluster_region>",
    credentials="<AWS_Credentials>",
)

spark = ff.register_spark(
    name="spark",
    executor=emr,
    ...
)

Parameters:

Name Type Description Default
emr_cluster_id str

ID of an existing EMR cluster.

required
emr_cluster_region str

Region of an existing EMR cluster.

required
credentials AWSCredentials

Credentials for an AWS account with access to the cluster

required

Provider Registration

This page provides reference and examples for how to register the various providers that Featureform supports.

Azure Blob Store

Register an Azure Blob Store provider.

Azure Blob Storage can be used as the storage component for Spark or the Featureform Pandas Runner.

Examples:

blob = ff.register_blob_store(
    name="azure-quickstart",
    container_name="my_company_container"
    root_path="custom/path/in/container"
    account_name=<azure_account_name>
    account_key=<azure_account_key>
    description="An azure blob store provider to store offline and inference data"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Azure blob store to be registered

required
container_name str

(Immutable) Azure container name

required
root_path str

(Immutable) A custom path in container to store data

required
account_name str

(Immutable) Azure account name

required
account_key str

(Mutable) Secret azure account key

required
description str

(Mutable) Description of Azure Blob provider to be registered

''
team str

(Mutable) The name of the team registering the filestore

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

None
properties dict

(Mutable) Optional grouping mechanism for resources

None

Returns:

Name Type Description
blob StorageProvider

Provider has all the functionality of OnlineProvider

BigQuery

Register a BigQuery provider.

Examples:

bigquery = ff.register_bigquery(
    name="bigquery-quickstart",
    description="A BigQuery deployment we created for the Featureform quickstart",
    project_id="quickstart-project",
    dataset_id="quickstart-dataset",
    credentials=GCPCredentials(...)
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of BigQuery provider to be registered

required
project_id str

(Immutable) The Project name in GCP

required
dataset_id str

(Immutable) The Dataset name in GCP under the Project Id

required
credentials GCPCredentials

(Mutable) GCP credentials to access BigQuery

required
description str

(Mutable) Description of BigQuery provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
bigquery OfflineSQLProvider

Provider

Cassandra

Register a Cassandra provider.

Examples:

cassandra = ff.register_cassandra(
        name = "cassandra",
        description = "Example inference store",
        team = "Featureform",
        host = "0.0.0.0",
        port = 9042,
        username = "cassandra",
        password = "cassandra",
        consistency = "THREE",
        replication = 3
    )

Parameters:

Name Type Description Default
name str

(Immutable) Name of Cassandra provider to be registered

required
host str

(Immutable) DNS name of Cassandra

required
port str

(Mutable) Port

required
username str

(Mutable) Username

required
password str

(Mutable) Password

required
consistency str

(Mutable) Consistency

'THREE'
replication int

(Mutable) Replication

3
description str

(Mutable) Description of Cassandra provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
cassandra OnlineProvider

Provider

DynamoDB

Register a DynamoDB provider.

Examples:

dynamodb = ff.register_dynamodb(
    name="dynamodb-quickstart",
    description="A Dynamodb deployment we created for the Featureform quickstart",
    credentials=aws_creds,
    region="us-east-1"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of DynamoDB provider to be registered

required
region str

(Immutable) Region to create dynamo tables

required
credentials AWSCredentials

(Mutable) AWS credentials with permissions to create DynamoDB tables

required
should_import_from_s3 bool

(Mutable) Determines whether feature materialization will occur via a direct import of data from S3 to new table (see docs for details)

False
description str

(Mutable) Description of DynamoDB provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
dynamodb OnlineProvider

Provider

Firestore

Register a Firestore provider.

Examples:

firestore = ff.register_firestore(
    name="firestore-quickstart",
    description="A Firestore deployment we created for the Featureform quickstart",
    project_id="quickstart-project",
    collection="quickstart-collection",
    credentials=ff.GCPCredentials(...)
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Firestore provider to be registered

required
project_id str

(Immutable) The Project name in GCP

required
collection str

(Immutable) The Collection name in Firestore under the given project ID

required
credentials GCPCredentials

(Mutable) GCP credentials to access Firestore

required
description str

(Mutable) Description of Firestore provider to be registered

''
team str

(Mutable) The name of the team registering the filestore

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
firestore OfflineSQLProvider

Provider

Google Cloud Storage

Register a GCS store provider.

Examples:

gcs = ff.register_gcs(
    name="gcs-quickstart",
    credentials=ff.GCPCredentials(...),
    bucket_name="bucket_name",
    root_path="featureform/path/",
    description="An gcs store provider to store offline"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of GCS store to be registered

required
bucket_name str

(Immutable) The bucket name

required
root_path str

(Immutable) Custom path to be used by featureform

required
credentials GCPCredentials

(Mutable) GCP credentials to access the bucket

required
description str

(Mutable) Description of GCS provider to be registered

''
team str

(Mutable) The name of the team registering the filestore

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
gcs FileStoreProvider

Provider has all the functionality of OfflineProvider

HDFS

Register a HDFS store provider.

This has the functionality of an offline store and can be used as a parameter to a k8s or spark provider

Examples:

hdfs = ff.register_hdfs(
    name="hdfs-quickstart",
    host="<host>",
    port="<port>",
    path="<path>",
    username="<username>",
    description="An hdfs store provider to store offline"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of HDFS store to be registered

required
host str

(Immutable) The hostname for HDFS

required
path str

(Immutable) A storage path within HDFS

''
port str

(Mutable) The IPC port for the Namenode for HDFS. (Typically 8020 or 9000)

required
username str

(Mutable) A Username for HDFS

''
description str

(Mutable) Description of HDFS provider to be registered

''
team str

(Mutable) The name of the team registering HDFS

''

Returns:

Name Type Description
hdfs FileStoreProvider

Provider

Kubernetes Pandas Runner

Register an offline store provider to run on Featureform's own k8s deployment. Examples:

spark = ff.register_k8s(
    name="k8s",
    store=AzureBlobStore(),
    docker_image="my-repo/image:version"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of provider

required
store FileStoreProvider

(Mutable) Reference to registered file store provider

required
docker_image str

(Mutable) A custom docker image using the base image featureformcom/k8s_runner

''
description str

(Mutable) Description of primary data to be registered

''
team str

(Mutable) A string parameter describing the team that owns the provider

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

MongoDB

Register a MongoDB provider.

Examples:

mongodb = ff.register_mongodb(
    name="mongodb-quickstart",
    description="A MongoDB deployment",
    username="my_username",
    password="myPassword",
    database="featureform_database"
    host="my-mongodb.host.com",
    port="10225",
    throughput=10000
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of MongoDB provider to be registered

required
database str

(Immutable) MongoDB database

required
host str

(Immutable) MongoDB hostname

required
port str

(Immutable) MongoDB port

required
username str

(Mutable) MongoDB username

required
password str

(Mutable) MongoDB password

required
throughput int

(Mutable) The maximum RU limit for autoscaling in CosmosDB

1000
description str

(Mutable) Description of MongoDB provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
mongodb OnlineProvider

Provider

Pinecone

Register a Pinecone provider.

Examples:

pinecone = ff.register_pinecone(
    name="pinecone-quickstart",
    project_id="2g13ek7",
    environment="us-west4-gcp-free",
    api_key="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Pinecone provider to be registered

required
project_id str

(Immutable) Pinecone project id

required
environment str

(Immutable) Pinecone environment

required
api_key str

(Mutable) Pinecone api key

required
description str

(Mutable) Description of Pinecone provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
pinecone OnlineProvider

Provider

Postgres

Register a Postgres provider.

Examples:

postgres = ff.register_postgres(
    name="postgres-quickstart",
    description="A Postgres deployment we created for the Featureform quickstart",
    host="quickstart-postgres",  # The internal dns name for postgres
    port="5432",
    user="postgres",
    password="password", #pragma: allowlist secret
    database="postgres"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Postgres provider to be registered

required
host str

(Immutable) Hostname for Postgres

required
database str

(Immutable) Database

required
port str

(Mutable) Port

'5432'
user str

(Mutable) User

required
password str

(Mutable) Password

required
sslmode str

(Mutable) SSL mode

'disable'
description str

(Mutable) Description of Postgres provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
postgres OfflineSQLProvider

Provider

ClickHouse

Register a ClickHouse provider.

Examples:

clickhouse = ff.register_clickhouse(
    name="clickhouse-quickstart",
    description="A ClickHouse deployment we created for the Featureform quickstart",
    host="quickstart-clickhouse",  # The internal dns name for clickhouse
    port=9000,
    user="default",
    password="", #pragma: allowlist secret
    database="default"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of ClickHouse provider to be registered

required
host str

(Immutable) Hostname for ClickHouse

required
database str

(Immutable) ClickHouse database

required
port int

(Mutable) Port

9000
ssl bool

(Mutable) Enable SSL

False
user str

(Mutable) User

required
password str

(Mutable) ClickHouse password

required
description str

(Mutable) Description of ClickHouse provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
clickhouse OfflineSQLProvider

Provider

Redis

Register a Redis provider.

Examples:

redis = ff.register_redis(
    name="redis-quickstart",
    host="quickstart-redis",
    port=6379,
    password="password",
    description="A Redis deployment we created for the Featureform quickstart"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Redis provider to be registered

required
host str

(Immutable) Hostname for Redis

required
db str

(Immutable) Redis database number

0
port int

(Mutable) Redis port

6379
password str

(Mutable) Redis password

''
description str

(Mutable) Description of Redis provider to be registered

''
team str

(Mutable) Name of team

''
tags Optional[List[str]]

(Mutable) Optional grouping mechanism for resources

None
properties Optional[dict]

(Mutable) Optional grouping mechanism for resources

None

Returns:

Name Type Description
redis OnlineProvider

Provider

Redshift

Register a Redshift provider.

Examples:

redshift = ff.register_redshift(
    name="redshift-quickstart",
    description="A Redshift deployment we created for the Featureform quickstart",
    host="quickstart-redshift",  # The internal dns name for redshift
    port="5432",
    user="redshift",
    password="password", #pragma: allowlist secret
    database="dev"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Redshift provider to be registered

required
host str

(Immutable) Hostname for Redshift

required
database str

(Immutable) Redshift database

required
port str

(Mutable) Port

required
user str

(Mutable) User

required
password str

(Mutable) Redshift password

required
sslmode str

(Mutable) SSL mode

'disable'
description str

(Mutable) Description of Redshift provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
redshift OfflineSQLProvider

Provider

S3

Register a S3 store provider.

This has the functionality of an offline store and can be used as a parameter to a k8s or spark provider

Examples:

s3 = ff.register_s3(
    name="s3-quickstart",
    credentials=aws_creds,
    bucket_name="bucket_name",
    bucket_region=<bucket_region>,
    path="path/to/store/featureform_files/in/",
    description="An s3 store provider to store offline"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of S3 store to be registered

required
bucket_name str

(Immutable) AWS Bucket Name

required
bucket_region str

(Immutable) AWS region the bucket is located in

required
path str

(Immutable) The path used to store featureform files in

''
credentials AWSCredentials

(Mutable) AWS credentials to access the bucket

required
description str

(Mutable) Description of S3 provider to be registered

''
team str

(Mutable) The name of the team registering the filestore

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
s3 FileStoreProvider

Provider has all the functionality of OfflineProvider

Snowflake

Current

Register a Snowflake provider.

Examples:

snowflake = ff.register_snowflake(
    name="snowflake-quickstart",
    username="snowflake",
    password="password", #pragma: allowlist secret
    account="account",
    organization="organization",
    database="snowflake",
    schema="PUBLIC",
    description="A Snowflake deployment we created for the Featureform quickstart"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Snowflake provider to be registered

required
account str

(Immutable) Account

required
organization str

(Immutable) Organization

required
database str

(Immutable) Database

required
schema str

(Immutable) Schema

'PUBLIC'
username str

(Mutable) Username

required
password str

(Mutable) Password

required
warehouse str

(Mutable) Specifies the virtual warehouse to use by default for queries, loading, etc.

''
role str

(Mutable) Specifies the role to use by default for accessing Snowflake objects in the client session

''
description str

(Mutable) Description of Snowflake provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
snowflake OfflineSQLProvider

Provider

Legacy

Register a Snowflake provider using legacy credentials.

Examples:

snowflake = ff.register_snowflake_legacy(
    name="snowflake-quickstart",
    username="snowflake",
    password="password",
    account_locator="account-locator",
    database="snowflake",
    schema="PUBLIC",
    description="A Snowflake deployment we created for the Featureform quickstart"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Snowflake provider to be registered

required
account_locator str

(Immutable) Account Locator

required
schema str

(Immutable) Schema

'PUBLIC'
database str

(Immutable) Database

required
username str

(Mutable) Username

required
password str

(Mutable) Password

required
warehouse str

(Mutable) Specifies the virtual warehouse to use by default for queries, loading, etc.

''
role str

(Mutable) Specifies the role to use by default for accessing Snowflake objects in the client session

''
description str

(Mutable) Description of Snowflake provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
snowflake OfflineSQLProvider

Provider

Spark

Register a Spark on Executor provider.

Examples:

spark = ff.register_spark(
    name="spark-quickstart",
    description="A Spark deployment we created for the Featureform quickstart",
    team="featureform-team",
    executor=databricks,
    filestore=azure_blob_store
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Spark provider to be registered

required
executor ExecutorCredentials

(Mutable) An Executor Provider used for the compute power

required
filestore FileStoreProvider

(Mutable) A FileStoreProvider used for storage of data

required
description str

(Mutable) Description of Spark provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
spark OfflineSparkProvider

Provider

Weaviate

Register a Weaviate provider.

Examples:

weaviate = ff.register_weaviate(
    name="weaviate-quickstart",
    url="https://<CLUSTER NAME>.weaviate.network",
    api_key="<API KEY>"
    description="A Weaviate project for using embeddings in Featureform"
)

Parameters:

Name Type Description Default
name str

(Immutable) Name of Weaviate provider to be registered

required
url str

(Immutable) Endpoint of Weaviate cluster, either in the cloud or via another deployment operation

required
api_key str

(Mutable) Weaviate api key

required
description str

(Mutable) Description of Weaviate provider to be registered

''
team str

(Mutable) Name of team

''
tags List[str]

(Mutable) Optional grouping mechanism for resources

[]
properties dict

(Mutable) Optional grouping mechanism for resources

{}

Returns:

Name Type Description
weaviate OnlineProvider

Provider