Azure plugins for Feast (FEAture STore)

Overview

Feast on Azure

This project provides resources to enable running a feast feature store on Azure.

Feast Azure Provider

The Feast Azure provider acts like a plugin that allows Feast users to connect to:

  • Azure SQL DB and/or Synapse SQL as the offline store
  • Azure cache for Redis as the online store
  • Azure blob storage for the feast registry store

📐 Architecture

The interoperable design of feast means that many Azure services can be used to produce and/or consume features (for example: Azure ML, Synapse, Azure Databricks, Azure functions, etc).

azure provider architecture

For more details, including setup please navigate to the provider directory in this repo

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Comments
  • SQL error on creating table - notebook part1-load-data.ipynb

    SQL error on creating table - notebook part1-load-data.ipynb

    I have followed all steps mentioned in https://github.com/Azure/feast-azure/tree/main/provider/tutorial and had even changes roleDefinitionID parameter as per previous defect but still I'm getting same error. image

    opened by delta0123 6
  • SQL error on creating table - notebook part1-load-data.ipynb

    SQL error on creating table - notebook part1-load-data.ipynb

    I've followed the steps at https://github.com/Azure/feast-azure/tree/main/provider/tutorial but couldnt get table for customer data to load. Did a printf and made sure keyvault secret has a value. I am using the one button deploy method (not k8 cluster).

    Screenshot: Error: OperationalError: (pyodbc.OperationalError) ('HYT00', '[HYT00] [Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0) (SQLDriverConnect)') (Background on this error at: https://sqlalche.me/e/14/e3q8)

    image

    documentation provider 
    opened by cbtham 6
  • Sample notebooks

    Sample notebooks

    Hey all, Glad to see this initiative. Earlier this year, I tried Feast 0.9 on GCP/GKE : https://paravatha.medium.com/feast-setup-your-own-ml-feature-store-on-kubernetes-5b3193c2b62c

    Unfortunately, I don't have access to create Redis. So, If you have any sample notebooks for just the offline store part, I'd like to give it a try.

    documentation 
    opened by paravatha 6
  • Adding local helm chart and changing install script

    Adding local helm chart and changing install script

    Adding helm chart and changing script, so feast can be installed from local helm chart that contains the updates rbac apis thats compatible with latest versions of Kubernetes.

    This PR solves the issue where the older rbac v1beta apis are not compatible with latest Kubernetes version. This fixes that problem but the chart needs to be installed from local feast-0.9.5-helmchart folder.

    opened by jainr 5
  • Error - Cannot update to latest feast 0.17

    Error - Cannot update to latest feast 0.17

    Updating to 0.17 with pip install --upgrade feast fails with error message:

    ERROR: feast-azure-provider 0.2.2 has requirement feast[redis]==0.15.1, but you'll have feast 0.17.0 which is incompatible.

    Tried modifying setup.py in my own fork with deployment template pointing to my own fork that has ==0.17 specified but still error out. own fork: https://github.com/cbtham/feast-azure/blob/main/provider/sdk/setup.py

    It seems like the source of feast-azure-provider(URL below) needs to be updated to support 0.17 because the current version 0.2.2 hardcoded ==0.15.1

    setup install_requires=[ "feast[redis]==0.15.1"

    https://pypi.org/project/feast-azure-provider/

    priority/p0 provider 
    opened by cbtham 4
  • feast cluster installation fails with rbac.authorization.k8s.io/v1beta API version (ClusterRole)

    feast cluster installation fails with rbac.authorization.k8s.io/v1beta API version (ClusterRole)

    Following the instructions listed in cluster installation on AKS cluster v1.22.4: https://github.com/Azure/feast-azure/tree/main/cluster/setup

    Initial installation steps pass, but api versions are not recognized in newer version of AKS (stable API version rbac.authorization.k8s.io/v1 should work instead of rbac.authorization.k8s.io/v1beta1).

    Any change to specify a newer version of helm chart in the installation shell script?

    Here is the issue which I encounter:

    sudo ./installfeast.sh (parameters passed here) INFO: Install feast error: failed to create secret secrets "feast-postgresql" already exists Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "ClusterRole" in version "rbac.authorization.k8s.io/v1beta1", unable to recognize "": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1", unable to recognize "": no matches for kind "Role" in version "rbac.authorization.k8s.io/v1beta1", unable to recognize "": no matches for kind "RoleBinding" in version "rbac.authorization.k8s.io/v1beta1"]

    cluster 
    opened by andrijaperovic 4
  • Issue on loading feast - missing module

    Issue on loading feast - missing module

    In the tutorial - step 2 in notebook 2 "Registre..." fails with missing module. The error is triggered by the command "fs = FeatureStore("./feature_repo")" The Feast evrsion is 0.15.1 and the Azure feast provider is 0.2.1.

    Error "snippet" first lines and last line. ModuleNotFoundError Traceback (most recent call last) /anaconda/envs/azureml_py38/lib/python3.8/site-packages/feast/infra/online_stores/redis.py in 43 from redis import Redis ---> 44 from rediscluster import RedisCluster ......... FeastModuleImportError: Could not import RedisOnlineStoreConfig module 'feast.infra.online_stores.redis'

    bug 
    opened by jcordtz 2
  • Race Condition with Blobfuse Mount and CI SetupScript

    Race Condition with Blobfuse Mount and CI SetupScript

    When trying to clone the feast-azure repo, I'm running into a race condition - where the Users directory hasn't yet been mounted to the CI when the cd /home/azureuser/cloudfiles/code/Users portion of the script is run.

    Here's the error in the logs...

    Successfully installed pymssql-2.2.2
    /mnt/batch/tasks/startup/wd/scripts/creation.sh: line 1: cd: /home/azureuser/cloudfiles/code/Users: No such file or directory
    Cloning into 'feast-azure'...
    
    bug tutorial 
    opened by ezwiefel 2
  • Conda Environment Creation Fails for Deployment

    Conda Environment Creation Fails for Deployment

    Summary: When attempting to create the conda environment "feast-env" as found in the training and deployment notebook, the environment creation stalls out after an hour during "installing pip dependencies" and never completes.

    Details: First occurrence of this issue was during attempts to deploy a scoring container, which timed out after an hour. Looking at the logs, it simply stops at the line "Installing pip dependencies: ... working ...", with no further detail.

    Attempting to build the conda environment from the "model_service_env.yml" in a local terminal fail in the same spot. Manually removing the pip packages from the yml file and manually installing those packages in the conda environment does not fail. Also, removing just the "feast-azure-provider" package causes to build to succeed, and it can then be manually installed. However, since Azure ML does not allow the pip installs to be done in separate steps, this is not a viable workaround for the container deployment.

    bug 
    opened by tarockey 2
  • Incorrect Redis Cache Connection string translation

    Incorrect Redis Cache Connection string translation

    Hello , In using the feature online_store with the sample code form your provider example here.

    With the current redis client version for Python 3.8 ( using code version as of date ) 0.3.0 https://pypi.org/project/feast-azure-provider/?msclkid=6debb6a2b9b511ec876e161bfda67531

    Looks like the connection string does not exclude the default values for the Redis Cache connection string .elements Sample : myfsonlinesotre.redis.cache.windows.net:6380,password=zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz=,ssl=True,abortConnect=False

    If you provide the connection string as is , Python Redis client throw exception about the abortConnect element

    The only way the connection string work is by removing the ,abortConnect=False portion

    Working Connection String myfsonlinesotre.redis.cache.windows.net:6380,password=zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz=,ssl=True

    -George Gergues. Best of luck

    opened by Gergues 1
  • Error when deploying service  - ImportError: cannot import name 'json' from itsdangerous

    Error when deploying service - ImportError: cannot import name 'json' from itsdangerous

    I got following error when deploying service based on tutorial in:

    Error: ... ImportError: cannot import name 'json' from itsdangerous ...

    I have fixed that by pinning version of itsdangerous library in inference.dockerfile to 2.0.1 (newer version doesn't work):

    RUN pip install 'azureml-defaults==1.35.0' \
                    'feast-azure-provider==0.2.2' \
                    'scikit-learn==0.22.2.post1' \
    		'joblib===1.1.0'\
                    'itsdangerous==2.0.1'
    
    

    I followed this SO: https://stackoverflow.com/questions/71189819/python-docker-importerror-cannot-import-name-json-from-itsdangerous

    opened by michalmar 1
  • Error occurred while loading customers, drivers, orders data in /provider/tutorial/notebooks/part1-load-data.ipynb

    Error occurred while loading customers, drivers, orders data in /provider/tutorial/notebooks/part1-load-data.ipynb

    Hi, I'm new to feast and was trying to reproduce the tutorial in azure portal. Faced the below error.

    ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Not able to validate external location because The remote server returned an error: (409) Conflict. (105215) (SQLExecDirectW)')
    [SQL: COPY INTO dbo.driver_hourly
    FROM 'https://feastonazuredatasamples.blob.core.windows.net/feastdatasamples/driver_hourly.csv'
    WITH
    (
    	FILE_TYPE = 'CSV'
    	,FIRSTROW = 2
    	,MAXERRORS = 0
    )
    ]
    

    Thanks in advance!

    opened by likhith00 0
  • Bump pyspark from 3.1.3 to 3.2.2 in /cluster/sdk/python

    Bump pyspark from 3.1.3 to 3.2.2 in /cluster/sdk/python

    Bumps pyspark from 3.1.3 to 3.2.2.

    Commits
    • 78a5825 Preparing Spark release v3.2.2-rc1
    • ba978b3 [SPARK-39099][BUILD] Add dependencies to Dockerfile for building Spark releases
    • 001d8b0 [SPARK-37554][BUILD] Add PyArrow, pandas and plotly to release Docker image d...
    • 9dd4c07 [SPARK-37730][PYTHON][FOLLOWUP] Split comments to comply pycodestyle check
    • bc54a3f [SPARK-37730][PYTHON] Replace use of MPLPlot._add_legend_handle with MPLPlot....
    • c5983c1 [SPARK-38018][SQL][3.2] Fix ColumnVectorUtils.populate to handle CalendarInte...
    • 32aff86 [SPARK-39447][SQL][3.2] Avoid AssertionError in AdaptiveSparkPlanExec.doExecu...
    • be891ad [SPARK-39551][SQL][3.2] Add AQE invalid plan check
    • 1c0bd4c [SPARK-39656][SQL][3.2] Fix wrong namespace in DescribeNamespaceExec
    • 3d084fe [SPARK-39677][SQL][DOCS][3.2] Fix args formatting of the regexp and like func...
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies python 
    opened by dependabot[bot] 0
  • How to load historical features directly into spark dataframe

    How to load historical features directly into spark dataframe

    We have been using Feast with a SQL db as an offline store and used JDBC to append features from a Spark dataframe directly to a table in SQL. Now for a recommender we'd like to build a historical dataset to train models on which will use a couple hundred-millions rows. Each is a customer with a timestamp. Feast's get_historical_features only takes a pandas dataframe as entity or a SQL query, so a workaround has been to store the entity df in the SQL db and use the query to fetch the features like so:

    sql_job = fs.get_historical_features(
        entity_df="SELECT * FROM test_entitity_df",
        features=[
            'feature_view1:feature1',
            'feature_view1:feature2',
        ]
    )
    

    However, the sql_job only has to_df, to_arrow, or persist functionality. My question is, how to load features efficiently into a Spark DF for training? One solution would be to store the result of the Feast query in a sql table and use JDBC again to load that into Spark, however, I cannot seem to get the persist functionality to work as the docs on SavedDatasetStorage is very limited. Please advice.

    Resources: https://docs.feast.dev/reference/offline-stores/overview#functionality https://docs.feast.dev/getting-started/concepts/dataset#creating-a-saved-dataset-from-historical-retrieval

    opened by VincentPe 0
  • Adding Terraform as Infa as a Code

    Adding Terraform as Infa as a Code

    Dear All,

    I am quite used to work with terraform and Azure and I am happy to help adding the terraform script for creation of the two current infra setup.

    Any suggestion or mention to add ?

    Thanks Rob

    opened by kamakay 0
  • Example deployment fails

    Example deployment fails

    train and deploy with Feast notebook fails (Part 3). When the following block is executed:

    import uuid
    from azureml.core.model import InferenceConfig
    from azureml.core.environment import Environment
    from azureml.core.model import Model
    
    # get the registered model
    model = Model(ws, "order_model")
    
    # create an inference config i.e. the scoring script and environment
    inference_config = InferenceConfig(
        entry_script="score.py", 
        environment=env, 
        source_directory="src"
    )
    
    # deploy the service
    service_name = "orders-service" + str(uuid.uuid4())[:4]
    service = Model.deploy(
        workspace=ws,
        name=service_name,
        models=[model],
        inference_config=inference_config,
        deployment_config=aciconfig,
    )
    
    service.wait_for_deployment(show_output=True)
    

    The error is: ImportError: cannot import name 'case_insensitive_dict' from 'azure.core.utils'

    Longer trace: Error:

    {
      "code": "AciDeploymentFailed",
      "statusCode": 400,
      "message": "Aci Deployment failed with exception: Error in entry script, ImportError: cannot import name 'case_insensitive_dict' from 'azure.core.utils' (/azureml-envs/feast/lib/python3.8/site-packages/azure/core/utils/__init__.py), please run print(service.get_logs()) to get details.",
      "details": [
        {
          "code": "CrashLoopBackOff",
          "message": "Error in entry script, ImportError: cannot import name 'case_insensitive_dict' from 'azure.core.utils' (/azureml-envs/feast/lib/python3.8/site-packages/azure/core/utils/__init__.py), please run print(service.get_logs()) to get details."
        }
      ]
    }
    
    opened by Ritaja 3
  • Feast azure to support Feast 0.19.x+ version

    Feast azure to support Feast 0.19.x+ version

    Hi, thankyou for creating this extension on feast, I am just curious if there will be any update on version on feast that will be used with feast-azure-provider, right now the default feast version its downloading is 0.18.x, but we are using 0.19+ version, so is there any way I can use feast azure provider with updated version of feast.

    opened by mdrijwan123 0
Releases(v0.3.0)
  • v0.3.0(Mar 15, 2022)

    What's Changed

    • Fixed tutorial issues by @samuel100 in https://github.com/Azure/feast-azure/pull/25
    • Bump commons-io from 2.5 to 2.7 in /cluster/sdk/spark/ingestion by @dependabot in https://github.com/Azure/feast-azure/pull/13
    • Bump pyyaml from 5.3.1 to 5.4 in /cluster/sdk/python by @dependabot in https://github.com/Azure/feast-azure/pull/14
    • Bump cryptography from 3.1 to 3.3.2 in /cluster/sdk/python by @dependabot in https://github.com/Azure/feast-azure/pull/15
    • Adding missing db-dtypes by @jainr in https://github.com/Azure/feast-azure/pull/44
    • Adding spark.hadoop.fs.azure properties needed for NativeAzureFileSy… by @andrijaperovic in https://github.com/Azure/feast-azure/pull/43
    • Adding local helm chart and changing install script by @jainr in https://github.com/Azure/feast-azure/pull/45
    • Jainr patch 1 by @jainr in https://github.com/Azure/feast-azure/pull/50
    • Support pandas df for point-in-time join and account key auth for blob by @bastrik in https://github.com/Azure/feast-azure/pull/53
    • Update Feast to support 0.18.x and later by @cbtham in https://github.com/Azure/feast-azure/pull/51

    New Contributors

    • @dependabot made their first contribution in https://github.com/Azure/feast-azure/pull/13
    • @jainr made their first contribution in https://github.com/Azure/feast-azure/pull/44
    • @andrijaperovic made their first contribution in https://github.com/Azure/feast-azure/pull/43
    • @cbtham made their first contribution in https://github.com/Azure/feast-azure/pull/51

    Full Changelog: https://github.com/Azure/feast-azure/compare/v0.2.2...v0.3.0

    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Nov 16, 2021)

  • v0.2.1(Nov 14, 2021)

  • v0.2.0(Nov 14, 2021)

  • v0.1.0(Oct 14, 2021)

    What's Changed

    • mssql server offline store by @DvirDukhan in https://github.com/Azure/feast-azure/pull/1
    • Add CI/CD (build/publish, test) pipelines by @bastrik in https://github.com/Azure/feast-azure/pull/2

    New Contributors

    • @DvirDukhan made their first contribution in https://github.com/Azure/feast-azure/pull/1
    • @bastrik made their first contribution in https://github.com/Azure/feast-azure/pull/2

    Full Changelog: https://github.com/Azure/feast-azure/commits/v0.1.0

    Source code(tar.gz)
    Source code(zip)
Owner
Microsoft Azure
APIs, SDKs and open source projects from Microsoft Azure
Microsoft Azure
Tools and Docker images to make a fast Ruby on Rails development environment

Tools and Docker images to make a fast Ruby on Rails development environment. With the production templates, moving from development to production will be seamless.

1 Nov 13, 2022
pyinfra automates infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployment, configuration management and more.

pyinfra automates/provisions/manages/deploys infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployme

Nick Barrett 2.1k Dec 29, 2022
Daemon to ban hosts that cause multiple authentication errors

__ _ _ ___ _ / _|__ _(_) |_ ) |__ __ _ _ _ | _/ _` | | |/ /| '_ \/ _` | ' \

Fail2Ban 7.8k Jan 09, 2023
A lobby boy will create a VPS server when you need one, and destroy it after using it.

Lobbyboy What is a lobby boy? A lobby boy is completely invisible, yet always in sight. A lobby boy remembers what people hate. A lobby boy anticipate

226 Dec 29, 2022
This project shows how to serve an TF based image classification model as a web service with TFServing, Docker, and Kubernetes(GKE).

Deploying ML models with CPU based TFServing, Docker, and Kubernetes By: Chansung Park and Sayak Paul This project shows how to serve a TensorFlow ima

Chansung Park 104 Dec 28, 2022
🐳 RAUDI: Regularly and Automatically Updated Docker Images

🐳 RAUDI: Regularly and Automatically Updated Docker Images RAUDI (Regularly and Automatically Updated Docker Images) automatically generates and keep

SecSI 534 Dec 29, 2022
Flexible and scalable monitoring framework

Presentation of the Shinken project Welcome to the Shinken project. Shinken is a modern, Nagios compatible monitoring framework, written in Python. It

Gabès Jean 1.1k Dec 18, 2022
This projects provides the documentation and the automation(code) for the Oracle EMEA WLA COA Demo UseCase.

COA DevOps Training UseCase This projects provides the documentation and the automation(code) for the Oracle EMEA WLA COA Demo UseCase. Demo environme

Cosmin Tudor 1 Jan 28, 2022
Python job scheduling for humans.

schedule Python job scheduling for humans. Run Python functions (or any other callable) periodically using a friendly syntax. A simple to use API for

Dan Bader 10.4k Jan 02, 2023
A simple python application for running a CI pipeline locally This app currently supports GitLab CI scripts

🏃 Simple Local CI Runner 🏃 A simple python application for running a CI pipeline locally This app currently supports GitLab CI scripts ⚙️ Setup Inst

Tom Stowe 0 Jan 11, 2022
A job launching library for docker, EC2, GCP, etc.

doodad A library for packaging dependencies and launching scripts (with a focus on python) on different platforms using Docker. Currently supported pl

Justin Fu 55 Aug 27, 2022
Nagios status monitor for your desktop.

Nagstamon Nagstamon is a status monitor for the desktop. It connects to multiple Nagios, Icinga, Opsview, Centreon, Op5 Monitor/Ninja, Checkmk Multisi

Henri Wahl 361 Jan 05, 2023
Automate SSH in python easily!

RedExpect RedExpect makes automating remote machines over SSH very easy to do and is very fast in doing exactly what you ask of it. Based on ssh2-pyth

Red_M 19 Dec 17, 2022
Oracle Cloud Infrastructure Object Storage fsspec implementation

Oracle Cloud Infrastructure Object Storage fsspec implementation The Oracle Cloud Infrastructure Object Storage service is an internet-scale, high-per

Oracle 9 Dec 18, 2022
Helperpod - A CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster

Helperpod is a CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster.

Atakan Tatlı 2 Feb 05, 2022
This Docker container is build to run on a server an provide an easy to use interface for every student to vote for their councilors

This Docker container is build to run on a server and provide an easy to use interface for every student to vote for their councilors.

Robin Adelwarth 7 Nov 23, 2022
DC/OS - The Datacenter Operating System

DC/OS - The Datacenter Operating System The easiest way to run microservices, big data, and containers in production. What is DC/OS? Like traditional

DC/OS 2.3k Jan 06, 2023
Dockerized service to backup all running database containers

Docker Database Backup Dockerized service to automatically backup all of your database containers. Docker Image Tags: docker.io/jandi/database-backup

Jan Dittrich 16 Dec 31, 2022