Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add REST API support for feature registry #99

Merged
merged 20 commits into from
Apr 16, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions docs/how-to-guides/deploy-feathr-api-as-webapp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Feathr API
jainr marked this conversation as resolved.
Show resolved Hide resolved
The API currently supports following functionality
jainr marked this conversation as resolved.
Show resolved Hide resolved

1. Get Feature by Qualified Name
2. Get Feature by GUID
jainr marked this conversation as resolved.
Show resolved Hide resolved
3. Get List of Features
4. Get Lineage for a Feature


## Build and run locally
### Install
__NOTE:__ You can run the following command in your local python environment or in your Azure Virtual machine.
You can install dependencies through the requirements file
```bash
$ pip install -r requirements.txt
```

### Run
This command will start the uvicorn server locally and will dynamically load your changes.
```bash
uvicorn api:app --port 8080 --reload
jainr marked this conversation as resolved.
Show resolved Hide resolved
```

## Build and deploy on Azure
Here are the steps to build the API as a docker container, push it to Azure Container registry and then deploy it as webapp. The instructions below are for Mac/Linux but should work on Windows too. You might have to use sudo command or run docker as administrator on windows if you don't have right privileges.

1. Install Azure CLI by following instructions [here](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest)

1. Create Azure Container Registry. First create the resource group.
jainr marked this conversation as resolved.
Show resolved Hide resolved
```bash
az group create --name <your_rg_name> --location <location example:westus>
```

Then create the container registry
```bash
az acr create --resource-group <your_rg_name> --name <registry-name> --sku Basic
```

1. Login to your Azure container registry (ACR) account.
```bash
$ az acr login --name <registry-name>
```

1. Clone the repository and navigate to api folder
```bash
$ git clone [email protected]:linkedin/feathr.git

$ cd feathr_project/feathr/api

```

1. Build the docker container locally, you need to have docker installed locally and have it running. To set up docker on your machine follow the instructions [here](https://docs.docker.com/get-started/)
__Note: Note: <your_username>/image_name is not a mandatory format for specifying the name of the image.It’s just a useful convention to avoid tagging your image again when you need to push it to a registry. It can be anything you want in the format below__

```bash
$ docker build -t feathr/api .
```

1. Run docker images command and you will see your newly created image
```bash
$ docker images

REPOSITORY TAG IMAGE ID CREATED SIZE
feathr/api latest a647ea749b9b 5 minutes ago 529MB
```

1. Before you can push an image to your registry, you must tag it with the fully qualified name of your ACR login server. The login server name is in the format <registry-name>.azurecr.io (all lowercase), for example, mycontainerregistry007.azurecr.io. Tag the image
```bash
$ docker tag feathr/api:latest feathracr.azurecr.io/feathr/api:latest
```
1. Push the image to the registry
```bash
$ docker push feathracr.azurecr.io/feathr/api:latest
```
1. List the images from your registry to see your recently pushed image
```
az acr repository list --name feathracr --output table
```
Output:
```
Result
----------
feathr/api
```

## Deploy image to Azure WebApp for Containers

1. Go to [Azure portal](https://portal.azure.com) and search for your container registry
1. Select repositories from the left pane and click latest tag. Click on the three dots on right side of the tag and select __Deploy to WebApp__ option. If you see the __Deploy to WebApp__ option greyed out, you would have to enable Admin User on the registry by Updating it.

![Container Image 1](../images/feathr_api_image_latest.png)

![Container Image 2](../images/feathr_api_image_latest_options.png)


1. Provide a name for the deployed webapp, along with the subscription to deploy app into, the resource group and the appservice plan

![Container Image](../images/feathr_api_image_latest_deployment.png)

1. You will get the notification that your app has been successfully deployed, click on __Go to Resource__ button.


1. On the App overview page go to the URL (https://<app_name>.azurewebsites.net/docs) for deployed app (it's under URL on the app overview page) and you should see the API documentation.

![API docs](../images/api-docs.png)

Congratulations you have successfully deployed the Feathr API.

Binary file added docs/images/api-docs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/feathr_api_image_latest.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/feathr_api_image_latest_options.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
113 changes: 91 additions & 22 deletions feathr_project/feathr/_feature_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,12 @@
from jinja2 import Template
from loguru import logger
from pyapacheatlas.auth import ServicePrincipalAuthentication
from pyapacheatlas.core import (AtlasEntity, AtlasProcess, PurviewClient,
TypeCategory)
from pyapacheatlas.core import (AtlasEntity, AtlasProcess, PurviewClient)
from pyapacheatlas.core.typedef import (AtlasAttributeDef, EntityTypeDef,
RelationshipTypeDef)
from pyapacheatlas.core.util import GuidTracker
from pyexpat import features
from pyhocon import ConfigFactory

from feathr._envvariableutil import _EnvVaraibleUtil
from feathr._file_utils import write_to_file
from feathr.anchor import FeatureAnchor
from feathr.feature import Feature
Expand All @@ -27,31 +24,43 @@


class _FeatureRegistry():

def __init__(self, config_path):
def __init__(self, config_path: Optional[str] ):
"""
Initializes the feature registry, doing the following:
- Use an Azure Service Principal to communicate with Azure Purview
- Initialize an Azure Purview Client
- Initialize the GUID tracker, project name, etc.
"""
envutils = _EnvVaraibleUtil(config_path)
self.project_name = envutils.get_environment_variable_with_default('project_config', 'project_name')
self.FEATURE_REGISTRY_DELIMITER = envutils.get_environment_variable_with_default('feature_registry', 'purview', 'delimiter')
self.azure_purview_name = envutils.get_environment_variable_with_default('feature_registry', 'purview', 'purview_name')

self.oauth = ServicePrincipalAuthentication(
tenant_id=_EnvVaraibleUtil.get_environment_variable(
"AZURE_TENANT_ID"),
client_id=_EnvVaraibleUtil.get_environment_variable(
"AZURE_CLIENT_ID"),
client_secret=_EnvVaraibleUtil.get_environment_variable(
"AZURE_CLIENT_SECRET")
)
# Read from the config file
if config_path is not None:
from feathr._envvariableutil import _EnvVaraibleUtil
envutils = _EnvVaraibleUtil(config_path)
self.project_name = envutils.get_environment_variable_with_default('project_config', 'project_name')
self.FEATURE_REGISTRY_DELIMITER = envutils.get_environment_variable_with_default('feature_registry', 'purview', 'delimiter')
self.azure_purview_name = envutils.get_environment_variable_with_default('feature_registry', 'purview', 'purview_name')
self.oauth = ServicePrincipalAuthentication(
tenant_id=_EnvVaraibleUtil.get_environment_variable(
"AZURE_TENANT_ID"),
client_id=_EnvVaraibleUtil.get_environment_variable(
"AZURE_CLIENT_ID"),
client_secret=_EnvVaraibleUtil.get_environment_variable(
"AZURE_CLIENT_SECRET")
)
else:
self.project_name = os.getenv('project_name')
self.FEATURE_REGISTRY_DELIMITER = os.getenv('delimiter', '__')
self.azure_purview_name = os.getenv('purview_name')
self.oauth = ServicePrincipalAuthentication(
tenant_id=os.getenv("AZURE_TENANT_ID"),
client_id=os.getenv("AZURE_CLIENT_ID"),
client_secret=os.getenv("AZURE_CLIENT_SECRET")
)

self.purview_client = PurviewClient(
account_name=self.azure_purview_name,
authentication=self.oauth
)
account_name=self.azure_purview_name,
authentication=self.oauth
)
self.guid = GuidTracker(starting=-1000)
self.entity_batch_queue = []

Expand Down Expand Up @@ -739,6 +748,7 @@ def register_features(self, workspace_path: Optional[Path] = None):
logger.info(
"Finished registering features. See {} to access the Purview web interface", webinterface_path)

@classmethod
def get_registry_client(self):
"""
Return a client object and users can operate more on it (like doing search)
Expand Down Expand Up @@ -807,3 +817,62 @@ def get_features_from_registry(self, project_name: str, workspace_path: str):
with open(feature_gen_file, "w") as online_config:
online_config.write(feature_gen_conf_content)
logger.info("Writing online configuration from feathr registry to {}", feature_gen_file)

def get_feature_by_fqdn_type(self, qualifiedName, typeName):
jainr marked this conversation as resolved.
Show resolved Hide resolved
"""
Get a single feature by it's QualifiedName and Type
Returns the feature else throws an AtlasException with 400 error code
"""
response = self.purview_client.get_entity(qualifiedName=qualifiedName, typeName=typeName)
entities = response.get('entities')
for entity in entities:
if entity.get('typeName') == typeName and entity.get('attributes').get('qualifiedName') == qualifiedName:
return entity

def get_feature_by_fqdn(self, qualifiedName):
"""
Get feature by qualifiedName
Returns the feature else throws an AtlasException with 400 error code
"""
guid = self.get_feature_guid(qualifiedName)
return self.get_feature_by_guid(guid)
jainr marked this conversation as resolved.
Show resolved Hide resolved

def get_feature_by_guid(self, guid):
"""
Get a single feature by it's GUID
Returns the feature else throws an AtlasException with 400 error code
jainr marked this conversation as resolved.
Show resolved Hide resolved
"""
response = self.purview_client.get_single_entity(guid=guid)
return response

def get_feature_lineage(self, guid):
"""
Get feature's lineage by it's GUID
Returns the feature else throws an AtlasException with 400 error code
"""
return self.purview_client.get_entity_lineage(guid=guid)

def get_feature_guid(self, qualifiedName):
"""
Get guid of a feature given its qualifiedName
"""
search_term = "qualifiedName:{0}".format(qualifiedName)
entities = self.purview_client.discovery.search_entities(search_term)
for entity in entities:
if entity.get('qualifiedName') == qualifiedName:
return entity.get('id')

def search_features(self, searchTerm):
"""
Search the registry for the given query term
jainr marked this conversation as resolved.
Show resolved Hide resolved
For a ride hailing company few examples could be - "taxi", "passenger", "fare" etc.
It's a keyword search on the registry metadata
"""
search_term = "qualifiedName:{0}".format(searchTerm)
entities = self.purview_client.discovery.search_entities(search_term)
return entities





Loading