Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surely delete temporary credentials files of Google Cloud #10203

Closed
yu-iskw opened this issue Feb 13, 2023 · 4 comments · Fixed by #12590 or #14536
Closed

Surely delete temporary credentials files of Google Cloud #10203

yu-iskw opened this issue Feb 13, 2023 · 4 comments · Fixed by #12590 or #14536
Assignees
Labels
enhancement New feature or request Ingestion P0 Highest priority

Comments

@yu-iskw
Copy link
Contributor

yu-iskw commented Feb 13, 2023

Is your feature request related to a problem? Please describe.
Take the BigQuery connector, for instance. The set_google_credentials function can be called multiple times during ingesting BigQuery metadata, if we pass raw service account key information to a database source. In reality, the close method of BigQuerySource tries to delete the temporary credentials file. However, temp_credentials property isn’t set anywhere, if I am correct. Aside from that, the get_connection function creates a temporary credentials file every time. I am doubting created credentials files remains. If I am correct, I would like to know how to completely delete created temporary files.

Specifically, the issue happens with the configuration.

  1. We pass raw credentials to the BigQuery connector,
  2. We pass multiple project IDs

In my opinion, we can have the similar issue on other source connectors, as we don't have an abstract feature or function to surely delete used temporary credentials file. To avoid the situation, we can also improve functions to deal with connections.

Describe the solution you'd like
There are a couple of ideas to surely delete temporary credentials files:

  • Plan A: We authenticat only one time, when initializing an object of BigQuerySource.
  • Plan B: We improve functions around connections like get_connection by authenticating and removing credentials on the connection side. By doing so, we may have to redesign the way to deal with connections.

Describe alternatives you've considered
N/A

Additional context
N/A

@yu-iskw yu-iskw added the enhancement New feature or request label Feb 13, 2023
@yu-iskw yu-iskw changed the title Surely delete credentials of Google Cloud Surely delete temporary credentials files of Google Cloud Feb 14, 2023
@yu-iskw
Copy link
Contributor Author

yu-iskw commented Feb 22, 2023

I havne't completely design how to enhance the functions around Google Cloud authorization. But, the official python SDK offers a couple of functions to safely refresh or revoke credentials. We can probably take advantage of them.

from google.oauth2.credentials import Credentials
from google.auth.transport.requests import Request
from google.oauth2 import service_account

# Create a credentials object from a service account key file
creds = service_account.Credentials.from_service_account_file('path/to/keyfile.json')

# Make a request to revoke the credentials
creds.revoke(Request())

@pmbrull
Copy link
Collaborator

pmbrull commented Jul 14, 2023

let's clean this up in the close call for the workflow

@ayush-shah ayush-shah moved this to 🔖 Ready in Release 1.1.1 Jul 21, 2023
@ayush-shah ayush-shah moved this from 🔖 Ready to 🏗 In progress in Release 1.1.1 Jul 26, 2023
@vanshika18
Copy link
Contributor

I will take care of this issue.

@ayush-shah ayush-shah assigned vanshika18 and unassigned ayush-shah Jul 26, 2023
@ulixius9 ulixius9 moved this from 🏗 In progress to 👀 In review in Release 1.1.1 Jul 26, 2023
@github-project-automation github-project-automation bot moved this from 👀 In review to 🔖 Ready in Release 1.1.1 Jul 27, 2023
@TeddyCr TeddyCr reopened this Sep 13, 2023
@TeddyCr TeddyCr removed this from Release 1.1.1 Sep 13, 2023
@TeddyCr
Copy link
Contributor

TeddyCr commented Sep 13, 2023

This needs to be properly handled for multi project ingestion

@ayush-shah ayush-shah added the P0 Highest priority label Oct 19, 2023
@ayush-shah ayush-shah self-assigned this Oct 19, 2023
@harshach harshach moved this to Ingestion - Bugs & Minor Features in Release 1.3.0 Oct 22, 2023
ayush-shah pushed a commit that referenced this issue Jan 2, 2024
@pmbrull pmbrull moved this from Ingestion - Bugs & Minor Features to Done in Release 1.3.0 Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Ingestion P0 Highest priority
Projects
No open projects
Status: Done
5 participants