Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long column names considered repeated because their names got truncated in code #11887

Closed
isabela-angelo-ze-delivery opened this issue Jun 5, 2023 · 3 comments
Assignees
Labels

Comments

@isabela-angelo-ze-delivery

Affected module
Ingestion

Describe the bug
I have a table in Glue Metabase with two columns:
installments_installment_chargeback_refunds_chargeback_refund_payment_id
installments_installment_chargeback_refunds_chargeback_refund_payment_date

And during the ingestion I get the following error:
Failed to ingest CreateTableRequest [***] due to api request failure: Column name installments_installment_chargeback_refunds_chargeback_refund_pa is repeated

To Reproduce

Create a table in Glue Metadata with a name longer than 64 characteres and with the same characteres at the begining 64 chars.
Run the ingestion process in Airfllow.

Expected behavior
Columns aren't repeated so the error shouldn't happen.

Version:

  • OS: debian:bullseye-slim ( image python:3.9.7-slim-bullseye )
  • Python version: 3.9.7
  • OpenMetadata version: 0.13.2
  • OpenMetadata Ingestion package version: openmetadata-ingestion[glue]==0.13.2.8

Additional context
I have been checking the code and found the line where the column name is considered as just 64 chars:
https://github.com/open-metadata/OpenMetadata/blob/26db825b7195000c9a550cbec889f985a024ac97/ingestion/src/metadata/ingestion/source/database/glue/metadata.py#LL298C46-L298C48

I know it's not nice to keep increasing this number so maybe there is a way to deduplicate these column names in Column class.
What do you think?

@SuperBo
Copy link

SuperBo commented Jul 10, 2023

this one also affects databricks ingestion with unity catalog

@harshach
Copy link
Collaborator

harshach commented Aug 3, 2023

@SuperBo @isabela-angelo-ze-delivery this is now fixed. Please giving it a try again. cc @pmbrull

@harshach harshach closed this as completed Aug 3, 2023
@SuperBo
Copy link

SuperBo commented Oct 4, 2023

Hi @harshach, @pmbrull this issue happens again with recent version "1.1.5" and "1.1.6" for databricks.

The latest source code (main) reflects this.

https://github.com/open-metadata/OpenMetadata/blob/5a3d759b48d93647fefd7cc9bc1399a28b44e67f/ingestion/src/metadata/ingestion/source/database/databricks/unity_catalog/metadata.py#L463C16-L463C16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants