Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with MySQL profile ingestion #13974

Closed
arudra123 opened this issue Nov 14, 2023 · 3 comments · Fixed by #14834
Closed

Error with MySQL profile ingestion #13974

arudra123 opened this issue Nov 14, 2023 · 3 comments · Fixed by #14834
Assignees
Labels
bug Something isn't working collaboration P1 High priority profiler support

Comments

@arudra123
Copy link

Affected module
after connecting the mysql data and after running the metadata ingestion, profiler ingestion is getting failed with the below error

Describe the bug

Hi team, in the latest instance i am getting below error when running the mysql profile ingestion
AS thirdQuartile
FROM aveva-sales-db.Owners
LIMIT %(param_1)s]
[parameters: {'param_1': 1}]
(Background on this error at: https://sqlalche.me/e/14/f405)
[2023-11-14T09:35:34.131+0000] {profiler_interface.py:404} ERROR - Owners.Owner metric_type.value: local variable 'row' referenced before assignment
[2023-11-14T09:35:34.132+0000] {status.py:65} WARNING - Owners.Owner metric_type.value: local variable 'row' referenced before assignment
[2023-11-14T09:35:39.907+0000] {profiler_interface.py:520} WARNING - Unexpected exception computing metrics: list index out of range
[2023-11-14T09:35:44.718+0000] {workflow_output_handler.py:361} INFO - Source: Processed 158 records, filtered 0 records, found 0 errors
[2023-11-14T09:35:44.718+0000] {workflow_output_handler.py:361} INFO - Processor: Processed 155 records, filtered 0 records, found 51 errors
[2023-11-14T09:35:44.719+0000] {workflow_output_handler.py:361} INFO - Processor: Processed 155 records, filtered 0 records, found 0 errors
[2023-11-14T09:35:44.719+0000] {workflow_output_handler.py:361} INFO - Sink: Processed 154 records, filtered 0 records, found 0 errors
[2023-11-14T09:36:07.975+0000] {profiler_interface.py:302} INFO - Skipping metrics for Products.Product due to (pymysql.err.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '-sales-db.Products,\n (SELECT @counter := COUNT() FROM aveva-sales...' at line 10")
[SQL: /
{"app": "OpenMetadata", "version": "1.2.0.1"} */

Screenshots or steps to reproduce
image

Expected behavior
after connecting the mysql database the profiler ingestion should run successfully

Version:
Lastest version as on 14-11-2023

Additional context
Add any other context about the problem here.

@pmbrull
Copy link
Collaborator

pmbrull commented Nov 14, 2023

From the support channel

My sql version: 10.3.38-MariaDB-0ubuntu0.20.04.1
-- `aveva-sales-db`.Accounts definition
CREATE TABLE `Accounts` (
  `Account Name` varchar(50) DEFAULT NULL,
  `State or Province` varchar(50) DEFAULT NULL,
  `Country` varchar(50) DEFAULT NULL,
  `AccountID` varchar(50) DEFAULT NULL,
  `AccountSeq` int(11) DEFAULT NULL,
  `IndustrySeq` int(11) DEFAULT NULL,
  `AccountOwnerSeq` int(11) DEFAULT NULL,
  `Account Number` varchar(50) DEFAULT NULL,
  `TerritorySeq` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
-- `aveva-sales-db`.Campaigns definition
CREATE TABLE `Campaigns` (
  `CampaignSeq` int(11) DEFAULT NULL,
  `Campaign Type` varchar(50) DEFAULT NULL,
  `Campaign` varchar(50) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
-- `aveva-sales-db`.Contacts definition
CREATE TABLE `Contacts` (
  `Contact` varchar(50) DEFAULT NULL,
  `Job Title` varchar(50) DEFAULT NULL,
  `ContactSeq` int(11) DEFAULT NULL,
  `AccountSeq` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
-- `aveva-sales-db`.Industries definition
CREATE TABLE `Industries` (
  `Industry` varchar(64) DEFAULT NULL,
  `IndustrySeq` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
-- `aveva-sales-db`.Opportunities definition
CREATE TABLE `Opportunities` (
  `Purchase Process` varchar(50) DEFAULT NULL,
  `Decision Maker` varchar(50) DEFAULT NULL,
  `Status` varchar(50) DEFAULT NULL,
  `Sales Stage` varchar(50) DEFAULT NULL,
  `Value` varchar(50) DEFAULT NULL,
  `Weeks Open` int(11) DEFAULT NULL,
  `DaysToClose` int(11) DEFAULT NULL,
  `Discount` varchar(50) DEFAULT NULL,
  `Probability` varchar(50) DEFAULT NULL,
  `Rating` varchar(50) DEFAULT NULL,
  `Days Remaining In Pipeline` int(11) DEFAULT NULL,
  `Probability (raw)` varchar(50) DEFAULT NULL,
  `AccountSeq` int(11) DEFAULT NULL,
  `OpportunitySeq` int(11) DEFAULT NULL,
  `ProductSeq` int(11) DEFAULT NULL,
  `SystemUserSeq` int(11) DEFAULT NULL,
  `CampaignSeq` varchar(50) DEFAULT NULL,
  `Blank` varchar(50) DEFAULT NULL,
  `DateDiff-Days` int(11) DEFAULT NULL,
  `Opportunity Created On` varchar(50) DEFAULT NULL,
  `CloseDate` varchar(50) DEFAULT NULL,
  `Days Remaining In Pipeline (bins)` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
-- `aveva-sales-db`.Owners definition
CREATE TABLE `Owners` (
  `Owner` varchar(50) DEFAULT NULL,
  `Manager` varchar(50) DEFAULT NULL,
  `SystemUserSeq` int(11) DEFAULT NULL,
  `systemuserid` varchar(50) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

@TeddyCr
Copy link
Contributor

TeddyCr commented Nov 14, 2023

Maria DB engine appear to compile to MySQL dialect and execute this query

@compiles(MedianFn, Dialects.MySQL)
def _(elements, compiler, **kwargs): # pylint: disable=unused-argument
"""Median computation for MySQL"""
col = compiler.process(elements.clauses.clauses[0])
table = elements.clauses.clauses[1].value
percentile = elements.clauses.clauses[2].value
return """
(SELECT
{col}
FROM (
SELECT
{col},
ROW_NUMBER() OVER () AS row_num
FROM
{table},
(SELECT @counter := COUNT(*) FROM {table}) t_count
ORDER BY {col}
) temp
WHERE temp.row_num = ROUND({percentile} * @counter)
)
""".format(
col=col, table=table, percentile=percentile
)
.

Maria DB seems to natively handle median computation we can look into it and handle the execution similar to what we did for single store

@TeddyCr TeddyCr added bug Something isn't working P1 High priority labels Nov 14, 2023
@harshach harshach moved this to Data Quality - Teddy, Pere, Harsha, Shailesh, Rupesh in Release 1.3.0 Nov 30, 2023
@NiharDoshi99
Copy link
Contributor

NiharDoshi99 commented Jan 23, 2024

Hi, i would like to work on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working collaboration P1 High priority profiler support
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants