Validator throws KeyError 'table.head' while interactively creating Expectation Suite on a BigQuery datasource #5424

riwim · 2022-07-01T13:21:45Z

Describe the bug
I am getting the same error as described in #3540 when interactively creating an Expectation Suite on a BigQuery datasource via CLI. As requested in the discussion, I am opening a new issue for this.

In the "Edit Your Expectation Suite" notebook provided by great_expectations suite new, the following function call throws an error:

validator.head(n_rows=5, fetch_all=False)

Thrown error:

KeyError                                  Traceback (most recent call last)
Input In [11], in <cell line: 1>()
----> 1 validator.head(n_rows=5, fetch_all=False)

File some-path/.venv/lib/python3.9/site-packages/great_expectations/validator/validator.py:2146, in Validator.head(self, n_rows, domain_kwargs, fetch_all)
   2141 if domain_kwargs is None:
   2142     domain_kwargs = {
   2143         "batch_id": self.execution_engine.active_batch_data_id,
   2144     }
-> 2146 data: Any = self.get_metric(
   2147     metric=MetricConfiguration(
   2148         metric_name="table.head",
   2149         metric_domain_kwargs=domain_kwargs,
   2150         metric_value_kwargs={
   2151             "n_rows": n_rows,
   2152             "fetch_all": fetch_all,
   2153         },
   2154     )
   2155 )
   2157 df: pd.DataFrame
   2158 if isinstance(
   2159     self.execution_engine, (PandasExecutionEngine, SqlAlchemyExecutionEngine)
   2160 ):

File some-path/.venv/lib/python3.9/site-packages/great_expectations/validator/validator.py:891, in Validator.get_metric(self, metric)
    889 def get_metric(self, metric: MetricConfiguration) -> Any:
    890     """return the value of the requested metric."""
--> 891     return self.get_metrics(metrics={metric.metric_name: metric})[
    892         metric.metric_name
    893     ]

File some-path/.venv/lib/python3.9/site-packages/great_expectations/validator/validator.py:856, in Validator.get_metrics(self, metrics)
    848 """
    849 metrics: Dictionary of desired metrics to be resolved, with metric_name as key and MetricConfiguration as value.
    850 Return Dictionary with requested metrics resolved, with metric_name as key and computed metric as value.
    851 """
    852 resolved_metrics: Dict[Tuple[str, str, str], Any] = self.compute_metrics(
    853     metric_configurations=list(metrics.values())
    854 )
--> 856 return {
    857     metric_configuration.metric_name: resolved_metrics[metric_configuration.id]
    858     for metric_configuration in metrics.values()
    859 }

File some-path/.venv/lib/python3.9/site-packages/great_expectations/validator/validator.py:857, in <dictcomp>(.0)
    848 """
    849 metrics: Dictionary of desired metrics to be resolved, with metric_name as key and MetricConfiguration as value.
    850 Return Dictionary with requested metrics resolved, with metric_name as key and computed metric as value.
    851 """
    852 resolved_metrics: Dict[Tuple[str, str, str], Any] = self.compute_metrics(
    853     metric_configurations=list(metrics.values())
    854 )
    856 return {
--> 857     metric_configuration.metric_name: resolved_metrics[metric_configuration.id]
    858     for metric_configuration in metrics.values()
    859 }

KeyError: ('table.head', 'batch_id=15a077d486452b3e1c894458758b7972', '04166707abe073177c1dd922d3584468')

To Reproduce
Steps to reproduce the behavior:

Initialize GE project
Add a BigQuery datasource via great_expectations datasource new
Create a new Expectation Suite via great_expectations suite new
Choose Interactively and select your datasource and data asset
Execute notebook code including the validator.head() call
See error above

Expected behavior
Calling validator.head() should not raise a KeyError.

Environment

Operating System: MacOS 12.3.1
Great Expectations Version: 0.15.11

Additional context

I have examined the GCP logs in the period of the call of the validator.head() function. I exclude a permission error, because the used service account has maximum rights on used GCP project during debugging. However, errors occur here in the BigQuery service with the JobService.InsertJob method, which are not due to insufficient permissions:

"serviceName": "bigquery.googleapis.com",
"methodName": "google.cloud.bigquery.v2.JobService.InsertJob",
"authorizationInfo": [
  {
    "resource": "projects/my-project",
    "permission": "bigquery.jobs.create",
    "granted": true,
    "resourceAttributes": {}
  }
],

The error itself is reported in the response object jobStatus:

"jobStatus": {
  "errors": [
    {
      "code": 3,
      "message": "Cannot access field id on a value with type ARRAY<STRUCT<id STRING>> at [1:4656]"
    }
  ],
  "errorResult": {
    "message": "Cannot access field id on a value with type ARRAY<STRUCT<id STRING>> at [1:4656]",
    "code": 3
  },
  "jobState": "DONE"
},

Some fields of the table I use are nested fields. Does the validator have problems with these?

The text was updated successfully, but these errors were encountered:

talagluck · 2022-07-13T17:46:14Z

Hi @riwim - thanks for raising this. I would have expected that this would be due to issues with temp table creation (or table creation in the case of BigQuery). Could you please share the BatchRequest that you used with this?

omerjakub · 2022-11-08T13:18:55Z

hello guys,
I get same error when I try to when I try to create suite on a clickhouse via cli:

does somebody have same issue ?

talagluck · 2022-11-09T16:03:47Z

Hi @poohdini1994 - that makes sense, since we don't yet have full support for Clickhouse, and that metric is not yet implemented for Clickhouse.

omerjakub · 2022-11-09T16:39:00Z

hi @talagluck, do you know when it will be supported ?

talagluck added community devrel This item is being addressed by the Developer Relations Team labels Jul 13, 2022

Adenmin mentioned this issue Jul 25, 2022

KeyError 'table.head' when running validator.head() using IBM DB2 datasource #5583

Closed

talagluck mentioned this issue Jul 29, 2022

[BUGFIX] Fix table.head metric issue when using BQ without temp tables #5630

Merged

talagluck closed this as completed in #5630 Aug 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validator throws KeyError 'table.head' while interactively creating Expectation Suite on a BigQuery datasource #5424

Validator throws KeyError 'table.head' while interactively creating Expectation Suite on a BigQuery datasource #5424

riwim commented Jul 1, 2022

talagluck commented Jul 13, 2022

omerjakub commented Nov 8, 2022 •

edited

Loading

talagluck commented Nov 9, 2022

omerjakub commented Nov 9, 2022

Validator throws KeyError 'table.head' while interactively creating Expectation Suite on a BigQuery datasource #5424

Validator throws KeyError 'table.head' while interactively creating Expectation Suite on a BigQuery datasource #5424

Comments

riwim commented Jul 1, 2022

talagluck commented Jul 13, 2022

omerjakub commented Nov 8, 2022 • edited Loading

talagluck commented Nov 9, 2022

omerjakub commented Nov 9, 2022

omerjakub commented Nov 8, 2022 •

edited

Loading