Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(drill): no rows returned #27073

Merged
merged 1 commit into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ def get_git_sha() -> str:
],
"db2": ["ibm-db-sa>0.3.8, <=0.4.0"],
"dremio": ["sqlalchemy-dremio>=1.1.5, <1.3"],
"drill": ["sqlalchemy-drill==0.1.dev"],
"drill": ["sqlalchemy-drill>=1.1.4, <2"],
"druid": ["pydruid>=0.6.5,<0.7"],
"duckdb": ["duckdb-engine>=0.9.5, <0.10"],
"dynamodb": ["pydynamodb>=0.4.2"],
Expand Down
37 changes: 30 additions & 7 deletions superset/db_engine_specs/drill.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,11 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

from __future__ import annotations

from datetime import datetime
from typing import Any, Optional
from typing import Any
from urllib import parse

from sqlalchemy import types
Expand Down Expand Up @@ -60,8 +63,8 @@

@classmethod
def convert_dttm(
cls, target_type: str, dttm: datetime, db_extra: Optional[dict[str, Any]] = None
) -> Optional[str]:
cls, target_type: str, dttm: datetime, db_extra: dict[str, Any] | None = None
) -> str | None:
sqla_type = cls.get_sqla_column_type(target_type)

if isinstance(sqla_type, types.Date):
Expand All @@ -76,8 +79,8 @@
cls,
uri: URL,
connect_args: dict[str, Any],
catalog: Optional[str] = None,
schema: Optional[str] = None,
catalog: str | None = None,
schema: str | None = None,
) -> tuple[URL, dict[str, Any]]:
if schema:
uri = uri.set(database=parse.quote(schema.replace(".", "/"), safe=""))
Expand All @@ -89,15 +92,15 @@
cls,
sqlalchemy_uri: URL,
connect_args: dict[str, Any],
) -> Optional[str]:
) -> str | None:
"""
Return the configured schema.
"""
return parse.unquote(sqlalchemy_uri.database).replace("/", ".")

@classmethod
def get_url_for_impersonation(
cls, url: URL, impersonate_user: bool, username: Optional[str]
cls, url: URL, impersonate_user: bool, username: str | None
) -> URL:
"""
Return a modified URL with the username set.
Expand All @@ -117,3 +120,23 @@
)

return url

@classmethod
def fetch_data(
cls,
cursor: Any,
limit: int | None = None,
) -> list[tuple[Any, ...]]:
"""
Custom `fetch_data` for Drill.

When no rows are returned, Drill raises a `RuntimeError` with the message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida is the Drill SQLAlchemy dialect or DB-API not conforming to the standard? Ideally it would be great to adopt the "shift left" approach and ensure this is handled correctly upstream as opposed for us having to write custom logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley I'm one of the authors of the Drill SQLAlchemy dialect. It's been a few years however, so I don't remember much about how it works. If something should be fixed there, I'm happy to do it. However, could someone please explain what the exact issue is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgivre looking at PEP-249 it seems like a StopIteration should be raised if Cursor.next() is called and the result set is exhausted. In the base engine spec the fetch_data() method calls Cursor.fetchall() or Cursor.fetchmany() rather than Cursor.next(). It would be prudent to double check the Drill DB-API and SQLAlchemy dialects are behaving in accordance with the standards.

I guess the TL;DR is I'm not sure why we need this custom logic for Drill.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley we believe this is now fixed sqlalchemy-drill. That's certainly where it should be fixed.

"generator raised StopIteration". This method catches the exception and
returns an empty list instead.
"""
try:
return super().fetch_data(cursor, limit)
except RuntimeError as ex:
if str(ex) == "generator raised StopIteration":
return []
raise

Check warning on line 142 in superset/db_engine_specs/drill.py

View check run for this annotation

Codecov / codecov/patch

superset/db_engine_specs/drill.py#L137-L142

Added lines #L137 - L142 were not covered by tests
Loading