-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Elasticsearch SQL query support #3393
Conversation
7725ef7
to
a4db311
Compare
a4db311
to
32d0523
Compare
json_data = json_dumps({ | ||
"columns": result_columns, | ||
"rows": result_rows | ||
}) | ||
except KeyboardInterrupt: | ||
except KeyboardInterrupt as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing, @fernandoprocy. LGTM!
mapping_url = "{0}/{1}/_mapping".format(self.server_url, index_name) | ||
|
||
mappings, error = self._get_query_mappings(mapping_url) | ||
if error: | ||
return None, error | ||
|
||
if mode == "sql": | ||
url = "{0}/_xpack/sql".format(self.server_url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API call returns results in the same format as _search
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. But fortunately, the return object has another schema similar to the one Redash uses (json object with column and rows separately), so it's kinda easy to adapt.
@@ -270,6 +268,19 @@ def collect_aggregations(mappings, rows, parent_key, data, row, result_columns, | |||
row[column] = value[0] if isinstance(value, list) and len(value) == 1 else value | |||
|
|||
result_rows.append(row) | |||
elif 'rows' in raw_result and 'columns' in raw_result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the format the SQL endpoint returns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, pretty similar to the desired form, with only a few transformations being necessary. You can see it here in the first example:
https://www.elastic.co/blog/an-introduction-to-elasticsearch-sql-with-practical-examples-part-1
Thank you for your effort on this and apologies that it wasn't merged eventually. Due to the state of the current implementation and the inability to properly test new changes, we decided to commit to making a new Elasticsearch query runner (#4293) and will address this and the other issues there. |
I saw that the current implementation didn't support SQL-like queries on Elasticsearch, so I made a few modifications that seem to do the job.
While working on that, I discover a perhaps undesired behavior with the original DSL implementation, where if you query for nested fields it would return a full json object instead of the value (for example: "region.city" would return something like "{"city": "Abc"}" instead of just "Abc"). Perhaps that will net another PR.
In the implementation I added, if you want to do SQL-like queries you have to pass a parameter "mode" == sql like below:
It's my first time opening a PR here, any heads up is welcome.