Allow fetching all rows from results endpoint #8389

betodealmeida · 2019-10-14T18:23:16Z

SUMMARY

Currently when results are fetched from /superset/results/ we apply DISPLAY_MAX_ROW, to limit the amount of data displayed in the UI. At Lyft, we have other clients accessing Superset programmatically, and we would like to bypass the limit when fetching data from these clients.

I changed the endpoint behavior so that by default DISPLAY_MAX_ROW is not applied, but it can be passed optionally. The frontend was changed to query /superset/results/${DISPLAY_MAX_ROW}, returning only a subset of the rows.

TEST PLAN

Tested with curl, confirmed it works. Added unit tests.

ADDITIONAL INFORMATION

REVIEWERS

codecov-io · 2019-10-14T18:34:50Z

Codecov Report

Merging #8389 into master will increase coverage by 0.1%.
The diff coverage is 50%.

@@            Coverage Diff            @@
##           master    #8389     +/-   ##
=========================================
+ Coverage   67.57%   67.67%   +0.1%     
=========================================
  Files         448      448             
  Lines       22527    22492     -35     
  Branches     2364     2364             
=========================================
  Hits        15222    15222             
+ Misses       7167     7132     -35     
  Partials      138      138

Impacted Files	Coverage Δ
...erset/assets/src/SqlLab/components/QuerySearch.jsx	`58.65% <ø> (ø)`	⬆️
superset/views/base.py	`70.64% <ø> (-0.29%)`	⬇️
.../assets/src/SqlLab/components/TabbedSqlEditors.jsx	`83.33% <ø> (ø)`	⬆️
...uperset/assets/src/SqlLab/components/SouthPane.jsx	`91.42% <ø> (ø)`	⬆️
...uperset/assets/src/SqlLab/components/SqlEditor.jsx	`52.81% <ø> (ø)`	⬆️
...rset/assets/src/SqlLab/components/QueryHistory.jsx	`83.33% <ø> (ø)`	⬆️
...perset/assets/src/SqlLab/components/QueryTable.jsx	`59.25% <0%> (ø)`	⬆️
superset/views/utils.py	`89.77% <100%> (+2.27%)`	⬆️
...uperset/assets/src/SqlLab/components/ResultSet.jsx	`79.77% <100%> (ø)`	⬆️
superset/assets/src/SqlLab/components/App.jsx	`77.77% <50%> (ø)`	⬆️
... and 66 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2117d1e...efe9349. Read the comment docs.

etr2460

could you add documentation for this flag somewhere?

betodealmeida · 2019-10-15T18:34:32Z

@etr2460 will do! I'll also add some unit tests.

villebro · 2019-10-16T04:44:51Z

docs/sqllab.rst

+applications. When retrieving results from asynchronous queries ran in SQL Lab
+from the results backend, the config `DISPLAY_MAX_ROW` will still be applied,
+even though the results might not necessarily be rendered in a display. In order
+to bypass the limit you can pass the query parameter `bypass_display_limit=true`


nit: would ignore_display_limit to more precise?

I don't have any strong preferences. @etr2460, any thoughts on this?

oh yeah, i like ignore_display_limit better personally

etr2460 · 2019-10-16T22:03:06Z

Sorry, I probably should have thought of this before, but maybe we should have the client explicitly ask for DISPLAY_MAX_ROWS rows from the results instead of the backend automatically applying it. So instead of adding an ignore_limit setting, we make the default pass back the entire results set, and the client adds a rows query param to /results that requests DISPLAY_MAX_ROWS rows. I think this might be a bit cleaner, and would make the superset backend behave more like a service with multiple consumers than just assuming the client is asking by default. thoughts @villebro @betodealmeida ?

betodealmeida · 2019-10-16T22:06:09Z

No worries, I agree that's a better approach.

betodealmeida · 2019-10-17T20:40:15Z

cc: @etr2460

etr2460

looks much better! one other question

etr2460 · 2019-10-21T18:12:00Z

superset/views/utils.py

@@ -176,7 +176,9 @@ def get_datasource_info(
    return datasource_id, datasource_type


-def apply_display_max_row_limit(sql_results: Dict[str, Any]) -> Dict[str, Any]:
+def apply_display_max_row_limit(
+    sql_results: Dict[str, Any], rows: Optional[int] = None


is this ever called with rows not defined?

Yes, synchronous queries will still call this to limit the response:

payload = json.dumps( apply_display_max_row_limit(data), default=utils.pessimistic_json_iso_dttm_ser, ignore_nan=True, encoding=None, )

I think it makes sense to limit the sync response, disallowing users from bypassing it. if the query is returned more than DISPLAY_MAX_ROWS and the user needs that data they should run it asynchronously, IMHO.

Async vs sync queries are dependent on the config of the datasource right? Which is something the user doesn't have any control over unless they're admin. I guess if you're calling this with an API, then you can decide if you want to make an async or async query yourself though.

I think it's fine, but it's just another little wart with superset that we need to remember

etr2460

one other comment about the url/api design (sorry i didn't catch all these at once), but after that i think it lgtm

etr2460 · 2019-10-22T16:58:03Z

superset/views/core.py

@@ -2459,8 +2459,9 @@ def cache_key_exist(self, key):

    @has_access_api
    @expose("/results/<key>/")
+    @expose("/results/<key>/<int:rows>")


it seems a little weird to encode the limit in the url like this. I think it would be preferable to construct the url like: /results/key?rows=1000.

etr2460

awesome, thanks for all the iteration! lgtm

villebro

Sorry @etr2460 I missed your comment re: replacing ignore with explicit DISPLAY_MAX_ROWS. Agree, it's a much much better approach. LGTM!

* Allow bypassing DISPLAY_MAX_ROW * Add unit tests and docs * Fix tests * Fix mock * Fix unit test * Revert config change after test * Change behavior * Address comments

Allow bypassing DISPLAY_MAX_ROW

179e676

pull-request-size bot added the size/XS label Oct 14, 2019

etr2460 requested changes Oct 15, 2019

View reviewed changes

betodealmeida mentioned this pull request Oct 15, 2019

Allow bypassing DISPLAY_MAX_ROW lyft/incubator-superset#72

Merged

Add unit tests and docs

fe7479f

betodealmeida added enhancement:request Enhancement request submitted by anyone from the community sqllab Namespace | Anything related to the SQL Lab lyft Related to Lyft minor-review labels Oct 15, 2019

villebro reviewed Oct 16, 2019

View reviewed changes

betodealmeida added 2 commits October 16, 2019 14:11

Fix tests

636014e

Fix mock

bbd01ef

betodealmeida added 3 commits October 16, 2019 15:25

Fix unit test

4312d3d

Revert config change after test

5bb171a

Change behavior

ea21872

pull-request-size bot added size/L and removed size/XS labels Oct 17, 2019

betodealmeida changed the title ~~Allow bypassing DISPLAY_MAX_ROW~~ Allow fetching all rows from results endpoint Oct 17, 2019

etr2460 reviewed Oct 21, 2019

View reviewed changes

etr2460 reviewed Oct 22, 2019

View reviewed changes

Address comments

efe9349

etr2460 approved these changes Oct 24, 2019

View reviewed changes

villebro approved these changes Oct 24, 2019

View reviewed changes

betodealmeida merged commit e704e29 into apache:master Oct 25, 2019

betodealmeida mentioned this pull request Oct 28, 2019

Revert "Allow bypassing DISPLAY_MAX_ROW" lyft/incubator-superset#74

Merged

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.36.0 labels Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow fetching all rows from results endpoint #8389

Allow fetching all rows from results endpoint #8389

betodealmeida commented Oct 14, 2019 •

edited

Loading

codecov-io commented Oct 14, 2019 •

edited

Loading

etr2460 left a comment

betodealmeida commented Oct 15, 2019

villebro Oct 16, 2019

betodealmeida Oct 16, 2019

etr2460 Oct 16, 2019

etr2460 commented Oct 16, 2019

betodealmeida commented Oct 16, 2019

betodealmeida commented Oct 17, 2019

etr2460 left a comment

etr2460 Oct 21, 2019

betodealmeida Oct 21, 2019 •

edited

Loading

etr2460 Oct 22, 2019

betodealmeida Oct 22, 2019

etr2460 left a comment

etr2460 Oct 22, 2019

betodealmeida Oct 22, 2019

etr2460 left a comment

villebro left a comment

Allow fetching all rows from results endpoint #8389

Allow fetching all rows from results endpoint #8389

Conversation

betodealmeida commented Oct 14, 2019 • edited Loading

CATEGORY

SUMMARY

TEST PLAN

ADDITIONAL INFORMATION

REVIEWERS

codecov-io commented Oct 14, 2019 • edited Loading

Codecov Report

etr2460 left a comment

Choose a reason for hiding this comment

betodealmeida commented Oct 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

etr2460 commented Oct 16, 2019

betodealmeida commented Oct 16, 2019

betodealmeida commented Oct 17, 2019

etr2460 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betodealmeida Oct 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

etr2460 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

etr2460 left a comment

Choose a reason for hiding this comment

villebro left a comment

Choose a reason for hiding this comment

betodealmeida commented Oct 14, 2019 •

edited

Loading

codecov-io commented Oct 14, 2019 •

edited

Loading

betodealmeida Oct 21, 2019 •

edited

Loading