show_query translates only partial of R query in SQL #703

chocwaffles · 2021-07-09T05:05:39Z

Show_query appears to only start after pivot_wider but not before?

library(sparklyr)
library(dplyr)
#load spark connection
sc <- spark_connect(method = "databricks") #remotely spark_home = "c:/programdata/anaconda3/lib/site-packages/pyspark"
sessionInfo()

iris_tbl <- copy_to(sc, iris)
iris_tbl

iris_tbl %>% filter(Species != 'setosa') %>%  show_query()

SQL Translated correctly where filter is shown

<SQL>
SELECT *
FROM `iris`
WHERE (`Species` != "setosa")

Now add Pivot_Wider function and see results after

iris_tbl %>% filter(Species != 'setosa') %>% pivot_wider(names_from = 'Species', values_from = 'Sepal_Length') %>% show_query()

SQL Translated Incorrectly where filter is no longer shown.

<SQL>
SELECT `Sepal_Width`, `Petal_Length`, `Petal_Width`, FIRST(IF(ISNULL(`versicolor`) OR ISNAN(`versicolor`), NULL, `versicolor`), TRUE) AS `versicolor`, FIRST(IF(ISNULL(`virginica`) OR ISNAN(`virginica`), NULL, `virginica`), TRUE) AS `virginica`
FROM `sparklyr_tmp_7fa799d2_542c_4288_8c2c_18b90167c322`
GROUP BY `Sepal_Width`, `Petal_Length`, `Petal_Width`

sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_1.0.2    sparklyr_1.7.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5         compiler_4.0.4     pillar_1.4.7       later_1.1.0.1     
 [5] dbplyr_2.0.0       TeachingDemos_2.10 r2d3_0.2.3         base64enc_0.1-3   
 [9] tools_4.0.4        uuid_0.1-4         digest_0.6.27      jsonlite_1.7.2    
[13] lifecycle_0.2.0    tibble_3.0.4       pkgconfig_2.0.3    rlang_0.4.11      
[17] cli_2.2.0          rstudioapi_0.13    shiny_1.5.0        DBI_1.1.0         
[21] parallel_4.0.4     yaml_2.2.1         fastmap_1.0.1      withr_2.3.0       
[25] hwriter_1.3.2      httr_1.4.2         generics_0.1.0     vctrs_0.3.5       
[29] htmlwidgets_1.5.3  rprojroot_2.0.2    tidyselect_1.1.0   glue_1.4.2        
[33] forge_0.2.0        R6_2.5.0           fansi_0.4.1        blob_1.2.1        
[37] tidyr_1.1.2        purrr_0.3.4        SparkR_3.1.1       magrittr_2.0.1    
[41] promises_1.1.1     hwriterPlus_1.0-3  htmltools_0.5.0    ellipsis_0.3.1    
[45] assertthat_0.2.1   mime_0.9           Rserve_1.8-7       xtable_1.8-4      
[49] httpuv_1.5.4       config_0.3         utf8_1.1.4         crayon_1.3.4

mgirlich · 2022-03-28T12:56:40Z

Unfortunately, pivot_wider() cannot be lazy. Collecting the data is necessary to figure out the column names. The documentation of pivot_wider() was updated so that in the next version this is more clear.

hadley transferred this issue from tidyverse/dplyr Sep 16, 2021

mgirlich mentioned this issue Jan 4, 2022

more functions should be generic tidyverse/tidyr#1071

Open

7 tasks

mgirlich mentioned this issue Mar 16, 2022

Update pivot_wider() documentation #794

Merged

mgirlich closed this as completed Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

show_query translates only partial of R query in SQL #703

show_query translates only partial of R query in SQL #703

chocwaffles commented Jul 9, 2021

mgirlich commented Mar 28, 2022

show_query translates only partial of R query in SQL #703

show_query translates only partial of R query in SQL #703

Comments

chocwaffles commented Jul 9, 2021

mgirlich commented Mar 28, 2022