Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support disk.frame objects #5

Closed
ianmcook opened this issue Oct 24, 2019 · 9 comments
Closed

Support disk.frame objects #5

ianmcook opened this issue Oct 24, 2019 · 9 comments
Labels
enhancement New feature or request

Comments

@ianmcook
Copy link
Owner

See DiskFrame/disk.frame#196

@ianmcook ianmcook added the enhancement New feature or request label Oct 24, 2019
@ianmcook
Copy link
Owner Author

Blocked by DiskFrame/disk.frame#197 (using a Mac dev environment)

@ianmcook
Copy link
Owner Author

DiskFrame/disk.frame#197 is resolved. Now blocked by DiskFrame/disk.frame#217

@ianmcook
Copy link
Owner Author

ianmcook commented Jan 5, 2020

Also blocked by DiskFrame/disk.frame#250

@xiaodaigh
Copy link

xiaodaigh commented Jul 30, 2020

Hey, all blockers are resolved and it is working! But with some bugs. See

library(disk.frame)
setup_disk.frame()

airports.df = as.disk.frame(airports)

# this works
airports.df %>%
  query("SELECT name as name1, lat as lat1, lon as lon1 ORDER BY lat DESC") %>% 
  collect

but this doesn't

airports.df %>%
  query("SELECT name, lat, lon as lon1 ORDER BY lat DESC LIMIT 5") %>% 
  collect

complaining about

Error: The SELECT list includes two or more long expressions with no aliases assigned to them. You must assign aliases to these expressions
In addition: There were 17 warnings (use warnings() to see them)

and the warnings()

Warning messages:
1: In readChar(rc, nchars) : truncating string with embedded nuls
2: In readChar(rc, nchars) : truncating string with embedded nuls
3: In readChar(rc, nchars) : truncating string with embedded nuls
4: In readChar(rc, nchars) : truncating string with embedded nuls
5: In readChar(rc, nchars) : truncating string with embedded nuls
6: In readChar(rc, nchars) : truncating string with embedded nuls
7: In readChar(rc, nchars) : truncating string with embedded nuls
8: In readChar(rc, nchars) : truncating string with embedded nuls
9: In readChar(rc, nchars) : truncating string with embedded nuls
10: In readChar(rc, nchars) : truncating string with embedded nuls
11: In readChar(rc, nchars) : truncating string with embedded nuls
12: In readChar(rc, nchars) : truncating string with embedded nuls
13: In readChar(rc, nchars) : truncating string with embedded nuls
14: In readChar(rc, nchars) : truncating string with embedded nuls
15: In readChar(rc, 1L, useBytes = TRUE) : truncating string with embedded nuls
16: In readChar(rc, 1L, useBytes = TRUE) : truncating string with embedded nuls
17: In readChar(rc, 1L, useBytes = TRUE) : truncating string with embedded nuls
18: In arrange.disk.frame(., ...) :
  `arrange.disk.frame` is now deprecated. Please use `chunk_arrange` instead. This is in preparation for a more powerful `arrange` that sorts the whole disk.frame

@ianmcook
Copy link
Owner Author

Thanks @xiaodaigh—I'll take a look at this soon

@ianmcook
Copy link
Owner Author

ianmcook commented Aug 1, 2020

@xiaodaigh this error is happening because colnames() is returning NULL on a disk.frame object. Should I be using names(collect(get_chunk(df, 1))) to get the column names, as you suggest at https://diskframe.com/reference/colnames.html?

@xiaodaigh
Copy link

I see. the design disk.frame is a little odd at this stage. So names(get_chunk(df, 1)) should suffice. But it's kinda weird to make you run this disk.frame specific code. Let me fix the disk.frame colnames.

See DiskFrame/disk.frame#299

@xiaodaigh
Copy link

Another approach, which I think might be better is to set query as a S3 method so this would work

query <- function(data, ...) {
  UseMethod("query")
}

query.data.frame <- function(data, sql)  {
    query_(data, sql, TRUE)
}

then on {disk.frame} side, I can do something like this

query.disk.frame = create_chunk_mapper(tidyquery::query)

airports.df %>%
  query("SELECT name, lat, lon as lon1") %>% 
  collect

to test, this should definitely work

airports.df %>%
  query.disk.frame("SELECT name, lat, lon as lon1") %>% 
  collect

This already on a branch on {disk.frame}'s side.

@ianmcook
Copy link
Owner Author

ianmcook commented Nov 5, 2022

Closing because {disk.frame} has been soft-deprecated.

@ianmcook ianmcook closed this as completed Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants