Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread: Read less rows (1327) than were allocated (1607) #1116

Closed
slowteetoe opened this issue Apr 15, 2015 · 1 comment
Closed

fread: Read less rows (1327) than were allocated (1607) #1116

slowteetoe opened this issue Apr 15, 2015 · 1 comment
Assignees
Milestone

Comments

@slowteetoe
Copy link

Reading a CSV file (https://gist.github.com/slowteetoe/528c78213fcd80f05419) containing quoted fields with embedded new-lines works correctly (in data_table 1.9.5) but returns the warning:

Read less rows (1327) than were allocated (1607). Run again with verbose=TRUE and please report.

fread("restaurants.csv", verbose=TRUE)
# Input contains no \n. Taking this to be a filename to open
# File opened, filesize is 0.000104 GB.
# Memory mapping ... ok
# Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.
# Positioned on line 1 after skip or autostart
# This line is the autostart and not blank so searching up for the last non-blank ... line 1
# Detecting sep ... ','
# Detected 6 columns. Longest stretch was from line 1 to line 30
# Starting data input on line 1 (either column names or first row of data). First 10 characters: name,zipCo
# All the fields on line 1 are character fields. Treating as the column names.
# Count of eol: 3982 (including 1 at the end)
# Count of sep: 8036
# nrow = MIN( nsep [8036] / ncol [6] -1, neol [3982] - nblank [1] ) = 1607
# Type codes (   first 5 rows): 414144
# Type codes (+ middle 5 rows): 414144
# Type codes (+   last 5 rows): 414144
# Type codes: 414144 (after applying colClasses and integer64)
# Type codes: 414144 (after applying drop or select (if supplied)
# Allocating 6 column slots (6 - 0 dropped)
#    0.000s (  4%) Memory map (rerun may be quicker)
#    0.000s (  4%) sep and header detection
#    0.001s ( 21%) Count rows (wc -l)
#    0.000s (  5%) Column type detection (first, middle and last 5 rows)
#    0.000s (  1%) Allocation of 1327x6 result (xMB) in RAM
#    0.003s ( 58%) Reading data
#    0.000s (  0%) Allocation for type bumps (if any), including gc time if triggered
#    0.000s (  0%) Coercing data already read in type bumps (if any)
#    0.000s (  7%) Changing na.strings to NA
#    0.004s        Total
#                            name zipCode    neighborhood councilDistrict
#    1:                       410   21206       Frankford               2
#    2:                      1919   21231     Fells Point               1
#    3:                     SAUTE   21224          Canton               1
#    4:        #1 CHINESE KITCHEN   21211         Hampden              14
#    5:     #1 chinese restaurant   21223        Millhill               9
#   ---                                                                  
#1323: ZEN WEST ROADSIDE CANTINA   21212        Rosebank               4
#1324:                   ZIASCOS   21231 Washington Hill               1
#1325:          ZINK'S CAF\u0090   21213   Belair-Edison              13
#1326:              ZISSIMOS BAR   21211         Hampden               7
#1327:                    ZORBAS   21224       Greektown               2
#       policeDistrict                          Location 1
#    1:   NORTHEASTERN   4509 BELAIR ROAD\nBaltimore, MD\n
#    2:   SOUTHEASTERN      1919 FLEET ST\nBaltimore, MD\n
#    3:   SOUTHEASTERN     2844 HUDSON ST\nBaltimore, MD\n
#    4:       NORTHERN    3998 ROLAND AVE\nBaltimore, MD\n
#    5:   SOUTHWESTERN 2481 frederick ave\nBaltimore, MD\n
#   ---                                                   
#1323:       NORTHERN       5916 YORK RD\nBaltimore, MD\n
#1324:   SOUTHEASTERN      1313 PRATT ST\nBaltimore, MD\n
#1325:   NORTHEASTERN  3300 LAWNVIEW AVE\nBaltimore, MD\n
#1326:       NORTHERN       1023 36TH ST\nBaltimore, MD\n
#1327:   SOUTHEASTERN   4710 EASTERN Ave\nBaltimore, MD\n
# Warning message:
# In fread("restaurants.csv", verbose = TRUE) :
#   Read less rows (1327) than were allocated (1607). Run again with verbose=TRUE and please report.

sessionInfo()
# R version 3.1.3 (2015-03-09)
# Platform: x86_64-apple-darwin13.4.0 (64-bit)
# Running under: OS X 10.9.5 (Mavericks)
# 
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] data.table_1.9.5 devtools_1.7.0  
# 
# loaded via a namespace (and not attached):
# [1] bitops_1.0-6   chron_2.3-45   httr_0.6.1     plyr_1.8.1     Rcpp_0.11.5   
# [6] RCurl_1.95-4.5 reshape2_1.4.1 stringr_0.6.2  tools_3.1.3
@arunsrinivasan
Copy link
Member

The issue is exactly the same as #1239. See there for the cause.

PS: remember to also credit #1239 and #1201.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants