-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet to CSV convert problem #291
Comments
hi, @ahmadhega |
@xitongsys sent you by mail ([email protected]) more details. |
hi, @ahmadhega package main
import (
"log"
"github.com/xitongsys/parquet-go-source/local"
"github.com/xitongsys/parquet-go/reader"
)
type Shoes struct {
Name *string `parquet:"name=Name, type=UTF8, encoding=PLAIN_DICTIONARY"`
Size *int64 `parquet:"name=Size, type=INT32, encoding=PLAIN"`
}
func main() {
///read
fr, err := local.NewLocalFileReader("shoes_orders.parquet")
if err != nil {
log.Println("Can't open file")
return
}
pr, err := reader.NewParquetReader(fr, new(Shoes), 1)
if err != nil {
log.Println("Can't create parquet reader", err)
return
}
num := int(pr.GetNumRows())
stus := make([]Shoes, num) //read 10 rows
if err = pr.Read(&stus); err != nil {
log.Println("Read error", err)
}
log.Println(stus)
pr.ReadStop()
fr.Close()
} |
zolstein
pushed a commit
to zolstein/parquet-go
that referenced
this issue
Jun 23, 2023
…itongsys#289) * refactor packages to use encoding.Values container * refactor page and dictionary creation to use encoding.Values * go vet fix * reduce memory footprint of encoding.Values * refactor encoding.Encoding to use simple Go types * port parquet-go package to use pair of values+offsets to represent byte arrays * add fuzz tests back * optimize DELTA_LENGTH_BYTE_ARRAY decoding (xitongsys#291) * optimize DELTA_LENGTH_BYTE_ARRAY decoding * add link to online documentation * fix * add a unit test for decodeByteArrayLengths * Update encoding/delta/length_byte_array_amd64.s Co-authored-by: Kevin Burke <[email protected]> * optimize DELTA_LENGTH_BYTE_ARRAY encoding (xitongsys#292) Co-authored-by: Kevin Burke <[email protected]> * account for size of offsets buffer when benchmarking throughput * optimize DELTA_BYTE_ARRAY decoding (xitongsys#294) * PR feedback Co-authored-by: Kevin Burke <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi ,
Creating Parquet for CSV file using parquet-go library and reconvert it by the library success for me, but creating parquet by python tool :
import pandas as pd df = pd.read_parquet(“file.csv”) df.to_parquet(“file.parquet”)
trying to convert it to CSV file failed.
To be more specified function
(self *ParquetReader) Read(dstInterface interface{}) error
failed for me and the error was : "runtime error: index out of range [74] with length 4"Any idea?
The text was updated successfully, but these errors were encountered: