You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dealing with the CSV generated by the toolbox is not trivial: before pd.read_csv we need to define a lot of dtype, in Jarbas we spent a bunch of lines of code deserializing data (converting strings to date objects, to integers and floats).
How can this be addressed?
@turicas and I talked today and he suggested that the toolbox could offer an API not only to generate a CSV version of our datasets, but also a high level iterator for them. Something like:
All the datasets in Brasil.IO will use the datapackage specification (for more info, see this milestone) and I think it could be the default way to access data in Serenata also (there are libraries to deal with it automatically so we don't need to create converters, just the datapackage spec). What do you think?
What is the problem?
Dealing with the CSV generated by the toolbox is not trivial: before
pd.read_csv
we need to define a lot ofdtype
, in Jarbas we spent a bunch of lines of code deserializing data (converting strings to date objects, to integers and floats).How can this be addressed?
@turicas and I talked today and he suggested that the toolbox could offer an API not only to generate a CSV version of our datasets, but also a high level iterator for them. Something like:
And the output would be an object with proper types (
int
,Decimal
,date
etc.).Who could help with this issue?
@turicas ; )
The text was updated successfully, but these errors were encountered: