-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: TrueFX Tick DataReader #153
Comments
I wonder if someone here knows why it's so long to process! I try to read 1 month of tick data of AUD/USD
which is not so big! In [26]: %time df=pd.read_csv("AUDUSD-2014-01.csv", names=['Symbol', 'Date', 'Bid', 'Ask'])
CPU times: user 3.31 s, sys: 481 ms, total: 3.79 s
Wall time: 4.13 s
In [27]: df
Out[27]:
Symbol Date Bid Ask
0 AUD/USD 20140101 21:55:34.404 0.88796 0.88922
1 AUD/USD 20140101 21:55:34.444 0.88805 0.88914
2 AUD/USD 20140101 21:55:34.475 0.88809 0.88910
3 AUD/USD 20140101 21:55:48.962 0.88811 0.88908
4 AUD/USD 20140101 21:56:38.293 0.88808 0.88887
... ... ... ... ...
1947101 AUD/USD 20140131 21:59:48.048 0.87525 0.87589
1947102 AUD/USD 20140131 21:59:54.599 0.87527 0.87589
1947103 AUD/USD 20140131 21:59:56.927 0.87531 0.87588
1947104 AUD/USD 20140131 21:59:59.365 0.87531 0.87574
1947105 AUD/USD 20140131 22:00:00.038 0.87531 0.87574
[1947106 rows x 4 columns]
In [28]: %time df['Date'] = pd.to_datetime(df['Date'])
CPU times: user 6min 27s, sys: 3.46 s, total: 6min 30s
Wall time: 6min 39s passing In [13]: %time df=pd.read_csv("AUDUSD-2014-01.csv", names=['Symbol', 'Date', 'Bid', 'Ask'], parse_dates=['Date'])
CPU times: user 7min 54s, sys: 4.65 s, total: 7min 59s
Wall time: 8min 32s This is odd because after data beeing load it's very quick to performs some calculations In [48]: del df['Spread']
In [49]: %time df['Spread']=df['Ask']-df['Bid']
CPU times: user 17.9 ms, sys: 31.7 ms, total: 49.6 ms
Wall time: 28.1 ms
In [51]: %time df.resample(how='ohlc', rule='1D')
CPU times: user 67.2 ms, sys: 4.27 ms, total: 71.4 ms
Wall time: 70.5 ms
Out[51]:
Bid Ask Spread
open high low close open high low close open high low close
Date
2014-01-01 0.88796 0.88979 0.88755 0.88928 0.88922 0.88997 0.88825 0.88942 0.00126 0.00126 0 0.00014
2014-01-02 0.88913 0.89434 0.88427 0.89050 0.88949 0.89441 0.88436 0.89057 0.00036 0.00044 0 0.00007
2014-01-03 0.89046 0.90043 0.88846 0.89465 0.89057 0.90053 0.88857 0.89468 0.00011 0.00049 0 0.00003
2014-01-04 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2014-01-05 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... .. ...
2014-01-27 0.86945 0.87595 0.86773 0.87307 0.86958 0.87598 0.86780 0.87316 0.00013 0.00102 0 0.00009
2014-01-28 0.87304 0.88205 0.87299 0.87853 0.87316 0.88209 0.87308 0.87853 0.00012 0.00162 0 0.00000
2014-01-29 0.87844 0.88257 0.87246 0.87464 0.87877 0.88269 0.87253 0.87473 0.00033 0.00273 0 0.00009
2014-01-30 0.87462 0.88044 0.87102 0.87955 0.87473 0.88050 0.87110 0.87962 0.00011 0.00047 0 0.00007
2014-01-31 0.87952 0.88232 0.86944 0.87531 0.87962 0.88239 0.86948 0.87574 0.00010 0.00081 0 0.00043
[31 rows x 12 columns] |
speedups conversion but the whole process of downloading, reading, converting is still so slow!!! Here is results with 2 months of ticks data for AUDUSD (using nose-timer https://github.com/mahmoudimus/nose-timer )
when data have been downloaded previously and stored to SQLite cache
|
Direct requests to URLs like http://www.truefx.com/dev/data/2014/JANUARY-2014/AUDUSD-2014-01.zip now redirects to Users need to be registered I see (at least) 2 solutions
|
TrueFX http://www.truefx.com/ provides free tick data
It will be nice to add these data to DataReader
see PR #152
I'm still facing some issue such as very long time to process data.
Any help is welcome
The text was updated successfully, but these errors were encountered: