Add a numpy engine for reading using numpy.genfromtxt() #452

dcslagel · 2021-04-18T20:33:59Z

Description:

This draft branch is to implement using numpy.genfromtxt() as an alternate data reader. It is work for issue #446 Use an accelerated numpy or pandas reader.

It builds on the work in Add data section reader which uses pandas.read_csv #450
It defaults to engine="numpy" for testing purposes.
Adds and separate function : read_data_section_iterative_numpy_engine()
When a file indicated that it is wrapped, then it will be read with engine="normal" instead.

Test Results:

Highlights:

The speed test is significantly faster than the master branch
Overall test coverage is declining, only down 1% from 86% to 85%.
There are currently 21 test failures. These generally divide into AssertionErrors and lasio.expections.LASDataErrors.

---------- coverage: platform darwin, python 3.9.4-final-0 -----------
Name                       Stmts   Miss  Cover
----------------------------------------------
lasio/__init__.py             13      2    85%
lasio/convert_version.py      20     20     0%
lasio/defaults.py             11      0   100%
lasio/examples.py             42     10    76%
lasio/excel.py                88     34    61%
lasio/exceptions.py            6      0   100%
lasio/las.py                 451     65    86%
lasio/las_items.py           199     29    85%
lasio/las_version.py          50     14    72%
lasio/reader.py              446     45    90%
lasio/writer.py              171      9    95%
----------------------------------------------
TOTAL                       1497    228    85%

Benchmark comparison with Master branch: numpy-genfromtxt-explore is significantly faster

------------------------------------------------------------------------------------------------- benchmark: 2 tests -------------------------------------------------------------------------------------------------
Name (time in ms)                                  Min                   Max                  Mean            StdDev                Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_read_v12_sample_big (NOW)                366.4775 (1.0)        373.0084 (1.0)        370.3707 (1.0)      2.9769 (1.68)       371.9692 (1.0)      5.1055 (1.62)          1;0  2.7000 (1.0)           5           1
test_read_v12_sample_big (0001_1c1220f)     1,087.4714 (2.97)     1,091.4720 (2.93)     1,089.7887 (2.94)     1.7711 (1.0)      1,090.2695 (2.93)     3.1479 (1.0)           1;0  0.9176 (0.34)          5           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Benchmark comparison with Pandas_readcsv branch: Pandas_read_csv is faster

--------------------------------------------------------------------------------------------- benchmark: 2 tests ---------------------------------------------------------------------------------------------
Name (time in ms)                                Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_read_v12_sample_big (0001_bfa11ec)     176.7494 (1.0)      184.5382 (1.0)      180.4695 (1.0)      2.8277 (1.0)      180.5998 (1.0)      3.2284 (1.0)           2;0  5.5411 (1.0)           5           1
test_read_v12_sample_big (NOW)              358.8873 (2.03)     367.7130 (1.99)     361.8933 (2.01)     3.7042 (1.31)     360.7237 (2.00)     5.4340 (1.68)          1;0  2.7632 (0.50)          5           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This commit adds the normal data section reader back into reader.py and also add some keyword arguments to LASFile.read. The behaviour is unchanged is engine == "normal" (default). If engine == "pandas", then: If the file is wrapped: if pandas_engine_wrapped_error is True (default): a LASDataError exception is raised. if False, a logger.warning message is emitted. The data section is then read using reader.py:read_data_section_iterative_pandas_engine() If an exception is raised in that function: If pandas_engine_error == "retry" (default): the data section will be re-read by the normal parser. Otherwise if it is "error": the exception will be raised. One problem is the pd.read_csv doesn't always raise an exception as we'd perhaps like it to.

This checkin is a quick hack to get an initial view of using numpy.genfromtext() for importing data sections. This checkin is based on the pandas-readcsv branch content and makes the following changes: - Set 'pandas' as the default engine. This is so we can run all the current tests with 'pandas (actually numpy.genfromtxt()) and get an intial view of any test failures. - Replace the actual 'pandas.read_csv(...)' call with numpy.genfromtxt()

kinverarity1 · 2021-04-21T13:11:10Z

Reverting to the normal engine for wrapped files brings it down to 18 test failures.

---------------------------------------------------- benchmark: 1 tests ---------------------------------------------------
Name (time in ms)                 Min       Max      Mean   StdDev    Median      IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------
test_read_v12_sample_big     436.3694  468.9999  450.1731  13.2622  448.1868  21.1479       2;0  2.2214       5           1
---------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=========================== short test summary info ============================
FAILED tests/test_null_policy.py::test_null_policy_9999_aggressive - Assertio...
FAILED tests/test_null_policy.py::test_null_policy_9999_all - AssertionError:...
FAILED tests/test_null_policy.py::test_null_policy_custom_1_caught_9998 - Ass...
FAILED tests/test_null_policy.py::test_null_policy_custom_2 - AssertionError:...
FAILED tests/test_null_policy.py::test_null_policy_ERR_strict - AssertionErro...
FAILED tests/test_null_policy.py::test_null_policy_runon_replaced_1 - lasio.e...
FAILED tests/test_null_policy.py::test_null_policy_runon_replaced_2 - lasio.e...
FAILED tests/test_null_policy.py::test_null_policy_runon_ok_1 - lasio.excepti...
FAILED tests/test_null_policy.py::test_null_policy_runon_ok_2 - lasio.excepti...
FAILED tests/test_null_policy.py::test_null_policy_small_non_zero_neg_nums - ...
FAILED tests/test_read.py::test_comma_decimal_mark_data - assert nan == 123.42
FAILED tests/test_read.py::test_missing_a_section - lasio.exceptions.LASDataE...
FAILED tests/test_read.py::test_blank_line_in_header - lasio.exceptions.LASDa...
FAILED tests/test_read.py::test_data_characters_1 - AssertionError: assert na...
FAILED tests/test_read.py::test_data_characters_2 - AssertionError: assert na...
FAILED tests/test_read.py::test_data_characters_types - AssertionError: asser...
FAILED tests/test_read.py::test_read_incorrect_shape - lasio.exceptions.LASDa...
FAILED tests/test_read.py::test_quoted_substrings_in_data_section - lasio.exc...
============ 18 failed, 217 passed, 2 skipped, 1 warning in 10.15s =============

- If numpy-engine throws an exception on data-read then retry with the normal engine. - Remove '_iterative' from the names of the data-read engine functions.

dcslagel · 2021-04-22T21:35:34Z

Remaining test failures:

FAILED tests/test_null_policy.py::test_null_policy_9999_aggressive - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_9999_all - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_custom_1_caught_9998 - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_custom_2 - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_ERR_strict - AssertionError: assert nan == 'ERR'
FAILED tests/test_null_policy.py::test_null_policy_small_non_zero_neg_nums - AssertionError: assert False
FAILED tests/test_read.py::test_comma_decimal_mark_data - assert nan == 123.42
FAILED tests/test_read.py::test_data_characters_1 - AssertionError: assert nan == '00:00:00'
FAILED tests/test_read.py::test_data_characters_2 - AssertionError: assert nan == '01-Jan-20'
FAILED tests/test_read.py::test_data_characters_types - AssertionError: assert False

Make these temporary changes to enable integrating the numpy-engine. - Route "aggressive" and "all" null_policies to the normal-engine. - Set tests that fail for numpy-engine to XFAIL. These test will continue to pass for the normal-engine. - First draft of useing genfromtxts' usemap and missing_values to align functionality with the normal-engine. This needs follow work.

dcslagel · 2021-04-23T23:17:21Z

@Boorhin and I tried to resolve the remaining test failures for the numpy-engine but they are going to take more time. So the latest checkin on dcslagel:numpy-genfromtxt-explore consists of temporary workarounds (routing some null_policy keys to normal-engine) and marking the failing tests with pytest.mark.xfail(). Xfail enables these test to run and pass for normal-engine but be ignored when tested with numpy-engine. Run pytest -rxXs to see these tests listed with a todo comment.

I think we should squish this branch to one commit, set the default engine to 'normal' and merge it. That will enable folks to use the faster numpy engine when they configure for it. Then follow up finishing the work resolving the remaining test failures. How do you feel about this approach?

Remaing failing tests:

FAILED tests/test_null_policy.py::test_null_policy_ERR_strict - AssertionError: assert nan == 'ERR'
FAILED tests/test_read.py::test_comma_decimal_mark_data - assert nan == 123.42
FAILED tests/test_read.py::test_data_characters_1 - AssertionError: assert nan == '00:00:00'
FAILED tests/test_read.py::test_data_characters_2 - AssertionError: assert nan == '01-Jan-20'
FAILED tests/test_read.py::test_data_characters_types - AssertionError: assert False

Thanks!,
DC

kinverarity1 · 2021-04-24T03:29:02Z

Yes, I think that's a good approach. I still would like to change the read and null policy and substitutions approach to align with a future release where the numpy engine is default, but I'll do that in a separate PR. Thanks @dcslagel and @Boorhin for doing all this! 🎉

Boorhin · 2021-04-24T06:59:47Z

So while i was waking up I found out what you could do in the short term to fulfill the null policies Polmask =np.array([np.where(array == X for X in NULL_POLICIES["all"] ]). Sum() Or something like that Np.ma.masked_where(array, polmask) It is not recommended to have NaN in an array. The strategy is to mask the values you don't need. NaN can create unexpected results in calculations. To mask invalid values there is a function like np.ma. Mask invalids or something. The other thing is that there will be only one NULL value per object. In a vectorial approach it makes no sense to predefine NULL, you have to test if the masking works but it is your object that should be tested not a bunch of variables. I am currently writing a parser for you it may take a while as I am thinly stretched at the moment. But that aims at parsing all the LAS versions. Cheers

…

On Sat, 24 Apr 2021, 04:29 Kent Inverarity, ***@***.***> wrote: Yes, I think that's a good approach. I still would like to change the read and null policy and substitutions approach to align with a future release where the numpy engine is default, but I'll do that in a separate PR. Thanks @dcslagel <https://github.com/dcslagel> and @Boorhin <https://github.com/Boorhin> for doing all this! 🎉 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#452 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACJCEFUAEOQVN2AHCESZRZDTKI3IXANCNFSM43ES6M3A> .

kinverarity1 · 2021-04-25T04:21:21Z

@dcslagel I've been working on the other major data-section-reader issue in PR #461 and it will conflict with this PR as we are working in the same part of the code. Happy to have a chat either here or there about which PR we merge first (this one or that one), I'm happy to do the work of merging them.

dcslagel · 2021-04-25T19:40:29Z

@kinverarity1,

I put one small change request in a review for #461. Other than that, #461 should go right in to the master/main branch without waiting for this pull-request.

More thoughts about this pull-request (452):

I've had 2nd thoughts about rushing to get it into master/main branch without all tests really passing. More generally, my concern is that adding another reader-engine will increase the maintenance workload.

Here is the approach I'm currently leaning toward.

Revert
528bd81 Numpy-engine temp workarounds for failing tests
( and a few other commits back to just after d3cea21 Remove pandas reader code)
Merge Allow different data types per curve in data section reader #461 to master/main
Merge master/main to this pull-request: 452
Continue to work through the failing tests on 452 ( and syncing with master/main as needed). Once they are all passing, then assess whether we can completely replace the normal-reader engine with the numpy-reader engine. Or if we run into test we are not able to configure numpy-reader to pass, continue to look at/work on other options for improving read performance while maintaining Lasio capability and stability.

Does this seem like a good set of next steps? Or are there other steps that will work?

kinverarity1 · 2021-04-26T00:14:29Z

I agree, let's do that!

This merge hangs on the following test. tests/test_enhancements.py::test_autodepthindex_point_one_inch

- Remove unneeded 'run_normal_engine' - remove remove_data_line_filter - Move curve_data_gen Transform to numpy-eng - Transpose curve_data_gen to pass test_autodepthindex

- reshape-in-data-reader - Add test for not replacing NULL in index curve - Enable writing empty LAS file - Fix rounding issue when writing LAS file - Add Gitter Badge

dcslagel · 2021-05-02T21:20:44Z

The current commit accomplishes steps 1 -3 in #452 (comment)

Here is the current set of 23 failing tests:
These generally divide into AssertionErrors and lasio.expections.LASDataErrors.

======================================================= short test summary info ========================================================
FAILED tests/test_null_policy.py::test_null_policy_9999_aggressive - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_9999_all - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_custom_1_caught_9998 - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_custom_2 - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_ERR_strict - AssertionError: assert nan == 'ERR'
FAILED tests/test_null_policy.py::test_null_policy_runon_replaced_1 - lasio.exceptions.LASDataError: 
FAILED tests/test_null_policy.py::test_null_policy_runon_replaced_2 - lasio.exceptions.LASDataError: 
FAILED tests/test_null_policy.py::test_null_policy_runon_ok_1 - lasio.exceptions.LASDataError: 
FAILED tests/test_null_policy.py::test_null_policy_runon_ok_2 - lasio.exceptions.LASDataError: 
FAILED tests/test_null_policy.py::test_null_policy_small_non_zero_neg_nums - AssertionError: assert False
FAILED tests/test_read.py::test_comma_decimal_mark_data - assert nan == 123.42
FAILED tests/test_read.py::test_missing_a_section - lasio.exceptions.LASDataError: 
FAILED tests/test_read.py::test_blank_line_in_header - lasio.exceptions.LASDataError: 
FAILED tests/test_read.py::test_issue92 - TypeError: 'numpy.float64' object does not support item assignment
FAILED tests/test_read.py::test_data_characters_1 - AssertionError: assert nan == '00:00:00'
FAILED tests/test_read.py::test_data_characters_2 - AssertionError: assert nan == '01-Jan-20'
FAILED tests/test_read.py::test_data_characters_types - AssertionError: assert False
FAILED tests/test_read.py::test_read_incorrect_shape - lasio.exceptions.LASDataError: 
FAILED tests/test_read.py::test_quoted_substrings_in_data_section - lasio.exceptions.LASDataError: 
FAILED tests/test_read.py::test_sample_dtypes_specified - assert False
FAILED tests/test_read.py::test_sample_dtypes_specified_as_dict - assert False
FAILED tests/test_read.py::test_sample_dtypes_specified_as_false - assert False
FAILED tests/test_write.py::test_write_single_step - TypeError: 'numpy.float64' object does not support item assignment
====================================== 23 failed, 218 passed, 2 skipped, 8648 warnings in 10.36s =======================================

dcslagel · 2021-05-26T23:13:06Z

Commit 0f8b1bf

======================================================== short test summary info =========================================================
FAILED tests/test_null_policy.py::test_null_policy_9999_aggressive - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_9999_all - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_custom_1_caught_9998 - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_custom_2 - AssertionError: assert False
FAILED tests/test_null_policy.py::test_null_policy_ERR_strict - AssertionError: assert nan == 'ERR'
FAILED tests/test_null_policy.py::test_null_policy_runon_replaced_1 - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_null_policy.py::test_null_policy_runon_replaced_2 - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_null_policy.py::test_null_policy_runon_ok_1 - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_null_policy.py::test_null_policy_runon_ok_2 - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_null_policy.py::test_null_policy_small_non_zero_neg_nums - AssertionError: assert False
FAILED tests/test_read.py::test_comma_decimal_mark_data - assert nan == 123.42
FAILED tests/test_read.py::test_missing_a_section - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_read.py::test_blank_line_in_header - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_read.py::test_data_characters_1 - AssertionError: assert nan == '00:00:00'
FAILED tests/test_read.py::test_data_characters_2 - AssertionError: assert nan == '01-Jan-20'
FAILED tests/test_read.py::test_data_characters_types - AssertionError: assert False
FAILED tests/test_read.py::test_read_incorrect_shape - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_read.py::test_quoted_substrings_in_data_section - lasio.exceptions.LASDataError: Traceback (most recent call last):
FAILED tests/test_read.py::test_sample_dtypes_specified - assert False
FAILED tests/test_read.py::test_sample_dtypes_specified_as_dict - assert False
FAILED tests/test_read.py::test_sample_dtypes_specified_as_false - assert False
=============================================== 21 failed, 220 passed, 2 skipped in 10.42s ===============================================

dcslagel · 2021-06-11T21:33:21Z

@kinverarity1 , @Boorhin, @donald-keighley,

Commit 468a0c6 passes all the tests and retains the speed for the big file read. I wasn't able to completely replace the 'normal' parser with a 'numpy' parser for several reasons:

Only the normal parser handles wrapped files.
When there is a custom null_policy, I haven't been able to workout how to get numpy to use the custom policy. It might be do-able, but I haven't figured it out so far...
When there is a custom d-types (non-'auto'), then numpy's performance degrades to the level of the normal parser.

So the current program flow in this pull-request is:

Set the default parser to be numpy-parser
Check for the 1 - 3 above and if any of them are true fallback to the normal-parser
If the parser is still the numpy-parser attempt to run with it. If an exception is thrown then re-try with the normal parser.

This is some additional complexity and means maintaining both parsers but obtains the speed improvement for many well formed LAS files.

Current Test Results:

Name                       Stmts   Miss  Cover
----------------------------------------------
lasio/__init__.py             13      2    85%
lasio/convert_version.py      20     20     0%
lasio/defaults.py             11      0   100%
lasio/examples.py             42     10    76%
lasio/excel.py                88     34    61%
lasio/exceptions.py            6      0   100%
lasio/las.py                 457     65    86%
lasio/las_items.py           199     29    85%
lasio/las_version.py          50     14    72%
lasio/reader.py              446     28    94%
lasio/writer.py              171      9    95%
----------------------------------------------
TOTAL                       1503    211    86%
Coverage XML written to file coverage.xml


--------------------------------------------------- benchmark: 1 tests ------------------------------
Name (time in ms)                 Min       Max      Mean  StdDev    Median     IQR  Outliers     OPS
-----------------------------------------------------------------------------------------------------
test_read_v12_sample_big     335.3062  353.3825  341.6552  6.8935  339.8994  6.2280       1;0  2.9269
-----------------------------------------------------------------------------------------------------

--
Let me know if this change could be accepted (or rejected) or
needs some additional changes to be approved and merged.

Thank you,
DC

kinverarity1

Thanks @dcslagel for getting this to mergeable state! 🎆

I think we should go ahead with this - there are ways to improve but at least it gives people a default boost in speed for most files.

dcslagel · 2021-06-28T16:44:26Z

Okay, Proceeding with the merge. First, I tested the merge in my local environment. All test pass and the speed test is: test_read_v12_sample_big 332.0824 Finalizing the merge via GitHub interface so that it will be signed with GitHubs verification signature.

Thanks,
DC

Boorhin · 2021-06-29T17:42:22Z

Congratulations guys!
Would have liked to be more available but having a big project for now. However I learned a few things in numpy that may help for LASIO

kinverarity1 and others added 9 commits April 17, 2021 21:32

Replace data section reader with pandas.read_csv

7b4c45f

Fix GH CI bug?

4246ce1

Rebase and make separate engine for 'numpy' reader

57bf80f

Remove pandas engine

f5dc804

Use normal engine for wrapped files

b65a0dd

Format code with black

0cbf3ed

Remove pandas reader code

d3cea21

Update from branch 'master' to numpy-genfromtxt-explore

6d4c0af

This was referenced Apr 21, 2021

Drop regexp subs #454

Closed

Add data section reader which uses pandas.read_csv #450

Closed

dcslagel changed the title ~~Experimental exploration with numpy.genfromtxt~~ Add a numpy engine for reading using numpy.genfromtxt() Apr 22, 2021

dcslagel added the data-section-parser A bug or enhancement relating to the data section parser label Apr 22, 2021

Handle numpy-engine data read exceptions

a614c1c

- If numpy-engine throws an exception on data-read then retry with the normal engine. - Remove '_iterative' from the names of the data-read engine functions.

kinverarity1 mentioned this pull request Apr 23, 2021

Read data section as dataframe #424

Closed

dcslagel force-pushed the numpy-genfromtxt-explore branch from df91e29 to 528bd81 Compare April 23, 2021 23:08

kinverarity1 mentioned this pull request Apr 24, 2021

Use an accelerated numpy reader (np.genfromtxt) #446

Closed

kinverarity1 mentioned this pull request Apr 24, 2021

quick and dirty #448

Closed

change 'f' formating to oldstyle for python 3.5

db551e8

kinverarity1 mentioned this pull request Apr 25, 2021

Allow different data types per curve in data section reader #461

Merged

kinverarity1 mentioned this pull request Apr 26, 2021

Support reading and writing all LAS 3.0 features #5

Open

9 tasks

dcslagel added 5 commits May 1, 2021 10:02

Merge-Squash numpy-genfromtext-explore on merge-base

0cb8d5b

Sync most of numpy-engine work to current master 91f8eab

e276941

Interm checkin add numpy-engine but hangs on a test

39117b3

This merge hangs on the following test. tests/test_enhancements.py::test_autodepthindex_point_one_inch

Fix merge issues

b8b5bcb

- Remove unneeded 'run_normal_engine' - remove remove_data_line_filter - Move curve_data_gen Transform to numpy-eng - Transpose curve_data_gen to pass test_autodepthindex

Sync with master's 91f8eab commit

ac9e125

- reshape-in-data-reader - Add test for not replacing NULL in index curve - Enable writing empty LAS file - Fix rounding issue when writing LAS file - Add Gitter Badge

dcslagel mentioned this pull request May 2, 2021

Numpy engine dc5 #464

Closed

Handle single row data in np.genfromtxt

058b1bf

dcslagel added 2 commits June 2, 2021 13:58

Return to exception fallback to old parser

b54f538

Add conditions for falling back to the normal parser

468a0c6

dcslagel marked this pull request as ready for review June 11, 2021 21:33

dcslagel requested a review from kinverarity1 June 11, 2021 21:33

kinverarity1 approved these changes Jun 27, 2021

View reviewed changes

dcslagel merged commit f62686e into kinverarity1:master Jun 28, 2021

dcslagel deleted the numpy-genfromtxt-explore branch July 20, 2022 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a numpy engine for reading using numpy.genfromtxt() #452

Add a numpy engine for reading using numpy.genfromtxt() #452

dcslagel commented Apr 18, 2021 •

edited

Loading

kinverarity1 commented Apr 21, 2021

dcslagel commented Apr 22, 2021

dcslagel commented Apr 23, 2021 •

edited

Loading

kinverarity1 commented Apr 24, 2021

Boorhin commented Apr 24, 2021 via email

kinverarity1 commented Apr 25, 2021

dcslagel commented Apr 25, 2021 •

edited

Loading

kinverarity1 commented Apr 26, 2021

dcslagel commented May 2, 2021 •

edited

Loading

dcslagel commented May 26, 2021

dcslagel commented Jun 11, 2021 •

edited

Loading

kinverarity1 left a comment

dcslagel commented Jun 28, 2021

Boorhin commented Jun 29, 2021

Add a numpy engine for reading using numpy.genfromtxt() #452

Add a numpy engine for reading using numpy.genfromtxt() #452

Conversation

dcslagel commented Apr 18, 2021 • edited Loading

Description:

Test Results:

Benchmark comparison with Master branch: numpy-genfromtxt-explore is significantly faster

Benchmark comparison with Pandas_readcsv branch: Pandas_read_csv is faster

kinverarity1 commented Apr 21, 2021

dcslagel commented Apr 22, 2021

dcslagel commented Apr 23, 2021 • edited Loading

kinverarity1 commented Apr 24, 2021

Boorhin commented Apr 24, 2021 via email

kinverarity1 commented Apr 25, 2021

dcslagel commented Apr 25, 2021 • edited Loading

More thoughts about this pull-request (452):

kinverarity1 commented Apr 26, 2021

dcslagel commented May 2, 2021 • edited Loading

dcslagel commented May 26, 2021

dcslagel commented Jun 11, 2021 • edited Loading

Current Test Results:

kinverarity1 left a comment

Choose a reason for hiding this comment

dcslagel commented Jun 28, 2021

Boorhin commented Jun 29, 2021

dcslagel commented Apr 18, 2021 •

edited

Loading

dcslagel commented Apr 23, 2021 •

edited

Loading

dcslagel commented Apr 25, 2021 •

edited

Loading

dcslagel commented May 2, 2021 •

edited

Loading

dcslagel commented Jun 11, 2021 •

edited

Loading