ENH: Binary ngc file parser #830

ksunden · 2018-12-11T19:20:07Z

Still need:

populate label for variables
replace pass with meaningful warning when header is unexpected
example files distributed (those I have are unpublished, currently, and larger than I'd like, only mild compression)
tests
Update doc rst files
~~Ascii text file parser (probably separate method, possibly separate PR entirely)~~ (The ASCII format seems untenable to me to parse in a particularly sane way, given the lack of metadata. It can be done, but it is not clear to me how to do so without making either lots of assumptions or adding so many parameters that it becomes difficult to use. A specific case parser can be maintained out of tree more easily for now. I am open to discussion/ being convinced it is possible to write a parser that is user-friendly, but I do feel that any such contributions are separable from the implementation of this particular parser)

ksunden · 2018-12-11T19:27:09Z

My whiteboard notes from writing this parser:

red circle in the middle is a 32 bit x n (where n is 16 bit int read immediately prior).
I do not know what the meaning of the value is, but use it such that nonzero means there is an array for that index following, and zero mean there is not. This is an assumption that I made based on a small number of examples, but it held for those.

There is a table of metadata at the bottom, which may be useful someday, but is not parsed now, and likely won't be until it is specifically requested/implemented by the interested party.

untzag · 2018-12-11T23:07:55Z

@ksunden please consider if there is a more meaningful name than from_ngc

a more explicit name that refers to the instrument or software that generates such files would be preferred

ksunden · 2018-12-12T00:13:05Z

it is unclear to me whether this file format is specific to a particular instrument (yes, all examples I have are from one instrument, all I know is that it is a horiba raman microscope, but I do not know much more) The program is something like LabSpec

The header of the file is "NGSNextGen", which mostly leads me to "Next Generation Sequencing", though I have not seen any other files which share the format in the sequencing community, so not sure that is a real link.

ksunden · 2019-01-14T18:30:43Z

@darienmorrow Do you have any distributable datasets?

I can take care of most of the points fairly quickly, but need datasets to test against

darienmorrow · 2019-01-21T15:53:17Z

@ksunden The datasets I have are rather large. We should talk about if we actually want to distribute them. The microscope these are taken from can also take 1D spectra, but I don't have any examples of it.
examples.zip

ksunden · 2019-07-17T21:30:42Z

Aramis_acquisition_information.pdf
Aramis_acquisitions.zip

These are the small test data files (and some additional information).

The same acquisition is stored in several formats (ASCII, ng[cs], t[vs]f ) only ngc is intended to be supported by this parser at this time, ngs may be soon as well, depending on how similar/what the headers give me to work with

ksunden · 2019-07-18T20:02:18Z

For now, I only put the example data in the tests directory, If we wish to include either some of those or some other dataset in datasets itself, I am open to doing so.

untzag

🌵

ENH: Binary ngc file parser

95b5d61

ksunden added the enhancement label Dec 11, 2018

ksunden added this to the 3.2.1 milestone Dec 11, 2018

ksunden self-assigned this Dec 11, 2018

ksunden requested review from darienmorrow, ddkohler and untzag as code owners December 11, 2018 19:20

Update _ngc.py

bd2a881

Merge branch 'master' into from_ngc

1617eac

ksunden modified the milestones: 3.2.1, 3.2.2 Jan 15, 2019

ksunden modified the milestones: 3.2.2, 3.2.3 Mar 21, 2019

ksunden added 6 commits April 15, 2019 17:26

Merge branch 'master' into from_ngc

7f95267

Merge branch 'master' into from_ngc

c83602c

Add warnings when headers are not what is expected

7ac55c3

Populate label for variables

209d47b

Update documentation for ngc

dd69d73

MAINT: fix warnings, delete print

1e5b350

ksunden added 2 commits July 18, 2019 11:43

Update parser to handle non-rectangular aquisitions

70761d6

Add tests for ngc parser

ca35418

ksunden changed the title ~~[WIP] ENH: Binary ngc file parser~~ ENH: Binary ngc file parser Jul 18, 2019

darienmorrow approved these changes Aug 13, 2019

View reviewed changes

ksunden added 2 commits August 17, 2019 16:27

Rename from_ngc to from_Aramis

c5c8c82

add the files with renames...

0513043

untzag approved these changes Aug 17, 2019

View reviewed changes

untzag merged commit 9693647 into master Aug 17, 2019

untzag deleted the from_ngc branch August 17, 2019 22:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Binary ngc file parser #830

ENH: Binary ngc file parser #830

ksunden commented Dec 11, 2018 •

edited

Loading

ksunden commented Dec 11, 2018

untzag commented Dec 11, 2018

ksunden commented Dec 12, 2018

ksunden commented Jan 14, 2019

darienmorrow commented Jan 21, 2019

ksunden commented Jul 17, 2019

ksunden commented Jul 18, 2019

untzag left a comment

ENH: Binary ngc file parser #830

ENH: Binary ngc file parser #830

Conversation

ksunden commented Dec 11, 2018 • edited Loading

ksunden commented Dec 11, 2018

untzag commented Dec 11, 2018

ksunden commented Dec 12, 2018

ksunden commented Jan 14, 2019

darienmorrow commented Jan 21, 2019

ksunden commented Jul 17, 2019

ksunden commented Jul 18, 2019

untzag left a comment

Choose a reason for hiding this comment

ksunden commented Dec 11, 2018 •

edited

Loading