-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Binary ngc file parser #830
Conversation
My whiteboard notes from writing this parser: red circle in the middle is a 32 bit x n (where n is 16 bit int read immediately prior). There is a table of metadata at the bottom, which may be useful someday, but is not parsed now, and likely won't be until it is specifically requested/implemented by the interested party. |
@ksunden please consider if there is a more meaningful name than a more explicit name that refers to the instrument or software that generates such files would be preferred |
it is unclear to me whether this file format is specific to a particular instrument (yes, all examples I have are from one instrument, all I know is that it is a horiba raman microscope, but I do not know much more) The program is something like LabSpec The header of the file is "NGSNextGen", which mostly leads me to "Next Generation Sequencing", though I have not seen any other files which share the format in the sequencing community, so not sure that is a real link. |
@darienmorrow Do you have any distributable datasets? I can take care of most of the points fairly quickly, but need datasets to test against |
@ksunden The datasets I have are rather large. We should talk about if we actually want to distribute them. The microscope these are taken from can also take 1D spectra, but I don't have any examples of it. |
Aramis_acquisition_information.pdf These are the small test data files (and some additional information). The same acquisition is stored in several formats (ASCII, ng[cs], t[vs]f ) only ngc is intended to be supported by this parser at this time, ngs may be soon as well, depending on how similar/what the headers give me to work with |
For now, I only put the example data in the tests directory, If we wish to include either some of those or some other dataset in datasets itself, I am open to doing so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌵
Still need:
label
for variablespass
with meaningful warning when header is unexpectedAscii text file parser (probably separate method, possibly separate PR entirely)(The ASCII format seems untenable to me to parse in a particularly sane way, given the lack of metadata. It can be done, but it is not clear to me how to do so without making either lots of assumptions or adding so many parameters that it becomes difficult to use. A specific case parser can be maintained out of tree more easily for now. I am open to discussion/ being convinced it is possible to write a parser that is user-friendly, but I do feel that any such contributions are separable from the implementation of this particular parser)