Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Parsing binaries of unknown size #545

Closed
GabriellaNicoleRamirez opened this issue Mar 26, 2019 · 8 comments
Closed

[Question] Parsing binaries of unknown size #545

GabriellaNicoleRamirez opened this issue Mar 26, 2019 · 8 comments
Labels

Comments

@GabriellaNicoleRamirez
Copy link

GabriellaNicoleRamirez commented Mar 26, 2019

Hi, I'm currently working with some custom physics data and I've been using https://www.carvesystems.com/news/parsing-binaries-with-kaitai-struct/ as a reference, but I don't know exactly where the footer is located.

Currently, we've just commented out the bits where we actually deal with the footer and just read the footer in as another event, but then we cut out the last event in the function to graph the data since it's really the footer.

Struct definition

  seq:

  - id: file_header
    type: pixie4e_header

  - id: events
    type: event
    repeat: eos
    # repeat: until
    # repeat-until: _.header.evt_info == 0x0100

  # - id: file_footer
  #   type: pixie_eor

Event Definition

event:

seq:

  - id: header
    type: channel_header

  - id: data    
    type: u2
    repeat: expr
    repeat-expr: header.num_trace_blks * _root.file_header.blk_size

Footer Definition

# pixie_eor:

#   seq:

#     - id: evt_pattern
#       type: u2

#     - id: evt_info
#       type: u2

#     - id: num_trace_blks
#       type: u2

#     - id: num_trace_blks_prev
#       type: u2

#     - id: reserved
#       type: u2
#       repeat: expr
#       repeat-expr: 0x1C

Is there a way I can test each byte to check if it's really an event (has all of the identifiers of an event) and then if it doesn't, and it's near the end of the file, read it as the footer?

Otherwise, can I tell it to stop reading events "this many bits" before the end of the file and read the rest as the footer?

Thanks!

Edit: added code

@KOLANICH
Copy link

can I tell it to stop reading events "this many bits" before the end of the file and read the rest as the footer?

You can. _io.size if I remember right.

@GabriellaNicoleRamirez
Copy link
Author

Thanks! I'll look into it.

@GreyCat
Copy link
Member

GreyCat commented Mar 27, 2019

Not really, and it would be a while since that kind of functionality will be implemented in KS.

For starters, there is no concept of "valid" or "invalid" parsing in KS right now, except for very rudimental validation checks done by contents: .... To some extent, #435 should cover it, once it will be implemented.

Also, there's no way to "test each byte", i.e. launch for scan or search repeatedly. There is a separate proposal to implement something close to that in #538.

That said, @KOLANICH is correct, and you can try to get away with limiting your input with some size relative to overall size of the file (i.e. something like size: _io.size - 123) and/or allocating footer as instance positioned there specifically:

footer:
  pos: _io.size - 123
  type: footer

@GabriellaNicoleRamirez
Copy link
Author

GabriellaNicoleRamirez commented Mar 27, 2019

Okay, thanks for the insight! Really appreciate it.

@GabriellaNicoleRamirez
Copy link
Author

GabriellaNicoleRamirez commented Mar 28, 2019

So, I've found out the sizes of the header and footer (32 words * 16-bit = 64 bytes, each) and I've tried using pos but I get a compiler error: /seq/o/pos: unknown key found.

I tried using size and defining the header and footer sizes as 64 bytes, but then I get an error of reading -96 amount of bytes.

At the moment, there is no way to find out the number of data entries in the file (it's not defined in the binary), but can I still pass in the "middle" chunk of the data and have it read until eos? Otherwise, is there a way to read the header up to 64 bytes, read the data and repeat until 64 bytes from eof?

Thanks for the advice!

@GreyCat
Copy link
Member

GreyCat commented Mar 29, 2019

pos is for instances, your error message suggests that you've put it into seq attributes.

Reading "until end of substream limited to all but last 64 bytes" would be something like

seq:
  - id: all_but_footer
    type: all_but_footer
    size: _io.size - 64
  - id: footer
    type: footer
    size: 64
types:
  all_but_footer:
    seq:
      # ...
      - id: elements
        type: element
        repeat: eos # this will be limited to substream, not main stream

@GabriellaNicoleRamirez
Copy link
Author

GabriellaNicoleRamirez commented Mar 29, 2019

Oh! Thanks! I should have double-checked in the documentation. I'll give it a try.

@GreyCat
Copy link
Member

GreyCat commented Apr 15, 2019

Seems that the answer was given, closing this discussion. Please feel free to reopen if anything still remains.

@GreyCat GreyCat closed this as completed Apr 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants