Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode procedures read header fields without checking length #116

Closed
jphickey opened this issue Dec 13, 2021 · 0 comments · Fixed by #137
Closed

Decode procedures read header fields without checking length #116

jphickey opened this issue Dec 13, 2021 · 0 comments · Fixed by #137
Assignees
Milestone

Comments

@jphickey
Copy link
Contributor

CF uses many variable-length fields in its header structure. This requires aggressive buffer bounds checking at each point during decode, since a bad value (such as a bad "length" on an LV parameter) can throw off the decoder which may end up reading past the end of data and into undefined behavior territory.

CF really only sanity checks the length at a couple points here in the initial header receive:

Headers alone:

if (CF_AppData.engine.in.bytes_received < hsize)

Full PDU length field:

if ((temp + hsize) != CF_AppData.engine.in.bytes_received)

However, note that even for the first check above (line 874) the code has already invoked CF_HeaderSize here:

const int hsize = CF_HeaderSize(ph);

Notably, the CF_HeaderSize() function reads data from the packet header to compute header size. Therefore this has already read some packet data before the length is even initially sanity checked. If the input data was very short (such as from a MID misconfiguration) this would potentially segfault by immediately reading beyond the buffer.

Furthermore, every packet type that utilizes an LV or TLV style sub-fields (e.g. EOF, FIN, MD) needs to re-check the bounds at each of these entries. For example in RecvMd it only confirms that the size is sufficient for fixed-size fields here:

if (CF_AppData.engine.in.bytes_received < (CF_HeaderSize(ph) + sizeof(CF_CFDP_PduMd_t)))

But later on when copying the LV data at these places, there is no check:

lv_ret = CF_CFDP_CopyDataFromLv((uint8 *)t->history->fnames.src_filename, (CF_CFDP_lv_t *)md->filename_lvs);

CF/fsw/src/cf_cfdp.c

Lines 950 to 951 in a894069

lv_ret =
CF_CFDP_CopyDataFromLv((uint8 *)t->history->fnames.dst_filename, (CF_CFDP_lv_t *)(md->filename_lvs + offs));

These functions only check that the length is less than CF_FILENAME_MAX_LEN ... it does not check if the length has gone beyond the end of the input buffer.

Recommendation: Each and every step of encode + decode should confirm that the process is not reading or writing past the end of the buffer. Particularly for variable length fields.

@jphickey jphickey added the bug label Dec 13, 2021
@jphickey jphickey self-assigned this Dec 13, 2021
jphickey added a commit to jphickey/CF that referenced this issue Dec 17, 2021
Improves the distinction between PDU data being actively interpreted
or created during the PDU receive or transmit process, and the encoded
form of that data.

CF formerly treated the two as the same, directly referencing the
encoded form of the data.  This creates many repeated translations.
Furthermore, it would sometimes write a modified value back to the
packet in a partially-decoded form, so it was never clear what
was in a given buffer at a given time (it could be native byte
order or network byte order, in the same fields).

This introduces a "logical" buffer which correlates to the CFDP
buffer, but is used for all in-work or temporary value storage.
All values in the logical buffer are normalized to the native
machine, that is they are aligned properly and always in the
correct byte order for the host, so they can be used as normal
values without any need for translation.

When it comes time to transmit data to/from the network, a
dedicated Encode/Decode function is used, to translate the
entire content from its native form to the network form, or
vice versa.

FSW should typically not access the encoded form of data,
outside of the codec routines, except under very limited
circumstances with good reason (such as dynamically updating
the total_length field in the base header after encode).
jphickey added a commit to jphickey/CF that referenced this issue Dec 17, 2021
Improves the distinction between PDU data being actively interpreted
or created during the PDU receive or transmit process, and the encoded
form of that data.

CF formerly treated the two as the same, directly referencing the
encoded form of the data.  This creates many repeated translations.
Furthermore, it would sometimes write a modified value back to the
packet in a partially-decoded form, so it was never clear what
was in a given buffer at a given time (it could be native byte
order or network byte order, in the same fields).

This introduces a "logical" buffer which correlates to the CFDP
buffer, but is used for all in-work or temporary value storage.
All values in the logical buffer are normalized to the native
machine, that is they are aligned properly and always in the
correct byte order for the host, so they can be used as normal
values without any need for translation.

When it comes time to transmit data to/from the network, a
dedicated Encode/Decode function is used, to translate the
entire content from its native form to the network form, or
vice versa.

FSW should typically not access the encoded form of data,
outside of the codec routines, except under very limited
circumstances with good reason (such as dynamically updating
the total_length field in the base header after encode).
jphickey added a commit to jphickey/CF that referenced this issue Dec 17, 2021
Improves the distinction between PDU data being actively interpreted
or created during the PDU receive or transmit process, and the encoded
form of that data.

CF formerly treated the two as the same, directly referencing the
encoded form of the data.  This creates many repeated translations.
Furthermore, it would sometimes write a modified value back to the
packet in a partially-decoded form, so it was never clear what
was in a given buffer at a given time (it could be native byte
order or network byte order, in the same fields).

This introduces a "logical" buffer which correlates to the CFDP
buffer, but is used for all in-work or temporary value storage.
All values in the logical buffer are normalized to the native
machine, that is they are aligned properly and always in the
correct byte order for the host, so they can be used as normal
values without any need for translation.

When it comes time to transmit data to/from the network, a
dedicated Encode/Decode function is used, to translate the
entire content from its native form to the network form, or
vice versa.

FSW should typically not access the encoded form of data,
outside of the codec routines, except under very limited
circumstances with good reason (such as dynamically updating
the total_length field in the base header after encode).
jphickey added a commit to jphickey/CF that referenced this issue Jan 5, 2022
Improves the distinction between PDU data being actively interpreted
or created during the PDU receive or transmit process, and the encoded
form of that data.

CF formerly treated the two as the same, directly referencing the
encoded form of the data.  This creates many repeated translations.
Furthermore, it would sometimes write a modified value back to the
packet in a partially-decoded form, so it was never clear what
was in a given buffer at a given time (it could be native byte
order or network byte order, in the same fields).

This introduces a "logical" buffer which correlates to the CFDP
buffer, but is used for all in-work or temporary value storage.
All values in the logical buffer are normalized to the native
machine, that is they are aligned properly and always in the
correct byte order for the host, so they can be used as normal
values without any need for translation.

When it comes time to transmit data to/from the network, a
dedicated Encode/Decode function is used, to translate the
entire content from its native form to the network form, or
vice versa.

FSW should typically not access the encoded form of data,
outside of the codec routines, except under very limited
circumstances with good reason (such as dynamically updating
the total_length field in the base header after encode).
jphickey added a commit to jphickey/CF that referenced this issue Jan 5, 2022
Improves the distinction between PDU data being actively interpreted
or created during the PDU receive or transmit process, and the encoded
form of that data.

CF formerly treated the two as the same, directly referencing the
encoded form of the data.  This creates many repeated translations.
Furthermore, it would sometimes write a modified value back to the
packet in a partially-decoded form, so it was never clear what
was in a given buffer at a given time (it could be native byte
order or network byte order, in the same fields).

This introduces a "logical" buffer which correlates to the CFDP
buffer, but is used for all in-work or temporary value storage.
All values in the logical buffer are normalized to the native
machine, that is they are aligned properly and always in the
correct byte order for the host, so they can be used as normal
values without any need for translation.

When it comes time to transmit data to/from the network, a
dedicated Encode/Decode function is used, to translate the
entire content from its native form to the network form, or
vice versa.

FSW should typically not access the encoded form of data,
outside of the codec routines, except under very limited
circumstances with good reason (such as dynamically updating
the total_length field in the base header after encode).
jphickey added a commit to jphickey/CF that referenced this issue Jan 5, 2022
Improves the distinction between PDU data being actively interpreted
or created during the PDU receive or transmit process, and the encoded
form of that data.

CF formerly treated the two as the same, directly referencing the
encoded form of the data.  This creates many repeated translations.
Furthermore, it would sometimes write a modified value back to the
packet in a partially-decoded form, so it was never clear what
was in a given buffer at a given time (it could be native byte
order or network byte order, in the same fields).

This introduces a "logical" buffer which correlates to the CFDP
buffer, but is used for all in-work or temporary value storage.
All values in the logical buffer are normalized to the native
machine, that is they are aligned properly and always in the
correct byte order for the host, so they can be used as normal
values without any need for translation.

When it comes time to transmit data to/from the network, a
dedicated Encode/Decode function is used, to translate the
entire content from its native form to the network form, or
vice versa.

FSW should typically not access the encoded form of data,
outside of the codec routines, except under very limited
circumstances with good reason (such as dynamically updating
the total_length field in the base header after encode).
jphickey added a commit to jphickey/CF that referenced this issue Jan 5, 2022
Improves the distinction between PDU data being actively interpreted
or created during the PDU receive or transmit process, and the encoded
form of that data.

CF formerly treated the two as the same, directly referencing the
encoded form of the data.  This creates many repeated translations.
Furthermore, it would sometimes write a modified value back to the
packet in a partially-decoded form, so it was never clear what
was in a given buffer at a given time (it could be native byte
order or network byte order, in the same fields).

This introduces a "logical" buffer which correlates to the CFDP
buffer, but is used for all in-work or temporary value storage.
All values in the logical buffer are normalized to the native
machine, that is they are aligned properly and always in the
correct byte order for the host, so they can be used as normal
values without any need for translation.

When it comes time to transmit data to/from the network, a
dedicated Encode/Decode function is used, to translate the
entire content from its native form to the network form, or
vice versa.

FSW should typically not access the encoded form of data,
outside of the codec routines, except under very limited
circumstances with good reason (such as dynamically updating
the total_length field in the base header after encode).
jphickey added a commit to jphickey/CF that referenced this issue Jan 5, 2022
jphickey added a commit to jphickey/CF that referenced this issue Jan 6, 2022
A significant cleanup and overhaul of CF unit tests
to follow patterns that are more consistent with
other CFE/OSAL test modules.

Since the FSW change in this PR requires significant test updates
to go with it, this replaces all the "cfdp" tests with new
implementations where there is a 1:1 ratio between test
functions and implementation functions.

Ths also does some minor cleanup on the FSW side where needed
for testability.

Note that all stub files in this version are now directly generated
using the tool provided with UtAssert.  These generated stubs should
not be modified.
astrogeco added a commit that referenced this issue Jan 6, 2022
Fix #116 (plus others), separate logical vs. network PDU buffers
@skliper skliper added this to the Draco milestone Jan 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants