Skip to content

Commit

Permalink
Merge pull request #269 from marshallpierce/mp/decode-precisely
Browse files Browse the repository at this point in the history
Precise decode output slice length checking
  • Loading branch information
marshallpierce authored Mar 2, 2024
2 parents 37670c5 + efb6c00 commit 5d70ba7
Show file tree
Hide file tree
Showing 12 changed files with 701 additions and 814 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "base64"
version = "0.21.7"
version = "0.22.0"
authors = ["Alice Maz <[email protected]>", "Marshall Pierce <[email protected]>"]
description = "encodes and decodes base64 as bytes or utf8"
repository = "https://github.com/marshallpierce/rust-base64"
Expand Down
6 changes: 6 additions & 0 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# 0.22.0

- `DecodeSliceError::OutputSliceTooSmall` is now conservative rather than precise. That is, the error will only occur if the decoded output _cannot_ fit, meaning that `Engine::decode_slice` can now be used with exactly-sized output slices. As part of this, `Engine::internal_decode` now returns `DecodeSliceError` instead of `DecodeError`, but that is not expected to affect any external callers.
- `DecodeError::InvalidLength` now refers specifically to the _number of valid symbols_ being invalid (i.e. `len % 4 == 1`), rather than just the number of input bytes. This avoids confusing scenarios when based on interpretation you could make a case for either `InvalidLength` or `InvalidByte` being appropriate.
- Decoding is somewhat faster (5-10%)

# 0.21.7

- Support getting an alphabet's contents as a str via `Alphabet::as_str()`
Expand Down
3 changes: 1 addition & 2 deletions benches/benchmarks.rs
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,8 @@ fn do_encode_bench_slice(b: &mut Bencher, &size: &usize) {
fn do_encode_bench_stream(b: &mut Bencher, &size: &usize) {
let mut v: Vec<u8> = Vec::with_capacity(size);
fill(&mut v);
let mut buf = Vec::new();
let mut buf = Vec::with_capacity(size * 2);

buf.reserve(size * 2);
b.iter(|| {
buf.clear();
let mut stream_enc = write::EncoderWriter::new(&mut buf, &STANDARD);
Expand Down
74 changes: 60 additions & 14 deletions src/decode.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,20 @@ use std::error;
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum DecodeError {
/// An invalid byte was found in the input. The offset and offending byte are provided.
/// Padding characters (`=`) interspersed in the encoded form will be treated as invalid bytes.
///
/// Padding characters (`=`) interspersed in the encoded form are invalid, as they may only
/// be present as the last 0-2 bytes of input.
///
/// This error may also indicate that extraneous trailing input bytes are present, causing
/// otherwise valid padding to no longer be the last bytes of input.
InvalidByte(usize, u8),
/// The length of the input is invalid.
/// A typical cause of this is stray trailing whitespace or other separator bytes.
/// In the case where excess trailing bytes have produced an invalid length *and* the last byte
/// is also an invalid base64 symbol (as would be the case for whitespace, etc), `InvalidByte`
/// will be emitted instead of `InvalidLength` to make the issue easier to debug.
InvalidLength,
/// The length of the input, as measured in valid base64 symbols, is invalid.
/// There must be 2-4 symbols in the last input quad.
InvalidLength(usize),
/// The last non-padding input symbol's encoded 6 bits have nonzero bits that will be discarded.
/// This is indicative of corrupted or truncated Base64.
/// Unlike `InvalidByte`, which reports symbols that aren't in the alphabet, this error is for
/// symbols that are in the alphabet but represent nonsensical encodings.
/// Unlike [DecodeError::InvalidByte], which reports symbols that aren't in the alphabet,
/// this error is for symbols that are in the alphabet but represent nonsensical encodings.
InvalidLastSymbol(usize, u8),
/// The nature of the padding was not as configured: absent or incorrect when it must be
/// canonical, or present when it must be absent, etc.
Expand All @@ -30,8 +32,10 @@ pub enum DecodeError {
impl fmt::Display for DecodeError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match *self {
Self::InvalidByte(index, byte) => write!(f, "Invalid byte {}, offset {}.", byte, index),
Self::InvalidLength => write!(f, "Encoded text cannot have a 6-bit remainder."),
Self::InvalidByte(index, byte) => {
write!(f, "Invalid symbol {}, offset {}.", byte, index)
}
Self::InvalidLength(len) => write!(f, "Invalid input length: {}", len),
Self::InvalidLastSymbol(index, byte) => {
write!(f, "Invalid last symbol {}, offset {}.", byte, index)
}
Expand All @@ -48,9 +52,7 @@ impl error::Error for DecodeError {}
pub enum DecodeSliceError {
/// A [DecodeError] occurred
DecodeError(DecodeError),
/// The provided slice _may_ be too small.
///
/// The check is conservative (assumes the last triplet of output bytes will all be needed).
/// The provided slice is too small.
OutputSliceTooSmall,
}

Expand Down Expand Up @@ -338,3 +340,47 @@ mod tests {
}
}
}

#[allow(deprecated)]
#[cfg(test)]
mod coverage_gaming {
use super::*;
use std::error::Error;

#[test]
fn decode_error() {
let _ = format!("{:?}", DecodeError::InvalidPadding.clone());
let _ = format!(
"{} {} {} {}",
DecodeError::InvalidByte(0, 0),
DecodeError::InvalidLength(0),
DecodeError::InvalidLastSymbol(0, 0),
DecodeError::InvalidPadding,
);
}

#[test]
fn decode_slice_error() {
let _ = format!("{:?}", DecodeSliceError::OutputSliceTooSmall.clone());
let _ = format!(
"{} {}",
DecodeSliceError::OutputSliceTooSmall,
DecodeSliceError::DecodeError(DecodeError::InvalidPadding)
);
let _ = DecodeSliceError::OutputSliceTooSmall.source();
let _ = DecodeSliceError::DecodeError(DecodeError::InvalidPadding).source();
}

#[test]
fn deprecated_fns() {
let _ = decode("");
let _ = decode_engine("", &crate::prelude::BASE64_STANDARD);
let _ = decode_engine_vec("", &mut Vec::new(), &crate::prelude::BASE64_STANDARD);
let _ = decode_engine_slice("", &mut [], &crate::prelude::BASE64_STANDARD);
}

#[test]
fn decoded_len_est() {
assert_eq!(3, decoded_len_estimate(4));
}
}
Loading

0 comments on commit 5d70ba7

Please sign in to comment.