Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading a file with length -1 #39

Open
xd009642 opened this issue Jan 15, 2020 · 10 comments
Open

Reading a file with length -1 #39

xd009642 opened this issue Jan 15, 2020 · 10 comments

Comments

@xd009642
Copy link

If you get a wav file with length set to 0xFFFFFFFF it fails to parse as an ill-formed wav file, however programs like ffmpeg can set that to the length in instances where the length can't be predetermined i.e. streaming interfaces.

Is there a way to handle this, or should I override the length field myself before passing into hound for now?

@ruuda
Copy link
Owner

ruuda commented Jan 15, 2020

Apart from getting an IO error at EOF, it should work with current master, only the reason it works is by accident. v3.4.0 contained a check that the number of samples is a multiple of the number of channels, which may be violated if the length is 0xffffffff. It looks like that check was lost in #25, which will then make a length of 0xffffffff work regardless of the number of channels.

@xd009642
Copy link
Author

Hmm maybe I need to check again, I saw the error and thought it didn't work! Though I'm working on a stream of bytes coming in from a network so I ended up piping into ffmpeg and piping out raw pcm. I suppose I can sort out a similar interface in Rust with a BufReader around a Vec and adding the bytes to the vec as they come in (though mutability might pose an issue).

@ruuda
Copy link
Owner

ruuda commented Jan 16, 2020

You should be able to wrap a BufReader around the socket itself.

@xd009642
Copy link
Author

Not possible I don't think it's a field in a oneof from a grpc stream. Interleaving audio data and other streamed data. I have another API with HTTP streaming though and that is probably a good shout there!

@xd009642
Copy link
Author

But yeah you can close this if you want, my only thing is it might be nice to mention in the docs if the length isn't set there will likely be an EOF error once the file is fully read

@ruuda
Copy link
Owner

ruuda commented Jan 17, 2020

There are some improvements for writing wav with unknown length on master, I’ll make sure to have reading and writing consistent and documented for the next release. I’ll keep this issue open until then.

@vi
Copy link
Contributor

vi commented Apr 22, 2020

I think it should be explicitly supported. All ones pattern should trigger mode where length is not checked at all, so more than 4GB of raw audio can be transferred.

Imagine setting an internet radio station that is based on some command-line pipeline, one of pipes is being used for raw wav data. You don't want it to suddently stop after a month of service just because of we exhaused 0xFFFFFFFF.

@xkr47
Copy link

xkr47 commented Apr 30, 2022

For all you waiting for this feature, I wrote a hacky wrapper to work around the issue, when using hound 3.4.0. It assumes the wav header layout to be very specific. 😁 It is currently tuned for ffmpeg 4.1.8. Please adjust offsets as necessary for the format you need to read in. Or implement code to look for the data chunk and post it here 😉

Example usage, load audio data from ffmpeg stdout:

    let child = Command::new("ffmpeg")
        .arg("-i")
        .arg(&video_file)
        .arg("-c:a")
        .arg("pcm_s16le")
        .arg("-f")
        .arg("wav")
        .arg("-")
        .stdin(Stdio::null())
        .stdout(Stdio::piped())
        .spawn()?;

    let stdout = child.stdout.unwrap();
    let stdout = WavLengthPatcher::new(stdout);
    let stdout = BufReader::new(stdout);
    let reader = hound::WavReader::new(stdout)?;
    let sample_iter = reader.into_samples::<i16>()
        .map_while(|sample| match sample {
            Ok(sample) => {
                Some(sample)
            },
            Err(hound::Error::IoError(e)) => {
                if let Some(ee) = e.get_ref() {
                    if "Failed to read enough bytes." == &format!("{}", ee) {
                        return None
                    }
                }
                panic!("Error reading audio data: {:#?}", e);
            }
            Err(e) => panic!("Error reading audio data: {:#?}", e)
        });

Implementation:

struct WavLengthPatcher<R: Read> {
    stream: R,
    pos: u8,
}

impl <R: Read> WavLengthPatcher<R> {
    fn new(stream: R) -> Self {
        Self {
            stream,
            pos: 0,
        }
    }
}

impl <R: Read> Read for WavLengthPatcher<R> {
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
        let l = self.stream.read(buf)?;
        if self.pos < 0x4e {
            // fake bytes @ 0x04 and 0x4A from 0xFF -> 0xF0 (zero lowest 4 bits of file & data chunk length)
            let capped_l = l.min(0x4e) as u8;
            for p in 0 .. capped_l {
                match p + self.pos {
                    0x04 | 0x4A => {
                        assert_eq!(buf[p as usize], 0xFF);
                        buf[p as usize] = 0xF0;
                    }
                    0x05 | 0x06 | 0x07 | 0x4b | 0x4c | 0x4d => {
                        assert_eq!(buf[p as usize], 0xFF);
                    }
                    _ => ()
                }
            }
            self.pos += capped_l; // cannot overflow
        }
        Ok(l)
    }
}

@vi
Copy link
Contributor

vi commented May 1, 2022

@xkr47 Will Hound stop reading such a patched file after 4GB of data flows though?

@xkr47
Copy link

xkr47 commented May 29, 2022

No idea, you'll have to try 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants