Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File read_at, non-mutable platform independent version #47229

Closed
PSeitz opened this issue Jan 6, 2018 · 10 comments
Closed

File read_at, non-mutable platform independent version #47229

PSeitz opened this issue Jan 6, 2018 · 10 comments

Comments

@PSeitz
Copy link
Contributor

PSeitz commented Jan 6, 2018

I have multiple threads reading from a std::fs::File handle. The existing Read Trait for File is not so nice in this case because of the mutability&mut File. I created the non-mutable platform independent version from the plattform specific counterparts like below.
It would be nice to have this in the standard library.

#[cfg(any(windows))]
fn load_exact_bytes_at(buffer: &mut Vec<u8>, file: &File, offset: u64) {
    use std::os::windows::fs::FileExt;
    let mut data_read = 0;
    while data_read < buffer.len() {
        data_read += file.seek_read(buffer, offset).unwrap();
    }
}

#[cfg(any(unix))]
fn load_exact_bytes_at(buffer: &mut Vec<u8>, file: &File, offset: u64) {
    use std::os::unix::fs::FileExt;
    let mut data_read = 0;
    while data_read < buffer.len() {
        data_read += file.read_at(buffer, offset).unwrap();
    }
}
@ranma42
Copy link
Contributor

ranma42 commented Jan 6, 2018

Assuming that seek_read actually changes the current position of the file, I am afraid the missing mutability on the self argument was overlooked.

@PSeitz
Copy link
Contributor Author

PSeitz commented Jan 6, 2018

@tbu- mentions only WriteFile here: #35704 (comment), but the seek_write version is not &mut

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Jan 6, 2018

&File implements Read so using File methods from multiple threads is perfectly possible: They only need a mutable reference to the &File, of which each thread can have a separate copy.

@PSeitz
Copy link
Contributor Author

PSeitz commented Jan 6, 2018

I know this is possible, but the unnecessary(?) mut complicates things (a lot in my case, without the plattform specific implementation).

@tbu-
Copy link
Contributor

tbu- commented Jan 6, 2018

Here's how to read from a immutable reference to a file:

use std::fs::File;
use std::io::Read;
use std::io;
fn read(mut file: &File) -> io::Result<()> {
    let mut buf = [0; 4];
    file.read(&mut buf)?;
    Ok(())
}

Your implementation of load_exact_bytes_at (and the write counterpart) is buggy if the operating system returns with a short read (write), or if the operating system returns an error.

EDIT: Removed the &mut in (&mut file).read(&mut buf)?.

@hanna-kruppe
Copy link
Contributor

@PSeitz Complicates how? You already have a &File. You just need to add a mut to the binding (the file: &File parameter or local) that you want to call the method on. No function signature or type change, no lifetime annotations change, etc., just a simple three letters for (at most) every Read method call. That doesn't seem like a significant burden, and especially not like one big enough to introduce a whole new API.

@PSeitz
Copy link
Contributor Author

PSeitz commented Jan 6, 2018

Okay, I thought a &mut was required. In this case, I don't understand how e.g. that snippet below works. Is this a race condition? Does Rust protect me from race conditions on a single file handle?

extern crate rayon;
use std::io::SeekFrom;
use std::io::prelude::*;
use std::fs::File;
use rayon::prelude::*;

fn main() {

    let file =  File::open("file").unwrap();
        
    [0,1].par_iter()
        .for_each(|p| {
            let mut bytes = vec![];
            bytes.resize(8, 0);
            load_bytes(&mut bytes, &file, *p as u64);
        });

}

fn load_bytes(buffer: &mut Vec<u8>, mut file: &File, offset: u64) {
    file.seek(SeekFrom::Start(offset)).unwrap();
    file.read_exact(buffer).unwrap();
}

@hanna-kruppe
Copy link
Contributor

A &mut self is required, but since the impl is for &File, the self parameter is a &mut &File.

In this case, I don't understand how e.g. that snippet below works. Is this a race condition? Does Rust protect me from race conditions on a single file handle?

The file I/O is racy, yes. Rust does not provide guaranteed protection from "logical races", just from data races. The operating systems' I/O operations are data-race-free even when multiple threads work on the same file, but even a single I/O operation is rarely atomic (e.g., a read might get interrupted). Multiple separate operations (such as the seek and read_exact in your example) can't be atomic without external synchronization. Therefore, working on the same file from multiple threads is generally very wrong (unless there's other synchronization between the threads).

@PSeitz
Copy link
Contributor Author

PSeitz commented Jan 6, 2018

Therefore, working on the same file from multiple threads is generally very wrong (unless there's other synchronization between the threads).

Do you mean file or std::fs::File? Do you refer to that read-only scenario?

I thought read-only with multiple file handles should be no problem.
Multiple threads with read and writes are of course bad without extra synchronization.

It would be nice if the API was designed, to protect from races on a single std::fs::File.

@hanna-kruppe
Copy link
Contributor

Do you mean file or std::fs::File? Do you refer to that read-only scenario?

I meant the std::fs::File type.

I thought read-only with multiple file handles should be no problem.

A std::fs::File object has a current read position, which is changed any time you read or seek.

Multiple threads with read and writes are of course bad without extra synchronization.

Yes (though again, just a race, not a data race).

It would be nice if the API was designed, to protect from races on a single std::fs::File.

Maybe, but backwards compatibility prevents us from changing this.


Can this issue be closed now? It seems the proposed API is no longer needed. If you have any further questions, I'd direct you to http://users.rust-lang.org/ or other forums.

@sfackler sfackler closed this as completed Jan 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants