Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdin and stdout support #208

Closed
wants to merge 2 commits into from
Closed

stdin and stdout support #208

wants to merge 2 commits into from

Conversation

ThomasWaldmann
Copy link
Contributor

this fixes the stdin/stdout part of issue #22, no FIFOs yet.

echo "foobar" | attic create archive::test -

when using stdin, the saved path is always "stdin". practically, you only have this 1 entry in the archive anyway as it would be rather unusual to mix "-" with real filenames (but even that works).

attic extract --stdout archive::test somefile

when using stdout, ALL extracted regular files' data will be written to stdout (just concatenated).
practically, you only extract 1 file usually when using stdout.

@ThomasWaldmann
Copy link
Contributor Author

opinions? suggestions? code review?

@danmbox
Copy link

danmbox commented Apr 25, 2015

Thomas, so you have a single stdin path? I would suggest at least a --stdin-name, so you can pipe different files into attic. A more advanced option would be, when the stream is a tar archive, to pick up names from the tar metadata

@ThomasWaldmann
Copy link
Contributor Author

it is currently using the convention that "-" instead of a filename means "read from stdin".
piping any file into attic would then work like "attic create repo - < any_file".

so why would we need --stdin-name (or --read-from like I would call it)?

@danmbox
Copy link

danmbox commented Apr 25, 2015

That's not what I meant. When you archive a directory, attic stores metadeta from the filesystem (i.e. it knows that you've archived /etc/hostname, /etc/passwd etc). When you archive from "-", the path metadata is simply "stdin" (AFAICT from your commits). So --stdin-name would provide the missing filename metadata for the piped-in stream.

More interestingly, if a tar archive is piped in, attic could pick up metadata from the tar format. This is what a --stdin-format would do (leaving room for --stdin-format=plain, stdin-format=tar, --stdin-format=zip, so that you can treat a piped-in tarfile either as a single file OR as a collection of files)

@ThomasWaldmann
Copy link
Contributor Author

@danmbox please give specific use cases / examples where that metadata would be interesting. Also, how would they be used (e.g. at extract time)? E.g. if i make a disk image by reading from /dev/sda, that filename (and other metadata of that file) would be rather uninteresting. Really interesting is rather image contents, like "win 7 pro 64bit std install, machine type x", but I can't get that info from /dev/sda's metadata, but would have to put it manually in the archive name.

Also, I have no idea why you are referring to piped tar archives here. I'ld suggest you split that off to a separate ticket.

@dnnr
Copy link

dnnr commented Apr 25, 2015

What's wrong with using the archive name for that? Since backing up stdin will always just be an archive containing a single (unnamed) object, you could simply indicate the name of that object using the name of the archive.

And I don't see why it should be attic's job to extract any information from a tar stream, or any other specific file format, for that matter. There's already a tool for that task and it's called tar.

@danmbox
Copy link

danmbox commented Apr 25, 2015

@dnnr, the archive name is one piece of metadata, the filenames are a different piece of metadata. Let's not mix them up. And unless you have something like tar parsing, it's impossible to create a snapshot containing multiple files without saving the stream to a temporary disk file (which kills performance, and also requires extra disk space, proportional to the uncompressed size).

@danmbox
Copy link

danmbox commented Apr 25, 2015

In the meantime I've tried pg_dump -CFt mydb | /opt/attic/bin/attic create /tmp/attic::pg_mydb several times and I can attest that it works. My comments regarding filename metadata could be saved for later as an enhancement request, if this pull request gets merged.

@danmbox
Copy link

danmbox commented Apr 25, 2015

The tar format idea is stolen from zbackup, and is one way to allow piping in a collection of files (not everyone wants to pipe in just virtual images -- ). E.g. pg_dump -Ft generates multiple filenames inside its tar archive stream, and they are useful to see when listing the snapshot.

@ThomasWaldmann
Copy link
Contributor Author

closing this pull request, seems unwanted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants