Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsync should support reflink similar to cp --reflink #119

Closed
purpleidea opened this issue Nov 25, 2020 · 19 comments
Closed

rsync should support reflink similar to cp --reflink #119

purpleidea opened this issue Nov 25, 2020 · 19 comments

Comments

@purpleidea
Copy link

More details in: https://bugzilla.samba.org/show_bug.cgi?id=10170

Manual import:


Description samba 2013-09-27 00:25:30 UTC

new filesystems support the reflink system call...
the standard GNU cp utility now supports this too.
this functionality is invaluable for future rsync.
@BasketCase recommended I write this as a feature.

Here is some background reference:
http://thread.gmane.org/gmane.linux.file-systems/31535

Thanks in advance, happy to answer any questions I can!

James

Comment 1 Kevin Korb 2013-09-27 00:31:00 UTC

*** Bug 9041 has been marked as a duplicate of this bug. ***

Comment 2 David Taylor 2015-01-06 20:32:31 UTC

Created attachment 10585 [details]
prototype patch for --reflink support

I have written a rough draft of a patch to enable support for (Btrfs) reflinks in rsync.  It is very lightly tested.

Only Btrfs is supported (using BTRFS_IOC_CLONE), and this is intended to provide a better alternative to --inplace for updating a backup which will be snapshotted repeatedly.  It avoids the problems of --inplace with
partial updates and stale hardlinks.

It adds two new options:

--reflink

When updating an existing file, rather than creating an entirely new temporary file, rsync will create a 'reflink' copy of the original file and update it in place.  When creating a backup copy of a file, rsync will also create a 'reflink' copy rather than rewriting the file.  If a 'reflink' copy cannot be performed it will fall back to the normal, non-reflink behaviour.

--reflink-always behaves like --reflink, but disables the fallback if a reflink fails.

Comment 3 roland 2015-01-06 20:58:30 UTC

fantastic !

did not know about that feature in btrfs, but as i already did inplace backups with rsync on btrfs, i think i will give that a try.

i just wondered about reflink support in zfs and putting a link here for reference: https://github.com/zfsonlinux/zfs/issues/405

Comment 4 James 2015-01-09 03:16:05 UTC

(In reply to David Taylor from comment #2)

Sweet... Glad to hear someone is hacking on this.

Comment 5 kdave 2015-03-27 01:40:43 UTC

(In reply to David Taylor from comment #2)

A few things:

* thanks for working on reflink support in rsync :)

* btrfs is not the only filesystem to support reflink, ocfs2 does as well, so you might make the formulations more generic

* for btrfs, there are 2 types of the cloning ioctl:
 1) that does file-to-file clone (same for ocfs2), IOC_CLONE
 2) range cloning, IOC_CLONE_RANGE

I believe for rsync option 2 should be implemented in the similar way --inplace works.

* the proposed patch does not work in all scenarios:

$ btrfs subvol create subv1
$ btrfs subvol create subv2
$ <create big file subv1/testfile>
$ rsync --reflink subv1/testfile subv2

according to strace, rsync does not call the clone ioctl at all (none of the forked processes)

$ rsync --reflink-always subv1/testfile subv2
testfile
    240,475,844 100%   43.54MB/s    0:00:05 (xfr#1, to-chk=0/1)
rsync: open "/subv2/testfile": No such file or directory (2)
rsync: reflink of "/subv2/.testfile.8bxNk0" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.2dev]

and the target file is not created, the same command without --reflink works.

* for speed comparison, I did 'cp --reflink subv1/testfile subv2' and it takes near to no time, compared to 5 seconds for bare copy, I'm expecting that rsync --reflink would take comparable time to cp+reflink



@WayneD
Copy link
Member

WayneD commented Nov 29, 2020

Rsync is optimized for remote copies, where reflink isn't possible. A future version of rsync might have a more local-optimized copy mode that would allow this, but nothing is likely in the near future. There is already a bugzilla bug about reflink support.

@WayneD WayneD closed this as completed Nov 29, 2020
@purpleidea
Copy link
Author

purpleidea commented Nov 29, 2020 via email

@WayneD
Copy link
Member

WayneD commented Nov 29, 2020

The patch for my version of using reflinks instead of hard-links for linking destination files together is in the rsync-patches repo: "clone-dest.diff" (since I think a clone is a more generic term that can apply to more than just btrfs, and "dest" keeps it grouped with the other --*-dest options).

@purpleidea
Copy link
Author

purpleidea commented Nov 29, 2020 via email

@WayneD
Copy link
Member

WayneD commented Nov 29, 2020

People need to test it first. Once it's been proven to be correct then it will be an easy merge.

@WayneD
Copy link
Member

WayneD commented Dec 13, 2020

FYI, I've done some simple testing of the patch (clone-dest.diff) and made some fixes.

@purpleidea
Copy link
Author

@WayneD That's awesome!! This is out of my expertise to review unfortunately, but I'd love to see it grow tests and then go upstream. I'll try to ping some people to have a look... It's one of those horrible situations where you (Wayne) are responsible for the tool that everyone uses, but few people contribute to.

@purpleidea
Copy link
Author

Patch is here if anyone is looking btw: https://github.com/WayneD/rsync-patches/blob/master/clone-dest.diff

@Massimo-B
Copy link

Are there any news about this? Any plans to integrate the patch?

@WayneD
Copy link
Member

WayneD commented Apr 11, 2022

Nobody has tested and reported back.

@sinyb
Copy link

sinyb commented Sep 20, 2022

I have tested it now... it seems to be working as described, but this is not really all that useful.
Yes, it efficiently handles files whose attributes have changed (unlike --link-dest), but it works on whole files only, and that is the problem...
When copying large files (read: VM images), it would see that mtime (and maybe size, but not always) has changed, and will transfer whole image instead reflinking unchanged blocks and transferring only changed data.
At this moment I use:
rsync -av --backup --no-whole-file src remote::dst
and it works fine, but it takes up almost twice as much space as would proper reflink, because it copies unchanged blocks from backup image (this is not possible with --inplace!) and transfers changes.

@axet
Copy link

axet commented Feb 9, 2025

rsync --reflink option can be remote as well. When I copy bunch of duplicate data from local host (which already reflink locally) to remote host would be nice rsync will prevent duplicate data and create reflink instead on remote.

@DADA30000
Copy link

rsync --reflink option can be remote as well. When I copy bunch of duplicate data from local host (which already reflink locally) to remote host would be nice rsync will prevent duplicate data and create reflink instead on remote.

reflinks are just links on local filesystem, they literally can not be used on remotes, what you are talking about, is delta transfer, and it is the main feature of rsync

@DADA30000
Copy link

I have tested it now... it seems to be working as described, but this is not really all that useful. Yes, it efficiently handles files whose attributes have changed (unlike --link-dest), but it works on whole files only, and that is the problem... When copying large files (read: VM images), it would see that mtime (and maybe size, but not always) has changed, and will transfer whole image instead reflinking unchanged blocks and transferring only changed data. At this moment I use: rsync -av --backup --no-whole-file src remote::dst and it works fine, but it takes up almost twice as much space as would proper reflink, because it copies unchanged blocks from backup image (this is not possible with --inplace!) and transfers changes.

I think you can just first delete dest file and then just reflink it, problem solved

@purpleidea
Copy link
Author

reflinks are just links on local filesystem

Rsync is used on lots of local-only environments, such as rsnapshot, for example.

@Massimo-B
Copy link

reflinks are just links on local filesystem,

I don't think so. Reflinks on COW filesystems are not visible like soft links are, and are still different from hardlinks. Hardlinks are pointing to the same inode, changes on both hardlinks are applied to the same file. Reflinked files can be both changed, they are basically 2 different files, while only the filesystem internally knows about the reflink.

cp --reflink advises the filesystem to reflink if supported, rsync locally could do so as well.
cp and rsync can advise to reflink because they are creating a copy from the source to the target file. I don't think they can detect existing reflinks and copy that reflinked structure to a remote target reflinked, without using btrfs internals like https://github.com/kilobyte/compsize does for instance.

@DADA30000
Copy link

reflinks are just links on local filesystem,

I don't think so. Reflinks on COW filesystems are not visible like soft links are, and are still different from hardlinks. Hardlinks are pointing to the same inode, changes on both hardlinks are applied to the same file. Reflinked files can be both changed, they are basically 2 different files, while only the filesystem internally knows about the reflink.

cp --reflink advises the filesystem to reflink if supported, rsync locally could do so as well. cp and rsync can advise to reflink because they are creating a copy from the source to the target file. I don't think they can detect existing reflinks and copy that reflinked structure to a remote target reflinked, without using btrfs internals like https://github.com/kilobyte/compsize does for instance.

yeah, I know what reflinks are, I use them a lot, they are BASICALLY links to other file on an FS level, with ability to lay changes on top of it
also why did you quote my message? I tried to explain why it's not doable on REMOTES to a guy

@axet
Copy link

axet commented Feb 22, 2025

I tried to explain why it's not doable on REMOTES to a guy

You didn't explain it. You simply do not understand how it works. That remote filesystem can have reflink support same as local filesystem. When you copy files from local to remote you can copy 1 file + 10 reflink, and get 1 file + 10 reflink. Instead you create 11 files on remove filesystem. Also you do not understand that by saying "links" you are referring to "soft links".

@DADA30000
Copy link

I tried to explain why it's not doable on REMOTES to a guy

You didn't explain it. You simply do not understand how it works. That remote filesystem can have reflink support same as local filesystem. When you copy files from local to remote you can copy 1 file + 10 reflink, and get 1 file + 10 reflink. Instead you create 11 files on remove filesystem. Also you do not understand that by saying "links" you are referring to "soft links".

Yes, I know (even though reflink literally has "link" word in them, but ok), that's why I used them as analogy, that's why I said "basically", even highlighted it in caps, because on a basic level, they are just "linked" files on an FS level, and any changes to those "links" are just "connected", or "laid on top" of that link, and also all those links are "equal", you need to delete all of them to really get rid of file data
and also yeah, now your idea makes sense, I initially thought that you want to only make reflink on a remote filesystem, that would somehow magically connect to a file on source filesystem
But even now it's clear that this will be really hard to implement :P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants