Without patching, the rsync
utility lacks support to detect when a file was renamed/moved across multiple directories inside the synced tree. There is a ––fuzzy
option to save bandwidth by building upon similar files on the target side, but only in the same directory.
You may need to synchronize the large file tree over a slow connection when you’ve done a big reorganization since the last rsync run. A real world example: Joe stores multiple GiBs of family photos and videos at home and periodically backs them up to a remote server.
$ rsync -avHP --delete-after ~/family/Photos remotebox:backups
$ rsync -avHP --delete-after ~/family/Photos remotebox:backups
$ cd ~/family
$ cp -rlp Photos Photos-work
cp
is done very quickly when its switches are: copy directories *R*ecursively + *L*ink files instead of copying + *P*reserve mode, ownership and timestamps (for non-hardlinked content such as directories)
Do the reorganization in the Photos-work
directory: you can rename, move, add and delete any files. But DON’T TOUCH the tree in Photos
, this directory (with the same sets of paths on both machines), will allow rsync
to quickly find the data to clone under Photos-work
on the remote machine.
When you’re done reorganizing, you run this:
$ rsync -avHP --delete-after --no-inc-recursive ~/family/Photos ~/family/Photos-work remotebox:backups
- As an
rsync
expert you are surely aware that slashes at the end of thersync
paths have strict meaning. If not, consult the manpage. - You may want to run it with the safety
-n
switch first to see what would happen. You will see=>
’s to mark the hard-linking.
rsync
collects all hard-links before it transfers anything. It is now able to reconstruct Photos-work
on the remote maching IN SECONDS. Next you finalize by:
$ mv Photos Photos-OLD
$ mv Photos-work Photos