Log in

No account? Create an account

June 17th, 2014

Using rsync on OS X

OS X includes rsync (v. 2) by default. Rsync is terribly useful for copying files, either between local drives or across networks. Across networks, by default it uses ssh to secure data transmission, and it uses checksums to verify that data wasn't corrupted in transmission (which is why I prefer it to, say, sftp or scp).

Rsync also automagically recurses through directories and sub-directories. But, this can be a problem on OS X, because the OS X file manager litters dot-files all over the place, e.g., “.DS_Store” will show up in every drive you mount. Since rsync isn't built for OSX, it considers these things regular files, and replicates them along with everything else. Which is not a problem, per se, but it is pointless and I find it annoying. But, you can use the exclude option in rysnc to ignore those files, e.g., “--exclude=\".??*\"”.

Another problem with using rsync on OS X is that it doesn't automatically prevent the system from going into sleep mode when there's an rsync transfer on-going. This you can overcome using the built-in utility caffeinate.

So, based on all that, here is what I've currently got in my OS X .profile file:

alias Rsync=$'caffeinate -is rsync -aPWIh --size-only --exclude=\".??*\"'

The “-is” options for caffeinate tell it what sorts of sleep to prevent—and, admittedly, I'm not sure if I really need both of those, but it works. I briefly cover the rsync options at the bottom of this post.

What rsync expects after all the options is “[source host/folders] [destination host/folders]”. Exactly how you specify those can be complicated, so check the documentation for rsync.

You can also stick additional rsync options in front of the [source files]. Typically, I always first do a “dry run” by specifying “-n”. The dry-run displays everything that rsync thinks you want it to do, without actually doing any of it (which can be particularly important if you use any of the “delete” options that rsync has). It also is a way to make sure you've correctly specified any remote hosts.

I use rsync to move files from my shiny Macbook to external drives, to USB flash drives, to NAS devices, to iOS devices, to local servers, to servers out on the intarwebs, etcetcetc. To say that rsync is widely supported doesn't entirely explain just how widely supported it is. Suffice it to say, if something uses the linux kernel, chances are that it either already has rsync installed (though, you may need to turn it on), or someone has already ported rsync to it and all you have to do is install it.

Summary of the rsync options

The rsync options are “archive”, “display progress”, “whole file only” (instead of determining and transferring only the parts of each file that are different between source and destination—see the rsync documentation for details), “ignore time-stamps” (which is usually how rsync identifies files that have changed), and “human-readable units” (i.e., use “MB” instead of “blocks”). The “--size-only” flag goes with the “-I” flag. It tells rsync to check the size of a file that is both on the source and the destination to determine if it has changed.

I disable the use of time-stamps for identifying changed files because I don't want to futz with getting the clocks synchronized on all my various computerators and devices. I disable the fancy algorithm for only sending the parts of files that have changed because the NAS devices I have are limited more by their CPU than by network speeds.

Note: I've made this world-viewable, so I've turned on comment screening. (I can guarantee at least one spammer will post a reply. Probably in Русский.)


Latest Month

June 2014
Powered by LiveJournal.com
Designed by Lilia Ahner