Skip to Content [alt-c]

November 9, 2012

How FUSE Can Break Rsync Backups

Update: See my followup article, Easily Running FUSE in an Isolated Mount Namespace, for a solution to this problem.

FUSE is cool, but by its nature has to introduce some non-standard semantics that you wouldn't see with a "real" filesystem. Sometimes these non-standard semantics can cause problems, as demonstrated by a recent experience I had with rsync-based backups on a multi-user Linux server that I administer.

The server performs regular snapshot backups of the filesystem using rsync and its --link-dest option to provide hard link-based dedupe. One day, an enterprising user mounted a sshfs filesystem in his home directory. That night, the snapshot backup failed with this error message:

rsync: readlink_stat("/home/aganlxl/cit") failed: Permission denied (13) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1060) [sender=3.0.7]

The problem is that, by default, only the user who owns the FUSE mount is allowed to access it. Not even root can. This is a security measure, since a malicious user could wreck havoc with a malicious FUSE mount (imagine an infinite filesystem, for example).

This behavior can be changed (with the 'allow_root' mount option, which has to be explicitly enabled by the super user in fuse.conf), but that's not the answer. Besides the security implications, that would cause rsync to descend into the sshfs mount and start backing up the remote system!

The problem is that rsync needs to be able to access everything it's backing up. Running as root, this is usually not a problem. Root not being able to access something on the filesystem seems weird, but is actually nothing new - root-squashed NFS mounts can also cause this. But FUSE mounts are worse. To begin with, unlike root-squashed NFS mounts, users are allowed to plop down FUSE mounts anywhere they like.

But worse, root isn't even allowed to stat FUSE mounts. This means that rsync's -x option (to not cross filesystem boundaries) can't even be used to exclude FUSE mounts, since rsync needs to stat the directory to determine if it's a mount point! This behavior is outside of POSIX, which says that stat shall return EACCES if "search permission is denied for a component of the path prefix."

In my opinion, a worthwhile addition to fuse.conf would be an option to restrict FUSE mounts to specific directories. With such an option, the admin could restrict FUSE mounts to locations that aren't backed up.

Until then, FUSE is disabled on this particular server.

Comments

November 8, 2012

Remote SSH Commands and Broken Connections

One problem with executing commands via ssh (that is, on ssh's command line, not via an interactive login shell) is that the command isn't terminated when the ssh connection dies. You can see this by running:

ssh otherhost /bin/sleep 600

and interrupting ssh with Ctrl+C. On otherhost, sleep will still be running. Its parent, the sshd process forked to handle the connection, will be gone, and sleep will have been reparented to init (PID 1).

You don't get this problem when you use ssh interactively. All the processes that you start have a controlling terminal, and when the connection dies and the controlling terminal goes away, the processes you started are killed with SIGHUP (unless they detached from the controlling terminal, such as with setsid).

One solution is to always allocate a terminal, by specifying ssh's -t option:

ssh -t otherhost /bin/sleep 600

But this isn't always feasible, especially if you're running ssh from a script which doesn't have a terminal.

We need a way to kill the remote command when its parent sshd process dies. Fortunately, on Linux, the prctl syscall provides a solution:

PR_SET_PDEATHSIG (since Linux 2.1.57)

Set the parent process death signal of the calling process to arg2 (either a signal value in the range 1..maxsig, or 0 to clear). This is the signal that the calling process will get when its parent dies. This value is cleared for the child of a fork(2).

We can write a simple C wrapper, called diewithparent, which calls prctl and then execs the command:

int main (int argc, char** argv) { prctl(PR_SET_PDEATHSIG, SIGTERM); execvp(argv[1], argv + 1); return 127; }

And use it like this:

ssh otherhost diewithparent /bin/sleep 600

Download the complete C source, which features error checking and options parsing (so you can specify the signal number). (Compile with cc -o diewithparent diewithparent.c.)

Naturally, this is a Linux-only solution. On other systems, the best solution (as far as I can tell) would be to fork and exec the command. In the parent, continuously poll the parent PID (with getppid()). When it changes to 1, you know the parent has died so you kill the command.

Comments

Newer Posts