March 27, 2013
Running a Robust NTP Daemon
Accurate time is essential on a server, and running ntpd is the best way to ensure it. Unfortunately, ntpd, especially on Debian, can be finicky and in the past I've had trouble with clock drift and ntpd failing to start on boot. Here are my best practices to avoid the problems.
On Debian, make sure lockfile-progs is installed
On Debian (and likely Ubuntu too), there's a nasty race condition on boot between ntpdate and ntpd. When the network comes up, ifupdown runs ntpdate to synchronize the clock. But at about the same time, ntpd starts. If ntpdate is still running when ntpd starts, ntpd can't bind to the local NTP port and terminates. Sometimes ntpd starts on boot and sometimes it doesn't!
The Debian scripts avoid this using locks, but only
if the lockfile-progs package is installed.
This is a Recommends: for ntpdate, but if you don't install Recommends: by default, you may miss
this.
If you use DHCP, don't request ntp-servers
If your system gets its IP address from DHCP using dhclient, then by default dhclient will update your ntpd configuration with NTP server information it receives from the DHCP server. It is extremely frustrating to configure reliable upstream NTP servers only to have them replaced with unreliable servers (such as those that advertise phony leap seconds, as has happened to me multiple times). And the last thing you want is your configuration management system fighting with dhclient over what NTP servers to use.
To prevent this, edit /etc/dhcp/dhclient.conf and remove ntp-servers from the request line.
Don't use the undisciplined local clock (i.e. server 127.127.1.0)
Make sure these lines aren't in your ntp.conf:
server 127.127.1.0
fudge 127.127.1.0 stratum 10
These lines enable the Undisciplined Local Clock, and cause ntpd to start using your local clock as a time source if the real NTP servers aren't reachable. This can be useful if you want to keep a group of servers on a local network in sync even if your Internet connection goes down, but in general you don't need or want this. I've seen strange situations where the local clock becomes preferred over the real NTP servers, resulting in clock drift that goes uncorrected. Best to disable the local clock by removing all references to 127.127.1.0.
March 2, 2013
GCC's Implementation of basic_istream::ignore() is Broken
The implementation of std::basic_istream::ignore()
in GCC's C++ standard library suffers from a serious flaw.
After ignoring the n characters as requested, it checks to see
if end-of-file has been reached. If it has, then the stream's eofbit is set.
The problem is that to check for end-of-file, ignore()
has to essentially peek ahead in the stream one character beyond what
you've ignored. That means that if
you ask to ignore all the characters
currently available in the stream buffer, ignore() causes
an underflow of the buffer. If it's a file stream, the buffer can be
refilled by reading from the filesystem in a finite amount of time,
so this is merely inefficient. But if it's a socket, this underflow can be fatal:
your program may block forever waiting for bytes that never come. This is horribly
unintuitive and is inconsistent with the behavior of std::basic_istream::read(),
which does not check for end-of-file after reading the requested number of characters.
The origin of this problem is that the C++ standard is perhaps not as clear as it should be regarding
ignore(). From section 27.7.2.3:
basic_istream<charT,traits>& ignore(streamsize n = 1, int_type delim = traits::eof());Effects: Behaves as an unformatted input function (as described in 27.7.2.3, paragraph 1). After constructing a sentry object, extracts characters and discards them. Characters are extracted until any of the following occurs:
- if
n != numeric_limits<streamsize>::max()(18.3.2),ncharacters are extracted- end-of-file occurs on the input sequence (in which case the function calls
setstate(eofbit), which may throwios_base::failure(27.5.5.4));traits::eq_int_type(traits::to_int_type(c), delim)for the next available input characterc(in which casecis extracted).
Note that the Standard does not specify the order in which the checks should be performed, suggesting
that a conformant implementation may check for end-of-file before checking if n characters have been
extracted, as GCC does. You may think that the order is implicit in the ordering of the bullet points,
but if it were, then why would the Standard explicitly state the order in the case of getline()?
From section 27.7.2.3:
basic_istream<charT,traits>& getline(char_type* s, streamsize n, char_type delim);Effects: Behaves as an unformatted input function (as described in 27.7.2.3, paragraph 1). After constructing a sentry object, extracts characters and stores them into successive locations of an array whose first element is designated by
s. Characters are extracted and stored until one of the following occurs:
- end-of-file occurs on the input sequence (in which case the function calls
setstate(eofbit));traits::eq(c, delim)for the next available input characterc(in which case the input character is extracted but not stored);nis less than one orn - 1characters are stored (in which case the function callssetstate(failbit)).These conditions are tested in the order shown.
At least this is one GCC developer's justification
for GCC's behavior. However, I have a different take:
I believe that the only way to satisfy the Standard's requirements for ignore() is to perform the checks
in the order presented. The Standard says that "characters are extracted until any of the following occurs."
That means that when n characters have been extracted, ignore() needs to terminate, since this condition is
among "any of the following." But, if ignore() first checks for end-of-file and blocks forever, then it
doesn't terminate. This constrains the order in which a conformant implementation can check the conditions, and is
perhaps why the Standard does not need to specify an explicit order here, but does for getline() where it
really does want the end-of-file check to occur first.
I have left a comment on the GCC bug stating
my interpretation. One problem with fixing this bug is that it will break code that has come to depend on eofbit being
set if you ignore all the data remaining on a stream, though I'm frankly skeptical that much code would make
that assumption. Also, both LLVM's libcxx
and Microsoft Visual Studio (version 2005, at least) implement ignore() according to my interpretation
of the Standard.
In the meantime, be very, very careful with your use of ignore(). Only use it on
file streams or when you know you'll be ignoring fewer characters than are available to be read.
And don't rely on eofbit being set one way or the other.
If you need a more reliable version of ignore(), I've written a
non-member function implementation which takes a std::basic_istream
as its first argument. It is very nearly a drop-in replacement for the member function
(it even properly throws exceptions depending on the stream's exceptions mask), except that it returns
the number of bytes ignored (not a reference to the stream) in lieu of making the number of bytes
available by a call to gcount(). (It's not possible for a non-member function
to set the value returned by gcount().)
March 1, 2013
Why Do Hackers Love Namecheap and Hate Name.com?
Update: as of 2022, I no longer use or recommend name.com. My preferred registrars are, in order, Amazon Route 53, Gandi, and Google Domains.
Namecheap has brilliant marketing. The day that GoDaddy announced their support of SOPA, Namecheap pounced on the opportunity. They penned a passionate blog post and declared December 29, 2011 "Move Your Domain Day," complete with a patriotic theme, a coupon code "SOPAsucks," and donations to the EFF. Move Your Domain Day was such a success that it has its own Wikipedia article. Namecheap led the charge against GoDaddy, and I think it's safe to assume that most people who transferred from GoDaddy because of SOPA transferred to Namecheap. Now they seem to be the preferred registrar of the Hacker News/Reddit crowd.
Now consider Name.com. They too opposed SOPA and encouraged transfers using a "nodaddy" coupon code. But they didn't exert nearly as much effort as Namecheap and as a consequence probably lost out on a lot of potential transfers.
But Name.com has a bigger problem. They get raked over the coals on Hacker News because their free DNS hosting service adds a wildcard record that points their users' otherwise non-existent subdomains to their own ad-laden landing page. I think that's bad and they shouldn't do it. But at the same time, people should understand the distinction between domain registration and DNS hosting.
I'm very happy with Name.com as a domain registrar. It is the best I've
used (among Network Solutions, GoDaddy, Directnic, Gandi, and 1&1) and the
first that I haven't had any significant complaints about. I haven't
used Namecheap. Namecheap looks like a good registrar too, but Name.com appears
at least as good, if not better. Their UI is friendly and uncluttered. Their
about page makes them seem just as non-evil as Namecheap.
Name.com has long supported both IPv6 glue records and DNSSEC (Namecheap
recently added IPv6 glue but still has no DNSSEC support).
Name.com has two-factor authentication, which is pretty important for such a critical service.
When you buy a domain from Name.com, you're paying for the registration. You don't have to use their DNS service, especially when there are so many good options for DNS hosting: Amazon's Route 53 is very inexpensive, Cloudflare offers DNS as part of their free plan, Hurricane Electric has a free DNS service, Linode has free DNS for their customers, there are paid providers like ZoneEdit, SlickDNS, etc. Or you can host your own DNS.
As a general rule, registrars make crummy DNS providers. Usually the interface is clunky and they don't support all the record types. Only a few months after registering my first domain with Network Solutions, their entire DNS service suffered an hours-long outage during which my domain was unresolvable. Ever since, I've hosted my own DNS without a problem (recently I added Linode as my slave).
I don't have a dog in this race, but I think it would be a shame for someone to exclude a good registrar like Name.com from consideration just because they're a bad DNS provider. It would also be a shame for someone to use any registrar's crummy DNS service when there are so many better options out there.
February 9, 2013
Easily Running FUSE in an Isolated Mount Namespace
I've previously discussed how FUSE's nonstandard semantics can cause problems with rsync-based backups. In short, when stat() is called on a FUSE mount owned by another user, the kernel returns EACCES, even though POSIX says EACCES is for when a file's path can't be traversed. This is done to isolate the effects of an unstable or malicious FUSE filesystems to only the user who mounted it.
In my opinion, instead of stretching POSIX by returning EACCES, a better way to isolate FUSE mounts would be to make them invisible to other users. This has been discussed before, first in 2005 with a patch to add "private mounts" to Linux and later in 2006 with a proposal for stat() to return a fake stat structure for FUSE mounts. However, both times the proposals were rejected in favor of using the more general namespace support along with shared subtrees to achieve isolated FUSE mounts.
Unfortunately, while namespaces and shared subtrees are quite powerful, they have not seen widespread adoption, and userspace support for them is limited to some basic command primitives that don't do much on their own. While there is a PAM namespaces module, it's tailored to giving users isolated /tmp directories.
So, I wrote a very simple C program called with-fuse. with-fuse takes a command as its argument and executes that command with gid fuse and in an isolated mount namespace. Any mounts and unmounts performed inside the private namespace are invisible to the rest of the system. At the same time, mount and unmounts performed in the global namespace are immediately visible inside the private namespace. with-fuse can be safely installed setuid-root to give users on the system a means of using FUSE without affecting other users.
Example:
$ with-fuse /bin/sh
$ sshfs ...
$ exit
For with-fuse to work, the following command must be run at system boot (for example, from /etc/rc.local):
mount --make-rshared /Note that with-fuse creates a per-process namespace, not a per-user namespace. That means that the mounts created in one with-fuse namespace will not be visible in another with-fuse namespace, even if both namespaces are owned by the same user. Therefore, the user may wish to run a terminal multiplexer like GNU Screen inside his with-fuse namespace, in order to share the namespace among several shells:
$ with-fuse screen
To ensure that users only use FUSE from within a with-fuse namespace, /dev/fuse should be owned by group fuse and have 660 permissions. No user should be a member of group fuse, as with-fuse will take care of granting that GID.
You can download the source for with-fuse here. It's short and extensively commented if you'd like to learn how it works.
December 18, 2012
Insecure and Inconvenient: Gmail's Broken Certificate Validation
Gmail has a feature to periodically fetch mail from POP accounts into your Gmail account. Although Gmail supports POP over SSL, for a long time Gmail did not check the validity of the POP server's SSL certificate. This was a security flaw: an active attacker could pull off a man-in-the-middle attack and intercept the POP traffic by presenting a bogus certificate. This was something that needed to be fixed, though in practice the likelihood of an active attack between the Gmail servers and your POP server was probably very low, much lower than the chance of an active attack between your laptop in a coffee shop and your POP server.
This month they decided to fix the problem, and their help now states:
Gmail uses "strict" SSL security. This means that we'll always enforce that your other provider's remote server has a valid SSL certificate. This offers a higher level of security to better protect your information.
Unfortunately, some cursory testing reveals that this "strict" security falls short and does nothing to prevent active attacks. While Gmail properly rejects self-signed certificates, it does not verify that the certificate's common name matches the server's hostname. Even if Gmail thinks it's connecting to alice.com's POP server, it will blindly accept a certificate for mallory.com, as long as it's signed by a recognized certificate authority. Consequentially, an attacker can still successfully pull off a man-in-the-middle attack by purchasing a certificate for his own domain and using it in the attack.
As an example, here's the error message you receive when you try to use an invalid username and password with a server using a self-signed certificate (note: since this article was written, this server no longer uses a self-signed certificate so you can't reproduce with this particular hostname):

In contrast, here's what you get when you try to use an invalid username and password with a server using a properly-signed but misnamed certificate. It's not a certificate error from Gmail, but an authentication error from the server (meaning the password has already been transmitted):

The certificate's common name is really techhouse.org, not techhouse.brown.edu, which we can verify using the openssl command:
$ openssl s_client -connect techhouse.brown.edu:995
...
subject=/C=US/postalCode=02912/ST=Rhode Island/L=Providence/street=Box 6820 Brown University/O=Technology House/OU=TH Server/OU=Provided by DNC Holdings, Inc./OU=Direct NIC Pro SSL/CN=techhouse.org
issuer=/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO High-Assurance Secure Server CA
...
Furthermore, while this change does absolutely nothing to improve security, it seriously inconveniences users who want to connect to POP servers with self-signed certificates. There is no way to override certificate errors, which means that users who are willing to accept the low risk of an active attack in exchange for saving some money on a "real" certificate are out of luck. Insultingly, plain text POP is still an option, so Google can't claim to be protecting users from themselves. While self-signed certificates are verboten, transmitting your password and mail in the clear on the Internet where it can be passively eavesdropped is just fine.
So Google has made the whole situation worse and not one iota better. But more troubling, it shows that not even Google can get this right all the time, which reduces my confidence not only in the overall security of Google's services, but also in the ability of people who aren't Google to do this properly. Certificate validation bugs like forgetting to check the common name are all too common and were the subject of a great paper recently published titled "The most dangerous code in the world." If you want to learn more, especially if you think you might ever program with SSL, you have got to read that paper.