December 1, 2022
Checking if a Certificate is Revoked: How Hard Can It Be?
This wasn't my first rodeo so I knew it would be hard. And I was right! The only question was what flavor of dysfunction I'd be encountering.
SSLMate's Certificate Transparency Search API now returns two new fields that tell you if, why, and when the certificate was revoked:
"revoked":true,
"revocation":{"time":"2021-10-27T21:38:48Z","reason":0,"checked_at":"2022-10-18T14:49:56Z"},
(See the complete API response)
This simple-sounding feature was obnoxious to implement, and required dealing with some amazingly creative screwups by certificate authorities, and a clunky system called the Common CA Database that's built on Salesforce. Just how dysfunctional is the WebPKI? Buckle up and find out!
Background on Certificate Revocation
There are two ways for a CA to publish that a certificate is revoked: the online certificate status protocol (OCSP), and certificate revocation lists (CRLs).
With OCSP, you send an HTTP request to the CA's OCSP server asking, "hey is the certificate with this serial number revoked?" and the CA is supposed to respond "yeah" or "nah", but often responds with "I dunno" or doesn't respond at all. CAs are required to support OCSP, and it's easy to find a CA's OCSP server (the URL is included in the certificate itself) but I didn't want to use it for the CT Search API: each API response can contain up to 100 certificates, so I'd have to make up to 100 OCSP requests just to build a single response. Given the slowness and unreliability of OCSP, that was a no go.
With CRLs, the CA publishes one or more lists of all revoked serial numbers.
This would be much easier to deal with: I could write a cron job to download every
CRL, insert all the entries into my trusty PostgreSQL database, and then building
a CT Search API response would be as simple as JOINing with the crl_entry
table!
Historically, CRLs weren't an option because not all CAs published CRLs, but on October 1, 2022, both Mozilla and Apple began requiring all CAs in their root programs to publish CRLs. Even better, they required CAs to disclose the URLs of their CRLs in the Common CA Database (CCADB), which is available to the public in the form of a chonky CSV file. Specifically, two new columns were added to the CSV: "Full CRL Issued By This CA", which is populated with a URL if the CA publishes a single CRL, and "JSON Array of Partitioned CRLs", which is populated with a JSON array of URLs if the CA splits its list of revoked certificates across multiple CRLs.
So I got to work writing a cron job in Go that would 1) download and parse the CCADB CSV file to determine the URL of every CRL 2) download, parse, and verify every CRL and 3) insert the CRL entries into PostgreSQL.
How hard could this be?
This wasn't my first rodeo so I knew it would be hard. And I was right! The only question was what flavor of dysfunction I'd be encountering.
CCADB Sucks
The CCADB is a database run by Mozilla that contains information about publicly-trusted certificate authorities. The four major browser makers (Mozilla, Apple, Chrome, and Microsoft) use the CCADB to keep track of the CAs which are trusted by their products.
CCADB could be a fairly simple CRUD app, but instead it's built on
Salesforce, which means it's actual crud. CAs use a clunky enterprise-grade UI to update their information, such as to disclose
their CRLs. Good news: there's an API. Bad news: here's how to get API credentials:
Salesforce will redirect to the callback url (specified in 'redirect_uri'). Quickly freeze the loading of the page and look in the browser address bar to extract the 'authorization code', save the code for the next steps.
To make matters worse, CCADB's data model is wrong (it's oriented around certificates rather than subject+key) which means the same information about a CA needs to be entered in multiple places. There is very little validation of anything a CA inputs. Consequentially, the information in the CCADB is often missing, inconsistent, or just flat out wrong.
In the "Full CRL Issued By This CA" column, I saw:
- URLs without a protocol
- Multiple URLs
- The strings "expired" and "revoked"
Meanwhile, the data for "JSON Array of Partitioned CRLs" could be divided into three categories:
- The empty array (
[]
). - A comma-separated list of URLs, with no square brackets or quotes.
- A comma-separated list of URLs, with square brackets but without quotes.
In other words, the only well-formed JSON in sight was the empty array.
Initially, I assumed that CAs didn't know how to write non-trivial JSON, because that seems like a skill they would struggle with. Turned out that Salesforce was stripping quotes from the CSV export. OK, CAs, it's not your fault this time. (Well, except for the one who left out square brackets.) But don't get too smug, CAs - we haven't tried to download your CRLs yet.
(The CSV was eventually fixed, but to unblock my
progress I had to parse this column with a mash of strings.Trim
and strings.Split
. Even Mozilla
had to resort to such hacks to parse their own CSV file.)
CAs Suck
Once I got the CCADB CSV parsed successfully, it was time to download some CRLs! Surely, this would be easy - even though CRLs weren't mandatory before October 1, the vast majority of CAs had been publishing CRLs for years, and plenty of clients were already consuming them. Surely, any problems would have been discovered and fixed by now, right?
Ah hah hah hah hah.
I immediately ran into some fairly basic issues, like Amazon's CRLs returning a 404 error, D-TRUST encoding CRLs as PEM instead of DER, or Sectigo disclosing a CRL with a non-existent hostname because they forgot to publish a DNS record, as well as some more... interesting issues:
GoDaddy
Since root certificate keys are kept offline, CRLs for root certificates
have to be generated manually during a signing ceremony.
Signing ceremonies are extremely serious affairs that involve donning
ceremonial robes, entering a locked cage, pulling a laptop out of a safe,
and manually running openssl commands based on a script - and not the
shell variety, but the reams of dead tree variety. Using the
openssl command is hell in the best of circumstances - now imagine
doing it from inside a cage. The smarter CAs write dedicated ceremony tooling
instead of using openssl. The rest bungle ceremonies on the regular,
as GoDaddy did here when they generated CRLs with an obsolete version number and missing a required extension, which consequentially couldn't be parsed by Go. To GoDaddy's credit,
they are now planning to switch to dedicated
ceremony tooling. Sometimes things do get better!
GlobalSign
Instead of setting the CRL signature algorithm based on the algorithm of the issuing CA's key, GlobalSign was setting it based on the algorithm of the issuing CA's signature. So when an elliptic curve intermediate CA was signed by an RSA root CA, the intermediate CA would produce CRLs that claimed to have RSA signatures even though they were really elliptic curve signatures.
After receiving my report, GlobalSign fixed their logic and added a test case.
Google Trust Services
Here is the list of CRL revocation reason codes defined by RFC 5280:
CRLReason ::= ENUMERATED {
unspecified (0),
keyCompromise (1),
cACompromise (2),
affiliationChanged (3),
superseded (4),
cessationOfOperation (5),
certificateHold (6),
-- value 7 is not used
removeFromCRL (8),
privilegeWithdrawn (9),
aACompromise (10) }
And here is the protobuf enum that Google uses internally for revocation reasons:
enum RevocationReason {
UNKNOWN = 0;
UNSPECIFIED = 1;
KEYCOMPROMISE = 2;
CACOMPROMISE = 3;
AFFILIATIONCHANGED = 4;
SUPERSEDED = 5;
CESSATIONOFOPERATION = 6;
CERTIFICATEHOLD = 7;
PRIVILEGEWITHDRAWN = 8;
AACOMPROMISE = 9;
}
As you can see, the reason code for unspecified is 0, and the protobuf enum value for unspecified is 1. The reason code for keyCompromise is 1 and the protobuf enum value for keyCompromise is 2. Therefore, by induction, all reason codes are exactly one less than the protobuf enum value. QED.
That was the logic of Google's code, which generated CRL reason codes by subtracting one from the protobuf enum value, instead of using a lookup table or switch statement. Of course, when it came time to revoke a certificate for the reason "privilegeWithdrawn", this resulted in a reason code of 7, which is not a valid reason code. Whoops.
At least this bug only materialized a few months ago, unlike most of the other CAs mentioned here, who had been publishing busted CRLs for years.
After receiving my report, Google fixed the CRL and added a test case, and will contribute to CRL linting efforts.
Conclusion
There are still some problems that I haven't investigated yet, but at this point, SSLMate knows the revocation status of the vast majority of publicly-trusted SSL certificates, and you can access it with just a simple HTTP query.
If you need to programmatically enumerate all the SSL certificates for a domain, such as to inventory your company's SSL certificates, then check out SSLMate's Certificate Transparency Search API. I don't know of any other service that pulls together information from over 40 Certificate Transparency logs and 3,500+ CRLs into one JSON API that's queryable by domain name. Best of all, I stand between you and all the WebPKI's dysfunction, so you can work on stuff you actually like, instead of wrangling CSVs and debugging CRL parsing errors.
Post a Comment
Your comment will be public. To contact me privately, email me. Please keep your comment polite, on-topic, and comprehensible. Your comment may be held for moderation before being published.
Comments
Anonymous on 2022-12-02 at 06:35:
Excellent write up. Not many blogs go this in depth on certs and PKI. I’m glad I don’t have to deal with CAs on a daily basis and only have to run OpenSSL occasionally for CSRs.
Reply