May 18, 2022
Parsing a TLS Client Hello with Go's cryptobyte Package
In my original post about SNI proxying,
I showed how you can parse a TLS Client Hello message (the first message that the client sends
to the server in a TLS connection) in Go using an amazing hack that
involves calling tls.Server
with a read-only net.Conn
wrapper and a
GetConfigForClient
callback that saves the tls.ClientHelloInfo
argument.
I'm using this hack in snid, and if you accessed this blog post over IPv4,
it was used to route your connection.
However, it's pretty gross, and only gives me access to the parts of
the Client Hello message that are exposed in the tls.ClientHelloInfo
struct. So I've decided to parse the Client Hello properly, using the
golang.org/x/crypto/cryptobyte
package,
which is a great library
that makes it easy parse length-prefixed binary messages, such as those
found in TLS.
cryptobyte was added to Go's quasi-standard x/crypto
library in 2017.
Since then, more and more parts of Go's TLS and X.509 libraries have been updated
to use cryptobyte for parsing, often leading to
significant performance gains.
In this post, I will show you how to use cryptobyte to parse a TLS Client Hello message, and introduce https://tlshello.agwa.name, an HTTP server that returns a JSON representation of the Client Hello message sent by your client.
Using cryptobyte
The cryptobyte parser is centered around the cryptobyte.String
type, which is just a slice of bytes that points to the message that
you are parsing:
type String []byte
cryptobyte.String
contains methods that read a part of the message and
advance the slice to point to the next part.
For example, let's say you have a message consisting of a variable-length string prefixed by a 16-bit big-endian length, followed by a 32-bit big-endian integer:
First, you create a cryptobyte.String
variable, message
, which
points to the above bytes.
Then, to read the name, you use ReadUint16LengthPrefixed
:
var name cryptobyte.String
message.ReadUint16LengthPrefixed(&name)
ReadUint16LengthPrefixed
reads two things. First, it reads the 16-bit length.
Second, it reads the number of bytes specified by the length. So, after the above function call,
name
points to the 6 byte string "Andrew", and message
is mutated to point to the remaining 4 bytes containing the ID.
To
read the ID, you use ReadUint32
:
var id uint32
message.ReadUint32(&id)
After this call, id
contains 5961228 (0x5AF60C) and message
is empty.
Note that cryptobyte.String
's methods return a bool indicating if the read was successful.
In real code, you'd want to check the return value and return an error if necessary.
It's also a good idea to call Empty
to make sure that the string
is really empty at the end, so you can detect and reject trailing garbage.
cryptobyte.String
's methods are generally zero-copy. In the above
example, name
will point to the same memory region which message
originally
pointed to. This makes cryptobyte very efficient.
Parsing the TLS Client Hello
Let's write a function that takes the bytes of a TLS Client Hello handshake message as input, and returns a struct with info about the TLS handshake:
func UnmarshalClientHello(handshakeBytes []byte) *ClientHelloInfo
We start by constructing a cryptobyte.String
from handshakeBytes
:
handshakeMessage := cryptobyte.String(handshakeBytes)
For guidance, we turn to Section 4 of RFC 8446, which describes TLS 1.3's handshake protocol.
Here's the definition of a handshake message:
struct {
HandshakeType msg_type; /* handshake type */
uint24 length; /* remaining bytes in message */
select (Handshake.msg_type) {
case client_hello: ClientHello;
case server_hello: ServerHello;
case end_of_early_data: EndOfEarlyData;
case encrypted_extensions: EncryptedExtensions;
case certificate_request: CertificateRequest;
case certificate: Certificate;
case certificate_verify: CertificateVerify;
case finished: Finished;
case new_session_ticket: NewSessionTicket;
case key_update: KeyUpdate;
};
} Handshake;
The first field in the message is a HandshakeType
, which is an enum defined as:
enum {
client_hello(1),
server_hello(2),
new_session_ticket(4),
end_of_early_data(5),
encrypted_extensions(8),
certificate(11),
certificate_request(13),
certificate_verify(15),
finished(20),
key_update(24),
message_hash(254),
(255)
} HandshakeType;
According to the above definition, a Client Hello message has a value of 1. The last entry of the
enum specifies the largest possible value of the enum. In TLS, enums are
transmitted as a big-endian integer using the smallest number
of bytes needed to represent the largest possible enum value. That's 255,
so HandshakeType
is transmitted as an 8-bit integer. Let's read
this integer and verify that it's 1:
var messageType uint8
if !handshakeMessage.ReadUint8(&messageType) || messageType != 1 {
return nil
}
The second field, length
, is a 24-bit integer specifying the number of bytes remaining in the message.
The third and last field depends on the type of handshake message.
Since it's a Client Hello message, it has type ClientHello
.
Let's read these two fields using ReadUint24LengthPrefixed
and then make sure there are no
bytes remaining in handshakeMessage
:
var clientHello cryptobyte.String
if !handshakeMessage.ReadUint24LengthPrefixed(&clientHello) || !handshakeMessage.Empty() {
return nil
}
clientHello
now points to the bytes of the ClientHello
structure, which is defined in Section 4.1.2 as follows:
struct {
ProtocolVersion legacy_version;
Random random;
opaque legacy_session_id<0..32>;
CipherSuite cipher_suites<2..2^16-2>;
opaque legacy_compression_methods<1..2^8-1>;
Extension extensions<8..2^16-1>;
} ClientHello;
The first field is legacy_version
, whose type is defined as a 16-bit integer:
uint16 ProtocolVersion;
To read it, we do:
var legacyVersion uint16
if !clientHello.ReadUint16(&legacyVersion) {
return nil
}
Next, random
, whose type is defined as:
opaque Random[32];
That means it's an opaque sequence of exactly 32 bytes. To read it, we do:
var random []byte
if !clientHello.ReadBytes(&random, 32) {
return nil
}
Next, legacy_session_id
. Like random
, it is an opaque sequence of
bytes, but this time the RFC specifies the length as a range,
<0..32>
. This syntax means it's a variable-length sequence that's between 0
and 32 bytes long, inclusive. In TLS, the length is transmitted just before the byte sequence as
a big-endian integer using the smallest number of bytes necessary to represent the largest possible length.
In this case, that's one byte, so we can read legacy_session_id
using
ReadUint8LengthPrefixed
:
var legacySessionID []byte
if !clientHello.ReadUint8LengthPrefixed((*cryptobyte.String)(&legacySessionID)) {
return nil
}
Now we're on to cipher_suites
, which is where things start to get
interesting. As with legacy_session_id
, it's a variable-length sequence,
but rather than being a sequence of bytes, it's a sequence of CipherSuites
,
which is defined as a pair of 8-bit integers:
uint8 CipherSuite[2];
In TLS, the length of the sequence is specified in bytes,
rather than number of items. For cipher_suites
, the largest possible
length is just shy of 2^16
, which means a 16-bit integer is used, so we'll
use ReadUint16LengthPrefixed
to read the cipher_suites
field:
var ciphersuitesBytes cryptobyte.String
if !clientHello.ReadUint16LengthPrefixed(&ciphersuitesBytes) {
return nil
}
Now we can iterate to read each item:
for !ciphersuitesBytes.Empty() {
var ciphersuite uint16
if !ciphersuitesBytes.ReadUint16(&ciphersuite) {
return nil
}
// do something with ciphersuite, like append to a slice
}
Next, legacy_compression_methods
, which is similar to legacy_session_id
:
var legacyCompressionMethods []uint8
if !clientHello.ReadUint8LengthPrefixed((*cryptobyte.String)(&legacyCompressionMethods)) {
return nil
}
Finally, we reach the extensions
field, which is another variable-length
sequence, this time containing the Extension
struct, defined as:
struct {
ExtensionType extension_type;
opaque extension_data<0..2^16-1>;
} Extension;
ExtensionType
is an enum with maximum value 65535 (i.e. a 16-bit integer).
As with cipher_suites
, we read all the bytes in the field into a cryptobyte.String
:
var extensionsBytes cryptobyte.String
if !clientHello.ReadUint16LengthPrefixed(&extensionsBytes) {
return nil
}
Since this is the last field, we want to make sure clientHello
is now empty:
if !clientHello.Empty() {
return nil
}
Now we can iterate to read each Extension
item:
for !extensionsBytes.Empty() {
var extType uint16
if !extensionsBytes.ReadUint16(&extType) {
return nil
}
var extData cryptobyte.String
if !extensionsBytes.ReadUint16LengthPrefixed(&extData) {
return nil
}
// Parse extData according to extType
}
And that's it! You can see working code, including parsing of several common extensions, in my tlshacks package.
tlshello.agwa.name
To test this out, I wrote an HTTP server that returns a JSON representation of the Client Hello. This is rather handy for checking what ciphers and extensions a client supports. You can check out what your client's Client Hello looks like at https://tlshello.agwa.name.
Making the Client Hello message available to an HTTP handler required some gymnastics, including
writing a net.Conn
wrapper struct that peeks at the first TLS handshake message and saves it
in the struct, and then a ConnContext
callback that grabs the saved message out of the wrapper struct and makes it available in the request's context. You can read the code if you're curious.
I'm happy to say that deploying this HTTP server was super easy thanks to snid. This service cannot run behind an HTTP reverse proxy - it has to terminate the TLS connection itself. Without snid, I would have needed to use a dedicated IPv4 address.
Post a Comment
Your comment will be public. To contact me privately, email me. Please keep your comment polite, on-topic, and comprehensible. Your comment may be held for moderation before being published.
Comments
Reader Chris on 2023-09-21 at 10:12:
Thank you so much for this blog post and the related code. It has been instrumental in me learning how to extract data from a specific extension that our clients inject in to the Client Hello message. I also see what you mean by gymnastics in order to place the TeeReader in at the right point to get the bytes available for storing.
My next task is to inject an extension from the client side in order to replicate my client's messages as part of a test bed.
Thanks once again
Reply
Andrew Ayer on 2023-09-21 at 15:23:
Thanks for your comment Chris! I'm very glad my post was useful.
Reply