Google summer of code 2023(org: GnuTLS)

Hello, I'm Ajit singh registered as a Pokémon and also a computer science student. I was fortunate enough to participate in Google Summer of Code(GSOC) 2023. I worked with GnuTLS on improving privacy by adding an implementation for Encrypted Client Hello(ECH).

Contents

Project Overview

Most modern applications use Transport Layer Security (TLS) to establish a secure connection between a client and a server. TLSv1.3 encrypts most of the handshake, including the server certificate. However, there is still room for improvement. The ClientHello message is still sent in plaintext, which contains sensitive metadata.

Encrypted Client Hello (ECH) is a draft extension for TLSv1.3 that enables clients to encrypt ClientHello messages in the TLS handshake. This prevents the sensitive metadata, such as the Server Name Indication (SNI) and Application Layer Protocol Negotiation (ALPN), from being leaked in plaintext.

The earlier iteration of this extension was called ESNI (Encrypted Server Name Indication). As the name suggests, it was proposed to encrypt only the SNI field. However, it was later renamed to ECH because it now encrypts the entire ClientHello message.

GnuTLS is a free and open-source implementation of the TLS and Secure Sockets Layer (SSL) protocols. It is used by a variety of applications, including web browsers, email clients, and file transfer programs.This project aims to add support for the Encrypted Client Hello (ECH) extension to the GnuTLS library.

Implementation

GnuTLS relies on the nettle library, which is a collection of cryptographic algorithms and functions. ECH relies on the Hybrid Public Key Encryption(HPKE) scheme, which is a new cryptographic protocol that combines the benefits of both symmetric and asymmetric encryption. This scheme is implemented in nettle but yet merge with master branch.

Community bonding period allowed me to get the good understanding of the poject and its requirements. Then finally I started the project by porting the HPKE code from npocs:nettle repository to GnuTLS. This involved figuring out and resolving header files dependencies, refactoring the code to finally make it work with GnuTLS, it was quite a lot of work. This resulted as the first merge request for the gsoc.

The next challenge was to actually start implementing ECH. The draft of the ECH is pretty big, so I needed to plan my implementation carefully. I decided to approach the implementation in an ad-hoc manner, which meant that I would start with the simplest parts of the protocol and work my way up to the more complex parts.

I went with the following order for my implementation:

This order of implementation allowed me to take on the challenges of ECH in a pretty nice order from low to high. By starting with the simplest parts of the draft, I was able to get a good understanding of how ECH works and to build up my confidence before I tackled the more complex parts.

Add new extensions [1]

Before starting, it was helpful to have entries for two new extensions: encrypted_client_hello (0xfe0d) and ech_outer_extensions (0xfd00). Registering these two extensions within GnuTLS allowed me to get familiar with the library.

Deserialize ECH configs [2]

This project tested to be interoperable with cloudflare and others. To detect if ech is working or not, it has used this [ech-check] website. This shows if ECH is detected or not and also displays the value of Inner and Outer SNI. At this point I have serialized ech configs which are fetched from dns record using dig defo.ie +short TYPE65. This first needed to be decoded from base64 then It simply had to follow the same structure of ECHConfig as given in ech-draft to deserialize them.

opaque HpkePublicKey<1..2^16-1> here this notation in angular bracket simply suggests that HpkePublicKey size can be from 1 byte to 2^16-1 bytes. If it's a fixed byte length field then it simply needs to read that bytes, or else if it's a variable length field then first it needs to read the length of the field then it will read that much of bytes and finally store it as structured data. Initially it was quite complicated for me to understand this much only.

Grease ECH [3]

The only challenge faced while implementing this grease ECH was to get an understanding of how to use the nettle-hpke interface to generate this encapsulated hpke key and then serialize the structure. Now, Here comes the help from mentors. Generating encap-hpke-key simply involved of using hpke_encap function. Additionally, This section allowed me to get prepared for the future use of hpke functionalities.

APIs Addition for GnuTLS [4]

Next, It is required to have some APIs to deserialize the ECHConfigs, select the valid ECHConfig and store it within session and one for enabling/disabling of ech feature. These APIs are used by gnutls-cli, which is used to test the project working, to get the ECH configs. This section also involved addition of cli-options for gnutls-cli which will be required to pass the pem-encoded ECHConfigs file name as an argument. gnutls-cli read and decode this pem-encoded file to raw data(serialized ECHConfigs).

Offering Real ECH [5]

Now, it comes to enabling support for real ECH. This was the most important and challenging part of the project. Single ignorance here simply ended up as an issue which took days to resolve. The challenge, or say the core of the project was how to create a copy of this clientHello, not just copy but each for each copy we have some modifications still not just modifications but more than that, these could not be tackled at the same point. It's require to have copy of this clientHello for clientHelloInner and the clientHelloOuter but that's not all, then It have more modified copies one of clientHelloInner introduced as EncodedClienthelloInner and other one of clientHelloOuter introduced as clientHelloOuterAAD.

So, what's the big challenge with copying and modifying? Initially it was, just to come up with a solution, it was quite challenging. One reason is because of the order in which the copies need to be generated. ClientHelloOuter has EncodedClientHelloInner as payload and then clientHelloOuterAAD is required while encrypting the payload. If It ignores the order, it will simply end up with storing each copy in session and then fetch it from session whenever it needs to be used. This can simply be resolved by having a orignal(or a base) clientHello, which can later be modified whenever new copies need to be generated. Another reason is that, because of the GnuTLS library it implements in C it does not have any structured way to keep this original clientHello. Its implementation simply serializes clientHello and sends it to the server. Therefore even before generating clientHelloOuter there needs to be an original complete copy of clientHello.

One solution about modification could have been by parsing this serialized clientHello and directly modifying it. This can be more easily done by storing offset for each extension in an array so that while modifying it can directly jump to that extension.

For this approach, It has to generate and save a baseClientHello and then have a function replaceExt(tls_id, data, offset), addExt(tls_id, data, offset) use these functions to generate modified copies. The only disadvantage of this approach is that this requires multiple insertion and deletion.

The other approach, which is used, is to modify the gen_hello_extensions() function in order to generate extension data with modification for each clientHello copies. This modification first stores all extensions data in session so that the extensions which do not need to be modified can simply be copied from there instead of generating. Then call this modified function to generate each copy with a flag indicating for which copy these extensions need to be generated. Other remaining fields of TLS clientHello could simply be rewritten in new copy, and then finally append this extensions data.

Encoding ClientHelloInner [6]

At this point in the implementation, it is reasonable to skip the compression of the clientHelloInner and proceed directly to encoding it. Encoding can be done by generating a new, modified copy of the clientHello as specified in section 5.1 of the draft. It uses the modified gen_hello_extensions to generate extensions and modify other required tls field simply while generating this modified serialized clientHello copy.

Encrypting ECH [7]

Once It decided a approach to generate copies of clientHello, then it started with generating clientHelloOuter which have an extension ECH, now in order to generate the data for ECH extension we require to generate EncodedClienthelloInner which will serve as payload field for this extension data. This section involved generating payload(EncodedClientHelloInner) for clientHelloOuter and encrypting this payload using HPKE interface. Here comes the use of public key provided within ECHConfig. Encrypting this payload also requires clientHelloOuterAAD to serve as additional authenticated data which will be used to authenticate the clientHelloOuter.

Determining ECH Acceptance [8]

This section wasn't that challenging in itself, it led me back to tls rfc and then look out implementation of functions HDKF-Extract, HKDF-Expand-Label within GnuTLS and function to generate transcript hash.

Now, so here it comes to a new challenge, what is this transcript hash? This value is computed by hashing the concatenation of each included handshake message. Now the challenging part was that if ECH is accepted then it have consider clientHelloInner as original clientHello else clientHelloOuter as original clientHello to this transcript hash. It resulted in storing this clientHelloInner within session as handshake_hash_buffer_inner so that in case if ECH is accepted we can proceed with it by replacing the original handshake_hash_buffer with handshake_hash_inner_buffer and other corresponding variables.

To determine we have to compute the hash same way as in here ech:section 7.2. The transcript_ech_conf this also includes server hello after modifying last 8 bytes of server random to zeroes. Finally to determine acceptance of ech by server it compares last 8 bytes of server random with computer hash.

While testing its interoperability with hosted ech enabled test server(defo.ie) it was hard to debug the issues, so I had a local ech test server which runs using cloudflare tls ech enabled library.

Compressing the clientHelloInner [9]

Repeating large extensions, such as "key_share" with post-quantum algorithms, between ClientHelloInner and ClientHelloOuter can lead to excessive size. To reduce the size impact, the client MAY substitute extensions which it knows will be duplicated in ClientHelloOuter. It does so by removing and replacing extensions from EncodedClientHelloInner with a single "ech_outer_extensions" extension. [section5.1]

There is one more extension to carry out the data which is a list of references to the extensions which needs to be decompressed at server side introduced as ech_outer_extensions. To implement the compression of extensions, It simply replaces the first extension in list with ech_outer_extensions and skip the remaining extension from that list, doing so to ensure the to keep them in order. Current implementation only compresses key_share extension by default.

Result

Testing ech interoperability with openSSL ech test-sever (https://defo.ie)

On interacting using with gnutls-cli directly to check ech status test server defo.ie

./src/gnutls-cli defo.ie --ech-configs-file config.pem

Example config.pem

  
    -----BEGIN ECH CONFIGS-----
    AEb+DQBCGwAgACDSupslkfIkg/C0be/yDdZqtUJs4ssKG5IgWHadWXn4KQAEAAEA
    ASUTY2xvdWRmbGFyZS1lc25pLmNvbQAA
    -----END ECH CONFIGS-----
  

NOTE: using gnutls-cli with ech enabled will print out retry ech-configs as debug log(use -d2 with gnutls-cli)

use dig to manually fetch the ech-configs

dig defo.ie +short TYPE65

On having valid ech-configs, gnutls-cli will proceed connecting to the server using ech, next client can send a HTTP request to the ech test server.

Example HTTP Request:

  
  GET /ech-check.php HTTP/1.1
  Host: defo.ie
  

As an output, here shown a part of HTML response from the server


  <center>
    <Table width=600 border=0><P><h1>defo.ie</h1>

    <p>This is the defo.ie ECH check page that tells you if ECH was used.</p>

    <p> PHP sez it's Tuesday 22nd of August 2023 04:06:42 AM(UTC)</p>

    <p>SSL_ECH_OUTER_SNI: cover.defo.ie <br />
    SSL_ECH_INNER_SNI: defo.ie <br />
    SSL_ECH_STATUS: success <img src="greentick-small.png" alt="good" /> <br/>
    </p>
    </TD>
    </TR>
  <center>
  

Testing interoperability with chrome ech test-server (https://tls-ech.dev/)

Text within <h2> header tag display ech status


    ./src/gnutls-cli tls-ech.dev --ech-configs-file configs.pem

    ...

    GET / HTTP/1.1
    Host: tls-ech.dev

    ...

    <body>
    <center>
    <h1>tls-ech.dev</h1>
    <h2>You are using ECH. :)</h2>
    <a href="/ech.dns">Active ECH Config</a><br/>
    <br/>
    <hr>
    <br/>

    <a href="https://tls-ech.dev">Normal Server</a><br/>
    <a href="https://stale.tls-ech.dev">Stale Server Config</a><br/>
    <a href="https://wrong.tls-ech.dev">Wrong Public Name</a><br/>
    <a href="https://tls12.tls-ech.dev">TLS 1.2</a><br/>
    </center>
    </body>
  

To disable the ech for gnutls-cli

./src/gnutls-cli defo.ie --disable-ech

NOTE: As final results src/gnutls-cli is able to interoperate with different ech implementations.

Contributions:

Future Work

This implementation only provides ECH support for client and still lacks few features such as Hello Retry Request(HRR) and enabling PSK support. The future tasks will be to implement these remaining features and to add server support for ECH.

Appreciation for mentors

I would like to take this opportunity to thank my mentors Daiki Ueno, Sahana Prasad, Zoltan and Norbert Pocs for their guidance, support, and willingness to answer my questions throughout this project.

It has been an amazing experience to work on a real-world project with mentors who are experts in their field. I have learned so much and I am excited to continue my research in this area.