I was doing a mental exercise and it went someplace that I think is not terribly useful but still interesting enough to share.

Say you want to give somebody a copy of a web page that you’ve previously retrieved, and you want to convince them that what you’re giving them really did come from the original web server at some point.

This could be useful, for example, with the Internet Archive. If I want to contribute my own crawl results, how can the Archive verify that I didn’t forge the pages I’m submitting?

Note that the recipient can’t just fetch the same page themself and compare it to what you gave them. It may have changed since the time when you retrieved it, or they may not have access to that particular document.

What I really want is for the origin server to have digitally signed the document. There’s an Internet Draft for Signed HTTP Exchanges which does exactly that, but for the purposes of this exercise I want to be able to do it for any web server as it is deployed today, rather than requiring support for a new protocol.

So how else can I get an existing web server to sign its response for me?

Abusing TLS

If I’m using HTTPS then the server is already sending me a digital signature that’s publicly verifiable against their TLS certificate, in the CertificateVerify message during the TLS handshake. Can I take advantage of that?

Well, not exactly. That signature only covers the preceding parts of the handshake. TLS uses an assortment of new symmetric keys for everything else, all derived from a random key generated through a Diffie-Hellman ephemeral key exchange.

The TLS handshake does guarantee that as long as neither party has leaked any of the ephemeral keys, then the client can trust that the server it’s talking to also posesses the private key corresponding to the server’s public certificate.

Because the cryptography in use is based on symmetric keys, by definition both the client and server share all the same keys after the handshake. That means that either end could forge messages pretending to be from the other. But that’s okay, because if I receive a message from you that claims to be from me, I’m not fooled. The worst either end could do is willfully fool themselves, and there are easier ways to do that.

Those guarantees go out the window as soon as I try to convince a third party of the authenticity of these messages, though.

The only way I can think of to do that is to reveal the ephemeral keys, together with the full encrypted stream. At that point, the recipient can verify that the entire stream was produced by somebody who had those keys, and verify that the server’s offered certificate was valid at the time of the conversation. But they can’t verify that I didn’t forge part or all of the application data. In fact, they’re now in a position to do their own forgeries!

Trusted timestamping proxy

I think there’s a simple fix for that problem, although it makes the solution much less useful than I hoped for.

Let’s introduce a simple proxy server. I’ll connect to the origin server through this proxy. The proxy will log the bytes that are sent back and forth through it, but it won’t understand them: it’ll only see the encrypted TLS traffic. At the end of the connection, it will give me a digital signature over a timestamp plus the complete connection log. I can check that the signature from the proxy matches what I saw from my end of the conversation, and discard the log if not.

(Introducing this proxy doesn’t change the security story for TLS: it already had to be resilient against passive or active attackers sitting between the client and the server.)

Now I can give you the signed connection log, plus the ephemeral keys used during the connection.

If you trust that the proxy only produces signatures over traffic that actually passed through it, then you have enough information to verify that I’ve given you an authentic conversation with the origin server.

unless…

Well, almost. If I can convince the proxy to connect to a server under my control, that server can complete the handshake with the origin server, then share the ephemeral keys with me via another channel, and we can collaborate to conduct a completely forged conversation through the proxy, getting its signature on traffic that the origin server never saw.

Part of the process of verifying a TLS certificate involves checking that the DNS name in the certificate matches the DNS name the client was trying to connect to. We can’t make the proxy entirely trustworthy, but we can make it less untrustworthy if it does its best to ensure that the server it connects to corresponds to the requested DNS name.

At the least this means any DNSSEC signatures should be included in the proxy’s traffic log, so a recipient can be convinced that the DNS response came from the same origin.

I don’t know of any way to secure the IP connection, though.

Wilder options

Personally I feel like I’d be pretty willing to accept evidence from a trusted timestamping proxy like this, assuming the proxy were operated by a reasonably neutral party.

If that isn’t enough for you, I suspect there’s a way to use secure multi-party computation to have the proxy and the client both participate in the conversation, in such a way that neither is capable of leaking the ephemeral keys during the conversation.

But this is already more complicated than I hoped for, and doesn’t offer much advantage over just having people fetch pages from origin servers themselves. So I’m calling it a day.

Reading the TLS 1.3 specification carefully enough to work all this out was sure an interesting exercise though!