Deploying web applications is hard to do, and there are a bewildering variety of ways to do it. Today I’d like to confuse the matter further by proposing yet another approach. As a preview, here are the components I’ll discuss:

  • TLS termination and when to not do it
  • TLS Server Name Indication to route entire connections
  • PROXY protocol to forward clients’ addresses without parsing the request stream
  • Unix sockets to communicate with backend applications
  • Unix permissions to isolate applications

Background

There is, of course, no one best way to make a web site available to the world, because people’s requirements vary a lot. For example, maybe your site is like this blog, which I write in Markdown and compile to static HTML using Jekyll, and any static hosting service will do. Things get more complicated if you want to do any server-side computation, or even just control your response status codes and headers.

Personally, I want to be able to put a new experiment online quickly at any time, and I want to be able to use any feature of the evolving web (such as WebSockets or HTTP/2 Server Push), with any kind of server-side framework, database, or other tooling. To that end, I currently have a dedicated server hosted by Hetzner, so that I can try anything that I’m capable of making work on bare metal.

“Platform as a Service” (PaaS) offerings like Heroku are popular for quick experiments that might turn into big projects. They impose constraints on your application that make it easier to scale, at least for them, and maybe also for you. For example, your application can’t store any persistent data to its local disk; it has to use an external service like a database or storage provider. This allows applications to crash without losing data, and to migrate across physical machines easily, and to be scaled up by running duplicate copies of identical code on as many physical computers as necessary.

But I’m not concerned about making my web sites scale; I desperately hope I never create the next Google or Facebook. A single computer these days can serve an awful lot of people so long as you don’t write code that’s accidentally quadratic or otherwise unnecessarily resource-intensive.

I don’t care about scaling and I don’t want constraints, so Heroku et al offer exactly the wrong trade-off for me. But there is one feature I love from providers like Heroku: As a developer, your deployment process is little more than recording what software you need installed and then running git push.

In order to get that simple workflow on my own server, I’ve been using Dokku for a couple of years. Built on top of the Herokuish compatibility layer, Dokku provides a fully open-source single-server PaaS that can generally run the same application images that Heroku can. Dokku doesn’t have the breadth of one-click third-party integrations available that Heroku does, but on the other hand, it allows you to use arbitrary Docker settings such as persistent volume mounts which don’t fit Heroku’s scaling model. Anyway, I never wanted to pay for any of those third-party integrations, so I don’t miss them.

Unfortunately, I’ve found that Dokku still imposes constraints I don’t like. I believe fixing all of them would push it too far from being compatible with Heroku, and therefore it wouldn’t be the same project any more. So I’ve been working out what I want from first principles.

My requirements

I want to share a single server and a single IP address among multiple applications, possibly managed by different people, who may all distrust each other to some degree. For my personal server these would be friends who I just want to protect from accidental interference, but ideally it wouldn’t be a big leap from there to real security.

Authorized people should be able to add new applications with minimal or no involvement from the system administrator, depending on how much trust the administrator wants to extend; but they had better not be able to interfere with other people’s applications.

I want to use only standard Unix permissions to control how people with access to the server can interact with their own and others’ applications.

Sharing an IP address requires multiplexing the HTTP and HTTPS ports between the different applications, but the reverse proxy which does that should be as minimal as possible. Minimalism is good for security, but that isn’t the key reason. If the proxy processes the protocol stream in any way, that can impose several different kinds of constraints on backend applications:

  • Any interpretation it does means that it can only forward traffic which it understands, keeping applications from taking advantage of the full features of HTTP/2 or other specifications. For example, as far as I can tell it is currently impossible to implement RFC8030 (Generic Event Delivery Using HTTP Push, part of the Push API) if nginx is your reverse proxy, because that server only supports sending a push promise that it immediately resolves.

  • Imposing arbitrary limits, such as limiting uploaded file size or providing a “Web Application Firewall”, means an application might need those limits configured. There is no one-size-fits-all configuration. That means untrusted people need to be able to change the configuration of a shared component, which makes it way more difficult to show that people can’t interfere with the configuration of each other’s applications.

Each application should be assigned one or more hostnames, which are not shared with any other application. Sharing at the level of subpaths or other properties of a request gets substantially more complicated. More importantly, hostnames are special in HTTPS because they’re provided early via TLS Server Name Indication, so the system can make routing decisions without inspecting the HTTP protocol stream at all.

I thought I wanted the proxy to provide TLS termination, so the administrator can provide wildcard certificates and DNS records for some domain. That would allow people to host applications at any subdomain of that domain without having to configure TLS or DNS themselves. However, the HTTP/2 specification says that browsers are allowed to send requests for any domain listed in the server’s certificate over the same connection. Per section 9.1.1 of RFC7540:

For example, a certificate with a subjectAltName of *.example.com might permit the use of the same connection for requests to URIs starting with https://a.example.com/ and https://b.example.com/.

That means routing based on SNI is only safe if each certificate is associated with exactly one backend. So instead I require the backend application to be responsible for provisioning its own TLS certificates, probably using ACME/Let’s Encrypt.

I’d like to have the option of using Docker, KVM, etc, but I also want the option to not use any container or virtualization technology for a given application. I want to run an application and all its dependencies, such as databases, caches, and task queues, entirely under my own privileges. I want to be able to use any software I can figure out how to build, and it shouldn’t be necessary to run any of it with elevated privileges.

It’s nice to support zero-downtime rolling upgrades even on a personal server used primarily for experiments. I want to start up a new version of an application and test that it’s working, before atomically switching outside traffic over to it and gracefully shutting down the old version. But I want to do that without reloading or reconfiguring the reverse proxy, because again, otherwise it’s hard to show that people can’t interfere with each other’s applications.

Solution

I haven’t found any existing software that I’m fully satisfied with, so I wrote my own, with a focus on minimizing the amount of code that has to be trusted. I used Rust, for its combination of memory safety, performance, and event-driven async/await syntax, to write two reverse proxies:

  • sniproxy for routing TLS connections according to SNI
  • httpdir for routing HTTP/1.1 requests according to Host header

They follow very similar designs. Neither program takes any command-line options, or even a configuration file in a traditional sense.

First off, you have to pass a listening TCP socket as stdin (file descriptor 0). There are many ways to do that, whether using classic tools like inetd or new alternatives like systemd. If you need to listen on multiple sockets, run a separate copy of the proxy for each socket.

This has several advantages:

  • It keeps the implementation simpler and easier to review because there are no configuration options to parse.

  • It means that the proxy itself never needs to have elevated privileges, which are otherwise required to bind to port numbers below 1024.

  • A service manager can hang on to the socket and later pass it to a new version of the proxy, while allowing the old version to finish any connections it’s already handling and gracefully exit.

  • If the single-threaded proxy turns out to be CPU-bound, the administrator can run multiple copies of it listening to the same socket, and pin each one to a separate CPU core.

Second, create a directory to hold your backend configuration, containing a subdirectory for each hostname you want to serve. I’ll call this directory hosts/ here. When you start either of these proxies, this top-level configuration directory must be its current working directory.

Unix permissions on the hosts/ directory determine who can create, delete, or rename a host. You could make it writable only by root, but if you want to be more permissive you can set the “sticky bit” (e.g. mode 1777) so that, like /tmp, anyone can create a directory but can only delete or rename directories they own.

Hostnames must be represented without a trailing dot (.) and in lowercase ASCII. So for international domain names, use the “A-label” form: if you’re hosting a web site at “🕸💍.ws”, then you’d put the backend configuration for that site in hosts/xn--sr8hvo.ws/.

Since you can’t create multiple directory entries with the same name, the filesystem enforces that all configured hostnames are unique. It does not enforce that all configured hostnames are legal or valid, but the proxies will reject an illegal hostname without checking for its presence in the filesystem, so any misconfigured backends will just be silently ignored.

In this configuration, a directory like hosts/jamey.thesharps.us/ should be owned by the user and/or group who is authorized to manage the application.

Each hostname subdirectory can have these files:

  • http-socket (required for httpdir): a Unix domain socket that your backend application is listening for HTTP/1.1 connections on.

  • tls-socket (required for sniproxy): a Unix domain socket that your backend application is listening for TLS connections on.

  • send-proxy-v1: if this file is present, sniproxy will prefix every connection to this backend with a PROXY protocol v1 header.

Any of these sockets or directories may be symlinks, which allows atomically switching them to connect to a different socket at any time. However, those symlinks should be relative and should only point within the same directory, because ideally the proxy should be run in a chroot and so may not have access to sockets located elsewhere.

You can change the configuration without restarting the proxies: they look up the target socket each time a connection comes in, and don’t need to know which hostnames to serve in advance. This is especially important if untrusted users will be managing any hostname configurations because allowing them to restart or reload the reverse proxy is risky.

Alternatives

I examined a lot of existing options before giving up and writing my own.

I’d like to give a special shout-out to Melvil, a proof of concept my friend Getty wrote some years ago, as evidence that my ideas are hardly unique. I independently ended up with a very similar design, although Getty was trying to put more functionality into the proxy layer, while I aimed to identify the least configurable thing which could provide the most flexibility to the backends.

Aside from Melvil, every alternative I found would have required me to write shell scripts or similar to construct a suitable configuration file from the directory structure I wanted to use. I was concerned about the possibility of directory names which aren’t valid hostnames but might lead to syntactically invalid configuration files if I substituted them naively into a template. Also, any time some backend’s configuration changed, I’d have to reconfigure or restart the proxy.

The classic TLS termination proxy is stunnel, and I thought it looked promising for a while, but I never quite managed to make enough sense of its documentation to put together even a minimal proof of concept. It seems to support a variety of protocols and configuration options, but it isn’t clear whether they work together in all combinations. I got particularly confused about whether turning on PROXY protocol support would affect the incoming connection from the client or the outgoing connection to the backend. Anyway I eventually decided I didn’t want TLS termination at all, and that’s all stunnel really does.

All the remaining alternatives that I considered put both TLS and HTTP/1.1 routing in the same application. I believe this is a task which should not share any state across connections, so there’s no reason to handle more than one socket per process. And there are good reasons to isolate parsers for different protocols from each other: a buggy HTTP parser shouldn’t be able to leak information from or deny service to another TLS connection, and vice versa.

Part of what convinced me that routing based purely on SNI could work was an existing SNI Proxy project. I read that implementation carefully and quite like a lot of the ideas in it. I didn’t have a lot of confidence in the TLS protocol parser though—which says “This was created based primarily on Wireshark dissection of a TLS handshake and RFC4366”—and in fact I believe that implementation would break if a client sent fragmented TLS records. Whether that’s actually a flaw remains an open question, since well-behaved clients don’t normally do that during the TLS handshake, but I decided I wanted to accept anything that the TLS 1.3 specification allows.

I spent the most time fiddling with haproxy. Eventually I had a somewhat sketchy configuration that I was pretty happy with, which would terminate TLS but use only SNI to decide which backend to route the decrypted stream to. This was the point when I discovered that HTTP/2 specifically allows clients to reuse a connection for a different hostname, as long as both hostnames appear in the certificate that the server presented for that connection. Honestly I think aggressive connection reuse is a good idea, but it made my prototype worthless so I was pretty grumpy for a bit.

Anyway, terminating TLS meant that haproxy had to negotiate extensions like ALPN before identifying what backend to connect to, rather than letting the backend decide what extensions to support. I had to hard-code things like “this server supports both HTTP/2 and HTTP/1.1”, and then force the backend to deal with the result. In my experiments I was quite happy with nghttp2’s nghttpx for providing transparent HTTP/2 support to backends that didn’t have it natively. But then I realized that haproxy could do the same thing, so I folded that into my haproxy configuration… before abandoning that approach entirely.

One of the first tools I tried was Traefik, which is designed around letting some container management system provide the information about which backends are configured. I thought about using the file backend and pretending to be a container management system, even for my backends that weren’t actually in containers. But I couldn’t see how to ensure that a given hostname was only managed by the user or group I authorized to do so, unless I wrote a custom plugin, and I didn’t want that level of complexity.

I also tried coercing nginx into giving me this kind of delegated authority over configuration. That seemed almost plausible! Because nginx include directives require that the named file be syntactically valid by itself, if I wrapped the user configuration inside a location / section, I could have relatively simple scripts generate templated configuration that would at least somewhat isolate backends from each other. Of course there are some kinds of configuration someone might reasonably want that isn’t allowed inside a location section, but this seemed like it would support most of the common needs at least.

But a lot of nginx configuration directives can have global effects. For example, if caching is turned on, any location can set an arbitrary cache key, potentially overwriting cache entries for other hosts. Turning off caching mitigates that particular problem, but then backends might need to configure their own caching layer, possibly by running another copy of nginx. If they have to manage a more capable reverse proxy anyway, why put another one with all that code sitting unused in front of it?

Anyway, for this approach to work I would need to audit every configuration directive of every module, and re-audit on every upgrade. So that’s definitely not an option.

Finally, I also considered Caddy and h2o, but both have a single configuration file with no clear method for safely templating it from untrusted configuration fragments, so I didn’t see much advantage to examining them further for this purpose. They do both look very promising for backend use, though; h2o was easier to configure as a quick static file server than anything else I tried.

Results

In total I wrote about 650 lines of Rust source for the two reverse proxies and I can explain every line of output from running either one under strace, so I feel I did pretty well toward my minimalism goals. I spent about two weeks putting these tools together, once I gave up on the existing alternatives.

70% of that source code is in sniproxy, since it contains my parser for extracting the SNI extension from the initial ClientHello message. It’s also a bit more sophisticated about handling signals, graceful shutdowns, and long hostnames, so httpdir might yet catch up somewhat in complexity. On the other hand, httpdir compiles to a much larger binary, because it pulls in an entire HTTP implementation from the hyper library.

I haven’t reached the point where I can git push to deploy my applications behind these proxies, but at this point the only other thing an administrator has to provide is a per-user service manager that starts at system boot. That could be by running loginctl enable-linger on a systemd installation, or starting per-user copies of daemontools or supervisord or anything else for each user to add their own services to.

As long as users have some way to set up long-lived services that get started again automatically after a reboot, and the ability to add host configurations for httpdir and sniproxy, it’s possible to build the rest of the glue needed for a “PaaS” without administrative privileges. I have some plans for how I intend to do that using Nix, but I hope to see other people tackle that part of the challenge in different ways too, because I’m certain I won’t get that right on the first try.

So please check out sniproxy and httpdir and see if they’d be useful to you too!