Monday, June 19, 2017

Search-based compiler code generation

I've been thinking about something for a while now, and promised a year ago in my Optimal Optimization post that I would write it up.

One of the difficulties in writing a programming language compiler is that there is a tension between making the pieces of the compiler modular and maintainable, versus making the compiler produce good code. The biggest cause of this tension is over "phase ordering", and I'll defer to one of the many academic papers on the subject to explain it:
Current optimizing compilers typically contain several different optimization phases. Each phase attempts to apply a series of transformations, each of which consists of a sequence of changes that preserves the semantic behavior of the program while typically improving its efficiency. Many of these optimization phases use and share resources (such as machine registers), and also need specific conditions in the code to be applicable. As a result, optimization phases interact with each other by enabling and disabling opportunities for other phases to be applied. (Prasad Kulkarni et al, "Practical Exhaustive Optimization Phase Order Exploration and Evaluation")
Researchers have explored a variety of ways to determine, for a given piece of source code, which passes should be applied and in what order. The above paper, and several others by the same authors, describe efficient ways to search over all possible phase orders. Many people have applied genetic algorithms or machine learning techniques to this problem as well; for an interesting example plus good citations to earlier work, see Sameer Kulkarni & John Cavazos, "Mitigating the Compiler Optimization Phase-Ordering Problem using Machine Learning".

But you might ask, why do we divide the compiler's work into phases in the first place? It's in support of our old software engineering friend: abstraction. It's much easier to reason about a particular transformation if we don't have to think about all of the others at the same time.

Still, could we avoid the phase order problem entirely by structuring the compiler in some other way?

Strongly normalizing intermediate representations

Alan C. Lawrence, in his 2007 PhD thesis ("Optimizing compilation with the Value State Dependence Graph"), claimed a maxim of compiler folklore is that "The most optimizing compiler is the most normalizing compiler," explaining it this way:
Normalization refers to where many different source programs result in the same machine code after compilation—that is, where the same machine code is produced regardless of which form was written by the programmer. … This allows the programmer to select from such representations according to which is the easiest to comprehend or to fit his ideas or program into, and leave the compiler to select the output according to efficiency concerns.
Traditionally, compilers have represented the structure of a function using a Control Flow Graph (CFG) composed of "basic blocks" containing sequential instructions, plus control-flow edges between basic blocks that describe the different paths the program may follow.

The CFG representation locks in details of evaluation order that aren't inherent in the original program, so many optimizations that are designed for CFGs just re-order instructions in various ways. Different orderings may have different performance or code size implications, so the challenge is to identify the best ordering for your particular goals.

By contrast, the Value State Dependence Graph (VSDG) does not force instructions into a sequential order, but instead keeps track of data dependencies and serialization requirements using directed edges in a graph. As a result, a VSDG describes all equivalent orderings of the same code in a single, normalized, form. (Other program representations have done that for parts of the code that are free of side effects, but VSDG extends it to control flow and memory access.)

Lawrence later noted:
Many optimizations, which would be separate passes in a CFG compiler, seem trivial in a VSDG framework, due to the VSDG’s normalizing properties. That is, they automatically (or implicitly) occur merely by representing a program as a VSDG.
However, this does not entirely avoid the problem: rather, it merely delays the decision until the IR is transformed into some form which does make the distinction, namely until the sequentialization stage. Thus, sequentialization wraps up many such optimizations together, and concentrates their difficulty into one place. However, this approach does seem helpful in addressing phase-order problems…

Although a single VSDG can represent a large number of seemingly-different CFGs, it is possible to reduce the number of equivalent CFGs just by adding extra serializing edges. The edges initially present in the graph represent true dependencies: A must be evaluated before B, either because B needs the result of A, or because A and B have side effects that must not be re-ordered. Artificially adding more dependencies imposes a specific evaluation order for the remaining computations.

The VSDG sequentialization stage can be defined as introducing serializing edges until exactly one CFG remains. Producing a sequentialization optimized for speed is NP-complete, but because the only choices are over where to add serializing edges, the VSDG is a convenient way to express whatever constraints, heuristics, or cost functions you want.

Although Lawrence's thesis discusses sequentialization extensively, if you want to know more on that subject I recommend James Stanier's 2011 PhD thesis ("Removing and Restoring Control Flow with the Value State Dependence Graph") as well as Bahmann et al's 2015 paper, "Perfect Reconstructability of Control Flow from Demand Dependence Graphs".

Combined Register Allocation and Code Motion

Neil Johnson, in chapter 6 of his 2004 PhD thesis ("Code Size Optimization for Embedded Processors"), proposed an elegant algorithm for combining a particular pair of traditionally-separate optimization passes using the VSDG, which makes a nice example of both the phase ordering problem and the ways VSDG can help solve it.

Register allocation is the process of assigning temporary values to specific hardware registers. Sometimes there are too many temporary values that the program needs to hang on to at the same time, and they can't all fit in registers at once. When that happens, there are traditionally two choices: either "spill" some of the registers to memory so they can be used for something else until the older values are needed again; or forget some temporary values and recompute them later. Both options result in a larger and slower program.

Code motion is a loop optimization that moves an expression out of the body of a loop if that expression will evaluate to the same value on every iteration of the loop. The only way this transformation could make the expression be evaluated more times than in the original program is if the loop never executes, so this is almost always an improvement.

However, code motion increases pressure on the register allocator, because there are more temporary values that need to be available throughout the execution of the loop. Under some circumstances, the program will actually be faster if it recomputes the loop-invariant expression every time it's needed, because that may free up registers and avoid expensive spills to memory.

Johnson uses the VSDG to implement a greedy algorithm that only performs code motion when it won't interfere with register allocation. His isn't the first algorithm to combine these two passes, but using VSDG makes his approach particularly easy to understand. I recommend reading his paper if you'd like to know more.

Recasting optimization as a search problem

Although several of the VSDG papers mention combinatorial search as a potential solution to various challenges, I have not found any signs that anyone has actually tried search with the VSDG. I'd be interested to see Johnson's algorithm, for example, revisited with an exhaustive search in place of the greedy policy, and an analysis of the compile-time cost versus run-time and code-size benefit of making optimal choices.

There is history of using search in various ways for code optimization:
  • I already cited "Practical Exhaustive Optimization Phase Order Exploration and Evaluation" as an example of using search to identify an optimal phase order for a traditional compiler on a given program.
  • Samuel Sanseri's 2001 Master's thesis ("Toward An Optimizing JIT Compiler For IA-64") demonstrated that for VLIW architectures, search-based instruction selection can produce more efficient code than GCC's hand-tuned heuristics. (It also promised to demonstrate that it could do so with fast enough compile time to be usable in a JIT setting, though I can't find evidence in the paper that compile time was actually measured. Still, I think the claim is plausible.)
  • João Dias' 2008 PhD thesis ("Automatically Generating the Back End of a Compiler Using Declarative Machine Descriptions") demonstrated using search to automatically generate the instruction selector in the backend of a compiler. Although compiler frontend techniques haven't changed dramatically since lex and yacc were written in the 1970s, instruction selection is still done today using extensive hand-written heuristics. Dias' approach replaces those heuristics with pattern-matching rules generated before compiling the compiler itself, so the slow search time is amortized across all the programs compiled for the same architecture.
These projects demonstrate a range of applications of combinatorial search covering both offline analysis used to generate better compilers, and online analysis to dynamically make better optimization decisions during compilation.

However, I have yet to find anyone who has tried using search to decide whether to apply a optimization/transformation at each individual location where it's valid. Since applying an optimization in a particular spot might prevent applying other optimizations in that spot, the current greedy methods may produce suboptimal results no matter which order they run optimization phases in.

Here's a rough outline of an approach that I think would be interesting to try:
  1. Construct a VSDG, and aggressively normalize it using equivalents of common subexpression elimination and partial redundancy elimination. These are not guaranteed to be a good idea, but they can be selectively undone later using node splitting if necessary. Also apply optimizations that are guaranteed to be improvements, such as constant folding, algebraic identities, and dead node elimination. Furthermore, some optimizations, like constant propagation and copy propagation, are implicit in VSDG construction (e.g. if there's no "copy" node available).
  2. Begin a search over the space of VSDGs that may be formed from the initial graph using any sequence of these operations:
    • first, split some subset of nodes whose results are demanded multiple times;
    • then, fully cover the graph with target machine instructions;
    • and finally, add serializing edges and spill registers until the graph corresponds to a single CFG and all edges can be assigned to registers without conflicts.
  3. Evaluate each candidate CFG according to the current optimization criteria, such as smallest code size or fastest execution time.
Although this still looks like a multi-phase optimization pipeline, the key difference is that we allow the compiler to backtrack over any decision that might turn out to be a bad idea. There may be multiple reasonable choices for nodes to split, multiple ways to select target machine instructions that compute the same results with different tradeoffs, and multiple ways to sequentialize and insert spills. I'd like to know what happens if you effectively check all combinations of those choices.

Of course, the details matter if the result is ever going to be used in a serious compiler. I have some questions about how to handle certain things, like pointers or break statements or exceptions, in a VSDG; but ignoring those concerns, the search also needs to produce reasonable answers quickly or most people will reject it. So a practical implementation needs good heuristics for value-ordering and for branch-and-bound pruning. Given those, I suspect Limited Discrepancy Search would be a good search strategy. (Sanseri reported good results with these choices in his thesis on IA-64 instruction selection, which seems like a good confirmation for my intuition.)

Anyone want to give this a try? I don't have high hopes that anyone reading my blog has spare time for such a project, but there's no harm in asking, right?

Delegating search to SMT solvers

As a side note, I also want to highlight various efforts to use SMT solvers to improve optimization. Effectively, such projects delegate the search to the SMT solver, which implements a good set of general-purpose heuristics. This is a smart approach, although querying an SMT solver is still a fairly expensive thing to do, so people have resisted doing it during normal compilation. As a result, these tools are primarily used offline, to identify improvements that should be made to a compiler, rather than online during compilation.
  • Sebastian Buchwald's "Optgen: A Generator for Local Optimizations" generates all possible peephole optimizations smaller than a given size limit, and in the paper he reported finding that GCC, LLVM, and the Intel C Compiler were all missing some optimizations that Optgen found.
  • Souper examines output from LLVM's optimization passes and looks for patterns that can be shown using SMT queries to have better alternatives. I can't find any published papers about this, but see John Regehr's blog for some reported results, such as "A Few Synthesizing Superoptimizer Results" from 2015. 
  • Lopez et al built Alive, a tool to prove LLVM peephole optimizations correct or generate counterexamples explaining why they're wrong; see "Provably Correct Peephole Optimizations with Alive".

Conclusions

  • Compilers are structured with many independent optimization passes because they're hard enough to get right that way, and worse if you try to combine them; but keeping them separate leads to suboptimal code.
  • Choosing a different intermediate representation in the compiler changes which parts of the compiler are hard to write. That may enable more sophisticated compilers by pushing complexity to places where it is easier to manage.
  • Since optimizations interact in complicated ways, I'd like to see more research into different ways to apply combinatorial search to find the best combination of optimizations for a given piece of code. I've proposed one approach that I haven't seen attempted yet.

Responses

Edit: Thanks to CalChris on Hacker News for the link to Unison, a recent LLVM pass that performs combined instruction scheduling and register allocation via constraint solving. See Lozano et al, "Register Allocation and Instruction Scheduling in Unison".

Thursday, April 6, 2017

Corrode update: control flow translation correctness

I've been pretty quiet about Corrode for a few months. That's partly because I was traveling some, partly because I've been busy with non-Corrode client work, partly because I've been setting up my next Rust-related contracts that I hope to get to announce soon... but mostly, I've been quiet about Corrode because I've been trying to fix some very difficult bugs since December, and it was demoralizing to fight with my code for so long.

So I'm very pleased to announce that Corrode now translates arbitrary C control flow to Rust, correctly!

One thing this has already enabled: the new rust-lzo crate on crates.io is a Rust implementation of the LZO compression algorithm, translated using Corrode from the Linux kernel's implementation. Hooray!

I last talked about the challenges around getting control flow right in November ("Go To Statement Considered (Mostly) Harmless"). At that time I mentioned that Corrode couldn't handle "irreducible" control flow, or some other cases. Later I discovered that sometimes it wasn't even producing correct answers for the cases that I thought were working.

Corrode's new control-flow algorithm is based on Emscripten's "Relooper" algorithm, as described in "Emscripten: An LLVM-to-JavaScript Compiler". I'd like to thank Brian Anderson of the Mozilla Rust team for reminding me that Emscripten exists and already solved almost exactly the problem I faced! (Unless I'm mixing people up again, in which case I'd like to thank whoever actually told me that, instead.)

The Relooper is fully general: it can handle any control flow that you can express in C, including arbitrary 'goto' statements. And I have confidence that my implementation really will handle any wacky control flow you can come up with, partly because at this point I have tested millions of randomly-generated control-flow graphs against relatively simple correctness properties using QuickCheck. I found so many bugs that way! QuickCheck then simplified each bug-triggering control-flow graph down to the smallest graph that would still trigger a bug, making it easier to figure out why it failed.

I now have Travis-CI testing 100,000 random CFGs after every build. This is not a hardship because the implementation is fast: on my 2013-era laptop, testing that many examples only takes around 20 seconds!

Some of the bugs I encountered were in cases which are not covered by the Emscripten paper. I never found the current implementation of the Relooper when I looked at the Emscripten source code, so I don't know whether those cases are covered by Emscripten's real implementation and were just left out of the paper, or even if they were covered in the paper and I just missed it. I also discovered that one of the cases in the paper can't happen because it's subsumed by the others. I intend to ping the paper's author about these surprises sooner or later...

This algorithm is tricky to get right and requires very carefully defining invariants and preconditions. Because I needed to reason about the algorithm in a very formal way to have any hope of getting it right, I thought it might be a good use case for annotating the implementation with refinement types as implemented in LiquidHaskell. Unfortunately, I had too much trouble writing types that LiquidHaskell would accept, so I gave up. Still, just making the effort to write down a semi-formal model helped my understanding a lot, so it wasn't wasted effort! I hope to find a good excuse to play with LiquidHaskell again sometime.

Although I wasn't able to formalize my reasoning for why Corrode's implementation of the Relooper is correct, I at least wrote down an English-language sketch of the correctness and termination proofs for the algorithm, as part of the Literate Haskell documentation for Corrode's CFG module. Please read it, I spent a lot of time writing it ;-)

I'm slowly evolving this algorithm beyond Emscripten's implementation, because for Corrode I want the output to be as human-readable and maintainable as possible. (Emscripten's output is only intended for a web browser to parse, and you need only glance at JavaScript that humans write to realize that nobody's much bothered if we give our web browsers massive headaches.) So far I have a simple post-processing pass to merge any "Simple" block with an immediately-following "Multiple" block, which eliminates the need to introduce an extra variable in the most common cases; and a straightforward check to avoid using labeled "break" and "continue" statements when referring to the most immediately enclosing loop. I have more clean-ups planned, inspired by trying to read the output from Corrode on real C source files.

I'm sure the question you're asking yourself by now is, "But how is the CVS Rust port coming along?" Significantly better now, thank you! There are now only five source files that Corrode refuses to translate, for various reasons, although quite a few more still won't compile after translation. And at least one source file now works fine that previously, when I tried converting it to Rust, made CVS fail its test suite. So a lot of the remaining work is just in cleaning up the output, plus a bit of handling more corner cases of C.

Corrode has many limitations left to fix. But at least control flow isn't one of them any more!

Tuesday, January 3, 2017

Which projects should convert to Rust?

Since the 1.0 release of the Rust programming language in May 2015, people have been exploring different ideas about what to use it for.
  • Does Rust make C obsolete?
  • If so, should we replace existing C programs with Rust equivalents?
  • Or should we leave existing code alone, write bindings that let us re-use it from Rust, and just write new code in this new language?
Of course, there's no one-size-fits-all answer to these questions. But I think there's enough evidence now from various experiments to put together some pretty decent guidelines, which I'll summarize here in case you're impatient:
  1. Rewriting existing software in Rust is an excellent educational exercise, worth doing if you just want to learn something.
  2. If you want your Rust version to be adopted, be prepared to give strong evidence that it's equivalent to the original implementation. This is easiest if the project has a comprehensive test suite, and if much of your translation work can be automated.
  3. Translating a project to Rust is justifiable if the bugs it has the most trouble with are the kinds of bugs Rust is most helpful on—notably, memory safety bugs.
  4. If a project's community is resistant to switching to Rust, any effort you put into porting may be wasted. Make sure you build community support around your work!
For more details, read on.

I'm taking this evidence from two kinds of projects. First, from quite a few efforts to replace pieces of C with Rust equivalents, such as:
  • rusl: an experimental musl libc incremental rewrite in Rust
  • Librsvg gets Rusty: incrementally rewriting parts of an SVG library in Rust, while maintaining binary compatibility with the fully-C version of the library
  • uutils/coreutils: rewriting GNU coreutils in Rust from scratch
  • Trust-DNS: from-scratch implementation of a DNS client and server, inspired by CVEs in BIND
A second source of evidence comes from my efforts at automatic translation from C to Rust using Corrode, and especially my Mozilla-supported experiment on translating CVS.

Education

Bear in mind that many of these porting efforts, such as rusl and librsvg, were started for educational purposes (see, for instance, rusl's goals and non-goals). I think it's safe to guess that everyone who has made a serious attempt at a project like this has learned a lot! So let's make that guideline #1:
If you want to learn a language you don't know already, one very educational way to do it is to rewrite some piece of software you already understand.
(That guideline isn't even specific to Rust. In 2002, I did an exercise in writing a run-length-encoding compressor in as many languages as I had patience to try; I picked C++, Perl, Java, Smalltalk, x86 assembly, and Haskell. It was very helpful for discovering some pros and cons of each language. I had not tried Smalltalk or Haskell before and that exercise led to my continued interest in Haskell years later.)

But what if you want more return on investment than just an educational experience? We'll have to look at the results of these projects more closely.

Correctness, safety, and security

These projects all share at least one big challenge: How can you be sure the Rust version behaves correctly? Nobody wants to replace code that works with code that has new bugs, and while Rust rules out certain classes of memory and type safety problems, it makes no guarantees about program logic. But then, how can you have any confidence that the original version is correct? Perhaps the answer to that question can answer the earlier one.

The above examples all use automated testing to gain confidence in their correctness. The musl, librsvg, and busybox projects each provide a test suite that these porting efforts adopted to test that their Rust rewrites produced the same results as the original implementations. So, assuming the tests pass for both implementations, we can be sure that the Rust version works exactly like the original, at least for those parts that are tested.

That's not as satisfying as I'd like. For example, the author of rusl points out that not only is the test suite for musl incomplete, but musl doesn't pass all of it. Still, re-using an existing test suite provides pretty good confidence.

Trust-DNS doesn't re-use an existing test suite, but its author wrote a test suite based on the relevant DNS specifications, and some of its tests query well-known public DNS servers as a simple interoperability check.

That approach has hazards as well. NASA Langley Research Center's 1990 follow up to the Knight/Leveson "Experimental Evaluation of the Assumption of Independence in Multi-Version Programming" from 1985 found that when multiple teams attempt to write separate programs implementing the same specification, their bugs are not statistically independent. When multiple implementations failed on the same inputs, they found that "lack of understanding by the programmers of key points in the specification was a major contributor".

Trust-DNS encountered an instance of this problem, in fact, related to this bit of RFC6840: "The guidance in the above paragraph differs from what has been published before but is consistent with current common practice." Different DNS developers interpreted a particular specification section in different ways, leading to non-interoperable implementations.

But if the implementation and the tests are written by the same person, then on top of the problems of incomplete test suites, any mis-interpretation of the specification will affect both the implementation and its tests, resulting in an incorrect implementation that appears to be correct because it passes its test suite.

That said, writing a test suite from a specification is far better than nothing, and I'm sure this approach has caught bugs in Trust-DNS.

Anyway, correctness isn't all that matters. The safety and security properties that Rust guarantees are not the only such properties you might care about, and test suites can't cover the gap. If some invalid input causes the C version to return an error and keep going, but causes the Rust version to panic or go into an infinite loop, that could be an exploitable denial-of-service bug. Or if the Rust version is missing an access-control check, it could disclose information that an attacker wasn't authorized for. It's not feasible to write tests to catch all of these cases.

Equivalence

I'd summarize all these concerns as guideline #2:
If you want to replace existing software with a Rust implementation, then the Rust version must behave equivalently to the original, aside from inputs that crash the original. To convince other people to adopt your new implementation, you must give them confidence that the new implementation really is equivalent.
Reusing the original implementation's test suite is a good start at providing this confidence. Demonstrating that the new implementation can interoperate with other implementations, as Trust-DNS does, is another good step.

But I think better approaches are available for demonstrating equivalence when the new implementation is supposed to match an existing one. (These don't help Trust-DNS, but do help the others.)

First, you could mechanically prove that the two implementations are equivalent. The developers of KLEE tested equivalence between busybox and GNU coreutils, for example. KLEE automatically generated test cases that maximized code coverage in both code bases, finding inputs which produced different outputs. In several cases, those differences represented significant correctness bugs; in others, just features that weren't implemented in one version or the other. The Software Analysis Workbench from Galois is also designed for this, and I suspect afl-fuzz could be used in a similar way. Each tool has different trade-offs between performance, completeness, and ease-of-use.

I'm surprised that I haven't seen any of the port-to-Rust projects use any of those tools, to be honest.

Second, and my personal favorite, is to automate as much of the porting effort as possible. If you've been following my blog posts on Corrode, then you know this is the approach I've been working on. This changes the problem from "do I trust the Rust version?" to "do I trust the translator?" Don't get me wrong: establishing trust in the automation is hard too! But so long as the translation tools can be reused across many projects, it's worth spending a lot of time on their correctness.

Any of the above techniques for establishing equivalence may be enough on its own to convince people to adopt your shiny new Rust implementation. But when the software you're converting is especially important, you may need to combine techniques. For example, in my CVS experiments, I'm using Corrode to automate much of the conversion, but I'm also relying on the extensive test suite that CVS has to check that neither Corrode nor I made mistakes. Since Corrode is still young, I've found bugs in it this way, making the test suite an excellent sanity check.

Time and resources

This suggests guideline #3:
The amount of energy you put into converting a project to Rust should be proportional to the importance of that project, and to the consequences of those bugs which Rust can help with.
A program that doesn't interact with the Internet or with files that might come from untrusted sources, like a single-player game or a scientific simulation, may not be worth spending much effort converting to Rust. To be clear, I think there are benefits to choosing Rust when starting work on scientific simulations or games! But once you have already written and debugged code for these sorts of programs, the marginal benefit of switching to Rust is much lower.

On the other hand, a small library that is used by a huge number of programs and often handles untrusted data, such as zlib or a multimedia codec, might be worth spending a lot of time on:
  1. because Rust's memory-safety guarantees address these libraries' most common sources of bugs,
  2. because the consequences of those bugs are widespread due to the number of users,
  3. and because the amount of code to translate is small enough that the effort is nicely bounded.
Widely used network programs such as NTP, SSH, web servers, etc., may be strong candidates for the same reasons.

For libraries that are not worth translating to Rust, it's probably worth at least using tools like rust-bindgen to generate Rust bindings so Rust developers can take advantage of the time and effort that have gone into those libraries.

Ideally, grant programs like the Linux Foundation's Core Infrastructure Initiative or Mozilla Open Source Support should fund efforts to translate critical open-source projects to Rust. Of course, until there's a significant example where a Rust implementation has replaced its predecessor, this may be a hard sell.

And on that note, here's guideline #4:
If you want your Rust port of a project to be adopted, you'd better make sure you have buy-in from that project's community.
This is just standard project management advice, of course, and applies to both open-source and commercial in-house projects. A project's community is deeply invested in their existing implementation, especially in projects with a long history. The larger and more established the community is, the harder it will be to convince a critical mass of their developers to learn Rust to continue working on a project that many of them probably feel is fine as-is.

You could say that the reason I chose to try converting CVS to Rust because not only does it satisfy guideline #3, but it also has too few people caring about it to complain about my attempts. 😉

I find it unlikely, for example, that the Linux kernel will ever switch to Rust. (Although that won't stop me from testing Corrode on it!) In fact, I'd expect some hypothetical new operating system to supplant Linux before Linux developers would switch programming languages. (I'm cheering for seL4, personally, perhaps together with Genode.)

On the other hand, it looks like developers working on GNOME libraries like librsvg may wind up being relatively early adopters.

The gap between "nobody wants this" and "everybody's on board" will move over time, as Rust's advantages are demonstrated in more projects. For the moment, it's still a young language: while I expect projects willing to make the switch will be relatively rare for the moment, I believe we'll see some significant early adopters soon. Naturally, I hope my work on Corrode will help make that happen sooner!

Summary

To recap:
  1. Rewriting existing software in Rust is an excellent educational exercise, worth doing if you just want to learn something.
  2. If you want your Rust version to be adopted, be prepared to give strong evidence that it's equivalent to the original implementation. This is easiest if the project has a comprehensive test suite, and if much of your translation work can be automated.
  3. Translating a project to Rust is justifiable if the bugs it has the most trouble with are the kinds of bugs Rust is most helpful on—notably, memory safety bugs.
  4. If a project's community is resistant to switching to Rust, any effort you put into porting may be wasted. Make sure you build community support around your work!
Discussion elsewhere on this post:

Friday, December 9, 2016

How to translate a large C project to Rust

In October, I started working on translating CVS from C to Rust, and today I'd like to answer these questions about how that's going:
  • How does a Corrode-aided porting effort work, in practice?
  • How much progress have I made on CVS, and how hard would it be to do the same for other projects today?

How to translate a new project using Corrode

Here's the process I've followed while working on translating CVS to Rust. You can find the scripts and patches described here in my cvs-rs repository.

0. Does it build?

Before doing anything else, I made sure that I could build CVS from unmodified source using GCC. This is important! If it doesn't work with a standard C compiler, there is absolutely no way Corrode is going to give you good results. Or as Charles Babbage put it:
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
If the project you're translating has a test suite, this is also a good time to check that the test suite passes before you start changing things!

1. Trial run

Next, to get a rough idea of how much work the translation will eventually require, I tried just substituting Corrode and rustc in place of GCC in the build. Most build systems for open source projects make it easy to do that, usually by setting a "CC" variable to the path you want to use as your C compiler. In the case of CVS, I did it this way:
make CC=/path/to/corrode/scripts/corrode-cc
corrode-cc is a wrapper script that does these steps:
  1. Run Corrode to translate one C source file to a standalone Rust module.
  2. Run rustc on that Rust module, asking it to treat it as a dylib-type crate but to only emit an object file.
  3. If either of those steps fails, save the error message and then run GCC to get an object file anyway.
So the result of this step is that we have a bunch of Rust modules lying around, plus a bunch of files recording various error messages. Hopefully the build completed successfully, and if we're lucky, there's even some Rust-compiled code in it, and if we're really lucky, the new binary still works!

To estimate how much work the translation is going to take, you can look at several factors:
  • How many distinct errors did the build produce? The corrode-cc script writes error messages to files named "errors-<hash>", where the hash is over the error message itself, so if multiple files run into identical errors in some header file that they all include, that error will only show up once. Error messages:
    • may indicate that the project relies on a feature of C that Corrode does not yet handle,
    • or may indicate a bug in Corrode,
    • or may indicate code that Rust can't verify is safe.
  • How many object files (*.o) did the build produce?
  • How many Rust modules (*.rs)? In the best case, there will be one for each object file, but currently Corrode fails to translate a variety of C patterns, and whenever Corrode fails it refuses to generate any output. Those cases may indicate that you'll need to patch the C source to make it easier to translate. This can be tricky.
  • How many of the object files were compiled via Rust? If the corrode-cc wrapper script had to fall back to compiling via GCC, then there are usually small edits you can make to the Corrode-generated Rust to make rustc accept it. This is tedious but generally pretty easy.

I found that the easiest way to check whether an object file was compiled using GCC or Rust was to check if the string ".rs" appears in it:
grep -lF .rs *.o
(Of course this might have false positives, so if you have a better approach please let me know!)

Based on these results you should get some idea how close Corrode alone will get you, and how much manual work you'll need to do to complete the translation.

2. Integrate Rust into the build system

For the CVS case study, I wanted use Corrode as if it were just another C compiler. So the C source is the canonical implementation, and I patched the build system to compile some source files via Rust instead.

Which source files should you do this to first? Maybe pick just one that worked in step 1.

For CVS, this amounted to making the following edits to src/Makefile.am:
  • Delete a selected subset of C sources from the cvs_SOURCES variable.
  • Create a RUST_SOURCES variable where, for example, if I removed "checkin.c" from cvs_SOURCES, then I added "checkin.rs" to RUST_SOURCES.
  • I added these rules:
    %.rs: CC=corrode
    %.rs: %.c
            $(COMPILE) -c $< >/dev/null
    
    .PHONY: rust-sources
    rust-sources: $(RUST_SOURCES)
    
    libcvs-rs.a: lib.rs $(RUST_SOURCES)
            rustc -A bad-style -A unused-mut -g -O \
            -C debug-assertions=on --crate-type=staticlib \
            -o $@ $<
    
  • Finally, I added libcvs-rs.a to cvs_DEPENDENCIES, and "libcvs-rs.a -ldl -lpthread -lgcc_s -lc -lm -lrt -lutil" to cvs_LDADD.
Also, I created a top-level Rust module in src/lib.rs which just re-exports declarations from the translated modules. So if checkin.rs is in RUST_SOURCES, lib.rs contains a "pub use checkin;" item.

Note that I split out a phony target just for ensuring that all the Rust sources have been built. That allowed me to split the build process into phases that run before and run after Corrode:
  1. Apply patches to the C source as needed that make it easier to translate.
  2. Run "make rust-sources" to auto-generate a rough version of each selected C source file.
  3. Apply additional patches, this time to the generated Rust, as needed to improve the translation.
  4. Run "make" to complete the build.
As Corrode and related tools improve, there should be less need for patches to either the C or the Rust source. If someday we can fully automate this process, then this multi-phase build approach can go away entirely.

I'm using quilt to manage the collections of patches. I have two reasons for doing it this way, and maybe neither will apply to you:
  • I wanted people to be able to learn from the cvs-rs repository, so I'm using the patch series as a way of communicating aspects of the process I'm following. If you're just doing a one-off conversion, you don't necessarily need to document the steps you took along the way.
  • Corrode is still under active development, so I'm frequently re-running it. Recording all the manual changes I'm making in separate patches makes it easier for me to manage my work-in-progress.

3. Translate more, or translate better

With that foundation in place, now comes the fun part: namely, "everything else!"

My current process for translating CVS involves doing either of these two tasks, over and over:
  • Pick a new C source file, move it to the build-via-Rust list, and see if it works.
  • Pick some piece of generated Rust, and see if I can improve it (by making it safer or more idiomatic).
You can keep doing these steps until there's nothing left to do, or you get bored.

Status

So far, I have translated 6.4% of the non-comment, non-blank lines in the src/ subdirectory of CVS, from 10 source files.

Sometimes, translating a thousand-line source file has taken 10 minutes. Other times, I've spent an entire afternoon comparing the generated Rust to the original C without spotting any differences, and yet the Rust version doesn't pass the test suite.

So there's more work to be done on Corrode, to make it reliably convert as many kinds of C source as possible. At this point, I'm going back to improving Corrode for a bit, rather than focusing on translating more of CVS.

Still, if you're interested in trying Corrode, I'd encourage you to try going through at least step 1 on whatever project you think is interesting. See how far you get, and if you find a project where Corrode works well, I would love to hear about it!

Discussion elsewhere on this post:

    Wednesday, November 16, 2016

    Go To Statement Considered (Mostly) Harmless

    While this post's title is a mash-up of "Hitchhiker's Guide to the Galaxy" with Dijkstra's 1968 letter, "Go To Statement Considered Harmful", today I have a bunch of Corrode-related news to share, not just commentary on "goto". But let's start there, because I'm super proud of this...

    Translating C's "switch" and "goto" to Rust

    Six months ago I wrote 'How to eliminate "goto" statements', about this question:
    C supports "goto" statements. Rust does not. Can programs with "goto" statements be turned into programs without them?
    Over the last two weeks I've finally gotten to write code tackling this problem. (Thanks again to Mozilla for supporting this project! I could not afford to dedicate the time this took without their support.)

    The result: Corrode now understands all forms of C statements except for the GCC extensions for "local labels", "computed goto", and inline assembly. The last holdouts were "goto" statements, obviously, as well as "switch" statements, which in C can lead to similarly complex control flow.

    I'm so excited, because this is a huge milestone! Lots of C code uses "switch" statements, and regardless of Dijkstra's opinions, lots of C code uses "goto" as well. None of the remaining GCC extensions to statement syntax are used widely enough to have nearly as much impact.

    The implementation looks very little like I expected it would six months ago. General implementation strategy:
    Gee, translating "goto" statements is hard. I know! I'll turn everything into "goto" statements, and then turn the resulting mess back into structured Rust. That'll be easier!
    If you're interested in the details, check out the re-structuring implementation. It's a 12-page paper describing the algorithms and invariants, with code interspersed in literate programming style, like the rest of Corrode. (And if you find any of the explanations confusing, I would love to receive pull requests making the documentation easier to understand!)

    It is still possible to write C programs using "goto" or "switch" that Corrode can't translate to Rust yet, but they're less common than you might expect. They fall into two categories:
    • If it's possible to enter a loop at two or more different places, the control flow is called "irreducible" and Corrode can't translate it, yet, because it fundamentally can't be represented in Rust without duplicating code or introducing new variables. The classic example of this is "Duff's Device". Many algorithms in compilers have trouble with irreducible control flow, so Corrode is not alone in this. But in "A study of irreducibility in C programs" from 2010, Stanier and Watson found that less than 0.05% of functions had irreducible control flow, across 15 large open source projects. So this is not a big concern, although I have thoughts on how to handle it anyway.
    • The more common case involves certain "goto fail"-style control flow. A surprisingly wide range of "goto fail"-style control flow actually works just fine when Corrode translates it to Rust. I haven't figured out yet how to characterize which cases don't work yet.
    One more challenge with the new implementation: It produces very noisy Rust.
    • The output is full of extra "break" and "continue" statements in places they aren't strictly necessary.
    • Every loop has a label, in case there are any multi-level exits that need to refer to it, and every "break" or "continue" uses the appropriate loop label even if it doesn't need to.
    • Corrode no longer generates "while" loops, just infinite "loop" statements with a conditional "break" at the beginning.
    All three of those problems should be fixable, but I needed to get the basic algorithm right first before figuring out how to produce more readable Rust.

    Progress translating CVS to Rust

    Last time I talked about why CVS is my Corrode current case study. Between the above work on goto/switch and a bunch of smaller fixes, the "Rustification" of CVS is going pretty well!

    Corrode is now able to produce Rust output of some sort for about 64% of the C source files in CVS. (I'm not counting C source for other platforms that doesn't get built on my system; I get 91 .rs files when I build CVS with Corrode, compared to 143 .o files in the build overall.)

    Of the remaining errors causing Corrode to fail on CVS source files, roughly one-third are due to over-complicated control flow. Another third are due to various uses of C-style union types; see Corrode issue #22 for background. (I need to work out how to merge the partial implementation of unions from pull request #63!)

    The rest are mostly things that will probably never work in Rust, but these usually can be manually fixed in the C source before running Corrode. Some examples:
    • Enum-typed global variables without an explicit initializer are a problem because C's default zero-initialization policy doesn't make sense in Rust, where enum types aren't fully isomorphic to integers. I suspect I'll edit the CVS source to provide initializers for these.
    • Rust doesn't allow declaring functions that take variable numbers of arguments, but the CVS code base declares several. These functions just have to stay in C, or re-write the callers to not use a varargs-style API.
    • One CVS header declares a function pointer that takes exactly 3 arguments, but actually expects the corresponding function to take a variable number of arguments. C allows passing more arguments than declared, but Rust doesn't. I expect that changing the declaration to explicitly accept a variable number of arguments will fix the immediate problem, although then of course I'll run into the varargs problem from the previous point.
    • Rust doesn't support C's `alloca` function for variable-sized stack allocations. I guess I'll probably carefully replace it with `malloc`/`free` pairs in the right places, and then try to clean up the resulting Rust.
    Once Corrode has generated a pile of Rust source files, then we get to find out how many of them the Rust compiler will actually accept. At the moment, the answer is about 44%. Here's what goes wrong with the other 56%:
    • Nearly 75% of the errors in the generated Rust are "use of possibly uninitialized variable" complaints.
      • This is often because some uninitialized variable is passed by reference to a function that only uses it as an out-parameter, which means it's safe, but the compiler can't tell that. I think the right option is to refactor functions with out-parameters to return tuples in Rust, optionally with a C compatibility wrapper if the function still needs to be callable from non-Rust functions.
      • Another class of "uninitialized variable" errors is because the C program assigned to all of the fields of a struct individually, but Rust can't conclude that the struct itself is initialized. This case should be fixable by using C99 structure literals in the pre-Corrode source, or hacking up the Rust output to use struct constructors.
    • 16% of error messages are due to issue #77, which is that Corrode doesn't translate nullable function pointers in a way that the Rust compiler will accept. (I had assumed I could treat them the same as raw pointers, which are nullable and allow arbitrary casts, but nope; in Rust, function pointers are strongly type-safe and memory-safe.)
    • 3.3% are due to Rust not supporting as many kinds of constant expressions as C does. In particular, I think it ought to be possible to evaluate if-expressions, calls to "std::mem::size_of", and calls to "as_ptr" on types such as "str", so long as their arguments or operands are all constant. (I haven't checked exactly what current nightly Rust allows, yet, but I don't think this has changed.)
    • 2.3% are due to issue #92, which is that Rust doesn't allow a parameter or variable binding to have the same name as a static variable that's in scope. I'm not sure I quite understand why Rust does this, so if somebody could explain what's going on in that issue I'd appreciate it!
    The last few issues are pretty evenly divided into these categories:
    • The new algorithm for handling arbitrary control flow doesn't yet try to place variable declarations correctly, so a variable that was in scope in the original C may not be in scope in the generated Rust. It happens to almost always work out correctly anyway, so I'm not too worried about it yet, but I'm tracking it in issue #93.
    • Corrode sometimes translates uses of C-style unions that it isn't actually able to handle correctly, but the error isn't detected until the Rust compiler tries to figure out what to do with it. This will be fixed by just fully implementing unions.
    • When dealing with enum types, Corrode currently adheres a little too closely to the letter of the C standard, and aggressively int-promotes enum values. Several errors would be fixed if we leave values as enum type as long as possible.
    Over time, I'd love to automate as many of these fixes as possible. But in the short term I'll probably just manually fix each issue so I can find out what other challenges await.

    Next up

    Now that I have a fairly substantial chunk of generated Rust to play with, I'm going to start scrutinizing it to find interesting patterns. Based on what I find, I'll write up advice for people who want to turn unsafe C-style Rust into safe, idiomatic Rust, and if we're lucky, somebody will be able to automate some of that advice.

    Until then, I invite you to:
    • read the implementation of the C translator and the control-flow graph algorithms;
    • try running Corrode on your own programs and see how well it does;
    • and contribute documentation, bug reports, or code if you can!

    Friday, October 28, 2016

    Corrode update: support from Mozilla, and new features

    It's been a while since I've written about Corrode; I've just been quietly working on it and merging contributions from several other people. I have a few things I'd like to mention today.

    Funding from Mozilla

    The big news: Mozilla is providing funding for me to work on Corrode for the next few months! I'm super excited to get to work on some of the trickier challenges in translating C to Rust, such as eliminating "goto" statements. More on that later, but first, let me tell you about the specific goal that Mozilla is providing support for.

    One challenge for computing today is bit-rot in software that nobody wants to maintain any more, but that's still useful to some community. (This is a problem for both open-source and proprietary software, but let's focus on the open-source side as there are fewer roadblocks for a person who's motivated to do something about it.) When users discover bugs and security flaws in unmaintained software, they're often just out of luck.

    Mozilla has an interest in finding better ways for people to build software, whether that's through better programming tools like Rust, or more effective ways of organizing an open-source community. So the experiment I'm working on is to see if it makes sense to "rescue" unmaintained open source projects that were written in C by semi-automatically translating them to Rust.

    To find an interesting case study, I looked through Debian's list of orphaned packages to find projects that have a reasonable amount of C source. I especially wanted to find a project with a network component, so if there are security flaws they may be remotely exploitable, because Rust's safety guarantees have more impact there. Then I cross-checked Debian's "Popularity Contest", an opt-in census of which packages are installed on Debian systems, to pick a package that's relatively widely-installed despite being unmaintained.

    The package I settled on: CVS, the venerable "Concurrent Versions System". The first release of CVS was on November 19, 1990, almost 26 years ago! (This is not the oldest codebase I've worked on, but it's close.) The last upstream release of CVS was in 2008, even though it had a security vulnerability discovered in 2012. This is clearly 50k lines of unmaintained C.

    CVS was largely supplanted by Subversion a decade ago, which has largely been supplanted by git today. Anyone choosing CVS for a new project would get asked some pointed questions. But there are still tons of open source projects where their history is only available via CVS; you can find plenty of examples hosted at sites like Sourceforge or Savannah. And according to the Popularity Contest statistics, 6.5% of Debian users still have CVS installed. So there's value in keeping the CVS implementation alive.

    I'm currently working on improving Corrode to be able to translate as much of CVS as I can manage. After that I'll use the generated Rust source to study how much manual work is needed to clean up Corrode's output, and see if there are opportunities for automating more. This effort will wrap up by the end of the year, and hopefully I'll have plenty to report by then.

    Thanks so much to Mozilla for supporting this work!

    Recent changes

    I want to take a moment to thank all the people who have contributed to Corrode so far. I've merged pull requests from: Alec Theriault, Nathan Bergey, Vickenty Fesunov, Fabian Zaiser, Jeff Waugh, Jeremie Jost, Nabil Hassein, Amin Bandali, Getty Ritter, Robert Grosse, Sean Jensen-Grey, and Taylor Cramer. Thanks also to everyone who has filed bug reports, or commented on them to clarify details about Rust or C!

    It's now possible to use Corrode as a replacement for GCC, with the corrode-cc script from the Corrode source tree. There are still a lot of bugs, but it's good enough that I'm usually testing Corrode now by running `make CC=corrode-cc` in various source trees, including CVS and musl-libc.

    All C control-flow statements are implemented now, thanks to Fabian finishing off do-while loops... with the significant exceptions of switch and goto statements, which I'm working on now in a new 'cfg' branch. (See the first post I wrote about Corrode for background; I've been looking forward to this part for months!)

    I've fixed a bunch of small issues that either made Corrode give up during translation due to not understanding the input, or made it generate invalid Rust output. The biggest change was to only translate declarations that are actually needed for the current translation unit. C header files are full of all sorts of junk, but if you don't actually reference a declaration then Corrode doesn't need to report any translation errors for that declaration. Fixing this let me translate my first 14 source files in the Linux kernel, so that was pretty exciting!

    One change motivated specifically by my work on CVS: Corrode now handles K&R-style function definitions. Yes, CVS still uses pre-C89 style code in some places, while using C99 features in other places. That'll happen in a project of this age.

    One interesting bug was around calling through function pointers stored in struct fields. In Rust, `s.f()` calls a trait method named "f" on "s". To call a function pointed to by a field named "f", you need parentheses, like so: `(s.f)()`. The pretty-printer for Corrode's Rust AST now inserts these parentheses as needed.

    Another pretty-printer bug is something I tried to squash before, but hopefully got right this time: blocks in expressions. Rust's parser effectively automatically inserts semicolons after blocks under certain circumstances, so programmers don't need to follow up every if-statement or for-loop with a semicolon. If we want to generate something like this:
    if c { t } else { f } as f32;
    (which is a weird thing to do, but cases like this keep coming up) then we need to wrap the if-statement in parentheses to keep Rust's parser from treating the end of the else-branch as the end of the statement. However, we don't want to insert parentheses when they aren't needed, because sometimes Rust will actually warn that there are unnecessary parentheses! Fortunately I think I have parentheses in exactly the right places now.

    And, because it was driving me mad, I fixed indentation for pretty-printing if-else ladders. They used to zig-zag further to the right, and sometimes some branches would all be on one line while other branches were split across multiple lines. Fixing this had no effect on correctness of the output but it's substantially more readable now!

    Coming up next...

    I'm preparing a new repository for the CVS source code that will include scripts to let anyone reproduce my progress on that translation. I'm having a little trouble scripting some of the steps I do, so that's taking longer than I hoped, but it'll happen at some point.

    A significant fraction of source files in CVS contain either switch or goto statements so my major focus short-term is on translating those control-flow constructs correctly. If you'd like to help, check out the discussion around goto and switch.

    I'll post progress reports like this regularly. Keep an eye out!

    Friday, July 15, 2016

    Testing strategies for Corrode

    Last week I posted about Corrode, my C-to-Rust translator.

    I'm quite proud of Corrode, to be perfectly frank! And apparently other people think it's cool too, since it was on the front page of Hacker News and /r/rust for a solid day, and it spent the subsequent week on GitHub's list of trending projects. The past week has been very exciting, I gotta say!

    A moment of reflection, then. There are a few things an open source project really ought to have:
    1. Responsive maintainers. People have been trying Corrode on their own code and filing issues, and a few people have been busily submitting pull requests. I've been responding to these as quickly as I can.
    2. Solid documentation. I think documentation is pretty well covered, considering some of the feedback:
      • "Absolutely blown away by the detail of the documentation."
      • "The documentation is amazing."
      • "That's an impressive use of literate programming."
      • "WOW... This is just amazing. The most important part of the source is highly educative literate haskell."
      • "Who are you and what did you do with Jamey?" —a former co-worker who had to maintain my code after I left
    3. Verification/validation. Um... well... did you see my documentation?
    So it's time for a real plan for verifying that Corrode does what it's supposed to. I've explored several strategies so far, which have interesting trade-offs.

    Randomized testing with Csmith and C-Reduce

    "Csmith is a tool that can generate random C programs that statically and dynamically conform to the C99 standard. It is useful for stress-testing compilers, static analyzers, and other tools that process C code."

    Corrode now has a script in scripts/csmith-test which performs the following steps:
    1. Generate a random C program using Csmith, and verify that the program compiles using GCC, as our reference compiler. If either of these fail, we won't discover any interesting Corrode-related bugs with this program, so discard it.
    2. Convert the program to Rust using Corrode, and compile it with rustc. If either of these fail, this may indicate an interesting bug; report it to the user.
    3. Run the GCC-compiled program. If it times out, or exits with an error, or produces output that doesn't look right for a Csmith-generated program, then odds are low that we'll learn anything from the Rust version and we just discard this case.
    4. Run the Rust-compiled program and compare its output to the GCC-compiled version. If they're different, that probably indicates an interesting bug; report it to the user.
    Usually it only takes a few seconds to run through that process; but on the other hand, usually these tests all pass :-) so I've been running this script in a loop which stops once an interesting failure is actually found.

    Once the script has produced an interestingly-buggy test case, then it tries to delete as much code from the test as possible while preserving the bug that made it interesting. C-Reduce does the heavy lifting for this part, using a variety of heuristics starting with "what happens if I delete this line? No? How about this one?"

    Unlike the Csmith test cycle, C-Reduce can easily run for an hour or two trying to find chunks that are OK to delete. At the moment this is my only complaint about C-Reduce: I have so many possible bugs to investigate that spending an hour or two waiting on each one feels like a lot. That said, it's doing a very good job of narrowing down the portions of code actually responsible for bugs, so I guess I'll keep it. :-)

    So far I've found and fixed three bugs in Corrode using Corrode's csmith-test script.
    • I applied C99's "usual arithmetic conversions" to the left-shift and right-shift operators, but I should instead have applied the "integer promotions" to both operands and then taken the left-hand side's type. Fixed in commit 3fd9040.
    • I never realized that the type of a C99 integer literal depends not only on whether it has 'U' or 'L' suffixes, but also on whether it's written in decimal (as opposed to octal or hex), and also on whether its value fits in a given type. Fixed in commit 93671a7.
    • Apparently, I find the "usual arithmetic conversions" hard to get right. I've "fixed" them in three commits in a row and Csmith keeps finding new corners for me.
    My big problem with the usual arithmetic conversions is that I'm trying to translate C's long to Rust's isize, but in various corner cases that means you can distinguish between Corrode and a standard-conforming C compiler. I may just decide that's a "feature"...
    I have not found any cases yet where I could blame rustc for any wrong output, but you never know... it could happen!

    Building real software: musl libc

    The musl library is a from-scratch implementation of the C standard library, which "is lightweight, fast, simple, free, and strives to be correct in the sense of standards-conformance and safety."

    Adam Perry has been manually porting musl to Rust as an exercise to discover "the advantages and pitfalls of a C-to-Rust rewrite project," so I thought it might be a good candidate for testing Corrode.

    Of particular importance for my purposes, musl doesn't use system headers, which means that compared with other real C code-bases, musl less frequently uses parts of C that Corrode doesn't support yet.

    To test Corrode on musl, I first made the corrode executable accept all the options that GCC/Clang do (more or less), and just pass anything it doesn't understand to the C preprocessor.

    With that, you can run "make CC=corrode -k" and get a "foo.rs" file alongside every "foo.c" file that Corrode was successfully able to translate! The "-k" option tells make to keep going as long as possible even when some commands fail.

    At the moment, Corrode translates about 10% of the source files in musl, which I feel is pretty good! I was especially excited when I got it to translate bsearch.c and a bunch of the string.h functions, since those do funny things with pointers...

    Now, I haven't yet tested whether any of these source files translate correctly, and in fact I know a few of them don't (if they have static local variables, or declare arrays). At the moment all I can tell you is that the output looks mostly reasonable, in my human judgment.

    At some point I'd like to integrate Adam Perry's build and test scripts from his manual rewriting effort, which would let me validate that Corrode+rustc generate code that passes musl's test suite.

    Software Analysis Workbench

    "The Software Analysis Workbench (SAW) provides the ability to formally verify properties of code written in C, Java, and Cryptol. It leverages automated SAT and SMT solvers to make this process as automated as possible, and provides a scripting language, called SAW Script, to enable verification to scale up to more complex systems."

    Csmith-generated programs print some computed values just before exiting, and that's all you can observe about them during testing. If there are errors mid-computation but they cancel out by the end, or otherwise have no effect on the final output, Csmith-based testing won't find those.

    SAW, by contrast, is designed to show that two functions compute equivalent outputs. In principle, we could use SAW to prove that the code clang generates is equivalent to the code which Corrode+rustc generate on a per-function basis.

    Unfortunately, current git master of SAW can't parse the LLVM bitcode which rustc produces. I hear SAW's maintainers are working on that, so I'm hoping this will be an effective option for Corrode testing soon!

    Unit tests and randomized property testing

    You were probably wondering why I haven't mentioned these yet, I bet?

    Csmith was low-hanging fruit: It has proven to be effective at finding bugs in Corrode with very little work on my part. And musl has quickly shown me which C constructs are used by a fairly substantial real-world code base, so I could focus on implementing those.

    Unit tests and QuickCheck-style property testing, on the other hand, require a lot more thought from the developer. I just haven't put in that level of thought yet.

    As Corrode matures, I think we'll need both focused regression tests and randomized property tests to test at every commit, which can be supplemented by periodic longer-duration Csmith runs.

    Conclusions

    Corrode doesn't support enough of C yet to translate most real programs, largely due to system headers declaring things like unions and enums even if your program never uses them. But for the subset of C which Corrode does support, at this point I feel pretty good about the correctness of its translation!