Friday, December 9, 2016

How to translate a large C project to Rust

In October, I started working on translating CVS from C to Rust, and today I'd like to answer these questions about how that's going:
  • How does a Corrode-aided porting effort work, in practice?
  • How much progress have I made on CVS, and how hard would it be to do the same for other projects today?

How to translate a new project using Corrode

Here's the process I've followed while working on translating CVS to Rust. You can find the scripts and patches described here in my cvs-rs repository.

0. Does it build?

Before doing anything else, I made sure that I could build CVS from unmodified source using GCC. This is important! If it doesn't work with a standard C compiler, there is absolutely no way Corrode is going to give you good results. Or as Charles Babbage put it:
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
If the project you're translating has a test suite, this is also a good time to check that the test suite passes before you start changing things!

1. Trial run

Next, to get a rough idea of how much work the translation will eventually require, I tried just substituting Corrode and rustc in place of GCC in the build. Most build systems for open source projects make it easy to do that, usually by setting a "CC" variable to the path you want to use as your C compiler. In the case of CVS, I did it this way:
make CC=/path/to/corrode/scripts/corrode-cc
corrode-cc is a wrapper script that does these steps:
  1. Run Corrode to translate one C source file to a standalone Rust module.
  2. Run rustc on that Rust module, asking it to treat it as a dylib-type crate but to only emit an object file.
  3. If either of those steps fails, save the error message and then run GCC to get an object file anyway.
So the result of this step is that we have a bunch of Rust modules lying around, plus a bunch of files recording various error messages. Hopefully the build completed successfully, and if we're lucky, there's even some Rust-compiled code in it, and if we're really lucky, the new binary still works!

To estimate how much work the translation is going to take, you can look at several factors:
  • How many distinct errors did the build produce? The corrode-cc script writes error messages to files named "errors-<hash>", where the hash is over the error message itself, so if multiple files run into identical errors in some header file that they all include, that error will only show up once. Error messages:
    • may indicate that the project relies on a feature of C that Corrode does not yet handle,
    • or may indicate a bug in Corrode,
    • or may indicate code that Rust can't verify is safe.
  • How many object files (*.o) did the build produce?
  • How many Rust modules (*.rs)? In the best case, there will be one for each object file, but currently Corrode fails to translate a variety of C patterns, and whenever Corrode fails it refuses to generate any output. Those cases may indicate that you'll need to patch the C source to make it easier to translate. This can be tricky.
  • How many of the object files were compiled via Rust? If the corrode-cc wrapper script had to fall back to compiling via GCC, then there are usually small edits you can make to the Corrode-generated Rust to make rustc accept it. This is tedious but generally pretty easy.

I found that the easiest way to check whether an object file was compiled using GCC or Rust was to check if the string ".rs" appears in it:
grep -lF .rs *.o
(Of course this might have false positives, so if you have a better approach please let me know!)

Based on these results you should get some idea how close Corrode alone will get you, and how much manual work you'll need to do to complete the translation.

2. Integrate Rust into the build system

For the CVS case study, I wanted use Corrode as if it were just another C compiler. So the C source is the canonical implementation, and I patched the build system to compile some source files via Rust instead.

Which source files should you do this to first? Maybe pick just one that worked in step 1.

For CVS, this amounted to making the following edits to src/Makefile.am:
  • Delete a selected subset of C sources from the cvs_SOURCES variable.
  • Create a RUST_SOURCES variable where, for example, if I removed "checkin.c" from cvs_SOURCES, then I added "checkin.rs" to RUST_SOURCES.
  • I added these rules:
    %.rs: CC=corrode
    %.rs: %.c
            $(COMPILE) -c $< >/dev/null
    
    .PHONY: rust-sources
    rust-sources: $(RUST_SOURCES)
    
    libcvs-rs.a: lib.rs $(RUST_SOURCES)
            rustc -A bad-style -A unused-mut -g -O \
            -C debug-assertions=on --crate-type=staticlib \
            -o $@ $<
    
  • Finally, I added libcvs-rs.a to cvs_DEPENDENCIES, and "libcvs-rs.a -ldl -lpthread -lgcc_s -lc -lm -lrt -lutil" to cvs_LDADD.
Also, I created a top-level Rust module in src/lib.rs which just re-exports declarations from the translated modules. So if checkin.rs is in RUST_SOURCES, lib.rs contains a "pub use checkin;" item.

Note that I split out a phony target just for ensuring that all the Rust sources have been built. That allowed me to split the build process into phases that run before and run after Corrode:
  1. Apply patches to the C source as needed that make it easier to translate.
  2. Run "make rust-sources" to auto-generate a rough version of each selected C source file.
  3. Apply additional patches, this time to the generated Rust, as needed to improve the translation.
  4. Run "make" to complete the build.
As Corrode and related tools improve, there should be less need for patches to either the C or the Rust source. If someday we can fully automate this process, then this multi-phase build approach can go away entirely.

I'm using quilt to manage the collections of patches. I have two reasons for doing it this way, and maybe neither will apply to you:
  • I wanted people to be able to learn from the cvs-rs repository, so I'm using the patch series as a way of communicating aspects of the process I'm following. If you're just doing a one-off conversion, you don't necessarily need to document the steps you took along the way.
  • Corrode is still under active development, so I'm frequently re-running it. Recording all the manual changes I'm making in separate patches makes it easier for me to manage my work-in-progress.

3. Translate more, or translate better

With that foundation in place, now comes the fun part: namely, "everything else!"

My current process for translating CVS involves doing either of these two tasks, over and over:
  • Pick a new C source file, move it to the build-via-Rust list, and see if it works.
  • Pick some piece of generated Rust, and see if I can improve it (by making it safer or more idiomatic).
You can keep doing these steps until there's nothing left to do, or you get bored.

Status

So far, I have translated 6.4% of the non-comment, non-blank lines in the src/ subdirectory of CVS, from 10 source files.

Sometimes, translating a thousand-line source file has taken 10 minutes. Other times, I've spent an entire afternoon comparing the generated Rust to the original C without spotting any differences, and yet the Rust version doesn't pass the test suite.

So there's more work to be done on Corrode, to make it reliably convert as many kinds of C source as possible. At this point, I'm going back to improving Corrode for a bit, rather than focusing on translating more of CVS.

Still, if you're interested in trying Corrode, I'd encourage you to try going through at least step 1 on whatever project you think is interesting. See how far you get, and if you find a project where Corrode works well, I would love to hear about it!

Discussion elsewhere on this post:

    Wednesday, November 16, 2016

    Go To Statement Considered (Mostly) Harmless

    While this post's title is a mash-up of "Hitchhiker's Guide to the Galaxy" with Dijkstra's 1968 letter, "Go To Statement Considered Harmful", today I have a bunch of Corrode-related news to share, not just commentary on "goto". But let's start there, because I'm super proud of this...

    Translating C's "switch" and "goto" to Rust

    Six months ago I wrote 'How to eliminate "goto" statements', about this question:
    C supports "goto" statements. Rust does not. Can programs with "goto" statements be turned into programs without them?
    Over the last two weeks I've finally gotten to write code tackling this problem. (Thanks again to Mozilla for supporting this project! I could not afford to dedicate the time this took without their support.)

    The result: Corrode now understands all forms of C statements except for the GCC extensions for "local labels", "computed goto", and inline assembly. The last holdouts were "goto" statements, obviously, as well as "switch" statements, which in C can lead to similarly complex control flow.

    I'm so excited, because this is a huge milestone! Lots of C code uses "switch" statements, and regardless of Dijkstra's opinions, lots of C code uses "goto" as well. None of the remaining GCC extensions to statement syntax are used widely enough to have nearly as much impact.

    The implementation looks very little like I expected it would six months ago. General implementation strategy:
    Gee, translating "goto" statements is hard. I know! I'll turn everything into "goto" statements, and then turn the resulting mess back into structured Rust. That'll be easier!
    If you're interested in the details, check out the re-structuring implementation. It's a 12-page paper describing the algorithms and invariants, with code interspersed in literate programming style, like the rest of Corrode. (And if you find any of the explanations confusing, I would love to receive pull requests making the documentation easier to understand!)

    It is still possible to write C programs using "goto" or "switch" that Corrode can't translate to Rust yet, but they're less common than you might expect. They fall into two categories:
    • If it's possible to enter a loop at two or more different places, the control flow is called "irreducible" and Corrode can't translate it, yet, because it fundamentally can't be represented in Rust without duplicating code or introducing new variables. The classic example of this is "Duff's Device". Many algorithms in compilers have trouble with irreducible control flow, so Corrode is not alone in this. But in "A study of irreducibility in C programs" from 2010, Stanier and Watson found that less than 0.05% of functions had irreducible control flow, across 15 large open source projects. So this is not a big concern, although I have thoughts on how to handle it anyway.
    • The more common case involves certain "goto fail"-style control flow. A surprisingly wide range of "goto fail"-style control flow actually works just fine when Corrode translates it to Rust. I haven't figured out yet how to characterize which cases don't work yet.
    One more challenge with the new implementation: It produces very noisy Rust.
    • The output is full of extra "break" and "continue" statements in places they aren't strictly necessary.
    • Every loop has a label, in case there are any multi-level exits that need to refer to it, and every "break" or "continue" uses the appropriate loop label even if it doesn't need to.
    • Corrode no longer generates "while" loops, just infinite "loop" statements with a conditional "break" at the beginning.
    All three of those problems should be fixable, but I needed to get the basic algorithm right first before figuring out how to produce more readable Rust.

    Progress translating CVS to Rust

    Last time I talked about why CVS is my Corrode current case study. Between the above work on goto/switch and a bunch of smaller fixes, the "Rustification" of CVS is going pretty well!

    Corrode is now able to produce Rust output of some sort for about 64% of the C source files in CVS. (I'm not counting C source for other platforms that doesn't get built on my system; I get 91 .rs files when I build CVS with Corrode, compared to 143 .o files in the build overall.)

    Of the remaining errors causing Corrode to fail on CVS source files, roughly one-third are due to over-complicated control flow. Another third are due to various uses of C-style union types; see Corrode issue #22 for background. (I need to work out how to merge the partial implementation of unions from pull request #63!)

    The rest are mostly things that will probably never work in Rust, but these usually can be manually fixed in the C source before running Corrode. Some examples:
    • Enum-typed global variables without an explicit initializer are a problem because C's default zero-initialization policy doesn't make sense in Rust, where enum types aren't fully isomorphic to integers. I suspect I'll edit the CVS source to provide initializers for these.
    • Rust doesn't allow declaring functions that take variable numbers of arguments, but the CVS code base declares several. These functions just have to stay in C, or re-write the callers to not use a varargs-style API.
    • One CVS header declares a function pointer that takes exactly 3 arguments, but actually expects the corresponding function to take a variable number of arguments. C allows passing more arguments than declared, but Rust doesn't. I expect that changing the declaration to explicitly accept a variable number of arguments will fix the immediate problem, although then of course I'll run into the varargs problem from the previous point.
    • Rust doesn't support C's `alloca` function for variable-sized stack allocations. I guess I'll probably carefully replace it with `malloc`/`free` pairs in the right places, and then try to clean up the resulting Rust.
    Once Corrode has generated a pile of Rust source files, then we get to find out how many of them the Rust compiler will actually accept. At the moment, the answer is about 44%. Here's what goes wrong with the other 56%:
    • Nearly 75% of the errors in the generated Rust are "use of possibly uninitialized variable" complaints.
      • This is often because some uninitialized variable is passed by reference to a function that only uses it as an out-parameter, which means it's safe, but the compiler can't tell that. I think the right option is to refactor functions with out-parameters to return tuples in Rust, optionally with a C compatibility wrapper if the function still needs to be callable from non-Rust functions.
      • Another class of "uninitialized variable" errors is because the C program assigned to all of the fields of a struct individually, but Rust can't conclude that the struct itself is initialized. This case should be fixable by using C99 structure literals in the pre-Corrode source, or hacking up the Rust output to use struct constructors.
    • 16% of error messages are due to issue #77, which is that Corrode doesn't translate nullable function pointers in a way that the Rust compiler will accept. (I had assumed I could treat them the same as raw pointers, which are nullable and allow arbitrary casts, but nope; in Rust, function pointers are strongly type-safe and memory-safe.)
    • 3.3% are due to Rust not supporting as many kinds of constant expressions as C does. In particular, I think it ought to be possible to evaluate if-expressions, calls to "std::mem::size_of", and calls to "as_ptr" on types such as "str", so long as their arguments or operands are all constant. (I haven't checked exactly what current nightly Rust allows, yet, but I don't think this has changed.)
    • 2.3% are due to issue #92, which is that Rust doesn't allow a parameter or variable binding to have the same name as a static variable that's in scope. I'm not sure I quite understand why Rust does this, so if somebody could explain what's going on in that issue I'd appreciate it!
    The last few issues are pretty evenly divided into these categories:
    • The new algorithm for handling arbitrary control flow doesn't yet try to place variable declarations correctly, so a variable that was in scope in the original C may not be in scope in the generated Rust. It happens to almost always work out correctly anyway, so I'm not too worried about it yet, but I'm tracking it in issue #93.
    • Corrode sometimes translates uses of C-style unions that it isn't actually able to handle correctly, but the error isn't detected until the Rust compiler tries to figure out what to do with it. This will be fixed by just fully implementing unions.
    • When dealing with enum types, Corrode currently adheres a little too closely to the letter of the C standard, and aggressively int-promotes enum values. Several errors would be fixed if we leave values as enum type as long as possible.
    Over time, I'd love to automate as many of these fixes as possible. But in the short term I'll probably just manually fix each issue so I can find out what other challenges await.

    Next up

    Now that I have a fairly substantial chunk of generated Rust to play with, I'm going to start scrutinizing it to find interesting patterns. Based on what I find, I'll write up advice for people who want to turn unsafe C-style Rust into safe, idiomatic Rust, and if we're lucky, somebody will be able to automate some of that advice.

    Until then, I invite you to:
    • read the implementation of the C translator and the control-flow graph algorithms;
    • try running Corrode on your own programs and see how well it does;
    • and contribute documentation, bug reports, or code if you can!

    Friday, October 28, 2016

    Corrode update: support from Mozilla, and new features

    It's been a while since I've written about Corrode; I've just been quietly working on it and merging contributions from several other people. I have a few things I'd like to mention today.

    Funding from Mozilla

    The big news: Mozilla is providing funding for me to work on Corrode for the next few months! I'm super excited to get to work on some of the trickier challenges in translating C to Rust, such as eliminating "goto" statements. More on that later, but first, let me tell you about the specific goal that Mozilla is providing support for.

    One challenge for computing today is bit-rot in software that nobody wants to maintain any more, but that's still useful to some community. (This is a problem for both open-source and proprietary software, but let's focus on the open-source side as there are fewer roadblocks for a person who's motivated to do something about it.) When users discover bugs and security flaws in unmaintained software, they're often just out of luck.

    Mozilla has an interest in finding better ways for people to build software, whether that's through better programming tools like Rust, or more effective ways of organizing an open-source community. So the experiment I'm working on is to see if it makes sense to "rescue" unmaintained open source projects that were written in C by semi-automatically translating them to Rust.

    To find an interesting case study, I looked through Debian's list of orphaned packages to find projects that have a reasonable amount of C source. I especially wanted to find a project with a network component, so if there are security flaws they may be remotely exploitable, because Rust's safety guarantees have more impact there. Then I cross-checked Debian's "Popularity Contest", an opt-in census of which packages are installed on Debian systems, to pick a package that's relatively widely-installed despite being unmaintained.

    The package I settled on: CVS, the venerable "Concurrent Versions System". The first release of CVS was on November 19, 1990, almost 26 years ago! (This is not the oldest codebase I've worked on, but it's close.) The last upstream release of CVS was in 2008, even though it had a security vulnerability discovered in 2012. This is clearly 50k lines of unmaintained C.

    CVS was largely supplanted by Subversion a decade ago, which has largely been supplanted by git today. Anyone choosing CVS for a new project would get asked some pointed questions. But there are still tons of open source projects where their history is only available via CVS; you can find plenty of examples hosted at sites like Sourceforge or Savannah. And according to the Popularity Contest statistics, 6.5% of Debian users still have CVS installed. So there's value in keeping the CVS implementation alive.

    I'm currently working on improving Corrode to be able to translate as much of CVS as I can manage. After that I'll use the generated Rust source to study how much manual work is needed to clean up Corrode's output, and see if there are opportunities for automating more. This effort will wrap up by the end of the year, and hopefully I'll have plenty to report by then.

    Thanks so much to Mozilla for supporting this work!

    Recent changes

    I want to take a moment to thank all the people who have contributed to Corrode so far. I've merged pull requests from: Alec Theriault, Nathan Bergey, Vickenty Fesunov, Fabian Zaiser, Jeff Waugh, Jeremie Jost, Nabil Hassein, Amin Bandali, Getty Ritter, Robert Grosse, Sean Jensen-Grey, and Taylor Cramer. Thanks also to everyone who has filed bug reports, or commented on them to clarify details about Rust or C!

    It's now possible to use Corrode as a replacement for GCC, with the corrode-cc script from the Corrode source tree. There are still a lot of bugs, but it's good enough that I'm usually testing Corrode now by running `make CC=corrode-cc` in various source trees, including CVS and musl-libc.

    All C control-flow statements are implemented now, thanks to Fabian finishing off do-while loops... with the significant exceptions of switch and goto statements, which I'm working on now in a new 'cfg' branch. (See the first post I wrote about Corrode for background; I've been looking forward to this part for months!)

    I've fixed a bunch of small issues that either made Corrode give up during translation due to not understanding the input, or made it generate invalid Rust output. The biggest change was to only translate declarations that are actually needed for the current translation unit. C header files are full of all sorts of junk, but if you don't actually reference a declaration then Corrode doesn't need to report any translation errors for that declaration. Fixing this let me translate my first 14 source files in the Linux kernel, so that was pretty exciting!

    One change motivated specifically by my work on CVS: Corrode now handles K&R-style function definitions. Yes, CVS still uses pre-C89 style code in some places, while using C99 features in other places. That'll happen in a project of this age.

    One interesting bug was around calling through function pointers stored in struct fields. In Rust, `s.f()` calls a trait method named "f" on "s". To call a function pointed to by a field named "f", you need parentheses, like so: `(s.f)()`. The pretty-printer for Corrode's Rust AST now inserts these parentheses as needed.

    Another pretty-printer bug is something I tried to squash before, but hopefully got right this time: blocks in expressions. Rust's parser effectively automatically inserts semicolons after blocks under certain circumstances, so programmers don't need to follow up every if-statement or for-loop with a semicolon. If we want to generate something like this:
    if c { t } else { f } as f32;
    (which is a weird thing to do, but cases like this keep coming up) then we need to wrap the if-statement in parentheses to keep Rust's parser from treating the end of the else-branch as the end of the statement. However, we don't want to insert parentheses when they aren't needed, because sometimes Rust will actually warn that there are unnecessary parentheses! Fortunately I think I have parentheses in exactly the right places now.

    And, because it was driving me mad, I fixed indentation for pretty-printing if-else ladders. They used to zig-zag further to the right, and sometimes some branches would all be on one line while other branches were split across multiple lines. Fixing this had no effect on correctness of the output but it's substantially more readable now!

    Coming up next...

    I'm preparing a new repository for the CVS source code that will include scripts to let anyone reproduce my progress on that translation. I'm having a little trouble scripting some of the steps I do, so that's taking longer than I hoped, but it'll happen at some point.

    A significant fraction of source files in CVS contain either switch or goto statements so my major focus short-term is on translating those control-flow constructs correctly. If you'd like to help, check out the discussion around goto and switch.

    I'll post progress reports like this regularly. Keep an eye out!

    Friday, July 15, 2016

    Testing strategies for Corrode

    Last week I posted about Corrode, my C-to-Rust translator.

    I'm quite proud of Corrode, to be perfectly frank! And apparently other people think it's cool too, since it was on the front page of Hacker News and /r/rust for a solid day, and it spent the subsequent week on GitHub's list of trending projects. The past week has been very exciting, I gotta say!

    A moment of reflection, then. There are a few things an open source project really ought to have:
    1. Responsive maintainers. People have been trying Corrode on their own code and filing issues, and a few people have been busily submitting pull requests. I've been responding to these as quickly as I can.
    2. Solid documentation. I think documentation is pretty well covered, considering some of the feedback:
      • "Absolutely blown away by the detail of the documentation."
      • "The documentation is amazing."
      • "That's an impressive use of literate programming."
      • "WOW... This is just amazing. The most important part of the source is highly educative literate haskell."
      • "Who are you and what did you do with Jamey?" —a former co-worker who had to maintain my code after I left
    3. Verification/validation. Um... well... did you see my documentation?
    So it's time for a real plan for verifying that Corrode does what it's supposed to. I've explored several strategies so far, which have interesting trade-offs.

    Randomized testing with Csmith and C-Reduce

    "Csmith is a tool that can generate random C programs that statically and dynamically conform to the C99 standard. It is useful for stress-testing compilers, static analyzers, and other tools that process C code."

    Corrode now has a script in scripts/csmith-test which performs the following steps:
    1. Generate a random C program using Csmith, and verify that the program compiles using GCC, as our reference compiler. If either of these fail, we won't discover any interesting Corrode-related bugs with this program, so discard it.
    2. Convert the program to Rust using Corrode, and compile it with rustc. If either of these fail, this may indicate an interesting bug; report it to the user.
    3. Run the GCC-compiled program. If it times out, or exits with an error, or produces output that doesn't look right for a Csmith-generated program, then odds are low that we'll learn anything from the Rust version and we just discard this case.
    4. Run the Rust-compiled program and compare its output to the GCC-compiled version. If they're different, that probably indicates an interesting bug; report it to the user.
    Usually it only takes a few seconds to run through that process; but on the other hand, usually these tests all pass :-) so I've been running this script in a loop which stops once an interesting failure is actually found.

    Once the script has produced an interestingly-buggy test case, then it tries to delete as much code from the test as possible while preserving the bug that made it interesting. C-Reduce does the heavy lifting for this part, using a variety of heuristics starting with "what happens if I delete this line? No? How about this one?"

    Unlike the Csmith test cycle, C-Reduce can easily run for an hour or two trying to find chunks that are OK to delete. At the moment this is my only complaint about C-Reduce: I have so many possible bugs to investigate that spending an hour or two waiting on each one feels like a lot. That said, it's doing a very good job of narrowing down the portions of code actually responsible for bugs, so I guess I'll keep it. :-)

    So far I've found and fixed three bugs in Corrode using Corrode's csmith-test script.
    • I applied C99's "usual arithmetic conversions" to the left-shift and right-shift operators, but I should instead have applied the "integer promotions" to both operands and then taken the left-hand side's type. Fixed in commit 3fd9040.
    • I never realized that the type of a C99 integer literal depends not only on whether it has 'U' or 'L' suffixes, but also on whether it's written in decimal (as opposed to octal or hex), and also on whether its value fits in a given type. Fixed in commit 93671a7.
    • Apparently, I find the "usual arithmetic conversions" hard to get right. I've "fixed" them in three commits in a row and Csmith keeps finding new corners for me.
    My big problem with the usual arithmetic conversions is that I'm trying to translate C's long to Rust's isize, but in various corner cases that means you can distinguish between Corrode and a standard-conforming C compiler. I may just decide that's a "feature"...
    I have not found any cases yet where I could blame rustc for any wrong output, but you never know... it could happen!

    Building real software: musl libc

    The musl library is a from-scratch implementation of the C standard library, which "is lightweight, fast, simple, free, and strives to be correct in the sense of standards-conformance and safety."

    Adam Perry has been manually porting musl to Rust as an exercise to discover "the advantages and pitfalls of a C-to-Rust rewrite project," so I thought it might be a good candidate for testing Corrode.

    Of particular importance for my purposes, musl doesn't use system headers, which means that compared with other real C code-bases, musl less frequently uses parts of C that Corrode doesn't support yet.

    To test Corrode on musl, I first made the corrode executable accept all the options that GCC/Clang do (more or less), and just pass anything it doesn't understand to the C preprocessor.

    With that, you can run "make CC=corrode -k" and get a "foo.rs" file alongside every "foo.c" file that Corrode was successfully able to translate! The "-k" option tells make to keep going as long as possible even when some commands fail.

    At the moment, Corrode translates about 10% of the source files in musl, which I feel is pretty good! I was especially excited when I got it to translate bsearch.c and a bunch of the string.h functions, since those do funny things with pointers...

    Now, I haven't yet tested whether any of these source files translate correctly, and in fact I know a few of them don't (if they have static local variables, or declare arrays). At the moment all I can tell you is that the output looks mostly reasonable, in my human judgment.

    At some point I'd like to integrate Adam Perry's build and test scripts from his manual rewriting effort, which would let me validate that Corrode+rustc generate code that passes musl's test suite.

    Software Analysis Workbench

    "The Software Analysis Workbench (SAW) provides the ability to formally verify properties of code written in C, Java, and Cryptol. It leverages automated SAT and SMT solvers to make this process as automated as possible, and provides a scripting language, called SAW Script, to enable verification to scale up to more complex systems."

    Csmith-generated programs print some computed values just before exiting, and that's all you can observe about them during testing. If there are errors mid-computation but they cancel out by the end, or otherwise have no effect on the final output, Csmith-based testing won't find those.

    SAW, by contrast, is designed to show that two functions compute equivalent outputs. In principle, we could use SAW to prove that the code clang generates is equivalent to the code which Corrode+rustc generate on a per-function basis.

    Unfortunately, current git master of SAW can't parse the LLVM bitcode which rustc produces. I hear SAW's maintainers are working on that, so I'm hoping this will be an effective option for Corrode testing soon!

    Unit tests and randomized property testing

    You were probably wondering why I haven't mentioned these yet, I bet?

    Csmith was low-hanging fruit: It has proven to be effective at finding bugs in Corrode with very little work on my part. And musl has quickly shown me which C constructs are used by a fairly substantial real-world code base, so I could focus on implementing those.

    Unit tests and QuickCheck-style property testing, on the other hand, require a lot more thought from the developer. I just haven't put in that level of thought yet.

    As Corrode matures, I think we'll need both focused regression tests and randomized property tests to test at every commit, which can be supplemented by periodic longer-duration Csmith runs.

    Conclusions

    Corrode doesn't support enough of C yet to translate most real programs, largely due to system headers declaring things like unions and enums even if your program never uses them. But for the subset of C which Corrode does support, at this point I feel pretty good about the correctness of its translation!

    Friday, July 8, 2016

    Translating C to Rust (and how you can help)

    I've been working for a few months on a C to Rust source code translator called "Corrode", and it's starting to come together into something that looks promising!

    Corrode is kind of a weird project in that it's really not very much code—less than a thousand lines at its core!—and yet people expect it to be this huge scary thing. Admittedly, it has required a lot more thought per line of code than many programs do, so maybe you should be a little scared of the scope of a project like this.

    But...

    I'd like to convince you that you can contribute to Corrode's development, and in this post I'll talk about some reasons why it's easier than it sounds, and some of the ways I'm trying to make first contributions as painless as I can.

    (If you just want to read up on how Corrode does its magic, or learn about some bizarre corners of C, check out the detailed design documentation.)

    Type-directed development

    Corrode is implemented in Haskell. Wait, come back! It's no big deal, I swear: Just about everything I wrote has a simple Rust equivalent, just with different syntax. The only reason I wrote Corrode in Haskell is because I couldn't find a complete enough C parser for Rust, and I was already familiar with the language-c parser for Haskell. So I believe that if you can write Rust, you can pick up enough Haskell to work on Corrode.

    In most compiler front-ends, the parser's job is to construct an "abstract syntax tree" (AST), and language-c is no different. What does an AST look like? Take a peek at the Language.C.Syntax.AST module. It defines a series of types, where each type represents some piece of C syntax. For example, a CExternalDeclaration represents one thing that can appear at the top-level of a C source file (that is, outside of any function), while a CExpression represents one operator and its operands (like "a + 2").

    So something you might wonder about a project like Corrode is: how can we tell when we're done? The neat thing is that these AST types tell us exactly what cases we have to handle before we'll be able to translate all legal C programs. Once we've written down a translation for every constructor of CExpression, for instance, there is no way for language-c to hand us some magic other kind of expression it hadn't previously told us about.

    As a result, the process of creating Corrode has been an almost mechanical cycle of finding one of the unimplemented cases, thinking about what the equivalent Rust should be for that case, and writing code to generate that Rust.

    Corrode defines a couple of common error-reporting functions, and one of them is called "unimplemented". One quick way to find some place you could contribute is to search for calls to that function and see if any of the missing cases looks like something you know how to tackle. (As of this writing, there are 11 such calls, although some are catch-all handlers for several unimplemented cases at once.)

    Open issues tagged "easy"

    Of course, some of the cases are not yet implemented because they're actually super difficult to handle correctly. For example, the switch and goto statements are in that category, for reasons I discussed in an earlier post (How to eliminate "goto" statements). Just searching for calls to unimplemented won't tell you whether you've picked one of the tricky cases.

    As an alternative, I've configured Corrode's issue tracker with a label called "easy", which I use on issues that I think are a good introduction to hacking on Corrode. I am deliberately avoiding implementing these cases, even though I know how I would implement them, except when they keep me from finishing something else.

    While I was a resident at the Recurse Center a few weeks ago (which was a wonderful experience, by the way, and highly recommended), some of the "Recursers" pair-programmed solutions to some of these easy-tagged issues with me. I was so happy to get to merge pull requests from Jeremie Jost (implement alignof) and Nabil Hassein (implement global variables). Jeremie also got a pull request merged in language-c for an issue we discovered along the way. These were all fairly small patches, and you could do this too!

    Literate programming

    Literate programming turns the usual emphasis of programming on its head. Instead of writing a bunch of machine-readable source code that has human-readable comments interspersed, in literate programming you write human-readable documentation that has some machine-readable source code scattered in it.

    For Corrode, there isn't very much source code, but the assumptions and rationale behind the implementation are intricate. I was going to write a detailed design document to help other people get started hacking on this project. But then I realized that I'd want the documentation to refer to specific parts of the implementation, and when reading the implementation I'd want the relevant parts of the documentation to be easy to find.

    The literate programming style is perfect for this! Check out the core source file of Corrode... which is just a Markdown document. My goal was to tell the story of why and how, and as you read that story the what is right there for easy cross-referencing.

    GitHub renders the Markdown nicely, including syntax highlighting for the code snippets. (And you can use pandoc to generate PDF or other formats from it too.) I also throw in snippets of C or Rust source, which help when explaining some of the quirks but are stripped out along with the rest of the non-Haskell text during compilation.

    If you want to use Markdown in your own Literate Haskell source, there are a couple of tricks to it:
    • Configure GHC to pre-process .lhs source files with markdown-unlit, by adding it to your .cabal file's build-depends and adding "-pgmL markdown-unlit" to your ghc-options. (That package's README documents how to turn your README into a test suite where all your code snippets are actually tested at build time, which is a very cool idea, but wasn't quite what I wanted in this case.)
    • Name your literate Haskell source files with a ".md" extension, so GitHub treats them as Markdown instead of as source code. Then create a symlink from "foo.lhs" to "foo.md" so GHC can find the module too.
    Of course, I know how all the code I wrote works, so there are likely sections where my explanation of what's going on doesn't actually clarify anything. I would dearly love to get pull requests that improve the documentation without changing the code at all. If you find some of the text confusing, see if you can find a better way to say it! This is an excellent way to gain a deeper understanding of how the code works, too. Huge thanks to Amin Bandali for sending in one documentation pull request already, explaining how to compile and run Corrode. (Oops, I should have thought of that!)

    Conclusion

    I would very much like to work with you to make Corrode into a tool that many C programmers can use to try switching to Rust. You can help:
    • by improving the documentation, so more people will feel comfortable contributing to and using Corrode;
    • by writing translations for C features which are not yet implemented, especially features I've tagged as "easy";
    • or just by trying it out on your own C source and reporting issues as they come up.
    I look forward to seeing your pull requests! :-)

    Tuesday, May 24, 2016

    Optimal optimization

    For most of the history of compilers, computers have been slow, and programmer time has been more valuable than program runtime. Compilers could, in theory, transform bulky and slow software into code that's faster and smaller than any human would write by hand, but the trade-off has rarely been worthwhile. Instead we get just those optimizations that always improve code, plus a few heuristically-guided optimizations that help sometimes but may miss opportunities for improvements.

    Some compilers are very explicit about their goals with regard to these trade-offs between compile time and run-time performance. Google's "Go" compiler aims for fast compile times and is less concerned about optimization, for instance. The recently open-sourced Chez Scheme compiler has the philosophy that "Compile time is important as well as run time, so compiler optimizations are generally expected to pay their own way, i.e., indirectly benefit compiler performance enough to cover the direct cost of the optimization." These are somewhat extreme examples, but generally compilers lean toward optimizations that don't require very much computation to apply.

    What might a compiler look like if we were willing to solve NP-complete problems in exchange for better generated code? In this post I'll start with the problem of optimal inlining, and explore other ideas in future posts.

    Inlining and its dual, procedural abstraction

    One important optimization implemented in many compilers is the ability to inline the body of a function into every place where that function was called.

    The immediate benefit of inlining is that a function call requires some extra instructions to set up a stack frame and then undo it when the function returns, so inlining eliminates those instructions and uses slightly less memory bandwidth.

    On the other hand, if the function is called more than once, then duplicating the instructions in that function could make the total program size substantially larger. In embedded systems, that could mean having to buy more expensive parts with more storage. On desktop and server CPUs, larger programs are less likely to fit in cache and could be significantly slower than an equivalent small program. So it's important to decide carefully whether inlining is worthwhile.

    Typically, compilers decide whether to inline based on heuristics. If the function is called only once, then it's pretty safe to just inline it. Otherwise, the compiler probably has some size limit: if the function is fewer than this many instructions, say, then we'll inline it.

    But inlining can have indirect benefits. If some calls to a function pass a constant value as one of its arguments, and the function chooses one of several branches based on that argument, then after inlining the compiler may be able to delete most of the function body as dead code.

    We could try to adjust the heuristics to notice this case, but then some even more complicated situation will come up. In general, the only way to be sure you've found all the optimization opportunities is to try, for each call site, either inlining it or not. Since inlining one call may expose new opportunities for improving another call, in principle you'd have to try all 2^n combinations. (I think that may actually be under-counting.) This is a combinatorial search problem.

    Let's look at the dual of this problem for a moment. Imagine the programmer wrote a bunch of copy-pasted code that does similar things over and over. They could have extracted a common pattern out to a new function and called that function each place they needed it, but for whatever reason they didn't. The resulting program is larger than it could be as a result, but presumably the resulting straight-line code is easily optimized and fast (aside from cache effects).

    What if, after optimization, we could discover that all that copy-pasted code is still very similar, and the compiler could choose to extract it to a function in order to reduce code size? Now we have the opposite dilemma: will the cost of creating a stack frame outweigh the code size advantages of extracting a function? But this time, the question is easier to answer. We've already applied all the intraprocedural optimizations we possibly could, and extracting a function can't expose any new opportunities. (At least, none I can think of; counterexamples welcome!) So we can use only local information to make the decision, no search required.

    The literature I've found on this calls it "procedural abstraction", and the reason we don't generally see compilers doing it is that it's computationally very expensive to find sections of code that are similar enough that they can be extracted to new functions. I've found three general approaches:
    1. Compute a fingerprint for each basic block. If several basic blocks have the same fingerprint, they're candidates for abstraction. This only works if each block has the same instructions, in the same order (but allowing for some variation, such as different register names), so it has low odds of finding matching code. It's fast, though. See, for example, Johan Runeson's 2000 master's thesis, "Code Compression through Procedural Abstraction before Register Allocation".
    2. Use a suffix trie to identify matching sub-sequences of instructions. This still requires instructions to appear in the same order, but can match instruction sequences that are smaller than a whole basic block. It's still reasonably fast.
    3. Treat the code as a directed graph and apply an algorithm for "frequent graph-based pattern mining", such as gSpan. This is NP-complete, but can match instructions regardless of what order they're in. See, for example, "Graph-Based Procedural Abstraction" by Dreweke et al, in the 2007 Proceedings of the International Symposium on Code Generation and Optimization; or chapter 4 of Neil E. Johnson's 2004 PhD thesis, "Code size optimization for embedded processors".

    Of course, for this exercise, what's a little asymptotic complexity? Obviously the hardest option is the only sensible choice.

    With a procedural abstraction algorithm in hand, let's go back to the problem of deciding when inlining is a good idea. Now we can take a conceptually easy approach:

    Inline everything and let procedural abstraction sort it out!

    (Recursive functions need a little care. If the language semantics allow tail-call optimization, then transform tail recursion into loops where possible first. After that, identify cycles in the call-graph and leave the targets of back-edge calls un-inlined. Those details aside, inline everything else!)

    At this point we've replaced a small NP-complete problem (where n is the number of function calls) with a much larger NP-complete problem (where n is the number of instructions in the program after inlining every call). But there are two reasons this is interesting:
    • We'll get the optimal set of function boundaries for whatever other optimizations we've chosen to implement. If we have an intraprocedural optimization (like dead-code elimination) that can take advantage of having a particular call inlined, then it will be inlined. If the other optimizations we've implemented don't fire after inlining, then the function will be outlined again. But in the process we'll shrink the code further by identifying redundant snippets in the original program.
    • Since the author of the compiler doesn't need to think about how inlining might interact with any other optimizations, this optimization is perfectly modular. That's great for verifying that the implementation is correct, to avoid wrong-code compiler bugs, and great for maintenance as the compiler evolves.
    As a side note, I haven't found any papers yet on applying the procedural abstraction idea to re-rolling loops, but it seems like a natural extension to me. (Please let me know if you've seen this elsewhere.) If a block of code does almost the same thing several times in a row with different constants, then we can extract those constants into an array and loop over the array. Whether that's faster or slower than a series of function calls could be a tricky question, but it's probably a code-size win.

    Next time...

    Inlining and procedural abstraction are one example of optimization opportunities which compilers generally leave on the table to avoid excessively long compile times. With all the complexity in a modern compiler it should be no surprise that there are more opportunities!

    I have one more post like this planned; perhaps more will occur to me later. I'll write about the Value State Dependence Graph, and opportunities to improve on optimizations described in papers like Alan C. Lawrence's 2007 PhD thesis, "Optimizing compilation with the Value State Dependence Graph", among others. I think I can outline a unified framework for register allocation, code motion, and instruction selection, that would maximize performance for a variety of instruction set architectures with instruction-level parallelism. But we'll see how that turns out, next time!

    Thursday, May 19, 2016

    Perspectives on commit history

    I've just read three blog posts talking about what policy open source projects should adopt with regard to what the Linux kernel community calls "the perfect patch".
    All three are writing with the perspective of extensive experience using git for collaboration on open source projects. They fully understand the alternatives, and I respect their positions.

    I'd like to offer a fourth position that's close to Joey's, with a slightly different emphasis: If you want new contributors to your project, never tell them to rebase. Don't ask them to "git rebase -i", don't ask them to "git pull --rebase", none of that.

    Similarly, if you're teaching people how to use git for the first time, please do not teach them anything about rebase, except perhaps to say, "There's this thing called rebase. Don't use it until you understand 'git reflog' and have a pretty good sense of git's object model."

    My problem with rebase is that, as Joey wrote, it "works well except when it doesn't, when some detail airbrushed out of the only remaining history turns out to be important." Worse, the tools for fixing rebase mistakes require deep understanding of how git works.

    If you have a merge commit that went wrong (perhaps there are merge conflict markers still in it, or conflicts were resolved incorrectly) then you can check out the pre-merge version on a branch, re-do the merge, and use "git merge -s ours" to replace the effects of the bad merge. It's a bit tedious, but the procedure is reliable, and someone more experienced can do it for you if necessary. If you have somebody else's rebased commits that had merge conflicts that they resolved incorrectly, there's nothing you can do. If they're your own commits then you can dig around in your reflog and hope you can find the pre-rebase branch head, but that takes much more effort than if you simply merged.

    To Matthew's point, squashing branches instead of merging them has many of the same problems, plus the separate problems he called out in his post. So I agree with him that squashing pull requests is a poor choice as well.

    A side effect of this advice is that the Linux kernel style of demanding perfect patches is impractical if you want newcomers to contribute, unless you're willing to put in a lot of work on mentoring. (Which is one reason why Outreachy is a wonderful program!)

    I sympathize with the kernel community and others who've adopted similar policies. As Joey pointed out, using "git bisect" to track down a bug requires a lot more work if your history contains lots of commits which don't compile or fail for unrelated reasons—but as he also pointed out, adding a "--merges-only" option to bisect would make that much less of an issue, and we shouldn't design our processes around the UX failures of our tools if we can just fix the tools instead.

    A more compelling concern is that code review, which is still one of the best tools we have for software quality, is more difficult if you're trying to review a messy series of patches. So evaluate your costs and benefits. Does your project actually do code review? If not, don't worry about clean patches, because you have bigger problems. Even if you do code review, is it really that hard to evaluate the history of how your contributors actually made their changes? The Linux kernel has tens of millions of lines of code, making code review inherently challenging even with perfect patches. Is your project honestly anywhere near that scale?

    It may be that whenever Josh Triplett gets around to releasing his current project (which I've seen nothing public on yet), we'll be able to have a different conversation around the ergonomics of managing a patch series. Until then, relaxing OCD requirements around what your commit history looks like in gitk or what your individual diffs contain allows you to accept contributions from more people and get them ramped up on improving your project. For most projects, that's worth more than anything else.

    Monday, May 16, 2016

    Angular velocity and quaternion derivatives

    OK, this post is all about math which frustrated me for weeks, a year and a half ago. I've forgotten most of the details since then and I'm not fully convinced I understood them in the first place, but the question has come up again in discussing sensor fusion at the Portland State Aerospace Society (PSAS), so I'm going to try to re-discover the details here.

    At the time I was working on a high-assurance quadcopter autopilot (SMACCMPilot), but it turns out that quadcopters have many of the same kinds of sensors as rockets, and similar sensor fusion problems. (I talked about sensor fusion for rockets in my first post here and I'm sure it'll come up again.)

    I have the impression that this topic is one where people who've actually been trained in aerospace think the answer is obvious. Consider the response I got when I tried to ask the author of a Kalman filter I wanted to re-use, "What representation is expected for gyro measurements?"

    I have no formal aerospace training and I still don't think it's obvious at all. Being told to "See standard texts on strapdown nav" was not helpful—especially since I have no idea which texts are "standard", even if one of the best sources I eventually found did turn out to be a book titled "Strapdown Inertial Navigation Technology".

    The question

    The short, technical, version of the question: How do you use a three-axis rate gyro to update a quaternion that represents your vehicle's current orientation? (If you already know what this question means you can skip to the next section.)

    Quadcopters and rockets both usually have "rate gyros", which are sensors that will tell you how fast your vehicle is spinning. You might interpret a rate gyro's output as a number measured in degrees per second. (The actual units don't matter for anything except your sanity, of course. Degrees per second is probably what you'll find in your sensor's datasheet; radians per second may be more convenient when working with trig functions; cycles per second might be best when presenting numbers to humans.)

    Each gyro measures rotation around a single axis; you need three of them to measure complete 3D angular velocity. In aerospace the three rotation axes are usually called "roll", "pitch", and "yaw", when measured with respect to the vehicle itself.

    6DOF en

    But for many purposes we don't care how fast the vehicle is rotating. What we want to know is which direction the vehicle is pointing, with respect to some convenient frame of reference, such as North/East/Down (NED) or Earth-Centered/Earth-Fixed (ECEF).

    It might seem like a natural way to represent that orientation would be as an angle around each of the three axes. That representation is referred to as the Euler angle representation. There's a problem with that representation, called "gimbal lock", which has been written about extensively elsewhere so I won't cover it here. Euler angles are also weird because rotations aren't commutative so you have to pick a convention for which order to apply the rotations in, and there are literally a dozen choices. Ick.

    Because of the problems with the Euler angle representation, aerospace systems usually use either a 3x3 rotation matrix, or a unit quaternion. Quaternions have several advantages over the matrix representation, including that they're smaller (4 numbers instead of 9); and that numerical precision problems due to floating-point inaccuracy are easier to fix up on quaternions.

    Quaternions have the disadvantage that they are utterly baffling, but let's not let that stop us, eh?

    If we assume that we know precisely what direction the vehicle was pointing when it powered on, and we have perfect measurements of how fast it's been rotating since then, then conceptually, we can just add up all those rotations to figure out which direction the vehicle is pointing now. In practice it isn't so easy because we don't have perfect information about the initial orientation nor about the rotations since then, but dealing with those problems is what sensor fusion is all about.

    Sensor fusion algorithms such as the Kalman filter augment a model which assumes perfect information to turn it into one that can take advantage of imperfect information instead. So in this post I'll pretend like we do have perfect information. Using the model we'll derive here in a sensor fusion implementation is left as an exercise for the reader. (Or you can study my Haskell sensor fusion example, or Paul Riseborough's Matlab/C++ version whose example I was trying to follow. Just be careful with the latter as I found plenty of implementation bugs in it, and be careful with the former because I make no guarantees about having the right model.)

    Putting all that together: how do we do the "adding up" of a series of 3-axis rotation measurements taken from rate gyros, given a unit quaternion representation for orientation?

    Why am I confused?

    When asking for help understanding Paul Riseborough's filter, I wrote:
    [The filter] constructs deltaQuat by assuming that correctedDelAng is in an axis-angle representation, where the magnitude of the Vector3f is the angle, and the direction of the vector is the axis of rotation.

    I expected that any gyro measurement would instead be represented as Euler angles, that is, three orthogonal measurements of rotation rate. That's not algebraically equivalent to the axis-angle representation, as far as I can tell, except when any two of the components are 0.
    Recently one of the students in PSAS asked me:
    I thought what I needed to do was store three vectors for local pitch/yaw/roll directions, in terms of my local tangent plane coordinate frame. Then, when I get data from the rate gyros, I can convert the rotations into quaternions using the local pitch/yaw/roll axis vectors and then compute the updated local axis vectors.

    But that seems wrong because I know rotations are non-commutative, and I'd have to choose what order to perform the rotations in. So I think there's something I'm missing.
    I think these are basically the same confusion. It turns out that this is really easy mathematically, but physically speaking, I still don't understand why!

    The answer

    Short version:
    1. If your current orientation is represented by a quaternion q, and you define an update quaternion p by making the vector part of p equal the gyro measurement vector while the scalar part is 0, then the derivative of q is 0.5q·p.
    2. Discretized, that means that the new value of q is equal to the old value of q, times a quaternion derived from treating the gyro measurement as a 3-vector axis/angle representation, where the magnitude of the measurement vector is the rotation angle, and the norm of the vector is the axis around which the vehicle is rotating.
    3. Computing the norm of the gyro measurement vector requires dividing each element by the magnitude of the vector. But if the vehicle isn't rotating then the magnitude is 0. An equivalent version without any division by 0 comes from taking the Taylor series expansion of the axis/angle formula, which also allows you to decide how much precision you care about in your approximation of sine and cosine.
    My main sources for these answers are the book "Strapdown Inertial Navigation Technology", 2nd edition, by Titterton and Weston; and a nearly-unreadable PDF titled "Quaternion Differentiation".

    Both sources assert point #1, but I think that the PDF gives a derivation justifying it. I'm not entirely sure, to be honest, because although I'm convinced of every algebra step in that paper, I'm not sure what I can conclude from them, or why I should believe their initial assumptions. It's really poorly written.

    Point #2 is derived from point #1 in the book, in section 11.2.5, on pages 319-320. The method is basically the same as that shown in the Wikipedia article on discretization:
    1. Re-write the differential equation so that it's a matrix multiplication instead of a quaternion multiplication, because the discretization procedure works on matrices, not quaternions. They do this by replacing the right-hand side with 0.5W·q, where W is the matrix representation of the conjugate of p. (I think they take the conjugate because they've moved it to the left of q instead of the right? I'm not sure.) Note that q still has the same 4 elements, only now we're pretending it's any old 4-vector instead of a quaternion.
    2. Assume that the discrete time steps are short enough that the angular velocity is basically constant. This is the "zero-order hold" assumption, according to Wikipedia.
    3. Integrate W over the current time-step, which is easy given zero-order hold: just multiply all the elements by T, the length of the time-step. (The book doesn't explain this step but it's really important that you multiply by delta-t somewhere!)
    4. Compute the matrix exponential of 0.5·T·W, and multiply the result by the old q to get the new q.
    5. But we want quaternions instead of matrices, so at this point the book turns that matrix exponential back into a quaternion through magic. A similar development was given in section 11.2.1 on pages 312-313, so they handwave it here. I'll handwave it too, but if you squint at the power series definition of matrix exponential you can see how it's like the Taylor series expansion of the sine and cosine terms in the axis-angle to quaternion formula. Specifically, if we define σ as the gyro measurement vector times the length of the time step, then the rotation axis is the norm of σ, and the rotation angle is the magnitude of σ. If we call the resulting quaternion r, then the new q is the old q times r. (Notice we've magically flipped back to multiplying on the right! I can't explain that. I bet there's another conjugate hiding in the hand-waving.)
    OK. That was a lot of magic, but I don't feel too terrible about it because (a) at least lots of sources seem to agree on the basic steps and (b) I find them convincing enough that these are principled steps and not just stuff somebody made up that seemed to work.

    The book goes on to derive point #3 from point #2. I don't want to go into detail on the Taylor series expansion of sine and cosine in this post, but if it helps, I wrote a Haskell function which computes an arbitrary-order approximation of the axis-angle to quaternion formula. In addition to avoiding any division by 0, this definition is nice because it has no square-roots, only three multiplies, plus one division and addition per order, and it computes both sine and cosine at the same time. On non-X86 CPUs, which don't typically have hardware sine/cosine instructions, I think this is about as efficient of an algorithm as you can get.

    Conclusion

    Just like when I was first banging my head against this question, in today's post I have not managed to justify the answers here from first principles of physics or anything. I decided then, and I'm still OK with this now, that I had dug far enough to be satisfied that there is some underlying reason why this is the right math.

    But if anyone wants to explain where that initial formula of 0.5q·p comes from, please leave a comment!

    Monday, May 9, 2016

    How to eliminate "goto" statements

    I wanted to write a program that would translate C programs into Rust programs, at least for some subset of C programs which are somehow "simple enough". You can follow along with my attempt in my GitHub repository for "Corrode". It isn't all that interesting yet.

    One reason I haven't gotten very far on it is that along the way I got distracted by a question. C supports "goto" statements. Rust does not. Can programs with "goto" statements be turned into programs without them?

    This turns out to be an interesting question, which people have been studying since at least a 1966 paper by Böhm and Jacopini. Wikipedia's "Structured program theorem" article gives good background on the question and its history. (If you want a formal treatment, try the 2008 paper, "The Böhm–Jacopini Theorem Is False, Propositionally".)

    The short answer is that yes, it is always possible to rewrite a program to not use "goto" statements. But you might not like the result.

    Impact

    Why have so many people studied this question and for so long? I wouldn't blame you if you thought it sounded like a point of mere academic curiosity. But I think we can divide the continued interest into two categories.

    The first category is code style, quality, and expressiveness: If we design our programming languages to force programmers to use "structured" programming style instead of goto statements, are we preventing them from writing useful programs? The Böhm-Jacopini theorem shows that the answer is "no": this kind of restriction does not change what programs can be written, in principle.

    That said, we have many language features which are not strictly necessary, and if you would like to see just how bad of an idea it is to get rid of them, try programming in some of the "Turing tarpit" class of esoteric programming languages. In this post I'm not going to take a position on whether "goto" should be allowed, because I think the answer is complicated; but the designers of Rust have chosen a position in that debate, and that's all that matters for my current project.

    The second category is in program analysis and optimization algorithms, which often only work on structured programs. According to several of the papers I've read while researching this question, modern compilers may just refuse to optimize functions that contain unstructured control flow. It would be nice to know if either we should avoid "goto" statements in performance critical code, or if compilers could be made smarter!

    As a specific example, a few years ago I was chatting with a developer working on the Mesa 3D Graphics Library about their compiler for the OpenGL Shading Language. This developer told me that on GPUs that were common at the time, the only control flow instructions were for structured control flow. The hardware understood "if" statements and "while" loops, and if the compiler couldn't translate a shader into those primitives, it either rejected the program or it generated much slower code; I forget which. (I think this had something to do with GPUs using a single instruction pointer for multiple threads of execution.) So a good answer to this question might have had a significant impact on video game performance at the time! Unfortunately, I think by then there wasn't much interest in performance tuning for that generation of hardware, so Mesa's compiler never implemented the known algorithms.

    My current interest is mostly in the first category, because I would like to transform C programs into Rust programs without destroying the structure of the program. In an ideal world, the output would resemble a program that a person might reasonably have written by hand. But I think the second category is also fascinating, particularly since program optimization can have a big impact.

    Approaches

    Wikipedia describes a trivial, but terrible, approach: wrap the entire function in a "while"-loop, and use a variable to keep track of which part of the function should run next. The extra variable then plays the same role as the program counter in a CPU, which shows that any program you could run on a CPU can be transformed this way. But as well-known computer scientist Donald Knuth pointed out, this transformation destroys the original structure of the program, so this is not a satisfying answer.

    The textbook "Compilers: Principles, Techniques, and Tools" by Aho, Sethi, and Ullman, usually referred to as "the dragon book" due to its cover art, is often used as the textbook for compilers courses. (I've heard people recommend against using it as an introductory text, but it's the only general-purpose compilers book I've actually read, so I'll cite it here...) In chapter 10, "Code Optimization", it describes an algorithm called "T1 - T2 Analysis", which identifies any control flow that can be described using only while and if statements, without goto. This algorithm works even on functions written entirely using conditional goto statements, as long as an equivalent while program exists. Such functions are called "reducible".

    If a function is irreducible, then T1-T2 analysis identifies some parts of the function that are causing trouble. One way to make those parts work without goto statements is to duplicate them, so that the code at the target of the goto now appears at every place which originally branched to it. This is called "node splitting", and sometimes labeled "T3" as it's a step you can take between applying the T1 and T2 steps of T1-T2 analysis. Node splitting can lead to a function which is exponentially larger than the original, though (see "Folklore Confirmed: Reducible Flow Graphs are Exponentially Larger"). Rather a lot of research has gone into splitting as few nodes as possible (see "Controlled Node Splitting" and "Handling Irreducible Loops: Optimized Node Splitting vs. DJ-Graphs", for two examples).

    In fact there are several more ways to deal with irreducible flow graphs, and I recommend reading chapter 6 of Reverse Compilation Techniques, Cristina Cifuentes' PhD thesis, if you would like a much more complete overview with plenty of citations.

    Conclusion

    Maybe we shouldn't care about irreducible flow graphs. It turns out that the ways people actually use "goto" statements almost always still lead to reducible flow graphs, according to James Stanier and Des Watson's 2010 paper, "A study of irreducibility in C programs". They examined 15 open source projects which together contained 1,709 "goto" statements. They found that out of 10,427 functions, only 5 had irreducible control flow! They also found that the code generators "flex" and "bison", despite generating plenty of "goto" statements, never generated irreducible functions for the eight grammars they tested.

    So for my project, I'll probably just use the T1-T2 approach once I encounter some interesting C source that's using goto statements, and until then I'll do naive syntax-driven transformations.

    But finding what people have written about this topic has been a fun diversion!

    Wednesday, April 27, 2016

    Rocket thrust and sensor fusion

    Inspired by various people, notably Julia Evans and (naturally) Sarah Sharp, I'm going to try writing again. (My previous personal blog had its last post in 2009.)

    I've been part of the Portland State Aerospace Society since 1999, which is to say, over half my life—so while I'm not going to introduce it in detail here, you can guess it's important to me. Check out the web site and our pile of GitHub repos. Maybe you'll feel inspired to contribute to open-source rocketry!

    I just learned something about rocket motors

    Last night at our weekly general meeting, I learned something from Aaron Baker, one of the students in the group:
    Rocket motors produce thrust (force) proportional to the amount of mass they're losing.
    (NASA has good background on the thrust equation, if you want more details and feel OK about first-year calculus-based physics. They claim their explanation is suitable for middle-school students but I think they may have mis-judged their audience.)

    Now, in hindsight, this feels almost obvious: a rocket throws burning bits of fuel out the nozzle in order to move, and if you throw more fuel out you'll move more. But it had never occurred to me that the two were related, before.

    OK, that's, uh, fascinating. Honest. Why do I care?

    The short answer is: It's useful to have a model of the physics of our rocket's flight that has as few parameters as possible, and this observation leads to a simpler model.

    A brief tour of rocket physics

    While a rocket is in flight, there are three major things affecting how it moves:
    1. If the rocket motor is burning, then it's making the rocket go faster.
    2. If the rocket is in atmosphere, then pushing against the air slows the rocket down.
    3. Gravity is always pulling on the rocket, and for my current purposes we can pretend like only Earth's gravity matters and it's always slowing the rocket down.
    How much effect does gravity have? We can pretend like it only depends on how high above the Earth you are.

    How much effect does air resistance have? We can pretend like it only depends on how fast the rocket is going, and how thick the atmosphere is; and for the latter, we can again pretend like it only depends on how high above the Earth you are.

    So far we only have two parameters to this model, altitude and speed, and those are both things our sensors can tell us directly.

    But the rocket motor adds two more parameters:
    • How hard is the motor pushing on the rocket right now? That changes during the burn, and is part of the design of the motor.
    • And how much does the remaining fuel weigh? As the rocket gets lighter, it becomes easier to move. So pushing the same amount gives you more speed than it did before, like the difference between pushing a bicycle and pushing a car.
    And we can't measure either of these parameters while the rocket is in flight! The closest we can get is the ratio of thrust to total vehicle weight, which is acceleration... but the acceleration we measure is the combination of accelerations due to all three of the above forces, and we can't separate out just the part due to the rocket motor.

    Here's where Aaron's observation comes in: Thrust is proportional to the change in vehicle weight. So if we can measure our total acceleration, and subtract off what our model says air resistance and gravity should have contributed, and we know what the total vehicle weight was a moment ago, we can compute both the current thrust and the new total vehicle weight. Et voilà, two parameters derived from one measurement!

    Of course it isn't actually that easy because we can't measure anything precisely enough.

    Measurement precision is hard

    Among the many challenges in building a rocket that can reach orbit, one is that it's moving faster than a human can react, and so the rocket needs to be able to steer itself. To do that, the rocket needs to know where it should be going, and where it actually is—so that when it isn't where it should be, it can take some action to get going the right direction again.

    Figuring out where you are is a tricky problem. You probably use GPS for that. On our rocket we also use GPS, but it isn't good enough by itself for several reasons:
    1. Commercial GPS receivers are not built with the amateur rocketry community in mind. (They're missing out on dozens to hundreds of sales each year!) So they tend to stop reporting positions during the most critical parts of the flight. There are both technical and legal reasons for that, but that's an entirely different discussion.
    2. The best commercial GPS receivers tell you where you are ten or twenty times per second. In one tenth of a second, our rocket may have traveled up to 40 meters (or 120 feet) at 375 meters per second (840 miles per hour). In other words, a lot can happen to a rocket in a tenth of a second!
    3. GPS can tell you your position and your velocity, but not which direction you're pointing (or equivalently, which direction your rocket motor is pointing). For that, your phone probably uses an electronic compass... which you may have noticed is often wrong. Yeah, electronic compasses have their own problems.
    For all those reasons, we don't rely purely on GPS. We also use accelerometers, gyroscopes, a pressure altimeter, and the previously mentioned electronic compass (technically called a 3D magnetometer).

    This raises a new problem: Now we have a bunch of sensors and they're all wrong! Every sensor measurement has some error—think of stepping on your bathroom scale and how it's obviously never right—and we use sensors that are as cheap as we can get away with, so our measurements are especially bad. Now we need to combine them all to get some idea of what's really going on. When they disagree, we need some way of making our best guess of the truth.

    This is the "sensor fusion" problem: how can we get the most truth out of each sensor, fusing all the measurements together into a reasonable guess? I'm not going into how to do sensor fusion today. If you want to do your own research, two standard algorithms you can look up are the Kalman Filter and the Particle Filter. Aaron is taking Prof. McNames' class on sensor fusion and related techniques right now, and Prof. McNames is posting video of his State Space Tracking course lectures as he goes, so you might find that to be a good resource.

    Putting it all together

    Sensor fusion algorithms do best when you give them an accurate model of the system you're measuring, so for sensor fusion applied to rockets, we'd like to use a physics model that incorporates all the major forces.

    But the more parameters the model needs, the worse any sensor fusion algorithm will perform. (Prof. McNames and others refer to this as the "curse of dimensionality".) I think of it this way: If I can explain the measurements I'm seeing two different ways, by changing different parameters, then it's going to be harder to guess which explanation is correct. If instead I can express one of those parameters in terms of another, then that rules out some explanations and makes it easier to find the right answer.

    So I'm excited about Aaron's insight because it allows us to use a more accurate physics model without adding too many parameters, which may lead to better results from cheaper sensors!

    p.s. Check out this video of our microgravity test of a reaction wheel system for attitude control in the cubesat we're building. Honestly I could just watch this on loop.