Oxcart and Clojure

Well, it’s a month into Google Summer of Code, and I still haven’t actually written anything about my project, better known as Oxcart beyond what little per function documentation I have written for Oxcart and the interview I did with Eric N.. So it’s time to fix that.

Motivation

Clojure isn’t “fast”, it’s simply “fast enough”. Rich, while really smart guy with awesome ideas isn’t a compilers research team and didn’t design Clojure with a laundry list of tricks that he wanted to be able to play in the compiler in mind. Instead in the beginning, Rich designed a language that he wanted to use to do work, built a naive compiler for it, confirmed that JVM JITs could run the resulting code sufficiently fast, and got on with actually building things.

The consequent of Rich’s priorities is that Clojure code is in fact fast enough to do just about whatever you want, but it could be faster. Clojure is often criticized for its slow startup time and its large memory footprint. Most of this footprint is not so much a consequent of fundamental limitations of Clojure as a language (some of it is but that’s for another time) as it is a consequent of how the existing Clojure compiler runtime pair operate together.

So Clojure wasn’t designed to have a sophisticated compiler, it doesn’t have such a compiler, and for some applications Clojure is slow compared to other equivalent languages as a result of not having these things. So for GSoC I proposed to build a prototype compiler which would attempt to build Clojure binaries tuned for performance and I got accepted.

Validating complaints

Okay, so I’ve made grand claims about the performance of Clojure, that it could be faster and soforth. What exactly do I find so distasteful in the language implementation?

Vars are the first and primary whipping boy. Vars, defined over here, are data structures which Clojure uses to represent bindings between symbols and values. These bindings, even when static at compile time, are interred at runtime in a thread shared global bindings table, and then thread local bindings tables contain “local” bindings which take precedence over global bindings. This is why (Clojure.core/alter-var-root!) and the other var manipulation functions in the Clojure standard library have the -! postfix used to annotate transactional memory mutation, because only a transaction can modify the root bindings of a thread shared and thread safe Var.

Now Vars are awesome because they are thread shared. This means that if you drop a REPL in a live program you can start re-defining Vars willy nilly and your program will “magically” update. Why does this work? Because Clojure programs never hold a reference to a “function”, instead they hold a thread synchronized var which names a function and get the latest function named by the var every time they have to call that function.

This is great because it enables the REPL driven development and debugging pattern upon which many people rely, however for the overwhelming majority of applications the final production form of a program will never redefine a Var. This means that consequently the nontrivial overhead of performing thread synchronization, fetching the function named by a var, checking that the fn is in fact Clojure.lang.IFn and then calling the function is wasted overhead that the reference Clojure implementation incurs every single function call. The worst part about this is that Var has a volatile root which poisons the JVM’s HotSpot JIT by providing a break point at function boundaries which the JIT can’t inline or analyze across.

IFn is another messy part of Clojure programs. The JVM does not have semantics for representing a pointer to a “function”. The result is that Clojure doesn’t really have “functions” when compiled to JVM bytecode, instead Clojure has classes providing .invoke() methods and instances of those classes, or more often Vars naming IFns may be passed around as values. This isn’t a bad thing for the most part, except that IFns use Vars to implement their .invoke() methods and vars are slow.

The real reason that this can be an issue is why do we need an IFn with multiple invoke methods? Because in Clojure functions can have multiple arities, and the single instance of an IFn could be invoked multiple ways where a JVM Method cannot be.

The real issue with IFns is that they exist at all. Every single class referenced by a JVM program must be read from disk, verified, loaded and compiled. This means that there is a load time cost to each individual class in a program. Clojure exacerbates this cost by not only generating a lot of small classes to implement IFns. When a namespace is required or otherwise loaded by a compiled Clojure program, the Clojure runtime loads foo__init.class, which creates instances of every IFn used to back a Var in the namespace foo and installs those definitions in the global var name table. Note that this loading is single threaded, so all synchronization at load time is wasteful. Also note that if there are top level forms like (println "I got loaded!") in a Clojure program those are evaluated when the namespace containing the form is loaded.

The sales pitch

So what’s the short version here? Clojure has Vars in order to enable a dynamic rebinding model of programming which deployed applications do not typically need. Because applications do not tend to use dynamic binding for application code, we can discard Vars and directly use the IFn classes to which the Vars refer. This could be a significant win just because it removes that volatile root on Vars that poisons the JIT.

This opens up more opportunities for playing tricks in the compiler, because we don’t really need IFn objects for the most part. Remember that IFn objects are only needed because methods aren’t first class on the JVM and to support dynamic redefinitions we need a first class value for Vars to point to. If all definitions are static, then we don’t need Vars, so we can find the fully qualified class and method that a given function invocation points to, freeing a Clojure compiler to do static method invocation. This should be a performance win as it allows the JVM JIT to escape type checking of the object on which the method is invoked and it allows the JIT to inline in the targeted method among other tricks.

If we can throw away IFns by implementing functions as say static methods on a namespace “class” rather than having seperate classes for each function, then we cut down on program size in terms of classes which should somewhat reduce the memory footprint of Clojure programs on disk and in memory in addition to reducing load time.

Oxcart

So what is Oxcart? Oxcart is a compiler for a subset of Clojure which seeks to implement exactly the performance hat tricks specified above and a few more. For the most part these are simple analysis operations with well defined limitations. In fact, most of what’s required to use Oxcart to compile Clojure code is already built and working in that Oxcart can currently rewrite Clojure programs to aid and execute the above transformations.

Oxcart is also a huge exercise in the infrastructure around the Clojure.tools.analyzer contrib library, as Oxcart is the first full up compiler to use tools.analyzer, tools.analyzer.jvm and tools.emitter.jvm as more than the direct compilation pipeline which tools.emitter.jvm implements. This means that Oxcart has interesting representational issues in how passes and analysis data are handled and shared between passes, let alone how the data structure describing the entire program is built and the performance limitations of various possible representations thereof.

So what’s Oxcart good for? right now: nothing. Oxcart doesn’t have an AOT file emitter yet, and relies on tools.emitter.jvm for eval and as such is no faster for evaling code in a live JVM than Clojure is. At present I’m working on building an AOT emitter which will enable me to start doing code generation and profiling Oxcart against Clojure. I hope to post an initial emitter and a trivial benchmark comparing a pair of mutually recursive math functions between Clojure and Oxcart.

Before you go

I’ve know I’ve said this entire time that Oxcart is a Clojure compiler. That’s a misnomer. Oxcart doesn’t compile Clojure and never will. Clojure has stuff like eval, eval-string, resolve, load-string, load and all the bindings stuff that allow Clojure programmers to reach around the compiler’s back and change bindings and definitions at runtime. These structures are not and never will be supported by Oxcart because supporting them would require disabling optimizations. Oxcart also doesn’t support non-def forms at the top level. Oxcart programs are considered to be a set of defs and an entry point. Oxcart also assumes that definitions are single. Redefining a var is entirely unsupported, abet not yet a source of warnings.

Some of these differences are sufficiently extreme that I’m honestly on the fence about whether Oxcart is really Clojure or some yet undefined “Oxlang” more in the style of Shen, but for now I’ll stick to building a prototype :D

^D