Oxcart and Clojure
26 Jun 2014Well, it’s a month into Google Summer of Code, and I still haven’t actually written anything about my project, better known as Oxcart beyond what little per function documentation I have written for Oxcart and the interview I did with Eric N.. So it’s time to fix that.
Motivation
Clojure isn’t “fast”, it’s simply “fast enough”. Rich, while really smart guy with awesome ideas isn’t a compilers research team and didn’t design Clojure with a laundry list of tricks that he wanted to be able to play in the compiler in mind. Instead in the beginning, Rich designed a language that he wanted to use to do work, built a naive compiler for it, confirmed that JVM JITs could run the resulting code sufficiently fast, and got on with actually building things.
The consequent of Rich’s priorities is that Clojure code is in fact fast enough to do just about whatever you want, but it could be faster. Clojure is often criticized for its slow startup time and its large memory footprint. Most of this footprint is not so much a consequent of fundamental limitations of Clojure as a language (some of it is but that’s for another time) as it is a consequent of how the existing Clojure compiler runtime pair operate together.
So Clojure wasn’t designed to have a sophisticated compiler, it doesn’t have such a compiler, and for some applications Clojure is slow compared to other equivalent languages as a result of not having these things. So for GSoC I proposed to build a prototype compiler which would attempt to build Clojure binaries tuned for performance and I got accepted.
Validating complaints
Okay, so I’ve made grand claims about the performance of Clojure, that it could be faster and soforth. What exactly do I find so distasteful in the language implementation?
Vars are the first and primary whipping boy. Vars
,
defined over here,
are data structures which Clojure uses to represent bindings between
symbols and values. These bindings, even when static at compile time,
are interred at runtime in a thread shared global bindings table, and
then thread local bindings tables contain “local” bindings which take
precedence over global bindings. This is why
(Clojure.core/alter-var-root!)
and the other var manipulation
functions in the Clojure standard library have the -!
postfix used
to annotate transactional memory mutation, because only a transaction
can modify the root bindings of a thread shared and thread safe Var
.
Now Var
s are awesome because they are thread shared. This means that
if you drop a REPL in a live program you can start re-defining Var
s
willy nilly and your program will “magically” update. Why does this
work? Because Clojure programs never hold a reference to a “function”,
instead they hold a thread synchronized var which names a function and
get the latest function named by the var every time they have to call
that function.
This is great because it enables the REPL driven development and
debugging pattern upon which many people rely, however for the
overwhelming majority of applications the final production form of a
program will never redefine a Var
. This means that consequently the
nontrivial overhead of performing thread synchronization, fetching the
function named by a var, checking that the fn is in fact
Clojure.lang.IFn
and then calling the function is wasted overhead
that the reference Clojure implementation incurs every single function
call. The worst part about this is that Var
has a volatile root
which poisons the JVM’s HotSpot JIT by providing a break point at
function boundaries which the JIT can’t inline or analyze across.
IFn is another messy part of Clojure programs. The JVM does not
have semantics for representing a pointer to a “function”. The result
is that Clojure doesn’t really have “functions” when compiled to JVM
bytecode, instead Clojure has classes providing .invoke()
methods
and instances of those classes, or more often Var
s naming IFn
s may
be passed around as values. This isn’t a bad thing for the most part,
except that IFn
s use Var
s to implement their .invoke()
methods
and vars are slow.
The real reason that this can be an issue is why do we need an IFn
with multiple invoke methods? Because in Clojure functions can have
multiple arities, and the single instance of an IFn
could be invoked
multiple ways where a JVM Method cannot be.
The real issue with IFn
s is that they exist at all. Every single
class referenced by a JVM program must be read from disk, verified,
loaded and compiled. This means that there is a load time cost to each
individual class in a program. Clojure exacerbates this cost by not
only generating a lot of small classes to implement IFn
s. When a
namespace is require
d or otherwise load
ed by a compiled Clojure
program, the Clojure runtime loads foo__init.class
, which creates
instances of every IFn
used to back a Var
in the namespace foo
and installs those definitions in the global var name table. Note that
this loading is single threaded, so all synchronization at load time
is wasteful. Also note that if there are top level forms like
(println "I got loaded!")
in a Clojure program those are evaluated
when the namespace containing the form is loaded.
The sales pitch
So what’s the short version here? Clojure has Var
s in order to
enable a dynamic rebinding model of programming which deployed
applications do not typically need. Because applications do not tend
to use dynamic binding for application code, we can discard Var
s and
directly use the IFn
classes to which the Var
s refer. This could
be a significant win just because it removes that volatile
root on
Var
s that poisons the JIT.
This opens up more opportunities for playing tricks in the compiler,
because we don’t really need IFn
objects for the most part. Remember
that IFn
objects are only needed because methods aren’t first class
on the JVM and to support dynamic redefinitions we need a first class
value for Var
s to point to. If all definitions are static, then we
don’t need Var
s, so we can find the fully qualified class and method
that a given function invocation points to, freeing a Clojure compiler
to do static method invocation. This should be a performance win as it
allows the JVM JIT to escape type checking of the object on which the
method is invoked and it allows the JIT to inline in the targeted
method among other tricks.
If we can throw away IFn
s by implementing functions as say static
methods on a namespace “class” rather than having seperate classes for
each function, then we cut down on program size in terms of classes
which should somewhat reduce the memory footprint of Clojure programs
on disk and in memory in addition to reducing load time.
Oxcart
So what is Oxcart? Oxcart is a compiler for a subset of Clojure which seeks to implement exactly the performance hat tricks specified above and a few more. For the most part these are simple analysis operations with well defined limitations. In fact, most of what’s required to use Oxcart to compile Clojure code is already built and working in that Oxcart can currently rewrite Clojure programs to aid and execute the above transformations.
Oxcart is also a huge exercise in the infrastructure around the
Clojure.tools.analyzer
contrib library, as Oxcart is the first full
up compiler to use tools.analyzer
, tools.analyzer.jvm
and
tools.emitter.jvm
as more than the direct compilation pipeline which
tools.emitter.jvm
implements. This means that Oxcart has interesting
representational issues in how passes and analysis data are handled
and shared between passes, let alone how the data structure describing
the entire program is built and the performance limitations of various
possible representations thereof.
So what’s Oxcart good for? right now: nothing. Oxcart doesn’t have an
AOT file emitter yet, and relies on tools.emitter.jvm
for eval
and
as such is no faster for eval
ing code in a live JVM than Clojure
is. At present I’m working on building an AOT emitter which will
enable me to start doing code generation and profiling Oxcart against
Clojure. I hope to post an initial emitter and a trivial benchmark
comparing a pair of mutually recursive math functions between Clojure
and Oxcart.
Before you go
I’ve know I’ve said this entire time that Oxcart is a Clojure
compiler. That’s a misnomer. Oxcart doesn’t compile Clojure and never
will. Clojure has stuff like eval
, eval-string
, resolve
,
load-string
, load
and all the bindings
stuff that allow Clojure
programmers to reach around the compiler’s back and change bindings
and definitions at runtime. These structures are not and never will be
supported by Oxcart because supporting them would require disabling
optimizations. Oxcart also doesn’t support non-def forms at the top
level. Oxcart programs are considered to be a set of defs and an entry
point. Oxcart also assumes that definitions are single. Redefining a
var is entirely unsupported, abet not yet a source of warnings.
Some of these differences are sufficiently extreme that I’m honestly
on the fence about whether Oxcart is really Clojure or some yet
undefined “Oxlang” more in the style of Shen, but for now I’ll stick
to building a prototype :D
^D