Oxcart going forwards
11 Dec 2014When I last wrote about Oxcart work pretty much went on hiatus due to my return to school. As there has been some recent interest in the status of Lean Clojure overall I thought I’d take the opportunity to review the state of Oxcart and the plan for Oxcart going forwards.
Oxcart and Clojure
The static linking approach and lack of run time dynamism found in Oxcart is explicitly at odds with the philosophy of core Clojure. Where Clojure was designed to enable live development and makes performance sacrifices to enable such development as discussed here, Oxcart attempts to offer the complement set of trade offs. Oxcart is intended as a pre-deployment static compiler designed to take a working application and to the greatest extent possible wring more performance out of unchanged (but restricted) Clojure as PyPy does for Python. As Oxcart explicitly avoids the dynamic bindings which Clojure embraces, Alex Miller, the Clojure Community Manager, has repeatedly stated that he expects to see little cross pollination from Oxcart and related work to Clojure itself.
This would be all well and good, were it not for the existing behavior of
Clojure’s clojure.lang.RT
class. As currently implemented in Clojure 1.6 and
1.7, RT
uses its <initc>
method to compile the following resources with
clojure.lang.Compiler
.
- “clojure/core”
- “clojure/core_proxy”
- “clojure/core_print”
- “clojure/genclass”
- “clojure/core_deftype”
- “clojure/core/protocols”
- “clojure/gvec”
- “clojure/instant”
- “clojure/uuid”
These represent about 10799 lines of code, all of which could easily be
statically compiled and most importantly tree shaken ahead of time by Oxcart or
another tool rather than being loaded at boot time. This also means that the
unavoidable booting of Clojure itself from source can easily dominate loading
user programs especially after static compilation to raw classes. A quick
benchmark on my machine shows that booting a Clojure 1.6 instance, loading a ns
containing only a -main
that only prints “hello world” takes ~2.8 seconds from
source compared to ~2.5 seconds booting the same program compiled with Oxcart
suggesting that the cost of booting Clojure is the shared ~2.5 second boot
time. This is the test.hello
benchmark in
Oxcart’s demos.
$ git clone git@github.com:oxlang/oxcart.git &&\
cd oxcart &&\
git checkout 0.1.2 &&\
bash bench.sh test.hello
Running Clojure 1.6.0 compiled test.hello....
Hello, World!
real 0m1.369s
user 0m3.117s
sys 0m0.083s
Oxcart compiling test.hello....
Running Oxcart compiled test.hello....
Hello, World!
real 0m1.212s
user 0m2.487s
sys 0m0.073s
Then there’s the test.load
benchmark. This benchmark as-is pushes credulity
because it compiles 502 functions of which only the -main
which uses none of
the other 501 will be invoked. This reflects more on program loading time than
on the loading time of “clojure/core”, but I still think instructive in the
costs of boot time compilation, showing a ~7s boot time for Clojure compared to
a ~2.5s boot time for Oxcart. As arbitrary slowdowns from macroexpansions which
Thread/sleep
would be entirely possible I consider this program within the
bounds of “fairness”.
A Fork in the Road
There are two solutions to this limitation, and both of them involve changing
the behavior of Clojure itself. The first is my proposed
lib-clojure
refactor. Partitioning Clojure is a bit extreme, and in toying with the proposed
RT
→ Util
changes over here
I’ve found that they work quite nicely even with a monolithic Clojure
artifact. Unfortunately there seems to be little interest from Clojure’s Core
team (as judged via Alex’s communications over the last few months) in these
specific changes or in the static compilation approach to reducing the
deployment overhead of Clojure programs. The second is to fork Clojure and then
make lib-clojure changes which solves the problem of convincing Core that
lib-clojure is a good idea but brings its own suite of problems.
Oxcart was intended to be my undergraduate thesis work. While the 16-25% speedup
previously reported is impressive, Oxcart does nothing novel or even interesting
under the hood. It only performs four real program transformations:
lambda lifting, two kinds of
static call site linking and
tree shaking. While I suppose
impressive for an undergrad, this project also leaves a lot on the table in
terms of potential utility due to its inability to alter RT
’s unfortunate
loading behavior. I also think there is low hanging fruit in doing unreachable
form elimination and effect analysis, probably enough that Oxcart as-is would
not be “complete” even were its emitter more stable.
I’m reluctant to simply fork Clojure, mainly because I don’t think that the
changes I’ve been kicking about for lib-clojure actually add anything to Clojure
as a language. If I were to fork Clojure, it’d be for
Oxlang
which actually seeks to make major changes to Clojure not just tweak some
plumbing. But writing a language so I can write a compiler is frankly silly so
that’s not high on the options list. The worst part of this is that forking
Clojure makes everything about using Oxcart harder. Now you have dependencies at
build time (all of “stock” Clojure) that don’t exist at deployment time (my
“hacked” Clojure). Whatever hack that requires either winds up complicating
everyone’s project.clj
or in an otherwise uncalled for leiningen plugin just
like lein-skummet. Tooling
needs to be able to get around this too when every library you’d want to use
explicitly requires [org.clojure/clojure ...]
which totally goes away once
Oxcart emits the bits you need and throws the rest out. Most of all I don’t want
to maintain a fork for feature parity as time goes on. However I also don’t see
any other a way to get around RT
’s existing behavior since the RT
→ Util
refactor touches almost every java file in Clojure.
Flaws in the Stone
Oxcart itself also needs a bunch of work. While I think that Nicola has done an
awesome job with tools.analyzer
and tools.emitter.jvm
I’m presently
convinced that while it’s fine for a naive emitter (what TEJVM is), it’s a
sub-optimal substrate for a whole program representation and for whole program
transforms.
Consider renaming a local symbol. In the LLVM compiler infrastructure, “locals”
and other program entities are represented as mutable nodes to which references
are held by clients (say call sites or use sites). A rename is then simply an
update in place on the node to be changed. All clients see the change with no
change in state. This makes replacements, renames and so forth constant time
updates. Unfortunately due to the program model used by tools.analyzer
and
tools.emitter.jvm
, such efficient updates are not possible. Instead most
rewrites degenerate into worst case traversals of the entire program AST when
they could be much more limited in
scope. Cutaway is one experiment in this
direction, but it at best approximates what clojure.core.logic.pldb
is capable
of. I hope that over Christmas I’ll have time to play with using pldb
to
store, search and rewrite a “flattened” form of tools.analyzer
ASTs.
Oxcart is out of date with tools.emitter.jvm
and tools.analyzer
. This
shouldn’t be hard to fix, but I just haven’t kept up with Nicola’s ongoing work
over the course of the last semester. This will probably get done over Christmas
as well.
Oxcart doesn’t support a bunch of stuff. As of right now, defmulti
,
defmethod
, deftype
, defprotocol
, proxy
, extend-type
and
extend-protocol
aren’t supported. I’m pretty sure all of these actually work,
or could easily work, they just didn’t get done in the GSoC time frame.
Finally and I think this is the thing that’s really blocking me from working on
Oxcart: it can’t compile clojure.core
anyway. This is a huge failing on my
part in terms of emitter completeness, but it’s a moot point because even if I
can compile clojure.core
with Oxcart RT
is gonna load it anyway at boot
time. I also suspect that this is an incompleteness in the project as a whole
which probably makes it an unacceptable thesis submission although I haven’t
spoken with my adviser about it yet.
The Endgame
As of right now I think it’s fair to call Oxcart abandoned. I don’t think it’s a worthwhile investment of my time to build and maintain a language fork that doesn’t have to be a fork. I talked with Alexander, one of the clojure-android developers and a fellow GSoC Lean Clojure student/researcher about this stuff and the agreement we reached was that until 1.7 is released there’s no way that the lib-clojure changes will even get considered and that the most productive thing we can do as community members is probably to wait for 1.8 planning and then try to sell lib-clojure and related cleanup work on the basis of enabling clojure-android and lean clojure/Oxcart. Practically speaking in terms of my time however, if it’s going to be a few months until 1.7 and then a year until 1.8, that only gives leaves me my last semester of college to work on Oxcart against an official version of Clojure that can really support it. If that’s what it takes to do Oxcart I’ll likely just find a different thesis project or plan on graduating without a thesis.
(with-plug That said, serious interest in Oxcart as a deployment tool or another contributor would probably be enough to push me over the futility hump of dealing with a Clojure fork and get Oxcart rolling.)
^d