On Future Languages

26 Aug 2014

As Paul Philips notes at the end of his talk We’re Doing It All Wrong [2013], ultimately a programming language is incidental to building software rather than critical. Ultimately the “software developer” industry is not paid to write microcode. Rather, software as an industry exists to deliver buisness solutions. Similarly computers themselves are by a large incidental. Rather, they are agents for delivering data warehousing, search and transmission solutions. Very few people are employed to improve software and computer technology for the sake of doing so compared to the hordes of industry programmers who ultimately seek to use computers as a magic loom to create value for a business. In this light I think it’s meaningful to investigate recent trends in programming languages and software design as they pertain to solving problems not to writing code.

As the last four or five decades of writing software stand witness, software development is rarely an exercise in first constructing a perfect program, subsequently delivering it and watching pleased as a client makes productive use of it until the heat death of the universe. Rather, we see that software solutions when delivered are discovered to have flaws, or didn’t solve the same problem that the client thought that it would solve, or just didn’t do it quite right, or the problem changed. All of these changes to the environment in which the software exists demand changes to the software itself.

Malleability

In the past, we have attempted to develop software as if we were going to deploy perfect products forged of pure adamantium which will endure forever unchanged. This begot the entire field of software architecture and the top down school of software design. The issue with this model as the history of top down software engineering stands testament is that business requirements change if they are known, and must be discovered and quantified if they are unknown. This is an old problem with no good solution. In the face of incomplete and/or changing requirements all that can be done is to evolve software as rapidly and efficiently as possible to meet changes as Parnas argues.

In the context of expecting change, languages and the other tools used to develop changing software must be efficient to re-architect and change. As Paul Philips says in the above talk, “modification is undesirable, modifiability is paramount”.

Looking at languages which seem to have enjoyed traction in my lifetime, the trend seems to have been that, with the exception of tasks for which the language in which the solution was to be built was a requirement the pendulum both of language design and of language use has been swinging away from statically compiled languages like Java, C & C++ (the Algol family) towards interpreted languages (Perl, Python, Ruby, Javascript) which trade off some performance for interactive development and immediacy of feedback.

Today, that trend would seem to be swinging the other way. Scala, a statically checked language base around extensive type inference with some interactive development support has been making headway. Java and C++ seem to have stagnated but are by no means dead and gone. Google Go, Mozilla Rust, Apple Swift and others have appeared on the scene also fitting into this intermediary range between interactive and statically compiled with varying styles of type inference to achieve static typing while reducing programmer load. Meanwhile the hot frontier in language research seems to be static typing and type inference as the liveliness of the Haskell ecosystem is ample proof.

Just looking at these two trends, I think it’s reasonable to draw the conclusion that interpreted, dynamically checked, dynamically dispatched languages like Python and Ruby succeeded at providing more malleable programming environments than the languages which came before them (the Algol family & co). However while making changes in a dynamically checked language is well supported, maintaining correctness is difficult because there is no compiler or type checker to warn you that you’ve broken something. This limits the utility of malleable environments, because software which crashes or gives garbage results is of no value compared to software which behaves correctly. However, as previously argued, software is not some work of divine inspiration which springs fully formed from the mind onto the screen. Rather the development of software is an evolutionary undertaking involving revision (which is well suited to static type checking) and discovery which may not be.

As a checked program must always be in a legal (correct with respect to the type system) state by definition, this precludes some elements of development by trial and error as the type system will ultimately constrain the flexibility of the program requiring systematic restructuring where isolated change could have sufficed. This is not argued to be a flaw, it is simply a trade off which I note between allowing users to express and experiment with globally “silly” or “erroneous” constructs and the strict requirement that all programs be well formed and typed when with respect to some context or the programmer’s intent the program may in fact be well formed.

As program correctness is an interesting property, and one which static model checking including “type systems” is well suited to assisting with, I do not mean to discount typed languages. Ultimately, a correct program must be well typed with respect to some type system whether that system is formalized or not.

Program testing can be used to show the presence of bugs, but never to show their absence!

~ Dijkstra (1970)

Static model checking on the other hand, can prove the presence of flaws with respect to some system. This property alone makes static model checking an indispensable part of software assurance as it cannot be replaced by any other non-proof based methodology such as assertion contracts or test cases.

Given this apparent trade off between flexibility and correctness, Typed Racket, Typed Clojure and the recent efforts at Typed Python are interesting, because they provide halfway houses between the “wild west” of dynamic dispatch & dynamic checking languages like traditional Python, Ruby and Perl and the eminently valuable model checking of statically typed languages. This is because they enable programmers to evolve a dynamically checked system, passing through states of varying levels of soundness towards a better system and then once it has reached a point of stability solidify it with static typing, property based testing and other static reasoning techniques without translating programs to another language which features stronger static analysis properties.

Utility & Rapidity from Library Support

Related to malleability in terms of ultimately delivering a solution to a problem that gets you paid is the ability to get something done in the first place. Gone (forever I expect) are the days when programs are built without using external libraries. Looking at recent languages, package/artifact management and tooling capable of trivializing leveraging open source software has EXPLODED. Java has mvn, Python has pip, Ruby has gem, Haskell has cabal, Node has npm and the Mozilla Rust team deemed a package manager so critical to the long term success of the language that they built their cargo system long before the first official release or even release candidate of the language.

Why are package managers and library infrastructure critical? Because they enable massive code reuse, especially of vendor code. Building a webapp? Need a datastore? The investment of buying into any proprietary database you may choose has been driven so low by the ease with which $free (and sometimes even free as in freedom) official drivers can be found and slotted into place it’s silly. The same goes for less vendor specific code… regex libraries, logic engine libraries, graphics libraries and many more exist in previously undreamed of abundance (for better or worse) today.

The XKCD stacksort algorithm is a tongue in cheek reference to the sheer volume of free as in freedom forget free as in beer code which can be found and leveraged in developing software today.

This doesn’t just go for “library” support, I’ll also include here FFI support. Java, the JVM family of languages, Haskell, Ocaml and many others gain much broader applicability for having FFI interfaces for leveraging the decades of C and C++ libraries which predate them. Similarly Clojure, Scala and the other “modern” crop of JVM languages gain huge utility and library improvements from being able to reach through to Java and leverage the entire Java ecosystem selectively when appropriate.

While it’s arguably unfair to compare languages on the basis of the quantity of libraries available as this metric neglects functionally immeasurable quality and utility, the presence of any libraries is a legitimate comparison in terms of potential productivity to comparative absence.

What good is a general purpose building material, capable of constructing any manner of machine, when simple machines such as the wheel or the ramp must be reconstructed by every programmer? Not nearly so much utility as a building material providing these things on a reusable basis at little or no effort to the builder regardless of the ease with which one may custom build such tools as needed.

So What’s the Big Deal

I look at this and expect to see two trends coming out of it. The first of which is that languages with limited interoperability and/or limited library bases are dead. Stone cold dead. Scheme, RⁿRS, and Common Lisp were my introductions to the Lisp family of languages. They are arguably elegant and powerful tools, however compared to other tools such as python they seem offer at best equal leverage due to prevailing lack of user let alone newbie friendly library support compared to other available languages.

I have personally written 32KLoC in Clojure. More counting intermediary diffs. That I can find on my laptop. Why? Because Clojure unlike my experiences with Common Lisp and Scheme escapes the proverbial lisp curse simply thanks to tooling which facilitates library and infrastructure sharing at a lower cost than the cost of reinvention. Reinvention still occurs, as it always will, but the marginal cost of improving an existing tool vs writing a new one is in my experience a compelling motivator for maintaining existing tool kits and software. This means that Clojure at least seems to have broken free of the black hole of perpetual reinvention and is consequently liberating to attack real application critical problems rather than distracting programmers into fighting simply to build a suitable environment.

It’s not that Clojure is somehow a better language, arguably it ultimately isn’t since it lacks the inbuilt facilities for many interesting static proof techniques, but that’s not the point. As argued above, the language(s) we use are ultimately incidental to the task of building a solution to some problem. What I really want is leverage from libraries, flexibility in development, optional and/or incremental type systems and good tooling. At this task, Clojure seems to be a superior language.

The Long View

This is not all to say that I think writing languages is pointless, nor that the languages we have today are the best we’ve ever had let alone the best we ever will have at simultaneously providing utility, malleability and safety. Nor is this to say that we’ll be on the JVM forever due to the value of legacy libraries or something equally silly. This is however to say that I look with doubt upon language projects which do not have the benefit of support from a “major player”, a research entity or some other group willing to fund long term development in spite of short term futility simply because the literal price of bootstrapping a “new” language into a state of compelling utility is expensive in terms of man-years.

This conclusion is, arguably, my biggest stumbling block with my Oxlang project. It’s not that the idea of the language is bad, it’s that a tradeoff must be carefully made between novelty and utility. Change too much and Oxlang will be isolated from the rest of the Clojure ecosystem and will loose hugely in terms of libraries as a result. Change to little and it won’t be interesting compared to Clojure. Go far enough, and it will cross the borders of hard static typing, entering the land of Shen, Ocaml and Haskell and as argued above I think sacrifice interesting flexibility for uncertain gains.