Formalism In Software 'Engineering'

This post was originally authored for the UTCS Facebook group, in the context of discussing the role of formalism in software “engineering” as regards this article from BuisnessInsider arguing that “The title “engineer” is cheapened by the tech industry.”

Another article in the same vein.

The argument at hand is that “Software Engineering” isn’t “real engineering” because we play it fast, loose and don’t use formalisms to the extent that other fields do. So lets consider formalisms in the context of computer science.

They’re pretty great. We can prove that functions are with respect to the proof criteria done and correct for all time. Sorting algorithms, Boyer-Moore, data structures and soforth do well defined things correctly and fall in this category.

On the other hand, most of the work we’ve done in the direction of formalisms is to be polite ultimately useless. We can’t do general proof, forget automate general proof (by application of the halting problem). Even without the halting problem we have logic incompleteness. So really I think that saying we have formalisms in general software engineering is pretty silly. The classical “formal” tools we have suffer from immense limitations and while some heros may succeed with them anyway, they are far from the realms of mortals.

So what of the claim that we do “software engineering”? If you dig very far into software engineering literature, you’ll find that it’s kinda a sham. I’ll direct you here to Parnas among other authors such as Brooks who was previously mentioned to support this point. The fundamental problem in software engineering isn’t designing or writing software. That’s easy, we teach undergrads and nontechnical people to program all the time. The real problem we face is coping with the long term consequences of change in a software system.

Users don’t really understand what it is that they want most of the time. If you look at the development of large software systems, you’ll find a huge amount of learning occurs as you build something. As you write a program, you are forced to explore the problem domain thereof. In doing so, you will unavoidably learn more about the problem and discover things you should have designed differently. Worse, users seeing intermediary results will get a better understanding for what it is that they want and start asking for it. So we are fated to loose from both sides. The only way to win is to reduce your iteration time both in terms of cleaning up your mess and getting something in front of your users faster. This is not somehow a problem unique to software development, it is shared by all engineering. The real draw of modern automated manufacturing is just this, that design improvements can be rolled out faster.

What is unique to software is that when we “deliver” I think it’s silly to say that whatever we’ve built is “done”. When we say that we’re “done” with a ship and the last rivet is fitted, the cost of changing that ship is enormous. Similarly when the last brick of a bridge is laid, the cost to tear it up and start over again is prohibitive. In contrast, we work with programmable machines which practically speaking never decay and whose behavior is anything but fixed at the time of “manufacture”.

Because of the huge change costs, vast amounts of time in traditional engineering is spent gathering and evaluating design criteria because it must be. The price of getting it wrong is simply too high. Unless you’re a Defense contractor you can’t afford to deliver the wrong thing late and over budget (ayyyy). Instead you have to figure out what the right thing is and how to build it right the first time on time and under budget if possible. Case in point, during the development of the digital telephone, Bell Labs spent a huge amount of money doing user studies. They examined different dialing keypad layouts and discovered the layout we use today. They tested human hearing to understand how much (read how little) information was required to support acceptable calls. And on and on, mostly done ahead of time due to the huge investments involved.

We have the opposite problem. It costs us next to nothing to change our design as our design criteria evolve. This means that we can afford to play fast and loose. We can always patch it later. Look at Tesla who’s famously OTA updating their fleet of cars to of all things add features that weren’t present when their customers purchased the cars. It’s literally more time (read cost) efficient for us to bang out a release and get it in front of users to try and refine our conception of what they want than it is for us to gather requirements and do an in depth design.

An interesting case study is Chris Granger’s programming language Eve which was “designed” (maybe discovered is the better word) by both doing a ton of research on prior art and user study after user study to try and understand how users were learning and making use of Eve. Critically, changes were made not just to the UI but to the semantics of the language as exposed to users to try and smooth out the pain points while preserving utility. This stands in stark contrast to langauges like Pascal which came more or less fully formed into the world and have changed little in response to requirements or user studies.

Similarly modern “continuous deployment” and “continuous testing” infrastructure is laudable because it helps programmers get software out the door faster. This tighter feedback loop between users and programmers helps us refine the “right thing” and deliver that thing. Faster is better. Sound familiar?

Move fast and break things

This is where we need to come back to formalism. Because this really isn’t working out for facebook and it won’t work out for you. While I will yet write an article on breaking changes and the necessity of really breaking software, the majority of breaking changes in the above sense are bugs and are accidental rather than essential. We are, as the other engineers accuse us, really bad at doing software development. We need a way to cope with change and ensure stability at the same time. If we choose our abstractions well through experience and art, we can get some of this. But there are very few people with the art to make design decisions which will mitigate real unknown future change. What we really need are tools which help us control and understand the impacts of our changes.

One of these tools is obviously testing and code coverage. We can understand the impact of our changes or our failure to change in terms of what tests we’ve broken or failed to fix. We can also understand changes in terms of its impact on the coverage of our programs. If we add a bunch of new code that isn’t tested, then our tests explore less of the behavior of our programs and are less valuable and the indicator of test coverage can tell us this. If we delete a bunch of code that wasn’t tested or wasn’t used, then we’ve reduced the state space of our program (which is a great thing!) and our coverage should go up to reflect this.

These are all really powerful tools, but they are more the practices and methodologies which have gotten us this bad name of not being real engineering. What of proof as a tool?

Our medium being inherently different and the cost mechanics also being different I think that any bemoaning of software correctness going out the window is plainly silly. Of course software won’t be “correct”. Much software is too complex for any meaningful “correctness” criterium to be established. It’s far more efficient to build something that sorta works and iterate than to obsess over getting it right. Proof, small software shops that can iterate quickly are eating the big slow movers lunches if you care to look at the last few decades of our industry’s history.

This suggests that formalism needs to take a different place in software engineering. It needs to exist not as an engine for absolute correctness, but as a tool to be used to help enable iteration. Type directed programming and error prediction in Haskell is I think a great and under explored use of this. If you want to make a change, make that change and run the typechecker (proof engine with regards to whatever properties you encoded in the type system) and all the errors you get are the places where your existing program needs to be changed in order to accomodate the desired change. This is not general correctness, this is not halting, this is just a tool for letting you as an engineer track the criteria which make what you’re building good enough and whether they are preserved.

This conception of lightweight formalism(s) is I think the future of the industry. Real theorem proving has its place in hardware design and in other corners of the industry where the cost benefit tradeoffs are again different, but it’s simply too effort intensive to be practicable for the “average” programmer working on yet another Uber for X.

So. Parnas. Fake it. Ship it. Think about it. Fake it again and again. Use formalisms to help you fake it. But formalism for any classical value of formalism is likely nonsense unless you have good reason otherwise.

What are we doing here by all this pushing around of constraints?

I think this is engineering.