Software company in accordance with the theory of constraints, Part 1
This is going to be a slightly different article than my usual technology stories. It’s going to be weirdly about project management in software organisations. Being in the CTO role for the past three years taught me a lot, it challenged my assumptions in many ways, and now I want to share my outlook on the problem of managing software organisations in general and project management in particular.
It is going to be a three part story. The first two will be basically rants about things that “they don’t teach us in school”, I’m going to challenge some of the assumptions that we take for granted in software industry. And, in the part three we will look at more practical implementation of those ideas that make for a better functioning software organisation.
Considering that it is still a hotly debated topic, and everyone feels entitled to an opinion, a bit of a trigger warning, I’m going to say things you might find either too trivial or categorically wrong. Just breathe in and out slowly, I’m not challenging your skills and knowledge here, it’s not even about you really. It is about those who are lost at sea and what to find their way back home.
Oh, and by the way, I stole all of this from various books and personal experiences. A bulk of this comes from the works of Eliyahu M. Goldratt. Take it for what you prefer, I’m just reassembling it here in a way that makes sense to me and hopefully you as well.
Okay, without further ado, lets dive into it.
The assumptions we take for granted
Common sense is not actually that common
— Eliyahu M. Goldratt
The interesting thing about software engineers is that we love making technology, and, at the same time, we hate managing the outcomes of our work. I find this fascinating. Talk to any programmer serious about their profession and they will speak endlessly about the nuts and bolts of software, the underlying rules, and why things should be done specific way, and where we’re going in technology, etc. Then ask the same person, how do they manage the outcomes of their work, and you’ll get a blank stare.
Okay, to be fair, some people are a bit more enlightened, they will tell you that they use SCRUM, kanban, agile, or whatever else that might look great on their CV at the moment. Some might even articulate what “lean” really means. Yet, if you start pocking at those believes, with basic ”whys?”, you will inevitably run into lack of understanding of principles that make an organisation work. People often just do what they were told on the process management side, without questioning the dogma that much. And, sooner or later, when you make them really think about it, you will start hearing the same list of excuses:
Software development is too complex to be predictable.
Software is not manufacturing, it is more of an art form.
Software projects should be treated as a creative undertaking rather than an assembly pipeline.
And the list goes on and on.
Truth be told, I said all the same things for the bulk of my career too. This idea that software engineering is “special” is so ubiquitous in our industry, that no-body seems to question it. And yet, the more I work as an engineer, the more I am getting convinced that there is no real evidence under this assumption other than our own failures in delivering software projects.
Don’t believe me? Here, try to answer those questions, or go ask any software engineer on your team the same thing:
A construction company is developing a residential building, it’s a 2 years project with hundreds of people involved. It includes the building construction itself, load bearing and all. It has electrical systems, communications, water supply, swage and trash disposal systems, interior design, all sorts of people traffic management, fire alarms and evacuation processes, car parking, noise reduction considerations, recreational areas, energy efficiency concerns, features for people with special needs, integration into local community, etc. And that’s just the beginning, the whole thing is made out of real life materials with varying qualities, things go wrong all the time, suppliers mess up, contractors drop out, people come and go, even the plans change all the time after customer requests. The question is: how come they usually deliver all this complexity of a multi-year project on time and budget, and we often can’t deliver on time a 4 weeks React.JS project involving 5 people?
Here is another one. A person had suffered a car crash and was admitted to an ER facility. The person has a broken limb, a severe concussion, they have diabetes, a heart disease, unknown allergies, significant loss of blood, and in addition, they are knocked out cold and completely unresponsive. Do you honestly believe that the complexity of reviving this person back to health, with several distinct specialities and processes involved, anywhere near the complexity of making a web app out of HTML and CSS, with 95% of your codebase coming pre-made from open-source? And yet, modern ER has over 90% success rate of nurturing patients in critical conditions back to normality in the span of several weeks. In software though, if you ship features 60% on time, you’re basically a god of engineering and hugely above the average.
And the list goes on and on. Cars manufacturing, social changes, science and R&D departments, music festivals, highways and roads building, space travel, agriculture, wars, etc. They all have complexities beyond single human understanding linked tightly with randomness, and yet we’ve been doing those successfully for decades.
So, lets dial back a bit and ask ourselves the same question. Is software engineering that much more unpredictable than other things in the real world? Is it that much special, that no normal rules apply? Or maybe, just maybe, it is a self fulfilling prophecy based on nothing else but our own ignorance? We don’t know how to deal with complexity and randomness effectively, and we deny ourselves learning experiences by keeping ourselves exempted from the “real life” rules, and so we keep coming back to where we were: chronically delayed projects and burned out engineers.
I personally starting to believe that the “software is special” mindset is a variation of the “dog ate my homework” excuse, and we can do significantly better than that. But, before we go into solutions, lets step back and understand what the problem we’re dealing with looks like.
What is the purpose of an organisation?
If you go to different people in an organisation and ask them: “what is the purpose of the organisation and your role in it?”, you will get all sorts of answers. A CEO will say “profits”, sales will say “to make sure customers buy”, engineers will say “ship quality solutions”, project management will say “timelines”. It’s actually a sort of a fractal problem, if you go for example inside of a project team, different roles will give you different answers as well.
The challenge here is to grasp the underlying principle. Without understanding the mechanics of a machinery it is very difficult to optimise it. All you can do in case of lack of perspective is to get stuck messing with local optimums. And that’s what we mostly do in software: sales promise god knows what, engineering is focused on code quality, and project management is stuck in between trying to please everyone in the most efficient way.
The problem is, that a system made out of optimal parts doesn’t make for an optimal system. The simplest example is the air travel: airplanes maintenance is highly optimised, airports are highly optimised, security is highly optimised, transport to and from the airport is highly optimised too. Yet, when was the last time you had an actually pleasant experience flying somewhere? The same is true for a software company, it can have highly optimised departments, but the overall result can, and often is, well, frustrating.
If you want a simple answer to this question, then the purpose of a software organisation is to make money, plain and simple. We can argue back and fourth about the various aspects of an organisation, but the simple fact remains, if you’re not making money, you’re not going to last long, regardless of how good your software and processes are.
Step into the void
I think everyone will agree that “making money” is not the best way of looking at the problem, it doesn’t give us much understanding of inner workings of an organisation to make improvements. If all you do is look at profits, all you can optimise is the costs and margins. We need to structure the problem further, and here is where the theory of constraints chirps in.
In accordance with the theory of constraints, one can look at an organisation as a supply chain. In real world, suppliers in a chain sell to each other all the time, and they can record “profits” by treating each other as customers. But, in reality, unless the last company in the chain actually sells to an end user, no one really makes sales; not in the long term anyways. The products just accumulate within the system.
Just as with a real world supply chain, any part of an organisation, or even all of its sections can excel at what they do and register as a “success”. And yet, unless the customer buys what a company creates, nobody really makes profits.
To simplify the problem and step away from confusion of costs and investments and who owes whom, the theory of constraints steps into more of the systems thinking territory and operates with the three top level concepts:
Throughput — how quickly customers get their product in quantities required
Inventory — how much products and unfinished parts stuck in the system
Cost of operation — how much does it cost to run the organisation
In classical organisation optimisation processes people are usually concerned with cost optimisation. And generally it makes sense, the less you pay to produce the same products, the better your margins and the more profits you make. The reality of the matter is though, cost accounting is the least effective way to manage supply chains. Moreover, if taken too far, cost accounting can suffocate critical parts of the delivery chain and be down right detrimental to the effectiveness of the overall system.
If the goal of a company is to make money, in terms of the structure above, we can say that the goal of an organisation is to maximise the throughput. And that is where most of the contemporary LEAN and subsequently agile methodologies are primarily focused on. Where the theory of constraints takes it slightly further is it adds in the consideration the impact of the inventory on the throughput of an organisation.
The basic premise of the theory of constraints is that the more inventory a system has trapped inside, the less throughput it can achieve, and the more cost of operation will be. The name of the game is to drive the inventory down to improve the throughput, while keeping the cost of operation as low as possible.
How does it work in real life?
Since this theory grew out of the work of Taiichi Ohno the thought father behind the widely praised Toyota Production System, it would be fitting to explain this from the perspective of a car manufacturing process.
Lets say you have a car manufacturing plant. It has three work centres that produce spare parts, a heat treatment facility, and an assembly line. All linked like so:
Wheels — — — — — — — — — → Assembly → finished car
Bodies — → Heat treatment — — — ^
Engines — — — — ^
Wheels go into assembly directly, but Bodies and Engines need heat treatment, which is a slow process that is done in large batches and can only process one type of parts at a time. And, obviously, one needs all the three parts to assemble a car.
Now imagine the problem where say the Engines facility outproduces the Bodies facility and starts clogging up the heat treatment machinery which will prevent Bodies to go through as quickly as they should. Naturally, the assembly line will start having too many Wheels, normal or excessive amount of Engines, and not enough of Bodies. Although all the facilities in this system work at their full capacity, the throughput of the overall system is less than ideal, because the assembly line has not enough of Bodies to assemble the finished cars.
The critical insight that the theory of constraints makes is that the problem will manifest in ever growing inventory of the spare parts within the system. Wheels will pail up in front of the assembly, and Bodies will pail up in front of the Heat treatment facility. To improve the throughput of the system, one needs to focus on improving the constraints of the system, in this case the heat treatment facility capacity and/or efficiency, and maybe the volume of production of Wheels and Engines.
As the result, driving down the inventory of parts that stuck in the system will produce a more efficient plant with better throughput and reduce cost of operation by limiting of overproduction of excessive parts. Also it will free more resources as spare parts cost money in raw materials, and if they’re stuck in the system, the money are stuck in the system as well.
What about software organisations though?
It is time for us to return to the world of software development. And here is the problem, we don’t buy raw materials, we also could argue that we rarely even move spare parts between different departments those days, we have killed silos and waterfalls in 1999. Moreover software engineering is a creative endeavour, right? It can take god knows how long. We can’t just apply those ideas directly from manufacturing!
Or can we? I would argue that a modern software organisation that is built after contemporary LEAN practices, is basically a straight up supply chain that looks somewhat like this:
design -> project management -> engineering -> QA -> marketing -> sales
Okay, it is often a bit more complicated than this, not everything goes through design or QA for example. Yet, if you squint hard enough, you can see that “something” travels through an organisation from the ideation to features the end user receives. And so, we can translate the theory of constraints terminology to the world of software organisations as following:
Throughput — how quickly features were shipped to the end users
Inventory — how many features are stuck in back-log, design or development
Cost of operation — salaries + cost of infrastructure
By the “features” here, I obviously mean everything that travels through the system in order for the end user to receive what they’re paying for. Those include: documentation, user features, design and usability improvements, bug fixes, technical debt, infrastructure management, etc.
And now you can apply the same principles to a software organisation as you would to a car manufacturing plant. Your goal to manage the company well will be the same, to reduce the inventory in order to improve the throughput.
The symptoms of a failing software organisation
Lets apply some of the structures from the previous chapter to identify classical problems in a software organisation.
For example, lets look at the good old technical debt. In accordance with the theory of constraints technical debt goes onto company’s inventory; it is a thing to do for the engineers. In this aspect you don’t even need to really understand where it occurs or why, the sheer fact of its existence bloats up your inventory and hence reduces your throughput. There are no two ways around it, if you want your organisation to move fast, you need to shave technical debt off.
A similar story happens with bugs. If a significant part of your inventory consists of bugs, a significant portion of your resources will be consumed by those, and your throughput in shipped features will be way less than it could be. If you want your organisation to perform well and ship fast, don’t do bugs, write tests religiously and automate everything.
Another symptom — and admittedly my favourite — is something that called “feature debt”. It is often the case that an organisation has very little of technical debt, and very few bugs, which is accompanied by a huge line of backlog of features to make. And guess what, your back log is an inventory too! If things are piling up in there, everything new will have to go back to the end of the queue and your throughput will suffer. It’s a clear sign that you need to either increase development capacity or exercise “laser focus”.
There can be more subtle problems in software organisations that manifest through inventory as well. For example the existence of expeditors in an organisation; which is always a bad sign. It means that things don’t go through the system as fast as desired, and instead of fixing the problem, someone got impatient and started clubbing people with a police baton. Why this doesn’t work? Simple, by expediting one thing over others those people disrupt the flow of features that goes through the company. Which, inevitably, drives the inventory up as people drop unfinished work to switched priorities. And, by now, I hope you know what growing inventory means: that is correct, your throughput will suffer.
Now you know why the ever-tempting “lets pause everything and refactor” approach rarely leads to anything good. It’s a form of expediting, you’re rushing something you personally deem important in at the expense of the overall system performance.
Conclusion, I suppose
If you boil this problem down to its essence, it’s really easy to trivialise the situation and start thinking about it in terms of “well, doh! just remove bottlenecks”. It all might start to look like common sense. But, reality is a bit more complex than that, and common sense is not actually that common.
Bottlenecks in systems rarely appear because someone did a substandard job. Quite the opposite, system constraints are usually a result of very smart people working hard to get their jobs done. There are always reasons why things the way they are. More often than not, it is very easy to get lost in all those requirements and reasons, and miss symptoms of failure.
Moreover, without clearly seeing how inventory moves through a company, it is difficult to understand what is and what is not a bottleneck. In this situation, you will decide that something is a bottleneck if it performs below some arbitrary level. Which will naturally lead one on the path of cost accounting and local optimums; exactly the situation you want to move away from.
The interesting fact is, most of the older industries exhibited the same symptoms as us in the past. The existence of expeditors, ever failing project time-lines and burned out employees, all of those are not specific to software development. And they fixed a lot of those too. A significant portion of those changes came from the ideas pioneered by Taiichi Ohno at Toyota, they started small by looking at the flow of inventory in their production lines and introduced the concept of LEAN manufacturing.
Meanwhile, the software development industry keeps excusing itself for being young and rebelious. We do borrow heavily from Japanese automobile makers: LEAN, kanban, agile, SCRUM; all have the same roots. The problem is often times that those are carbon copies of solutions, rather than more conscientious attempts to really understand how software organisations function. And so we go forever oscillating between autonomy and centralised control in hope to fix that what we cannot see.
Hopefully, the theory of constraints steps a bit away from specific tactics and tries to create a more abstract framework that can provide guidance and understanding of the same innate structures that could be applied across industries and technologies. We had a bit of a sneak peak into it’s methods in this article. In the next one we will take a step further and take a look at the way complexity and randomness coexist and influence each other. Why our best laid plans often don’t work. Why SCRUM is failing you, and why our ubiquitous obsession with better estimations a futile waste of everyone’s time.
Oh, wow. This turned into a bit lengthier read than I planned originally. But, regardless, I hope I have managed to demonstrate that we can borrow heavily from other industries to improve software organisations. Well, at least I hope I have planted a seed of doubt in your mind with this article. So, please, put a comment below and let me know what do you think? Or shoot me a message on twitter. I’m curious what do you make of this.
And finally. If you’re a part of a meetup or an organisation who are passionate about this sort of things, and if you want me to come over and give a talk on the subject, shoot me a message, i’m always happy to give talks.