Versett Frontier Article - Three pillars of sustainable software
Note: This article was written during my time at Versett for the organization's Frontier series, and is posted here for posterity. I strongly recommend you visit the site and explore the writing of Versett whose past and present members have much smarter perspectives than my own.
If you ask business leaders if they want to move fast the answer will be yes. It will likely be accompanied with a bit of confusion: who would want to move slowly? If you ask the same people if they’d like to move faster the answer will most likely be the same. The world is in constant change, and so is the competition. This is especially true with software being increasingly embedded into the core of most company’s offerings.
What allows some companies in this space to move faster than others? Why do some companies hit a wall while others maintain momentum? While there is no silver bullet to success, a major factor is how well they are able to manage their technologies long-term.
The Three Pillar Model of Technology #
A sustainable model can be described using three core pillars: growth, support, and stability. Think of these pillars as needs that must be balanced like the legs of a three-legged stool.
The first pillar is growth. In order to grow, you build: fresh product features, initiatives, marketing campaigns. The implementation of new or improved offerings is necessary to retain customers and create competitive moats. The second pillar is driven entirely by the first, and that is the need for support. This could be customers needing answers to ambiguity or reporting issues that need to be addressed. These needs scale with continued investment in the first pillar because the more ways a user can interact with your organization, the more potential there is for user or system error.
Almost all day-to-day operations are in service of one of these two pillars. Logistics are a customer support mechanism, while analytics and reporting exist to inform growth. That means the third pillar, stability, is often invisible because it is entirely dedicated to supporting the other two.
The Forgotten Pillar: Stability #
Stability work means continuous investment in keeping systems running smoothly so that all other operations remain uninterrupted. In order to manage this effectively it’s valuable to understand two primary areas where complexity compounds and how that impacts support: bugs and system design.
System Bugs #
There is a standard acceptable number of bugs in software, and it’s not zero. Coverity, a code analysis tool by Synopsys, indicates a defect density for good quality software to be 1.0 defects per 1,000 lines of code. In Code Complete, Steve McConnell believed that number is closer to 15 to 50 errors per 1,000 lines of delivered code. Considering your average software can consist of tens to hundreds of thousands if not millions of lines of code, this adds up quickly.
The number of bugs passing production to end users is called the defect escape rate. Each organization has a different tolerance. A high tolerance allows companies to move fast and stay competitive in the short term. This was embodied organizationally by Facebook’s original mantra “move fast and break things.” The opposite is seen in NASA: its operations are prioritized around ensuring no bug ever makes it into a situation where it could cost lives. This was well detailed in the iconic “They Write the Right Stuff” by Charles Fishman.
Defect management is a good example of balancing the pillars. Growth introduces bugs, support identifies and works around the existence of them, and stability work removes them to reduce the burden on support and ensures that the problems don’t compound. Organizations that aim for long-term growth embrace bugs as a reality and define processes that incorporate bug fixes into their work cycles. They invest in reworking areas of the system that consistently introduce bugs. This is stability work that comes at the cost of short-term growth as a trade-off for long-term success.
System Design #
System design is another area of complexity management where the pillars need to be balanced. Most systems in fast-moving companies are designed to only solve today's problems. This is often intentional, and happens for two reasons: the more systems are built for eventualities, the more complex they are from day one; and the more a system is designed into the future, the more likely it is to be wrong.
Companies and software can expect predictable and unpredictable change. An example of predictable change is expecting an increase of customers over time. Unpredictable change can be big, such as an organizational pivot, or small such as adding a new feature to respond to a competitor. Well-designed systems can handle change to a certain extent, but eventually the needs of the organization will surpass anything a system was ever designed to handle and the system begins to fail.
Organizations do not succeed by designing the best system on day one. They do so by constantly reviewing and redesigning small pieces of their systems over time. Growth puts strain on how systems are designed, support increases in response to that strain. This can look like software outages and latency, manual operational processes that are quickly overwhelmed, or recurring problems that look like extremely tricky bugs but are actually the system failing in areas it was never designed to scale. Stability work of periodically investing in refactoring the design of your systems prevents catastrophic failures and complete rebuilds.
Building For a Complex Future #
Each pillar can be viewed from a lens of complexity management: growth work is the continuous addition of complexity, support work is a reaction to existing complexity, and stability at its core is the reduction of complexity. As complexity expands, support increases, often exponentially. The more complex a system is, the harder it is to change, so growth slows down. When this continues unmitigated companies hit a wall: support becomes overwhelmed, new initiatives grind to a halt, and the Andon Cord is pulled.
In the end, companies that balance these pillars are the ones that last. The better a company is at stabilizing and reducing complexity, the greater its competitive advantage.