What Should We Do To Prevent Software From Failing?

This article first appeared in the MIT Sloan Management Review Blog on May 20, 2019.

It takes an army of trained, licensed, and accredited professionals to build a skyscraper in most cities around the world. But what about the software platforms and machine learning tools that have become crucial components of the world’s financial, military, medical, and communications ecosystems?

The critical software and technical systems we rely on daily are like invisible skyscrapers all around us — yet we often don’t know who designed them, how they were constructed, or whether they hide defects that could lead to massive inconvenience, financial chaos, or catastrophic failures.

Design flaws are introduced into software systems around the world every day, but the most serious errors can have widespread and costly effects. Even industries that require higher levels of regulation and certification can face catastrophic consequences of software design flaws. In October of 2018, for example, the U.S. Government Accountability Office warned that many sophisticated weapons systems are vulnerable to cyberattack after testers playing the role of adversaries hacked numerous weapons’ control systems. In that same month, the Food and Drug Administration issued an alert on two medical devices due to software vulnerabilities that could allow a hacker to hijack the device and change its function, potentially with lethal consequences for the device user.

Such consequences have been in the spotlight in recent months, as Boeing has faced sharp scrutiny over whether software design flaws were a factor in two back-to-back fatal crashes of 737 Max jets — and if these errors were preventable.

To help prevent other catastrophes, industries that provide critical products and services built with rapidly evolving hardware and software need to consider how they can ensure their businesses have a level of digital resiliency that justifies the trust society has placed in them. This will require — on the part of technical architects, software developers, and hardware designers — creating a commonly accepted set of requirements that software, hardware, and network professionals must satisfy in order to practice their craft. If they don’t, the government just might.

Business and society run on increasingly complex software, so it’s time we require a license to write critical code

Consider the construction industry, which has had formal standards in place for decades. A licensed architect must create the design for a building, and an army of professional engineers must approve the structure as well as the electrical and mechanical systems, to ensure the project meets or exceeds all building codes and safety standards. All of these professionals have years of schooling and relevant work experience and have passed rigorous certification exams.

Tall buildings rarely collapse, and when they do fall down — or even display structural weaknesses — extensive resources are deployed to figure out what mistakes were made so procedures can be modified. In some cases, the professionals who made mistakes lose their licenses.

When it comes to software and hardware design and development, the requirements are far less formalized. While many of the billions of lines of software code that run big parts of society’s infrastructure are written by highly skilled engineers and computer scientists, there is no requirement to ensure this is the case. There are industry standards for some elements of technical infrastructure development, but because there is no enforcement mechanism, the standards are rarely followed.

Except in rare cases, such as the platforms used in the airline industry and the space program, no professional engineer or architect signs off on the plans for critical computer programs and hardware platforms, and no government inspector certifies them for use. Not every software application or coding project carries the same level of potential risk, so with this focus on quality and resiliency, a tiered approach is likely required.

To mitigate the massive risks of critical system failure, the private sector should join together to further professionalize the design and implementation of software. To start, coders who work on critical infrastructure should have a professional accreditation framework that issues licenses. One approach might be something like the Financial Industry Regulatory Authority, a nongovernmental organization that tests, certifies, and monitors those who work in the U.S. brokerage industry to ensure they have the skills to perform their jobs.

Of course, licensing and registration wouldn’t solve all problems, as anyone who has ever experienced a bad financial adviser, architect, or doctor can attest. But it is a step in the right direction. A potential positive outcome of such an approach would be the further leveling of the playing field from a diversity perspective. After all, it would be difficult to argue that an individual programmer or designer wasn’t as qualified as another if they both had the same level of industry certification.

If industry fails to self-regulate, governments might seize the opportunity. Already, financial regulators in the U.S. are venturing down this path in efforts such as several agencies coming together to propose the Enhanced Cyber Risk Management Standards framework, which is focused on cyber resiliency, and California’s Consumer Privacy Act, focused on enhanced data privacy. Issues in both areas often result from poorly written code or badly designed hardware.

The future clearly will run on increasingly complex software. Yet it is only a matter of time before another mistake in a critical piece of software or hardware results in a sensitive data breach, financial instability, or further loss of life. It is time to recognize all the invisible skyscrapers that exist all around us and take the steps to prevent them from falling down.