Systems engineering has come a long way since the 1960s. Defense and aerospace data management systems, which initially evolved under a centralized authority, must now adapt to highly distributed organizations with multiple authorities and open and modular development needs. Organizational management techniques have evolved to smooth logistics and collaboration between contributors, and data management and modeling tools have been developed to help orchestrate information, goals, and processes in a centralized manner. These tools work wonderfully when matched with centralized organizational or project structures. The challenge is that the U.S. Government’s efforts (like the 2021 National Defense Authorization Act (NDAA)’s requirement for a Modular Open Systems Approach (MOSA)) to drive modularity and reuse amongst its systems foster decentralized organizational structures. New and emerging projects trend toward reusable components assembled with contributions from distributed organizations. It is not clear whether Defense and Aerospace data management systems and strategies will adapt.
Lessons from the Space Race
In the 1960s and ‘70s, the United States and the Soviet Union were locked in an epic competition to be the first nation to send a crewed spacecraft to the moon. It was a herculean task—not only of engineering, but of organizational design, logistics, and systems management. According to Stephen B. Johnson, aerospace engineering professor and author of The Secret of Apollo: Systems Management in American and European Space Programs, the Apollo program “cost more than $19 billion through the first Moon landing and used 300,000 individuals working for 20,000 contractors and 200 universities in 80 countries” (Johnson 5).
The “space race” reached its zenith in a series of moon landings beginning in 1969. The United States had won. As the United States wrangled its widely dispersed multitude of talented engineers and scientists to design and redesign its rockets, pioneer groundbreaking new technology, and ultimately explore the moon, the European Space Vehicle Launcher Development Organisation (ELDO) — a cooperative effort between Britain, France, West Germany, Italy, Belgium, and the Netherlands — was working to design its own space launch vehicle. Yet where the United States succeeded, ELDO failed, their rockets exploding again and again. ELDO itself was eventually dissolved and reformed.
The United States had a centralized organization with centralized authority, management, and oversight, an organizational approach included in what Johnson dubs “systems management.” ELDO was decentralized, with multiple nations negotiating and renegotiating authority responsibilities, and attempting to capitalize off one anothers’ technologies and markets while protecting their own. The saga of these two organizations, and how their approaches to systems management led to success on the one hand and failure on the other, is told in fascinating detail in Johnson’s The Secret of Apollo.1 The United States applied its central control to enforce rigorous communication guidelines (such as change controls on design documents). ELDO was mired by miscommunication and a lack of authoritative information, as related by a report on its Europa II project:
“Europa II seems in a continuous state of research and development with major changes made from one launch to the next almost independently of whether the previous flight objectives have been achieved. No single, complete specification existed for the entire vehicle. Without clear specifications, engineers did not have clear goals for defining telemetry measurements, for limiting the weight of the vehicle, or for ensuring quality and redundancy across the project. The end result was ‘a launch vehicle with little design coherence, and posing complicated integration and operational problems.’”2
Many factors contributed to ELDO’s failure, but one clear culprit was the organization’s lack of an Authoritative Source of Truth (ASoT) to give engineers clear information about their designs, constraints, and objectives. An ASoT is a capability that gives definitive answers to queries about a collection of systems. Lacking the capability to communicate authoritatively about their work, systems engineers from different organizations or countries struggled to collaborate to the point that completing the complex task of building a space-worthy rocket proved not just difficult, but impossible.
Data Management: From Apollo to Linux
The Apollo project was a marvel of systems engineering, successful despite the organizational challenges of an unprecedented number of contributors. Through systems management NASA provided an Authoritative Source of Truth that enabled engineers across the Apollo program to access definitive information about the program. ELDO had no such mechanism. Apollo was centralized and ELDO was decentralized. Apollo had an ASoT. ELDO did not.
Yet the difference in Apollo’s success and ELDO’s failure did not hinge simply on the presence or absence of an ASoT, nor on their differing organizational structures. An ASoT does not guarantee success, and decentralized stakeholders do not spell doom. Rather, success requires an Authoritative Source of Truth (ASoT) implemented in a manner that is consistent with the organizational structure of the project. With the right ASoT, a project with decentralized stakeholders can succeed.
Case in point: the Linux Kernel. Linux was famously started by Linus Torvalds in 1991. Linus and thousands of his peers developed and continue to develop Linux to this day. And though Linux has many stakeholders and contributors around the world, each with overlapping priorities and objectives, it works.
To support the heterogeneous and decentralized nature of Linux stakeholders, contributors Linus Tolvalds and the Linux community created the version control tool Git in 2005. Git provides an information management system—a critical part of an Authoritative Source of Truth—that has a data management structure consistent with that of the Linux community. The Linux community is distributed and heterogeneous; git supports distributed contributors with varying objectives and skill sets. Its features for branching, merging, and forking3 allow for multiple variations of a project managed by different organizations to share a common baseline. In other words: decentralized data management for a decentralized organization. By harnessing the power of a context-appropriate ASoT, Linux has been able to make the most out of its distributed structure, turning a potential liability into an asset. It has been massively successful in multiple domains, dominating the supercomputer industry and providing the foundation for the Android operating system.
The moral of the story: a distributed organization (or set of organizations) can not only survive, but thrive if it has an Authoritative Source of Truth that is consistent with its structure.
Growing Complexity, Growing Need
In the modern systems engineering landscape, this strategy of matching the right ASoT to the right organizational or project structure is more relevant than ever. A project can limp along with an ill-fitting ASoT or no ASoT, but results and efficiency will suffer. Many software developers might recall working on school projects where data management consisted of emailing source code between team members: friction abounds.
Since the 1960s, software and systems have become dramatically more complex. Cyber-physical systems, from power plants to fighter jets, medical devices to rocket ships, are all around us—intricate webs of components, processes, and people. Designing, validating, and building complex systems nearly inevitably requires collaboration and input from multiple companies and specialists. And for the biggest and most cutting edge projects, the number of contributors and stakeholders trying to collaborate to build a rocket, or an airplane, or a chip, is often staggering.
For these projects to succeed, for them to be more like NASA or Linux than ELDO, they need an Authoritative Source of Truth consistent with their organizational structure. Without it, engineers simply don’t have the data they need to be successful. The result: cost overrun, missed deadlines, and a failure to innovate. New and emerging U.S. Defense and Aerospace projects are starting to look more like Linux and ELDO than like Apollo. The question is: will our data management approach keep up so we succeed like Linux, or will we fail to establish an ASoT like ELDO?
The Problem Persists
In 2019 and 2020 I led a study4 to investigate the nature of an Authoritative Source of Truth. My team assembled a set of requirements for an ASoT and conducted a series of demonstrations showing how an ASoT could be assembled in practice. We interviewed a variety of government and industry stakeholders and reviewed a wide variety of data management tools. We found a problem.
The data management tools available today (such as Product Lifecycle Management (PML) tools) are structured for use by centralized organizations. These tools have been successful because they structure data in a congruous manner: the organization is hierarchical, so the data storage is hierarchical. The organization is centrally controlled and the data storage is centrally controlled. Putting this in terms of database normalization, most PLM tools today have been normalized to accommodate use by a single organization with broad control of the system specifications. For example, automotive companies own their designs. Although automotive companies use third party suppliers, they control the final product. They function as a single organization with a single data management system; a single source of truth.
Yet in systems procurement, the U.S. Government does not have unilateral control of system specifications. As the defense industrial base grew after the second world war, the Government began to procure systems like one might buy a car: the manufacturer controls the entire architectural specification, the buyer receives the end product. The manufacturer keeps control of the architectural specifications.
This trend is changing. Recent initiatives for modularity in defense systems mean architectural specification ownership is no longer unified in a single organization. When the Government procures a complex system like an airplane or a helicopter today, different entities may contribute and control different pieces of the system specification data set. Some data are owned entirely by the Government. Some data are owned entirely by contractors. And often data on different parts of a system’s design or implementation are dispersed between different contractors who worked on different pieces of the puzzle. Tools that work well when used by a single organization induce unexpected friction when applied by two organizations, even when working in concert. Funding sources and priorities may be mixed between multiple Government and commercial concerns.
Sounds a bit like the challenges faced by ELDO back in the ‘60s doesn’t it?
To put it bluntly, the challenges that befell ELDO are the same challenges we face today. These challenges have reared their heads with particular ferocity in United States Department of Defense (DoD) programs. The F-35, for example, was developed in a distributed manner; eight different nations contributed components and expertise to its creation. F-35 continues to experience delays and overruns, as described in a 2022 Government Accountability (GAO) report:
“As of 2021, the program office now plans to complete Block 4 capability deliveries 3 years later than the original schedule due to software quality issues, funding challenges, and the addition of new capabilities, among others.”5
Reports on the F-35 program indicate that some of its challenges are due in part to mismatch between the F-35 development processes and the data systems used in those processes. For example, the Autonomic Logistics Information System (ALIS) System has been an ongoing source of challenges:
“We have previously reported on numerous long-standing challenges with ALIS, including technical complexity, poor usability, inaccurate or missing data, and challenges deploying the system due to its bulky hardware. In addition, some ALIS software and hardware components will become obsolete in 2023, several older hardware items are no longer in production, and program officials stated that they have struggled with limited and ease of access to logistics data on contractor and operational servers.”6
The F-35 is a complex system with many pieces contributed by many stakeholders. The reported challenges indicate that one source of its failings is data management systems that do not align with its organizational structure.
The following figure contrasts the organizational structures we have considered so far.
Bottom left – NASA pioneered centralized data management (as part of “systems management”) for large systems. The Apollo program had centralized stakeholders and an ASoT that provided centralized data management.
Upper left – Systems development with a centralized organization (e.g., team of students working on a school project) and decentralized data management (each student managing their own contributions to the project separately and emailing updates to one another). This structure induces friction by adding effort every time the students need to share data.
Upper right – Linux development is decentralized (many stakeholders have their own data storage and management, replicated across organizations using git repositories). The Linux ASoT is distributed and aligned with this structure, allowing organizations to manage their contributions to and derivations of the Linux kernel with little friction. For example, two companies collaboratively developing elements of the Linux Kernel do not have to open their firewalls to one another.
Lower right – The F-35 has many stakeholders with different concerns, yet its data management is centralized through its system integrator. Each stakeholder’s contribution to the F-35 ASoT may be part of another, unrelated data management system. For example, the Global Positioning System (GPS) sensor may be provided by a company that sells it for commercial and defense use and has its own system for managing data.
When you have multiple organizations with ownership over a process and data, a single source of truth is the wrong tool. U.S. Government programs need Authoritative Sources of Truth aligned with multi-organization workflows. Unfortunately, few of the data management tools available today are well suited to providing an ASoT for distributed development of cyber-physical systems.
How do we Succeed?
So how do we actually do this? And what does it mean to “align with multi-organization workflows?” It sounds like buzzword gibberish. Let’s break it down and add some definitions:
- A data management system is some combination of software and hardware that stores information. A server providing a network share drive is an example of a data management system. A data management system is part of an Authoritative Source of Truth.
- A multi-organization workflow is a process in which multiple separate organizations interact using data (sending data back and forth between them, for example).
- Alignment refers to the cost of executing a workflow.
A data management system that is well-aligned with a workflow adds little cost when executing the workflow between multiple organizations. A data management system that is poorly aligned with a workflow adds significant cost when executing the workflow between multiple organizations.
For example, suppose organizations A and B are working together on a project and do not have an ASoT with a well-aligned data management system. Organization A has to spend 8 hours of effort to export data and send it to organization B. Organization B has to spend 8 hours to download and import data from organization A. If organizations A and B frequently exchange data, this workflow adds significant cost with minimal program benefit.
When establishing an Authoritative Source of Truth, evaluate the workflows that need to use your data management system and consider the effort required to execute them, especially when they cross organizational boundaries.7
How do I decide what tools to use?
We are charting new territory here. We can look to Linux as an example for software development, but Government-funded cyber-physical systems bring many new challenges that open source software projects do not experience: mixed data rights, including handling of proprietary or restricted design information. Mixed and interdependent hardware and software development activities – engineers may be writing code to support hardware that does not exist yet.
The playbook is still being written, but the established practice of database management can provide some clues. The late, great John Carlis got me started in data modeling and I’ve used his methods again and again, his voice echoing in my mind reminding me that user is a four letter word, I should anchor my understanding with instances, and that September 19 is international talk like a pirate day.
Computers are only useful when they have data on which to operate. Data modelers (and their alter-egos, Database Administrators (DBAs)) use a regular process for determining how to store data. As professor Carlis would say, a data modeler starts by asking, “What do you want to remember about this?”
After the modeler collects the names and details of things the data system needs to remember, he or she starts to organize it. There are many ways to approach this, but the traditional approach is called normalization. In normalization, the data modeler organizes data to avoid redundancy and maximize flexibility. However, normalization has a point of diminishing returns. As any DBA will tell you, normalization must be balanced against the practical capabilities of the system storing the data. Microsoft’s article on this subject summarizes the concern nicely:
“Adhering to the third normal form, while theoretically desirable, is not always practical. If you have a Customers table and you want to eliminate all possible interfield dependencies, you must create separate tables for cities, ZIP codes, sales representatives, customer classes, and any other factor that may be duplicated in multiple records. In theory, normalization is worth [pursuing]. However, many small tables may degrade performance or exceed open file and memory capacities.”
This sounds a lot like the centralization/decentralization challenge in creating an ASoT. There may be a mathematically optimal organization of data that minimizes storage space or maximizes flexibility, but if you optimize without accounting for how the data is used you can run into problems. In a database this means you may want to combine data in a single table, even when multiple tables would be storage-optimal, because the data are commonly used together. In an ASoT you may want to use redundant and distributed tools like Git, even if centralized storage might make access control simpler. You must evaluate your project’s organizational structure when assembling an ASoT. We can generalize this notion of diminishing returns for data storage:
The structure of your data storage must be consistent with the structure of the processes that use the data.
As we move away from the single-system-integrator model of weapons system procurement, we must look to new tools that align with new processes. Git works for open source because it was designed to align with the open source process model. Similar solutions are needed for different domains and contexts. A database is one method of data storage. Digital engineering models are another. Just as DBAs structure a database according to its use, engineers should structure digital engineering models according to their use. Just as DBAs need to balance the performance needs of users against storage optimization and flexibility, ASoT architects need to balance the needs of multiple organizational stakeholders against the data management availability, replication, and control.
As humanity pursues ambitious distributed cyber-physical projects like piloted missions to Mars and next-generation aircraft, we need to increase our capacity for creating Authoritative Sources of Truth that align to the structure and needs of distributed projects. This means building and applying data management tools and data representation methods that align with the structure of our projects.
Here at Galois, we’re doing just that. Already, we’ve developed Model-Based Engineering Tools and methods like the Curated Access to Model-based Engineering Tools (CAMET) Library and helped create methods like the Architecture Centric Virtual Integration Process (ACVIP) to provide analysis capabilities and methods specifically to support decentralized projects. We apply languages like the Architecture Analysis and Design Language (AADL) because their rigorous semantics reduce friction when crossing organizational boundaries.
As procurement processes and the complexity of engineering projects themselves evolve, we’re building dynamic, context-appropriate tools to meet those projects’ needs. As we look ahead to a future of astounding technological innovation and scientific exploration, we’re excited to be a part of whatever comes next.
1 Johnson, Stephen B.. The Secret of Apollo: Systems Management in American and European Space Programs (New Series in NASA History) (p. 155). Johns Hopkins University Press. Kindle Edition.
2 Europa II Project Review Commission, Group No. 5, sections 0.4.1.1, 0.4.2.3.
3 “Forking” is technically a feature of Git repositories like Github or Gitlab, not of Git itself.
4 Funded by U.S. Army Combat Capabilities Development Command Aviation & Missile Center under contract no. W911W6-17-D-0003/W911W6-19-F-703D. A