Visualizing Codebases as Cities

Understanding large code bases is hard. It can take months for an experienced engineer to get comfortable navigating and manipulating large software projects. At Galois, we’re developing tools to generate 3D representations of code bases visualized as cities to help engineers get oriented faster, and to develop a shared visual reference for team conversations and problem-solving. 

Building Shared Metaphors

Engineers love to use spatial reasoning, but software, unlike hardware, doesn’t have natural spatial allegories or metaphors. Engineers often create their own spatial mental models, but those models are not always consistent between individuals. The question, “Where is that function defined?” has unstated spatial assumptions that might not be consistent between individual engineers. 

We treat code bases like cities because they share common traits - a city is hierarchical, consisting of neighborhoods, blocks, and buildings. A city changes, but it does so slowly and intentionally. Most humans have an existing shared experience of navigating cities and we can use this shared experience as the basis for shared metaphors to improve communication and understanding. This shared spatial understanding also means we can navigate many things by memory, and do not require verbose labels on common locations.

The Ardupilot code base rendered as a cityscape

Spatial <–> Linguistic

The right hemisphere of the brain is spatially oriented, and the left hemisphere is language-oriented. The left hemisphere is categorical, the right hemisphere wants relations. Learning a code base is a mix of different types of reasoning - engineers read language (in the form of comments and source code) but have to build mental models of relations. Learning the same thing in multiple modalities (hearing, seeing, touching) contributes to richer semantic understanding. We believe that viewing a code base as a cityscape will help join the spatial and linguistic capabilities of engineers’ brains.

When one part of the code calls a function defined elsewhere, an engineer might ask, ”where is that other function?” Integrated Development Environments (IDEs) like Visual Studio Code or tools like grep can help an engineer locate the text of the other function, but additional investigation is necessary to uncover the context required to meaningfully understand where a function is defined: what structures contain the function? What structures does the function contain? What elements of the architecture are closely related to the function? This information is available today, but requires brainpower and time to identify, and if the engineer can’t find it or if the relationship and structures are not intuitive, that adds friction to understanding.

Rendering a codebase as a cityscape gives substance to these otherwise conceptual contexts and structures. The size of structures containing various functions can be visualized as buildings of different sizes. Related functions or elements may be represented as other buildings in close physical proximity. Relationships and dependencies between elements of code or components can be represented as roads or subways. 

Visualizing Abstract Systems

Creating a shared, semi-permanent model for shared understanding of complex codebases is just the first step. Once we have common spatial metaphors on which to anchor ideas and understanding, we can create dynamic spatial visualizations that integrate with state of the art systems engineering tools to aid design space exploration, highlight potential vulnerabilities, or show the cascading impact of changes within a system. 

At Galois we have developed and are developing many tools that help engineers rigorously and rapidly conduct engineering activities. Tools like VERSE will help engineers apply formal methods to their code. Tools like Taphos help infer structure and relationships about code. We plan to overlay the capabilities of Taphos and VERSE as layers on a cityscape so that their analytical results are easy to tie to established metaphors. For example, the connections lifted by Taphos may appear like a public transit map over a cityscape. Using “landmarks” in the code base, we’ll be able to more rapidly help engineers contextualize Taphos results, such as the potential impacts of a code change. 

Currently, these efforts are still in the early stages. The screenshot above is from a prototype tool we implemented in Blender, showing an early concept of how this cityscape visualization could look. In the coming months, we hope to build a VSCode plugin to make this capability available alongside code. Stay tuned and follow along as Galois continues our work to improve software and systems engineering processes. 


Acknowledgment: This research was influenced by "Code City," which is worth checking out as well!