Generative AI, Mission Critical Systems, and the Tightrope of Trust

Edited image generated by MidJourney

The public release of ChatGPT3 and DALL-E 2 radically changed our expectations for the near future of AI technologies. Given the demonstrated capability of large generative models (LGMs), the ways in which they immediately captured public imagination, and the level of publicized planned capital investment, we can anticipate rapid integration of these models into current architectures and future products.

At the time of their release, the generative model behind ChatGPT3 was already years out of date, and we already see the rapid publication of experimental results demonstrating extensions and augmentations of these models which push those capabilities even further. 

If we were to focus only on the present, blog posts about these models would risk being out of date as soon as they were published. To put it simply: things are moving fast. 

As we consider these models, our analysis is deliberately forward-facing, specifically focused on these models’ future development and potential for integration into critical systems. We are focused in particular on how these generative approaches, their capability design, interact with safety-critical notions of trust. Trust is a cornerstone for critical systems and an area of particular interest and expertise for us at Galois. In the context of these newly released generative models, how trust is measured and the role of trust in the evaluation of system performance requires our attention, and we would argue, re-definition.

A conversation grappling with trust in automation wouldn’t be so crucial or time-sensitive if these models were less capable. LGMs rapidly generate plausible content with minimal prompting that is both syntactically and stylistically compelling, and already have demonstrable utility and traction in the public sphere. There is, however, a recognized tension between generative capacity and accuracy of output—a tension that has been highlighted in discussions about the models and in research efforts designed to enhance their performance. While these models’ capabilities are impressive, neither the architecture nor the behavior of LGMs warrants a broad assignment of trust. Plausibile content is not guaranteed to be truthful, and the utility of applied technologies does not infer trustworthiness.

In a forthcoming series of blog posts, we aim to tackle this issue of trust in generative systems. We will be taking a peek “under the hood” of LGMs to explore both the ways this technology might merit caution, and how future applications could be transformative. Most of all, we will highlight and grapple with the various ways in which evaluating trust in LGMs is vital as both a context-dependent and a deeply technical endeavor. How should we conceptualize trust, and how might we effectively evaluate these technologies prior to their introduction into safety-critical systems and workflows? In what contexts and to what ends are these generative technologies trustworthy? If they are useful, but not trustworthy, how do we mitigate their inherent risk? In these posts, we aim to define and examine the tightrope which must be walked for these generative models to usefully integrate into safety-critical systems.