Skip links

Trustworthy AI



TRUSTWORTHY AI

Lt. General John Shanahan (Retired), ex US Air Force.

At its core, AI assurance is about recognizing and mitigating the risks of AI-enabled military systems. AI assurance is a crucial part of all AI development. It combines two things– the principles of test and evaluation, or T&E, and the tenets of responsible AI.

AI assurance is applied at all stages of the AI lifecycle. Across the world, the number of terms may differ, and the exact word itself may differ. But generally, the principles are the same responsible, equitable, traceable, reliable, and governable. Even while AI is viewed as software, testing and evaluation are as important for AI-enabled systems as for any other hardware or software system in the military.The goal is to ensure that AI systems are trustworthy. 

An AI-enabled system is trustworthy to the extent that, when used correctly, it will do what it is supposed to do. Two, when used correctly, it will not do what it is not supposed to do. Andthree, humans can dependably use it correctly. The last one brings in theconsiderations of human-machine teaming. This theme will be one of the most crucial things we must work through in defining those interdependencies, responsibilities, and roles between machines and humans over the next decade.

Many aspects of AI testing resemble testing of all other military systems. The same fundamental systems engineering principles apply. However, some significant differences must be considered when developing, testing, fielding, and sustaining AI systems. Sometimes, that sustainment part is neglected, which may be one of the most important when you are talking about AI systems.

The difference in how models are trained and how they perform once they are fielded is important. What we call mission or domain-specific adaptation includes the very real possibility of distribution drift, seen in models when fielded in the Department of Defense.

Unanticipated or unknown failure modes. The need for explainability, auditability, and predictability. And related directly to this idea of sustainment, what we call continuous integration and continuous delivery. If you're not updating your AI models every month, they're getting too stale.

And then the risk of skewed, corrupted, and incomplete data sets in all varieties of adversarial attacks. When you compare traditional hardware or software systems with AI-enabled systems, roughly 70

per cent

of tests and evaluations are the same

.

And, of course, we cannot consider AI's limitations and technical risks in isolation. We also need to address policy and legal matters.


Leave a comment