Blog

Quality Assurance of AI - Richard Seidl

Written by Richard Seidl | Mar 4, 2024 11:00:00 PM

AI has a lot to offer us. Our imagination is required: Where do we use it? What should it do? How should it work? Regardless of the area of application, the quality requirements are high. This is where conventional test methods no longer get us very far. Results are sometimes not reproducible and therefore unpredictable, so it is important to define quality differently and then approach testing.

“What we had completely overlooked was that there was a timestamp from the camera on all the photos. And then the AI learned the timestamp - and nothing else” - Nils Röttger, Gerhard Runze

Nils Röttger has more than 15 years of experience in the field of quality assurance. He was already involved in software testing during his studies at the University of Göttingen. He has been working at imbus AG in Möhrendorf since 2008, currently as a senior consultant and project manager for mobile testing and AI testing. In his many presentations at conferences and as an author of books and specialist articles, he is constantly dealing with current topics relating to testing.

Dr. Gerhard Runze holds a doctorate in electrical engineering from the Friedrich-Alexander University of Erlangen-Nuremberg and worked in the telecommunications industry from 1999 to 2015 in various roles, including as a developer and test team leader. Since 2015, he has been working at imbus AG as a senior consultant for software quality, specializing in embedded software, agile testing and AI, and is also active as a trainer for ISTQB® training courses. He is co-author of the “German Standardization Roadmap AI”, has contributed to the ISTQB® Certified Tester AI Testing curriculum and will publish a companion book on this topic in 2023.

Highlights of this episode:

  • Nils Röttger and Gerhard Runze are experts in the field of quality assurance for AIs and authors of a book on the subject
  • We discuss what quality means in an AI and which new quality features and test methods are relevant
  • An exciting example is the testing of a heating control system with AI, in which the AI unexpectedly learned the time stamp on photos
  • We shed light on the challenges of reproducibility and test environments in AI testing
  • There are many parallels between classical testing methods and AI testing methods, such as pairwise testing or AB testing, which are also used in AIs
  • The importance of generative AIs for generating test data is emphasized
  • The book “Basiswissen KI testen” is intended to help testers to further their education in the field of AI testing and offers practical exercises

Further links:

The art of AI testing

Today we are talking about quality assurance of AI systems, the challenges and methods. Gerhard Runze and Nils Röttger share their personal experiences and give an insight into the complexity of AI tests. One thing is certain: testers need to change their mindset here; conventional testing methods won’t get very far.

A new era for testers

Today I’m talking to Nils Röttger and Gerhard Runze about the exciting topic of quality assurance for artificial intelligence. Both are not only experts in their field, but also authors of the book ‘Basiswissen KI testen’. Their thoughts on quality in AI and how to test it effectively open up new perspectives and show that testing AI is far more than just a technical challenge.

Quality assurance meets artificial intelligence

The conversation began with a fundamental question: What does quality mean in an AI? Nils and Gerhard brought up the fact that aspects such as autonomy and ethical considerations must now be taken into account in addition to the classic quality features such as functionality and performance. They paid particular attention to the fact that the functionality of an AI is often a statistical variable - a paradigm shift for many testers.

The challenges of AI testing

One of the biggest challenges in AI testing is the reproducibility of test results. As Nils explains, when training a neural network, decisions are often made randomly, which can make results difficult to reproduce. Nevertheless, it is crucial to keep the basic principle of a test reproducible. This realization underlines the importance of a new way of thinking for testers when dealing with AI.

Of aha moments and learning curves

Gerhard shared a particularly striking example with us: in a heating control project, an AI mistakenly learned the timestamp visible in photos instead of the desired settings - a classic example of ‘shit in, shit out’. This aha moment clearly showed how essential a deep understanding of the data and the learning process of an AI is for successful tests.

Innovative test methods for innovative technologies

The conversation took a turn towards specific methods for testing AIs. From metamorphic testing to pairwise testing and the use of A/B testing, various approaches were discussed. These methods allow testers to approach the unique challenges of AI testing and offer an exciting outlook on the future of software quality assurance.

Normalization and standardization

Nils and Gerhard pointed out the importance of standardization in the field of AI testing. Projects such as the DIN standardization roadmap show the need for clear guidelines and ensure that quality assurance does not become a game of chance in the world of artificial intelligence.