Olaf is an expert in audio AIs. There are a lot of challenges here, especially in German-speaking countries: many different dialects and the necessary - but often missing - training data. He takes us into the world of voice AIs, how we can train and test them and what tasks he actually sets ChatGPT.
“Unfortunately, we have the problem with audio that the data volumes are much larger. That’s why there aren’t yet any cool models like OpenAI ChatGPT or something like that” - Olaf Thiele
Olaf digitizes spoken German. He trains models for the transcription and synthesis of German audio material with the help of artificial intelligence. As these models are being used more and more, testing and quality assurance of these models is also becoming increasingly important.
Highlights of this episode:
This podcast episode is about audio AI testing, the challenges and opportunities, and how we can make progress with the latest technologies.
Olaf emphasized that testing audio AI is not only technically challenging, but also requires an enormous amount of creative thinking. He explained the difficulties associated with the limited availability of data in German and the varying dialects. The challenge is to collect enough diversified data to train a model effectively. In addition, he reported on the problems of reproducibility with different hardware configurations and the difficulties in ensuring model generalization.
Olaf talked about Chat-GPT as a tool for generating test data. This could be a revolutionary method to expand the range of possible test scenarios. Through such tools, testers can have a variety of data produced without the need for manual intervention. This could be particularly useful for language models where variation in pronunciation or dialect is difficult to simulate manually.
Olaf expressed hope for progressive developments in tools and methodologies, particularly through platforms such as Hugging Face. These could introduce standardized procedures for training and testing AI models. Such a development would not only simplify testing, but also lead to more meaningful comparisons between different models.
In addition to the technical aspects, we also talked about the ethical considerations when using AI technologies. The need for careful consideration of what models should and should not learn was emphasized. The European AI Act, which provides for potential regulations for the use of AI systems, was also discussed.
The conversation ended with an optimistic outlook on the opportunities opened up by advanced AI technologies. Despite the many challenges, both Olaf and I are convinced that through innovative approaches and continuous research, the potential of audio AI can be fully realized.