5 min read

Data and data processes

Richard Seidl 11/12/2024

Podcast Software Testing

Test data plays a central role in modern software development. It enables data errors to be identified at an early stage and quality assurance to be optimized. The use of AI and low-code solutions makes test automation more efficient. Close cooperation between IT and the specialist department is crucial in order to harmonize data processes and increase quality.

Podcast on data and data processes

In this episode, I talked to Joshua and Hermann about quality, test automation and agility. Hermann explained how important it is to detect data errors early on and how to systematically generate target results in order to compare them with actual results. He emphasized that there is often a lack of suitable tools to carry out these tests efficiently. Joshua added that their methods help companies to harmonize and test data from different systems. We also talked about the role of artificial intelligence in the testing process and how it can help make suggestions for testing and improve collaboration between IT and business departments. Finally, there were insights into the challenges and benefits of visualizing data processes to optimize quality assurance.

“The problem is that you have data errors. They often appear far too late. In other words, you actually want to test data during development.” Hermann Friebel

Hermann Friebel, founder and Managing Director of FINARIS Financial Software Development GmbH since 2001, has almost four decades of expertise in software development and testing in the areas of securities trading and risk controlling.

Since joining FINARIS in 2015, Joshua Claßen has established himself as a senior consultant for backend test automation of complex banking applications. Through his work with RapidRep, the predecessor of SQACE, he gained extensive experience in automated data testing and data quality assurance with various clients.

Highlights of the Episode

The challenges of detecting data errors and testing them early on in the development process are considerable.
The target data generation method often only covers 70% of cases, which is why test data enrichment is necessary.
Collaboration between IT and the business department is crucial to improving data processes and quality.
A methodology that makes it possible to systematically test data processes without double implementation is advantageous.
Visual representations and low-code solutions facilitate collaboration between business and IT.

Efficient test automation and data quality

The challenge of data errors

In many companies, data errors are often only recognized at a late stage. Although the IT department usually has technical access to the data, it often lacks the necessary business expertise to identify data errors at an early stage. In most cases, the specialist department develops specific test cases and checks them, but this is often cumbersome and inefficient.

Systematic target results as a solution

A promising method for solving this problem is to systematically generate target results and compare them with the actual results. However, this technique often only covers about 70% of the relevant cases. To ensure complete coverage, the remaining 30% should be supplemented by generated test data. In this way, almost comprehensive coverage of all possible scenarios can be achieved.

Collaboration between IT and the specialist department

Close cooperation between IT and the specialist department is essential in order to improve the detection of data errors. By using special tools, employees from the specialist department can actively work on the data processes and support their further development. This collaboration not only promotes the quality of the test results, but also mutual understanding between IT and specialist departments, which leads to better data processes in the long term.

Use of low-code components

The use of low-code components helps to simplify the testing process. With these tools, many tasks can be implemented more quickly and without in-depth technical knowledge. Especially in large companies with many departments in which similar problems often occur, low-code components help to avoid redundant solutions and make processes more efficient.

The influence of AI on test automation

Artificial intelligence (AI) has the potential to revolutionize test automation. AI can help to automatically generate suggestions for tests or create code, making the process more efficient and saving valuable time. However, it is important to remain realistic: AI-supported systems make work easier and optimize processes, but they do not take on all tasks completely autonomously.

Frequently asked questions about test data

How do you ensure that test data remains up-to-date and relevant?

To ensure that test data remains current and relevant, regular reviews and updates are crucial. Test data should be regularly compared with real data and adapted to changes in the requirements or in the software. Automated tests can help to generate test data dynamically and ensure that it always corresponds to the current usage scenarios. Close cooperation with the specialist departments also helps to identify and create relevant test data. This guarantees the quality of the tests.

When does it make sense to use synthetic test data instead of real production data?

It makes sense to use synthetic test data when privacy is important or when real production data is not available. Synthetic test data can also be used to test specific scenarios that are difficult to reproduce with real data. They also enable tests to be carried out cost-effectively and securely without the risk of data leakage or misuse. When developing and testing new functions, they are often more flexible to use than real production data.

What are the most common challenges in test data management?

The most common challenges in test data management are ensuring data quality and integrity. There is often a lack of realistic test data that covers relevant scenarios. Data protection regulations make it difficult to use real data, while generating synthetic test data can be time-consuming. In addition, the management and provision of test data for different test environments is often uncoordinated. Ultimately, a lack of automation leads to inefficient processes that lengthen the test cycle.

What different types of test data are there and how are they used?

There are different types of test data that are used in software development. These include 1. real-time data: Real data from production for accurate simulation. 2. batch or dummy data: Randomly generated data for extensive testing. 3. limit value data: Test data that lies at the limits of the input values. 4. positive and negative testing: data representing correct and incorrect inputs. This test data helps to effectively test the functionality, security and performance of software solutions.

What is meant by the pseudonymization of test data?

Pseudonymization of test data means changing personal data in such a way that it can no longer be assigned to a specific person without additional information. In the case of test data, identification features are replaced by fictitious values to ensure data protection. This allows test results to be analyzed without endangering the privacy of the persons concerned. Pseudonymization is particularly important in software development and data analysis in order to comply with legal requirements.

What is test data anonymization and why is it important?

Test data anonymization is the process of masking or altering personal or sensitive information in test data to protect the identity of individuals. It is important to comply with privacy laws and build trust, while allowing developers and testers to use realistic data without compromising privacy. Anonymization allows companies to work more securely while ensuring the quality of their software.

What are test data generators and how do they create test data?

Test data generators are tools that automatically create test data to test software applications. They generate structured data in various formats based on defined rules or templates. This allows developers to simulate realistic scenarios without having to enter data manually. Test data can be generated randomly or rule-based to meet specific requirements. As a result, these generators save time and minimize sources of error while ensuring that the tested systems function in real time.

What is synthetic test data and how is it created?

Synthetic test data is artificially generated data that is used to analyze and validate software applications. It imitates real data without containing confidential information. It is created using algorithms that mimic the patterns and structures of real data. This can be done using techniques such as data anonymization, statistical modelling or scripting. Synthetic test data makes it possible to carry out tests without taking data protection risks and is particularly useful for software development and quality assurance.

What is test data and how is it used?

Test data is specific data that is used to check software applications during the testing process. It helps to evaluate the functionality, performance and security of an application. Test data is created to represent different scenarios, including normal and exception conditions. They can be generated manually or automatically. The use of test data enables testers to identify errors and ensure that the software meets the desired requirements before it is released.

Test Data Management

03/26/2024

Test data - a tiresome topic for many companies, especially when it comes to cross-system provision. It must be complete, otherwise the tests will...

Podcast Software Testing

2 min read

(Test)data radical cure

06/14/2024

“There’s a budget again. It just needs to say AI on it.” - Richard Seidl Test data management is currently experiencing a renaissance. Driven, of...

Software Testing People & Tech

Data Warehouse Testing

04/08/2021

Alternative approaches to testing a data warehouse system Conventional testing of operational information systems involves testers sitting at a...

Software Testing People & Tech

Data and data processes

Podcast on data and data processes

Highlights of the Episode

Efficient test automation and data quality

The challenge of data errors

Systematic target results as a solution

Collaboration between IT and the specialist department

Use of low-code components

The influence of AI on test automation

Frequently asked questions about test data

How do you ensure that test data remains up-to-date and relevant?

When does it make sense to use synthetic test data instead of real production data?

What are the most common challenges in test data management?

What different types of test data are there and how are they used?

What is meant by the pseudonymization of test data?

What is test data anonymization and why is it important?

What are test data generators and how do they create test data?

What is synthetic test data and how is it created?

What is test data and how is it used?

Test Data Management

(Test)data radical cure

Data Warehouse Testing

Navigation

Info

Contact