“There’s a budget again. It just needs to say AI on it.” - Richard Seidl
Test data management is currently experiencing a renaissance. Driven, of course, by AI and the possibilities it offers: better, more accurate test data, simple generation, effortless management across system boundaries. Ideas are bubbling over as to what will be possible … or would be … or well, maybe … hm.
There are a few classic challenges in software development. But while we have halfway solved those around releases (pipelines) and test environments (cloud), for example, test data is another matter. Especially in application landscapes with many different systems and data repositories, test data initiatives can quickly get out of hand. It’s not easy either, because the challenges are manifold:
But no problem at all. Just throw all the rules, requirements etc. into an AI and then we generate our test data across all systems and almost in real time - a dream. But I’m pretty sure it won’t work that easily. There are already some very nice approaches to generating and managing test data. However, my observation is that this is often a case of treating symptoms. I would rather ask two other questions.
What data do I really need? (And of these: which do I really need?) Just because we can store everything doesn’t mean we have to. It’s so easy to add a field to a table - but the effects can be dramatic. So: Just leave it out and delete all structures that are not needed. A (test) data radical cure!
Do I have a suitable data architecture? With cross-system architectures, I see many interfaces and dependencies, but hardly an overall picture of the data content, data flows and where which data is stored in a meaningful way. So that they are not stored redundantly and circularly. It’s slowly becoming a shoe.
And if the data is halfway decent, then I’ll think about something with AI 😉