The Critical Importance of Test Data

Posted by at 11:58h

The business imperative for faster enterprise applications development has never been stronger. Competitiveness and ultimately business survival depends on enterprise capability to not only create the right software but also to deploy it fast. At the same time, quality cannot be compromised as the business costs of launching bad applications also has sky-rocketed.

In this climate, it is no wonder that a lot of effort has been devoted to creation of good tools for test automation. However, an overwhelming number of these tools tend to concentrate on the test scripts and not doing very much to solve the test data problem. And still, even the best test script does little to secure quality unless it is executed with quality test data that truly reflects the business reality the application will run in. Scripts without the right data is like a car without fuel.



Getting the right test data can be difficult. The larger and older the organization is, the more complex the problem typically is as metadata documentation is either missing or is poorly maintained. With no documentation to refer to, it becomes difficult to capture data relationships. The consequences of missing data relations means that tests are executed in an artificially simplified environment that may differ significantly from the real road test that will only happen when the application is deployed to production. The inherent risks are frequently not known, assessed, nor managed consciously.

A better test data environment requires understanding of data on enterprise level including relations between data elements. Developers and testers must have the ability to create comprehensive synthetic data and the ability to quickly obtain and mask production data in quantities that ensure good test coverage from both technical and business perspective.

Synthetic data is particularly important for new applications or when major changes are implemented. The challenge is to create data that truly reflects future production data as this requires good understanding of the business and how users will use the application. As few enterprise applications are created in a vacuum the synthetic data’s relations to existing data must also be carefully considered. This is a business/IT problem requiring closer business/IT collaboration than typically seen is most companies.

Testing with production data has the advantage that we know it reflects real usage. At least as the application has been used so far. A critical success factor is to be able to obtain relevant subsets from production easily. Frequently testers have to rely on backlogged DBAs to get appropriate data which delays releases or slows down production problem resolution. The need for testers to be able to obtain most of the data on their own with the help of easy-to-use tools is apparent.

Another critical need is the ability to mask sensitive information without losing the properties of the data. The need for masking is rapidly growing with increased regulation in most industries. The masking must be quick, truly protect the data, and the data must still behave in the program as the unmasked data would have behaved.

ROKITT’s solutions for testing address all business processes for efficient test automation including test data automation. Test data automation is inherently organization dependent and we deliver solutions using strong frameworks combined with consulting to help our clients to discover their data and make the overall testing process faster and better.

ROKITT’s Enterprise Data Manager provides the ability to automatically Discover relationships in & across databases and displays them in a graph. It has the ability to retrieve data, create synthetic data, subset data from any environment and mask data as well. It enables the organization to effectively manage its data and perform comprehensive testing to deliver quality products.

Image courtesy of rajcreationzs at