Bj Rollison, Microsoft
Testing is the most challenging job in software development. The software tester must select a small number of tests from countless possible tests and perform them within a limited period of time, often with too few resources. Additionally, tests usually employ only a fraction of the possible data that may be used by customers or by malicious users. Whether we are unit testing, testing an API, or executing end-to-end user scenarios or acceptance tests the test data is usually the keystone to many functional tests.
Testers often craft test data representing typical customer inputs, as well as invalid data for a given input control or parameter. But, defining a broad set of test data from all possible inputs for either positive or negative testing is often a non-trivial part of the testing effort. Also, while static test data is useful, the effectiveness of static test data wears out with repeated use in subsequent iterations of a test in which that data is used.
One possible way to increase the breadth of test data coverage is to use random test data. But, random test data is sometimes disregarded because it may not “look like” customer data, or random data may generate false positive results indicating a failure in the system due because of invalid constructs in the random test data. This paper explains the fundamental principles of parameterized random test data generation which can be used to overcome many of the problems associated with random test data. It also demonstrates how parameterized random test data can increase test coverage and expose unexpected issues in software.
Bj Rollison, 2011 Technical Paper, Abstract, Paper