A Method to Efficiently Select Tests Based on Function Coverage
We all know that in order to discover software defects, tests need to be run. But how many tests are 'good enough'? Many companies do full regression suite runs to validate new software versions. Depending on your situation, this may be fine. If you are part of a start-up, where there is a high rate of code change and potentially fewer tests, it may be preferred. But if you are QA for a mature product with many tests and/or limited resources, you might wonder about the value of running all the tests when little or no changes have occurred. There are some available tools that can help. Jest for Java, pytest for Python. They all have their strengths and weaknesses. There is often the fear `what if we miss something'? What about side-effects? Who owns tests upstream and downstream from me in a flow? The application described in this paper addresses these concerns. From the changes that are detected what should be tested is determined with risk reduced to tolerable levels. The end result is a system that you can be confident will provide proper coverage with the fewest number of tests. "1. A method to not run tests that won't fail because they don't cover a code change.
A. Only tests that do cover a change are run, providing fast turnaround time.
B. If a test does not cover a code change, it will be unlikely to fail.
C. A developer is on vacation, it is very unlikely that tests will fail.
2. A continually growing pool of hardware may be able to have growth reduced.
Less tests run means a possible reduction in hardware. The hardware could be utilized for other tasks, such as additional exploratory testing.
3. There are ways to reduce the risk of running less tests. Some examples:
A. Run all tests as previous, but increase the run interval.
B. Run all tests for public releases until confidence is high.
4. Examples of other benefits
A. Can determine how much code a test covers.
B. Can identify which functions have a large impact. i.e., a function that impacts a large portion of the test suite.
C. Can be used as an automated way to map/categorize tests to a given feature.
D. Can aid in detecting tests that have a low chance of failing for any code change.
5. Some examples of concerns
A. Which tests comprise an optimal set? What is optimal? Probably shortest run time with the most coverage. What if a long-running test is the only one that would exhibit a failure?
B. For whom is this tool meant for? The usage may vary depending on if it is being used by a development engineer or a quality assurance engineer. The developer wants a fast turnaround time to not delay development. QA staff may prefer more thorough testing that may take more time to complete.
C. Code changes to common functions may cause a high percentage of tests to be selected. Additional filtering may be necessary to reduce the number of tests. Which is an opportunity to miss a defect. Running as before with an increased interval will catch the missed defect.
Jack Marvin, Software Quality Engineer, Siemens
Software QA Engineer at Siemens, with 20+ years of experience in automation, exploratory testing and performance testing.
Trevor Hammock, Software QA/Developer, Siemens
Software QA/Developer at Siemens, with 5 years of testing experience ranging from automation to C++ and python for test suite development.