Better Flaky Failure Analysis

Test automation is expected to provide a consistent pass/fail signal when run. However, when run repeatedly with no changes, tests will sometimes report different answers. The transient nature of these failures makes it difficult to assess changes in behavior -- was the change in signal due to a recently introduced issue, or was it behavior that was already present at some low frequency? This provides an unclear signal, resulting in regressions being missed.
There are too many of these failures to fully enumerate. We have tried simply ignoring results from 'unreliable' tests, or simply rerunning failed tests to look for a passing result; these approaches result in lost signal and cause degradation of automation over time.
Our approach uses a baseline dataset and the results of a single rerun to analyze failures. The baseline dataset enumerates typical failures but also helps produce a statistical model that predicts low frequency failure behavior. Using the baseline data, a single rerun, and our statistical model, we can more accurately assess if a failure is new or existing.
Using this approach, our pull-request validation job pass rates have moved from 35% to 80%. - anticipating failure behavior based on a single rerun
- how to pivot failure analysis based on failure frequency
- how to pull useful signal out from flakey failures

Michael Robinson, Principal Software Engineer, Microsoft

Michael Robinson has been working on test automation at Microsoft for 19 years. During that time, he has worked on the automation systems and frameworks used to validate Microsoft Office at scale and has seen those systems grow from executing a few hundred tests a week to millions per day. Michael has a Batchelor of Science in Computer Science from the University of Missouri-Rolla. He is married with two small children. In his spare time, he enjoys playing video games, taking care of his cars, anything related to technology, and gardening.

Jenava Sexton, Senior Product Manager, Microsoft

Jenava Sexton is a Senior Product Manager at Microsoft, working with test automation systems for the MS Office Engineering System. She has worked in her role for three and a half years. Over the course of her 20-year career, Jenava has created and implemented many business systems across a wide variety of industries and company sizes. She is passionate about building software that makes complex business processes and decisions clear and straightforward for users. Jenava earned an MBA from the University of Washington and a BA from Bennington College. She has two small children, a spouse, and a large standard poodle. In her free time, she enjoys sewing, crocheting, and yoga.
Connect on LinkedIn: https://www.linkedin.com/in/jenavasexton/