Flaky tests have gained attention from the research community in recent years and with good reason. These tests lead to wasted time and resources and reduce the reliability of the test suites and build systems they affect. However, most of the existing works on flaky tests focus exclusively on traditional unit tests. This ignores UI tests that have larger input spaces and more diverse running conditions than traditional unit tests. In addition, UI tests tend to be more complex and resource-heavy, making them unsuited for detection techniques involving rerunning test suites multiple times.
In this paper, we perform a study on UI flaky tests. We analyze 235 flaky UI test samples found in 62 projects from both web and Android environments. We identify the common underlying root causes of flakiness in the UI tests, the strategies used to manifest the flaky behavior, and the fixing strategies used to remedy flaky UI tests. The findings made in this work can provide a foundation for the development of detection and prevention techniques for flakiness arising in UI tests.