Android has become the most popular mobile operating system. Correspondingly, an increasing number of Android malware has been developed and spread to steal users’ private information. There exists one type of malware whose benign behaviors are developed to camouflage malicious behaviors. The malicious component occupies a small part of the entire code of the application (app for short), and the malicious part is strongly coupled with the benign part. In this case, the malware may cause false negatives when malware detectors extract features from the entire apps to conduct classification because the malicious features of these apps may be hidden among benign features. Moreover, some previous work aims to divide the entire app into several parts to discover the malicious part. However, the premise of these methods to commence app partition is that the connections between the normal part and the malicious part are weak (e.g., repackaged apps).
In this paper, we call this type of malware as Android covert malware and generate the first dataset of covert malware. To detect these malware samples,we first conduct static analysis to extract the function call graphs. Through the deep analysis from these graphs, we observe that although the correlations between the normal part and the malicious part in these graphs are high, the degree of these correlations has a unique range of distribution. By this, we design a novel system (i.e., HomDroid) to detect covert malware by analyzing the homophily of call graphs. Our evaluation results on a dataset of 4,840 benign apps and 3,385 covert malicious apps show that the ideal threshold of correlation to distinguish the normal part and the malicious part is 3. In this case, HomDroid is capable of detecting 96.8% of covert malware while the False Negative Rates of another three state-of-the-art systems (i.e., PerDroid, Drebin, and MaMaDroid) are 30.7%, 16.3%, and 15.2%, respectively.