Date of Award
Doctor of Philosophy
Mobile devices have become ubiquitous over the last years. Android, as the leading platform in the mobile ecosystem, have over 2.5 million apps published in Google Play Market. This enormous ecosystem creates a fierce competition between apps with similar functionality in which the low quality of apps has been shown to increase the churn rate considerably. Additionally, the complex event-driven, framework-based architecture that developers use to implement apps imposes several challenges and led to new varieties of code smells and bugs. There is a need for tools that assure the quality of apps such as program analysis and testing tools. One of the foundational challenges for developing these tools is the sequencing or ordering of callback methods invoked from external events (e.g. GUI events) and framework calls. Even for a small subset of callbacks, it has been shown that the current state-of-the-art tools fail to generate sequences of callbacks that match the runtime behavior of Android apps.
This thesis explores the construction and applications of new representations and program analyses for event-driven, framework-based mobile applications, specifically Android apps. In Android, we observe that the changes of control flow between entry points are mostly handled by the framework using callbacks. These callbacks can be executed synchronously and asynchronously when an external event happens (e.g. a click event) or a framework call is made. In framework-based systems, method calls to the framework can invoke sequences of callbacks. With the high overhead introduced by libraries such as the Android framework, most current tools for the analysis of Android apps have opted to skip the analysis of these libraries. Thus, these analyses missed the correct order of callbacks for each callback invoked in framework calls. This thesis presents a new specification called Predicate Callback Summary (PCS) to summarize how library or API methods invoke callbacks. PCSs enable inter-procedural analysis for Android apps without the overhead of analyzing the whole framework and help developers understand how their code (callback methods) is executed in the framework. We show that our static analysis techniques to summarize PCSs is accurate and scalable, considering the complexity of the millions of lines of code in the Android framework.
With PCSs summaries, we have information about the control flow of callbacks invoked in framework calls but lack information about how external events can execute callbacks. To integrate event-driven control flow behavior with control behavior generated from framework calls, we designed a novel program representation, namely Callback Control Flow Automata (CCFA). The design of CCFA is based on the Extended Finite State Machine (EFSM) model, which extends the Finite State Machine (FSM) by labeling transitions using information such as guards. In a CCFA, a state represents whether the execution path enters or exits a callback. The transition from one state to another represents the transfer of control flow between callbacks. We present an analysis to automatically construct CCFAs by combining two callback control flow representations developed from the previous research, namely, Window Transition Graphs (WTGs) and PCSs. To demonstrate the usefulness of our representation, we integrated CCFAs into two client analyses: a taint analysis using FLOWDROID, and a value-flow analysis that computes source and sink pairs of a program. Our evaluation shows that we can compute CCFAs efficiently and that CCFAs improved the callback coverages over WTGs. As a result of using CCFAs, we obtained 33 more true positive security leaks than FLOWDROID over a total of 55 apps we have run. With a low false positive rate, we found that 22.76\% of source-sink pairs we computed are located in different callbacks and that 31 out of 55 apps contain source-sink pairs spreading across components.
In the last part of this thesis, we use the CCFAs to develop a new family of coverage criteria based on callback sequences for more effective testing Android apps. We present 2 studies to help us identify what types of callbacks are important when detecting bugs. With the help of the empirical results, we defined 3 coverage criteria based on callback sequences. Our evaluation shows that our coverage criteria are a more effective metric than statement and GUI-based event coverage to guide test input generation.
Danilo Dominguez Perez
Dominguez Perez, Danilo, "The construction and applications of callback control flow graphs for event-driven and framework-based mobile apps" (2019). Graduate Theses and Dissertations. 17004.