View on GitHub


Android Benchmark Reproduction Framework


ReproDroid is a framework which can be used to create, refine and execute reproducible benchmarks for Android app analysis tools.

!!! Update can be found in the Errata section below !!!


The complete ReproDroid framework consists of BREW and its underlying AQL-System which uses the AQL. The picture below summarizes how the framework works. BREW takes a set of apps or a complete benchmark as input and issues one AQL-Query per benchmark case. Then, one query after another arrives at an AQL-System which produces one AQL-Answer per query. To do so, it uses analysis tools specified in BREW’s configuration file. All AQL-Answers are gathered by BREW. Based on these answers a final report for e.g. a benchmark is carried out.

The tools and results presented in the proposing paper can be downloaded for inspection here. In order to work with the framework, we suggest to download the up-to-date version of BREW. The underlying AQL-System is also available in a newer version.


To refine benchmarks and to determine the associated results the Benchmark Refinement and Execution Wizard (BREW) has been used. There are two versions available for download:

A tutorial on how to fully load ReproDroid benchmark results can be found here

A documentation of the Android App Analysis Query Langauge (AQL) as well as the AQL-System using it is also obtainable online:


None of the six evaluated tools are contained in either of these tools. How to set up a configuration file in order to use a tool is explained in this tutorial. The six evaluated tools themselves can be downloaded from their associated websites:


All result determined with ReproDroid can be found in this section.


The refined versions of DroidBench 2.0 and 3.0 as well as the extended DroidBench version can be downloaded here. Every download includes:


Extensions for DroidBench

The Feature-Checking and Intent-Matching benchmark extensions can be downloaded here. Both are available for Android API 19 and 26. Every download includes:



The refined version of ICC-Bench 2.0 can be downloaded here. It includes:



The iteratively refined version of DIALDroidBench can be downloaded here. It includes:



All benchmarks above which are based on DroidBench contain four tiny bugs (mislabeled).

Category Benchmark Case Wrong Label Correct Label
Aliasing SimpleAliasing1 Negative / Not-Expected Case Positive / Expected Case
UnreachableCode UnreachableBoth Positive / Expected Case Negative / Not-Expected Case
UnreachableCode UnreachableSink1 Positive / Expected Case Negative / Not-Expected Case
UnreachableCode UnreachableSource1 Positive / Expected Case Negative / Not-Expected Case

Furthermore the results in the category Reflection were incorrectly reported. The filter included the category Reflection_ICC in the category Reflection - a simple (sub-)string matching mistake. (The results for most benchmarks and all tools above will be re-evaluated and published here asap - still might take a while.)

DroidBench 3.0 (updated) + TaintBench

Here you find an updated version of the DroidBench 3.0 benchmark (DroidBench website) and the new TaintBench benchmark (TaintBench website) - to open you need BREW version 2.0.0 or newer. This are the two benchmarks we recommend to use for your Android taint analysis tool evaluation.




Felix Pauck (FoelliX)
Paderborn University