ARPfix Benchmark Results
Link to ARPfix benchmark: https://github.com/sqlab-sustech/APER-ARPfix-benchmark
Link to Aper executable: APER tool
Link to Lint source code: com/android/tools/lint/checks/PermissionDetector.kt
Link to ARPDroid repository: https://bitbucket.org/malindadoo/arpdroid/src/develop/
Link to RevDroid repository: https://github.com/letitb/revdroid
We compare the tools by their:
-
True positive (\(TP\)): numbers of buggy versions that have warnings
-
True negative (\(TN\)): numbers of patched versions that have no warnings
-
False positive (\(FP\)): numbers of patched versions that have warnings
-
False negative (\(FN\)): numbers of buggy versions that have no warnings
The evaluation metrics:
\[Precision=\frac{TP}{TP+FP}\] \[Recall=\frac{TP}{TP+FN}\] \[F_1\ score=\frac{2*Precision*Recall}{Precision+Recall}\] \[FPR=\frac{FP}{FP+TN}\]1: Type-1
Tools | True Positive | True Negative | False Positive | False Negative | Failed | Precision(%) | Recall(%) | F1(%) | FPR(%) |
---|---|---|---|---|---|---|---|---|---|
Lint | 16 | 22 | 13 | 19 | - | 55.17 | 45.71 | 50.00 | 81.25 |
ARPDroid | 13 | 22 | 13 | 22 | - | 50.00 | 37.14 | 42.62 | 100.00 |
RevDroid | 15 | 26 | 6 | 17 | 6 | 71.43 | 46.88 | 56.61 | 40.00 |
Aper | 26 | 32 | 3 | 9 | - | 89.66 | 74.29 | 81.25 | 11.59 |
2: Type-2
Tools | True Positive | True Negative | False Positive | False Negative | Failed | Precision(%) | Recall(%) | F1(%) | FPR(%) |
---|---|---|---|---|---|---|---|---|---|
Lint | 14 | 15 | 10 | 11 | - | 58.33 | 56.00 | 57.14 | 71.43 |
Aper | 23 | 19 | 6 | 2 | - | 79.31 | 92.00 | 85.19 | 26.09 |