Review with Security Concern Dataset
1. Introduction
Providing pertinent information to potential users about security concerns drawn from user feedback is essential to enhance the security level of the entire mobile ecosystem. According to a usable security survey, star ratings and reviews can help users' download and permission decisions. However, many apps have security problems with a high star rating. To confirm this problem, we build a dataset by collecting 56,439,878 star ratings and reviews for 8,999 popular game apps on Google Play Store. This study proposes a model using an active learning method that distinguishes Review with Security Concern (RSC) from user reviews.
2. Publication
3. Dataset
This dataset consists of reviews of popular Android game apps that have been downloaded over 10,000,000 from Google Play.
3-1. Data attributes
all_reviews
All collected reviews (raw version)
Consists of a total of 8,999 apps
train_reviews
Review data for training
Consists of a total of 6502 reviews labeled according to the three types below.
3-2. Security Concern type
Data Leakage
Excessive Advertising
Inexpedient Payment
4. download
We share data for academic purposes. The composition of the data was collected from publicly available Google Play. The labeling of this dataset was done manually. If you have any feedback regarding the dataset, please contact us. If you use our dataset for your experiment, please cite our paper.
Dataset Download Link: Download
5. Contact
Sangho Lee (lee35@korea.ac.kr), Huy Kang Kim (cenda@korea.ac.kr)