Coder Social home page Coder Social logo

cuiyungao / idea-1 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from armor-ai/idea

0.0 2.0 0.0 2.43 MB

IDEA (IDentifying Emerging issues from App reviews) is a framework for detecting emerging issues from version-sensitive app reviews.

License: MIT License

Shell 0.93% Python 84.98% C 5.64% HTML 8.44%

idea-1's Introduction

IDEA

IDEA (IDentifying Emerging issues from App reviews) is a framework for detecting emerging issues from version-sensitive app reviews. You can view the visualized demo with the output of IDEA on Youtube dataset.

IDEA employs a novel method AOLDA (Adaptively Online Latent Dirichlet Allocation) to model version-sensitive topic distributions. The emerging topics are then identified based on anomaly detection algorithm. IDEA labels each topic with the most relevant phrases and sentences based on an effective ranking scheme considering both semantic relevance and user sentiment. More details can be referred to the following paper:

Cuiyun Gao, Jichuan Zeng, Michael Lyu, Irwin King. Online App Review Analysis for Identifying Emerging Issues. ICSE 2018.

Input Data Format

Input raw reviews should be saved as the following format per line. The attributes are separated by ******, and only the first five attributes are necessary. The number of attributes should be claimed in the variable InfoNum under the [Info] section. Here, InfoNum=6.

rating******review text******title******date******version******nation

Usage

  1. Install python dependence packages for IDEA:
$ cd IDEA/
$ ./install.sh

IDEA is built on Python2.7 under Linux or iOS, add sudo before the installation command if you need administrator permission.

  1. Notice: If this is the first time to use IDEA in your computer, you need to compile pyx and c. Also make sure _lda.c and _lda.so have been deleted before running the command:
$ cd src/
$ python build_pyx.py build_ext --inplace
  1. Run the main program using sample data. This may take several minutes.
$ python main.py

One can modify the parameters for the inputs and outputs in the config.ini file easily.

Dataset in the Paper

Researchers interested in obtaining the the full dataset (including the validation files) used in the paper may submit a data request form to be provided with the data usage agreement and further information on obtaining the data.

Visualization

  1. The source code for visualization is under the folder visualization. To prepare the input for visualization, we first run
$ python get_input <result_folder> <topic_number>

result_folder ----- the output dir of IDEA, should contain apk name, e.g., '../result/youtube/'
topic_nubmer  ----- the number of topics
  1. Use localhost server to display the topic river. For Python 2, run $ python -m SimpleHTTPServer 7778, while for Python 3, run python -m http.server 7778. 7778 is the port number for viewing the visualization, e.g., for localhost, here we type localhost:7778 in the browser.

For Linux or Mac:, can simply run:

$ bash visualize.sh <result_folder> <K>

Validation

  1. Download the word2vec model trained on 4 millions app reviews from this link, and unzip the directory in the model folder.

  2. Change the value of Validate in config.ini to be 1, and run the script.

$ python main.py

Citation

Please cite the ICSE paper if you use IDEA/datasets/trained word2vec in your work:

@inproceedings{gao2018online,
  title={Online App Review Analysis for Identifying Emerging Issues},
  author={Cuiyun Gao and Jichuan Zeng and Michael R. Lyu and Irwin King},
  booktitle={Proceedings of the 40th International Conference on Software Engineering, {ICSE} 2018},
  year={2018}
}

Related Link

History

2018-2-4: first version of IDEA

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.