This is the online repository of the ESEC/FSE2021 paper titled "Lightweight Global and Local Contexts Guided Method Name Recommendation with Prior Knowledge".
All datasets used in our study are open-sourced. We provide the links to each of them below.
- Empirical dataset (here)
- MNR task datasets: Java-small, Java-med, Java-large (here)
- MNR task dataset: MNire's (here)
- MCC task dataset (here)
Our Cognac is implemented by following the PyTorch version of pointer generator network. It is built on PyTorch-1.5 and TensorFlow-1.12. We use FastText to embed each token and utilize the Python package javalang to perform program analysis. Link to the installation of this package is here.
To reproduce our study, you need to:
- Execute
dataextractor.py
to extract the inputs of Cognac; - Execute
train_fasttext.py
to train the FastText model with using the extracted data from the last step. - Train, validate, and test the model by executing
start_train.sh
,start_eval.sh
, andstart_decode.sh
respectively. - If you want to reproduce the MCC task, execute
decode_mcc.py
andcal_sim.py
respectively.
We are unsure that other reproduction studies can achieve the same results as ours. Reasons for such deviation can come from:
- The hyperparameters in the
config.py
file may need to be fine-tuned. - In
datasetextractor.py
, we set a threshold to restrict the time consumption for parsing each Java file. Hence, servers with different hardware configuration may parse diverse numbers of methods.