michaelpradel / deepbugs Goto Github PK
View Code? Open in Web Editor NEWDeepBugs is a framework for learning bug detectors from an existing code corpus.
License: MIT License
DeepBugs is a framework for learning bug detectors from an existing code corpus.
License: MIT License
When using BugDetection.py unknown tokens are not converted into UNK and instances containing them are skipped.
For example, this happens in python/LearningDataSwappedArgs.py lines 65-69.
For that one this could be fixed by having python/Util.py replace OOV tokens in the call before yielding it.
I've checked this for the swapped arguments bugs but I guess that the same happens for the rest.
Please let me know if I have misunderstood how the above works.
Have you performed any experiments with larger embeddings from Word2Vec?
For example, 512 or 1024 dimensions.
Do you know if that results in better or worse performance?
Part 3 of Embeddings for Identifiers has the following command:
python3 python/EmbeddingsLearnerWord2Vec.py token_to_number_*.json encoded_tokens_*.json
.
But when the command is ran, with the appropriate filenames for token_to_number_*.json
and encoded_tokens_*.json
, I get an error:
python3: can't open file 'python/EmbeddingsLearnerWord2Vec.py': [Errno 2] No such file or directory
.
Looking in the python
directory I see a file named: EmbeddingLearnerWord2Vec.py
not EmbeddingsLearnerWord2Vec.py
.
Running the command with EmbeddingLearnerWord2Vec.py
works fine.
I found 5 types in BugDetection.py but only 3 types are listed in README.md. Can these two types(IncorrectAssignment and MissingArg) be used?
Seem to have difficulty running the initial step 1 command from the Readme.md
"node javascript/extractFromJS.js idsLitsWithTokens --parallel 4 data/js/programs_50_training.txt data/js/programs_50"
With all the files correctly in place it seems to return a code of
Total number of files: 0
Left in worklist: 0. Spawning an instance.
Left in worklist: 0. Spawning an instance.
Left in worklist: 0. Spawning an instance.
Left in worklist: 0. Spawning an instance.
I have read your paper and tried to understand how positive and negative training samples are created.
Is the transformations process available in this repo, if yes, please let me know the file names which are traversing the AST's and extracting the function calls so that the examples can be created.
In the paper, there is a code snippet from Angular.js project, where setTimeout
function is used incorrectly.
browserSingleton . startPoller (100 ,
function (delay , fn) {
setTimeout (delay ,fn) ;
})
However, I am unable to find such instance in the Angular.js code repo.
Can you please share the link to the buggy code/commit or pull request?
Hello,
I have two questions about DeepBugs:
Best regards
Florian
Hi
When I run the 'node javascript/extractFromJS.js calls --parallel 4 .. ' command for the bigger corpus for programs_eval.txt(or programs_training.txt), multiple 'calls_..' .json file get created instead of a single 'calls_..'.json file(corresponding to programs_eval.txt).
Which of these 'calls_..'.json files should I use in the next step(for training the classifier) in python?
I have followed all the instructions in read me file. The first part works fine where I get call_* files for training and eval. When I run the second step, below message is printed with bunch of other stuff:
Traceback (most recent call last):
File "python/BugDetection.py", line 134, in
learning_data.pre_scan(training_data_paths, validation_data_paths)
Hi,
I would like to just simply try out each bug detector. But the README.md
does not seem to have explanations how to run each bug detector after the learning&validation process. That is, I expect that, given a javascript file as an input, then a bug detector, as an output, may report buggy locations in the file.
Can you give me some helps?
Hello
Can you share the type_to_vector.json and node_type_to_vector.json or share idea on how to generate it.
Would be waiting for a response
Thanks and regards
Shivam
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.