saltudelft / type4py Goto Github PK
View Code? Open in Web Editor NEWType4Py: Deep Similarity Learning-Based Type Inference for Python
License: Apache License 2.0
Type4Py: Deep Similarity Learning-Based Type Inference for Python
License: Apache License 2.0
When using the preprocess
command with only the -o
argument, the code crashes with the following
UnboundLocalError: local variable 'train_files_vars' referenced before assignment
This is because in the following extract
Lines 302 to 319 in 93828c3
the train_files_vars
variable is only initialised in the if
branch
Hello, thank you for creating and providing this great project! I plan to use this project for my bachelor thesis. Therefore, I am mainly interested in the inference functionality provided with infer.py
on branch server
(branch infer
seems to be outdated).
I am aware of the VS Code extension and the public JSON API. I, however, prefer to use this project locally.
Since infer.py
takes a pre-trained model as a program argument, I followed all the steps in the README to train such a model.
Unfortunately, the script crashes with the following message (excerpt):
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: tok for the following indices
index: 0 Got: 7 Expected: 1
Please fix either the inputs or the model.
Below you can find a link to a Google Colab notebook with all the steps from start (downloading the ManyTypes4Py dataset, pip-installing type4py, preprocessing) to finish (training a model, trying to infer the types of a single file) and the corresponding output from when I ran it the last time (including the full error backtrace on the bottom):
https://colab.research.google.com/drive/1kRIffMlgGCeW55wXelksGrXfSd0WjhKQ?usp=sharing
It should be relatively self-explanatory. Evidently, I use a fork of this project and not the project itself. The differences are minor though: In learn.py
, I just re-uncommented the .to(DEVICE)
-calls (c42144d) as otherwise it would lead to a crash in the notebook (vectors are on different devices). The remaining changes don't affect Python files and are not relevant to this issues.
Further, I am using venv, although I doubt this has any negative influence on the execution of this project.
My question is, how can I successfully use infer.py
? How can I obtain a proper compatible model for it?
Are any of those steps in the linked notebook incorrect?
The JSON output file is not JSON conformant in two aspects:
This may affect some simpler JSON parsers, better JSON parsers can handle these minor errors just fine.
'error': None
#should be
"error": "None"
Hey there,
I am currently trying to getting this project up and running and was following the instructions to train the model using the ManyTypes4Py dataset. Unfortunately, the preprocess command just skips the dataset (or rather, does not find any relevant information). I solved this issue by removing the files all_fns.csv
and all_vars.csv
and symlinking processed_projects_complete
to processed_projects
.
Did I miss anything during the setup? Are those steps expected and should be added to the documentation?
It would be interesting to see how well the TypeWriter algorithm (https://software-lab.org/publications/TypeWriter_arXiv_1912.03768.pdf) for searching type annotation suggestions works against type4py. We might get dramatically better results for two reasons:
pyre incremental
is orders of magnitude faster than pyre
was when the TypeWriter paper was written, so we may be able to try many more combinations and get correspondingly better resultsAt one point we'd considered hacking this very quickly as an internal project in my company, but we ran out of time. I think it would be better done open-source anyway because then
I'm unsure if I can find time to prioritize this in the next 6 months at work but it's a little more likely if I treat it as a side project, which would also open the door to an informal weekend hackathon as a way to kick it off :)
I could do this in a separate repository or inside of type4py. What do you think @mir-am ? And does this sound interesting to you?
I experimented with the type prediction (http://localhost:5001/api/predict?tc=0) using the provided docker image.
I noticed that depending on the analysed source code, I get different amounts of type predictions per parameter/return/variable type.
Is it possible to retrieve a fixed number of predicted types?
For example, I would like to retrieve the Top-10 type predictions for each parameter and return type.
Best regards
Florian
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.