agemagician / codetrans Goto Github PK
View Code? Open in Web Editor NEWPretrained Language Models for Source code
License: MIT License
Pretrained Language Models for Source code
License: MIT License
Hi, I've been trying to reproduce the results of Code Documentation Generation but failed to do so. Could you please help to explain how you process the input (directly use the provided tokenized data or manually tokenize using tree_sitter), and how you calculate the smoothed bleu-4 scores? See below for the details:
Take JavaScript for example: the results for CodeTrans-TF-Small/Base/Large reported in the paper are 17.23, 18.25, 18.98, respectively.
I first directly employ the tokenized data provided by CodeBERT (or CodeXGLUE), where my reproduced results are 15.8, 16.96, 17.67.
Besides, I tokenize the source code using tree_sitter following your provided pipeline (i.e., CodeTrans/prediction/multitask/fine-tuning/function documentation generation/javascript/small_model.ipynb), and the obtained results are 15.28, 16.91, 17.61.
Other facts: I calculate the smoothed bleu-4 score following CodeXGLUE (https://github.com/microsoft/CodeXGLUE/blob/main/Code-Text/code-to-text/evaluator/evaluator.py). I truncate the source and target sequence up to 512 tokens before fed to the model.
We also cannot reproduce the results for other languages on Code Documentation Generation task. Please help to resolve this. Thanks in advance!
For example let's teach it langchain with carefully annotated question answer pairs dataset.
Thanks in advance.
I'm trying to fine-tune the model on the Kotlin dataset for code comment/code documentation tasks.
But I getting RuntimeError: Could not infer dtype of dict.
more details are available in the below link
https://stackoverflow.com/questions/75399318/runtimeerror-could-not-infer-dtype-of-dict
Hi, When playing with the example notebook provided in the following link: https://github.com/agemagician/CodeTrans/blob/main/prediction/single%20task/source%20code%20summarization/python/base_model.ipynb
I noticed the summary is an interrogative sentence.
But it seems like when the example was first created, the expected output should be a declarative sentence as follow:
Has the model been updated recently so that it outputs summary differently?
Thank you!
Hi, I have a question and want to know the difference between these three tasks: code documentation generate、code summarization and code comment generate. My understanding is that all three of these tasks are generating natural language
descriptions for a code snippet.
Rather than just using the pre-trained model for single task Source Code Summarization. Will it better to integrate a latest LLM in it?
A lot of jupyter notebooks with pipelines for the tasks your model can perform is great, but it would also be nice to have a finetuning script. Ideally it would be a slight modification of the transformers run_mlm.py
, but a custom script should suffice.
Are you planning to provide checkpoints for your models?
Hi.
I was trying to run your code in single task/api generation/t5 interface/base_model.ipynb on COLAB and I am receiving this error after model.predict
.
model.predict(
input_file="input.txt",
output_file=predict_outputs_path,
checkpoint_steps=840000,
beam_size=4,
vocabulary=vocab,
# Select the most probable output token at each step.
temperature=0,
)
=======================================
INFO:tensorflow:Using config: {'_model_dir': 'base', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 5000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=100, num_shards=None, num_cores_per_replica=1, per_host_input_for_training=4, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2, experimental_host_call_every_n_steps=1, experimental_allow_per_host_v2_parallel_get_next=False, experimental_feed_hook=None), '_cluster': None}
INFO:tensorflow:_TPUContext: eval_on_tpu True
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-11-e1bf683e0ef5> in <module>()
7 vocabulary=vocab,
8 # Select the most probable output token at each step.
----> 9 temperature=0,
10 )
4 frames
/usr/local/lib/python3.7/dist-packages/mesh_tensorflow/transformer/utils.py in infer_model(estimator, vocabulary, sequence_length, batch_size, model_type, model_dir, eval_checkpoint_step, checkpoint_paths, decode_fn)
1853 batch_size=batch_size,
1854 sequence_length=sequence_length,
-> 1855 checkpoint_path=checkpoint_path)
1856
1857
TypeError: 'str' object is not callable
In call to configurable 'infer_model' (<function infer_model at 0x7f68488db950>)
How can I fix this issue?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.