intel / neuro-vectorizer Goto Github PK

NeuroVectorizer is a framework that uses deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas for for loops in C and C++ codes.

Home Page: https://arxiv.org/abs/1909.13639

License: BSD 3-Clause "New" or "Revised" License

Python 1.26% Shell 0.07% C 98.67%

neuro-vectorizer's Introduction

DISCONTINUATION OF PROJECT

This project will no longer be maintained by Intel.

Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.

Intel no longer accepts patches to this project.

If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.

Contact: [email protected]

NeuroVectorizer v0.0.1

Neurovectorizer is a framework that uses deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas for for loops in C and C++ codes. The framework currently integrates with the LLVM compiler and can inject vectorization and interleaving factors. It is possible to support unrolling factors too by adding it as an action in the RL environment. More details are available in the paper. This paper appeared in CGO2020 and Passed all the artifact evaluations for reproducability.

Dependencies:

TF2 (pip install tensorflow).
Ray (pip install ray==0.8.4).
RLlib (pip install ray[rllib]==0.8.4).
LLVM (you need to have /usr/lib/llvm-X.Y/lib/libclang.so.1 or equivalent working). Currently tested with /usr/lib/llvm-6.0/lib/libclang.so.1.
clang (pip install clang). Current tested version is clang-6.0.0.2.

For more detailed install instructions tested on Ubuntu click here.

The framework takes the text code of loops (detects them in the code) and uses an AST embedding generator. The output of this generator is fed to a neural network agent that predicts the optimal factors.

To run training run:

- cd preprocess
- source ./configure.sh //you might need here to modify SOURCE_DIR to point to your train data.
- source ./preprocess.sh //this will generate the bag of words embedding of the AST trees for the training data (the training set is in "training_data" feel free to add more samples).
- python autovec.py

Important notes:

Some of the error messages when running source ./preprocess.sh are dumped to code2vec/data/for_loops/stderr.txt.
Training might take a long time to finish.
autovec.py uses the RLLib/TUNE API explained here: https://ray.readthedocs.io/en/latest/tune-package-ref.html.
O3_runtimes.pkl and obs_encodings.pkl are provided in ./training_data. O3_runtimes.pkl stores the -O3 runtimes on Intel® AVX Intel® Xeon® Processor E5-2667 v2 and obs_encodings.pkl stores the encodings of the AST programs so that you don't have to recompute it when training on the training data. If you have another Processor, remove O3_runtimes.pkl or else it will use -O3 runtimes based on the wrong processor!
If you want to use another model in the embedding generator, you need to modify get_obs function in "envs/neurovec.py".

To run rollout/inference on files in the provided dataset*:

python temp_rollout.py <~/ray_results/NeuroVectorizer/PPO_NeuroVectorizerEnv_*/checkpoint_*/checkpoint-*> --rollout_dir \ <./rollout_data> --compile

* If it is not in the dataset then use the --new_train_data flag.

Note that this command will raise ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task. This error is due to killing the ray worker after inferencing all the files. Ignore this error.

The provided pretrained model:

A very basic pretrained model is provided as three checkpoints under ./checkpoints that you can use to navigate and exercise. For example: you can run python temp_rollout.py checkpoints/checkpoint_100/checkpoint-100 --rollout_dir "./tests" --compile

Please reach out to Ameer Haj Ali for any questions.

To cite this work:

@inproceedings{ameerhajalicgo,
 author = {Haj-Ali, Ameer and Ahmed, Nesreen and Willke, Ted and Shao, Sophia and Asanovic, Krste and Stoica, Ion},
 title = {NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning},
 booktitle = {Proceedings of the 2020 International Symposium on Code Generation and Optimization},
 series = {CGO 2020},
 year = {2020},
 location = {San Diego, USA},
 publisher = {ACM},
}

neuro-vectorizer's People

Contributors

Stargazers

Watchers

neuro-vectorizer's Issues

Issue with temp_rollout.py

Hi @AmeerHajAli:
I'm done with configure.sh and preprocess.sh, when i run python temp_rollout.py checkpoints/checkpoint_100/checkpoint-100 --rollout_dir "./tests" --compile , i get the below error in error.txt.
AttributeError: 'NonoeType' object has no attribute 'restore'.

Best Regards
TimHu

Error Messages

Hi,

I was wondering if could explicitly explain on the README that the error messages are saved as a textfile under "stderr.txt" and that the error messages do not appear in console. It took me a little to figure out and I don't think it's obvious for new people coming across this program.

Is there any code about generating the results of MiBench and PolyBench as shown in figs. 10 and 11 of the paper?

As shown in Figs.10 and 11 of the paper "NeuroVectorizer: End-to-End Vectorization with Deep
Reinforcement Learning", could you provide source codes about generating the results of MiBench and PolyBench?

Related to training

Hi @AmeerHajAli ,

Thank you the neuro-vectorizer code!

Would it be possible to tell me how much time it takes to complete training? There is only one comment in the README -

"Training might take a long time to finish.".

I understand training time is dependent on a number of factors, but if you could document the time it takes for training on some particular configuration, it would be very helpful.

EDIT - Training terminated after 200 epochs and took a little less than 2 hours, I had enabled GPU ("num_gpus": 1) and set "num_workers": 15

How often does the model get saved and in which path? I see the following configuration in autovec.py

checkpoint_freq = 1,

Does this mean the model is saved every 1 iteration? Each iteration takes about 30 seconds on my system and training has completed 150+ iterations. So does that mean I have 150 models saved somewhere? Please do let me know.

EDIT - Found all the 200 checkpoints under ~/ray_results/neurovectorizer_train/PPO_autovec_*
Maybe this could be documented for ease of use as well, if you think it is worth it.

Problems encountered during run the autovec.py

Hello, I encountered some problems while running the autovec.py, I hope I can get your help!
First, when I run the autovec.py, I got errors as follows:

Traceback (most recent call last):
  File "autovec.py", line 33, in <module>
    from envs.neurovec import NeuroVectorizerEnv
  File "/home/milet/neuro-vectorizer-master/envs/neurovec.py", line 47, in <module>
    from utility import get_bruteforce_runtimes, get_O3_runtimes, get_snapshot_from_code, get_runtime, get_vectorized_codes, init_runtimes_dict, get_encodings_from_local, MAX_LEAF_NODES, pragma_line
  File "/home/milet/neuro-vectorizer-master/utility.py", line 43, in <module>
    MAX_LEAF_NODES = os.environ['MAX_LEAF_NODES']
  File "/home/milet/anaconda3/lib/python3.7/os.py", line 679, in __getitem__
    raise KeyError(key) from None
KeyError: 'MAX_LEAF_NODES'

I checked the system's environ keys and found there are no 'MAX_LEAF_NODES'in my system, so in order to solve the problem, I change the code in utility.py.

before:
MAX_LEAF_NODES = os.environ['MAX_LEAF_NODES']
TEST_SHELL_COMMAND_TIMEOUT = os.environ['TEST_SHELL_COMMAND_TIMEOUT']
after:
MAX_LEAF_NODES = 100
TEST_SHELL_COMMAND_TIMEOUT = 10

Unfortunately, I encountered a new problem, I could get the meaning of the error, so I hope someone can help me. Thank you!

Traceback (most recent call last):
  File "autovec.py", line 59, in <module>
    loggers=[TBXLogger]
  File "/home/milet/anaconda3/lib/python3.7/site-packages/ray/tune/tune.py", line 342, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_autovec_00000])
(pid=6111) 2021-05-15 15:50:25,693	WARNING deprecation.py:30 -- DeprecationWarning: `sample_batch_size` has been deprecated. Use `rollout_fragment_length` instead. This will raise an error in the future!
(pid=6111) 2021-05-15 15:50:25,693	INFO trainer.py:585 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=6111) cp: cannot stat './training_data/*': No such file or directory
(pid=6111) creating ./new_garbage_6111 directory
(pid=6111) running: cp -r ./training_data/* ./new_garbage_6111

Missing scripts for reproducing with pretrained code2vec model

In the NeuroVectorizer paper (the version submitted to CGO 2020), in the Artifact Description appendix "Experiment Workflow" section, you mention two options for running the framework: one that takes weeks, and one that takes a day or less (skipping the code2vec training). The instructions for the first option are included in the README in this repo, but the instructions for the second option (and any other associated Python scripts) are not included in this repo.

Would you be able to provide scripts and/or instructions for 1) training the RL model without retraining code2vec and/or 2) running the experiment scripts mentioned in the artifact description?

[Running on RTX GEFORCE 1070 8GB Machine]

Hi @AmeerHajAli,

Thank you for yesterday's response.
I'm able to move forward now.
But the configure.sh files have some default values for TESLA K80 GPU.
How should i change the WORD_VOCAB_SIZE, PATH_VOCAB_SIZE, TARGET_VOCAB_SIZE and various default values to match my machine specifications.

I'm currently working on i9-9th gen processor with 32DDR4 and 1070 RTX card.

Thanks and Regards,
Vinayak N Baddi

related to rollout

Hi @AmeerHajAli:
I find the rollout result is non-deterministic for the same code. That is to say, if I run the rollout multiple times, the generated VF and IF could be different for the same code. There is any way to make the rollout deterministic?

Best Regards
TimHu

[Issue with autovec.py]

Hi Team,

First of all thank you so much for the open implementation.
Read paper from NEURIPS'19 ML for systems workshop. The result looks very promising. I was trying to replicate the results from paper using this repo. I cloned the github repo and followed the steps as mentioned in the ReadMe document. I'm done with configure.sh and preprocess.sh, when i run python autovec.py, i get the below error.

Error: ModuleNotFoundError: No module named 'my_model'.

Note: I know it is present inside code2vec directory and i'm running the code from repo home directory. please help me resolve the issue or else i would take more time to change it manually for all files.

Thanks and Regards,
Vinayak N Baddi

verison

please tell me the version of tenserflow, numpy, ray, ray[rllib] . thanks.