Coder Social home page Coder Social logo

apoorvsinghnegi / double-deep-q-learning-for-resource-allocation Goto Github PK

View Code? Open in Web Editor NEW

This project forked from engineer1999/double-deep-q-learning-for-resource-allocation

0.0 0.0 0.0 154 KB

Reproduce results of the research article "Deep Reinforcement Learning Based Resource Allocation for V2V Communications"

Python 100.00%

double-deep-q-learning-for-resource-allocation's Introduction

Deep Reinforcement Learning based Resource Allocation for V2V Communications

This repository contains the implementation of reinforcement learning algorithm double deep-Q learning for resource allocation problem in the vehicle to vehicle communication based on the research paper "Deep Reinforcement Learning based Resource Allocation for V2V Communications" by Hao Ye, Geoffrey Ye Li, and Biing-Hwang Fred Juang. Orignal codes are developed by IIT-lab, Paper-with-Code-of-Wireless-communication-Based-on-DL which implements deep-q learning.

I have made some modifications in code so that the results of the research paper can be reproduced.

Installation and use

Linux

Fork the repository and open the terminal using ctrl+alt+t

cd <path-to-the-python-files>

pip3 install -r requirement.txt

After successful installation close the terminal and again open it, use the below command in terminal to run the program.

cd <path-to-the-python-files>

python3 agent.py

Running this code will require a good amount of time (36 hours on i7 7th gen)

Tips and Tricks

Use the below commands to save the terminal output in .txt file. It will be beneficial while you are debugging the code.

python3 agent.py 2>&1 | tee SomeFile.txt

Run the code using the above command.

Results reproduced using Deep-Q learning

Sum Rate of V2I vs Number of Vehicles

Figure-1

The above figure shows the sum rate of V2I vs the number of vehicles. From the figure, we can infer that, with the increase in the number of vehicles, the number of V2V links increases as a result, the interference with the V2I link grows, therefore the V2I capacity will drop.

Probability of Satisfied V2V links vs the number of vehicles

Figure-2

The given figure shows the probability that the V2V links satisfy the latency constraint versus the number of vehicles. From the figure, we can infer that, with the increase in the number of vehicles, the V2V links in increases, as a result, it is more difficult to ensure every vehicle satisfies the latency constraint.

The Probability of power level selection with the remaining time for transmission

Figure-3

The above figure shows the probability for the agent to choose power levels with different time left for transmission. In general, the probability for the agent to choose the maximum power is low when there is abundant time for transmission, while the agent will select the maximum power with a high probability to ensure satisfying the V2V latency constraint when only a small amount of time left. However, when only 10 ms left, the probability for choosing the maximum power level suddenly drops to about 0.6 because the agent learns that even with the maximum power the latency constraints will be violated with high probability and switching to a lower power will get more reward by reducing interference to the V2I and other V2V links.

Therefore, we can infer that the improvement of the deep reinforcement learning based approach comes from learning the implicit relationship between the state and the reward function.

Effect of Double Deep-Q Learning

The Probability of power level selection with the remaining time for transmission

Figure-4

Figure-4 shows the probability for the agent to choose power levels with different time left for transmission when Double-Deep Q-Learning is used. The probability for the agent to choose the maximum power is decreased compared to the figure-3 when there is abundant time for transmission. Also, the probability of selecting maximum power to ensure the V2V latency constraint when a small amount of time left is increased.

Apart from this, when the agent has abundant time for transmission it will select low power transmission to reduce resource usage.

double-deep-q-learning-for-resource-allocation's People

Contributors

engineer1999 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.