Coder Social home page Coder Social logo

arkaju / real-time-recurrent-learning Goto Github PK

View Code? Open in Web Editor NEW
20.0 1.0 2.0 725 KB

RNN based design of a Controller for a Multi-Input Multi-Output (MIMO) System.

License: MIT License

Python 0.77% Jupyter Notebook 99.23%
recurrent-neural-networks controller

real-time-recurrent-learning's Introduction

Real-Time-Recurrent-Learning

Introduction

This project utilizes a class of discrete-time Recurrent Neural Networks(RNNs) known as Real Time Recurrent Networks(RTRNs) to obtain the desired response of a dynamic control system. The system considered is a model of an autonomous helicopter.

Dependencies:

  • Numpy
  • Matplotlib
  • Pandas
  • Scipy
  • Math

Real Time Recurrent Learning(RTRL) v/s Back Propagation Through Time(BPTT)

  • In general, vanilla neural networks are known to utilise BPTT as their learning algorithm. In case of RTRNs, the concept of backpropagation doesn't exist. That implies The gradient information at t + 1 is forward propagated to compute the gradient information at t + 2 and so on. This is the difference.

  • RTRL is a real-time or online technique, whereas BPTT is an offline technique.

  • Since it is real time it means the data need not be collected over time from the system and then learn the model. The model will be learned as and when the data comes; most of the control problems must be dealt online.

Overview:

Two RTRNs, based on the same network architecture, are utilized in the learning control system.

  1. One is used to approximate the non- linear system
  2. Other is used to mimic the desired system response output.

The learning rule is achieved by combining the two RTRNs to form the neural network control system.

An iterative learning control(ILC) algorithm is used to train the RTRNs. The learned weights at one time step are utilised in the next time step. The derivation is based on 2-D system theory and can be found in the reference number [1].

A 2-D nonlinear system mathematical model is used to describe the dynamics of the RTRNs the learning process. One dimension reflects the RTRN dynamics in the time domain and the other to reflect the iterative learning process.

Learning algorithm:

When RTRNs are used to approximate and control an unknown nonlinear system through an on-line learning process, they may be considered as subsystems of an adaptive control system. The weights of the networks are initiliased randomly and need to be updated using a dynamical learning algorithm during the control process.

During the learning process, each variable of an RTRN depends on two independent variables; time(t) and iteration(k). For every time step (since discrete system is considered), both the RTRNs are trained separately in sequence till the error between the plant output and RTRN1 output & desired system response output and RTRN 2 output is brought under a certain threshold.

Control:

The learned weights of both the RTRNs are combined together to establish a control rule. This control rule updates the control inputs U1 and U2 on per timestep basis and helps to approximate the system output to the desired output.

Dataset:

The dataset is obtained by hardcoding the dynamics(given in terms of equations) obtaining outputs corresponding to inputs. The equations are borrowed from [2], which is a paper on control and stability analysis for an autonomous helicopter. The columns represent factors like angular velocity, vertical velocity, collective pitch angle etc.

NOTE:

  • The dataset was used only for obtaining values of mean and standard deviation for normalisation purposes and choosing an arbitrary initial value. Nowhere are they required for the learning algorithm.

  • A small number delta was added to the intermediate matrix to avoid singularity.

Results:

  1. The objective of this particular setup was to obtain a desired output of 140 for output 1 and 0.5 for output 2. This was achieved in about 50 time steps on average and maintained constant throughout.
  2. The results had some issues with weight initialisations. On some rare occasions, the output didn't converge. So the model is a bit sensitive to initial values.
  3. Except the first few timesteps, only one iteration is required to converge for both the RTRNs.

Some of the plots of the plant output for different random number seeds are given below:

References:

  1. IEEE
  2. Helicopter Paper
  3. NPTEL Electrical Engineering: Intelligent Systems and Control. (for the lecture slide)

real-time-recurrent-learning's People

Contributors

arkaju avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

real-time-recurrent-learning's Issues

Not able to replicate the solution

It does not converge during the iteration, so I have put maximum iteration limit as 100.
I have modified your code for the problem example as given in the IEEE paper,example 2, and then it worked. However, I am not able to replicate the solutions exactly as given by the author in the IEEE paper (problem Example 2). Have you tried this problem using your code?.I have got outputs as attached below. But as you can see, the maximum values are limited to .32. Do you know what I could do to not restrict the maximum values of the output?. blue is the controleed output and red is the reference
Also shouldn't we iterate over until the desired output converges with plant output (i mean a loop below time step loop)?.

Note: i am not normalizing as I donot know the inputs beforehand, as its real time.
untitled

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.