Coder Social home page Coder Social logo

dgm's Introduction

DGM

Deep Galerkin Method

Environment setup:

conda create --name tensorflow python=3.6
conda activate tensorflow
conda install tensorflow-gpu
conda install notebook pylint
conda install matplotlib
conda env export > environment.yml

dgm's People

Contributors

adolfocorreia avatar

Stargazers

Loc Luong avatar  avatar Erdin Guma avatar Sofiane Haddad avatar  avatar Pongpisit Thanasutives avatar Ante Kapetanovic avatar  avatar  avatar Yuri Coutinho avatar Ram Balachandran avatar Zewen Shen avatar Fan Pu avatar Chi Feng avatar  avatar Tmn07 avatar  avatar

Watchers

Sofiane Haddad avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dgm's Issues

Deep Galerkin Network

I am trying to use Deep Galerkin Method (DGM) to solve high dimensional PDEs and I face a problem. For illustrative purposes, I am posting a simple optimization problem below. The feed-forward network successfully recovers the optimal funciton, but DGM network fails to do so. Any help is highly appreciated.

import logging, os 
os.system('clear')
logging.disable(logging.WARNING) 
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
import tensorflow as tf
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
import numpy as np 
import pandas as pd 
import time
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

# CLASS DEFINITIONS FOR NEURAL NETWORKS USED IN DEEP GALERKIN METHOD

#%% import needed packages
import tensorflow as tf

#%% LSTM-like layer used in DGM (see Figure 5.3 and set of equations on p. 45) - modification of Keras layer class

class LSTMLayer(tf.keras.layers.Layer):
    
    # constructor/initializer function (automatically called when new instance of class is created)
    def __init__(self, output_dim, input_dim, trans1 = "tanh", trans2 = "tanh"):
        '''
        Args:
            input_dim (int):       dimensionality of input data
            output_dim (int):      number of outputs for LSTM layers
            trans1, trans2 (str):  activation functions used inside the layer; 
                                   one of: "tanh" (default), "relu" or "sigmoid"
        
        Returns: customized Keras layer object used as intermediate layers in DGM
        '''        
        
        # create an instance of a Layer object (call initialize function of superclass of LSTMLayer)
        super(LSTMLayer, self).__init__()
        
        # add properties for layer including activation functions used inside the layer  
        self.output_dim = output_dim
        self.input_dim = input_dim
        
        if trans1 == "tanh":
            self.trans1 = tf.nn.tanh
        elif trans1 == "relu":
            self.trans1 = tf.nn.relu
        elif trans1 == "sigmoid":
            self.trans1 = tf.nn.sigmoid
        
        if trans2 == "tanh":
            self.trans2 = tf.nn.tanh
        elif trans2 == "relu":
            self.trans2 = tf.nn.relu
        elif trans2 == "sigmoid":
            self.trans2 = tf.nn.relu
        
        ### define LSTM layer parameters (use Xavier initialization)
        # u vectors (weighting vectors for inputs original inputs x)
        self.Uz = self.add_variable("Uz", shape=[self.input_dim, self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        self.Ug = self.add_variable("Ug", shape=[self.input_dim ,self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        self.Ur = self.add_variable("Ur", shape=[self.input_dim, self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        self.Uh = self.add_variable("Uh", shape=[self.input_dim, self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        
        # w vectors (weighting vectors for output of previous layer)        
        self.Wz = self.add_variable("Wz", shape=[self.output_dim, self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        self.Wg = self.add_variable("Wg", shape=[self.output_dim, self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        self.Wr = self.add_variable("Wr", shape=[self.output_dim, self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        self.Wh = self.add_variable("Wh", shape=[self.output_dim, self.output_dim],
                                    initializer = tf.contrib.layers.xavier_initializer())
        
        # bias vectors
        self.bz = self.add_variable("bz", shape=[1, self.output_dim])
        self.bg = self.add_variable("bg", shape=[1, self.output_dim])
        self.br = self.add_variable("br", shape=[1, self.output_dim])
        self.bh = self.add_variable("bh", shape=[1, self.output_dim])
    
    
    # main function to be called 
    def call(self, S, X):
        '''Compute output of a LSTMLayer for a given inputs S,X .    

        Args:            
            S: output of previous layer
            X: data input
        
        Returns: customized Keras layer object used as intermediate layers in DGM
        '''   
        
        # compute components of LSTM layer output (note H uses a separate activation function)
        Z = self.trans1(tf.add(tf.add(tf.matmul(X,self.Uz), tf.matmul(S,self.Wz)), self.bz))
        G = self.trans1(tf.add(tf.add(tf.matmul(X,self.Ug), tf.matmul(S, self.Wg)), self.bg))
        R = self.trans1(tf.add(tf.add(tf.matmul(X,self.Ur), tf.matmul(S, self.Wr)), self.br))
        
        H = self.trans2(tf.add(tf.add(tf.matmul(X,self.Uh), tf.matmul(tf.multiply(S, R), self.Wh)), self.bh))
        
        # compute LSTM layer output
        S_new = tf.add(tf.multiply(tf.subtract(tf.ones_like(G), G), H), tf.multiply(Z,S))
        
        return S_new

#%% Fully connected (dense) layer - modification of Keras layer class
   
class DenseLayer(tf.keras.layers.Layer):
    
    # constructor/initializer function (automatically called when new instance of class is created)
    def __init__(self, output_dim, input_dim, transformation=None):
        '''
        Args:
            input_dim:       dimensionality of input data
            output_dim:      number of outputs for dense layer
            transformation:  activation function used inside the layer; using
                             None is equivalent to the identity map 
        
        Returns: customized Keras (fully connected) layer object 
        '''        
        
        # create an instance of a Layer object (call initialize function of superclass of DenseLayer)
        super(DenseLayer,self).__init__()
        self.output_dim = output_dim
        self.input_dim = input_dim
        
        ### define dense layer parameters (use Xavier initialization)
        # w vectors (weighting vectors for output of previous layer)
        self.W = self.add_variable("W", shape=[self.input_dim, self.output_dim],
                                   initializer = tf.contrib.layers.xavier_initializer())
        
        # bias vectors
        self.b = self.add_variable("b", shape=[1, self.output_dim])
        
        if transformation:
            if transformation == "tanh":
                self.transformation = tf.tanh
            elif transformation == "relu":
                self.transformation = tf.nn.relu
        else:
            self.transformation = transformation
    
    
    # main function to be called 
    def call(self,X):
        '''Compute output of a dense layer for a given input X 

        Args:                        
            X: input to layer            
        '''
        
        # compute dense layer output
        S = tf.add(tf.matmul(X, self.W), self.b)
                
        if self.transformation:
            S = self.transformation(S)
        
        return S

#%% Neural network architecture used in DGM - modification of Keras Model class
    
class DGMNet(tf.keras.Model):
    
    # constructor/initializer function (automatically called when new instance of class is created)
    def __init__(self, layer_width, n_layers, input_dim, final_trans=None):
        '''
        Args:
            layer_width: 
            n_layers:    number of intermediate LSTM layers
            input_dim:   spaital dimension of input data (EXCLUDES time dimension)
            final_trans: transformation used in final layer
        
        Returns: customized Keras model object representing DGM neural network
        '''  
        
        # create an instance of a Model object (call initialize function of superclass of DGMNet)
        super(DGMNet,self).__init__()
        
        # define initial layer as fully connected 
        # NOTE: to account for time inputs we use input_dim+1 as the input dimensionality
        self.initial_layer = DenseLayer(layer_width, input_dim, transformation = "tanh")
        
        # define intermediate LSTM layers
        self.n_layers = n_layers
        self.LSTMLayerList = []
                
        for _ in range(self.n_layers):
            self.LSTMLayerList.append(LSTMLayer(layer_width, input_dim))
        
        # define final layer as fully connected with a single output (function value)
        self.final_layer = DenseLayer(1, layer_width, transformation = final_trans)
    
    
    # main function to be called  
    def call(self,x):
        '''            
        Args:
            t: sampled time inputs 
            x: sampled space inputs

        Run the DGM model and obtain fitted function value at the inputs (t,x)                
        '''  
        
        # define input vector as time-space pairs
        X = tf.concat([x],1)
        
        # call initial layer
        S = self.initial_layer.call(X)
        
        # call intermediate LSTM layers
        for i in range(self.n_layers):
            S = self.LSTMLayerList[i].call(S,X)
        
        # call final LSTM layers
        result = self.final_layer.call(S)
        
        return result


#%% main class

class check():
        def __init__(self,v,x,layers,learning_rate,adam_iter,params):
            self.params=params
            self.v=v
            self.x=x
            self.learning_rate = learning_rate
            self.adam_iter = adam_iter
            self.lb = np.array([self.x[0][0]])
            self.ub = np.array([self.x[-1][0]])

            self.sess = tf.Session(config = tf.ConfigProto(allow_soft_placement = True, log_device_placement = True))
            self.x_tf = tf.placeholder(tf.float32, shape=[None,self.x.shape[1]])
            self.v_tf = tf.placeholder(tf.float32, shape=[None,self.v.shape[1]])
            self.x_u_tf = tf.placeholder(tf.float32, shape=[None,self.x.shape[1]])
            self.v_u_tf = tf.placeholder(tf.float32, shape=[None,self.v.shape[1]])
            
            self.weights_v,self.biases_v = self.initialize_nn(layers)
            self.weights_i,self.biases_i = self.initialize_nn(layers)
            
            
            with tf.variable_scope("control",reuse=True):
                self.i_pred = self.net_i(self.x_tf)
            
            with tf.variable_scope("value",reuse=True):
                self.v_pred = self.net_v(self.x_tf)
            
            self.error_i = self.policy_error(self.x_tf)
            
            self.loss_v = tf.math.reduce_max(tf.abs(self.v_pred-self.v_tf))
            
            self.loss =  tf.math.reduce_max(tf.abs(self.v_pred-self.v_tf)) + tf.reduce_mean(tf.square(self.error_i))
            
            self.optimizer_Adam = tf.train.AdamOptimizer(learning_rate=self.learning_rate)
            self.train_op_Adam = self.optimizer_Adam.minimize(self.loss)
            self.optimizer_Adam_v = tf.train.AdamOptimizer(learning_rate=self.learning_rate)
            self.train_op_Adam_v = self.optimizer_Adam.minimize(self.loss_v)
            
            init = tf.global_variables_initializer()
            self.sess.run(init)
        
        def policy_error(self,x):
            i=self.net_i(x)
            v_ = self.net_v(x+i)
            l = v_ - i*x**2
            error_i = tf.gradients(l,i)[0] 
            return error_i
        
        def initialize_nn(self,layers):
                weights = []
                biases = []
                num_layers = len(layers)
                for l in range(num_layers-1):
                    W = self.xavier_init(size = [layers[l],layers[l+1]])
                    b = tf.Variable(tf.zeros([1,layers[l+1]], dtype=tf.float32), dtype = tf.float32)
                    weights.append(W)
                    biases.append(b)
                
                return weights,biases            
        def xavier_init(self,size):
                in_dim = size[0]
                out_dim = size[1]
                xavier_stddev = np.sqrt(2/(in_dim + out_dim))
                try:
                    val = tf.Variable(tf.random.truncated_normal([in_dim,out_dim], stddev = xavier_stddev), dtype = tf.float32)
                except:
                    val = tf.Variable(tf.truncated_normal([in_dim,out_dim], stddev = xavier_stddev), dtype = tf.float32)
                return val
    
        def neural_net(self,X,weights,biases):
                num_layers = len(weights) +1
                H = 2.0*(X - self.lb)/(self.ub - self.lb) -1
                #H=X
                for l in range(num_layers-2):
                    W = weights[l]
                    b = biases[l]
                    H = tf.tanh(tf.add(tf.matmul(H,W),b))
                W = weights[-1]
                b = biases[-1]
                Y = tf.add(tf.matmul(H,W),b)
    
                return Y
        def net_v(self,eta):
                if self.params['DGM']==True:
                    model_v = DGMNet(self.params['neurons_per_layer'],self.params['num_layers'],1)
                    v_u = model_v(eta)
                
                else:
                    X = tf.concat([eta],1)
                    v_u = self.neural_net(X,self.weights_v,self.biases_v)
                return v_u
        def net_i(self,eta):
                if self.params['DGM']==True:
                    model_i = DGMNet(self.params['neurons_per_layer'],self.params['num_layers'],1)
                    i_u = model_i(eta)
                
                else:
                    X = tf.concat([eta],1)
                    i_u = self.neural_net(X,self.weights_i,self.biases_i)
                return i_u 
        def callback(self,loss):    
                print('Loss: ',loss)
        
        def train(self):
            #K.clear_session()
                
                         
                start_time = time.time()
                if True: #set this to true if you want adam to run 
                    tf_dict = {self.v_tf:self.v, self.x_tf:self.x}
                    for it in range(self.adam_iter):
                        self.sess.run(self.train_op_Adam_v, tf_dict)
                        # Print
                        if it % 1000 == 0:
                            elapsed = time.time() - start_time
                            loss_value = self.sess.run(self.loss_v, tf_dict)
                            print('It: %d, Loss: %.3e, Time: %.2f' % 
                                  (it, loss_value, elapsed))
                            start_time = time.time()
                
                    start_time = time.time()
                                    
                if True: #set this to true if you want adam to run 
                    tf_dict = {self.v_tf:self.v, self.x_tf:self.x}
                    for it in range(self.adam_iter):
                        self.sess.run(self.train_op_Adam, tf_dict)
                        # Print
                        if it % 1000 == 0:
                            elapsed = time.time() - start_time
                            loss_value = self.sess.run(self.loss, tf_dict)
                            print('It: %d, Loss: %.3e, Time: %.2f' % 
                                  (it, loss_value, elapsed))
                            start_time = time.time()
                            
                    start_time = time.time()

        def predict(self,X_star):
            i_star = self.sess.run(self.i_pred,{self.x_tf: X_star[:,0:1]})            
            v_star = self.sess.run(self.v_pred,{self.x_tf: X_star[:,0:1]}) 
            error = self.sess.run(self.error_i,{self.x_tf: X_star[:,0:1]})
            tf.reset_default_graph()
            return i_star,v_star,error


#%%
if __name__=="__main__":
        params={'DGM':True,'neurons_per_layer':50,'num_layers':4}
        x=np.linspace(-1,1,100).reshape(-1,1).astype(np.float32)
        v=(10 - x**2).reshape(-1,1).astype(np.float32)
        #architecture for feed-forward network
        layers =  [1, 10,1]
        learning_rate = 0.001
        adam_iter = 5000
        
        run = check(v,x,layers,learning_rate,adam_iter,params)
        run.train()
        
        i_star,v_star,error=run.predict(x)
        

The problem is to find the optimal function i that maximizes the function v=10-(x+i^2)-ix^2, where x is the state variable. That is, the optimal function i will depend on x. If I set 'DGM' as False in the parameter dictionary and run the code, I get the right solution (in this case the functions are coded as feed-forward neural network), where the correct analytical solution is i_star = 0.5*(-2x-x^2). If I set 'DGM' as False, the solution is incorrect. I tried with different number of layers and number of neurons per each layer, but DGM always gives incorrect solution.

Am I doing something wrong? Many thanks.

GPU

Is the DGM network optimized for GPU? The code as it stands does not utilize GPU. Changing the DGM network to feed-forward network fixes this problem though (I can see the GPU being used at full capacity). The training time with DGM network is too high with CPU unfortunately.

Any help is appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.