neurodiffgym / neurodiffeq Goto Github PK

A library for solving differential equations using neural networks based on PyTorch, used by multiple research groups around the world, including at Harvard IACS.

Home Page: http://pypi.org/project/neurodiffeq/

License: MIT License

Python 98.90% TeX 0.68% Dockerfile 0.23% Shell 0.14% Batchfile 0.05%

differential-equations neural-networks pytorch pde-solver odes artificial-intelligence deep-learning physics-informed-neural-networks pypi time-series

neurodiffeq's People

Contributors

Stargazers

Watchers

Forkers

gnicks007 tmatsuzawa aksj98 drvinceknight dylanrandle swoopyy panghalshagun sierrahq vesple err-nil chulora mansura-habiba sumantkrsoni sakzsee matinmoezzi p-kshitij tiamat-tech teeratornk isfanyang smallff diffeqml bensayah godhand594 jacklbw dalerxli chaibaoqiang anatoly2008 turthyan dhydhydhy-919 kanhaiya-gupta vaishnavtv shushu-qin a2un zhb0318 19test vishalbalaji-v leekangjie 55python lddlxx snytav kohsin keygenx lixb3499 green-home christoffel1989 freebob cbuahin taodongwang iam-007swarna dhockaday drrstranger projetsplusia felix660 limsm3 vieuxroblochon mingyipro jbernalr jpcurbelo varde80 jerrybrowne 0xironfox ramakantgadhewal furmanlukasz potgie ashokdahal ahsheikh joaopedrobiu6 dimitra-maoutsa vldobr kolke-tales danyalrehman terrisgo samuelperezdi rockdeldiablo kwantopia jason-tju lakshay-13 jdelpiano salmadouieb aparna024 zhengbaofang kulkarnihrishikesh zhiqinkuang prikarsartam kornbotdevultimatorkraton

neurodiffeq's Issues

Use one net instead of many for ode/pde systems

Allowing the neural nets to share weights can result in better solutions for some ode/pde systems. The user should be able to switch between multiple neural nets and single neural net.

Tests are taking too long

Tests usually take ~40 mins to run. This delayed feedback can slow down development. Is there a better way to test?

make xy_min and xy_max argument optional in pde.solve2D_system and pde.solve2D

when train_generator and valid_generator are specified by the user, xy_min and xy_max are redundant. We have a similar issue with the ode module.

use contour plot for Monitor2D

Main causes of test failures found

Causes of Test Failure

To understand the causes of massive test failures,

PyTorch tensors (on the low level) are strictly-typed and does not support dynamic casting. For example, one can not add a Double tensor with a Float tensor without explicitly casting one to the other's type.
PyTorch supports globally setting the default type of tensors via torch.set_default_dtype(...), which will be used for all tensors subsequently created, until the current process exits. If no calls to torch.set_default_dtype are made, PyTorch will work out default tensor type using some unknown strategies.
When python evaluates a script (for import or any other purposes), it runs everything in that module verbatim. In other words, Python is not intelligently only evaluate bar when a user executes from foo import bar. Instead, it runs (evaluates) the entire file foo.py and registers foo.bar to the current namespace.
[To be confirmed] When pytest is executed with no arguments, it searches for all files matching the test_*.py pattern, executes them one at a time (in the same process), registers all target functions beginning with test in all files, and runs all registered targets. Note that in this manner, if default types are specified multiple times in different .py files, only the last call takes effect.
In our case, default tensor types are different among test files. Therefore, running any individual test files yields no error (which is the common workflow when using an IDE like PyCharm) while running all tests together in a single process results in failure (which is the case for our travis-ci build config).

Behavior

To verify our theory, consider the following ways of runnings tests

Scenario A: Running all test files together using

pytest tests/test_*.py 
# or, without arguments
pytest

causes some test cases to fail because of Float vs Double type mismatch

Scenario B: Running each test file one at a time using

for file in tests/test_*.py; do
    pytest $file
done

yields no error (except for the technical debt in test_pde_spherical.py)

Solution

Modify the following line in .travis.yml

  - pytest --cov-report term --cov=neurodiffeq/

to do pytest one file at a time.

I'm not familiar with Travis config's syntax, but probably something like

  - pytest tests/test_function_basis.py
  - pytest tests/test_ode.py
  - pytest tests/test_pde.py
  - pytest tests/test_pde_spherical.py
  - pytest tests/test_temporal.py

Is there a more elegant way of doing this?

Coupled PDEs

Hello Everyone
Can we use neurodiffeq to solve coupled PDEs in thermoelasticity?

remove tol keyword

The tol keyword and maxiter keyword in the solve/solve_system function are confusing if they coexist. For now, remove the tol keyword.

shuffle training set

Shuffling training set (with numpy.random.shuffle or torch.permute) results in a pytorch error.

I may have missed it, but does the library solve nonlinear ode? For example, can it solve dy / dx = 2 * y - y ^ 2 + 1 and y_0 = 0, a nonlinear Riccati Equation? When trying to solve with the specified codes, there were very wrong results, is there a trick to solve nonlinear equations with this library, or is there a method you can suggest?

Thank you very much.

allow tracking custom metrics

Fix FCNN `n_hidden_layers`

What will be changed

Following the discussion last Friday, the arguments n_hidden_layers (int) and n_hidden_units(int) in the constructor of neurodiffeq.networks.FCNN will be deprecated in favor of a hidden_units (Tuple[int]).

When to release

The change will be released in v0.2.2 instead of v0.3.0.

Rename arguments in ode functions/classes

The following changes will be made (while providing backward compatibility)

rename the argument x to u in neurodiffeq.conditions.{IVP,DirichletBVP}
rename the argument x to u in neurodiffeq.neurodiffeq.{diff,safe_diff,unsafe_diff}

Performance of computing partial derivative

Hi,

I am pretty new to neurodiffeq, thank you very much for the excellent library.

I am interested in the way, and the computational speed, of computing partial derivatives w.r.t. the inputs.

Take forward ODE (1D, 1 unknown variable) solver for example, the input is x, a batch of coordinates, and the output of the neural network is y, the approximated solution of the PDE at these coordinates. If view the neural network as a smooth function that simulate the solution and name it f, the forward part in the training is evaluating y = f(x), and for each element of input, x_i, the neural network gives y_i = f(x_i), the i increase from 0 to N-1, the batch size. When constructing the loss function, one evaluate the residual of PDE, which usually require evaluating \frac{\partial y_i}{\partial x_i} and higher order of derivative.

My question related to the way of evaluating the \frac{\partial y_i}{\partial x_i}, for example x is (N, 1) tensor, y is also (N, 1) tensor, N is the batch size, if you do autograd.grad(y, t, create_graph=True, grad_outputs=ones, allow_unused=True) as the lines below
https://github.com/odegym/neurodiffeq/blob/718f226d40cfbcb9ed50d72119bd3668b0c68733/neurodiffeq/neurodiffeq.py#L21-L22
my understanding is that it will evaluate a Jacobian Matrix of size (N, N) with elements equal to \frac{\partial y_i}{\partial x_j} (I, j from 0 to N-1) regardless of the fact that y_i only dependent on x_i and thus computation (and storage) on the non-diagonal elements is useless and unnecessary. In other word, the computation actually can be done by evaluating N gradients, but the current method do N * N times.

My question is that:

Is what I state above correct to your understanding?
If correct, do you think this may cause computation speed influence?
If 1 is correct, do you know any way, and do you have any plan to reduce the computation needed?

Thanks!

boundary conditions with complex shape

Write docs on customizing boundary/initial conditions and using callbacks.

Unifying shapes of tensors throughout the library

Rethink the `diff` function

At the core of the neurodiffeq library is the diff(x, t) function, which computes the partial derivative ∂x/∂t evaluated at t. Usually, both tensor t and x have shapes of (n_samples, 1). When either x.shape or t.shape is malformed, however, there are cases where things could go wrong due to broadcasting. Such cases are so subtle that they have gone unnoticed for a long time.

All our generators (as defined in neurodiffeq.generator) currently return tensors with shapes (n_samples,) instead of (n_samples, 1). Efforts should be put into unifying the tensor shapes everywhere.

Here are two simple cases for review.

Case 1: Shapes don't matter

In this case, we try different combinations of x.shape and t.shape and check the shape of the output ∂x/∂t, namely:

[n, 1] and [n] --> [n]
[n] and [n]--> [n]
[n, 1] and [n, 1]--> [n,1]
[n] and [n, 1]--> [n,1]

To see this, run the following code. Note that d1, d2, d3, and d4, while having different shapes, hold the same values. This is the reason why we incorrectly believed in the soundness of the diff() function.

n = 10

t = torch.rand(n, requires_grad=True)
x = torch.sin(t)
d1 = diff(x.reshape(-1, 1), t)
d2 = diff(x.reshape(-1), t)

t = t.reshape(-1, 1)
x = torch.sin(t)
d3 = diff(x.reshape(-1, 1), t)
d4 = diff(x.reshape(-1), t)

Case 2: Shapes matter

In this second case, we examine two new operators – div and curl in spherical coordinates – and show that only when x.shape and t.shape are both (n, 1) will the vector identity div(curl(...)) == 0 hold.

Here is the definition of curl and divergence in spherical coordinates

# these two operators have been recently implemented in neurodiffeq.operators
def spherical_curl(u_r, u_theta, u_phi, r, theta, phi):
    d_r = lambda u: diff(u, r)
    d_theta = lambda u: diff(u, theta)
    d_phi = lambda u: diff(u, phi)

    curl_r = (d_theta(u_phi * sin(theta)) - d_phi(u_theta)) / (r * sin(theta))
    curl_theta = (d_phi(u_r) / sin(theta) - d_r(u_phi * r)) / r
    curl_phi = (d_r(u_theta * r) - d_theta(u_r)) / r

    return curl_r, curl_theta, curl_phi


def spherical_div(u_r, u_theta, u_phi, r, theta, phi):
    div_r = diff(u_r * r ** 2, r) / r ** 2
    div_theta = diff(u_theta * sin(theta), theta) / (r * sin(theta))
    div_phi = diff(u_phi, phi) / (r * sin(theta))
    return div_r + div_theta + div_phi

Here we define a vector field q by specifying the rule to compute q given coordinates (r, theta, phi)

def compute_q(r, theta, phi):
    r_theta_phi = torch.stack([r.flatten(), theta.flatten(), phi.flatten()], dim=1)
    W = torch.tensor([
        [.01, .04, .07],
        [.02, .05, .08],
        [.03, .06, .09],
    ])
    q = torch.matmul(r_theta_phi, W)
    q = torch.tanh(q)
    return q[:, 0], q[:, 1], q[:, 2]

We then test the vector identity div(curl(q)) == 0 for q

n = 10

# create r, theta, and phi with shape (n, 1)
r = torch.rand(n, 1, requires_grad=True) + 0.1
theta = torch.rand(n, 1, requires_grad=True) * np.pi
phi = torch.rand(n, 1, requires_grad=True)  * np.pi * 2
q_r, q_theta, q_phi = compute_q(r, theta, phi)

# bind the operators to the r, theta, phi created above
div = lambda u_r, u_theta, u_phi: spherical_div(u_r, u_theta, u_phi, r, theta, phi)
curl = lambda u_r, u_theta, u_phi: spherical_curl(u_r, u_theta, u_phi, r, theta, phi)

div_curl_q1 = div(*curl(q_r.reshape(-1, 1), q_theta.reshape(-1, 1), q_phi.reshape(-1, 1)))
div_curl_q2 = div(*curl(q_r.reshape(-1), q_theta.reshape(-1), q_phi.reshape(-1)))

# create r, theta, and phi with shape (n,)
r = r.reshape(-1)
theta = r.reshape(-1)
phi = r.reshape(-1)
q_r, q_theta, q_phi = compute_q(r, theta, phi)

# bind the operators to the r, theta, phi created above
div = lambda u_r, u_theta, u_phi: spherical_div(u_r, u_theta, u_phi, r, theta, phi)
curl = lambda u_r, u_theta, u_phi: spherical_curl(u_r, u_theta, u_phi, r, theta, phi)

div_curl_q3 = div(*curl(q_r.reshape(-1, 1), q_theta.reshape(-1, 1), q_phi.reshape(-1, 1)))
div_curl_q4 = div(*curl(q_r.reshape(-1), q_theta.reshape(-1), q_phi.reshape(-1)))

print(div_curl_q1, div_curl_q2, div_curl_q3, div_curl_q4, sep="\n")

Printing all four div_curl_qs will show that, only div_curl_q1 is (approximately) equal to 0, which means both the dependent and independent variables must have shape (n, 1) for the differentiation to go correctly.

Tidiness

Things that should be cleaned up:

how the library is broken into submodules and classes: Submodules like ode, pde, pde_spherical, tdim(time-dependent PDEs under development) are not mutually exclusive. Many of the functions and classes share very similar logic (e.g. ode.solve_system and pde.solve2D_system). This amplifies the work needed to make changes (e.g. every time I change pde.solve2D_system, I almost always need to make similar changes to ode.solve_system as well). We may want to extract these logics.
function signatures are not carefully thought out:
- keywords whose values can potentially contradict each other, e.g. What if in pde.solve2D_system, the domain indicated by xy_min and xy_max is different from that indicated by train_generator?
- keywords that come from different abstraction level, e.g. in pde.solve2D_system, there are keywords that come from the realm of differential equations (higher level) and keywords that come from the realm of deep learning (lower level). Maybe it's just me but that feels uncomfortable.
- keywords that always appear together, e.g. nets and single_net in pde submodule. This suggests that they really should be made into one parameter object.
- the names and order of keywords are not consistent across the package.
conversion between numpy.array and torch.tensor: numpy.array and torch.tensor are converted to each other everywhere and I find myself checking whether a sequence is a numpy.array or torch.tensor all the time. This makes working on / using the code less pleasant. Since the only reason numpy.array is there is because of matplotlib.pyplot and numpy.linalg, we should try to limit the use of numpy.array only to those sections and use torch.tensor in other places. Also, numpy.array and torch.tensor use different default precision, conversion can potentially introduce error.
the use of torch.tensor:
- I was not mindful of the memory footprint of torch.tensor when I wrote the code, there may be cases where requires_grad is set to True when it really should be False. there may also be cases where we can reuse a tensor yet it is re-created.
- I was not mindful when reshaping torch.tensor. I'm not sure the impact on performance.
naming things:
- non-standard format: function names should not have uppercase letters (e.g.solve2D); upper case variable names are not reserved for constants and enumerations.
- names that give false impressions: e.g. ExampleGenerator is not a python generator.
- confusing names: e.g. What is an additional_loss? What is an FCNN? Why are there t_0 and t_1 in ode.DirichletBVP which has nothing to do with time?
the use of strings as categorical variables: should have used Enum instead, in other cases (e.g. the as_type keyword of pde.Solution.__call__) should have used a bool instead.

added logging

Let there be logs.

Chebyshev Interpolation

Add Chebyshev interpolation as training point generation procedure.

Solving ODEs - Skipping correct reparametrization

Hi team,

I imagine there is an issue while solving ODEs with initial x_0_prime=0.

It appears that in this case, the correct reparametrization is skipped and we suspect the below code from ode.py maybe the cause.

        if self.x_0_prime:
            return self.x_0 + (t-self.t_0)*self.x_0_prime + ( (1-torch.exp(-t+self.t_0))**2 )*x
        else:
            return self.x_0 + (1-torch.exp(-t+self.t_0))*x

If self.x_0_prime is 0, then the if block is skipped, giving the second reparametrization.

from numpy.array to torch.tensor

For the solution of the differential equations, make the returned value torch.tensor instead of numpy.array.

add validation set

User should be able to specify a validation set (e.g. in the form of a 'generator' which can have different density/discretization/distribution from the training generator). Accordingly, the loss on the validation set should be visualized and returned.

Implement coolest reparameterization

IBVP2D for PDEs

custom loss function

Currently, the loss function is assumed to be a function of the reparameterized output of the neural net alone. What if the user want a loss function that also takes the independent variable directly as an input?

Write documentation on callbacks.

solving simple ode

I am new to this . try to solve ode dx/dt= t, x(0) =1 , I found huge difference between ann and analytical solution . plz fix my problem . here is ###code

from neurodiffeq import diff
from neurodiffeq.networks import FCNN
from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator
import torch
from torch import nn, optim
import numpy as np
import matplotlib.pyplot as plt

ode = lambda x, t: diff(x,t) - t
t_min, t_max = 0.0, 1.0
N=5
fcnn = FCNN(n_hidden_units=5, n_hidden_layers=1, actv=nn.Tanh)
adam = optim.Adam(fcnn.parameters(), lr=0.001)
init_ode = IVP(t_0= t_min, x_0=1.0)
train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy")

solution,loss_history = solve(
ode=ode,
condition=init_ode,
train_generator=train_gen,
t_min=t_min, t_max=t_max,
net=fcnn,
batch_size=N,
max_epochs=1500,
optimizer=adam,
monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100),
)

t_max2 = 4.0
N2 = 5
adam2 = optim.Adam(fcnn.parameters(), lr=0.001)
train_gen2 = ExampleGenerator(N2, t_min= t_min, t_max= t_max2, method="equally-spaced-noisy")

solution, _ = solve(
ode=ode,
condition=init_ode,
train_generator=train_gen2,
t_min=t_min, t_max=t_max2,
net=fcnn,
batch_size=N2,
max_epochs=1000,
optimizer=adam2,
monitor=Monitor(t_min= t_min, t_max= t_max2, check_every=100),
)

ts = np.linspace(t_min, 4.0, 10)
x_ana = (ts**2)/2 +1
x_nn = solution(ts, as_type='np')
plt.figure()
plt.plot(ts, x_nn, label='ANN-based solution')
plt.plot(ts, x_ana, label='analytical solution')
plt.ylabel('x')
plt.xlabel('t')
plt.title('comparing solutions')
plt.legend()
plt.show()

ann solution
array([1. , 1.02371858, 1.15868232, 1.37826193, 1.70709725,
2.2127547 , 2.98944599, 4.11046892, 5.54030202, 7.00839809])

analytical solution
array([1. , 1.09876543, 1.39506173, 1.88888889, 2.58024691,
3.4691358 , 4.55555556, 5.83950617, 7.32098765, 9. ])

Model Saving & Reloading

Hi,

I am studying how transfer learning can enhance the training of physics-informed neural networks. The NeuroDiffEq sparked my interest and I was wondering whether it is possible to

save a trained model, i.e. the parameters of the network and its architecture
reload the saved model and continue training from that non-random state.

Potential bug in the differential operator `diff`

Potential Bug?

I think I found an error in our implementation of the differential operator. I'll be drafting a commit to fix this shortly (PR #45). Honestly, I'm still confused about it. It's really strange that we should have such a fundamental bug. And I'm making the changes based on the following example.

Bug Description

Our differentiation function looks like this:

def diff(x, t, order=1):
    ones = torch.ones_like(t)
    der, = autograd.grad(x, t, create_graph=True, grad_outputs=ones)
    for i in range(1, order):
        der, = autograd.grad(der, t, create_graph=True, grad_outputs=ones)
    return der

Here ones is used for the role of gradients that flows backward from elsewhere. However, in this implementation, we assume x to be of the same shape as t. An error will be raised if we have differently shaped x and t. For example:

>>> x = torch.arange(10, dtype=torch.float).requires_grad_(True)
>>> y = torch.arange(10, dtype=torch.float).requires_grad_(True)
>>> z = torch.dot(x, y)
>>> diff(z, x)
RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([10]) and output[0] has a shape of torch.Size([]).

Proposed fix

Change the ones to be of the same shape as x instead of t.
i.e., change

ones = torch.ones_like(t)

ones = torch.ones_like(x)

Now, if we re-run the above code, we get

>>> diff(z, x)
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], grad_fn=<MulBackward0>)
>>> diff(z, y)
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], grad_fn=<MulBackward0>)

less code duplication

I find myself always making parallel changes of ode and pde module which involves a lot of copy-pasting. Is it possible to factor out the commond logic in solve_system and solve2D_system?

bump in Burger's equation's solution

When solving Burger's equation, a ' bump' always appear in the residual of the solution. The cause is unknown.

def test_burgers():

    nu = 1
    a = 2  # a parameter for the special case
    T = 0.1

    burgers = lambda u, x, t: diff(u, t) + u * diff(u, x) - nu * diff(u, x, order=2)

    ibvp = IBVP1D(
        x_min=0, x_min_val=lambda t: 0,
        x_max=1, x_max_val=lambda t: 0,
        t_min=0, t_min_val=lambda x: 2 * nu * np.pi * torch.sin(np.pi * x) / (a + torch.cos(np.pi * x))
    )
    net = FCNN(n_input_units=2, n_hidden_units=32, n_hidden_layers=1)

    solution_neural_net_burgers, _ = solve2D(
        pde=burgers, condition=ibvp, xy_min=[0, 0], xy_max=[1, T],
        net=net, max_epochs=300,
        train_generator=ExampleGenerator2D([50, 50], [0, 0], [1, T], method='equally-spaced-noisy'),
        batch_size=64,
        monitor=Monitor2D(check_every=10, xy_min=[0, 0], xy_max=[1, T])
    )

    def solution_analytical_burgers(x, t):
        numer = 2 * nu * np.pi * np.exp(-np.pi ** 2 * nu * t) * np.sin(np.pi * x)
        denom = a + np.exp(-np.pi ** 2 * nu * t) * np.cos(np.pi * x)
        return numer / denom

    xs = np.linspace(0, 1, 101)
    ts = np.linspace(0, T, 101)
    xx, tt = np.meshgrid(xs, ts)
    make_animation(solution_neural_net_burgers, xs, ts) # test animation
    sol_ana = solution_analytical_burgers(xx, tt)
    sol_net = solution_neural_net_burgers(xx, tt, as_type='np')
    assert isclose(sol_net, sol_ana, atol=0.1).all()
    print('Burgers test passed.')

How to use l-bfgs optimization method in neurodiffeq?

GPU version?

Hi @feiyu-chen96, do you have plans for the GPU version in the near future?

Solution diverges for simple 2nd-order linear ODE

I was solving the Taylor-Couette equation, which, under mild assumptions, is simplified to this 2nd order linear ODE:

Equation Statement

The solution to the above equation should be

where A and B are arbitrary constants.

Boundary Condition and Analytical Solution

Under the Dirichlet condition y(0.1) == y(10) == 10.1, the solution can be uniquely determined as:
, which looks like this

Expected Behaviour

As the training proceeds, the net should return a solution that first goes down until x == 1 and go back up again when x > 1

Actual Behaviour

However, using the following code, the network gives a solution that keeps straying away from the analytical solution.

import torch
import numpy as np
import matplotlib.pyplot as plt
from neurodiffeq import diff
from neurodiffeq.ode import solve, Monitor, ExampleGenerator
from neurodiffeq.ode import IVP, DirichletBVP
from neurodiffeq.networks import FCNN

net = FCNN(n_input_units=1, n_output_units=1, n_hidden_units=512, n_hidden_layers=0)
x_1 = 0.1
x_2 = 10.0
ode = lambda y, x: diff(y, x, order=2) + diff(y, x) / x - (y / x ** 2)
condition = DirichletBVP(x_1, 10.1, x_2, 10.1)
monitor = Monitor(t_min=x_1, t_max=x_2, check_every=50)
train_generator = ExampleGenerator(256, t_min=x_1, t_max=x_2, method='uniform')
valid_generator = ExampleGenerator(2048, t_min=x_1, t_max=x_2, method='equally-spaced')
monitor.check_every = 50

# solve the ODE
solution, loss, internals = solve(
    ode=ode,
    condition=condition, 
    t_min=x_1, 
    t_max=x_2, 
    monitor=monitor, 
    max_epochs=5000,
    return_internal=True,
    train_generator=train_generator,
    valid_generator=valid_generator,
    batch_size=train_generator.size,
)

Here is a gif file that shows how the model is performing. Note that not only does the network give a solution that looks drastically different from the analytic one, but also the solution is being scaled in the y-direction. The latter can be deduced by the change of the maximum value of the y-axis over time.

use pytest.fixture instead of module-level variables
wrap warnings with pytest.warns