Physics-Informed Neural Networks

Physics-Informed Neural Networks#

Motivation#

For years mathematicians and physicists are trying to model the world with differential equations. However, since the advent of techniques such as machine learning, neural networks and deep learning together with greater computing power, community has speculated that we could learn automatically (algorithms) anything with a enough amount of data. However, it seems this is not really true.

Philosophical Question: Could have a machine discovered Newton’s laws?

PINNs#

In 2019, Raissi, Perdikaris and Karniadakis introduced Physics-Informed Neural Networks (PINNs), neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations (source). PINNs are nowadays used to solve PDEs, fractional equations, and integral-differential equations.

Idea

PINNs approximate PDE solutions by training a neural network to minimize a loss function, including:

Initial and boundary conditions along the space-time domain’s boundary
PDE residual at selected points in the domain.

If you want to do a simplified analogy, initial and boundary conditions points will be an usual training dataset, but also it is necessary to embed physical laws (PDE) into the neural network.

Setup#

PINNs can solve differential equations expressed, in the most general form, like:

\[\begin{split} \begin{align*} \mathcal{F}(u(z); \lambda) &= f(z) \quad z \text{ in } \Omega \\ \mathcal{B}(u(z)) &= g(z) \quad z \text{ in } \partial \Omega \end{align*} \end{split}\]

defined on the domain $\Omega \subset \mathbb{R}^d$ with the boundary $\partial \Omega$. Where

$z := (x_1, x_2, \ldots, t)^\top$ indicated the space-time coordinate vector,
$u$ the unknown function,
$\lambda$ the parameters related to the physics,
$\mathcal{F}$ the non-linear differential operator,
$f$ the function identifying the data of the problem,
$\mathcal{B}$ the operator indicating arbitrary initial or boundary conditions, and
$g$ the boundary function.

In the PINN methodology, $u(z)$ is computationally predicted by a NN, parametrized by a set of parameters $\theta$, giving rise to an approximation $$ \hat{u}_\theta(z) \approx u(z) $$

The optimization problem we want to deal with it is

\[ \min_\theta \; \omega_\mathcal{F} \mathcal{L}_\mathcal{F}(\theta) + \omega_\mathcal{B} \mathcal{L}_\mathcal{B}(\theta) + \omega_{\text{data}} \mathcal{L}_{\text{data}}(\theta) \]

this is three weighted loss functions, each one depending on

$\mathcal{L}_\mathcal{F}$, differential equation,
$\mathcal{L}_\mathcal{B}$, boundary conditions, and
$\mathcal{L}_{\text{data}}$, (eventually) some known data.

PINNs

Source

Implementation: DeepXDE#

DeepXDE is a library for scientific machine learning and physics-informed learning. It support several tensor libraries as backend (TensorFlow, PyTorch, JAX, among other).

One of its main features is that enables the user code to be compact, resembling closely the mathematical formulation. We don’t have to worry much about the technical aspect of the code, but rather about the mathematical model.

DeepXDE

Source

Example: Burguers Equation#

We will solve a Burquers equation

\[ \frac{\partial u}{\partial t} + u\frac{\partial u}{\partial x} = \nu\frac{\partial^2u}{\partial x^2}, \qquad x \in [-1, 1], \quad t \in [0, 1] \]

with the Dirichlet boundary conditions and initial conditions $$ u(-1,t)=u(1,t)=0, \quad u(x,0) = - \sin(\pi x). $$

Official documentation tutorial here.

Remarks:

If you are working on Google Colab you need to install DeepXDE and enable GPU.
1. !pip install deepxde
2. Go to Menu Menu > Runtime > Change Runtime and change hardware accelaration to GPU.
If you are working on your own machine, please follow the install and setup documentation.

# Uncomment and run the next line if you are working on Google Colab 
# !pip install deepxde

import numpy as np
import deepxde as dde
from deepxde.backend import tf

No backend selected.
Finding available backend...

2023-06-05 03:29:12.112854: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-06-05 03:29:12.163822: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-06-05 03:29:12.165043: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

2023-06-05 03:29:13.132098: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Using backend: tensorflow.compat.v1
Other supported backends: tensorflow, pytorch, jax, paddle.
paddle supports more examples now and is recommended.

Found tensorflow.compat.v1
Setting the default backend to "tensorflow.compat.v1". You can change it in the ~/.deepxde/config.json file or export the DDE_BACKEND environment variable. Valid options are: tensorflow.compat.v1, tensorflow, pytorch, jax, paddle (all lowercase)
WARNING:tensorflow:From /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term

WARNING:tensorflow:From /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/deepxde/nn/initializers.py:118: The name tf.keras.initializers.he_normal is deprecated. Please use tf.compat.v1.keras.initializers.he_normal instead.

First of all, we need to define the domain when we will work on

geom = dde.geometry.Interval(-1, 1)
timedomain = dde.geometry.TimeDomain(0, 0.99)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

The PDE residual is

\[ \mathcal{L}_{\mathcal{F}}(\theta) = \frac{\partial \hat{u}_\theta}{\partial t} + \hat{u}_\theta\frac{\partial \hat{u}_\theta}{\partial x} - \nu\frac{\partial^2 \hat{u}_\theta}{\partial x^2} \]

Then we express the PDE residual in a function.

def pde(x, y):
    dy_x = dde.grad.jacobian(y, x, i=0, j=0)
    dy_t = dde.grad.jacobian(y, x, i=0, j=1)
    dy_xx = dde.grad.hessian(y, x, i=0, j=0)
    return dy_t + y * dy_x - 0.01 / np.pi * dy_xx

The first argument to pde is 2-dimensional vector where the first component(x[:,0]) is :math:x-coordinate and the second componenet (x[:,1]) is the :math:t-coordinate. The second argument is the network output, i.e., the solution :math:u(x,t), but here we use y as the name of the variable.

Now, let’s continue with initial and boundary conditions

bc = dde.icbc.DirichletBC(
    geomtime,
    lambda x: 0,
    lambda _,
    on_boundary: on_boundary
)
ic = dde.icbc.IC(
    geomtime,
    lambda x: -np.sin(np.pi * x[:, 0:1]),
    lambda _,
    on_initial: on_initial
)

Finally, we can define our PDE in a TimePDE object

data = dde.data.TimePDE(
    geomtime,
    pde,
    [bc, ic],
    num_domain=2540,
    num_boundary=80,
    num_initial=160
)

Where

num_domain is the number of training residual points sampled inside the domain,
num_boundary is the number of training points sampled on the boundary, and
num_initial is the number of residual points for the initial conditions.

Just at this point we can we choose the network. Here, we use a fully connected neural network of depth 4 (i.e., 3 hidden layers) and width 20, hyperbolic tangent as activation function and a standar way to initialize the NN called Glorot normal.

net = dde.nn.FNN([2] + [20] * 3 + [1], "tanh", "Glorot normal")

Notice it has two neurons in the input layer, corresponding to time and 1-Dimensional space. However the output layer is just one neuron since $u(t,x) \in \mathbb{R}$.

Now we need to create an object with our PDE and NN. Also it is necessary to indicate the optimizer we will use.

model = dde.Model(data, net)
model.compile("adam", lr=1e-3)

Compiling model...
Building feed-forward neural network...
'build' took 0.047117 s

/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/deepxde/nn/tensorflow_compat_v1/fnn.py:116: UserWarning: `tf.layers.dense` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Dense` instead.
  return tf.layers.dense(

'compile' took 0.377277 s

We then train the model for 15000 iterations:

losshistory, train_state = model.train(iterations=15000)

Training model...

Step      Train loss                        Test loss                         Test metric

0         [6.55e-01, 1.87e-01, 5.50e-01]    [6.55e-01, 1.87e-01, 5.50e-01]    []  

2023-06-05 03:29:16.271871: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled

1000      [4.64e-02, 2.85e-03, 6.84e-02]    [4.64e-02, 2.85e-03, 6.84e-02]    []  

2000      [3.60e-02, 5.00e-04, 5.31e-02]    [3.60e-02, 5.00e-04, 5.31e-02]    []  

3000      [3.04e-02, 1.40e-04, 4.54e-02]    [3.04e-02, 1.40e-04, 4.54e-02]    []  

4000      [2.03e-02, 8.63e-05, 2.73e-02]    [2.03e-02, 8.63e-05, 2.73e-02]    []  

5000      [6.31e-03, 2.34e-05, 7.58e-03]    [6.31e-03, 2.34e-05, 7.58e-03]    []  

6000      [3.87e-03, 1.04e-05, 3.68e-03]    [3.87e-03, 1.04e-05, 3.68e-03]    []  

7000      [2.49e-03, 9.13e-06, 2.06e-03]    [2.49e-03, 9.13e-06, 2.06e-03]    []  

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[9], line 1
----> 1 losshistory, train_state = model.train(iterations=15000)

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/deepxde/utils/internal.py:22, in timing.<locals>.wrapper(*args, **kwargs)
     19 @wraps(f)
     20 def wrapper(*args, **kwargs):
     21     ts = timeit.default_timer()
---> 22     result = f(*args, **kwargs)
     23     te = timeit.default_timer()
     24     if config.rank == 0:

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/deepxde/model.py:633, in Model.train(self, iterations, batch_size, display_every, disregard_previous_best, callbacks, model_restore_path, model_save_path, epochs)
    631     if iterations is None:
    632         raise ValueError("No iterations for {}.".format(self.opt_name))
--> 633     self._train_sgd(iterations, display_every)
    634 self.callbacks.on_train_end()
    636 if config.rank == 0:

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/deepxde/model.py:651, in Model._train_sgd(self, iterations, display_every)
    646 self.callbacks.on_batch_begin()
    648 self.train_state.set_data_train(
    649     *self.data.train_next_batch(self.batch_size)
    650 )
--> 651 self._train_step(
    652     self.train_state.X_train,
    653     self.train_state.y_train,
    654     self.train_state.train_aux_vars,
    655 )
    657 self.train_state.epoch += 1
    658 self.train_state.step += 1

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/deepxde/model.py:540, in Model._train_step(self, inputs, targets, auxiliary_vars)
    538 if backend_name == "tensorflow.compat.v1":
    539     feed_dict = self.net.feed_dict(True, inputs, targets, auxiliary_vars)
--> 540     self.sess.run(self.train_step, feed_dict=feed_dict)
    541 elif backend_name in ["tensorflow", "paddle"]:
    542     self.train_step(inputs, targets, auxiliary_vars)

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/tensorflow/python/client/session.py:968, in BaseSession.run(self, fetches, feed_dict, options, run_metadata)
    965 run_metadata_ptr = tf_session.TF_NewBuffer() if run_metadata else None
    967 try:
--> 968   result = self._run(None, fetches, feed_dict, options_ptr,
    969                      run_metadata_ptr)
    970   if run_metadata:
    971     proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/tensorflow/python/client/session.py:1191, in BaseSession._run(self, handle, fetches, feed_dict, options, run_metadata)
   1188 # We only want to really perform the run if fetches or targets are provided,
   1189 # or if the call is a partial run that specifies feeds.
   1190 if final_fetches or final_targets or (handle and feed_dict_tensor):
-> 1191   results = self._do_run(handle, final_targets, final_fetches,
   1192                          feed_dict_tensor, options, run_metadata)
   1193 else:
   1194   results = []

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/tensorflow/python/client/session.py:1371, in BaseSession._do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1368   return self._call_tf_sessionprun(handle, feed_dict, fetch_list)
   1370 if handle is None:
-> 1371   return self._do_call(_run_fn, feeds, fetches, targets, options,
   1372                        run_metadata)
   1373 else:
   1374   return self._do_call(_prun_fn, handle, feeds, fetches)

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/tensorflow/python/client/session.py:1378, in BaseSession._do_call(self, fn, *args)
   1376 def _do_call(self, fn, *args):
   1377   try:
-> 1378     return fn(*args)
   1379   except errors.OpError as e:
   1380     message = compat.as_text(e.message)

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/tensorflow/python/client/session.py:1361, in BaseSession._do_run.<locals>._run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1358 def _run_fn(feed_dict, fetch_list, target_list, options, run_metadata):
   1359   # Ensure any changes to the graph are reflected in the runtime.
   1360   self._extend_graph()
-> 1361   return self._call_tf_sessionrun(options, feed_dict, fetch_list,
   1362                                   target_list, run_metadata)

File /opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/tensorflow/python/client/session.py:1454, in BaseSession._call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1452 def _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list,
   1453                         run_metadata):
-> 1454   return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
   1455                                           fetch_list, target_list,
   1456                                           run_metadata)

KeyboardInterrupt: 

After we train the network using Adam, we continue to train the network using L-BFGS to achieve a smaller loss:

model.compile("L-BFGS-B")
losshistory, train_state = model.train()

Compiling model...
'compile' took 0.144206 s

Training model...

Step      Train loss                        Test loss                         Test metric
15000     [1.23e-03, 1.08e-05, 9.74e-04]    [1.23e-03, 1.08e-05, 9.74e-04]    []  
16000     [3.42e-04, 1.71e-06, 1.75e-04]                                          
17000     [1.35e-04, 1.56e-06, 5.74e-05]                                          
18000     [7.67e-05, 4.28e-07, 1.97e-05]                                          
19000     [4.58e-05, 3.63e-07, 7.05e-06]                                          
20000     [2.60e-05, 1.73e-07, 3.14e-06]                                          
21000     [1.49e-05, 1.01e-07, 1.56e-06]                                          
22000     [8.42e-06, 7.99e-08, 1.14e-06]                                          
INFO:tensorflow:Optimization terminated with:
  Message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
  Objective function value: 0.000009
  Number of iterations: 6699
  Number of functions evaluations: 7242
22242     [8.01e-06, 6.20e-08, 8.80e-07]    [8.01e-06, 6.20e-08, 8.80e-07]    []  

Best model at step 22242:
  train loss: 8.95e-06
  test loss: 8.95e-06
  test metric: []

'train' took 37.846161 s

dde.saveplot(losshistory, train_state, issave=False, isplot=True)

../../_images/641c756c4d9c207d10217ce665a3b30d450184493d7acc6a3a121df99995a466.png

../../_images/dd9d200c979d9d768eef4c135887f63aedd58fa194fc89a2759121f2e1726d72.png

Let’s compare the predicted solution with the real solution

from pathlib import Path

burguers_filepath = Path().resolve().parent / "data" / "Burgers.npz"
data = np.load(burguers_filepath)
t, x, exact = data["t"], data["x"], data["usol"].T
xx, tt = np.meshgrid(x, t)
X = np.vstack((np.ravel(xx), np.ravel(tt))).T
y_true = exact.flatten()[:, None]

y_pred = model.predict(X)
f = model.predict(X, operator=pde)
print("Mean residual:", np.mean(np.absolute(f)))
print("L2 relative error:", dde.metrics.l2_relative_error(y_true, y_pred))

Mean residual: 0.008336143
L2 relative error: 0.05253457073427108