Numerical instability between Tensorflow and Pytorch

### Issue type
Bug or help needed

### Relevant package versions
numpy == 1.24.1
tensorflow == 2.11.0
torch == 2.0.1

### Python version
3.8.0

### Current behaviour
The envelope forms in tensorflow and pytorch (defined [here](https://github.com/sanderlab/CellBox/blob/master/cellbox/cellbox/kernel.py#L11)) yield very similar results (their difference between the two outputs is on the scale of 10e-8). However, these differences accumulate after several time steps in the ODE solver, and become very noticeable after around 150 to 200 time steps in the solver.

### Code to reproduce
The recommended envelope form for CellBox is the tanh. The code below calculates the output from tensorflow's and pytorch's isolated envelope form set to tanh (defined in `KernelConfig`). There is no ODE involved yet.

```
import numpy as np
import tensorflow.compat.v1 as tf
import torch
tf.disable_v2_behavior()

class KernelConfig(object):
    def __init__(self):
        
        self.n_x = 5
        self.envelope_form = "tanh" # options: tanh, polynormial, hill, linear, clip linear
        self.envelope_fn = None
        self.polynomial_k = 2 # larger than 1
        self.ode_degree = 1
        self.envelope = 0
        self.ode_solver = "heun" # options: euler, heun, rk4, midpoint
        self.dT = 0.1
        self.n_T = 1000
        self.gradient_zero_from = None

args = KernelConfig()
W = np.random.normal(loc=0.01, size=(args.n_x, args.n_x))
eps = np.ones((args.n_x, 1), dtype=np.float32)
alpha = np.ones((args.n_x, 1), dtype=np.float32)
y0_np = np.zeros((args.n_x, 1))

# Test the envelope
def tensorflow_envelope():
    from cellbox.kernel import get_envelope
    envelope_fn = get_envelope(args)

    params = {}
    W_copy = np.copy(W)
    params["W"] = tf.convert_to_tensor(W_copy, dtype=tf.float32)
    if args.ode_degree == 1:
        def weighted_sum(x):
            return tf.matmul(params['W'], x)
    
    return envelope_fn(weighted_sum(tf.convert_to_tensor(params["W"], dtype=tf.float32))).eval(session=tf.compat.v1.Session())

def pytorch_get_envelope(args):
    """get the envelope form based on the given argument"""
    if args.envelope_form == 'tanh':
        args.envelope_fn = torch.tanh
    elif args.envelope_form == 'polynomial':
        k = args.polynomial_k
        assert k > 1, "Hill coefficient has to be k>2."
        if k % 2 == 1:  # odd order polynomial equation
            args.envelope_fn = lambda x: x ** k / (1 + torch.abs(x) ** k)
        else:  # even order polynomial equation
            args.envelope_fn = lambda x: x**k/(1+x**k)*torch.sign(x)
    elif args.envelope_form == 'hill':
        k = args.polynomial_k
        assert k > 1, "Hill coefficient has to be k>=2."
        args.envelope_fn = lambda x: 2*(1-1/(1+nn.functional.relu(torch.tensor(x+1)).numpy()**k))-1
    elif args.envelope_form == 'linear':
        args.envelope_fn = lambda x: x
    elif args.envelope_form == 'clip linear':
        args.envelope_fn = lambda x: torch.clamp(x, min=-1, max=1)
    else:
        raise Exception("Illegal envelope function. Choose from [tanh, polynomial/hill]")
    return args.envelope_fn

def pytorch_envelope():
    envelope_fn = pytorch_get_envelope(args)
    params = {}
    W_copy = np.copy(W)
    params["W"] = torch.tensor(W_copy, dtype=torch.float32)
    if args.ode_degree == 1:
        def weighted_sum(x):
            return torch.matmul(params['W'], x)

    return envelope_fn(weighted_sum(torch.tensor(params["W"], dtype=torch.float32))).numpy()

tf_out = tensorflow_envelope()
torch_out = pytorch_envelope()
print(np.abs(tf_out - torch_out))
```
The output is:
```
[[0.0000000e+00 1.4901161e-08 0.0000000e+00 0.0000000e+00 0.0000000e+00]
 [5.9604645e-08 0.0000000e+00 5.9604645e-08 2.9802322e-08 5.9604645e-08]
 [1.1920929e-07 0.0000000e+00 9.3132257e-10 0.0000000e+00 2.9802322e-08]
 [2.9802322e-08 1.4901161e-08 5.9604645e-08 1.8626451e-09 5.9604645e-08]
 [5.9604645e-08 5.9604645e-08 5.9604645e-08 0.0000000e+00 0.0000000e+00]]
```
If using `polynomial` with `args.polynomial_k = 2`:
```
args.envelope_form = "polynomial"
args.polynomial_k = 2
tf_out = tensorflow_envelope()
torch_out = pytorch_envelope()
print(np.abs(tf_out - torch_out))
```
The output is:
```
[[0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
 [0.0000000e+00 0.0000000e+00 5.9604645e-08 0.0000000e+00 0.0000000e+00]
 [0.0000000e+00 0.0000000e+00 1.4551915e-11 0.0000000e+00 0.0000000e+00]
 [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
 [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]]
```
However, if changing the envelope form to `clip linear`:
```
args.envelope_form = "clip linear"
tf_out = tensorflow_envelope()
torch_out = pytorch_envelope()
print(np.abs(tf_out - torch_out))
```
The output is:
```
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
```

This difference might be small, but it adds up within the ODE solver, and causes the final result of the tensorflow and pytorch ODE solver to differ significantly. The same issue persisted when `args.envelope_form` is set to `hill` or `polynomial`. However, when `args.envelope_form` is set to `linear` or `clip linear`, the difference between tensorflow and pytorch ODE solver is exactly 0, leading me to believe the numerical discrepancy of the other envelope functions cause this behaviour.

### Solution
Is there a way around this? If two ODE solutions are very different, which one is the correct solution? 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Numerical instability between Tensorflow and Pytorch #56

Issue type

Relevant package versions

Python version

Current behaviour

Code to reproduce

Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Numerical instability between Tensorflow and Pytorch #56

Description

Issue type

Relevant package versions

Python version

Current behaviour

Code to reproduce

Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions