Computation#

General concepts and mechanisms#

Overview#

Binary operations between data arrays or datasets behave as follows:

Property

Action

coord

compare, abort on mismatch

data

apply operation

mask

combine with or

attr

typically ignored or dropped

In the special case of in-place operations such as += or *= Scipp preserves existing attributes and ignores attributes of the right-hand-side.

Dimension matching and transposing#

Operations “align” variables based on their dimension labels. That is, an operation between two variables that have a transposed memory layout behave correctly:

[1]:
import numpy as np
import scipp as sc

a = sc.Variable(values=np.random.rand(2, 4),
                variances=np.random.rand(2, 4),
                dims=['x', 'y'],
                unit=sc.units.m)
b = sc.Variable(values=np.random.rand(4, 2),
                variances=np.random.rand(4, 2),
                dims=['y', 'x'],
                unit=sc.units.s)
a/b
[1]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (384 Bytes)
    • (x: 2, y: 4)
      float64
      m/s
      19.312, 9.997, ..., 0.421, 291.578
      σ = 79.801, 95.765, ..., 1.832, 4.879e+04
      Values:
      array([[1.93117158e+01, 9.99672866e+00, 5.11491490e-01, 1.21685867e+00], [1.30196514e-02, 3.97592672e-01, 4.20820145e-01, 2.91578462e+02]])

      Variances (σ²):
      array([[6.36813817e+03, 9.17098127e+03, 9.84906741e-01, 5.76137239e-01], [1.15138912e+00, 1.25817485e+00, 3.35629462e+00, 2.38009796e+09]])

Propagation of uncertainties#

If variables have variances, operations correctly propagate uncertainties (the variances), in contrast to a naive implementation using NumPy:

[2]:
result = a/b
result.values
[2]:
array([[1.93117158e+01, 9.99672866e+00, 5.11491490e-01, 1.21685867e+00],
       [1.30196514e-02, 3.97592672e-01, 4.20820145e-01, 2.91578462e+02]])
[3]:
a.values/np.transpose(b.values)
[3]:
array([[1.93117158e+01, 9.99672866e+00, 5.11491490e-01, 1.21685867e+00],
       [1.30196514e-02, 3.97592672e-01, 4.20820145e-01, 2.91578462e+02]])
[4]:
result.variances
[4]:
array([[6.36813817e+03, 9.17098127e+03, 9.84906741e-01, 5.76137239e-01],
       [1.15138912e+00, 1.25817485e+00, 3.35629462e+00, 2.38009796e+09]])
[5]:
a.variances/np.transpose(b.variances)
[5]:
array([[35.18395248,  0.95989352,  1.00649822, 23.04593062],
       [ 1.67252236,  1.71973821,  1.08625721,  4.08048179]])

The implementation assumes uncorrelated data and is otherwise based on, e.g., Wikipedia: Propagation of uncertainty. See also Propagation of uncertainties for the concrete equations used for error propagation.

WARNING:

If an operand with variances is also broadcast in an operation then the resulting values will be correlated. Scipp has no way of tracking or handling such correlations. Subsequent operations that combine values of the result, such as computing the mean, will thus result in underestimated uncertainties. Generally, the differences are negligible only if the variances of the broadcast operand are negligible.

Scipp’s behavior in this case may change in the future.

Broadcasting#

Missing dimensions in the operands are automatically broadcast. Consider:

[6]:
var_xy = sc.Variable(dims=['x', 'y'], values=np.arange(6).reshape((2,3)))
print(var_xy.values)
[[0 1 2]
 [3 4 5]]
[7]:
var_y = sc.Variable(dims=['y'], values=np.arange(3))
print(var_y.values)
[0 1 2]
[8]:
var_xy -= var_y
print(var_xy.values)
[[0 0 0]
 [3 3 3]]

Since var_y did not depend on dimension 'x' it is considered as “constant” along that dimension. That is, the same var_y values are subtracted from all slices of dimension 'x' in var_xy.

Coming back to our original variables a and b, we see that broadcasting integrates seamlessly with slicing and transposing:

[9]:
a.values
[9]:
array([[0.78037389, 0.67378761, 0.48722647, 0.87390092],
       [0.01137485, 0.33032904, 0.21765289, 0.69099728]])
[10]:
a -= a['x', 1]
a.values
[10]:
array([[0.76899904, 0.34345857, 0.26957358, 0.18290364],
       [0.        , 0.        , 0.        , 0.        ]])

Both operands may be broadcast, creating an output with the combination of input dimensions:

[11]:
sc.show(a['x', 1])
sc.show(a['y', 1])
sc.show(a['x', 1] + a['y', 1])
dims=('y',), shape=(4,), unit=m, variances=Truevariances yvalues y
dims=('x',), shape=(2,), unit=m, variances=Truevariances xvalues x
dims=('y', 'x'), shape=(4, 2), unit=m, variances=Truevariances yxvalues yx

Note that in-place operations such as += will never change the shape of the left-hand-side. That is only the right-hand-side operation can be broadcast, and the operation fails of a broadcast of the left-hand-side would be required.

Units#

Units are required to be compatible:

[12]:
try:
    a + b
except Exception as e:
    print(str(e))
Cannot add m and s.

Coordinate and name matching#

In operations with datasets, data items are paired based on their names when applying operations to datasets. Operations fail if names do not match:

  • In-place operations such as += accept a right-hand-side operand that omits items that the left-hand-side has. If the right-hand-side contains items that are not in the left-hand-side the operation fails.

  • Non-in-place operations such as + return a new dataset with items from the intersection of the inputs.

Coords are compared in operations with datasets or data arrays (or items of datasets). Operations fail if there is any mismatch in coord or label values.

[13]:
d1 = sc.Dataset(
    data={'a': sc.Variable(dims=['x', 'y'], values=np.random.rand(2, 3)),
          'b': sc.Variable(dims=['y', 'x'], values=np.random.rand(3, 2)),
          'c': sc.Variable(dims=['x'], values=np.random.rand(2)),
          'd': sc.scalar(value=1.0)},
    coords={
        'x': sc.Variable(dims=['x'], values=np.arange(2.0), unit=sc.units.m),
        'y': sc.Variable(dims=['y'], values=np.arange(3.0), unit=sc.units.m)})
d2 = sc.Dataset(
    data={'a': sc.Variable(dims=['x', 'y'], values=np.random.rand(2, 3)),
          'b': sc.Variable(dims=['y', 'x'], values=np.random.rand(3, 2))},
    coords={
        'x': sc.Variable(dims=['x'], values=np.arange(2.0), unit=sc.units.m),
        'y': sc.Variable(dims=['y'], values=np.arange(3.0), unit=sc.units.m)})
[14]:
d1 += d2
[15]:
try:
    d2 += d1
except Exception as e:
    print(str(e))
"Expected 'c' in <scipp.Dataset.keys {a, b}>."
[16]:
d3 = d1 + d2
for name in d3:
    print(name)
a
b
[17]:
d3['a'] -= d3['b'] # transposing
d3['a'] -= d3['x', 1]['b'] # broadcasting
try:
    d3['a'] -= d3['x', 1:2]['b'] # fail due to coordinate mismatch
except Exception as e:
    print(str(e))
Mismatch in coordinate 'x' in operation 'subtract_equals':
(x: 2)    float64              [m]  [0, 1]
vs
(x: 1)    float64              [m]  [1]

Arithmetics#

The arithmetic operations +, -, *, and / and their in-place variants +=, -=, *=, and /= are available for variables, data arrays, and datasets. They can also be combined with slicing.

Trigonometrics#

Trigonometric functions like sin are supported for variables. Units for angles provide a safeguard and ensure correct operation when working with either degree or radian:

[18]:
rad = 3.141593*sc.units.rad
deg = 180.0*sc.units.deg
print(sc.sin(rad))
print(sc.sin(deg))
try:
    rad + deg
except Exception as e:
    print(str(e))
<scipp.Variable> ()    float64  [dimensionless]  [-3.4641e-07]
<scipp.Variable> ()    float64  [dimensionless]  [1.22465e-16]
Cannot add rad and deg.

Other#

See the list of free functions for an overview of other available operations.