What’s new in scipp

This page highlights feature additions and discusses major changes from recent releases. For a full list of changes see the Release Notes.

[1]:
import numpy as np
import scipp as sc

General

Unique dimensions and slicing of 1-D objects

New in 0.9

The new dim property checks whether an object is 1-D, and returns the only dimension label. An exception is raised if the object is not 1-D.

Example:

[2]:
x = sc.linspace(dim='x', start=0, stop=1, num=4)
x.dim
[2]:
'x'

New in 0.11

1-D objects can now be sliced without specifying a dimension.

Example:

[3]:
x[-1]
[3]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes out of 32 Bytes)
    • ()
      float64
      1.0
      Values:
      array(1.)

If an object is not 1-D then DimensionError is raised:

[4]:
var2d = sc.concat([x,x], 'y')
var2d[0]
---------------------------------------------------------------------------
DimensionError                            Traceback (most recent call last)
/tmp/ipykernel_9954/2472674120.py in <module>
      1 var2d = sc.concat([x,x], 'y')
----> 2 var2d[0]

DimensionError: Slicing with implicit dimension label is only possible for 1-D objects. Got Sizes[y:2, x:4, ] with ndim=2. Provide an explicit dimension label, e.g., var['y', 0] instead of var[0].

Logging support

New in 0.9

Scipp now provides a logger, and a pre-configured logging widget for Jupyter notebooks. See Logging.

Bound method equivalents to many free functions

New in 0.8

Many functions that have been available as free functions can now be used also as methods of variables and data arrays. See the documentation for individual classes for a full list.

Example:

[5]:
var = sc.arange(dim="x", unit="m", start=0, stop=12)
var.sum()  # Previously sc.sum(var)
[5]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      int64
      m
      66
      Values:
      array(66)

Note that sc.sum(var) will continue to be supported as well.

Unified conversion of unit and dtype

New in 0.11

Variables and data arrays have a new method, to, for conversion of dtype, unit, or both. This can be used to replace uses of to_unit and astype.

Example:

[6]:
var = sc.arange(dim='x', start=0, stop=4, unit='m')
var
[6]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      int64
      m
      0, 1, 2, 3
      Values:
      array([0, 1, 2, 3])

Use the unit keyword argument to convert to a different unit:

[7]:
var.to(unit='mm')
[7]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      int64
      mm
      0, 1000, 2000, 3000
      Values:
      array([ 0, 1000, 2000, 3000])

Use the dtype keyword argument to convert to a different dtype:

[8]:
var.to(dtype='float64')
[8]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      m
      0.0, 1.0, 2.0, 3.0
      Values:
      array([0., 1., 2., 3.])

If both unit and dtype are provided, the implementation attempts to apply the two conversions in optimal order to reduce or avoid the effect of rounding/truncation errors:

[9]:
var.to(dtype='float64', unit='km')
[9]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      km
      0.0, 0.001, 0.002, 0.003
      Values:
      array([0. , 0.001, 0.002, 0.003])

Operations

Creation functions

New in 0.11

Creation functions for datetimes where added:

  • Added epoch, datetime and datetimes.

[10]:
sc.datetime('now', unit='ms')
[10]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      datetime64
      ms
      2022-01-13T14:43:55.000
      Values:
      array('2022-01-13T14:43:55.000', dtype='datetime64[ms]')
[11]:
times = sc.datetimes(dims=['time'], values=['2022-01-11T10:24:03', '2022-01-11T10:24:03'])
times
[11]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes)
    • (time: 2)
      datetime64
      s
      2022-01-11T10:24:03, 2022-01-11T10:24:03
      Values:
      array(['2022-01-11T10:24:03', '2022-01-11T10:24:03'], dtype='datetime64[s]')

The new epoch function is useful for obtaining the time since epoch, i.e., a time difference (dtype='int64') instead of a time point (dtype='datetime64'):

[12]:
times - sc.epoch(unit=times.unit)
[12]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes)
    • (time: 2)
      int64
      s
      1641896643, 1641896643
      Values:
      array([1641896643, 1641896643])

from_pandas and from_xarray

New in 0.8

  • from_pandas for converting pandas.Dataframe to scipp.Dataset.

  • from_xarray for converting xarray.DataArray or xarray.Dataset to scipp.DataAray or scipp.Dataset, respectively.

Both functions are available in the compat submodule.

Reduction operations

Internal precision in summation operations

New in 0.9

Reduction operations such as sum of single-precision (float32) data now use double-precision (float64) internally to reduce the effects of rounding errors.

Reductions over multiple inputs using reduce

New in 0.9

The new reduce function can be used for reduction operations that do not operate along a dimension of a scipp object but rather across a list or tuple of multiple scipp objects. The mechanism is a 2-step approach, with a syntasx similar to groupby:

[13]:
a = sc.linspace(dim="x", start=0.0, stop=1.0, num=4)
b = sc.linspace(dim="x", start=0.2, stop=0.8, num=4)
c = sc.linspace(dim="x", start=0.2, stop=1.2, num=4)
sc.reduce([a, b, c]).sum()
[13]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      0.4, 1.267, 2.133, 3.0
      Values:
      array([0.4 , 1.26666667, 2.13333333, 3. ])
[14]:
reducer = sc.reduce([a, b, c])
reducer.min()
[14]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      0.0, 0.333, 0.600, 0.8
      Values:
      array([0. , 0.33333333, 0.6 , 0.8 ])
[15]:
reducer.max()
[15]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      0.2, 0.533, 0.867, 1.2
      Values:
      array([0.2 , 0.53333333, 0.86666667, 1.2 ])

Shape operations

concat replacing concatenate

New in 0.9

concat is replacing concatenate (which is deprecated now and will be removed in 0.10). It supports a list of inputs rather than just 2 inputs.

[16]:
a = sc.scalar(1.2)
b = sc.scalar(2.3)
c = sc.scalar(3.4)
sc.concat([a, b, c], "x")
[16]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • (x: 3)
      float64
      1.2, 2.3, 3.4
      Values:
      array([1.2, 2.3, 3.4])

Vectors and matrices

General

New in 0.11

scipp.spatial has been restructured and extended:

  • New data types for spatial transforms were added:

    • vector3 (renamed from vector3_float64)

    • rotation3 (3-D rotation defined using quaternion coeffiecients)

    • translation3 (translation in 3-D)

    • linear_transform3 (previously matrix_3_float64, 3-D linear transform with, e.g., rotation and scaling)

    • affine_transform3 (affine transform in 3-D, combination of a linear transform and a translation, defined using 4x4 matrix)

  • The scipp.spatial submodule was extended with a number of new creation functions, in particular for the new dtypes.

  • matrix and matrices for creating “matrices” have been deprecated. Use scipp.spatial.linear_transform and scipp.spatial.linear_transforms instead.

Note that the scipp.spatial subpackage must be imported explicitly:

[17]:
from scipp import spatial
linear = spatial.linear_transform(value=[[1,0,0],[0,2,0],[0,0,3]])
linear
[17]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (72 Bytes)
    • ()
      linear_transform3
      [[1. 0. 0.] [0. 2. 0.] [0. 0. 3.]]
      Values:
      array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]])
[18]:
trans = spatial.translation(value=[1,2,3], unit='m')
trans
[18]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • ()
      translation3
      m
      [1. 2. 3.]
      Values:
      array([1., 2., 3.])

Multiplication can be used to combine the various transforms:

[19]:
linear * trans
[19]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (128 Bytes)
    • ()
      affine_transform3
      m
      [[1. 0. 0. 1.] [0. 2. 0. 4.] [0. 0. 3. 9.] [0. 0. 0. 1.]]
      Values:
      array([[1., 0., 0., 1.], [0., 2., 0., 4.], [0., 0., 3., 9.], [0., 0., 0., 1.]])

Note that in the case of affine_transform3 the unit refers to the translation part. A unit for the linear part is currently not supported.

Coordinate transformations

New in 0.8

The transform_coords function has been added (also available as method of data arrays and datasets). It is a tool for transforming one or more input coordinates into one or more output coordinates. It automatically handles:

  • Renaming of dimensions, if dimension-coordinates are transformed.

  • Change of coordinates to attributes to avoid interference of coordinates consumed by the transformation in follow-up operations.

  • Conversion of event-coordinates of binned data, if present.

See Coordinate transformations for a full description.

Physical constants

New in 0.8

The scipp.constants (in the style of scipy.constants) submodule was added, providing physical constants from CODATA 2018. For full details see the module’s documentation.

Examples:

[20]:
from scipp.constants import hbar, m_e, physical_constants
[21]:
hbar
[21]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      J*s
      1.0545718176461565e-34
      Values:
      array(1.05457182e-34)
[22]:
m_e
[22]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      kg
      9.1093837015e-31
      Values:
      array(9.1093837e-31)
[23]:
physical_constants("speed of light in vacuum")
[23]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      m/s
      299792458.0
      Values:
      array(2.99792458e+08)
[24]:
physical_constants("neutron mass", with_variance=True)
[24]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes)
    • ()
      float64
      kg
      1.67492749804e-27
      σ = 9.5e-37
      Values:
      array(1.6749275e-27)

      Variances (σ²):
      array(9.025e-73)
[25]:
import numpy as np

N = int(800)
data = sc.DataArray(
    data=sc.Variable(dims=["time"], values=100 + np.random.rand(N) * 10, unit="K"),
    coords={
        "x": sc.Variable(dims=["time"], unit="m", values=np.random.rand(N)),
        "y": sc.Variable(dims=["time"], unit="m", values=np.random.rand(N)),
        "time": sc.Variable(
            dims=["time"], values=(10000 * np.random.rand(N)).astype("datetime64[s]")
        ),
    },
)
binned = sc.bin(
    data,
    edges=[
        sc.linspace(dim="x", unit="m", start=0.0, stop=1.0, num=5),
        sc.linspace(dim="y", unit="m", start=0.0, stop=1.0, num=5),
    ],
)
binned
[25]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (25.33 KB)
    • x: 4
    • y: 4
    • x
      (x [bin-edge])
      float64
      m
      0.0, 0.25, 0.5, 0.75, 1.0
      Values:
      array([0. , 0.25, 0.5 , 0.75, 1. ])
    • y
      (y [bin-edge])
      float64
      m
      0.0, 0.25, 0.5, 0.75, 1.0
      Values:
      array([0. , 0.25, 0.5 , 0.75, 1. ])
    • (x, y)
      DataArrayView
      binned data [len=48, len=48, ..., len=54, len=60]
      Values:
      [<scipp.DataArray> Dimensions: Sizes[time:48, ] Coordinates: time datetime64 [s] (time) [1970-01-01T02:08:29, 1970-01-01T01:14:35, ..., 1970-01-01T00:29:11, 1970-01-01T00:14:26] x float64 [m] (time) [0.185679, 0.154261, ..., 0.0481469, 0.248385] y float64 [m] (time) [0.157335, 0.0184621, ..., 0.0831641, 0.0758221] Data: float64 [K] (time) [104.196, 109.117, ..., 103.01, 102.546] , <scipp.DataArray> Dimensions: Sizes[time:48, ] Coordinates: time datetime64 [s] (time) [1970-01-01T02:24:02, 1970-01-01T01:59:13, ..., 1970-01-01T01:20:54, 1970-01-01T00:28:46] x float64 [m] (time) [0.185029, 0.226894, ..., 0.141988, 0.0169694] y float64 [m] (time) [0.297743, 0.252622, ..., 0.362635, 0.313435] Data: float64 [K] (time) [109.845, 108.114, ..., 101.075, 100.451] , ..., <scipp.DataArray> Dimensions: Sizes[time:54, ] Coordinates: time datetime64 [s] (time) [1970-01-01T01:25:35, 1970-01-01T00:11:23, ..., 1970-01-01T02:17:47, 1970-01-01T00:46:16] x float64 [m] (time) [0.862611, 0.961646, ..., 0.915319, 0.851259] y float64 [m] (time) [0.703346, 0.593832, ..., 0.510264, 0.650847] Data: float64 [K] (time) [104.647, 103.79, ..., 104.966, 103.122] , <scipp.DataArray> Dimensions: Sizes[time:60, ] Coordinates: time datetime64 [s] (time) [1970-01-01T01:26:06, 1970-01-01T02:21:58, ..., 1970-01-01T00:35:48, 1970-01-01T01:14:51] x float64 [m] (time) [0.863391, 0.891747, ..., 0.873954, 0.883454] y float64 [m] (time) [0.894954, 0.879536, ..., 0.764983, 0.798934] Data: float64 [K] (time) [103.117, 109.604, ..., 103.861, 107.711] ]
[26]:
sc.show(binned)
(dims=['x', 'y'], shape=[4, 4], unit=dimensionless, variances=False)values xy (dims=['time'], shape=[800], unit=K, variances=False)values time xx(dims=['time'], shape=[800], unit=m, variances=False)values time yy(dims=['time'], shape=[800], unit=m, variances=False)values time timetime(dims=['time'], shape=[800], unit=s, variances=False)values time xx(dims=['x'], shape=[5], unit=m, variances=False)values x yy(dims=['y'], shape=[5], unit=m, variances=False)values y

To allow for this, the bins property provides properties data, coords, masks, and attrs of the bins that behave like the properties of a data array while retaining the binned structure. That is, it can be used for computation involving information available on a per-bin basis:

[27]:
binned.bins.coords["time"]
[27]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (6.50 KB)
    • (x: 4, y: 4)
      VariableView
      binned data [len=48, len=48, ..., len=54, len=60]
      Values:
      [<scipp.Variable> (time: 48) datetime64 [s] [1970-01-01T02:08:29, 1970-01-01T01:14:35, ..., 1970-01-01T00:29:11, 1970-01-01T00:14:26], <scipp.Variable> (time: 48) datetime64 [s] [1970-01-01T02:24:02, 1970-01-01T01:59:13, ..., 1970-01-01T01:20:54, 1970-01-01T00:28:46], ..., <scipp.Variable> (time: 54) datetime64 [s] [1970-01-01T01:25:35, 1970-01-01T00:11:23, ..., 1970-01-01T02:17:47, 1970-01-01T00:46:16], <scipp.Variable> (time: 60) datetime64 [s] [1970-01-01T01:26:06, 1970-01-01T02:21:58, ..., 1970-01-01T00:35:48, 1970-01-01T01:14:51]]
[28]:
sc.show(binned.bins.coords["time"])
dims=['x', 'y'], shape=[4, 4], unit=dimensionless, variances=Falsevalues xy dims=['time'], shape=[800], unit=s, variances=Falsevalues time

We can use this in our example to correct for an hypothetical clock error that depends on the x-y bin:

[29]:
clock_correction = sc.array(
    dims=["x", "y"], unit="s", values=(100 * np.random.rand(4, 4)).astype("int64")
)
clock_correction
[29]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (128 Bytes)
    • (x: 4, y: 4)
      int64
      s
      23, 98, ..., 46, 61
      Values:
      array([[23, 98, 98, 24], [17, 13, 71, 92], [47, 42, 28, 2], [45, 16, 46, 61]])
[30]:
binned.bins.coords["time"] += clock_correction

The properties can also be used to add or delete meta data entries:

[31]:
del binned.bins.coords["x"]

SciPy compatibility layer

New in 0.11

A number of subpackages providing wrappers for a subset of functions from the corresponding packages in SciPy was added:

Please refer to the function documentation for working examples.

Performance

New in 0.9

  • sc.lookup(histogram, dim)[var] is now faster if histogram is very long and is integer-valued. This is relevant in a number of event-filtering operations.