What’s new in scipp#

This page highlights feature additions and discusses major changes from recent releases. For a full list of changes see the Release Notes.

[1]:
import numpy as np
import scipp as sc

General#

Unique dimensions and slicing of 1-D objects#

New in 0.9

The new dim property checks whether an object is 1-D, and returns the only dimension label. An exception is raised if the object is not 1-D.

Example:

[2]:
x = sc.linspace(dim='x', start=0, stop=1, num=4)
x.dim
[2]:
'x'

New in 0.11

1-D objects can now be sliced without specifying a dimension.

Example:

[3]:
x[-1]
[3]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes out of 32 Bytes)
    • ()
      float64
      𝟙
      1.0
      Values:
      array(1.)

If an object is not 1-D then DimensionError is raised:

[4]:
var2d = sc.concat([x,x], 'y')
var2d[0]
---------------------------------------------------------------------------
DimensionError                            Traceback (most recent call last)
Input In [4], in <cell line: 2>()
      1 var2d = sc.concat([x,x], 'y')
----> 2 var2d[0]

DimensionError: Slicing with implicit dimension label is only possible for 1-D objects. Got Sizes[y:2, x:4, ] with ndim=2. Provide an explicit dimension label, e.g., var['y', 0] instead of var[0].

Logging support#

New in 0.9

Scipp now provides a logger, and a pre-configured logging widget for Jupyter notebooks. See Logging.

Slicing with stride#

New in 0.12

Positional slicing (slicing with integer indices, as opposed to slicing with a label matching a coordinate value) now supports strides.

Negative strides are currently not supported.

Examples:

[5]:
y = sc.arange('y', 10)
y[::2]
[5]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (40 Bytes out of 80 Bytes)
    • (y: 5)
      int64
      𝟙
      0, 2, 4, 6, 8
      Values:
      array([0, 2, 4, 6, 8])
[6]:
x = sc.linspace('x', 0.0, 1.0, num=5)
da = sc.DataArray(sc.ones(dims=['x', 'y'], shape=[4,10], unit='K'), coords={'x':x, 'y':y})
da['y', 1::2]
[6]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (240 Bytes out of 440 Bytes)
    • x: 4
    • y: 5
    • x
      (x [bin-edge])
      float64
      𝟙
      0.0, 0.25, 0.5, 0.75, 1.0
      Values:
      array([0. , 0.25, 0.5 , 0.75, 1. ])
    • y
      (y)
      int64
      𝟙
      1, 3, 5, 7, 9
      Values:
      array([1, 3, 5, 7, 9])
    • (x, y)
      float64
      K
      1.0, 1.0, ..., 1.0, 1.0
      Values:
      array([[1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.]])

Slicing a dimension with a bin-edge coordinate with a stride is ill-defined and not supported:

[7]:
da['x', ::2]
---------------------------------------------------------------------------
SliceError                                Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 da['x', ::2]

SliceError: Object has bin-edges along dimension x so slicing with stride 2 != 1 is not valid.

Unified conversion of unit and dtype#

New in 0.11

Variables and data arrays have a new method, to, for conversion of dtype, unit, or both. This can be used to replace uses of to_unit and astype.

Example:

[8]:
var = sc.arange(dim='x', start=0, stop=4, unit='m')
var
[8]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      int64
      m
      0, 1, 2, 3
      Values:
      array([0, 1, 2, 3])

Use the unit keyword argument to convert to a different unit:

[9]:
var.to(unit='mm')
[9]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      int64
      mm
      0, 1000, 2000, 3000
      Values:
      array([ 0, 1000, 2000, 3000])

Use the dtype keyword argument to convert to a different dtype:

[10]:
var.to(dtype='float64')
[10]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      m
      0.0, 1.0, 2.0, 3.0
      Values:
      array([0., 1., 2., 3.])

If both unit and dtype are provided, the implementation attempts to apply the two conversions in optimal order to reduce or avoid the effect of rounding/truncation errors:

[11]:
var.to(dtype='float64', unit='km')
[11]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      km
      0.0, 0.001, 0.002, 0.003
      Values:
      array([0. , 0.001, 0.002, 0.003])

Support for unit=None#

New in 0.12

Previously scipp used unit=sc.units.dimensiobless (or the alias unit=sc.units.one) for anything that does not have a unit, such as strings, booleans, or bins. To allow for distinction of actual physically dimensionless quantities from theses cases, scipp now supports variables and, by extension, data arrays that have their unit set to None.

This change is accomponied by a number of related changes:

  • Creation function use a default unit if not given explicitly. The default for numbers (floating point or integer) is sc.units.dimensionless. The default for everything else, including bool is None.

  • Comparison operations, which return variables with dtype=bool, have unit=None.

  • A new function index was added, to allow for creation of 0-D variable with unit=None. This complements scalar, which uses the default unit (depending on the dtype).

Examples:

[12]:
print(sc.array(dims=['x'], values=[1.1,2.2,3.3]))
print(sc.array(dims=['x'], values=[1,2,3]))
print(sc.array(dims=['x'], values=[False, True, False]))
print(sc.array(dims=['x'], values=['a','b','c']))
<scipp.Variable> (x: 3)    float64  [dimensionless]  [1.1, 2.2, 3.3]
<scipp.Variable> (x: 3)      int64  [dimensionless]  [1, 2, 3]
<scipp.Variable> (x: 3)       bool           [None]  [False, True, False]
<scipp.Variable> (x: 3)     string           [None]  ["a", "b", "c"]
[13]:
a = sc.array(dims=['x'], values=[1,2,3])
b = sc.array(dims=['x'], values=[1,3,3])
print(a == b)
print(a < b)
<scipp.Variable> (x: 3)       bool           [None]  [True, False, True]
<scipp.Variable> (x: 3)       bool           [None]  [False, True, False]
[14]:
(a == b).unit is None
[14]:
True

For some purposes we may use a coordinate with unique interger-valued identifiers. Since the identifiers to not have a physical meaning, we use unit=None. Note that this has to be given explicitly since otherwise integers are treated as numbers, i.e., the unit would be dimensionless:

[15]:
da = sc.DataArray(a, coords={'id':sc.array(dims=['x'], unit=None, values=[34,21,14])})
da
[15]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (48 Bytes)
    • x: 3
    • id
      (x)
      int64
      34, 21, 14
      Values:
      array([34, 21, 14])
    • (x)
      int64
      𝟙
      1, 2, 3
      Values:
      array([1, 2, 3])

The index function can now be used to conveniently lookup data by its identifier:

[16]:
da['id', sc.index(21)]
[16]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (16 Bytes out of 48 Bytes)
    • ()
      int64
      𝟙
      2
      Values:
      array(2)
    • id
      ()
      int64
      21
      Values:
      array(21)

Operations#

Creation functions#

New in 0.11

Creation functions for datetimes where added:

  • Added epoch, datetime and datetimes.

[17]:
sc.datetime('now', unit='ms')
[17]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      datetime64
      ms
      2022-07-19T08:21:11.000
      Values:
      array('2022-07-19T08:21:11.000', dtype='datetime64[ms]')
[18]:
times = sc.datetimes(dims=['time'], values=['2022-01-11T10:24:03', '2022-01-11T10:24:03'])
times
[18]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes)
    • (time: 2)
      datetime64
      s
      2022-01-11T10:24:03, 2022-01-11T10:24:03
      Values:
      array(['2022-01-11T10:24:03', '2022-01-11T10:24:03'], dtype='datetime64[s]')

The new epoch function is useful for obtaining the time since epoch, i.e., a time difference (dtype='int64') instead of a time point (dtype='datetime64'):

[19]:
times - sc.epoch(unit=times.unit)
[19]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes)
    • (time: 2)
      int64
      s
      1641896643, 1641896643
      Values:
      array([1641896643, 1641896643])

New in 0.12

zeros_like, ones_like, empty_like, and full_like can now be used with data arrays.

Example:

[20]:
x = sc.linspace('x', 0.0, 1.0, num=5)
da = sc.DataArray(sc.ones(dims=['x', 'y'], shape=[4,6], unit='K'), coords={'x':x})
sc.zeros_like(da)
[20]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (232 Bytes)
    • x: 4
    • y: 6
    • x
      (x [bin-edge])
      float64
      𝟙
      0.0, 0.25, 0.5, 0.75, 1.0
      Values:
      array([0. , 0.25, 0.5 , 0.75, 1. ])
    • (x, y)
      float64
      K
      0.0, 0.0, ..., 0.0, 0.0
      Values:
      array([[0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.]])

Utility methods and functions#

New in 0.12

Added squeeze method to remove length-1 dimensions from objects. Added rename method to rename dimensions and associated dimension-coordinates (or attributes). This complements rename_dims, which only changes dimension labels but does not rename coordinates. Added midpoints to compute bin-centers.

Example:

[21]:
x = sc.linspace('x', 0.0, 1.0, num=5)
da = sc.DataArray(sc.ones(dims=['x', 'y'], shape=[4,6], unit='K'), coords={'x':x})

A length-1 x-dimension…

[22]:
da['x', 0:1]
[22]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (64 Bytes out of 232 Bytes)
    • x: 1
    • y: 6
    • x
      (x [bin-edge])
      float64
      𝟙
      0.0, 0.25
      Values:
      array([0. , 0.25])
    • (x, y)
      float64
      K
      1.0, 1.0, ..., 1.0, 1.0
      Values:
      array([[1., 1., 1., 1., 1., 1.]])

… can be removed with squeeze:

[23]:
da['x', 0:1].squeeze()
[23]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (64 Bytes out of 232 Bytes)
    • y: 6
    • (y)
      float64
      K
      1.0, 1.0, ..., 1.0, 1.0
      Values:
      array([1., 1., 1., 1., 1., 1.])
    • x
      (x)
      float64
      𝟙
      0.0, 0.25
      Values:
      array([0. , 0.25])

squeeze returns a new object and leaves the original unchanged.

Renaming is most convenient using keyword arguments:

[24]:
da.rename(x='xnew')
[24]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (232 Bytes)
    • xnew: 4
    • y: 6
    • xnew
      (xnew [bin-edge])
      float64
      𝟙
      0.0, 0.25, 0.5, 0.75, 1.0
      Values:
      array([0. , 0.25, 0.5 , 0.75, 1. ])
    • (xnew, y)
      float64
      K
      1.0, 1.0, ..., 1.0, 1.0
      Values:
      array([[1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.]])

rename returns a new object and leaves the original unchanged.

midpoints can be used to replace a bin-edge coordinate by bin centers:

[25]:
da.coords['x'] = sc.midpoints(da.coords['x'])
da
[25]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (224 Bytes)
    • x: 4
    • y: 6
    • x
      (x)
      float64
      𝟙
      0.125, 0.375, 0.625, 0.875
      Values:
      array([0.125, 0.375, 0.625, 0.875])
    • (x, y)
      float64
      K
      1.0, 1.0, ..., 1.0, 1.0
      Values:
      array([[1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.]])

Reduction operations#

Internal precision in summation operations#

New in 0.9

Reduction operations such as sum of single-precision (float32) data now use double-precision (float64) internally to reduce the effects of rounding errors.

Reductions over multiple inputs using reduce#

New in 0.9

The new reduce function can be used for reduction operations that do not operate along a dimension of a scipp object but rather across a list or tuple of multiple scipp objects. The mechanism is a 2-step approach, with a syntax similar to groupby:

[26]:
a = sc.linspace(dim="x", start=0.0, stop=1.0, num=4)
b = sc.linspace(dim="x", start=0.2, stop=0.8, num=4)
c = sc.linspace(dim="x", start=0.2, stop=1.2, num=4)
sc.reduce([a, b, c]).sum()
[26]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      𝟙
      0.4, 1.267, 2.133, 3.0
      Values:
      array([0.4 , 1.26666667, 2.13333333, 3. ])
[27]:
reducer = sc.reduce([a, b, c])
reducer.min()
[27]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      𝟙
      0.0, 0.333, 0.600, 0.8
      Values:
      array([0. , 0.33333333, 0.6 , 0.8 ])
[28]:
reducer.max()
[28]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      𝟙
      0.2, 0.533, 0.867, 1.2
      Values:
      array([0.2 , 0.53333333, 0.86666667, 1.2 ])

Shape operations#

concat replacing concatenate#

New in 0.9

concat is replacing concatenate (which is deprecated now and will be removed in 0.10). It supports a list of inputs rather than just 2 inputs.

[29]:
a = sc.scalar(1.2)
b = sc.scalar(2.3)
c = sc.scalar(3.4)
sc.concat([a, b, c], "x")
[29]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • (x: 3)
      float64
      𝟙
      1.2, 2.3, 3.4
      Values:
      array([1.2, 2.3, 3.4])

fold supports size -1#

New in 0.12

fold now accepts up to one size (or shape) entry with value -1. This indicates that the size should be computed automatically based on the input size and other provided sizes.

Example:

[30]:
var = sc.arange('xyz', 2448)
var.fold('xyz', sizes={'x':4, 'y':4, 'z':-1})
[30]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (19.12 KB)
    • (x: 4, y: 4, z: 153)
      int64
      𝟙
      0, 1, ..., 2446, 2447
      Values:
      array([[[ 0, 1, 2, ..., 150, 151, 152], [ 153, 154, 155, ..., 303, 304, 305], [ 306, 307, 308, ..., 456, 457, 458], [ 459, 460, 461, ..., 609, 610, 611]], [[ 612, 613, 614, ..., 762, 763, 764], [ 765, 766, 767, ..., 915, 916, 917], [ 918, 919, 920, ..., 1068, 1069, 1070], [1071, 1072, 1073, ..., 1221, 1222, 1223]], [[1224, 1225, 1226, ..., 1374, 1375, 1376], [1377, 1378, 1379, ..., 1527, 1528, 1529], [1530, 1531, 1532, ..., 1680, 1681, 1682], [1683, 1684, 1685, ..., 1833, 1834, 1835]], [[1836, 1837, 1838, ..., 1986, 1987, 1988], [1989, 1990, 1991, ..., 2139, 2140, 2141], [2142, 2143, 2144, ..., 2292, 2293, 2294], [2295, 2296, 2297, ..., 2445, 2446, 2447]]])

Vectors and matrices#

General#

New in 0.11

scipp.spatial has been restructured and extended:

  • New data types for spatial transforms were added:

    • vector3 (renamed from vector3_float64)

    • rotation3 (3-D rotation defined using quaternion coeffiecients)

    • translation3 (translation in 3-D)

    • linear_transform3 (previously matrix_3_float64, 3-D linear transform with, e.g., rotation and scaling)

    • affine_transform3 (affine transform in 3-D, combination of a linear transform and a translation, defined using 4x4 matrix)

  • The scipp.spatial submodule was extended with a number of new creation functions, in particular for the new dtypes.

  • matrix and matrices for creating “matrices” have been deprecated. Use scipp.spatial.linear_transform and scipp.spatial.linear_transforms instead.

Note that the scipp.spatial subpackage must be imported explicitly:

[31]:
from scipp import spatial
linear = spatial.linear_transform(value=[[1,0,0],[0,2,0],[0,0,3]])
linear
[31]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (72 Bytes)
    • ()
      linear_transform3
      𝟙
      [[1. 0. 0.] [0. 2. 0.] [0. 0. 3.]]
      Values:
      array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]])
[32]:
trans = spatial.translation(value=[1,2,3], unit='m')
trans
[32]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • ()
      translation3
      m
      [1. 2. 3.]
      Values:
      array([1., 2., 3.])

Multiplication can be used to combine the various transforms:

[33]:
linear * trans
[33]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (128 Bytes)
    • ()
      affine_transform3
      m
      [[1. 0. 0. 1.] [0. 2. 0. 4.] [0. 0. 3. 9.] [0. 0. 0. 1.]]
      Values:
      array([[1., 0., 0., 1.], [0., 2., 0., 4.], [0., 0., 3., 9.], [0., 0., 0., 1.]])

Note that in the case of affine_transform3 the unit refers to the translation part. A unit for the linear part is currently not supported.

SciPy compatibility layer#

New in 0.11

A number of subpackages providing wrappers for a subset of functions from the corresponding packages in SciPy was added:

Please refer to the function documentation for working examples.

Performance#

New in 0.12

  • sc.bin() is now faster when binning or grouping into thousands of bins or more.