Download this Jupyter notebook


What’s new in scipp

This page highlights feature additions and discusses major changes from recent releases. For a full list of changes see the Release Notes.

[1]:
import numpy as np
import scipp as sc

General

Bound method equivalents to many free functions

New in 0.8

Many functions that have been available as free functions can now be used also as methods of variables and data arrays. See the documentation for individual classes for a full list.

Example:

[2]:
var = sc.arange(dim='x', unit='m', start=0, stop=12)
var.sum()  # Previously sc.sum(var)
[2]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      int64
      m
      66
      Values:
      array(66)

Note that sc.sum(var) will continue to be supported as well.

Python-like shallow/deep copy mechanism

New in 0.7

The most significant change in the scipp 0.7 release is a fundamental rework of all scipp data structures (variables, data arrays, and datasets). These now behave mostly like nested Python objects, i.e., sub-objects are shared by default. Previously there was no sharing mechanism and scipp always made deep-copies. Some of the effects are exemplified in the following.

Variables

For variables on their own, the new and old implementations mostly yield the same user experience. Previously, views of variables, such as created when slicing a variable along a dimension, returned a different type – VariableView – which kept alive the original Variable. This asymmetry is now gone. Slices or other views of variables are now also of type Variable, and all views share ownership of the underlying data.

If a variable refers only to a section of the underlying data buffer this is now indicated in the HTML view in the title line as part of the size, here “16 Bytes out of 96 Bytes”. This allows for identification of “small” variables that keep alive potentially large buffers:

[3]:
var = sc.arange(dim='x', unit='m', start=0, stop=12)
var['x', 4:6]
[3]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes out of 96 Bytes)
    • (x: 2)
      int64
      m
      4, 5
      Values:
      array([4, 5])

To create a variable with sole ownership of a buffer, use the copy() method:

[4]:
var['x', 4:6].copy()
[4]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes)
    • (x: 2)
      int64
      m
      4, 5
      Values:
      array([4, 5])

By default, copy() returns a deep copy. Shallow copies can be made by specifying deep=False, which preserves shared ownership of underlying buffers:

[5]:
shallow_copy = var['x', 4:6].copy(deep=False)
shallow_copy
[5]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes out of 96 Bytes)
    • (x: 2)
      int64
      m
      4, 5
      Values:
      array([4, 5])

Data arrays

The move away from the previous “always deep copy” mechanism avoids a number of critical issues. However, as a result of the new sharing mechanism extra care must now be taken in some cases, just like when working with any other Python library. Consider the following example, using the same variable for data and a coordinate:

[6]:
da = sc.DataArray(data=var, coords={'x': var})
da += 666 * sc.units.m
da
[6]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (192 Bytes)
    • x: 12
    • x
      (x)
      int64
      m
      666, 667, ..., 676, 677
      Values:
      array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])
    • (x)
      int64
      m
      666, 667, ..., 676, 677
      Values:
      array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])

The modification unintentionally also affected the coordinate. However, if we think of data arrays and coordinate dicts as Python-like objects, the behavior should then not be surprising.

Note that the original var is also affected:

[7]:
var
[7]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (96 Bytes)
    • (x: 12)
      int64
      m
      666, 667, ..., 676, 677
      Values:
      array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])

To avoid this, use copy(), e.g.,:

[8]:
da = sc.DataArray(data=var.copy(), coords={'x': var.copy()})
da += 666 * sc.units.m
da
[8]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (192 Bytes)
    • x: 12
    • x
      (x)
      int64
      m
      666, 667, ..., 676, 677
      Values:
      array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])
    • (x)
      int64
      m
      1332, 1333, ..., 1342, 1343
      Values:
      array([1332, 1333, 1334, 1335, 1336, 1337, 1338, 1339, 1340, 1341, 1342, 1343])

Apart from the more standard and pythonic behavior, one advantage of this is that creating data arrays from variables can now be cheap, without inflicting copies of potentially large objects.

A related change is the introduction of read-only flags. Consider the following attempt to modify the data via a slice:

[9]:
try:
    da['x', 0].data = var['x', 2]
except sc.DataArrayError as e:
    print(e)
Read-only flag is set, cannot set new data.

Since da['x',0] is itself a data array, assigning to the data property would repoint the data to whatever is given on the right-hand side. However, this would not affect da, and the attempt to change the data would silently do nothing, since the temporary da['x',0] disappears immediately. The read-only flag protects us from this.

To actually modify the slice, use __setitem__ instead:

[10]:
da['x', 0] = var['x', 2]

Read-only flags were also introduced for variables, meta-data dicts (coords, masks, and attrs properties), data arrays and datasets. The flags solve a number of conceptual issues and serve as a safeguard against hidden bugs.

Datasets

Just like creating data arrays from variables is now cheap (without deep-copies), inserting items into datasets does not inflict potentially expensive deep copies:

[11]:
ds = sc.Dataset()
ds['a'] = da  # shallow copy

Note that while the buffers are shared, the meta-data dicts such as coords, masks, or attrs are not. Compare:

[12]:
ds['a'].attrs['attr'] = 1.2 * sc.units.m
'attr' in da.attrs  # the attrs *dict* is copied
[12]:
False

with

[13]:
da.coords['x'] *= -1
ds.coords['x']  # the coords *dict* is copied, but the 'x' coordinate references same buffer
[13]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (96 Bytes)
    • (x: 12)
      int64
      m
      -666, -667, ..., -676, -677
      Values:
      array([-666, -667, -668, -669, -670, -671, -672, -673, -674, -675, -676, -677])

Indexing

Ellipsis

New in 0.8

Indexing with ellipsis (...) is now supported. This can be used, e.g., to replace data in an existing object without re-pointing the underlying reference to the object given on the right-hand side.

Example

[14]:
var1 = sc.ones(dims=['x'], shape=[4])
var2 = var1 + var1
da = sc.DataArray(data=sc.zeros(dims=['x'], shape=[4]))
da.data = var1  # replace data variable
da.data[...] = var2  # assign to slice, copy into existing data variable
var1  # now holds values of var2
[14]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes)
    • (x: 4)
      float64
      2.0, 2.0, 2.0, 2.0
      Values:
      array([2., 2., 2., 2.])

Changing var2 has no effect on da.data:

[15]:
var2 += 2222.0
da
[15]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (32 Bytes)
    • x: 4
    • (x)
      float64
      2.0, 2.0, 2.0, 2.0
      Values:
      array([2., 2., 2., 2.])

Label-based indexing

New in 0.5

Indexing based on coordinate values is now possible:

  • Works just like position indexing (with integers).

  • Use a scalar variable as index (instead of integer) to use label-based indexing

  • Works with single values as well as slices (: notation)

See Label-based indexing for more details.

Example

[16]:
da = sc.DataArray(data=sc.zeros(dims=['x', 'day'], shape=(4, 3)))
da.coords['x'] = sc.linspace(dim='x', unit='m', start=0.1, stop=0.2, num=5)
da.coords['day'] = sc.array(dims=['day'], values=[1, 7, 31])
[17]:
da['day', sc.scalar(7)]
[17]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (80 Bytes out of 160 Bytes)
    • x: 4
    • x
      (x [bin-edge])
      float64
      m
      0.1, 0.12, 0.15, 0.18, 0.2
      Values:
      array([0.1 , 0.125, 0.15 , 0.175, 0.2 ])
    • (x)
      float64
      0.0, 0.0, 0.0, 0.0
      Values:
      array([0., 0., 0., 0.])
    • day
      ()
      int64
      7
      Values:
      array(7)
[18]:
da['x', 0.13 * sc.units.m]  # selects bin containing this value
[18]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (64 Bytes out of 160 Bytes)
    • day: 3
    • day
      (day)
      int64
      1, 7, 31
      Values:
      array([ 1, 7, 31])
    • (day)
      float64
      0.0, 0.0, 0.0
      Values:
      array([0., 0., 0.])
    • x
      (x)
      float64
      m
      0.12, 0.15
      Values:
      array([0.125, 0.15 ])

Support for datetime64

New in 0.6

  • Previously we stored time-related information such as, e.g., sample-temperature logs as integers.

  • Added support for datetime64 compatible with np.datetime64

  • Time differences (np.timedelta64) are not used, we simply use integers since in combination with scipp’s units this provides everything we need.

Example:

[19]:
var = sc.array(dims=['time'],
               values=np.arange(np.datetime64('2021-01-01T12:00:00'),
                                np.datetime64('2021-01-01T12:04:00')))

Datetimes and intgers with time units interoperate naturally. We can offset a datetime by adding a duration:

[20]:
var + 123 * sc.Unit('s')
[20]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (1.88 KB)
    • (time: 240)
      datetime64
      s
      2021-01-01T12:02:03, 2021-01-01T12:02:04, ..., 2021-01-01T12:06:01, 2021-01-01T12:06:02
      Values:
      array(['2021-01-01T12:02:03', '2021-01-01T12:02:04', '2021-01-01T12:02:05', '2021-01-01T12:02:06', '2021-01-01T12:02:07', '2021-01-01T12:02:08', '2021-01-01T12:02:09', '2021-01-01T12:02:10', '2021-01-01T12:02:11', '2021-01-01T12:02:12', '2021-01-01T12:02:13', '2021-01-01T12:02:14', '2021-01-01T12:02:15', '2021-01-01T12:02:16', '2021-01-01T12:02:17', '2021-01-01T12:02:18', '2021-01-01T12:02:19', '2021-01-01T12:02:20', '2021-01-01T12:02:21', '2021-01-01T12:02:22', '2021-01-01T12:02:23', '2021-01-01T12:02:24', '2021-01-01T12:02:25', '2021-01-01T12:02:26', '2021-01-01T12:02:27', '2021-01-01T12:02:28', '2021-01-01T12:02:29', '2021-01-01T12:02:30', '2021-01-01T12:02:31', '2021-01-01T12:02:32', '2021-01-01T12:02:33', '2021-01-01T12:02:34', '2021-01-01T12:02:35', '2021-01-01T12:02:36', '2021-01-01T12:02:37', '2021-01-01T12:02:38', '2021-01-01T12:02:39', '2021-01-01T12:02:40', '2021-01-01T12:02:41', '2021-01-01T12:02:42', '2021-01-01T12:02:43', '2021-01-01T12:02:44', '2021-01-01T12:02:45', '2021-01-01T12:02:46', '2021-01-01T12:02:47', '2021-01-01T12:02:48', '2021-01-01T12:02:49', '2021-01-01T12:02:50', '2021-01-01T12:02:51', '2021-01-01T12:02:52', '2021-01-01T12:02:53', '2021-01-01T12:02:54', '2021-01-01T12:02:55', '2021-01-01T12:02:56', '2021-01-01T12:02:57', '2021-01-01T12:02:58', '2021-01-01T12:02:59', '2021-01-01T12:03:00', '2021-01-01T12:03:01', '2021-01-01T12:03:02', '2021-01-01T12:03:03', '2021-01-01T12:03:04', '2021-01-01T12:03:05', '2021-01-01T12:03:06', '2021-01-01T12:03:07', '2021-01-01T12:03:08', '2021-01-01T12:03:09', '2021-01-01T12:03:10', '2021-01-01T12:03:11', '2021-01-01T12:03:12', '2021-01-01T12:03:13', '2021-01-01T12:03:14', '2021-01-01T12:03:15', '2021-01-01T12:03:16', '2021-01-01T12:03:17', '2021-01-01T12:03:18', '2021-01-01T12:03:19', '2021-01-01T12:03:20', '2021-01-01T12:03:21', '2021-01-01T12:03:22', '2021-01-01T12:03:23', '2021-01-01T12:03:24', '2021-01-01T12:03:25', '2021-01-01T12:03:26', '2021-01-01T12:03:27', '2021-01-01T12:03:28', '2021-01-01T12:03:29', '2021-01-01T12:03:30', '2021-01-01T12:03:31', '2021-01-01T12:03:32', '2021-01-01T12:03:33', '2021-01-01T12:03:34', '2021-01-01T12:03:35', '2021-01-01T12:03:36', '2021-01-01T12:03:37', '2021-01-01T12:03:38', '2021-01-01T12:03:39', '2021-01-01T12:03:40', '2021-01-01T12:03:41', '2021-01-01T12:03:42', '2021-01-01T12:03:43', '2021-01-01T12:03:44', '2021-01-01T12:03:45', '2021-01-01T12:03:46', '2021-01-01T12:03:47', '2021-01-01T12:03:48', '2021-01-01T12:03:49', '2021-01-01T12:03:50', '2021-01-01T12:03:51', '2021-01-01T12:03:52', '2021-01-01T12:03:53', '2021-01-01T12:03:54', '2021-01-01T12:03:55', '2021-01-01T12:03:56', '2021-01-01T12:03:57', '2021-01-01T12:03:58', '2021-01-01T12:03:59', '2021-01-01T12:04:00', '2021-01-01T12:04:01', '2021-01-01T12:04:02', '2021-01-01T12:04:03', '2021-01-01T12:04:04', '2021-01-01T12:04:05', '2021-01-01T12:04:06', '2021-01-01T12:04:07', '2021-01-01T12:04:08', '2021-01-01T12:04:09', '2021-01-01T12:04:10', '2021-01-01T12:04:11', '2021-01-01T12:04:12', '2021-01-01T12:04:13', '2021-01-01T12:04:14', '2021-01-01T12:04:15', '2021-01-01T12:04:16', '2021-01-01T12:04:17', '2021-01-01T12:04:18', '2021-01-01T12:04:19', '2021-01-01T12:04:20', '2021-01-01T12:04:21', '2021-01-01T12:04:22', '2021-01-01T12:04:23', '2021-01-01T12:04:24', '2021-01-01T12:04:25', '2021-01-01T12:04:26', '2021-01-01T12:04:27', '2021-01-01T12:04:28', '2021-01-01T12:04:29', '2021-01-01T12:04:30', '2021-01-01T12:04:31', '2021-01-01T12:04:32', '2021-01-01T12:04:33', '2021-01-01T12:04:34', '2021-01-01T12:04:35', '2021-01-01T12:04:36', '2021-01-01T12:04:37', '2021-01-01T12:04:38', '2021-01-01T12:04:39', '2021-01-01T12:04:40', '2021-01-01T12:04:41', '2021-01-01T12:04:42', '2021-01-01T12:04:43', '2021-01-01T12:04:44', '2021-01-01T12:04:45', '2021-01-01T12:04:46', '2021-01-01T12:04:47', '2021-01-01T12:04:48', '2021-01-01T12:04:49', '2021-01-01T12:04:50', '2021-01-01T12:04:51', '2021-01-01T12:04:52', '2021-01-01T12:04:53', '2021-01-01T12:04:54', '2021-01-01T12:04:55', '2021-01-01T12:04:56', '2021-01-01T12:04:57', '2021-01-01T12:04:58', '2021-01-01T12:04:59', '2021-01-01T12:05:00', '2021-01-01T12:05:01', '2021-01-01T12:05:02', '2021-01-01T12:05:03', '2021-01-01T12:05:04', '2021-01-01T12:05:05', '2021-01-01T12:05:06', '2021-01-01T12:05:07', '2021-01-01T12:05:08', '2021-01-01T12:05:09', '2021-01-01T12:05:10', '2021-01-01T12:05:11', '2021-01-01T12:05:12', '2021-01-01T12:05:13', '2021-01-01T12:05:14', '2021-01-01T12:05:15', '2021-01-01T12:05:16', '2021-01-01T12:05:17', '2021-01-01T12:05:18', '2021-01-01T12:05:19', '2021-01-01T12:05:20', '2021-01-01T12:05:21', '2021-01-01T12:05:22', '2021-01-01T12:05:23', '2021-01-01T12:05:24', '2021-01-01T12:05:25', '2021-01-01T12:05:26', '2021-01-01T12:05:27', '2021-01-01T12:05:28', '2021-01-01T12:05:29', '2021-01-01T12:05:30', '2021-01-01T12:05:31', '2021-01-01T12:05:32', '2021-01-01T12:05:33', '2021-01-01T12:05:34', '2021-01-01T12:05:35', '2021-01-01T12:05:36', '2021-01-01T12:05:37', '2021-01-01T12:05:38', '2021-01-01T12:05:39', '2021-01-01T12:05:40', '2021-01-01T12:05:41', '2021-01-01T12:05:42', '2021-01-01T12:05:43', '2021-01-01T12:05:44', '2021-01-01T12:05:45', '2021-01-01T12:05:46', '2021-01-01T12:05:47', '2021-01-01T12:05:48', '2021-01-01T12:05:49', '2021-01-01T12:05:50', '2021-01-01T12:05:51', '2021-01-01T12:05:52', '2021-01-01T12:05:53', '2021-01-01T12:05:54', '2021-01-01T12:05:55', '2021-01-01T12:05:56', '2021-01-01T12:05:57', '2021-01-01T12:05:58', '2021-01-01T12:05:59', '2021-01-01T12:06:00', '2021-01-01T12:06:01', '2021-01-01T12:06:02'], dtype='datetime64[s]')

Or subtract datetimes to obtain a duration:

[21]:
var['time', 10] - var['time', 0]
[21]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      int64
      s
      10
      Values:
      array(10)

to_unit can be used to convert to a different precision:

[22]:
sc.to_unit(var, 'ms')
[22]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (1.88 KB)
    • (time: 240)
      datetime64
      ms
      2021-01-01T12:00:00.000, 2021-01-01T12:00:01.000, ..., 2021-01-01T12:03:58.000, 2021-01-01T12:03:59.000
      Values:
      array(['2021-01-01T12:00:00.000', '2021-01-01T12:00:01.000', '2021-01-01T12:00:02.000', '2021-01-01T12:00:03.000', '2021-01-01T12:00:04.000', '2021-01-01T12:00:05.000', '2021-01-01T12:00:06.000', '2021-01-01T12:00:07.000', '2021-01-01T12:00:08.000', '2021-01-01T12:00:09.000', '2021-01-01T12:00:10.000', '2021-01-01T12:00:11.000', '2021-01-01T12:00:12.000', '2021-01-01T12:00:13.000', '2021-01-01T12:00:14.000', '2021-01-01T12:00:15.000', '2021-01-01T12:00:16.000', '2021-01-01T12:00:17.000', '2021-01-01T12:00:18.000', '2021-01-01T12:00:19.000', '2021-01-01T12:00:20.000', '2021-01-01T12:00:21.000', '2021-01-01T12:00:22.000', '2021-01-01T12:00:23.000', '2021-01-01T12:00:24.000', '2021-01-01T12:00:25.000', '2021-01-01T12:00:26.000', '2021-01-01T12:00:27.000', '2021-01-01T12:00:28.000', '2021-01-01T12:00:29.000', '2021-01-01T12:00:30.000', '2021-01-01T12:00:31.000', '2021-01-01T12:00:32.000', '2021-01-01T12:00:33.000', '2021-01-01T12:00:34.000', '2021-01-01T12:00:35.000', '2021-01-01T12:00:36.000', '2021-01-01T12:00:37.000', '2021-01-01T12:00:38.000', '2021-01-01T12:00:39.000', '2021-01-01T12:00:40.000', '2021-01-01T12:00:41.000', '2021-01-01T12:00:42.000', '2021-01-01T12:00:43.000', '2021-01-01T12:00:44.000', '2021-01-01T12:00:45.000', '2021-01-01T12:00:46.000', '2021-01-01T12:00:47.000', '2021-01-01T12:00:48.000', '2021-01-01T12:00:49.000', '2021-01-01T12:00:50.000', '2021-01-01T12:00:51.000', '2021-01-01T12:00:52.000', '2021-01-01T12:00:53.000', '2021-01-01T12:00:54.000', '2021-01-01T12:00:55.000', '2021-01-01T12:00:56.000', '2021-01-01T12:00:57.000', '2021-01-01T12:00:58.000', '2021-01-01T12:00:59.000', '2021-01-01T12:01:00.000', '2021-01-01T12:01:01.000', '2021-01-01T12:01:02.000', '2021-01-01T12:01:03.000', '2021-01-01T12:01:04.000', '2021-01-01T12:01:05.000', '2021-01-01T12:01:06.000', '2021-01-01T12:01:07.000', '2021-01-01T12:01:08.000', '2021-01-01T12:01:09.000', '2021-01-01T12:01:10.000', '2021-01-01T12:01:11.000', '2021-01-01T12:01:12.000', '2021-01-01T12:01:13.000', '2021-01-01T12:01:14.000', '2021-01-01T12:01:15.000', '2021-01-01T12:01:16.000', '2021-01-01T12:01:17.000', '2021-01-01T12:01:18.000', '2021-01-01T12:01:19.000', '2021-01-01T12:01:20.000', '2021-01-01T12:01:21.000', '2021-01-01T12:01:22.000', '2021-01-01T12:01:23.000', '2021-01-01T12:01:24.000', '2021-01-01T12:01:25.000', '2021-01-01T12:01:26.000', '2021-01-01T12:01:27.000', '2021-01-01T12:01:28.000', '2021-01-01T12:01:29.000', '2021-01-01T12:01:30.000', '2021-01-01T12:01:31.000', '2021-01-01T12:01:32.000', '2021-01-01T12:01:33.000', '2021-01-01T12:01:34.000', '2021-01-01T12:01:35.000', '2021-01-01T12:01:36.000', '2021-01-01T12:01:37.000', '2021-01-01T12:01:38.000', '2021-01-01T12:01:39.000', '2021-01-01T12:01:40.000', '2021-01-01T12:01:41.000', '2021-01-01T12:01:42.000', '2021-01-01T12:01:43.000', '2021-01-01T12:01:44.000', '2021-01-01T12:01:45.000', '2021-01-01T12:01:46.000', '2021-01-01T12:01:47.000', '2021-01-01T12:01:48.000', '2021-01-01T12:01:49.000', '2021-01-01T12:01:50.000', '2021-01-01T12:01:51.000', '2021-01-01T12:01:52.000', '2021-01-01T12:01:53.000', '2021-01-01T12:01:54.000', '2021-01-01T12:01:55.000', '2021-01-01T12:01:56.000', '2021-01-01T12:01:57.000', '2021-01-01T12:01:58.000', '2021-01-01T12:01:59.000', '2021-01-01T12:02:00.000', '2021-01-01T12:02:01.000', '2021-01-01T12:02:02.000', '2021-01-01T12:02:03.000', '2021-01-01T12:02:04.000', '2021-01-01T12:02:05.000', '2021-01-01T12:02:06.000', '2021-01-01T12:02:07.000', '2021-01-01T12:02:08.000', '2021-01-01T12:02:09.000', '2021-01-01T12:02:10.000', '2021-01-01T12:02:11.000', '2021-01-01T12:02:12.000', '2021-01-01T12:02:13.000', '2021-01-01T12:02:14.000', '2021-01-01T12:02:15.000', '2021-01-01T12:02:16.000', '2021-01-01T12:02:17.000', '2021-01-01T12:02:18.000', '2021-01-01T12:02:19.000', '2021-01-01T12:02:20.000', '2021-01-01T12:02:21.000', '2021-01-01T12:02:22.000', '2021-01-01T12:02:23.000', '2021-01-01T12:02:24.000', '2021-01-01T12:02:25.000', '2021-01-01T12:02:26.000', '2021-01-01T12:02:27.000', '2021-01-01T12:02:28.000', '2021-01-01T12:02:29.000', '2021-01-01T12:02:30.000', '2021-01-01T12:02:31.000', '2021-01-01T12:02:32.000', '2021-01-01T12:02:33.000', '2021-01-01T12:02:34.000', '2021-01-01T12:02:35.000', '2021-01-01T12:02:36.000', '2021-01-01T12:02:37.000', '2021-01-01T12:02:38.000', '2021-01-01T12:02:39.000', '2021-01-01T12:02:40.000', '2021-01-01T12:02:41.000', '2021-01-01T12:02:42.000', '2021-01-01T12:02:43.000', '2021-01-01T12:02:44.000', '2021-01-01T12:02:45.000', '2021-01-01T12:02:46.000', '2021-01-01T12:02:47.000', '2021-01-01T12:02:48.000', '2021-01-01T12:02:49.000', '2021-01-01T12:02:50.000', '2021-01-01T12:02:51.000', '2021-01-01T12:02:52.000', '2021-01-01T12:02:53.000', '2021-01-01T12:02:54.000', '2021-01-01T12:02:55.000', '2021-01-01T12:02:56.000', '2021-01-01T12:02:57.000', '2021-01-01T12:02:58.000', '2021-01-01T12:02:59.000', '2021-01-01T12:03:00.000', '2021-01-01T12:03:01.000', '2021-01-01T12:03:02.000', '2021-01-01T12:03:03.000', '2021-01-01T12:03:04.000', '2021-01-01T12:03:05.000', '2021-01-01T12:03:06.000', '2021-01-01T12:03:07.000', '2021-01-01T12:03:08.000', '2021-01-01T12:03:09.000', '2021-01-01T12:03:10.000', '2021-01-01T12:03:11.000', '2021-01-01T12:03:12.000', '2021-01-01T12:03:13.000', '2021-01-01T12:03:14.000', '2021-01-01T12:03:15.000', '2021-01-01T12:03:16.000', '2021-01-01T12:03:17.000', '2021-01-01T12:03:18.000', '2021-01-01T12:03:19.000', '2021-01-01T12:03:20.000', '2021-01-01T12:03:21.000', '2021-01-01T12:03:22.000', '2021-01-01T12:03:23.000', '2021-01-01T12:03:24.000', '2021-01-01T12:03:25.000', '2021-01-01T12:03:26.000', '2021-01-01T12:03:27.000', '2021-01-01T12:03:28.000', '2021-01-01T12:03:29.000', '2021-01-01T12:03:30.000', '2021-01-01T12:03:31.000', '2021-01-01T12:03:32.000', '2021-01-01T12:03:33.000', '2021-01-01T12:03:34.000', '2021-01-01T12:03:35.000', '2021-01-01T12:03:36.000', '2021-01-01T12:03:37.000', '2021-01-01T12:03:38.000', '2021-01-01T12:03:39.000', '2021-01-01T12:03:40.000', '2021-01-01T12:03:41.000', '2021-01-01T12:03:42.000', '2021-01-01T12:03:43.000', '2021-01-01T12:03:44.000', '2021-01-01T12:03:45.000', '2021-01-01T12:03:46.000', '2021-01-01T12:03:47.000', '2021-01-01T12:03:48.000', '2021-01-01T12:03:49.000', '2021-01-01T12:03:50.000', '2021-01-01T12:03:51.000', '2021-01-01T12:03:52.000', '2021-01-01T12:03:53.000', '2021-01-01T12:03:54.000', '2021-01-01T12:03:55.000', '2021-01-01T12:03:56.000', '2021-01-01T12:03:57.000', '2021-01-01T12:03:58.000', '2021-01-01T12:03:59.000'], dtype='datetime64[ms]')

Operations

Creation functions

New in 0.5

For convenience and similarity to numpy we added functions that create variables. Our intention is to fully replace the need to use sc.Variable directly, but at this point this has not been rolled out to our documentation pages.

Examples:

[23]:
sc.array(dims=['x'], values=np.array([1, 2, 3]))
[23]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • (x: 3)
      int64
      1, 2, 3
      Values:
      array([1, 2, 3])
[24]:
sc.zeros(dims=['x'], shape=[3])
[24]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • (x: 3)
      float64
      0.0, 0.0, 0.0
      Values:
      array([0., 0., 0.])
[25]:
sc.scalar(17)
[25]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      int64
      17
      Values:
      array(17)

All of these also take keyword arguments. Note that we can still support creating scalars by multiplying with a unit:

[26]:
1.2 * sc.units.m
[26]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      m
      1.2
      Values:
      array(1.2)

New in 0.7

More creation functions were added:

  • Added zeros_like, ones_like, and empty_like.

  • Added linspace, logspace, geomspace, and arange.

New in 0.8

More creation functions were added:

  • Added full and full_like.

Unit conversion

New in 0.6

Conversions between different unit scales (not to be confused with conversions provided by scippneutron) are now supported. to_unit provides conversion of variables between, e.g., mm and m.

New in 0.7

  • to_unit can now avoid making a copy if the input already has the desired unit. This can be used as a cheap way to ensure inputs have expected units.

  • to_unit now also works for binned data, converting the unit of the underlying events in the bins

New in 0.8

  • to_unit now has a copy argument. By default, copy=True and to_unit makes a copy even if the input already has the desired unit. For a cheap way to ensure inputs have expected units use copy=False to avoid copies if possible.

Example:

[27]:
var = sc.array(dims=['x'], unit='mm', values=[3.2, 5.4, 7.6])
m = sc.to_unit(var, 'm')
m
[27]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • (x: 3)
      float64
      m
      0.0, 0.01, 0.01
      Values:
      array([0.0032, 0.0054, 0.0076])

No copy is made if the input has the requested unit when we specify copy=False:

[28]:
sc.to_unit(m, 'm', copy=False)  # no copy
[28]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • (x: 3)
      float64
      m
      0.0, 0.01, 0.01
      Values:
      array([0.0032, 0.0054, 0.0076])

Conversions also work for more specialized units such as electron-volt:

[29]:
sc.to_unit(sc.scalar(1.0, unit='nJ'), unit='meV')
[29]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      meV
      6241509074460.764
      Values:
      array(6.24150907e+12)

from_pandas and from_xarray

New in 0.8

  • from_pandas for converting pandas.Dataframe to scipp.Dataset.

  • from_xarray for converting xarray.DataArray or xarray.Dataset to scipp.DataAray or scipp.Dataset, respectively.

Both functions are available in the compat submodule.

Shape operations

fold and flatten

New in 0.6

fold and flatten, which are similar to numpy.reshape, have been added. In contrast to reshape, fold and flatten support data arrays and handle also meta data such as coord, masks, and attrs.

New in 0.7

  • fold now always returns views of data and all meta data instead of making deep copies.

  • flatten also preserves reshaped data as a view, but unlike fold the same is not true for meta data in general, since it may require duplication in the flatten operation.

Example:

[30]:
var = sc.ones(dims=['pixel'], shape=[100])
xy = sc.fold(var, dim='pixel', sizes={'x': 10, 'y': 10})
xy = sc.DataArray(data=xy,
                  coords={
                      'x': sc.array(dims=['x'], values=np.arange(10)),
                      'y': sc.array(dims=['y'], values=np.arange(10))
                  })
xy
[30]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (960 Bytes)
    • x: 10
    • y: 10
    • x
      (x)
      int64
      0, 1, ..., 8, 9
      Values:
      array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
    • y
      (y)
      int64
      0, 1, ..., 8, 9
      Values:
      array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
    • (x, y)
      float64
      1.0, 1.0, ..., 1.0, 1.0
      Values:
      array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

Folding does not effect copies of either data or meta data, for example:

[31]:
xy['y', 4] *= 0.0  # affects var (scipp-0.7 and higher)
var.plot()

The reverse of fold is flatten:

[32]:
flat = sc.flatten(xy, to='pixel')
flat
[32]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (2.34 KB)
    • pixel: 100
    • x
      (pixel)
      int64
      0, 0, ..., 9, 9
      Values:
      array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9])
    • y
      (pixel)
      int64
      0, 1, ..., 8, 9
      Values:
      array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
    • (pixel)
      float64
      1.0, 1.0, ..., 1.0, 1.0
      Values:
      array([1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1.])

Flattening does not effect a copy of data, but meta data may get copied if values need to be duplicated by the operation:

[33]:
flat['pixel', 0] = 22  # modifies var (scipp-0.7 and higher)
var.plot()

Vectors and matrices

General

New in 0.7

Several improvements for working with (3-D position) vectors and (3-D rotation) matrices are part of this release:

  • Creation functions were added:

    • vector (a single vector)

    • vectors (array of vectors)

    • matrix (a single matrix),

    • matrices (array of matrices).

  • Direct creation and initialization of 2-D (or higher) arrays of matrices and vectors is now possible from numpy arrays.

  • The values property now returns a numpy array with ndim+1 (vectors) or ndim+2 (matrices) axes, with the inner 1 (vectors) or 2 (matrices) axes corresponding to the vector or matrix axes.

  • Vector or matrix elements can now be accessed and modified directly using the new fields property of variables. fields provides access to vector elements x, y, and z or matrix elements xx, xy, …, zz.

New in 0.8

The fields property can now be iterated and behaves similar to a dict with fixed keys.

[34]:
sc.vector(value=[1, 2, 3])
[34]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (24 Bytes)
    • ()
      vector_3_float64
      [1. 2. 3.]
      Values:
      array([1., 2., 3.])
[35]:
vecs = sc.vectors(dims=['x'], unit='m', values=np.arange(12).reshape(4, 3))
vecs
[35]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (96 Bytes)
    • (x: 4)
      vector_3_float64
      m
      [0. 1. 2.], [3. 4. 5.], [6. 7. 8.], [ 9. 10. 11.]
      Values:
      array([[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.], [ 9., 10., 11.]])
[36]:
vecs.values
[36]:
array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.],
       [ 6.,  7.,  8.],
       [ 9., 10., 11.]])
[37]:
vecs.fields.y
[37]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (32 Bytes out of 96 Bytes)
    • (x: 4)
      float64
      m
      1.0, 4.0, 7.0, 10.0
      Values:
      array([ 1., 4., 7., 10.])
[38]:
vecs.fields.z += 0.666 * sc.units.m
vecs
[38]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (96 Bytes)
    • (x: 4)
      vector_3_float64
      m
      [0. 1. 2.666], [3. 4. 5.666], [6. 7. 8.666], [ 9. 10. 11.666]
      Values:
      array([[ 0. , 1. , 2.666], [ 3. , 4. , 5.666], [ 6. , 7. , 8.666], [ 9. , 10. , 11.666]])

New in 0.8

The cross function to compute the cross-product of vectors as added.

[39]:
sc.cross(vecs, vecs['x', 0])
[39]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (96 Bytes)
    • (x: 4)
      vector_3_float64
      m^2
      [0. 0. 0.], [ 4.998 -7.998 3. ], [ 9.996 -15.996 6. ], [ 14.994 -23.994 9. ]
      Values:
      array([[ 0. , 0. , 0. ], [ 4.998, -7.998, 3. ], [ 9.996, -15.996, 6. ], [ 14.994, -23.994, 9. ]])

scipp.spatial.transform

New in 0.8

The scipp.spatial.transform (in the style of scipy.spatial.transform) submodule was added. This now provides:

  • from_rotvec to create rotation matrices from rotation vectors.

  • as_rotvec to convert rotation matrices into rotation vectors.

As an example, the following creates a rotation matrix for rotation around the x-axis by 30 degrees:

[40]:
from scipp.spatial.transform import from_rotvec

rot = from_rotvec(sc.vector(value=[30.0, 0, 0], unit='deg'))
rot
[40]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (72 Bytes)
    • ()
      matrix_3_float64
      [[ 1. 0. 0. ] [ 0. 0.8660254 -0.5 ] [ 0. 0.5 0.8660254]]
      Values:
      array([[ 1. , 0. , 0. ], [ 0. , 0.8660254, -0.5 ], [ 0. , 0.5 , 0.8660254]])

Coordinate transformations

New in 0.8

The transform_coords function has been added (also available as method of data arrays and datasets). It is a tool for transforming one or more input coordinates into one or more output coordinates. It automatically handles:

  • Renaming of dimensions, if dimension-coordinates are transformed.

  • Change of coordinates to attributes to avoid interference of coordinates consumed by the transformation in follow-up operations.

  • Conversion of event-coordinates of binned data, if present.

See Coordinate transformations for a full description.

Physical constants

New in 0.8

The scipp.constants (in the style of scipy.constants) submodule was added, providing physical constants from CODATA 2018. For full details see the module’s documentation.

Examples:

[41]:
from scipp.constants import hbar, m_e, physical_constants
[42]:
hbar
[42]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      J*s
      1.0545718176461565e-34
      Values:
      array(1.05457182e-34)
[43]:
m_e
[43]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      kg
      9.1093837015e-31
      Values:
      array(9.1093837e-31)
[44]:
physical_constants('speed of light in vacuum')
[44]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (8 Bytes)
    • ()
      float64
      m/s
      299792458.0
      Values:
      array(2.99792458e+08)
[45]:
physical_constants('neutron mass', with_variance=True)
[45]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (16 Bytes)
    • ()
      float64
      kg
      1.67492749804e-27
      σ = 9.5e-37
      Values:
      array(1.6749275e-27)

      Variances (σ²):
      array(9.025e-73)

Plotting

New in 0.7

  • Plotting supports redraw() method for updating existing plots with new data, without recreating the plot.

New in 0.8

  • Plotting 1-D binned (event) data is now supported.

Binned data

Buffer and meta data access

New in 0.7

  • The internal buffer holding the “events” underlying binned data can now be accessed directly using the new events property. Update: This is deprecated as of 0.8.2.

  • HTML view now works for binned meta data access such as binned.bins.coords['time']

New in 0.8

The mean of bins can now be computed using binned.bins.mean(). This should general be used instead of binned.bins.sum() the if dtype is not “summable”, i.e., typically anything that is not of unit “counts”.

Consider the following example, representing a time series of temperature measurements on an x-y plane:

[46]:
import numpy as np

N = int(800)
data = sc.DataArray(
    data=sc.Variable(dims=['time'], values=100 + np.random.rand(N) * 10, unit='K'),
    coords={
        'x': sc.Variable(dims=['time'], unit='m', values=np.random.rand(N)),
        'y': sc.Variable(dims=['time'], unit='m', values=np.random.rand(N)),
        'time': sc.Variable(dims=['time'], values=(10000 * np.random.rand(N)).astype('datetime64[s]')),
    })
binned = sc.bin(data,
                edges=[sc.linspace(dim='x', unit='m', start=0.0, stop=1.0, num=5),
                       sc.linspace(dim='y', unit='m', start=0.0, stop=1.0, num=5)])
binned
[46]:
Show/Hide data repr Show/Hide attributes
scipp.DataArray (25.33 KB)
    • x: 4
    • y: 4
    • x
      (x [bin-edge])
      float64
      m
      0.0, 0.25, 0.5, 0.75, 1.0
      Values:
      array([0. , 0.25, 0.5 , 0.75, 1. ])
    • y
      (y [bin-edge])
      float64
      m
      0.0, 0.25, 0.5, 0.75, 1.0
      Values:
      array([0. , 0.25, 0.5 , 0.75, 1. ])
    • (x, y)
      DataArrayView
      binned data [len=52, len=50, ..., len=59, len=44]
      Values:
      [<scipp.DataArray> Dimensions: Sizes[time:52, ] Coordinates: time datetime64 [s] (time) [1970-01-01T00:35:50, 1970-01-01T02:08:26, ..., 1970-01-01T00:28:55, 1970-01-01T02:42:30] x float64 [m] (time) [0.186080, 0.054793, ..., 0.146708, 0.072271] y float64 [m] (time) [0.022003, 0.075396, ..., 0.115558, 0.210394] Data: float64 [K] (time) [109.794155, 100.553758, ..., 107.340572, 106.902720] , <scipp.DataArray> Dimensions: Sizes[time:50, ] Coordinates: time datetime64 [s] (time) [1970-01-01T02:11:44, 1970-01-01T01:12:52, ..., 1970-01-01T01:15:14, 1970-01-01T02:34:09] x float64 [m] (time) [0.081474, 0.210229, ..., 0.010766, 0.119580] y float64 [m] (time) [0.253279, 0.276667, ..., 0.378480, 0.439987] Data: float64 [K] (time) [101.748073, 108.030339, ..., 109.531299, 101.435841] , ..., <scipp.DataArray> Dimensions: Sizes[time:59, ] Coordinates: time datetime64 [s] (time) [1970-01-01T01:51:18, 1970-01-01T02:41:37, ..., 1970-01-01T02:43:02, 1970-01-01T00:30:26] x float64 [m] (time) [0.836701, 0.788889, ..., 0.971129, 0.850548] y float64 [m] (time) [0.514702, 0.707257, ..., 0.594856, 0.694594] Data: float64 [K] (time) [109.032462, 101.496697, ..., 104.652373, 103.083986] , <scipp.DataArray> Dimensions: Sizes[time:44, ] Coordinates: time datetime64 [s] (time) [1970-01-01T00:19:50, 1970-01-01T01:37:06, ..., 1970-01-01T00:21:42, 1970-01-01T00:28:06] x float64 [m] (time) [0.864923, 0.793226, ..., 0.900084, 0.862296] y float64 [m] (time) [0.851275, 0.791152, ..., 0.912784, 0.840306] Data: float64 [K] (time) [106.664423, 109.135451, ..., 107.185179, 105.569680] ]
[47]:
sc.show(binned)
(dims=['x', 'y'], shape=[4, 4], unit=dimensionless, variances=False)values xy (dims=['time'], shape=[800], unit=K, variances=False)values time timetime(dims=['time'], shape=[800], unit=s, variances=False)values time yy(dims=['time'], shape=[800], unit=m, variances=False)values time xx(dims=['time'], shape=[800], unit=m, variances=False)values time xx(dims=['x'], shape=[5], unit=m, variances=False)values x yy(dims=['y'], shape=[5], unit=m, variances=False)values y

To allow for this, the bins property provides properties data, coords, masks, and attrs of the bins that behave like the properties of a data array while retaining the binned structure. That is, it can be used for computation involving information available on a per-bin basis:

[48]:
binned.bins.coords['time']
[48]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (6.50 KB)
    • (x: 4, y: 4)
      VariableView
      binned data [len=52, len=50, ..., len=59, len=44]
      Values:
      [<scipp.Variable> (time: 52) datetime64 [s] [1970-01-01T00:35:50, 1970-01-01T02:08:26, ..., 1970-01-01T00:28:55, 1970-01-01T02:42:30], <scipp.Variable> (time: 50) datetime64 [s] [1970-01-01T02:11:44, 1970-01-01T01:12:52, ..., 1970-01-01T01:15:14, 1970-01-01T02:34:09], ..., <scipp.Variable> (time: 59) datetime64 [s] [1970-01-01T01:51:18, 1970-01-01T02:41:37, ..., 1970-01-01T02:43:02, 1970-01-01T00:30:26], <scipp.Variable> (time: 44) datetime64 [s] [1970-01-01T00:19:50, 1970-01-01T01:37:06, ..., 1970-01-01T00:21:42, 1970-01-01T00:28:06]]
[49]:
sc.show(binned.bins.coords['time'])
dims=['x', 'y'], shape=[4, 4], unit=dimensionless, variances=Falsevalues xy dims=['time'], shape=[800], unit=s, variances=Falsevalues time

We can use this in our example to correct for an hypothetical clock error that depends on the x-y bin:

[50]:
clock_correction = sc.array(dims=['x', 'y'], unit='s', values=(100 * np.random.rand(4, 4)).astype('int64'))
clock_correction
[50]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (128 Bytes)
    • (x: 4, y: 4)
      int64
      s
      97, 82, ..., 95, 82
      Values:
      array([[97, 82, 1, 56], [69, 24, 28, 9], [35, 4, 46, 69], [70, 76, 95, 82]])
[51]:
binned.bins.coords['time'] += clock_correction

The properties can also be used to add or delete meta data entries:

[52]:
del binned.bins.coords['x']

Performance

New in 0.7

  • sort is now considerably faster for data with more rows.

  • reduction operations such as sum and mean are now also multi-threaded and thus considerably faster.