Download this Jupyter notebook
What’s new in scipp¶
This page highlights feature additions and discusses major changes from recent releases. For a full list of changes see the Release Notes.
[1]:
import numpy as np
import scipp as sc
General¶
Bound method equivalents to many free functions¶
New in 0.8
Many functions that have been available as free functions can now be used also as methods of variables and data arrays. See the documentation for individual classes for a full list.
Example:
[2]:
var = sc.arange(dim='x', unit='m', start=0, stop=12)
var.sum() # Previously sc.sum(var)
[2]:
- ()int64m66
Values:
array(66)
Note that sc.sum(var)
will continue to be supported as well.
Python-like shallow/deep copy mechanism¶
New in 0.7
The most significant change in the scipp 0.7 release is a fundamental rework of all scipp data structures (variables, data arrays, and datasets). These now behave mostly like nested Python objects, i.e., sub-objects are shared by default. Previously there was no sharing mechanism and scipp always made deep-copies. Some of the effects are exemplified in the following.
Variables¶
For variables on their own, the new and old implementations mostly yield the same user experience. Previously, views of variables, such as created when slicing a variable along a dimension, returned a different type – VariableView
– which kept alive the original Variable
. This asymmetry is now gone. Slices or other views of variables are now also of type Variable
, and all views share ownership of the underlying data.
If a variable refers only to a section of the underlying data buffer this is now indicated in the HTML view in the title line as part of the size, here “16 Bytes out of 96 Bytes”. This allows for identification of “small” variables that keep alive potentially large buffers:
[3]:
var = sc.arange(dim='x', unit='m', start=0, stop=12)
var['x', 4:6]
[3]:
- (x: 2)int64m4, 5
Values:
array([4, 5])
To create a variable with sole ownership of a buffer, use the copy()
method:
[4]:
var['x', 4:6].copy()
[4]:
- (x: 2)int64m4, 5
Values:
array([4, 5])
By default, copy()
returns a deep copy. Shallow copies can be made by specifying deep=False
, which preserves shared ownership of underlying buffers:
[5]:
shallow_copy = var['x', 4:6].copy(deep=False)
shallow_copy
[5]:
- (x: 2)int64m4, 5
Values:
array([4, 5])
Data arrays¶
The move away from the previous “always deep copy” mechanism avoids a number of critical issues. However, as a result of the new sharing mechanism extra care must now be taken in some cases, just like when working with any other Python library. Consider the following example, using the same variable for data and a coordinate:
[6]:
da = sc.DataArray(data=var, coords={'x': var})
da += 666 * sc.units.m
da
[6]:
- x: 12
- x(x)int64m666, 667, ..., 676, 677
Values:
array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])
- (x)int64m666, 667, ..., 676, 677
Values:
array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])
The modification unintentionally also affected the coordinate. However, if we think of data arrays and coordinate dicts as Python-like objects, the behavior should then not be surprising.
Note that the original var
is also affected:
[7]:
var
[7]:
- (x: 12)int64m666, 667, ..., 676, 677
Values:
array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])
To avoid this, use copy()
, e.g.,:
[8]:
da = sc.DataArray(data=var.copy(), coords={'x': var.copy()})
da += 666 * sc.units.m
da
[8]:
- x: 12
- x(x)int64m666, 667, ..., 676, 677
Values:
array([666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677])
- (x)int64m1332, 1333, ..., 1342, 1343
Values:
array([1332, 1333, 1334, 1335, 1336, 1337, 1338, 1339, 1340, 1341, 1342, 1343])
Apart from the more standard and pythonic behavior, one advantage of this is that creating data arrays from variables can now be cheap, without inflicting copies of potentially large objects.
A related change is the introduction of read-only flags. Consider the following attempt to modify the data via a slice:
[9]:
try:
da['x', 0].data = var['x', 2]
except sc.DataArrayError as e:
print(e)
Read-only flag is set, cannot set new data.
Since da['x',0]
is itself a data array, assigning to the data
property would repoint the data to whatever is given on the right-hand side. However, this would not affect da
, and the attempt to change the data would silently do nothing, since the temporary da['x',0]
disappears immediately. The read-only flag protects us from this.
To actually modify the slice, use __setitem__
instead:
[10]:
da['x', 0] = var['x', 2]
Read-only flags were also introduced for variables, meta-data dicts (coords
, masks
, and attrs
properties), data arrays and datasets. The flags solve a number of conceptual issues and serve as a safeguard against hidden bugs.
Datasets¶
Just like creating data arrays from variables is now cheap (without deep-copies), inserting items into datasets does not inflict potentially expensive deep copies:
[11]:
ds = sc.Dataset()
ds['a'] = da # shallow copy
Note that while the buffers are shared, the meta-data dicts such as coords
, masks
, or attrs
are not. Compare:
[12]:
ds['a'].attrs['attr'] = 1.2 * sc.units.m
'attr' in da.attrs # the attrs *dict* is copied
[12]:
False
with
[13]:
da.coords['x'] *= -1
ds.coords['x'] # the coords *dict* is copied, but the 'x' coordinate references same buffer
[13]:
- (x: 12)int64m-666, -667, ..., -676, -677
Values:
array([-666, -667, -668, -669, -670, -671, -672, -673, -674, -675, -676, -677])
Indexing¶
Ellipsis¶
New in 0.8
Indexing with ellipsis (...
) is now supported. This can be used, e.g., to replace data in an existing object without re-pointing the underlying reference to the object given on the right-hand side.
Example
[14]:
var1 = sc.ones(dims=['x'], shape=[4])
var2 = var1 + var1
da = sc.DataArray(data=sc.zeros(dims=['x'], shape=[4]))
da.data = var1 # replace data variable
da.data[...] = var2 # assign to slice, copy into existing data variable
var1 # now holds values of var2
[14]:
- (x: 4)float642.0, 2.0, 2.0, 2.0
Values:
array([2., 2., 2., 2.])
Changing var2
has no effect on da.data
:
[15]:
var2 += 2222.0
da
[15]:
- x: 4
- (x)float642.0, 2.0, 2.0, 2.0
Values:
array([2., 2., 2., 2.])
Label-based indexing¶
New in 0.5
Indexing based on coordinate values is now possible:
Works just like position indexing (with integers).
Use a scalar variable as index (instead of integer) to use label-based indexing
Works with single values as well as slices (
:
notation)
See Label-based indexing for more details.
Example
[16]:
da = sc.DataArray(data=sc.zeros(dims=['x', 'day'], shape=(4, 3)))
da.coords['x'] = sc.linspace(dim='x', unit='m', start=0.1, stop=0.2, num=5)
da.coords['day'] = sc.array(dims=['day'], values=[1, 7, 31])
[17]:
da['day', sc.scalar(7)]
[17]:
- x: 4
- x(x [bin-edge])float64m0.1, 0.12, 0.15, 0.18, 0.2
Values:
array([0.1 , 0.125, 0.15 , 0.175, 0.2 ])
- (x)float640.0, 0.0, 0.0, 0.0
Values:
array([0., 0., 0., 0.])
- day()int647
Values:
array(7)
[18]:
da['x', 0.13 * sc.units.m] # selects bin containing this value
[18]:
- day: 3
- day(day)int641, 7, 31
Values:
array([ 1, 7, 31])
- (day)float640.0, 0.0, 0.0
Values:
array([0., 0., 0.])
- x(x)float64m0.12, 0.15
Values:
array([0.125, 0.15 ])
Support for datetime64¶
New in 0.6
Previously we stored time-related information such as, e.g., sample-temperature logs as integers.
Added support for datetime64 compatible with np.datetime64
Time differences (
np.timedelta64
) are not used, we simply use integers since in combination with scipp’s units this provides everything we need.
Example:
[19]:
var = sc.array(dims=['time'],
values=np.arange(np.datetime64('2021-01-01T12:00:00'),
np.datetime64('2021-01-01T12:04:00')))
Datetimes and intgers with time units interoperate naturally. We can offset a datetime by adding a duration:
[20]:
var + 123 * sc.Unit('s')
[20]:
- (time: 240)datetime64s2021-01-01T12:02:03, 2021-01-01T12:02:04, ..., 2021-01-01T12:06:01, 2021-01-01T12:06:02
Values:
array(['2021-01-01T12:02:03', '2021-01-01T12:02:04', '2021-01-01T12:02:05', '2021-01-01T12:02:06', '2021-01-01T12:02:07', '2021-01-01T12:02:08', '2021-01-01T12:02:09', '2021-01-01T12:02:10', '2021-01-01T12:02:11', '2021-01-01T12:02:12', '2021-01-01T12:02:13', '2021-01-01T12:02:14', '2021-01-01T12:02:15', '2021-01-01T12:02:16', '2021-01-01T12:02:17', '2021-01-01T12:02:18', '2021-01-01T12:02:19', '2021-01-01T12:02:20', '2021-01-01T12:02:21', '2021-01-01T12:02:22', '2021-01-01T12:02:23', '2021-01-01T12:02:24', '2021-01-01T12:02:25', '2021-01-01T12:02:26', '2021-01-01T12:02:27', '2021-01-01T12:02:28', '2021-01-01T12:02:29', '2021-01-01T12:02:30', '2021-01-01T12:02:31', '2021-01-01T12:02:32', '2021-01-01T12:02:33', '2021-01-01T12:02:34', '2021-01-01T12:02:35', '2021-01-01T12:02:36', '2021-01-01T12:02:37', '2021-01-01T12:02:38', '2021-01-01T12:02:39', '2021-01-01T12:02:40', '2021-01-01T12:02:41', '2021-01-01T12:02:42', '2021-01-01T12:02:43', '2021-01-01T12:02:44', '2021-01-01T12:02:45', '2021-01-01T12:02:46', '2021-01-01T12:02:47', '2021-01-01T12:02:48', '2021-01-01T12:02:49', '2021-01-01T12:02:50', '2021-01-01T12:02:51', '2021-01-01T12:02:52', '2021-01-01T12:02:53', '2021-01-01T12:02:54', '2021-01-01T12:02:55', '2021-01-01T12:02:56', '2021-01-01T12:02:57', '2021-01-01T12:02:58', '2021-01-01T12:02:59', '2021-01-01T12:03:00', '2021-01-01T12:03:01', '2021-01-01T12:03:02', '2021-01-01T12:03:03', '2021-01-01T12:03:04', '2021-01-01T12:03:05', '2021-01-01T12:03:06', '2021-01-01T12:03:07', '2021-01-01T12:03:08', '2021-01-01T12:03:09', '2021-01-01T12:03:10', '2021-01-01T12:03:11', '2021-01-01T12:03:12', '2021-01-01T12:03:13', '2021-01-01T12:03:14', '2021-01-01T12:03:15', '2021-01-01T12:03:16', '2021-01-01T12:03:17', '2021-01-01T12:03:18', '2021-01-01T12:03:19', '2021-01-01T12:03:20', '2021-01-01T12:03:21', '2021-01-01T12:03:22', '2021-01-01T12:03:23', '2021-01-01T12:03:24', '2021-01-01T12:03:25', '2021-01-01T12:03:26', '2021-01-01T12:03:27', '2021-01-01T12:03:28', '2021-01-01T12:03:29', '2021-01-01T12:03:30', '2021-01-01T12:03:31', '2021-01-01T12:03:32', '2021-01-01T12:03:33', '2021-01-01T12:03:34', '2021-01-01T12:03:35', '2021-01-01T12:03:36', '2021-01-01T12:03:37', '2021-01-01T12:03:38', '2021-01-01T12:03:39', '2021-01-01T12:03:40', '2021-01-01T12:03:41', '2021-01-01T12:03:42', '2021-01-01T12:03:43', '2021-01-01T12:03:44', '2021-01-01T12:03:45', '2021-01-01T12:03:46', '2021-01-01T12:03:47', '2021-01-01T12:03:48', '2021-01-01T12:03:49', '2021-01-01T12:03:50', '2021-01-01T12:03:51', '2021-01-01T12:03:52', '2021-01-01T12:03:53', '2021-01-01T12:03:54', '2021-01-01T12:03:55', '2021-01-01T12:03:56', '2021-01-01T12:03:57', '2021-01-01T12:03:58', '2021-01-01T12:03:59', '2021-01-01T12:04:00', '2021-01-01T12:04:01', '2021-01-01T12:04:02', '2021-01-01T12:04:03', '2021-01-01T12:04:04', '2021-01-01T12:04:05', '2021-01-01T12:04:06', '2021-01-01T12:04:07', '2021-01-01T12:04:08', '2021-01-01T12:04:09', '2021-01-01T12:04:10', '2021-01-01T12:04:11', '2021-01-01T12:04:12', '2021-01-01T12:04:13', '2021-01-01T12:04:14', '2021-01-01T12:04:15', '2021-01-01T12:04:16', '2021-01-01T12:04:17', '2021-01-01T12:04:18', '2021-01-01T12:04:19', '2021-01-01T12:04:20', '2021-01-01T12:04:21', '2021-01-01T12:04:22', '2021-01-01T12:04:23', '2021-01-01T12:04:24', '2021-01-01T12:04:25', '2021-01-01T12:04:26', '2021-01-01T12:04:27', '2021-01-01T12:04:28', '2021-01-01T12:04:29', '2021-01-01T12:04:30', '2021-01-01T12:04:31', '2021-01-01T12:04:32', '2021-01-01T12:04:33', '2021-01-01T12:04:34', '2021-01-01T12:04:35', '2021-01-01T12:04:36', '2021-01-01T12:04:37', '2021-01-01T12:04:38', '2021-01-01T12:04:39', '2021-01-01T12:04:40', '2021-01-01T12:04:41', '2021-01-01T12:04:42', '2021-01-01T12:04:43', '2021-01-01T12:04:44', '2021-01-01T12:04:45', '2021-01-01T12:04:46', '2021-01-01T12:04:47', '2021-01-01T12:04:48', '2021-01-01T12:04:49', '2021-01-01T12:04:50', '2021-01-01T12:04:51', '2021-01-01T12:04:52', '2021-01-01T12:04:53', '2021-01-01T12:04:54', '2021-01-01T12:04:55', '2021-01-01T12:04:56', '2021-01-01T12:04:57', '2021-01-01T12:04:58', '2021-01-01T12:04:59', '2021-01-01T12:05:00', '2021-01-01T12:05:01', '2021-01-01T12:05:02', '2021-01-01T12:05:03', '2021-01-01T12:05:04', '2021-01-01T12:05:05', '2021-01-01T12:05:06', '2021-01-01T12:05:07', '2021-01-01T12:05:08', '2021-01-01T12:05:09', '2021-01-01T12:05:10', '2021-01-01T12:05:11', '2021-01-01T12:05:12', '2021-01-01T12:05:13', '2021-01-01T12:05:14', '2021-01-01T12:05:15', '2021-01-01T12:05:16', '2021-01-01T12:05:17', '2021-01-01T12:05:18', '2021-01-01T12:05:19', '2021-01-01T12:05:20', '2021-01-01T12:05:21', '2021-01-01T12:05:22', '2021-01-01T12:05:23', '2021-01-01T12:05:24', '2021-01-01T12:05:25', '2021-01-01T12:05:26', '2021-01-01T12:05:27', '2021-01-01T12:05:28', '2021-01-01T12:05:29', '2021-01-01T12:05:30', '2021-01-01T12:05:31', '2021-01-01T12:05:32', '2021-01-01T12:05:33', '2021-01-01T12:05:34', '2021-01-01T12:05:35', '2021-01-01T12:05:36', '2021-01-01T12:05:37', '2021-01-01T12:05:38', '2021-01-01T12:05:39', '2021-01-01T12:05:40', '2021-01-01T12:05:41', '2021-01-01T12:05:42', '2021-01-01T12:05:43', '2021-01-01T12:05:44', '2021-01-01T12:05:45', '2021-01-01T12:05:46', '2021-01-01T12:05:47', '2021-01-01T12:05:48', '2021-01-01T12:05:49', '2021-01-01T12:05:50', '2021-01-01T12:05:51', '2021-01-01T12:05:52', '2021-01-01T12:05:53', '2021-01-01T12:05:54', '2021-01-01T12:05:55', '2021-01-01T12:05:56', '2021-01-01T12:05:57', '2021-01-01T12:05:58', '2021-01-01T12:05:59', '2021-01-01T12:06:00', '2021-01-01T12:06:01', '2021-01-01T12:06:02'], dtype='datetime64[s]')
Or subtract datetimes to obtain a duration:
[21]:
var['time', 10] - var['time', 0]
[21]:
- ()int64s10
Values:
array(10)
to_unit
can be used to convert to a different precision:
[22]:
sc.to_unit(var, 'ms')
[22]:
- (time: 240)datetime64ms2021-01-01T12:00:00.000, 2021-01-01T12:00:01.000, ..., 2021-01-01T12:03:58.000, 2021-01-01T12:03:59.000
Values:
array(['2021-01-01T12:00:00.000', '2021-01-01T12:00:01.000', '2021-01-01T12:00:02.000', '2021-01-01T12:00:03.000', '2021-01-01T12:00:04.000', '2021-01-01T12:00:05.000', '2021-01-01T12:00:06.000', '2021-01-01T12:00:07.000', '2021-01-01T12:00:08.000', '2021-01-01T12:00:09.000', '2021-01-01T12:00:10.000', '2021-01-01T12:00:11.000', '2021-01-01T12:00:12.000', '2021-01-01T12:00:13.000', '2021-01-01T12:00:14.000', '2021-01-01T12:00:15.000', '2021-01-01T12:00:16.000', '2021-01-01T12:00:17.000', '2021-01-01T12:00:18.000', '2021-01-01T12:00:19.000', '2021-01-01T12:00:20.000', '2021-01-01T12:00:21.000', '2021-01-01T12:00:22.000', '2021-01-01T12:00:23.000', '2021-01-01T12:00:24.000', '2021-01-01T12:00:25.000', '2021-01-01T12:00:26.000', '2021-01-01T12:00:27.000', '2021-01-01T12:00:28.000', '2021-01-01T12:00:29.000', '2021-01-01T12:00:30.000', '2021-01-01T12:00:31.000', '2021-01-01T12:00:32.000', '2021-01-01T12:00:33.000', '2021-01-01T12:00:34.000', '2021-01-01T12:00:35.000', '2021-01-01T12:00:36.000', '2021-01-01T12:00:37.000', '2021-01-01T12:00:38.000', '2021-01-01T12:00:39.000', '2021-01-01T12:00:40.000', '2021-01-01T12:00:41.000', '2021-01-01T12:00:42.000', '2021-01-01T12:00:43.000', '2021-01-01T12:00:44.000', '2021-01-01T12:00:45.000', '2021-01-01T12:00:46.000', '2021-01-01T12:00:47.000', '2021-01-01T12:00:48.000', '2021-01-01T12:00:49.000', '2021-01-01T12:00:50.000', '2021-01-01T12:00:51.000', '2021-01-01T12:00:52.000', '2021-01-01T12:00:53.000', '2021-01-01T12:00:54.000', '2021-01-01T12:00:55.000', '2021-01-01T12:00:56.000', '2021-01-01T12:00:57.000', '2021-01-01T12:00:58.000', '2021-01-01T12:00:59.000', '2021-01-01T12:01:00.000', '2021-01-01T12:01:01.000', '2021-01-01T12:01:02.000', '2021-01-01T12:01:03.000', '2021-01-01T12:01:04.000', '2021-01-01T12:01:05.000', '2021-01-01T12:01:06.000', '2021-01-01T12:01:07.000', '2021-01-01T12:01:08.000', '2021-01-01T12:01:09.000', '2021-01-01T12:01:10.000', '2021-01-01T12:01:11.000', '2021-01-01T12:01:12.000', '2021-01-01T12:01:13.000', '2021-01-01T12:01:14.000', '2021-01-01T12:01:15.000', '2021-01-01T12:01:16.000', '2021-01-01T12:01:17.000', '2021-01-01T12:01:18.000', '2021-01-01T12:01:19.000', '2021-01-01T12:01:20.000', '2021-01-01T12:01:21.000', '2021-01-01T12:01:22.000', '2021-01-01T12:01:23.000', '2021-01-01T12:01:24.000', '2021-01-01T12:01:25.000', '2021-01-01T12:01:26.000', '2021-01-01T12:01:27.000', '2021-01-01T12:01:28.000', '2021-01-01T12:01:29.000', '2021-01-01T12:01:30.000', '2021-01-01T12:01:31.000', '2021-01-01T12:01:32.000', '2021-01-01T12:01:33.000', '2021-01-01T12:01:34.000', '2021-01-01T12:01:35.000', '2021-01-01T12:01:36.000', '2021-01-01T12:01:37.000', '2021-01-01T12:01:38.000', '2021-01-01T12:01:39.000', '2021-01-01T12:01:40.000', '2021-01-01T12:01:41.000', '2021-01-01T12:01:42.000', '2021-01-01T12:01:43.000', '2021-01-01T12:01:44.000', '2021-01-01T12:01:45.000', '2021-01-01T12:01:46.000', '2021-01-01T12:01:47.000', '2021-01-01T12:01:48.000', '2021-01-01T12:01:49.000', '2021-01-01T12:01:50.000', '2021-01-01T12:01:51.000', '2021-01-01T12:01:52.000', '2021-01-01T12:01:53.000', '2021-01-01T12:01:54.000', '2021-01-01T12:01:55.000', '2021-01-01T12:01:56.000', '2021-01-01T12:01:57.000', '2021-01-01T12:01:58.000', '2021-01-01T12:01:59.000', '2021-01-01T12:02:00.000', '2021-01-01T12:02:01.000', '2021-01-01T12:02:02.000', '2021-01-01T12:02:03.000', '2021-01-01T12:02:04.000', '2021-01-01T12:02:05.000', '2021-01-01T12:02:06.000', '2021-01-01T12:02:07.000', '2021-01-01T12:02:08.000', '2021-01-01T12:02:09.000', '2021-01-01T12:02:10.000', '2021-01-01T12:02:11.000', '2021-01-01T12:02:12.000', '2021-01-01T12:02:13.000', '2021-01-01T12:02:14.000', '2021-01-01T12:02:15.000', '2021-01-01T12:02:16.000', '2021-01-01T12:02:17.000', '2021-01-01T12:02:18.000', '2021-01-01T12:02:19.000', '2021-01-01T12:02:20.000', '2021-01-01T12:02:21.000', '2021-01-01T12:02:22.000', '2021-01-01T12:02:23.000', '2021-01-01T12:02:24.000', '2021-01-01T12:02:25.000', '2021-01-01T12:02:26.000', '2021-01-01T12:02:27.000', '2021-01-01T12:02:28.000', '2021-01-01T12:02:29.000', '2021-01-01T12:02:30.000', '2021-01-01T12:02:31.000', '2021-01-01T12:02:32.000', '2021-01-01T12:02:33.000', '2021-01-01T12:02:34.000', '2021-01-01T12:02:35.000', '2021-01-01T12:02:36.000', '2021-01-01T12:02:37.000', '2021-01-01T12:02:38.000', '2021-01-01T12:02:39.000', '2021-01-01T12:02:40.000', '2021-01-01T12:02:41.000', '2021-01-01T12:02:42.000', '2021-01-01T12:02:43.000', '2021-01-01T12:02:44.000', '2021-01-01T12:02:45.000', '2021-01-01T12:02:46.000', '2021-01-01T12:02:47.000', '2021-01-01T12:02:48.000', '2021-01-01T12:02:49.000', '2021-01-01T12:02:50.000', '2021-01-01T12:02:51.000', '2021-01-01T12:02:52.000', '2021-01-01T12:02:53.000', '2021-01-01T12:02:54.000', '2021-01-01T12:02:55.000', '2021-01-01T12:02:56.000', '2021-01-01T12:02:57.000', '2021-01-01T12:02:58.000', '2021-01-01T12:02:59.000', '2021-01-01T12:03:00.000', '2021-01-01T12:03:01.000', '2021-01-01T12:03:02.000', '2021-01-01T12:03:03.000', '2021-01-01T12:03:04.000', '2021-01-01T12:03:05.000', '2021-01-01T12:03:06.000', '2021-01-01T12:03:07.000', '2021-01-01T12:03:08.000', '2021-01-01T12:03:09.000', '2021-01-01T12:03:10.000', '2021-01-01T12:03:11.000', '2021-01-01T12:03:12.000', '2021-01-01T12:03:13.000', '2021-01-01T12:03:14.000', '2021-01-01T12:03:15.000', '2021-01-01T12:03:16.000', '2021-01-01T12:03:17.000', '2021-01-01T12:03:18.000', '2021-01-01T12:03:19.000', '2021-01-01T12:03:20.000', '2021-01-01T12:03:21.000', '2021-01-01T12:03:22.000', '2021-01-01T12:03:23.000', '2021-01-01T12:03:24.000', '2021-01-01T12:03:25.000', '2021-01-01T12:03:26.000', '2021-01-01T12:03:27.000', '2021-01-01T12:03:28.000', '2021-01-01T12:03:29.000', '2021-01-01T12:03:30.000', '2021-01-01T12:03:31.000', '2021-01-01T12:03:32.000', '2021-01-01T12:03:33.000', '2021-01-01T12:03:34.000', '2021-01-01T12:03:35.000', '2021-01-01T12:03:36.000', '2021-01-01T12:03:37.000', '2021-01-01T12:03:38.000', '2021-01-01T12:03:39.000', '2021-01-01T12:03:40.000', '2021-01-01T12:03:41.000', '2021-01-01T12:03:42.000', '2021-01-01T12:03:43.000', '2021-01-01T12:03:44.000', '2021-01-01T12:03:45.000', '2021-01-01T12:03:46.000', '2021-01-01T12:03:47.000', '2021-01-01T12:03:48.000', '2021-01-01T12:03:49.000', '2021-01-01T12:03:50.000', '2021-01-01T12:03:51.000', '2021-01-01T12:03:52.000', '2021-01-01T12:03:53.000', '2021-01-01T12:03:54.000', '2021-01-01T12:03:55.000', '2021-01-01T12:03:56.000', '2021-01-01T12:03:57.000', '2021-01-01T12:03:58.000', '2021-01-01T12:03:59.000'], dtype='datetime64[ms]')
Operations¶
Creation functions¶
New in 0.5
For convenience and similarity to numpy
we added functions that create variables. Our intention is to fully replace the need to use sc.Variable
directly, but at this point this has not been rolled out to our documentation pages.
Examples:
[23]:
sc.array(dims=['x'], values=np.array([1, 2, 3]))
[23]:
- (x: 3)int641, 2, 3
Values:
array([1, 2, 3])
[24]:
sc.zeros(dims=['x'], shape=[3])
[24]:
- (x: 3)float640.0, 0.0, 0.0
Values:
array([0., 0., 0.])
[25]:
sc.scalar(17)
[25]:
- ()int6417
Values:
array(17)
All of these also take keyword arguments. Note that we can still support creating scalars by multiplying with a unit:
[26]:
1.2 * sc.units.m
[26]:
- ()float64m1.2
Values:
array(1.2)
New in 0.7
More creation functions were added:
Added
zeros_like
,ones_like
, andempty_like
.Added
linspace
,logspace
,geomspace
, andarange
.
New in 0.8
More creation functions were added:
Added
full
andfull_like
.
Unit conversion¶
New in 0.6
Conversions between different unit scales (not to be confused with conversions provided by scippneutron) are now supported. to_unit
provides conversion of variables between, e.g., mm
and m
.
New in 0.7
to_unit
can now avoid making a copy if the input already has the desired unit. This can be used as a cheap way to ensure inputs have expected units.to_unit
now also works for binned data, converting the unit of the underlying events in the bins
New in 0.8
to_unit
now has acopy
argument. By default,copy=True
andto_unit
makes a copy even if the input already has the desired unit. For a cheap way to ensure inputs have expected units usecopy=False
to avoid copies if possible.
Example:
[27]:
var = sc.array(dims=['x'], unit='mm', values=[3.2, 5.4, 7.6])
m = sc.to_unit(var, 'm')
m
[27]:
- (x: 3)float64m0.0, 0.01, 0.01
Values:
array([0.0032, 0.0054, 0.0076])
No copy is made if the input has the requested unit when we specify copy=False
:
[28]:
sc.to_unit(m, 'm', copy=False) # no copy
[28]:
- (x: 3)float64m0.0, 0.01, 0.01
Values:
array([0.0032, 0.0054, 0.0076])
Conversions also work for more specialized units such as electron-volt:
[29]:
sc.to_unit(sc.scalar(1.0, unit='nJ'), unit='meV')
[29]:
- ()float64meV6241509074460.764
Values:
array(6.24150907e+12)
from_pandas
and from_xarray
¶
New in 0.8
from_pandas
for convertingpandas.Dataframe
toscipp.Dataset
.from_xarray
for convertingxarray.DataArray
orxarray.Dataset
toscipp.DataAray
orscipp.Dataset
, respectively.
Both functions are available in the compat
submodule.
Shape operations¶
fold
and flatten
¶
New in 0.6
fold
and flatten
, which are similar to numpy.reshape, have been added. In contrast to reshape
, fold
and flatten
support data arrays and handle also meta data such as coord, masks, and attrs.
New in 0.7
fold
now always returns views of data and all meta data instead of making deep copies.flatten
also preserves reshaped data as a view, but unlikefold
the same is not true for meta data in general, since it may require duplication in the flatten operation.
Example:
[30]:
var = sc.ones(dims=['pixel'], shape=[100])
xy = sc.fold(var, dim='pixel', sizes={'x': 10, 'y': 10})
xy = sc.DataArray(data=xy,
coords={
'x': sc.array(dims=['x'], values=np.arange(10)),
'y': sc.array(dims=['y'], values=np.arange(10))
})
xy
[30]:
- x: 10
- y: 10
- x(x)int640, 1, ..., 8, 9
Values:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) - y(y)int640, 1, ..., 8, 9
Values:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- (x, y)float641.0, 1.0, ..., 1.0, 1.0
Values:
array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])
Folding does not effect copies of either data or meta data, for example:
[31]:
xy['y', 4] *= 0.0 # affects var (scipp-0.7 and higher)
var.plot()
The reverse of fold
is flatten
:
[32]:
flat = sc.flatten(xy, to='pixel')
flat
[32]:
- pixel: 100
- x(pixel)int640, 0, ..., 9, 9
Values:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9]) - y(pixel)int640, 1, ..., 8, 9
Values:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- (pixel)float641.0, 1.0, ..., 1.0, 1.0
Values:
array([1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1.])
Flattening does not effect a copy of data, but meta data may get copied if values need to be duplicated by the operation:
[33]:
flat['pixel', 0] = 22 # modifies var (scipp-0.7 and higher)
var.plot()
Vectors and matrices¶
General¶
New in 0.7
Several improvements for working with (3-D position) vectors and (3-D rotation) matrices are part of this release:
Creation functions were added:
vector
(a single vector)vectors
(array of vectors)matrix
(a single matrix),matrices
(array of matrices).
Direct creation and initialization of 2-D (or higher) arrays of matrices and vectors is now possible from numpy arrays.
The values property now returns a numpy array with ndim+1 (vectors) or ndim+2 (matrices) axes, with the inner 1 (vectors) or 2 (matrices) axes corresponding to the vector or matrix axes.
Vector or matrix elements can now be accessed and modified directly using the new
fields
property of variables.fields
provides access to vector elementsx
,y
, andz
or matrix elementsxx
,xy
, …,zz
.
New in 0.8
The fields
property can now be iterated and behaves similar to a dict
with fixed keys.
[34]:
sc.vector(value=[1, 2, 3])
[34]:
- ()vector_3_float64[1. 2. 3.]
Values:
array([1., 2., 3.])
[35]:
vecs = sc.vectors(dims=['x'], unit='m', values=np.arange(12).reshape(4, 3))
vecs
[35]:
- (x: 4)vector_3_float64m[0. 1. 2.], [3. 4. 5.], [6. 7. 8.], [ 9. 10. 11.]
Values:
array([[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.], [ 9., 10., 11.]])
[36]:
vecs.values
[36]:
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.],
[ 9., 10., 11.]])
[37]:
vecs.fields.y
[37]:
- (x: 4)float64m1.0, 4.0, 7.0, 10.0
Values:
array([ 1., 4., 7., 10.])
[38]:
vecs.fields.z += 0.666 * sc.units.m
vecs
[38]:
- (x: 4)vector_3_float64m[0. 1. 2.666], [3. 4. 5.666], [6. 7. 8.666], [ 9. 10. 11.666]
Values:
array([[ 0. , 1. , 2.666], [ 3. , 4. , 5.666], [ 6. , 7. , 8.666], [ 9. , 10. , 11.666]])
New in 0.8
The cross
function to compute the cross-product of vectors as added.
[39]:
sc.cross(vecs, vecs['x', 0])
[39]:
- (x: 4)vector_3_float64m^2[0. 0. 0.], [ 4.998 -7.998 3. ], [ 9.996 -15.996 6. ], [ 14.994 -23.994 9. ]
Values:
array([[ 0. , 0. , 0. ], [ 4.998, -7.998, 3. ], [ 9.996, -15.996, 6. ], [ 14.994, -23.994, 9. ]])
scipp.spatial.transform
¶
New in 0.8
The scipp.spatial.transform
(in the style of scipy.spatial.transform
) submodule was added. This now provides:
from_rotvec
to create rotation matrices from rotation vectors.as_rotvec
to convert rotation matrices into rotation vectors.
As an example, the following creates a rotation matrix for rotation around the x
-axis by 30 degrees:
[40]:
from scipp.spatial.transform import from_rotvec
rot = from_rotvec(sc.vector(value=[30.0, 0, 0], unit='deg'))
rot
[40]:
- ()matrix_3_float64[[ 1. 0. 0. ] [ 0. 0.8660254 -0.5 ] [ 0. 0.5 0.8660254]]
Values:
array([[ 1. , 0. , 0. ], [ 0. , 0.8660254, -0.5 ], [ 0. , 0.5 , 0.8660254]])
Coordinate transformations¶
New in 0.8
The transform_coords
function has been added (also available as method of data arrays and datasets). It is a tool for transforming one or more input coordinates into one or more output coordinates. It automatically handles:
Renaming of dimensions, if dimension-coordinates are transformed.
Change of coordinates to attributes to avoid interference of coordinates consumed by the transformation in follow-up operations.
Conversion of event-coordinates of binned data, if present.
See Coordinate transformations for a full description.
Physical constants¶
New in 0.8
The scipp.constants
(in the style of scipy.constants
) submodule was added, providing physical constants from CODATA 2018. For full details see the module’s documentation.
Examples:
[41]:
from scipp.constants import hbar, m_e, physical_constants
[42]:
hbar
[42]:
- ()float64J*s1.0545718176461565e-34
Values:
array(1.05457182e-34)
[43]:
m_e
[43]:
- ()float64kg9.1093837015e-31
Values:
array(9.1093837e-31)
[44]:
physical_constants('speed of light in vacuum')
[44]:
- ()float64m/s299792458.0
Values:
array(2.99792458e+08)
[45]:
physical_constants('neutron mass', with_variance=True)
[45]:
- ()float64kg1.67492749804e-27σ = 9.5e-37
Values:
array(1.6749275e-27)
Variances (σ²):
array(9.025e-73)
Plotting¶
New in 0.7
Plotting supports
redraw()
method for updating existing plots with new data, without recreating the plot.
New in 0.8
Plotting 1-D binned (event) data is now supported.
Binned data¶
Buffer and meta data access¶
New in 0.7
The internal buffer holding the “events” underlying binned data can now be accessed directly using the new
events
property. Update: This is deprecated as of 0.8.2.HTML view now works for binned meta data access such as
binned.bins.coords['time']
New in 0.8
The mean of bins can now be computed using binned.bins.mean()
. This should general be used instead of binned.bins.sum()
the if dtype is not “summable”, i.e., typically anything that is not of unit “counts”.
Consider the following example, representing a time series of temperature measurements on an x-y plane:
[46]:
import numpy as np
N = int(800)
data = sc.DataArray(
data=sc.Variable(dims=['time'], values=100 + np.random.rand(N) * 10, unit='K'),
coords={
'x': sc.Variable(dims=['time'], unit='m', values=np.random.rand(N)),
'y': sc.Variable(dims=['time'], unit='m', values=np.random.rand(N)),
'time': sc.Variable(dims=['time'], values=(10000 * np.random.rand(N)).astype('datetime64[s]')),
})
binned = sc.bin(data,
edges=[sc.linspace(dim='x', unit='m', start=0.0, stop=1.0, num=5),
sc.linspace(dim='y', unit='m', start=0.0, stop=1.0, num=5)])
binned
[46]:
- x: 4
- y: 4
- x(x [bin-edge])float64m0.0, 0.25, 0.5, 0.75, 1.0
Values:
array([0. , 0.25, 0.5 , 0.75, 1. ]) - y(y [bin-edge])float64m0.0, 0.25, 0.5, 0.75, 1.0
Values:
array([0. , 0.25, 0.5 , 0.75, 1. ])
- (x, y)DataArrayViewbinned data [len=52, len=50, ..., len=59, len=44]
Values:
[<scipp.DataArray> Dimensions: Sizes[time:52, ] Coordinates: time datetime64 [s] (time) [1970-01-01T00:35:50, 1970-01-01T02:08:26, ..., 1970-01-01T00:28:55, 1970-01-01T02:42:30] x float64 [m] (time) [0.186080, 0.054793, ..., 0.146708, 0.072271] y float64 [m] (time) [0.022003, 0.075396, ..., 0.115558, 0.210394] Data: float64 [K] (time) [109.794155, 100.553758, ..., 107.340572, 106.902720] , <scipp.DataArray> Dimensions: Sizes[time:50, ] Coordinates: time datetime64 [s] (time) [1970-01-01T02:11:44, 1970-01-01T01:12:52, ..., 1970-01-01T01:15:14, 1970-01-01T02:34:09] x float64 [m] (time) [0.081474, 0.210229, ..., 0.010766, 0.119580] y float64 [m] (time) [0.253279, 0.276667, ..., 0.378480, 0.439987] Data: float64 [K] (time) [101.748073, 108.030339, ..., 109.531299, 101.435841] , ..., <scipp.DataArray> Dimensions: Sizes[time:59, ] Coordinates: time datetime64 [s] (time) [1970-01-01T01:51:18, 1970-01-01T02:41:37, ..., 1970-01-01T02:43:02, 1970-01-01T00:30:26] x float64 [m] (time) [0.836701, 0.788889, ..., 0.971129, 0.850548] y float64 [m] (time) [0.514702, 0.707257, ..., 0.594856, 0.694594] Data: float64 [K] (time) [109.032462, 101.496697, ..., 104.652373, 103.083986] , <scipp.DataArray> Dimensions: Sizes[time:44, ] Coordinates: time datetime64 [s] (time) [1970-01-01T00:19:50, 1970-01-01T01:37:06, ..., 1970-01-01T00:21:42, 1970-01-01T00:28:06] x float64 [m] (time) [0.864923, 0.793226, ..., 0.900084, 0.862296] y float64 [m] (time) [0.851275, 0.791152, ..., 0.912784, 0.840306] Data: float64 [K] (time) [106.664423, 109.135451, ..., 107.185179, 105.569680] ]
[47]:
sc.show(binned)
To allow for this, the bins
property provides properties data
, coords
, masks
, and attrs
of the bins that behave like the properties of a data array while retaining the binned structure. That is, it can be used for computation involving information available on a per-bin basis:
[48]:
binned.bins.coords['time']
[48]:
- (x: 4, y: 4)VariableViewbinned data [len=52, len=50, ..., len=59, len=44]
Values:
[<scipp.Variable> (time: 52) datetime64 [s] [1970-01-01T00:35:50, 1970-01-01T02:08:26, ..., 1970-01-01T00:28:55, 1970-01-01T02:42:30], <scipp.Variable> (time: 50) datetime64 [s] [1970-01-01T02:11:44, 1970-01-01T01:12:52, ..., 1970-01-01T01:15:14, 1970-01-01T02:34:09], ..., <scipp.Variable> (time: 59) datetime64 [s] [1970-01-01T01:51:18, 1970-01-01T02:41:37, ..., 1970-01-01T02:43:02, 1970-01-01T00:30:26], <scipp.Variable> (time: 44) datetime64 [s] [1970-01-01T00:19:50, 1970-01-01T01:37:06, ..., 1970-01-01T00:21:42, 1970-01-01T00:28:06]]
[49]:
sc.show(binned.bins.coords['time'])
We can use this in our example to correct for an hypothetical clock error that depends on the x-y bin:
[50]:
clock_correction = sc.array(dims=['x', 'y'], unit='s', values=(100 * np.random.rand(4, 4)).astype('int64'))
clock_correction
[50]:
- (x: 4, y: 4)int64s97, 82, ..., 95, 82
Values:
array([[97, 82, 1, 56], [69, 24, 28, 9], [35, 4, 46, 69], [70, 76, 95, 82]])
[51]:
binned.bins.coords['time'] += clock_correction
The properties can also be used to add or delete meta data entries:
[52]:
del binned.bins.coords['x']
Performance¶
New in 0.7
sort
is now considerably faster for data with more rows.reduction operations such as
sum
andmean
are now also multi-threaded and thus considerably faster.