What’s new in scipp
Contents
What’s new in scipp¶
This page highlights feature additions and discusses major changes from recent releases. For a full list of changes see the Release Notes.
[1]:
import numpy as np
import scipp as sc
General¶
Unique dimensions and slicing of 1-D objects¶
New in 0.9
The new dim
property checks whether an object is 1-D, and returns the only dimension label. An exception is raised if the object is not 1-D.
Example:
[2]:
x = sc.linspace(dim='x', start=0, stop=1, num=4)
x.dim
[2]:
'x'
New in 0.11
1-D objects can now be sliced without specifying a dimension.
Example:
[3]:
x[-1]
[3]:
- ()float641.0
Values:
array(1.)
If an object is not 1-D then DimensionError
is raised:
[4]:
var2d = sc.concat([x,x], 'y')
var2d[0]
---------------------------------------------------------------------------
DimensionError Traceback (most recent call last)
/tmp/ipykernel_9954/2472674120.py in <module>
1 var2d = sc.concat([x,x], 'y')
----> 2 var2d[0]
DimensionError: Slicing with implicit dimension label is only possible for 1-D objects. Got Sizes[y:2, x:4, ] with ndim=2. Provide an explicit dimension label, e.g., var['y', 0] instead of var[0].
Logging support¶
New in 0.9
Scipp now provides a logger, and a pre-configured logging widget for Jupyter notebooks. See Logging.
Bound method equivalents to many free functions¶
New in 0.8
Many functions that have been available as free functions can now be used also as methods of variables and data arrays. See the documentation for individual classes for a full list.
Example:
[5]:
var = sc.arange(dim="x", unit="m", start=0, stop=12)
var.sum() # Previously sc.sum(var)
[5]:
- ()int64m66
Values:
array(66)
Note that sc.sum(var)
will continue to be supported as well.
Unified conversion of unit and dtype¶
New in 0.11
Variables and data arrays have a new method, to
, for conversion of dtype, unit, or both. This can be used to replace uses of to_unit
and astype
.
Example:
[6]:
var = sc.arange(dim='x', start=0, stop=4, unit='m')
var
[6]:
- (x: 4)int64m0, 1, 2, 3
Values:
array([0, 1, 2, 3])
Use the unit
keyword argument to convert to a different unit:
[7]:
var.to(unit='mm')
[7]:
- (x: 4)int64mm0, 1000, 2000, 3000
Values:
array([ 0, 1000, 2000, 3000])
Use the dtype
keyword argument to convert to a different dtype:
[8]:
var.to(dtype='float64')
[8]:
- (x: 4)float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.])
If both unit
and dtype
are provided, the implementation attempts to apply the two conversions in optimal order to reduce or avoid the effect of rounding/truncation errors:
[9]:
var.to(dtype='float64', unit='km')
[9]:
- (x: 4)float64km0.0, 0.001, 0.002, 0.003
Values:
array([0. , 0.001, 0.002, 0.003])
Operations¶
Creation functions¶
New in 0.11
Creation functions for datetimes where added:
Added
epoch
,datetime
anddatetimes
.
[10]:
sc.datetime('now', unit='ms')
[10]:
- ()datetime64ms2022-01-13T14:43:55.000
Values:
array('2022-01-13T14:43:55.000', dtype='datetime64[ms]')
[11]:
times = sc.datetimes(dims=['time'], values=['2022-01-11T10:24:03', '2022-01-11T10:24:03'])
times
[11]:
- (time: 2)datetime64s2022-01-11T10:24:03, 2022-01-11T10:24:03
Values:
array(['2022-01-11T10:24:03', '2022-01-11T10:24:03'], dtype='datetime64[s]')
The new epoch
function is useful for obtaining the time since epoch, i.e., a time difference (dtype='int64'
) instead of a time point (dtype='datetime64'
):
[12]:
times - sc.epoch(unit=times.unit)
[12]:
- (time: 2)int64s1641896643, 1641896643
Values:
array([1641896643, 1641896643])
from_pandas
and from_xarray
¶
New in 0.8
from_pandas
for convertingpandas.Dataframe
toscipp.Dataset
.from_xarray
for convertingxarray.DataArray
orxarray.Dataset
toscipp.DataAray
orscipp.Dataset
, respectively.
Both functions are available in the compat
submodule.
Reduction operations¶
Internal precision in summation operations¶
New in 0.9
Reduction operations such as sum
of single-precision (float32
) data now use double-precision (float64
) internally to reduce the effects of rounding errors.
Reductions over multiple inputs using reduce
¶
New in 0.9
The new reduce
function can be used for reduction operations that do not operate along a dimension of a scipp object but rather across a list or tuple of multiple scipp objects. The mechanism is a 2-step approach, with a syntasx similar to groupby
:
[13]:
a = sc.linspace(dim="x", start=0.0, stop=1.0, num=4)
b = sc.linspace(dim="x", start=0.2, stop=0.8, num=4)
c = sc.linspace(dim="x", start=0.2, stop=1.2, num=4)
sc.reduce([a, b, c]).sum()
[13]:
- (x: 4)float640.4, 1.267, 2.133, 3.0
Values:
array([0.4 , 1.26666667, 2.13333333, 3. ])
[14]:
reducer = sc.reduce([a, b, c])
reducer.min()
[14]:
- (x: 4)float640.0, 0.333, 0.600, 0.8
Values:
array([0. , 0.33333333, 0.6 , 0.8 ])
[15]:
reducer.max()
[15]:
- (x: 4)float640.2, 0.533, 0.867, 1.2
Values:
array([0.2 , 0.53333333, 0.86666667, 1.2 ])
Shape operations¶
concat
replacing concatenate
¶
New in 0.9
concat
is replacing concatenate
(which is deprecated now and will be removed in 0.10). It supports a list of inputs rather than just 2 inputs.
[16]:
a = sc.scalar(1.2)
b = sc.scalar(2.3)
c = sc.scalar(3.4)
sc.concat([a, b, c], "x")
[16]:
- (x: 3)float641.2, 2.3, 3.4
Values:
array([1.2, 2.3, 3.4])
Vectors and matrices¶
General¶
New in 0.11
scipp.spatial
has been restructured and extended:
New data types for spatial transforms were added:
vector3
(renamed fromvector3_float64
)rotation3
(3-D rotation defined using quaternion coeffiecients)translation3
(translation in 3-D)linear_transform3
(previouslymatrix_3_float64
, 3-D linear transform with, e.g., rotation and scaling)affine_transform3
(affine transform in 3-D, combination of a linear transform and a translation, defined using 4x4 matrix)
The scipp.spatial submodule was extended with a number of new creation functions, in particular for the new dtypes.
matrix
andmatrices
for creating “matrices” have been deprecated. Usescipp.spatial.linear_transform
andscipp.spatial.linear_transforms
instead.
Note that the scipp.spatial
subpackage must be imported explicitly:
[17]:
from scipp import spatial
linear = spatial.linear_transform(value=[[1,0,0],[0,2,0],[0,0,3]])
linear
[17]:
- ()linear_transform3[[1. 0. 0.] [0. 2. 0.] [0. 0. 3.]]
Values:
array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]])
[18]:
trans = spatial.translation(value=[1,2,3], unit='m')
trans
[18]:
- ()translation3m[1. 2. 3.]
Values:
array([1., 2., 3.])
Multiplication can be used to combine the various transforms:
[19]:
linear * trans
[19]:
- ()affine_transform3m[[1. 0. 0. 1.] [0. 2. 0. 4.] [0. 0. 3. 9.] [0. 0. 0. 1.]]
Values:
array([[1., 0., 0., 1.], [0., 2., 0., 4.], [0., 0., 3., 9.], [0., 0., 0., 1.]])
Note that in the case of affine_transform3
the unit refers to the translation part. A unit for the linear part is currently not supported.
Coordinate transformations¶
New in 0.8
The transform_coords
function has been added (also available as method of data arrays and datasets). It is a tool for transforming one or more input coordinates into one or more output coordinates. It automatically handles:
Renaming of dimensions, if dimension-coordinates are transformed.
Change of coordinates to attributes to avoid interference of coordinates consumed by the transformation in follow-up operations.
Conversion of event-coordinates of binned data, if present.
See Coordinate transformations for a full description.
Physical constants¶
New in 0.8
The scipp.constants
(in the style of scipy.constants
) submodule was added, providing physical constants from CODATA 2018. For full details see the module’s documentation.
Examples:
[20]:
from scipp.constants import hbar, m_e, physical_constants
[21]:
hbar
[21]:
- ()float64J*s1.0545718176461565e-34
Values:
array(1.05457182e-34)
[22]:
m_e
[22]:
- ()float64kg9.1093837015e-31
Values:
array(9.1093837e-31)
[23]:
physical_constants("speed of light in vacuum")
[23]:
- ()float64m/s299792458.0
Values:
array(2.99792458e+08)
[24]:
physical_constants("neutron mass", with_variance=True)
[24]:
- ()float64kg1.67492749804e-27σ = 9.5e-37
Values:
array(1.6749275e-27)
Variances (σ²):
array(9.025e-73)
[25]:
import numpy as np
N = int(800)
data = sc.DataArray(
data=sc.Variable(dims=["time"], values=100 + np.random.rand(N) * 10, unit="K"),
coords={
"x": sc.Variable(dims=["time"], unit="m", values=np.random.rand(N)),
"y": sc.Variable(dims=["time"], unit="m", values=np.random.rand(N)),
"time": sc.Variable(
dims=["time"], values=(10000 * np.random.rand(N)).astype("datetime64[s]")
),
},
)
binned = sc.bin(
data,
edges=[
sc.linspace(dim="x", unit="m", start=0.0, stop=1.0, num=5),
sc.linspace(dim="y", unit="m", start=0.0, stop=1.0, num=5),
],
)
binned
[25]:
- x: 4
- y: 4
- x(x [bin-edge])float64m0.0, 0.25, 0.5, 0.75, 1.0
Values:
array([0. , 0.25, 0.5 , 0.75, 1. ]) - y(y [bin-edge])float64m0.0, 0.25, 0.5, 0.75, 1.0
Values:
array([0. , 0.25, 0.5 , 0.75, 1. ])
- (x, y)DataArrayViewbinned data [len=48, len=48, ..., len=54, len=60]
Values:
[<scipp.DataArray> Dimensions: Sizes[time:48, ] Coordinates: time datetime64 [s] (time) [1970-01-01T02:08:29, 1970-01-01T01:14:35, ..., 1970-01-01T00:29:11, 1970-01-01T00:14:26] x float64 [m] (time) [0.185679, 0.154261, ..., 0.0481469, 0.248385] y float64 [m] (time) [0.157335, 0.0184621, ..., 0.0831641, 0.0758221] Data: float64 [K] (time) [104.196, 109.117, ..., 103.01, 102.546] , <scipp.DataArray> Dimensions: Sizes[time:48, ] Coordinates: time datetime64 [s] (time) [1970-01-01T02:24:02, 1970-01-01T01:59:13, ..., 1970-01-01T01:20:54, 1970-01-01T00:28:46] x float64 [m] (time) [0.185029, 0.226894, ..., 0.141988, 0.0169694] y float64 [m] (time) [0.297743, 0.252622, ..., 0.362635, 0.313435] Data: float64 [K] (time) [109.845, 108.114, ..., 101.075, 100.451] , ..., <scipp.DataArray> Dimensions: Sizes[time:54, ] Coordinates: time datetime64 [s] (time) [1970-01-01T01:25:35, 1970-01-01T00:11:23, ..., 1970-01-01T02:17:47, 1970-01-01T00:46:16] x float64 [m] (time) [0.862611, 0.961646, ..., 0.915319, 0.851259] y float64 [m] (time) [0.703346, 0.593832, ..., 0.510264, 0.650847] Data: float64 [K] (time) [104.647, 103.79, ..., 104.966, 103.122] , <scipp.DataArray> Dimensions: Sizes[time:60, ] Coordinates: time datetime64 [s] (time) [1970-01-01T01:26:06, 1970-01-01T02:21:58, ..., 1970-01-01T00:35:48, 1970-01-01T01:14:51] x float64 [m] (time) [0.863391, 0.891747, ..., 0.873954, 0.883454] y float64 [m] (time) [0.894954, 0.879536, ..., 0.764983, 0.798934] Data: float64 [K] (time) [103.117, 109.604, ..., 103.861, 107.711] ]
[26]:
sc.show(binned)
To allow for this, the bins
property provides properties data
, coords
, masks
, and attrs
of the bins that behave like the properties of a data array while retaining the binned structure. That is, it can be used for computation involving information available on a per-bin basis:
[27]:
binned.bins.coords["time"]
[27]:
- (x: 4, y: 4)VariableViewbinned data [len=48, len=48, ..., len=54, len=60]
Values:
[<scipp.Variable> (time: 48) datetime64 [s] [1970-01-01T02:08:29, 1970-01-01T01:14:35, ..., 1970-01-01T00:29:11, 1970-01-01T00:14:26], <scipp.Variable> (time: 48) datetime64 [s] [1970-01-01T02:24:02, 1970-01-01T01:59:13, ..., 1970-01-01T01:20:54, 1970-01-01T00:28:46], ..., <scipp.Variable> (time: 54) datetime64 [s] [1970-01-01T01:25:35, 1970-01-01T00:11:23, ..., 1970-01-01T02:17:47, 1970-01-01T00:46:16], <scipp.Variable> (time: 60) datetime64 [s] [1970-01-01T01:26:06, 1970-01-01T02:21:58, ..., 1970-01-01T00:35:48, 1970-01-01T01:14:51]]
[28]:
sc.show(binned.bins.coords["time"])
We can use this in our example to correct for an hypothetical clock error that depends on the x-y bin:
[29]:
clock_correction = sc.array(
dims=["x", "y"], unit="s", values=(100 * np.random.rand(4, 4)).astype("int64")
)
clock_correction
[29]:
- (x: 4, y: 4)int64s23, 98, ..., 46, 61
Values:
array([[23, 98, 98, 24], [17, 13, 71, 92], [47, 42, 28, 2], [45, 16, 46, 61]])
[30]:
binned.bins.coords["time"] += clock_correction
The properties can also be used to add or delete meta data entries:
[31]:
del binned.bins.coords["x"]
SciPy compatibility layer¶
New in 0.11
A number of subpackages providing wrappers for a subset of functions from the corresponding packages in SciPy was added:
scipp.integrate providing
simpson
andtrapezoid
.scipp.interpolate providing
interp1d
.scipp.optimize providing
curve_fit
.scipp.signal providing
butter
andsosfiltfilt
.
Please refer to the function documentation for working examples.
Performance¶
New in 0.9
sc.lookup(histogram, dim)[var]
is now faster ifhistogram
is very long and is integer-valued. This is relevant in a number of event-filtering operations.