What’s new in scipp
Contents
What’s new in scipp#
This page highlights feature additions and discusses major changes from recent releases. For a full list of changes see the Release Notes.
[1]:
import numpy as np
import scipp as sc
General#
Unique dimensions and slicing of 1-D objects#
New in 0.9
The new dim
property checks whether an object is 1-D, and returns the only dimension label. An exception is raised if the object is not 1-D.
Example:
[2]:
x = sc.linspace(dim='x', start=0, stop=1, num=4)
x.dim
[2]:
'x'
New in 0.11
1-D objects can now be sliced without specifying a dimension.
Example:
[3]:
x[-1]
[3]:
- ()float64𝟙1.0
Values:
array(1.)
If an object is not 1-D then DimensionError
is raised:
[4]:
var2d = sc.concat([x,x], 'y')
var2d[0]
---------------------------------------------------------------------------
DimensionError Traceback (most recent call last)
Input In [4], in <cell line: 2>()
1 var2d = sc.concat([x,x], 'y')
----> 2 var2d[0]
DimensionError: Slicing with implicit dimension label is only possible for 1-D objects. Got Sizes[y:2, x:4, ] with ndim=2. Provide an explicit dimension label, e.g., var['y', 0] instead of var[0].
Slicing with stride#
New in 0.12
Positional slicing (slicing with integer indices, as opposed to slicing with a label matching a coordinate value) now supports strides.
Negative strides are currently not supported.
Examples:
[5]:
y = sc.arange('y', 10)
y[::2]
[5]:
- (y: 5)int64𝟙0, 2, 4, 6, 8
Values:
array([0, 2, 4, 6, 8])
[6]:
x = sc.linspace('x', 0.0, 1.0, num=5)
da = sc.DataArray(sc.ones(dims=['x', 'y'], shape=[4,10], unit='K'), coords={'x':x, 'y':y})
da['y', 1::2]
[6]:
- x: 4
- y: 5
- x(x [bin-edge])float64𝟙0.0, 0.25, 0.5, 0.75, 1.0
Values:
array([0. , 0.25, 0.5 , 0.75, 1. ]) - y(y)int64𝟙1, 3, 5, 7, 9
Values:
array([1, 3, 5, 7, 9])
- (x, y)float64K1.0, 1.0, ..., 1.0, 1.0
Values:
array([[1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.]])
Slicing a dimension with a bin-edge coordinate with a stride is ill-defined and not supported:
[7]:
da['x', ::2]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 da['x', ::2]
IndexError: Object has bin-edges along dimension x so slicing with stride 2 != 1 is not valid.
Slicing: Advanced indexing support with integer array or boolean variable#
New in 0.13
Added support for indexing with an integer array.
Added support for indexing with a boolean variable.
The Slicing documentation provides details and examples.
Units#
Unified conversion of unit and dtype#
New in 0.11
Variables and data arrays have a new method, to
, for conversion of dtype, unit, or both. This can be used to replace uses of to_unit
and astype
.
Example:
[8]:
var = sc.arange(dim='x', start=0, stop=4, unit='m')
var
[8]:
- (x: 4)int64m0, 1, 2, 3
Values:
array([0, 1, 2, 3])
Use the unit
keyword argument to convert to a different unit:
[9]:
var.to(unit='mm')
[9]:
- (x: 4)int64mm0, 1000, 2000, 3000
Values:
array([ 0, 1000, 2000, 3000])
Use the dtype
keyword argument to convert to a different dtype:
[10]:
var.to(dtype='float64')
[10]:
- (x: 4)float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.])
If both unit
and dtype
are provided, the implementation attempts to apply the two conversions in optimal order to reduce or avoid the effect of rounding/truncation errors:
[11]:
var.to(dtype='float64', unit='km')
[11]:
- (x: 4)float64km0.0, 0.001, 0.002, 0.003
Values:
array([0. , 0.001, 0.002, 0.003])
Support for unit=None
#
New in 0.12
Previously scipp used unit=sc.units.dimensionless
(or the alias unit=sc.units.one
) for anything that does not have a unit, such as strings, booleans, or bins. To allow for distinction of actual physically dimensionless quantities from these cases, scipp now supports variables and, by extension, data arrays that have their unit set to None
.
This change is accompanied by a number of related changes:
Creation function use a default unit if not given explicitly. The default for numbers (floating point or integer) is
sc.units.dimensionless
. The default for everything else, includingbool
isNone
.Comparison operations, which return variables with
dtype=bool
, haveunit=None
.A new function
index
was added, to allow for creation of 0-D variable withunit=None
. This complementsscalar
, which uses the default unit (depending on thedtype
).
Examples:
[12]:
print(sc.array(dims=['x'], values=[1.1,2.2,3.3]))
print(sc.array(dims=['x'], values=[1,2,3]))
print(sc.array(dims=['x'], values=[False, True, False]))
print(sc.array(dims=['x'], values=['a','b','c']))
<scipp.Variable> (x: 3) float64 [dimensionless] [1.1, 2.2, 3.3]
<scipp.Variable> (x: 3) int64 [dimensionless] [1, 2, 3]
<scipp.Variable> (x: 3) bool <no unit> [False, True, False]
<scipp.Variable> (x: 3) string <no unit> ["a", "b", "c"]
[13]:
a = sc.array(dims=['x'], values=[1,2,3])
b = sc.array(dims=['x'], values=[1,3,3])
print(a == b)
print(a < b)
<scipp.Variable> (x: 3) bool <no unit> [True, False, True]
<scipp.Variable> (x: 3) bool <no unit> [False, True, False]
[14]:
(a == b).unit is None
[14]:
True
For some purposes we may use a coordinate with unique integer-valued identifiers. Since the identifiers to not have a physical meaning, we use unit=None
. Note that this has to be given explicitly since otherwise integers are treated as numbers, i.e., the unit would be dimensionless:
[15]:
da = sc.DataArray(a, coords={'id':sc.array(dims=['x'], unit=None, values=[34,21,14])})
da
[15]:
- x: 3
- id(x)int6434, 21, 14
Values:
array([34, 21, 14])
- (x)int64𝟙1, 2, 3
Values:
array([1, 2, 3])
The index
function can now be used to conveniently lookup data by its identifier:
[16]:
da['id', sc.index(21)]
[16]:
- ()int64𝟙2
Values:
array(2)
- id()int6421
Values:
array(21)
Reduced effect of rounding errors when converting units#
New in 0.14
sc.to_unit
(and therefore also the to()
method) now avoid rounding errors when converting from a large unit to a small unit, if the conversion factor is integral.
Example:
[17]:
sc.scalar(1.0, unit='m').to(unit='nm')
[17]:
- ()float64nm1000000000.0
Values:
array(1.e+09)
Checking if coordinates are bin-edges#
New in 0.13
The coords
property (and also the attrs
, meta
, and masks
properties) now provide the is_edges
method to check if an entry is a bin-edge coordinate.
Example:
[18]:
import scipp as sc
x = sc.arange('x', 3)
da = sc.DataArray(x, coords={'x1': x, 'x2': sc.arange('x', 4)})
print(f"{da.coords.is_edges('x1') = }")
print(f"{da.coords.is_edges('x2') = }")
da.coords.is_edges('x1') = False
da.coords.is_edges('x2') = True
Operations#
Creation functions#
New in 0.11
Creation functions for datetimes where added:
Added
epoch
,datetime
anddatetimes
.
[19]:
sc.datetime('now', unit='ms')
[19]:
- ()datetime64ms2022-07-19T12:37:46.000
Values:
array('2022-07-19T12:37:46.000', dtype='datetime64[ms]')
[20]:
times = sc.datetimes(dims=['time'], values=['2022-01-11T10:24:03', '2022-01-11T10:24:03'])
times
[20]:
- (time: 2)datetime64s2022-01-11T10:24:03, 2022-01-11T10:24:03
Values:
array(['2022-01-11T10:24:03', '2022-01-11T10:24:03'], dtype='datetime64[s]')
The new epoch
function is useful for obtaining the time since epoch, i.e., a time difference (dtype='int64'
) instead of a time point (dtype='datetime64'
):
[21]:
times - sc.epoch(unit=times.unit)
[21]:
- (time: 2)int64s1641896643, 1641896643
Values:
array([1641896643, 1641896643])
New in 0.12
zeros_like
, ones_like
, empty_like
, and full_like
can now be used with data arrays.
Example:
[22]:
x = sc.linspace('x', 0.0, 1.0, num=5)
da = sc.DataArray(sc.ones(dims=['x', 'y'], shape=[4,6], unit='K'), coords={'x':x})
sc.zeros_like(da)
[22]:
- x: 4
- y: 6
- x(x [bin-edge])float64𝟙0.0, 0.25, 0.5, 0.75, 1.0
Values:
array([0. , 0.25, 0.5 , 0.75, 1. ])
- (x, y)float64K0.0, 0.0, ..., 0.0, 0.0
Values:
array([[0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.]])
Utility methods and functions#
New in 0.12
Added
squeeze
method to remove length-1 dimensions from objects.Added
rename
method to rename dimensions and associated dimension-coordinates (or attributes). This complementsrename_dims
, which only changes dimension labels but does not rename coordinates.Added
midpoints
to compute bin-centers.
Example:
[23]:
x = sc.linspace('x', 0.0, 1.0, num=5)
da = sc.DataArray(sc.ones(dims=['x', 'y'], shape=[4,6], unit='K'), coords={'x':x})
A length-1 x-dimension…
[24]:
da['x', 0:1]
[24]:
- x: 1
- y: 6
- x(x [bin-edge])float64𝟙0.0, 0.25
Values:
array([0. , 0.25])
- (x, y)float64K1.0, 1.0, ..., 1.0, 1.0
Values:
array([[1., 1., 1., 1., 1., 1.]])
… can be removed with squeeze
:
[25]:
da['x', 0:1].squeeze()
[25]:
- y: 6
- (y)float64K1.0, 1.0, ..., 1.0, 1.0
Values:
array([1., 1., 1., 1., 1., 1.])
- x(x)float64𝟙0.0, 0.25
Values:
array([0. , 0.25])
squeeze
returns a new object and leaves the original unchanged.
Renaming is most convenient using keyword arguments:
[26]:
da.rename(x='xnew')
[26]:
- xnew: 4
- y: 6
- xnew(xnew [bin-edge])float64𝟙0.0, 0.25, 0.5, 0.75, 1.0
Values:
array([0. , 0.25, 0.5 , 0.75, 1. ])
- (xnew, y)float64K1.0, 1.0, ..., 1.0, 1.0
Values:
array([[1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.]])
rename
returns a new object and leaves the original unchanged.
midpoints
can be used to replace a bin-edge coordinate by bin centers:
[27]:
da.coords['x'] = sc.midpoints(da.coords['x'])
da
[27]:
- x: 4
- y: 6
- x(x)float64𝟙0.125, 0.375, 0.625, 0.875
Values:
array([0.125, 0.375, 0.625, 0.875])
- (x, y)float64K1.0, 1.0, ..., 1.0, 1.0
Values:
array([[1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.], [1., 1., 1., 1., 1., 1.]])
Reduction operations#
More operations supported by data arrays and datasets#
New in 0.14
DataArray
andDataset
now support more reduction operations, includingsum
,nansum
,mean
,nanmean
,max
,min
,nanmax
,nanmin
,all
, andany
.All of the above are now also supported for the
bins
property.groupby
now also supports all of these operations. Exception:nanmean
.Event-based masks are now supported in all reduction operations.
Example:
[28]:
da = sc.data.binned_x(nevent=100, nbin=3)
da
[28]:
- x: 3
- x(x [bin-edge])float64m0.0, 0.333, 0.667, 1.0
Values:
array([0. , 0.33333333, 0.66666667, 1. ])
- (x)DataArrayViewbinned data [len=37, len=26, len=37]
Values:
[<scipp.DataArray> Dimensions: Sizes[row:37, ] Coordinates: x float64 [m] (row) [0.261692, 0.319097, ..., 0.0338069, 0.0670407] y float64 [m] (row) [0.174846, 0.495505, ..., 0.00843672, 0.266437] z float64 [m] (row) [0.806809, 0.0545327, ..., 0.367852, 0.26365] Data: float64 [K] (row) [1.04638, 1.08511, ..., 1.0173, 1.05255] , <scipp.DataArray> Dimensions: Sizes[row:26, ] Coordinates: x float64 [m] (row) [0.380196, 0.441006, ..., 0.461613, 0.350609] y float64 [m] (row) [0.619161, 0.277216, ..., 0.221658, 0.220269] z float64 [m] (row) [0.00207108, 0.762591, ..., 0.351097, 0.00326209] Data: float64 [K] (row) [1.0491, 1.02263, ..., 1.08232, 1.08752] , <scipp.DataArray> Dimensions: Sizes[row:37, ] Coordinates: x float64 [m] (row) [0.9767, 0.923246, ..., 0.823521, 0.947199] y float64 [m] (row) [0.29784, 0.301757, ..., 0.741137, 0.941554] z float64 [m] (row) [0.747273, 0.809356, ..., 0.454869, 0.313959] Data: float64 [K] (row) [1.05724, 1.07197, ..., 1.01219, 1.03879] ]
The maximum value in each bin:
[29]:
da.bins.max()
[29]:
- x: 3
- x(x [bin-edge])float64m0.0, 0.333, 0.667, 1.0
Values:
array([0. , 0.33333333, 0.66666667, 1. ])
- (x)float64K1.095, 1.094, 1.100
Values:
array([1.09532596, 1.09356846, 1.09963128])
The maximum value in each bin of a binned variable, here a coordinate:
[30]:
da.bins.coords['x'].bins.max()
[30]:
- (x: 3)float64m0.326, 0.660, 0.992
Values:
array([0.3264945 , 0.65987435, 0.99225923])
Shape operations#
fold
supports size -1#
New in 0.12
fold
now accepts up to one size (or shape) entry with value -1
. This indicates that the size should be computed automatically based on the input size and other provided sizes.
Example:
[31]:
var = sc.arange('xyz', 2448)
var.fold('xyz', sizes={'x':4, 'y':4, 'z':-1})
[31]:
- (x: 4, y: 4, z: 153)int64𝟙0, 1, ..., 2446, 2447
Values:
array([[[ 0, 1, 2, ..., 150, 151, 152], [ 153, 154, 155, ..., 303, 304, 305], [ 306, 307, 308, ..., 456, 457, 458], [ 459, 460, 461, ..., 609, 610, 611]], [[ 612, 613, 614, ..., 762, 763, 764], [ 765, 766, 767, ..., 915, 916, 917], [ 918, 919, 920, ..., 1068, 1069, 1070], [1071, 1072, 1073, ..., 1221, 1222, 1223]], [[1224, 1225, 1226, ..., 1374, 1375, 1376], [1377, 1378, 1379, ..., 1527, 1528, 1529], [1530, 1531, 1532, ..., 1680, 1681, 1682], [1683, 1684, 1685, ..., 1833, 1834, 1835]], [[1836, 1837, 1838, ..., 1986, 1987, 1988], [1989, 1990, 1991, ..., 2139, 2140, 2141], [2142, 2143, 2144, ..., 2292, 2293, 2294], [2295, 2296, 2297, ..., 2445, 2446, 2447]]])
broadcast
supports DataArray
#
New in 0.13
broadcast
now also supports data arrays.
Vectors and matrices#
General#
New in 0.11
scipp.spatial
has been restructured and extended:
New data types for spatial transforms were added:
vector3
(renamed fromvector3_float64
)rotation3
(3-D rotation defined using quaternion coeffiecients)translation3
(translation in 3-D)linear_transform3
(previouslymatrix_3_float64
, 3-D linear transform with, e.g., rotation and scaling)affine_transform3
(affine transform in 3-D, combination of a linear transform and a translation, defined using 4x4 matrix)
The scipp.spatial submodule was extended with a number of new creation functions, in particular for the new dtypes.
matrix
andmatrices
for creating “matrices” have been deprecated. Usescipp.spatial.linear_transform
andscipp.spatial.linear_transforms
instead.
Note that the scipp.spatial
subpackage must be imported explicitly:
[32]:
from scipp import spatial
linear = spatial.linear_transform(value=[[1,0,0],[0,2,0],[0,0,3]])
linear
[32]:
- ()linear_transform3𝟙[[1. 0. 0.] [0. 2. 0.] [0. 0. 3.]]
Values:
array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]])
[33]:
trans = spatial.translation(value=[1,2,3], unit='m')
trans
[33]:
- ()translation3m[1. 2. 3.]
Values:
array([1., 2., 3.])
Multiplication can be used to combine the various transforms:
[34]:
linear * trans
[34]:
- ()affine_transform3m[[1. 0. 0. 1.] [0. 2. 0. 4.] [0. 0. 3. 9.] [0. 0. 0. 1.]]
Values:
array([[1., 0., 0., 1.], [0., 2., 0., 4.], [0., 0., 3., 9.], [0., 0., 0., 1.]])
Note that in the case of affine_transform3
the unit refers to the translation part. A unit for the linear part is currently not supported.
SciPy compatibility layer#
New in 0.11
A number of subpackages providing wrappers for a subset of functions from the corresponding packages in SciPy was added:
scipp.integrate providing
simpson
andtrapezoid
.scipp.interpolate providing
interp1d
.scipp.optimize providing
curve_fit
.scipp.signal providing
butter
andsosfiltfilt
.
Please refer to the function documentation for working examples.
New in 0.14
scipp.ndimage providing
gaussian_filter
,median_filter
, and more.
Performance#
New in 0.12
sc.bin()
is now faster when binning or grouping into thousands of bins or more.
New in 0.14
Fixed slow import times of scipp
.