Download this Jupyter notebook
What is scipp¶
Containers providing multi-dimensional array with associated dicts of coordinates, masks, and attributes
A Mantid evolution borne out of attempt to rethink data structures
Heavily influenced by python xarray project
C++ core with python bindings. Python is first-class element.
Development gathered pace in 2020
Feature Summary¶
Very flexible containers with good optimistaion potential
Supports key features
Variances
,Histograms
,Masking
,Events
,Units
,Bin-edges
,Slicing
,Sample-Environment
Can provide a good scientific representation of data, does not force users to work in Detector-Space
Emphasises use of built-in generic functions
Bundles it’s own plotting library
Dataset
,DataArray
are main data containers
Feature Exhibit¶
There are many demos and tutorials in the scipp online documentation
N-d data¶
We take the example of a 2D numpy array with values between 1 and 100
[1]:
import numpy as np
data = np.arange(1.0, 101.0).reshape(10,10)
data
[1]:
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.],
[ 11., 12., 13., 14., 15., 16., 17., 18., 19., 20.],
[ 21., 22., 23., 24., 25., 26., 27., 28., 29., 30.],
[ 31., 32., 33., 34., 35., 36., 37., 38., 39., 40.],
[ 41., 42., 43., 44., 45., 46., 47., 48., 49., 50.],
[ 51., 52., 53., 54., 55., 56., 57., 58., 59., 60.],
[ 61., 62., 63., 64., 65., 66., 67., 68., 69., 70.],
[ 71., 72., 73., 74., 75., 76., 77., 78., 79., 80.],
[ 81., 82., 83., 84., 85., 86., 87., 88., 89., 90.],
[ 91., 92., 93., 94., 95., 96., 97., 98., 99., 100.]])
In scipp we attach labels to the dimensions. This additional information helps with numerous things as we will see below.
[2]:
import scipp as sc
image_data = sc.array(dims=['y', 'x'], values=data)
[3]:
sc.plot(image_data)
Coordinates and Units¶
[4]:
# Lets give our image data the correct units
image_data.unit = sc.units.counts
x = sc.array(dims=['x'], values=np.arange(10), unit=sc.units.mm)
y = sc.array(dims=['y'], values=np.arange(10), unit=sc.units.mm)
image = sc.DataArray(data=image_data, coords={'x':x, 'y':y})
sc.plot(image, aspect='equal')
[5]:
sc.show(image)
Unit mismatch¶
Coords and units, not about pretty labels, give safety to help with preventable/costly mistakes. Lets see.
[6]:
reference = image.copy()
normalized = image / reference
try:
image + normalized # Caught!
except RuntimeError as e:
print(e)
Cannot add counts and dimensionless.
Coordinate mismatch¶
[7]:
background_corrected = reference - image
sc.plot(background_corrected)
[8]:
reference.coords['x'] += 4 * sc.units.mm # Detector shifted along x
sc.plot(reference)
[9]:
try:
reference - image
except RuntimeError as e:
print(e)
Mismatch in coordinate 'x', expected
(x: 10) int64 [mm] [4, 5, ..., 12, 13], got
(x: 10) int64 [mm] [0, 1, ..., 8, 9]
Masking¶
[10]:
image2 = image.copy()
image.masks['lhs'] = image.coords['x'] < 5.0 * sc.units.mm
sc.plot(image)
Lets make more masks…
[11]:
image.masks['bad-pixel'] = image.data >= 99 * sc.units.counts
sc.plot(image)
[12]:
image2.masks['bad-row'] = image.coords['y'] == 6 * sc.units.mm
sc.plot(image2)
Masks are applied with OR. But data is not zero’d until the mask has to be lost.
[13]:
image += image2
sc.plot(image)
[14]:
sc.to_html(image)
sc.show(image)
- y: 10
- x: 10
- x(x)int64mm0, 1, ..., 8, 9
Values:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) - y(y)int64mm0, 1, ..., 8, 9
Values:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- (y, x)float64counts2.0, 4.0, ..., 198.0, 200.0
Values:
array([[ 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.], [ 22., 24., 26., 28., 30., 32., 34., 36., 38., 40.], [ 42., 44., 46., 48., 50., 52., 54., 56., 58., 60.], [ 62., 64., 66., 68., 70., 72., 74., 76., 78., 80.], [ 82., 84., 86., 88., 90., 92., 94., 96., 98., 100.], [102., 104., 106., 108., 110., 112., 114., 116., 118., 120.], [122., 124., 126., 128., 130., 132., 134., 136., 138., 140.], [142., 144., 146., 148., 150., 152., 154., 156., 158., 160.], [162., 164., 166., 168., 170., 172., 174., 176., 178., 180.], [182., 184., 186., 188., 190., 192., 194., 196., 198., 200.]])
- bad-pixel(y, x)boolFalse, False, ..., True, True
Values:
array([[False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, False, False], [False, False, False, False, False, False, False, False, True, True]]) - bad-row(y)boolFalse, False, ..., False, False
Values:
array([False, False, False, False, False, False, True, False, False, False]) - lhs(x)boolTrue, True, ..., False, False
Values:
array([ True, True, True, True, True, False, False, False, False, False])
Slicing¶
In numpy you are required to know your dimension order
[15]:
data[4:,:]
[15]:
array([[ 41., 42., 43., 44., 45., 46., 47., 48., 49., 50.],
[ 51., 52., 53., 54., 55., 56., 57., 58., 59., 60.],
[ 61., 62., 63., 64., 65., 66., 67., 68., 69., 70.],
[ 71., 72., 73., 74., 75., 76., 77., 78., 79., 80.],
[ 81., 82., 83., 84., 85., 86., 87., 88., 89., 90.],
[ 91., 92., 93., 94., 95., 96., 97., 98., 99., 100.]])
[16]:
data[:, 4:] # Or was it the other way round?
[16]:
array([[ 5., 6., 7., 8., 9., 10.],
[ 15., 16., 17., 18., 19., 20.],
[ 25., 26., 27., 28., 29., 30.],
[ 35., 36., 37., 38., 39., 40.],
[ 45., 46., 47., 48., 49., 50.],
[ 55., 56., 57., 58., 59., 60.],
[ 65., 66., 67., 68., 69., 70.],
[ 75., 76., 77., 78., 79., 80.],
[ 85., 86., 87., 88., 89., 90.],
[ 95., 96., 97., 98., 99., 100.]])
but with scipp “crop” any dimension using the dimension label as a key.
[17]:
sc.plot(image['x', 4:], aspect='equal')
You can also chain the slicing operations.
[18]:
sc.plot(image['y', 1:]['x', 4:], aspect='equal')
Dynamic type control¶
[19]:
image_data
[19]:
- (y: 10, x: 10)float64counts2.0, 4.0, ..., 198.0, 200.0
Values:
array([[ 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.], [ 22., 24., 26., 28., 30., 32., 34., 36., 38., 40.], [ 42., 44., 46., 48., 50., 52., 54., 56., 58., 60.], [ 62., 64., 66., 68., 70., 72., 74., 76., 78., 80.], [ 82., 84., 86., 88., 90., 92., 94., 96., 98., 100.], [102., 104., 106., 108., 110., 112., 114., 116., 118., 120.], [122., 124., 126., 128., 130., 132., 134., 136., 138., 140.], [142., 144., 146., 148., 150., 152., 154., 156., 158., 160.], [162., 164., 166., 168., 170., 172., 174., 176., 178., 180.], [182., 184., 186., 188., 190., 192., 194., 196., 198., 200.]])
[20]:
image_data.astype('float32')
[20]:
- (y: 10, x: 10)float32counts2.0, 4.0, ..., 198.0, 200.0
Values:
array([[ 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.], [ 22., 24., 26., 28., 30., 32., 34., 36., 38., 40.], [ 42., 44., 46., 48., 50., 52., 54., 56., 58., 60.], [ 62., 64., 66., 68., 70., 72., 74., 76., 78., 80.], [ 82., 84., 86., 88., 90., 92., 94., 96., 98., 100.], [102., 104., 106., 108., 110., 112., 114., 116., 118., 120.], [122., 124., 126., 128., 130., 132., 134., 136., 138., 140.], [142., 144., 146., 148., 150., 152., 154., 156., 158., 160.], [162., 164., 166., 168., 170., 172., 174., 176., 178., 180.], [182., 184., 186., 188., 190., 192., 194., 196., 198., 200.]], dtype=float32)
Compatibility¶
Mantid¶
scipp data structures are not API compatible with Mantid’s
scipp and Mantid data structures (workspaces) are convertible. As one-liners in some cases:
ds = sc.neutron.from_mantid(a_mantid)
scipp can load and use nexus files like Mantid
ds = sc.neutron.load("experiment.nxs")
More on this topic in the docs
Numpy¶
scipp objects can expose their underlying arrays in a numpy compatible form. This makes it possible to use numpy operations directly on scipp variables.
[21]:
x = sc.array(dims=['x'], values=np.linspace(-np.pi, np.pi, 20))
y = x.copy() # empty container
np.sin(x.values, out=y.values)
sc.plot(y)
Packages¶
Conda packages for Linux
, OSX
, and Windows
on anaconda cloud
Installation¶
Simply
conda install -c conda-forge -c scipp scipp
Interoperability with mantid is achieved by installing the mantid-framework
package, which is an optional dependency. It can be installed through the same channels.
conda install -c conda-forge -c scipp mantid-framework
Full installation notes here
Lots more in scipp¶
IO
label-based slicing
events/binning
grouping and filtering operations
Future Plans¶
Across technique areas Issues and priorities are already being driven by Instrument Data Scientists. Includes Søren Schmidt.
Data driven development using reduction workflows
Priority is to support getting Day One instruments ready for Hot Commissioning
Aligned to above, scipp is being supplimented by
ess
andneutron
specific modules that provide bespoke tools.scipp-widgets
library also under deveopment for building-block gui additions. See docsTechnical short-term roadmap already available