Masking#

The purpose of masks in Scipp is to exclude regions of data from analysis, or to select regions of interest (ROIs). For example, we may mask data from sensors we know to be broken, or we may mask everything outside our ROI in an image.

In some cases direct removal of the bad data or data outside the ROI may be preferred. Masking provides an alternative solution, which lets us, e.g., modify the mask or remove it later.

NumPy provides support for masked arrays in the numpy.ma module. Scipp’s masking feature is conceptually similar, but is based on a dictionary of masks. Each mask is a Variable, i.e., comes with explicit dimensions. Scipp can therefore store masks in a very space-efficient manner. For example, given an image stack with dimensions ('image', 'pixel_y', 'pixel_x') we may have a mask for “images” with dimensions ('image', ) and a second mask defining the ROI with dimensions ('pixel_y', 'pixel_x'). The support for multiple masks also enables Scipp to selectively apply or preserve masks. For example, a sum over the ‘image’ dimension can preserve the ROI mask.

[1]:

import numpy as np
import scipp as sc

Creating and manipulating masks#

Masks are simply variables with dtype=bool:

[2]:

mask = sc.array(dims=['x'], values=[False, False, True])
mask

[2]:

scipp.Variable (259 Bytes)

- (x: 3)
  bool
  False, False, True
```
Values:
array([False, False,  True])
```

Boolean operators can be used to manipulate such variables:

[3]:

print(~mask)
print(mask ^ mask)
print(mask & ~mask)
print(mask | ~mask)

<scipp.Variable> (x: 3)       bool        <no unit>  [True, True, False]
<scipp.Variable> (x: 3)       bool        <no unit>  [False, False, False]
<scipp.Variable> (x: 3)       bool        <no unit>  [False, False, False]
<scipp.Variable> (x: 3)       bool        <no unit>  [True, True, True]

Comparison operators such as ==, !=, <, or >= (see also the list of comparison functions) are a common method of defining masks:

[4]:

var = sc.array(dims=['x'], values=np.random.random(5), unit='m')
mask2 = var < 0.5 * sc.Unit('m')
mask2

[4]:

scipp.Variable (261 Bytes)

(x: 5)

bool

True, False, True, True, True

Values:
array([ True, False,  True,  True,  True])

Masks in data arrays and items of dataset#

Data arrays and equivalently items of dataset can store arbitrary masks. Datasets themselves do not support masks. Masks are accessible using the masks keyword-argument and property, which behaves in the same way as coords:

[5]:

a = sc.DataArray(
    data=sc.array(dims=['y', 'x'], values=np.arange(1.0, 7.0).reshape((2, 3))),
    coords={'y': sc.arange('y', 2.0, unit='m'), 'x': sc.arange('x', 3.0, unit='m')},
    masks={'x': sc.array(dims=['x'], values=[False, False, True])},
)
sc.show(a)

[6]:

b = a.copy()
b.masks['x'].values[1] = True
b.masks['y'] = sc.array(dims=['y'], values=[False, True])

A mask value of True means that the mask is on, i.e., the corresponding data value should be ignored. Note that setting a mask does not affect the data.

Masks of dataset items are accessed using the masks property of the item:

[7]:

ds = sc.Dataset(data={'a': a})
ds['a'].masks['x']

[7]:

scipp.Variable (259 Bytes)

- (x: 3)
  bool
  False, False, True
```
Values:
array([False, False,  True])
```

This Page

Masking#

Creating and manipulating masks#

Masks in data arrays and items of dataset#

Operations with masked objects#

Element-wise binary operations#

Reduction operations#

Binning and resampling operations#