scipp.group
scipp.group#
- scipp.group(x, /, *args)#
Create binned data by grouping input by one or more coordinates.
Grouping can be specified in two ways: (1) When a string is provided the unique values of the corresponding coordinate are used as groups. (2) When a scipp variable is provided then the variable’s values are used as groups.
Note that option (1) may be very slow if the input is very large.
When grouping a dimension with an existing dimension-coord, the binning for the dimension is modified, i.e., the input and the output will have the same dimension labels.
When grouping by non-dimension-coords, the output will have new dimensions given by the names of these coordinates. These new dimensions replace the dimensions the input coordinates depend on.
- Parameters
- Returns
See also
scipp.bin
Creating binned data by binning based on edges, instead of grouping.
scipp.binning.make_binned
Lower level function that can bin and group, and does not automatically replace/erase dimensions.
Examples
Group a table by one of its coord columns, specifying (1) a coord name or (2) an actual grouping:
>>> from numpy.random import default_rng >>> rng = default_rng(seed=1234) >>> x = sc.array(dims=['row'], unit='m', values=rng.random(100)) >>> y = sc.array(dims=['row'], unit='m', values=rng.random(100)) >>> data = sc.ones(dims=['row'], unit='K', shape=[100]) >>> table = sc.DataArray(data=data, coords={'x': x, 'y': y}) >>> table.coords['label'] = (table.coords['x'] * 10).to(dtype='int64') >>> table.group('label').sizes {'label': 10}
>>> groups = sc.array(dims=['label'], values=[1, 3, 5], unit='m') >>> table.group(groups).sizes {'label': 3}
Group a table by two of its coord columns:
>>> table.coords['a'] = (table.coords['x'] * 10).to(dtype='int64') >>> table.coords['b'] = (table.coords['y'] * 10).to(dtype='int64') >>> table.group('a', 'b').sizes {'a': 10, 'b': 10}
>>> groups = sc.array(dims=['a'], values=[1, 3, 5], unit='m') >>> table.group(groups, 'b').sizes {'a': 3, 'b': 10}
Group binned data along an additional dimension:
>>> table.coords['a'] = (table.coords['y'] * 10).to(dtype='int64') >>> binned = table.bin(x=10) >>> binned.group('a').sizes {'x': 10, 'a': 10}