{ "cells": [ { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "# What's new in scipp\n", "\n", "This page highlights feature additions and discusses major changes from recent releases.\n", "For a full list of changes see the [Release Notes](https://scipp.github.io/about/release-notes.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import scipp as sc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## General" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bound method equivalents to many free functions\n", "\n", "
\n", "\n", "**New in 0.8**\n", "\n", "Many functions that have been available as free functions can now be used also as methods of variables and data arrays.\n", "See the [documentation for individual classes](https://scipp.github.io/reference/api.html#classes) for a full list.\n", "\n", "
\n", "\n", "Example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var = sc.arange(dim='x', unit='m', start=0, stop=12)\n", "var.sum() # Previously sc.sum(var)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `sc.sum(var)` will continue to be supported as well." ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Python-like shallow/deep copy mechanism\n", "\n", "
\n", "\n", "**New in 0.7**\n", "\n", "The most significant change in the scipp 0.7 release is a fundamental rework of all scipp data structures (variables, data arrays, and datasets).\n", "These now behave mostly like nested Python objects, i.e., sub-objects are shared by default.\n", "Previously there was no sharing mechanism and scipp always made deep-copies.\n", "Some of the effects are exemplified in the following.\n", "\n", "
\n", "\n", "#### Variables\n", "\n", "For variables on their own, the new and old implementations mostly yield the same user experience.\n", "Previously, views of variables, such as created when slicing a variable along a dimension, returned a different type – `VariableView` – which kept alive the original `Variable`.\n", "This asymmetry is now gone.\n", "Slices or other views of variables are now also of type `Variable`, and all views share ownership of the underlying data.\n", "\n", "If a variable refers only to a section of the underlying data buffer this is now indicated in the HTML view in the title line as part of the size, here *\"16 Bytes out of 96 Bytes\"*.\n", "This allows for identification of \"small\" variables that keep alive potentially large buffers:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var = sc.arange(dim='x', unit='m', start=0, stop=12)\n", "var['x', 4:6]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To create a variable with sole ownership of a buffer, use the `copy()` method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var['x', 4:6].copy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, `copy()` returns a deep copy.\n", "Shallow copies can be made by specifying `deep=False`, which preserves shared ownership of underlying buffers:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shallow_copy = var['x', 4:6].copy(deep=False)\n", "shallow_copy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Data arrays\n", "\n", "The move away from the previous \"always deep copy\" mechanism avoids a number of critical issues.\n", "However, as a result of the new sharing mechanism extra care must now be taken in some cases, just like when working with any other Python library.\n", "Consider the following example, using the same variable for data and a coordinate:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da = sc.DataArray(data=var, coords={'x': var})\n", "da += 666 * sc.units.m\n", "da" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The modification unintentionally also affected the coordinate.\n", "However, if we think of data arrays and coordinate dicts as Python-like objects, the behavior should then not be surprising.\n", "\n", "Note that the original `var` is also affected:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To avoid this, use `copy()`, e.g.,:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da = sc.DataArray(data=var.copy(), coords={'x': var.copy()})\n", "da += 666 * sc.units.m\n", "da" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Apart from the more standard and pythonic behavior, one advantage of this is that creating data arrays from variables can now be cheap, without inflicting copies of potentially large objects.\n", "\n", "A related change is the introduction of read-only flags.\n", "Consider the following attempt to modify the data via a slice:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " da['x', 0].data = var['x', 2]\n", "except sc.DataArrayError as e:\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since `da['x',0]` is itself a data array, assigning to the `data` property would repoint the data to whatever is given on the right-hand side.\n", "However, this would not affect `da`, and the attempt to change the data would silently do nothing, since the temporary `da['x',0]` disappears immediately.\n", "The read-only flag protects us from this.\n", "\n", "To actually modify the slice, use `__setitem__` instead:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da['x', 0] = var['x', 2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read-only flags were also introduced for variables, meta-data dicts (`coords`, `masks`, and `attrs` properties), data arrays and datasets.\n", "The flags solve a number of conceptual issues and serve as a safeguard against hidden bugs.\n", "\n", "#### Datasets\n", "\n", "Just like creating data arrays from variables is now cheap (without deep-copies), inserting items into datasets does not inflict potentially expensive deep copies:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = sc.Dataset()\n", "ds['a'] = da # shallow copy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that while the buffers are shared, the meta-data dicts such as `coords`, `masks`, or `attrs` are not.\n", "Compare:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds['a'].attrs['attr'] = 1.2 * sc.units.m\n", "'attr' in da.attrs # the attrs *dict* is copied" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "with" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da.coords['x'] *= -1\n", "ds.coords['x'] # the coords *dict* is copied, but the 'x' coordinate references same buffer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Indexing\n", "\n", "#### Ellipsis\n", "\n", "
\n", "\n", "**New in 0.8**\n", " \n", "Indexing with ellipsis (`...`) is now supported.\n", "This can be used, e.g., to replace data in an existing object without re-pointing the underlying reference to the object given on the right-hand side.\n", "\n", "
\n", "\n", "Example" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var1 = sc.ones(dims=['x'], shape=[4])\n", "var2 = var1 + var1\n", "da = sc.DataArray(data=sc.zeros(dims=['x'], shape=[4]))\n", "da.data = var1 # replace data variable\n", "da.data[...] = var2 # assign to slice, copy into existing data variable\n", "var1 # now holds values of var2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Changing `var2` has no effect on `da.data`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var2 += 2222.0\n", "da" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Label-based indexing\n", "\n", "
\n", "\n", "**New in 0.5**\n", " \n", "Indexing based on coordinate values is now possible:\n", "\n", "- Works just like position indexing (with integers).\n", "- Use a scalar variable as index (instead of integer) to use label-based indexing\n", "- Works with single values as well as slices (`:` notation)\n", "\n", "See [Label-based indexing](https://scipp.github.io/user-guide/slicing.html#Label-based-indexing) for more details.\n", " \n", "
\n", "\n", "Example" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da = sc.DataArray(data=sc.zeros(dims=['x', 'day'], shape=(4, 3)))\n", "da.coords['x'] = sc.linspace(dim='x', unit='m', start=0.1, stop=0.2, num=5)\n", "da.coords['day'] = sc.array(dims=['day'], values=[1, 7, 31])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da['day', sc.scalar(7)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da['x', 0.13 * sc.units.m] # selects bin containing this value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Support for datetime64\n", "\n", "
\n", "\n", "**New in 0.6**\n", " \n", "- Previously we stored time-related information such as, e.g., sample-temperature logs as integers.\n", "- Added support for datetime64 compatible with [np.datetime64](https://numpy.org/doc/stable/reference/arrays.datetime.html)\n", "- Time differences (`np.timedelta64`) are not used, we simply use integers since in combination with scipp's units this provides everything we need.\n", "\n", "
\n", "\n", "Example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var = sc.array(dims=['time'],\n", " values=np.arange(np.datetime64('2021-01-01T12:00:00'),\n", " np.datetime64('2021-01-01T12:04:00')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Datetimes and intgers with time units interoperate naturally.\n", "We can offset a datetime by adding a duration:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var + 123 * sc.Unit('s')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or subtract datetimes to obtain a duration:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var['time', 10] - var['time', 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`to_unit` can be used to convert to a different precision:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.to_unit(var, 'ms')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Operations\n", "\n", "#### Creation functions\n", "\n", "
\n", "\n", "**New in 0.5**\n", "\n", "For convenience and similarity to `numpy` we added [functions that create variables](../reference/api.rst#creation-functions).\n", "Our intention is to fully replace the need to use `sc.Variable` directly, but at this point this has not been rolled out to our documentation pages.\n", "\n", "
\n", "\n", "Examples:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.array(dims=['x'], values=np.array([1, 2, 3]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.zeros(dims=['x'], shape=[3])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.scalar(17)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All of these also take keyword arguments.\n", "Note that we can still support creating scalars by multiplying with a unit:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1.2 * sc.units.m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**New in 0.7**\n", " \n", "More creation functions were added:\n", "\n", "- Added `zeros_like`, `ones_like`, and `empty_like`.\n", "- Added `linspace`, `logspace`, `geomspace`, and `arange`.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**New in 0.8**\n", " \n", "More creation functions were added:\n", "\n", "- Added `full` and `full_like`.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Unit conversion\n", "\n", "
\n", "\n", "**New in 0.6**\n", "\n", "Conversions between different unit scales (not to be confused with [conversions provided by scippneutron](https://scipp.github.io/scippneutron/user-guide/unit-conversions.html)) are now supported.\n", "`to_unit` provides conversion of variables between, e.g., `mm` and `m`.\n", "\n", "
\n", "\n", "
\n", "\n", "**New in 0.7**\n", "\n", "- `to_unit` can now avoid making a copy if the input already has the desired unit.\n", " This can be used as a cheap way to ensure inputs have expected units.\n", "- `to_unit` now also works for binned data, converting the unit of the underlying events in the bins\n", " \n", "
\n", "\n", "
\n", "\n", "**New in 0.8**\n", "\n", "- `to_unit` now has a `copy` argument.\n", " By default, `copy=True` and `to_unit` makes a copy even if the input already has the desired unit.\n", " For a cheap way to ensure inputs have expected units use `copy=False` to avoid copies if possible.\n", " \n", "
\n", "\n", "Example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var = sc.array(dims=['x'], unit='mm', values=[3.2, 5.4, 7.6])\n", "m = sc.to_unit(var, 'm')\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "No copy is made if the input has the requested unit when we specify `copy=False`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "sc.to_unit(m, 'm', copy=False) # no copy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Conversions also work for more specialized units such as electron-volt:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.to_unit(sc.scalar(1.0, unit='nJ'), unit='meV')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### `from_pandas` and `from_xarray`\n", "\n", "
\n", "\n", "**New in 0.8**\n", "\n", "- `from_pandas` for converting `pandas.Dataframe` to `scipp.Dataset`.\n", "- `from_xarray` for converting `xarray.DataArray` or `xarray.Dataset` to `scipp.DataAray` or `scipp.Dataset`, respectively.\n", "\n", "Both functions are available in the `compat` submodule.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Shape operations\n", "\n", "#### `fold` and `flatten`\n", "\n", "
\n", "\n", "**New in 0.6**\n", "\n", "`fold` and `flatten`, which are similar to [numpy.reshape](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html), have been added.\n", "In contrast to `reshape`, `fold` and `flatten` support data arrays and handle also meta data such as coord, masks, and attrs.\n", "\n", "
\n", "\n", "
\n", "\n", "**New in 0.7**\n", "\n", "- `fold` now always returns views of data and all meta data instead of making deep copies.\n", "- `flatten` also preserves reshaped data as a view, but unlike `fold` the same is not true for meta data in general, since it may require duplication in the flatten operation.\n", "\n", "
\n", "\n", "Example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var = sc.ones(dims=['pixel'], shape=[100])\n", "xy = sc.fold(var, dim='pixel', sizes={'x': 10, 'y': 10})\n", "xy = sc.DataArray(data=xy,\n", " coords={\n", " 'x': sc.array(dims=['x'], values=np.arange(10)),\n", " 'y': sc.array(dims=['y'], values=np.arange(10))\n", " })\n", "xy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Folding does not effect copies of either data or meta data, for example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xy['y', 4] *= 0.0 # affects var (scipp-0.7 and higher)\n", "var.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The reverse of `fold` is `flatten`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flat = sc.flatten(xy, to='pixel')\n", "flat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Flattening does not effect a copy of data, but meta data may get copied if values need to be duplicated by the operation:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flat['pixel', 0] = 22 # modifies var (scipp-0.7 and higher)\n", "var.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Vectors and matrices\n", "\n", "#### General\n", "\n", "
\n", "\n", "**New in 0.7**\n", "\n", "Several improvements for working with (3-D position) vectors and (3-D rotation) matrices are part of this release:\n", "\n", "- Creation functions were added:\n", " - `vector` (a single vector)\n", " - `vectors` (array of vectors)\n", " - `matrix` (a single matrix),\n", " - `matrices` (array of matrices).\n", "- Direct creation and initialization of 2-D (or higher) arrays of matrices and vectors is now possible from numpy arrays.\n", "- The values property now returns a numpy array with ndim+1 (vectors) or ndim+2 (matrices) axes, with the inner 1 (vectors) or 2 (matrices) axes corresponding to the vector or matrix axes.\n", "- Vector or matrix elements can now be accessed and modified directly using the new `fields` property of variables.\n", " `fields` provides access to vector elements `x`, `y`, and `z` or matrix elements `xx`, `xy`, ..., `zz`.\n", " \n", "
\n", "\n", "
\n", "\n", "**New in 0.8**\n", "\n", "The `fields` property can now be iterated and behaves similar to a `dict` with fixed keys.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.vector(value=[1, 2, 3])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vecs = sc.vectors(dims=['x'], unit='m', values=np.arange(12).reshape(4, 3))\n", "vecs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vecs.values" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vecs.fields.y" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vecs.fields.z += 0.666 * sc.units.m\n", "vecs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**New in 0.8**\n", " \n", "The `cross` function to compute the cross-product of vectors as added.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.cross(vecs, vecs['x', 0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### `scipp.spatial.transform`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**New in 0.8**\n", " \n", "The `scipp.spatial.transform` (in the style of `scipy.spatial.transform`) submodule was added.\n", "This now provides:\n", "\n", "- `from_rotvec` to create rotation matrices from rotation vectors.\n", "- `as_rotvec` to convert rotation matrices into rotation vectors.\n", "\n", "
\n", "\n", "As an example, the following creates a rotation matrix for rotation around the `x`-axis by 30 degrees:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipp.spatial.transform import from_rotvec\n", "\n", "rot = from_rotvec(sc.vector(value=[30.0, 0, 0], unit='deg'))\n", "rot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Coordinate transformations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**New in 0.8**\n", "\n", "The `transform_coords` function has been added (also available as method of data arrays and datasets).\n", "It is a tool for transforming one or more input coordinates into one or more output coordinates. It automatically handles:\n", "\n", "- Renaming of dimensions, if dimension-coordinates are transformed.\n", "- Change of coordinates to attributes to avoid interference of coordinates consumed by the transformation in follow-up operations.\n", "- Conversion of event-coordinates of binned data, if present.\n", "\n", "See [Coordinate transformations](../user-guide/coordinate-transformations.ipynb) for a full description.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Physical constants\n", "\n", "
\n", "\n", "**New in 0.8**\n", " \n", "The `scipp.constants` (in the style of `scipy.constants`) submodule was added, providing physical constants from CODATA 2018.\n", "For full details see the [module's documentation](../generated/modules/scipp.constants.rst).\n", "\n", "
\n", "\n", "Examples:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipp.constants import hbar, m_e, physical_constants" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hbar" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_e" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "physical_constants('speed of light in vacuum')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "physical_constants('neutron mass', with_variance=True)" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Plotting\n", "\n", "
\n", "\n", "**New in 0.7**\n", "\n", "- Plotting supports `redraw()` method for updating existing plots with new data, without recreating the plot.\n", "\n", "
\n", "\n", "
\n", "\n", "**New in 0.8**\n", "\n", "- Plotting 1-D binned (event) data is now supported.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Binned data" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Buffer and meta data access\n", "\n", "
\n", "\n", "**New in 0.7**\n", "\n", "- The internal buffer holding the \"events\" underlying binned data can now be accessed directly using the new `events` property.\n", " **Update: This is deprecated as of 0.8.2.**\n", "- HTML view now works for binned meta data access such as `binned.bins.coords['time']`\n", "\n", "
\n", "\n", "
\n", "\n", "**New in 0.8**\n", "\n", "The mean of bins can now be computed using `binned.bins.mean()`.\n", "This should general be used instead of `binned.bins.sum()` the if dtype is not \"summable\", i.e., typically anything that is not of unit \"counts\".\n", "\n", "
\n", "\n", "Consider the following example, representing a time series of temperature measurements on an x-y plane:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "N = int(800)\n", "data = sc.DataArray(\n", " data=sc.Variable(dims=['time'], values=100 + np.random.rand(N) * 10, unit='K'),\n", " coords={\n", " 'x': sc.Variable(dims=['time'], unit='m', values=np.random.rand(N)),\n", " 'y': sc.Variable(dims=['time'], unit='m', values=np.random.rand(N)),\n", " 'time': sc.Variable(dims=['time'], values=(10000 * np.random.rand(N)).astype('datetime64[s]')),\n", " })\n", "binned = sc.bin(data,\n", " edges=[sc.linspace(dim='x', unit='m', start=0.0, stop=1.0, num=5),\n", " sc.linspace(dim='y', unit='m', start=0.0, stop=1.0, num=5)])\n", "binned" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.show(binned)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To allow for this, the `bins` property provides properties `data`, `coords`, `masks`, and `attrs` *of the bins* that behave like the properties of a data array *while retaining the binned structure*.\n", "That is, it can be used for computation involving information available on a per-bin basis:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "binned.bins.coords['time']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sc.show(binned.bins.coords['time'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use this in our example to correct for an hypothetical clock error that depends on the x-y bin:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "clock_correction = sc.array(dims=['x', 'y'], unit='s', values=(100 * np.random.rand(4, 4)).astype('int64'))\n", "clock_correction" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "binned.bins.coords['time'] += clock_correction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The properties can also be used to add or delete meta data entries:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "del binned.bins.coords['x']" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Performance" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**New in 0.7**\n", "\n", "- `sort` is now considerably faster for data with more rows.\n", "- reduction operations such as `sum` and `mean` are now also multi-threaded and thus considerably faster.\n", "\n", "
" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 4 }