# What's new in scipp

This page highlights feature additions and discusses major changes from recent releases.
For a full list of changes see the [Release Notes](https://scipp.github.io/about/release-notes.html).

In [None]:
import numpy as np
import scipp as sc

## General

### Get unique dimension using `dim` property

<div class="alert alert-info">

**New in 0.9**

The new `dim` property checks whether an object is 1-D, and returns the only dimension label.
An exception is raised if the object is not 1-D.
</div>

Example:

In [None]:
x = sc.linspace(dim='x', start=0, stop=1, num=4)
x.dim

### Logging support

<div class="alert alert-info">

**New in 0.9**

Scipp now provides a logger, and a pre-configured logging widget for Jupyter notebooks.
See [Logging](../reference/logging.ipynb).
    
</div>

### Bound method equivalents to many free functions

<div class="alert alert-info">

**New in 0.8**

Many functions that have been available as free functions can now be used also as methods of variables and data arrays.
See the [documentation for individual classes](../reference/classes.rst#classes) for a full list.

</div>

Example:

In [None]:
var = sc.arange(dim="x", unit="m", start=0, stop=12)
var.sum()  # Previously sc.sum(var)

Note that `sc.sum(var)` will continue to be supported as well.

### Indexing

#### Ellipsis

<div class="alert alert-info">

**New in 0.8**
    
Indexing with ellipsis (`...`) is now supported.
This can be used, e.g., to replace data in an existing object without re-pointing the underlying reference to the object given on the right-hand side.

</div>

Example

In [None]:
var1 = sc.ones(dims=["x"], shape=[4])
var2 = var1 + var1
da = sc.DataArray(data=sc.zeros(dims=["x"], shape=[4]))
da.data = var1  # replace data variable
da.data[...] = var2  # assign to slice, copy into existing data variable
var1  # now holds values of var2

Changing `var2` has no effect on `da.data`:

In [None]:
var2 += 2222.0
da

### Operations

#### Creation functions

<div class="alert alert-info">

**New in 0.5**

For convenience and similarity to `numpy` we added [functions that create variables](../reference/creation-functions.rst#creation-functions).
Our intention is to fully replace the need to use `sc.Variable` directly, but at this point this has not been rolled out to our documentation pages.

</div>

Examples:

In [None]:
sc.array(dims=["x"], values=np.array([1, 2, 3]))

In [None]:
sc.zeros(dims=["x"], shape=[3])

In [None]:
sc.scalar(17)

All of these also take keyword arguments.
Note that we can still support creating scalars by multiplying with a unit:

In [None]:
1.2 * sc.units.m

<div class="alert alert-info">

**New in 0.7**
    
More creation functions were added:

- Added `zeros_like`, `ones_like`, and `empty_like`.
- Added `linspace`, `logspace`, `geomspace`, and `arange`.

</div>

<div class="alert alert-info">

**New in 0.8**
    
More creation functions were added:

- Added `full` and `full_like`.

</div>

#### Unit conversion

<div class="alert alert-info">

**New in 0.6**

Conversions between different unit scales are now supported.
`to_unit` provides conversion of variables between, e.g., `mm` and `m`.

</div>

<div class="alert alert-info">

**New in 0.7**

- `to_unit` can now avoid making a copy if the input already has the desired unit.
  This can be used as a cheap way to ensure inputs have expected units.
- `to_unit` now also works for binned data, converting the unit of the underlying events in the bins
    
</div>

<div class="alert alert-info">

**New in 0.8**

- `to_unit` now has a `copy` argument.
   By default, `copy=True` and `to_unit` makes a copy even if the input already has the desired unit.
   For a cheap way to ensure inputs have expected units use `copy=False` to avoid copies if possible.
    
</div>

Example:

In [None]:
var = sc.array(dims=["x"], unit="mm", values=[3.2, 5.4, 7.6])
m = sc.to_unit(var, "m")
m

No copy is made if the input has the requested unit when we specify `copy=False`:

In [None]:
sc.to_unit(m, "m", copy=False)  # no copy

Conversions also work for more specialized units such as electron-volt:

In [None]:
sc.to_unit(sc.scalar(1.0, unit="nJ"), unit="meV")

#### `from_pandas` and `from_xarray`

<div class="alert alert-info">

**New in 0.8**

- `from_pandas` for converting `pandas.Dataframe` to `scipp.Dataset`.
- `from_xarray` for converting `xarray.DataArray` or `xarray.Dataset` to `scipp.DataAray` or `scipp.Dataset`, respectively.

Both functions are available in the `compat` submodule.

</div>

### Reduction operations

#### Internal precision in summation operations

<div class="alert alert-info">

**New in 0.9**

Reduction operations such as `sum` of single-precision (`float32`) data now use double-precision (`float64`) internally to reduce the effects of rounding errors.

</div>

#### Reductions over multiple inputs using `reduce`

<div class="alert alert-info">

**New in 0.9**

The new `reduce` function can be used for reduction operations that do not operate along a dimension of a scipp object but rather across a list or tuple of multiple scipp objects.
The mechanism is a 2-step approach, with a syntasx similar to `groupby`:

</div>

In [None]:
a = sc.linspace(dim="x", start=0.0, stop=1.0, num=4)
b = sc.linspace(dim="x", start=0.2, stop=0.8, num=4)
c = sc.linspace(dim="x", start=0.2, stop=1.2, num=4)
sc.reduce([a, b, c]).sum()

In [None]:
reducer = sc.reduce([a, b, c])
reducer.min()

In [None]:
reducer.max()

### Shape operations

#### `concat` replacing `concatenate`

<div class="alert alert-info">

**New in 0.9**

`concat` is replacing `concatenate` (which is deprecated now and will be removed in 0.10).
It supports a list of inputs rather than just 2 inputs.

</div>

In [None]:
a = sc.scalar(1.2)
b = sc.scalar(2.3)
c = sc.scalar(3.4)
sc.concat([a, b, c], "x")

#### `fold` and `flatten`

<div class="alert alert-info">

**New in 0.6**

`fold` and `flatten`, which are similar to [numpy.reshape](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html), have been added.
In contrast to `reshape`, `fold` and `flatten` support data arrays and handle also meta data such as coord, masks, and attrs.

</div>

<div class="alert alert-info">

**New in 0.7**

- `fold` now always returns views of data and all meta data instead of making deep copies.
- `flatten` also preserves reshaped data as a view, but unlike `fold` the same is not true for meta data in general, since it may require duplication in the flatten operation.

</div>

Example:

In [None]:
var = sc.ones(dims=["pixel"], shape=[100])
xy = sc.fold(var, dim="pixel", sizes={"x": 10, "y": 10})
xy = sc.DataArray(
    data=xy,
    coords={
        "x": sc.array(dims=["x"], values=np.arange(10)),
        "y": sc.array(dims=["y"], values=np.arange(10)),
    },
)
xy

Folding does not effect copies of either data or meta data, for example:

In [None]:
xy["y", 4] *= 0.0  # affects var (scipp-0.7 and higher)
var.plot()

The reverse of `fold` is `flatten`:

In [None]:
flat = sc.flatten(xy, to="pixel")
flat

Flattening does not effect a copy of data, but meta data may get copied if values need to be duplicated by the operation:

In [None]:
flat["pixel", 0] = 22  # modifies var (scipp-0.7 and higher)
var.plot()

### Vectors and matrices

#### General

<div class="alert alert-info">

**New in 0.7**

Several improvements for working with (3-D position) vectors and (3-D rotation) matrices are part of this release:

- Creation functions were added:
  - `vector` (a single vector)
  - `vectors` (array of vectors)
  - `matrix` (a single matrix),
  - `matrices` (array of matrices).
- Direct creation and initialization of 2-D (or higher) arrays of matrices and vectors is now possible from numpy arrays.
- The values property now returns a numpy array with ndim+1 (vectors) or ndim+2 (matrices) axes, with the inner 1 (vectors) or 2 (matrices) axes corresponding to the vector or matrix axes.
- Vector or matrix elements can now be accessed and modified directly using the new `fields` property of variables.
  `fields` provides access to vector elements `x`, `y`, and `z` or matrix elements `xx`, `xy`, ..., `zz`.
    
</div>

<div class="alert alert-info">

**New in 0.8**

The `fields` property can now be iterated and behaves similar to a `dict` with fixed keys.

</div>

In [None]:
sc.vector(value=[1, 2, 3])

In [None]:
vecs = sc.vectors(dims=["x"], unit="m", values=np.arange(12).reshape(4, 3))
vecs

In [None]:
vecs.values

In [None]:
vecs.fields.y

In [None]:
vecs.fields.z += 0.666 * sc.units.m
vecs

<div class="alert alert-info">

**New in 0.8**
    
The `cross` function to compute the cross-product of vectors as added.

</div>

In [None]:
sc.cross(vecs, vecs["x", 0])

#### `scipp.spatial.transform`

<div class="alert alert-info">

**New in 0.8**
    
The `scipp.spatial.transform` (in the style of `scipy.spatial.transform`) submodule was added.
This now provides:

- `from_rotvec` to create rotation matrices from rotation vectors.
- `as_rotvec` to convert rotation matrices into rotation vectors.

</div>

As an example, the following creates a rotation matrix for rotation around the `x`-axis by 30 degrees:

In [None]:
from scipp.spatial.transform import from_rotvec

rot = from_rotvec(sc.vector(value=[30.0, 0, 0], unit="deg"))
rot

### Coordinate transformations

<div class="alert alert-info">

**New in 0.8**

The `transform_coords` function has been added (also available as method of data arrays and datasets).
It is a tool for transforming one or more input coordinates into one or more output coordinates. It automatically handles:

- Renaming of dimensions, if dimension-coordinates are transformed.
- Change of coordinates to attributes to avoid interference of coordinates consumed by the transformation in follow-up operations.
- Conversion of event-coordinates of binned data, if present.

See [Coordinate transformations](../user-guide/coordinate-transformations.ipynb) for a full description.

</div>

### Physical constants

<div class="alert alert-info">

**New in 0.8**
    
The `scipp.constants` (in the style of `scipy.constants`) submodule was added, providing physical constants from CODATA 2018.
For full details see the [module's documentation](../generated/modules/scipp.constants.rst).

</div>

Examples:

In [None]:
from scipp.constants import hbar, m_e, physical_constants

In [None]:
hbar

In [None]:
m_e

In [None]:
physical_constants("speed of light in vacuum")

In [None]:
physical_constants("neutron mass", with_variance=True)

## Plotting

<div class="alert alert-info">

**New in 0.7**

- Plotting supports `redraw()` method for updating existing plots with new data, without recreating the plot.

</div>

<div class="alert alert-info">

**New in 0.8**

- Plotting 1-D binned (event) data is now supported.

</div>

## Binned data

### Buffer and meta data access

<div class="alert alert-info">

**New in 0.7**

- The internal buffer holding the "events" underlying binned data can now be accessed directly using the new `events` property.
  **Update: This is deprecated as of 0.8.2.**
- HTML view now works for binned meta data access such as `binned.bins.coords['time']`

</div>

<div class="alert alert-info">

**New in 0.8**

The mean of bins can now be computed using `binned.bins.mean()`.
This should general be used instead of `binned.bins.sum()` the if dtype is not "summable", i.e., typically anything that is not of unit "counts".

</div>

Consider the following example, representing a time series of temperature measurements on an x-y plane:

In [None]:
import numpy as np

N = int(800)
data = sc.DataArray(
    data=sc.Variable(dims=["time"], values=100 + np.random.rand(N) * 10, unit="K"),
    coords={
        "x": sc.Variable(dims=["time"], unit="m", values=np.random.rand(N)),
        "y": sc.Variable(dims=["time"], unit="m", values=np.random.rand(N)),
        "time": sc.Variable(
            dims=["time"], values=(10000 * np.random.rand(N)).astype("datetime64[s]")
        ),
    },
)
binned = sc.bin(
    data,
    edges=[
        sc.linspace(dim="x", unit="m", start=0.0, stop=1.0, num=5),
        sc.linspace(dim="y", unit="m", start=0.0, stop=1.0, num=5),
    ],
)
binned

In [None]:
sc.show(binned)

To allow for this, the `bins` property provides properties `data`, `coords`, `masks`, and `attrs` *of the bins* that behave like the properties of a data array *while retaining the binned structure*.
That is, it can be used for computation involving information available on a per-bin basis:

In [None]:
binned.bins.coords["time"]

In [None]:
sc.show(binned.bins.coords["time"])

We can use this in our example to correct for an hypothetical clock error that depends on the x-y bin:

In [None]:
clock_correction = sc.array(
    dims=["x", "y"], unit="s", values=(100 * np.random.rand(4, 4)).astype("int64")
)
clock_correction

In [None]:
binned.bins.coords["time"] += clock_correction

The properties can also be used to add or delete meta data entries:

In [None]:
del binned.bins.coords["x"]

### Broadcasting dense variables to binned variables using `bins_like`

<div class="alert alert-info">

**New in 0.9**
    
- Added `bins_like`, for broadcasting dense variables to binned variables, e.g., for converting bin coordinates into event coordinates.

</div>

In [None]:
temperature = sc.array(dims=['x'], unit='K', values=[3.,4.,5.,6.]) 
binned.bins.coords['temperature'] = sc.bins_like(binned, fill_value=temperature)
binned

## Performance

<div class="alert alert-info">

**New in 0.7**

- `sort` is now considerably faster for data with more rows.
- reduction operations such as `sum` and `mean` are now also multi-threaded and thus considerably faster.

</div>

<div class="alert alert-info">

**New in 0.9**

- `sc.lookup(histogram, dim)[var]` is now faster if `histogram` is very long and is integer-valued.
  This is relevant in a number of event-filtering operations.

</div>