scipp.curve_fit#
- scipp.curve_fit(coords, f, da, *, p0=None, bounds=None, reduce_dims=(), unsafe_numpy_f=False, workers=1, **kwargs)#
Use non-linear least squares to fit a function, f, to data. The function interface is similar to that of
xarray.DataArray.curvefit().Added in version 23.12.0.
This is a wrapper around
scipy.optimize.curve_fit(). See there for in depth documentation and keyword arguments. The differences are:Instead of separate
xdata,ydata, andsigmaarguments, the input data array defines these,xdataby the coords on the data array,ydatabyda.data, andsigmais defined as the square root of the variances, if present, i.e., the standard deviations.The fit function
fmust work with scipp objects. This provides additional safety over the underlying scipy function by ensuring units are consistent.The initial guess in
p0must be provided as a dict, mapping from fit-function parameter names to initial guesses.The parameter bounds must also be provided as a dict, like
p0.If the fit parameters are not dimensionless the initial guess must be a scipp
Variablewith the correct unit.If the fit parameters are not dimensionless the bounds must be a variables with the correct unit.
The bounds and initial guesses may be scalars or arrays to allow the initial guesses or bounds to vary in different regions. If they are arrays they will be broadcasted to the shape of the output.
The returned optimal parameter values
poptand the covariance matrixpcovwill have units if the initial guess has units.poptandpcovareDataGroupand aDataGroupofDataGrouprespectively. They are indexed by the fit parameter names. The variances of the parameter values inpoptare set to the corresponding diagonal value in the covariance matrix.
- Parameters:
coords (
Sequence[str] |Mapping[str,str|Variable]) – The coords that act as predictor variables in the fit. If a mapping, the keys signify names of arguments tofand the values signify coordinate names inda.coords. If a sequence, the names of the arguments tofand the names of the coords are taken to be the same. To use a fit coordinate not present inda.coords, pass it as a Variable.f (
Callable[...,Variable]) – The model function,f(x, y..., a, b...). It must take all coordinates listed incoordsas arguments, otherwise aValueErrorwill be raised, all other arguments will be treated as parameters of the fit.da (
DataArray) – The values of the data array provide the dependent data. If the data array stores variances then the standard deviations (square root of the variances) are taken into account when fitting.p0 (
dict[str,Variable|float] |None, default:None) – An optional dict of initial guesses for the parameters. If None, then the initial values will all be dimensionless 1. If the fit function cannot handle initial values of 1, in particular for parameters that are not dimensionless, then typically a :py:class:scipp.UnitErroris raised, but details will depend on the function.bounds (
Mapping[str,tuple[Variable,Variable] |tuple[Variable,None] |tuple[None,Variable] |tuple[Real,Real] |tuple[Real,None] |tuple[None,Real] |tuple[None,None]] |None, default:None) – Lower and upper bounds on parameters. Defaults to no bounds. Bounds are given as a dict of 2-tuples of (lower, upper) for each parameter where lower and upper are either both Variables or plain numbers. Parameters omitted from theboundsdict are unbounded.reduce_dims (
Sequence[str], default:()) – Additional dimensions to aggregate while fitting. If a dimension is not inreduce_dims, or in the dimensions of the coords used in the fit, then the values of the optimal parameters will depend on that dimension. One fit will be performed for every slice, and the data arrays in the output will have the dimension in theirdims. If a dimension is passed toreduce_dimsall data in that dimension is instead aggregated in a single fit and the dimension will not be present in the output.unsafe_numpy_f (
bool, default:False) – By default the provided fit functionfis assumed to take scipp Variables as input and use scipp operations to produce a scipp Variable as output. This has the safety advantage of unit checking. However, in some cases it might be advantageous to implementfusing Numpy operations for performance reasons. This is particularly the case if the curve fit will make many small curve fits involving relatively few data points. In this case the pybind overhead on scipp operations might be considerable. Ifunsafe_numpy_fis set toTruethen the arguments passed tofwill be Numpy arrays instead of scipp Variables and the output offis expected to be a Numpy array.workers (
int, default:1) – Number of worker processes to use when fitting many curves. Defaults to1.
- Returns:
popt– Optimal values for the parameters.pcov– The estimated covariance of popt.
See also
scipp.scipy.optimize.curve_fitSimilar functionality for 1D fits.
Examples
A 1D example
>>> def round(a, d): ... 'Helper for the doctests' ... return sc.round(10**d * a) / 10**d
>>> def func(x, a, b): ... return a * sc.exp(-b * x)
>>> rng = np.random.default_rng(1234) >>> x = sc.linspace(dim='x', start=0.0, stop=0.4, num=50, unit='m') >>> y = func(x, a=5, b=17/sc.Unit('m')) >>> y.values += 0.01 * rng.normal(size=50) >>> da = sc.DataArray(y, coords={'x': x})
>>> from scipp import curve_fit >>> popt, _ = curve_fit(['x'], func, da, p0 = {'b': 1.0 / sc.Unit('m')}) >>> round(sc.values(popt['a']), 3), round(sc.stddevs(popt['a']), 4) (<scipp.DataArray> Dimensions: Sizes[] Data: float64 [dimensionless] () 4.999 , <scipp.DataArray> Dimensions: Sizes[] Data: float64 [dimensionless] () 0.0077 )
A 2D example where two coordinates participate in the fit
>>> def func(x, z, a, b): ... return a * z * sc.exp(-b * x)
>>> x = sc.linspace(dim='x', start=0.0, stop=0.4, num=50, unit='m') >>> z = sc.linspace(dim='z', start=0.0, stop=1, num=10) >>> y = func(x, z, a=5, b=17/sc.Unit('m')) >>> y.values += 0.01 * rng.normal(size=500).reshape(10, 50) >>> da = sc.DataArray(y, coords={'x': x, 'z': z})
>>> popt, _ = curve_fit(['x', 'z'], func, da, p0 = {'b': 1.0 / sc.Unit('m')}) >>> round(sc.values(popt['a']), 3), round(sc.stddevs(popt['a']), 3) (<scipp.DataArray> Dimensions: Sizes[] Data: float64 [dimensionless] () 5.004 , <scipp.DataArray> Dimensions: Sizes[] Data: float64 [dimensionless] () 0.004 )
A 2D example where only one coordinate participates in the fit and we map over the dimension of the other coordinate. Note that the value of one of the parameters is z-dependent and that the output has a z-dimension
>>> def func(x, a, b): ... return a * sc.exp(-b * x)
>>> x = sc.linspace(dim='xx', start=0.0, stop=0.4, num=50, unit='m') >>> z = sc.linspace(dim='zz', start=0.0, stop=1, num=10) >>> # Parameter `a` is `z`-dependent. >>> y = func(x, a=z, b=17/sc.Unit('m')) >>> y.values += 0.01 * rng.normal(size=500).reshape(10, 50) >>> da = sc.DataArray(y, coords={'x': x, 'z': z})
>>> popt, _ = curve_fit( ... ['x'], func, da, ... p0 = {'b': 1.0 / sc.Unit('m')}, ... workers=1) >>> # Note that a = z >>> round(sc.values(popt['a']), 2), (<scipp.DataArray> Dimensions: Sizes[zz:10, ] Coordinates: * z float64 [dimensionless] (zz) [0, 0.111111, ..., 0.888889, 1] Data: float64 [dimensionless] (zz) [-0.01, 0.11, ..., 0.89, 1.01] ,)
Lastly, a 2D example where only one coordinate participates in the fit and the other coordinate is reduced.
>>> def func(x, a, b): ... return a * sc.exp(-b * x)
>>> x = sc.linspace(dim='xx', start=0.0, stop=0.4, num=50, unit='m') >>> z = sc.linspace(dim='zz', start=0.0, stop=1, num=10) >>> y = z * 0 + func(x, a=5, b=17/sc.Unit('m')) >>> y.values += 0.01 * rng.normal(size=500).reshape(10, 50) >>> da = sc.DataArray(y, coords={'x': x, 'z': z})
>>> popt, _ = curve_fit( ... ['x'], func, da, ... p0 = {'b': 1.0 / sc.Unit('m')}, reduce_dims=['zz']) >>> round(sc.values(popt['a']), 3), round(sc.stddevs(popt['a']), 4) (<scipp.DataArray> Dimensions: Sizes[] Data: float64 [dimensionless] () 5 , <scipp.DataArray> Dimensions: Sizes[] Data: float64 [dimensionless] () 0.0021 )
The variance is about 10x lower in this example than in the first 1D example. That is because in this example 50x10 points are used in the fit while in the first example only 50 points were used in the fit.