The Scipp data structures (variables, data arrays, and datasets) behave mostly like nested Python objects, i.e., sub-objects are shared by default. Some of the effects are exemplified in the following.
Slices or other views of variables are also of type Variable and all views share ownership of the underlying data.
If a variable refers only to a section of the underlying data buffer this is indicated in the HTML view in the title line as part of the size (“x Bytes out of y Bytes”). This allows for identification of “small” variables that keep alive potentially large buffers:
As a result of the sharing mechanism, extra care must be taken in some cases, just like when working with any other Python library. Consider the following example, using the same variable as data and as a coordinate:
The modification unintentionally also affected the coordinate. However, if we think of data arrays and coordinate dicts as Python-like objects, then the behavior should not be surprising.
Apart from the standard and pythonic behavior, one advantage of this is that creating data arrays from variables is typically cheap, without inflicting copies of potentially large objects.
Just like creating data arrays from variables is cheap (without deep-copies), inserting items into datasets does not inflict potentially expensive deep copies:
[7]:
ds=sc.Dataset({'a':da})# shallow copy
Note that while the buffers are shared, the meta-data dicts coords and masks are not. Compare:
[8]:
ds['a'].masks['m']=da.coords['x']<670*sc.Unit('m')'m'inda.masks# the masks *dict* is copied
[8]:
False
with
[9]:
da.coords['x']*=-1# the coords *dict* is copied,# but the 'x' coordinate references same bufferds.coords['x']
Since da['x',0] is itself a data array, assigning to the data property would repoint the data to whatever is given on the right-hand side. However, this would not affect da, and the attempt to change the data would silently do nothing, since the temporary da['x',0] disappears immediately. The read-only flag protects us from this.
To actually modify the slice, use __setitem__ instead:
[11]:
da['x',0]=var['x',2]
Variables, meta-data dicts (coords, masks, and attrs properties), data arrays, and datasets also have read-only flags. The flags solve a number of conceptual issues and serve as a safeguard against hidden bugs.