Representations and Tables#
Scipp provides a number of options for visualizing the structure and contents of variables, data arrays, and datasets:
scipp.to_html produces an HTML representation. This is also bound to
_repr_html_
, i.e., Jupyter will display this when the name of a Scipp object is typed at the end of a cell.scipp.show draws an SVG representation of the contained items and their shapes.
scipp.table outputs a table representation of 1-D data.
str
andrepr
produce a summary as a string.
String formatting is always possible, but the outputs of to_html
, show
, and table
are designed for Jupyter notebooks.
While the outputs are mostly self-explanatory we discuss some details below.
HTML representation#
scipp.to_html is used to define _repr_html_
. This special property is used by Jupyter in place of __repr__
.
[1]:
import numpy as np
import scipp as sc
[2]:
x = sc.arange('x', 2.0)
y = sc.arange('y', 4.0, unit='m')
labels = sc.arange('y', start=7.0, stop=10.0)
ds = sc.Dataset(
data={
'a': sc.array(
dims=['y', 'x'],
values=np.random.random((3, 2)),
variances=0.1 * np.random.random((3, 2)),
unit='angstrom',
)
},
coords={'x': x, 'y': y, 'y_label': labels},
)
ds['b'] = ds['a']
Simply typing the name of a variable, data array, or dataset will show the HTML representation:
[3]:
ds
[3]:
- y: 3
- x: 2
- x(x)float64𝟙0.0, 1.0
Values:
array([0., 1.]) - y(y [bin-edge])float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.]) - y_label(y)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
- a(y, x)float64Å0.175, 0.675, ..., 0.887, 0.241σ = 0.314, 0.201, ..., 0.302, 0.239
Values:
array([[0.17503413, 0.67461542], [0.8249171 , 0.23458381], [0.88747844, 0.24111765]])
Variances (σ²):
array([[0.09875485, 0.04023017], [0.04097682, 0.07812443], [0.09136517, 0.05695473]]) - b(y, x)float64Å0.175, 0.675, ..., 0.887, 0.241σ = 0.314, 0.201, ..., 0.302, 0.239
Values:
array([[0.17503413, 0.67461542], [0.8249171 , 0.23458381], [0.88747844, 0.24111765]])
Variances (σ²):
array([[0.09875485, 0.04023017], [0.04097682, 0.07812443], [0.09136517, 0.05695473]])
The columns are
Name of the data item, coordinate, etc. For coordinates, a bold font indicates that the coordinate is aligned.
Dimensions.
DType.
Unit.
Values and variances.
The reported size is only an estimate. It includes the actual arrays of values as well as (some of) the internal memory used by variables, etc. See, e.g. scipp.Variable.underlying_size.
WARNING:
IPython (and thus Jupyter) has an Output caching system. By default this keeps the last 1000 cell outputs. In the above case this is ds
(not the displayed HTML, but the object itself). If such cell outputs are large then this output cache can consume enormous amounts of memory.
Note that del ds
will not release the memory, since the IPython output cache still holds a reference to the same object. See this FAQ entry for clearing or disabling this caching.
Note that (as usual) Jupyter only shows the last variable mentioned in a cell:
[4]:
a = 1
ds
a
[4]:
1
In this case, to_html
can be used to retain the HTML view, e.g., to show multiple objects in a single cell:
[5]:
sc.to_html(ds['a'])
sc.to_html(ds['b'])
- y: 3
- x: 2
- x(x)float64𝟙0.0, 1.0
Values:
array([0., 1.]) - y(y [bin-edge])float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.]) - y_label(y)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
- (y, x)float64Å0.175, 0.675, ..., 0.887, 0.241σ = 0.314, 0.201, ..., 0.302, 0.239
Values:
array([[0.17503413, 0.67461542], [0.8249171 , 0.23458381], [0.88747844, 0.24111765]])
Variances (σ²):
array([[0.09875485, 0.04023017], [0.04097682, 0.07812443], [0.09136517, 0.05695473]])
- y: 3
- x: 2
- x(x)float64𝟙0.0, 1.0
Values:
array([0., 1.]) - y(y [bin-edge])float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.]) - y_label(y)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
- (y, x)float64Å0.175, 0.675, ..., 0.887, 0.241σ = 0.314, 0.201, ..., 0.302, 0.239
Values:
array([[0.17503413, 0.67461542], [0.8249171 , 0.23458381], [0.88747844, 0.24111765]])
Variances (σ²):
array([[0.09875485, 0.04023017], [0.04097682, 0.07812443], [0.09136517, 0.05695473]])
Typing the Scipp module name at the end of a cell yields an HTML view of all Scipp objects (variables, data arrays, and datasets):
[6]:
sc
Variables:(3)
labels
- (y: 3)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
x
- (x: 2)float64𝟙0.0, 1.0
Values:
array([0., 1.])
y
- (y: 4)float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.])
DataArrays:(0)
Datasets:(1)
ds
- y: 3
- x: 2
- x(x)float64𝟙0.0, 1.0
Values:
array([0., 1.]) - y(y [bin-edge])float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.]) - y_label(y)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
- a(y, x)float64Å0.175, 0.675, ..., 0.887, 0.241σ = 0.314, 0.201, ..., 0.302, 0.239
Values:
array([[0.17503413, 0.67461542], [0.8249171 , 0.23458381], [0.88747844, 0.24111765]])
Variances (σ²):
array([[0.09875485, 0.04023017], [0.04097682, 0.07812443], [0.09136517, 0.05695473]]) - b(y, x)float64Å0.175, 0.675, ..., 0.887, 0.241σ = 0.314, 0.201, ..., 0.302, 0.239
Values:
array([[0.17503413, 0.67461542], [0.8249171 , 0.23458381], [0.88747844, 0.24111765]])
Variances (σ²):
array([[0.09875485, 0.04023017], [0.04097682, 0.07812443], [0.09136517, 0.05695473]])
DataGroups:(0)
[6]:
<module 'scipp' from '/home/runner/work/scipp/scipp/.tox/docs/lib/python3.10/site-packages/scipp/__init__.py'>
SVG representation#
scipp.show renders Scipp objects to an image that shows the relationships between coordinates and data. It should be noted that if a dimension extent is large, show
will truncate it to avoid generation of massive and unreadable SVGs. Objects with more than three dimensions are not supported and will result in an error message.
Compare the image below with the HTML representation to see what the individual components represent. Names of dataset items and coordinates are shown in large letters. And dimension names are shown in smaller (rotated for y) letters.
[7]:
sc.show(ds)
Note that y has four blocks and y_label and the data have 3 in the y-dimension. This indicates that y
is a bin-edge coordinate.
scipp.show
also works with binned data. Here, the smaller blocks to the right represent the events, i.e., the bin contents. Their length does not mean anything as the size of bins can vary.
[8]:
sc.show(sc.data.binned_xy(100, 3, 2))
Table representation#
scipp.table arranges Scipp objects in a table. If only works with one-dimensional objects, so we have to use slicing to display our higher dimensional example:
[9]:
sc.table(ds['y', 0])
[9]:
a | b | |
---|---|---|
Coordinates | Data | Data |
x [𝟙] | [Å] | [Å] |
0.000 | 0.175±0.314 | 0.175±0.314 |
1.000 | 0.675±0.201 | 0.675±0.201 |
In the following, the y column is longer than the other columns because y
is a bin-edge coordinate.
[10]:
sc.table(ds['x', 0])
[10]:
a | b | ||
---|---|---|---|
Coordinates | Data | Data | |
y [m] | y_label [𝟙] | [Å] | [Å] |
0.000 | 7.000 | 0.175±0.314 | 0.175±0.314 |
1.000 | 8.000 | 0.825±0.202 | 0.825±0.202 |
2.000 | 9.000 | 0.887±0.302 | 0.887±0.302 |
3.000 |
String-representation#
All Scipp objects can be converted to strings:
[11]:
print(ds)
<scipp.Dataset>
Dimensions: Sizes[y:3, x:2, ]
Coordinates:
* x float64 [dimensionless] (x) [0, 1]
* y float64 [m] (y [bin-edge]) [0, 1, 2, 3]
* y_label float64 [dimensionless] (y) [7, 8, 9]
Data:
a float64 [Å] (y, x) [0.175034, 0.674615, ..., 0.887478, 0.241118] [0.0987548, 0.0402302, ..., 0.0913652, 0.0569547]
b float64 [Å] (y, x) [0.175034, 0.674615, ..., 0.887478, 0.241118] [0.0987548, 0.0402302, ..., 0.0913652, 0.0569547]
The format of variables can be controlled using f-strings or format. For example, the default format shows the first 2 and last 2 elements:
[12]:
var = sc.linspace('x', 0.0, 1.0, 11, unit='m')
f'{var}'
[12]:
'<scipp.Variable> (x: 11) float64 [m] [0, 0.1, ..., 0.9, 1]'
Use <
to show the first 4 elements:
[13]:
f'{var:<}'
[13]:
'<scipp.Variable> (x: 11) float64 [m] [0, 0.1, 0.2, 0.3, ...]'
Use #n
to show n
elements instead of 4:
[14]:
f'{var:#5}'
[14]:
'<scipp.Variable> (x: 11) float64 [m] [0, 0.1, ..., 0.8, 0.9, 1]'
Configure how elements are formatted. Note the double colon! The options after the first colon control how the variable itself is formatted. Options after the second are forwarded to the elements and can be anything that the element type (in this case float
) supports.
[15]:
f'{var::.1e}'
[15]:
'<scipp.Variable> (x: 11) float64 [m] [0.0e+00, 1.0e-01, ..., 9.0e-01, 1.0e+00]'
Or combine all of the above:
[16]:
f'{var:<#5:.1e}'
[16]:
'<scipp.Variable> (x: 11) float64 [m] [0.0e+00, 1.0e-01, 2.0e-01, 3.0e-01, 4.0e-01, ...]'
In addition, Variables have a compact string format:
[17]:
var = sc.scalar(1.2345, variance=0.01, unit='kg')
f'{var:c}'
[17]:
'1.23(10) kg'
Note that this is primarily intended for scalar variables and may produce hard to read outputs otherwise.
Format string syntax#
The full syntax of format specifiers is:
format_spec ::= [scipp_spec] [":" nested_spec]
nested_spec ::= .*
scipp_spec ::= [selection]["#" length][type]
selection ::= "^" | "<" | ">"
length ::= digit+
type ::= "c"
``selection`` controls how the array is sliced:
selection |
Meaning |
---|---|
|
Use elements from the beginning and end as if by |
|
Use elements from the beginning as if by |
|
Use elements from the end as if by |
None |
Same as |
``length`` controls how many elements are shown. It defaults to 4.
``type`` selects between different formatters:
type |
Meaning |
---|---|
|
Compact formatter. Does not support other options like |
None |
Default formatter which shows the variable with all metadata and data as determined by the other options. |
``nested_spec`` is used to format the array elements. It can be anything that the dtype’s formatter supports. Note that it always requires an additional colon to separate it from the scipp_spec
. See in particular the standard library specification.