Representations and Tables
Contents
Representations and Tables#
Scipp provides a number of options for visualizing the structure and contents of variables, data arrays, and datasets:
scipp.to_html produces an HTML representation. This is also bound to
_repr_html_
, i.e., Jupyter will display this when the name of a scipp object is typed at the end of a cell.scipp.show draws an SVG representation of the contained items and their shapes.
scipp.table outputs a table representation of 1-D data.
str
andrepr
produce a summary as a string.
String formatting is always possible, but the outputs of to_html
, show
, and table
are designed for Jupyter notebooks.
While the outputs are mostly self-explanatory we discuss some details below.
HTML representation#
scipp.to_html is used to define _repr_html_
. This special property is used by Jupyter in place of __repr__
.
[1]:
import numpy as np
import scipp as sc
[2]:
x = sc.arange('x', 2.)
y = sc.arange('y', 4., unit='m')
labels = sc.arange('y', start=7., stop=10.)
d = sc.Dataset(
data={'a':sc.array(dims=['y', 'x'],
values=np.random.random((3, 2)),
variances=0.1 * np.random.random((3, 2)),
unit='angstrom')},
coords={'x':x, 'y':y, 'y_label':labels})
d['b'] = d['a']
d['c'] = 1.0 * sc.units.kg
d['a'].attrs['x_attr'] = sc.array(dims=['x'], values=[1.77, 3.32])
d['b'].attrs['x_attr'] = sc.array(dims=['x'], values=[55.7, 105.1])
d['b'].attrs['b_attr'] = 1.2 * sc.units.m
Simply typing the name of a variable, data array, or dataset will show the HTML representation:
[3]:
d
[3]:
- y: 3
- x: 2
- x(x)float64𝟙0.0, 1.0
Values:
array([0., 1.]) - y(y [bin-edge])float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.]) - y_label(y)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
- a(y, x)float64Å0.367, 0.539, ..., 0.319, 0.545σ = 0.223, 0.141, ..., 0.077, 0.113
- x_attr(x)float64𝟙1.77, 3.32
Values:
array([1.77, 3.32])
Values:
array([[0.36698296, 0.53884768], [0.42914051, 0.73771041], [0.31926488, 0.54456542]])
Variances (σ²):
array([[0.04965686, 0.0197753 ], [0.04518583, 0.07240765], [0.00587706, 0.01275458]]) - b(y, x)float64Å0.367, 0.539, ..., 0.319, 0.545σ = 0.223, 0.141, ..., 0.077, 0.113
- b_attr()float64m1.2
Values:
array(1.2) - x_attr(x)float64𝟙55.7, 105.1
Values:
array([ 55.7, 105.1])
Values:
array([[0.36698296, 0.53884768], [0.42914051, 0.73771041], [0.31926488, 0.54456542]])
Variances (σ²):
array([[0.04965686, 0.0197753 ], [0.04518583, 0.07240765], [0.00587706, 0.01275458]]) - c()float64kg1.0
Values:
array(1.)
Note that (as usual) Jupyter only shows the last variable mentioned in a cell:
[4]:
a = 1
d
a
[4]:
1
In this case, to_html
can be used to retain the HTML view, e.g., to show multiple objects in a single cell:
[5]:
sc.to_html(d['a'])
sc.to_html(d['c'])
- y: 3
- x: 2
- x(x)float64𝟙0.0, 1.0
Values:
array([0., 1.]) - y(y [bin-edge])float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.]) - y_label(y)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
- (y, x)float64Å0.367, 0.539, ..., 0.319, 0.545σ = 0.223, 0.141, ..., 0.077, 0.113
Values:
array([[0.36698296, 0.53884768], [0.42914051, 0.73771041], [0.31926488, 0.54456542]])
Variances (σ²):
array([[0.04965686, 0.0197753 ], [0.04518583, 0.07240765], [0.00587706, 0.01275458]])
- x_attr(x)float64𝟙1.77, 3.32
Values:
array([1.77, 3.32])
- ()float64kg1.0
Values:
array(1.)
Typing the scipp module name at the end of a cell yields an HTML view of all scipp objects (variables, data arrays, and datasets):
[6]:
sc
Variables:(3)
labels
- (y: 3)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
x
- (x: 2)float64𝟙0.0, 1.0
Values:
array([0., 1.])
y
- (y: 4)float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.])
DataArrays:(0)
Datasets:(1)
d
- y: 3
- x: 2
- x(x)float64𝟙0.0, 1.0
Values:
array([0., 1.]) - y(y [bin-edge])float64m0.0, 1.0, 2.0, 3.0
Values:
array([0., 1., 2., 3.]) - y_label(y)float64𝟙7.0, 8.0, 9.0
Values:
array([7., 8., 9.])
- a(y, x)float64Å0.367, 0.539, ..., 0.319, 0.545σ = 0.223, 0.141, ..., 0.077, 0.113
- x_attr(x)float64𝟙1.77, 3.32
Values:
array([1.77, 3.32])
Values:
array([[0.36698296, 0.53884768], [0.42914051, 0.73771041], [0.31926488, 0.54456542]])
Variances (σ²):
array([[0.04965686, 0.0197753 ], [0.04518583, 0.07240765], [0.00587706, 0.01275458]]) - b(y, x)float64Å0.367, 0.539, ..., 0.319, 0.545σ = 0.223, 0.141, ..., 0.077, 0.113
- b_attr()float64m1.2
Values:
array(1.2) - x_attr(x)float64𝟙55.7, 105.1
Values:
array([ 55.7, 105.1])
Values:
array([[0.36698296, 0.53884768], [0.42914051, 0.73771041], [0.31926488, 0.54456542]])
Variances (σ²):
array([[0.04965686, 0.0197753 ], [0.04518583, 0.07240765], [0.00587706, 0.01275458]]) - c()float64kg1.0
Values:
array(1.)
[6]:
<module 'scipp' from '/home/runner/work/scipp/scipp/.tox/docs/lib/python3.8/site-packages/scipp/__init__.py'>
SVG representation#
scipp.show renders scipp objects to an image that shows the relationships between coordinates and data. It should be noted that if a dimension extent is large, show
will truncate it to avoid generation of massive and unreadable SVGs. Objects with more than three dimensions are not supported and will result in an error message.
Compare the image below with the HTML representation to see what the individual components represent. Names of dataset items and coordinates are shown in large letters. And dimension names are shown in smaller (rotated for y) letters.
[7]:
sc.show(d)
Note that y has four blocks and y_label and the data have 3 in the y-dimension. This indicates that y
is a bin-edge coordinate.
scipp.show
also works with binned data. Here, the smaller blocks to the right represent the events, i.e., the bin contents. Their length does not mean anything as the size of bins can vary.
[8]:
sc.show(sc.data.binned_xy(100, 3, 2))
Table representation#
scipp.table arranges scipp objects in a table. If only works with one-dimensional objects, so we have to use slicing to display our higher dimensional example:
[9]:
sc.table(d['y', 0])
[9]:
a | b | |||
---|---|---|---|---|
Coordinates | Data | Attributes | Data | Attributes |
x [𝟙] | [Å] | x_attr [𝟙] | [Å] | x_attr [𝟙] |
0.000 | 0.367±0.223 | 1.770 | 0.367±0.223 | 55.700 |
1.000 | 0.539±0.141 | 3.320 | 0.539±0.141 | 105.100 |
In the following, the y column is longer than the other columns because y
is a bin-edge coordinate.
[10]:
sc.table(d['x', 0])
[10]:
a | b | ||
---|---|---|---|
Coordinates | Data | Data | |
y [m] | y_label [𝟙] | [Å] | [Å] |
0.000 | 7.000 | 0.367±0.223 | 0.367±0.223 |
1.000 | 8.000 | 0.429±0.213 | 0.429±0.213 |
2.000 | 9.000 | 0.319±0.077 | 0.319±0.077 |
3.000 |
String-representation#
All scipp objects can be converted to strings:
[11]:
print(d)
<scipp.Dataset>
Dimensions: Sizes[y:3, x:2, ]
Coordinates:
x float64 [dimensionless] (x) [0, 1]
y float64 [m] (y [bin-edge]) [0, 1, 2, 3]
y_label float64 [dimensionless] (y) [7, 8, 9]
Data:
a float64 [Å] (y, x) [0.366983, 0.538848, ..., 0.319265, 0.544565] [0.0496569, 0.0197753, ..., 0.00587706, 0.0127546]
Attributes:
x_attr float64 [dimensionless] (x) [1.77, 3.32]
b float64 [Å] (y, x) [0.366983, 0.538848, ..., 0.319265, 0.544565] [0.0496569, 0.0197753, ..., 0.00587706, 0.0127546]
Attributes:
b_attr float64 [m] () [1.2]
x_attr float64 [dimensionless] (x) [55.7, 105.1]
c float64 [kg] () [1]
In addition, Variables have a compact string format:
[12]:
print('{:c}'.format(d['c'].data))
1.0 kg
Note that this is primarily intended for scalar variables and may produce hard to read outputs otherwise.