# Representations and Tables

Scipp provides a number of options for visualizing the structure and contents of variables, data arrays, and datasets:

- [scipp.to_html](../generated/functions/scipp.to_html.rst) produces an HTML representation.
  This is also bound to `_repr_html_`, i.e., Jupyter will display this when the name of a Scipp object is typed at the end of a cell.
- [scipp.show](../generated/functions/scipp.show.rst) draws an SVG representation of the contained items and their shapes.
- [scipp.table](../generated/functions/scipp.table.rst) outputs a table representation of 1-D data.
- `str` and `repr` produce a summary as a string.

String formatting is always possible, but the outputs of `to_html`, `show`, and `table` are designed for Jupyter notebooks.

While the outputs are mostly self-explanatory we discuss some details below.

## HTML representation

[scipp.to_html](../generated/functions/scipp.to_html.rst) is used to define `_repr_html_`.
This special property is used by Jupyter in place of `__repr__`.

In [None]:
import numpy as np
import scipp as sc

In [None]:
x = sc.arange('x', 2.)
y = sc.arange('y', 4., unit='m')
labels = sc.arange('y', start=7., stop=10.)
ds = sc.Dataset(
    data={'a':sc.array(dims=['y', 'x'],
                       values=np.random.random((3, 2)),
                       variances=0.1 * np.random.random((3, 2)),
                       unit='angstrom')},
    coords={'x':x, 'y':y, 'y_label':labels})
ds['b'] = ds['a']

Simply typing the name of a variable, data array, or dataset will show the HTML representation:

In [None]:
ds

The columns are

1. Name of the data item, coordinate, etc. For coordinates, a bold font indicates that the coordinate is aligned.
2. Dimensions.
3. DType.
4. Unit.
5. Values and variances.

The reported size is only an estimate.
It includes the actual arrays of values as well as (some of) the internal memory used by variables, etc.
See, e.g. [scipp.Variable.underlying_size](https://scipp.github.io/generated/classes/scipp.Variable.html#scipp.Variable.underlying_size).

<div class="alert alert-warning">
    <b>WARNING:</b>

IPython (and thus Jupyter) has an [Output caching system](https://ipython.readthedocs.io/en/stable/interactive/reference.html?highlight=previous#output-caching-system).
By default this keeps the last 1000 cell outputs.
In the above case this is `ds` (not the displayed HTML, but the object itself).
If such cell outputs are large then this output cache can consume enormous amounts of memory.

Note that `del ds` will *not* release the memory, since the IPython output cache still holds a reference to the same object.
See [this FAQ entry](../getting-started/faq.rst#scipp-is-using-more-and-more-memory-the-jupyter-kernel-crashes) for clearing or disabling this caching.

</div>

Note that (as usual) Jupyter only shows the last variable mentioned in a cell:

In [None]:
a = 1
ds
a

In this case, `to_html` can be used to retain the HTML view, e.g., to show multiple objects in a single cell:

In [None]:
sc.to_html(ds['a'])
sc.to_html(ds['b'])

Typing the Scipp module name at the end of a cell yields an HTML view of all Scipp objects (variables, data arrays, and datasets):

In [None]:
sc

## SVG representation

[scipp.show](../generated/functions/scipp.show.rst) renders Scipp objects to an image that shows the relationships between coordinates and data.
It should be noted that if a dimension extent is large, `show` will truncate it to avoid generation of massive and unreadable SVGs.
Objects with more than three dimensions are not supported and will result in an error message.

Compare the image below with the HTML representation to see what the individual components represent.
Names of dataset items and coordinates are shown in large letters.
And dimension names are shown in smaller (rotated for y) letters.

In [None]:
sc.show(ds)

Note that y has four blocks and y_label and the data have 3 in the y-dimension.
This indicates that `y` is a bin-edge coordinate.

`scipp.show` also works with binned data.
Here, the smaller blocks to the right represent the events, i.e., the bin contents.
Their length does not mean anything as the size of bins can vary.

In [None]:
sc.show(sc.data.binned_xy(100, 3, 2))

## Table representation

[scipp.table](../generated/functions/scipp.table.rst) arranges Scipp objects in a table.
If only works with one-dimensional objects, so we have to use slicing to display our higher dimensional example:

In [None]:
sc.table(ds['y', 0])

In the following, the y column is longer than the other columns because `y` is a bin-edge coordinate.

In [None]:
sc.table(ds['x', 0])

## String-representation

All Scipp objects can be converted to strings:

In [None]:
print(ds)

The format of variables can be controlled using f-strings or [format](https://docs.python.org/3/library/functions.html?highlight=format#format).
For example, the default format shows the first 2 and last 2 elements:

In [None]:
var = sc.linspace('x', 0.0, 1.0, 11, unit='m')
f'{var}'

Use `<` to show the first 4 elements:

In [None]:
f'{var:<}'

Use `#n` to show `n` elements instead of 4:

In [None]:
f'{var:#5}'

Configure how elements are formatted.
Note the double colon!
The options after the first colon control how the variable itself is formatted.
Options after the second are forwarded to the elements and can be anything that the element type (in this case `float`) supports.

In [None]:
f'{var::.1e}'

Or combine all of the above:

In [None]:
f'{var:<#5:.1e}'

In addition, Variables have a compact string format:

In [None]:
var = sc.scalar(1.2345, variance=0.01, unit='kg')
f'{var:c}'

Note that this is primarily intended for scalar variables and may produce hard to read outputs otherwise.

## Format string syntax

The full syntax of format specifiers is:
```
format_spec ::= [scipp_spec] [":" nested_spec]
nested_spec ::= .*
scipp_spec  ::= [selection]["#" length][type]
selection   ::= "^" | "<" | ">"
length      ::= digit+
type        ::= "c"
```

*`selection`* controls how the array is sliced:

| selection | Meaning |
|-----------|---------|
| `^` | Use elements from the beginning and end as if by `var[:length//2]`, `...`, `var[-length//2:]`. |
| `<` | Use elements from the beginning as if by `var[:length]`, `...`. |
| `>` | Use elements from the end as if by `...`, `var[-length]`. |
| None | Same as `^` |

*`length`* controls how many elements are shown.
It defaults to 4.

*`type`* selects between different formatters:

| type | Meaning |
|------|---------|
| `c` | Compact formatter. Does not support other options like `selection` or `nested_spec`. |
| None | Default formatter which shows the variable with all metadata and data as determined by the other options. |

*`nested_spec`* is used to format the array elements.
It can be anything that the dtype's formatter supports.
Note that it always requires an additional colon to separate it from the `scipp_spec`.
See in particular the [standard library specification](https://docs.python.org/3/library/string.html?highlight=string#format-specification-mini-language).