Numpy, Pandas, and Xarray#

While Plopp is primarily aimed at being used with Scipp data structures (and uses Scipp internally), it offers some compatibility with other libraries in the scientific Python ecosystem. Most of the high-level functions in Plopp will accept Numpy, Pandas and Xarray data structures as input.

We illustrate this here with the help of a few useful examples.

Numpy arrays#

The most commonly used function in Plopp is the high-level plot wrapper, which can accept a number of different inputs. For a one-dimensional ndarray, simple use

[1]:
import numpy as np
import plopp as pp

a1d = np.sin(0.15 * np.arange(50.0))
pp.plot(a1d)
[1]:
../../_images/user-guide_getting-started_numpy-pandas-xarray_1_0.svg

Scipp data arrays have dimensions and physical units, that are used for axes labels in Plopp. Numpy arrays do not have dimension labels, so the horizontal axis of the figure is just labeled axis-0. Similarly, the array does not have physical units, and the vertical label is just given the default dimensionless label (try using Scipp data arrays to have axes labeled automatically).

Plotting two-dimensional arrays is equally simple with

[2]:
a2d = np.sin(0.15 * np.arange(50.0)).reshape(50, 1) * np.sin(0.2 * np.arange(30.0))
pp.plot(a2d)
[2]:
../../_images/user-guide_getting-started_numpy-pandas-xarray_3_0.svg

Just like with Scipp data arrays, plotting multiple arrays onto the same axes is achieved by supplying a dict to the plot function:

[3]:
b1d = 3 * a1d + np.random.random(50)
pp.plot({'a': a1d, 'b': b1d})
[3]:
../../_images/user-guide_getting-started_numpy-pandas-xarray_5_0.svg

Pandas Series and DataFrame#

0ba0738ebc164cce8a7dca68ea475ca4   New in version 23.05.0.

Plopp’s plot wrapper will accept a Pandas data Series as input in the same way:

[4]:
import pandas as pd

N = 200
ts = pd.Series(
    np.random.randn(N), index=pd.date_range("1/1/2000", periods=N), name='Temperature'
)
ts = ts.cumsum()
pp.plot(ts, ls='-', marker=None)
[4]:
../../_images/user-guide_getting-started_numpy-pandas-xarray_7_0.svg

Supplying a DataFrame to plot will attempt to place all entries on the same axes. This is very useful for quick inspection, but it also means that if some data types are incompatible (e.g. some columns are floats, while others are strings), the call to plot will fail.

[5]:
df = pd.DataFrame(np.random.randn(N, 4), index=ts.index, columns=list("ABCD"))
df = df.cumsum()
pp.plot(df, ls='-', marker=None)
[5]:
../../_images/user-guide_getting-started_numpy-pandas-xarray_9_0.svg

Xarray#

baa9c77e69e34f8497b78c6c75c292a0   New in version 23.05.0.

Xarray data structures are very similar to the ones Scipp provides, and the labeled dimensions allow us to automatically annotate the axes labels of a figure.

[6]:
import xarray as xr

air = xr.tutorial.open_dataset("air_temperature").air
# We modify a few entries which are not well handled by Scipp
del air.attrs['precision']
del air.attrs['GRIB_id']
del air.attrs['actual_range']
air.coords['lat'].attrs['units'] = 'degrees'
air
[6]:
<xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> Size: 31MB
[3869000 values with dtype=float64]
Coordinates:
  * lat      (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon      (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
  * time     (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Attributes:
    long_name:    4xDaily Air temperature at sigma level 995
    units:        degK
    GRIB_name:    TMP
    var_desc:     Air temperature
    dataset:      NMC Reanalysis
    level_desc:   Surface
    statistic:    Individual Obs
    parent_stat:  Other
[7]:
air1d = air.isel(lat=10, lon=10)
pp.plot(air1d)
/home/runner/work/plopp/plopp/.tox/docs/lib/python3.10/site-packages/plopp/plotting/common.py:44: UserWarning: Input data contains some attributes which have been dropped during the conversion.
  return sc.compat.from_xarray(obj)
[7]:
../../_images/user-guide_getting-started_numpy-pandas-xarray_12_1.svg
[8]:
air2d = air.isel(time=500)
pp.plot(air2d)
[8]:
../../_images/user-guide_getting-started_numpy-pandas-xarray_13_0.svg

Interactive tools#

While you can easily make these plots with Xarray itself, Plopp also provides additional tools to explore your data.

One example is the slicer plot, that can be used to navigate additional dimension of 1d or 2d data using an interactive slider.

[9]:
%matplotlib widget
pp.slicer(air)
[9]:

Or the inspector plot that allows you to pick points on the 2d map and display a time cut in a second plot below using the the inspector tool 6141a5bfd56f412980e8cdbc2b9bde28:

[10]:
inspect_plot = pp.inspector(air, dim='time', operation='mean', orientation='vertical')
[12]:
inspect_plot
[12]: