Numpy, Pandas, and Xarray#
While Plopp is primarily aimed at being used with Scipp data structures (and uses Scipp internally), it offers some compatibility with other libraries in the scientific Python ecosystem. Most of the high-level functions in Plopp will accept Numpy, Pandas and Xarray data structures as input.
We illustrate this here with the help of a few useful examples.
Numpy arrays#
The most commonly used function in Plopp is the high-level plot
wrapper, which can accept a number of different inputs. For a one-dimensional ndarray
, simple use
[1]:
import numpy as np
import plopp as pp
a1d = np.sin(0.15 * np.arange(50.0))
pp.plot(a1d)
[1]:
Scipp data arrays have dimensions and physical units, that are used for axes labels in Plopp. Numpy arrays do not have dimension labels, so the horizontal axis of the figure is just labeled axis-0
. Similarly, the array does not have physical units, and the vertical label is just given the default dimensionless
label (try using Scipp data arrays to have axes labeled automatically).
Plotting two-dimensional arrays is equally simple with
[2]:
a2d = np.sin(0.15 * np.arange(50.0)).reshape(50, 1) * np.sin(0.2 * np.arange(30.0))
pp.plot(a2d)
[2]:
Just like with Scipp data arrays, plotting multiple arrays onto the same axes is achieved by supplying a dict to the plot
function:
[3]:
b1d = 3 * a1d + np.random.random(50)
pp.plot({'a': a1d, 'b': b1d})
[3]:
Pandas Series and DataFrame#
New in version 23.05.0.
Plopp’s plot
wrapper will accept a Pandas data Series
as input in the same way:
[4]:
import pandas as pd
N = 200
ts = pd.Series(
np.random.randn(N), index=pd.date_range("1/1/2000", periods=N), name='Temperature'
)
ts = ts.cumsum()
pp.plot(ts, ls='-', marker=None)
[4]:
Supplying a DataFrame
to plot
will attempt to place all entries on the same axes. This is very useful for quick inspection, but it also means that if some data types are incompatible (e.g. some columns are floats, while others are strings), the call to plot
will fail.
[5]:
df = pd.DataFrame(np.random.randn(N, 4), index=ts.index, columns=list("ABCD"))
df = df.cumsum()
pp.plot(df, ls='-', marker=None)
[5]:
Xarray#
New in version 23.05.0.
Xarray data structures are very similar to the ones Scipp provides, and the labeled dimensions allow us to automatically annotate the axes labels of a figure.
[6]:
import xarray as xr
air = xr.tutorial.open_dataset("air_temperature").air
# We modify a few entries which are not well handled by Scipp
del air.attrs['precision']
del air.attrs['GRIB_id']
del air.attrs['actual_range']
air.coords['lat'].attrs['units'] = 'degrees'
air
[6]:
<xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> Size: 31MB [3869000 values with dtype=float64] Coordinates: * lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0 * time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00 Attributes: long_name: 4xDaily Air temperature at sigma level 995 units: degK GRIB_name: TMP var_desc: Air temperature dataset: NMC Reanalysis level_desc: Surface statistic: Individual Obs parent_stat: Other
[7]:
air1d = air.isel(lat=10, lon=10)
pp.plot(air1d)
/home/runner/work/plopp/plopp/.tox/docs/lib/python3.10/site-packages/plopp/plotting/common.py:44: UserWarning: Input data contains some attributes which have been dropped during the conversion.
return sc.compat.from_xarray(obj)
[7]:
[8]:
air2d = air.isel(time=500)
pp.plot(air2d)
[8]:
Interactive tools#
While you can easily make these plots with Xarray itself, Plopp also provides additional tools to explore your data.
One example is the slicer
plot, that can be used to navigate additional dimension of 1d or 2d data using an interactive slider.
[9]:
%matplotlib widget
pp.slicer(air)
[9]:
Or the inspector
plot that allows you to pick points on the 2d map and display a time cut in a second plot below using the the inspector tool :
[10]:
inspect_plot = pp.inspector(air, dim='time', operation='mean', orientation='vertical')
[12]:
inspect_plot
[12]: