Data types#

In most cases, the data type (dtype) of a Variable is derived from the data. For instance when passing a NumPy array to Scipp, Scipp will use the dtype provided by NumPy:

[1]:
import numpy as np
import scipp as sc

var = sc.Variable(dims=['x'], values=np.arange(4.0))
var.dtype
[1]:
DType('float64')
[2]:
var = sc.Variable(dims=['x'], values=np.arange(4))
var.dtype
[2]:
DType('int64')

The dtype may also be specified using a keyword argument to sc.Variable and most creation functions. It is possible to use Scipp’s own scipp.DType, numpy.dtype, or (where a NumPy equivalent exists) a string:

[3]:
var = sc.zeros(dims=['x'], shape=[2], dtype=sc.DType.float32)
var.dtype
[3]:
DType('float32')
[4]:
var = sc.zeros(dims=['x'], shape=[2], dtype=np.dtype(np.float32))
var.dtype
[4]:
DType('float32')
[5]:
var = sc.zeros(dims=['x'], shape=[2], dtype='float32')
var.dtype
[5]:
DType('float32')

Scipp supports common dtypes like

  • float32, float64

  • int32, int64

  • bool

  • string

  • datetime64

It is also possible to nest Variables, DataArrays, or Datasets inside of Variables. This is useful for storing attributes in DataArrays and Datasets. But there is only limited interoperability with NumPy in those cases.

[6]:
var = sc.scalar(sc.zeros(dims=['x'], shape=[2], dtype='float64'))
var
[6]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (528 Bytes)
    • ()
      Variable
      <scipp.Variable> (x: 2) float64 [dimensionless] [0, 0]
      Values:
      <scipp.Variable> (x: 2) float64 [dimensionless] [0, 0]

You can find a full list in the docs of the scipp.DType class.

Dates and Times#

Scipp has a special dtype for time-points, sc.DType.datetime64. Variables can be constructed using scipp.datetime and scipp.datetimes:

[7]:
sc.datetime('2022-01-10T14:31:21')
[7]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      s
      2022-01-10T14:31:21
      Values:
      array('2022-01-10T14:31:21', dtype='datetime64[s]')
[8]:
sc.datetimes(dims=['t'], values=['2022-01-10T14:31:21', '2022-01-11T11:09:05'])
[8]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (272 Bytes)
    • (t: 2)
      datetime64
      s
      2022-01-10T14:31:21, 2022-01-11T11:09:05
      Values:
      array(['2022-01-10T14:31:21', '2022-01-11T11:09:05'], dtype='datetime64[s]')

Datetimes can also be constructed from integers which encode the time since the Scipp epoch which is equal to the Unix epoch. The unit argument determines what time scale the integers represent.

[9]:
sc.datetime(0, unit='s')
[9]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      s
      1970-01-01T00:00:00
      Values:
      array('1970-01-01T00:00:00', dtype='datetime64[s]')
[10]:
sc.datetimes(dims=['t'], values=[123456789, 345678912], unit='us')
[10]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (272 Bytes)
    • (t: 2)
      datetime64
      µs
      1970-01-01T00:02:03.456789, 1970-01-01T00:05:45.678912
      Values:
      array(['1970-01-01T00:02:03.456789', '1970-01-01T00:05:45.678912'], dtype='datetime64[us]')

As a shortand, scipp.epoch can be used to get a scalar containing Scipp’s epoch:

[11]:
sc.epoch(unit='s')
[11]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      s
      1970-01-01T00:00:00
      Values:
      array('1970-01-01T00:00:00', dtype='datetime64[s]')

The other creation functions also work with datetimes by specifying the datetime64 dtype explicitly. However, only integer inputs and Numpy arrays of numpy.datetime64 can be used in those cases.

[12]:
sc.scalar(value=24, unit='h', dtype=sc.DType.datetime64)
[12]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      h
      1970-01-02T00
      Values:
      array('1970-01-02T00', dtype='datetime64[h]')
[13]:
var = sc.scalar(value=681794055, unit=sc.units.s, dtype='datetime64')
var
[13]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      s
      1991-08-10T03:14:15
      Values:
      array('1991-08-10T03:14:15', dtype='datetime64[s]')

Scipp’s datetime variables can interoperate with numpy.datetime64 and arrays thereof:

[14]:
var.value
[14]:
numpy.datetime64('1991-08-10T03:14:15')
[15]:
sc.scalar(value=np.datetime64('now'))
[15]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      s
      2022-11-25T14:18:17
      Values:
      array('2022-11-25T14:18:17', dtype='datetime64[s]')

Or more succinctly:

[16]:
sc.datetime('now')
[16]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      s
      2022-11-25T14:18:17
      Values:
      array('2022-11-25T14:18:17', dtype='datetime64[s]')

Note that 'now' implies unit s even though we did not specify it. The unit was deduced from the numpy.datetime64 object which encodes a unit of its own.

Operations#

Variables containing datetimes only support a limited set of operations as it makes no sense to, for instance, add two time points. In contrast to NumPy, Scipp does not have a separate type for time differences. Those are simply encoded by integer Variables with a temporal unit.

[17]:
a = sc.datetime('2021-03-14T00:00:00', unit='ms')
b = sc.datetime('2000-01-01T00:00:00', unit='ms')
a - b
[17]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      int64
      ms
      668995200000
      Values:
      array(668995200000)
[18]:
a + b
---------------------------------------------------------------------------
DTypeError                                Traceback (most recent call last)
Cell In [18], line 1
----> 1 a + b

DTypeError: 'add' does not support dtypes 'datetime64', 'datetime64',
[19]:
a + sc.scalar(value=123, unit='ms')
[19]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      ms
      2021-03-14T00:00:00.123
      Values:
      array('2021-03-14T00:00:00.123', dtype='datetime64[ms]')

Time zones#

Scipp does not support manual handling of time zones. All datetime objects are assumed to be in UTC. Scipp does not look at your local time zone, thus the following will always produce 12:30 on 2021-03-09 UTC no matter where you are when you run this code:

[20]:
sc.scalar(value=np.datetime64('2021-09-03T12:30:00'))
[20]:
Show/Hide data repr Show/Hide attributes
scipp.Variable (264 Bytes)
    • ()
      datetime64
      s
      2021-09-03T12:30:00
      Values:
      array('2021-09-03T12:30:00', dtype='datetime64[s]')