scipp.io.csv.load_csv#

scipp.io.csv.load_csv(filename, *, sep=',', data_columns=None, header_parser=None)#

Load a CSV file as a dataset.

This function currently uses Pandas to load the file and converts the result into a scipp.Dataset. Pandas is not a hard dependency of Scipp and will thus not be installed automatically, so you need to install it manually.

load_csv exists to conveniently load simple CSV files. If a file cannot be loaded directly, consider using Pandas directly. For example, use pandas.read_csv() to load the file into a data frame and scipp.compat.pandas_compat.from_pandas() to convert the data frame into a dataset.

Parameters:

filename (Union[str, PathLike, StringIO, BytesIO]) – Path or URL of file to load or buffer to load from.
sep (Optional[str], default: ',') – Column separator. Automatically deduced if sep is None. See pandas.read_csv() for details.
data_columns (Union[str, Iterable[str], None], default: None) – Select which columns to assign as data. The rest are returned as coordinates. If None, all columns are assigned as data. Use an empty list to assign all columns as coordinates.
header_parser (Union[Literal[‘bracket’], Callable[[str], Tuple[str, Optional[Unit]]], None], default: None) – Parser for column headers. See scipp.compat.pandas_compat.from_pandas() for details.

Returns:

Dataset – The loaded data as a dataset.

Examples

Given the following CSV ‘file’:

>>> from io import StringIO
>>> csv_content = '''a [m],b [s],c
... 1,5,9
... 2,6,10
... 3,7,11
... 4,8,12'''

By default, it will be loaded as

>>> sc.io.load_csv(StringIO(csv_content))
<scipp.Dataset>
Dimensions: Sizes[row:4, ]
Data:
  a [m]                       int64  [dimensionless]  (row)  [1, 2, 3, 4]
  b [s]                       int64  [dimensionless]  (row)  [5, 6, 7, 8]
  c                           int64  [dimensionless]  (row)  [9, 10, 11, 12]

In this example, the column headers encode units. They can be parsed into actual units:

>>> sc.io.load_csv(StringIO(csv_content), header_parser='bracket')
<scipp.Dataset>
Dimensions: Sizes[row:4, ]
Data:
  a                           int64              [m]  (row)  [1, 2, 3, 4]
  b                           int64              [s]  (row)  [5, 6, 7, 8]
  c                           int64        <no unit>  (row)  [9, 10, 11, 12]

It is possible to select which columns are stored as data:

>>> sc.io.load_csv(
...     StringIO(csv_content),
...     header_parser='bracket',
...     data_columns='a',
... )
<scipp.Dataset>
Dimensions: Sizes[row:4, ]
Coordinates:
* b                           int64              [s]  (row)  [5, 6, 7, 8]
* c                           int64        <no unit>  (row)  [9, 10, 11, 12]
Data:
  a                           int64              [m]  (row)  [1, 2, 3, 4]