scipp.io.csv.load_csv#
- scipp.io.csv.load_csv(filename, *, sep=',', data_columns=None, header_parser=None)#
Load a CSV file as a dataset.
This function currently uses Pandas to load the file and converts the result into a
scipp.Dataset
. Pandas is not a hard dependency of Scipp and will thus not be installed automatically, so you need to install it manually.load_csv
exists to conveniently load simple CSV files. If a file cannot be loaded directly, consider using Pandas directly. For example, usepandas.read_csv()
to load the file into a data frame andscipp.compat.pandas_compat.from_pandas()
to convert the data frame into a dataset.- Parameters:
filename (
str
|PathLike
[str
] |StringIO
|BytesIO
) – Path or URL of file to load or buffer to load from.sep (
str
|None
, default:','
) – Column separator. Automatically deduced ifsep is None
. Seepandas.read_csv()
for details.data_columns (
Union
[str
,Iterable
[str
],None
], default:None
) – Select which columns to assign as data. The rest are returned as coordinates. IfNone
, all columns are assigned as data. Use an empty list to assign all columns as coordinates.header_parser (
Union
[Literal
['bracket'
],Callable
[[str
],tuple
[str
,Unit
|None
]],None
], default:None
) – Parser for column headers. Seescipp.compat.pandas_compat.from_pandas()
for details.
- Returns:
Dataset
– The loaded data as a dataset.
Examples
Given the following CSV ‘file’:
>>> from io import StringIO >>> csv_content = '''a [m],b [s],c ... 1,5,9 ... 2,6,10 ... 3,7,11 ... 4,8,12'''
By default, it will be loaded as
>>> sc.io.load_csv(StringIO(csv_content)) <scipp.Dataset> Dimensions: Sizes[row:4, ] Data: a [m] int64 [dimensionless] (row) [1, 2, 3, 4] b [s] int64 [dimensionless] (row) [5, 6, 7, 8] c int64 [dimensionless] (row) [9, 10, 11, 12]
In this example, the column headers encode units. They can be parsed into actual units:
>>> sc.io.load_csv(StringIO(csv_content), header_parser='bracket') <scipp.Dataset> Dimensions: Sizes[row:4, ] Data: a int64 [m] (row) [1, 2, 3, 4] b int64 [s] (row) [5, 6, 7, 8] c int64 <no unit> (row) [9, 10, 11, 12]
It is possible to select which columns are stored as data:
>>> sc.io.load_csv( ... StringIO(csv_content), ... header_parser='bracket', ... data_columns='a', ... ) <scipp.Dataset> Dimensions: Sizes[row:4, ] Coordinates: * b int64 [s] (row) [5, 6, 7, 8] * c int64 <no unit> (row) [9, 10, 11, 12] Data: a int64 [m] (row) [1, 2, 3, 4]