Data reduction for POWGEN
Contents
Data reduction for POWGEN#
This notebook shows a basic reduction workflow for powder diffraction for the SNS POWGEN instrument. It serves mainly to develop and present routines for powder diffraction and will eventually be removed in favor of a workflow for DREAM at ESS.
Note that we load functions from external
modules. These modules will be removed when their ESS counterparts exist.
[1]:
import scipp as sc
import scippneutron as scn
import plopp as pp
import ess
from ess.diffraction import powder
from ess import diffraction
from ess.external import powgen
Initialize the logging system to write logs to a file powgen.log
. This also displays a widget which shows log messages emitted by the functions that we call.
[2]:
ess.logging.configure_workflow('powgen_reduction', filename='powgen.log')
[2]:
<Logger scipp.ess.powgen_reduction (INFO)>
Load data#
Load the sample data.
Note: We get the file name from powgen.data
. This module provides access to managed example files. In the real world, we would need to find the file name in a different way. But here, the data has been converted to a Scipp HDF5 file.
[3]:
sample_full = sc.io.load_hdf5(powgen.data.sample_file())
Downloading file 'PG3_4844_event.zip' from 'https://public.esss.dk/groups/scipp/ess/powgen/1/PG3_4844_event.zip' to '/home/runner/.cache/ess/powgen/1'.
Unzipping contents of '/home/runner/.cache/ess/powgen/1/PG3_4844_event.zip' to '/home/runner/.cache/ess/powgen/1/PG3_4844_event.zip.unzip'
[4]:
sample_full
[4]:
- datascippDataArray(spectrum: 24794, tof: 1)DataArrayViewbinned data [len=0, len=0, ..., len=0, len=0]
- detector_infoscippDataset(detector: 24794)
The loaded data group contains some auxiliary detector info that we need later. The events are
[5]:
sample = sample_full['data']
sample
[5]:
- spectrum: 24794
- tof: 1
- gd_prtn_chrg()float64µAh1171.953902925
Values:
array(1171.95390292) - position(spectrum)vector3m[ 1.17451004 -1.01106149 -2.03796699], [ 1.18147634 -0.95946649 -2.05334117], ..., [1.81428985 0.09565841 3.84338287], [1.81375055 0.1499371 3.84269584]
Values:
array([[ 1.17451004, -1.01106149, -2.03796699], [ 1.18147634, -0.95946649, -2.05334117], [ 1.18844265, -0.90787149, -2.06871534], ..., [ 1.81482915, 0.04137972, 3.8440699 ], [ 1.81428985, 0.09565841, 3.84338287], [ 1.81375055, 0.1499371 , 3.84269584]]) - sample_position()vector3m[0. 0. 0.]
Values:
array([0., 0., 0.]) - source_position()vector3m[ 0. 0. -60.]
Values:
array([ 0., 0., -60.]) - spectrum(spectrum)int321, 2, ..., 24793, 24794
Values:
array([ 1, 2, 3, ..., 24792, 24793, 24794], dtype=int32) - tof(tof [bin-edge])float64µs19.0, 1.669e+04
Values:
array([ 19. , 16694.30078125])
- (spectrum, tof)DataArrayViewbinned data [len=0, len=0, ..., len=0, len=0]
dim='event', content=DataArray( dims=(event: 17926980), data=float32[counts], coords={'tof':float64[µs]})
Inspect the raw data#
We can plot the data array to get an idea of its contents.
[6]:
sample.hist(spectrum=500, tof=400).plot()
[6]:
We can see how that data maps onto the detector by using POWGEN’s instrument view.
[7]:
scn.instrument_view(sample.hist())
[7]:
Filter out invalid events#
The file contains events that cannot be associated with a specific pulse. We can get a range of valid time-of-flight values from the instrument characterization file associated with the current run. There is currently no mechanism in scippneutron
or ess
to load such a file as it is not clear if ESS will use this approach. The values used below are taken from PG3_characterization_2011_08_31-HR.txt
which is part of the sample files of Mantid. See, e.g.,
PowderDiffractionReduction.
We remove all events that have a time-of-flight value outside the valid range:
[8]:
sample = sample.bin(tof=sc.array(dims=['tof'], values=[0.0, 16666.67], unit='us'))
Normalize by proton charge#
Next, we normalize the data by the proton charge.
[9]:
sample /= sample.coords['gd_prtn_chrg']
We can check the unit of the event weight to see that the data was indeed divided by a charge.
[10]:
sample.data.values[0].unit
[10]:
counts/uAh
Compute d-spacing#
Here, we compute d-spacing using calibration parameters provided in an example file. First, we load the calibration parameters.
Note: ESS instruments will use a different, yet to be determined way of encoding calibration parameters.
[11]:
cal = sc.io.load_hdf5(powgen.data.calibration_file())
Downloading file 'PG3_FERNS_d4832_2011_08_24.zip' from 'https://public.esss.dk/groups/scipp/ess/powgen/1/PG3_FERNS_d4832_2011_08_24.zip' to '/home/runner/.cache/ess/powgen/1'.
Unzipping contents of '/home/runner/.cache/ess/powgen/1/PG3_FERNS_d4832_2011_08_24.zip' to '/home/runner/.cache/ess/powgen/1/PG3_FERNS_d4832_2011_08_24.zip.unzip'
The calibration is loaded with a ‘detector’ dimension. Compute the corresponding spectrum indices using the detector info loaded as part of the sample data.
[12]:
cal = powgen.beamline.map_detector_to_spectrum(
cal, detector_info=sample_full['detector_info']
)
[13]:
cal
[13]:
- spectrum: 24794
- spectrum(spectrum)int321, 2, ..., 24793, 24794
Values:
array([ 1, 2, 3, ..., 24792, 24793, 24794], dtype=int32)
- difa(spectrum)float649290304000000s/ft^20.0, 0.0, ..., 0.0, 0.0
Values:
array([0., 0., 0., ..., 0., 0., 0.]) - difc(spectrum)float6410N/mW2.997e+04, 3.003e+04, ..., 7113.696, 7125.408
Values:
array([29971.33683513, 30030.72117038, 30087.70542759, ..., 7107.51320735, 7113.6962062 , 7125.40789494]) - mask(spectrum)boolFalse, False, ..., False, False
Values:
array([False, False, False, ..., False, False, False]) - tzero(spectrum)float64µs0.0, 0.0, ..., 0.0, 0.0
Values:
array([0., 0., 0., ..., 0., 0., 0.])
Now when can compute d-spacing for the sample using the calibration parameters.
[14]:
sample_dspacing = powder.to_dspacing_with_calibration(sample, calibration=cal)
Vanadium correction#
Before we can process the d-spacing distribution further, we need to normalize the data by a vanadium measurement.
[15]:
vana_full = sc.io.load_hdf5(powgen.data.vanadium_file())
Downloading file 'PG3_4866_event.zip' from 'https://public.esss.dk/groups/scipp/ess/powgen/1/PG3_4866_event.zip' to '/home/runner/.cache/ess/powgen/1'.
Unzipping contents of '/home/runner/.cache/ess/powgen/1/PG3_4866_event.zip' to '/home/runner/.cache/ess/powgen/1/PG3_4866_event.zip.unzip'
[16]:
vana_full
[16]:
- datascippDataArray(spectrum: 24794, tof: 1)DataArrayViewbinned data [len=0, len=0, ..., len=0, len=0]
- proton_chargescippDataArray(pulse_time: 1068931)float64pC1.444e+07, 1.444e+07, ..., 1.460e+07, 1.460e+07
[17]:
vana = vana_full['data']
vana
[17]:
- spectrum: 24794
- tof: 1
- gd_prtn_chrg()float64µAh3777.917473130555
Values:
array(3777.91747313) - position(spectrum)vector3m[ 1.17451004 -1.01106149 -2.03796699], [ 1.18147634 -0.95946649 -2.05334117], ..., [1.81428985 0.09565841 3.84338287], [1.81375055 0.1499371 3.84269584]
Values:
array([[ 1.17451004, -1.01106149, -2.03796699], [ 1.18147634, -0.95946649, -2.05334117], [ 1.18844265, -0.90787149, -2.06871534], ..., [ 1.81482915, 0.04137972, 3.8440699 ], [ 1.81428985, 0.09565841, 3.84338287], [ 1.81375055, 0.1499371 , 3.84269584]]) - sample_position()vector3m[0. 0. 0.]
Values:
array([0., 0., 0.]) - source_position()vector3m[ 0. 0. -60.]
Values:
array([ 0., 0., -60.]) - spectrum(spectrum)int321, 2, ..., 24793, 24794
Values:
array([ 1, 2, 3, ..., 24792, 24793, 24794], dtype=int32) - tof(tof [bin-edge])float64µs19.0, 1.670e+04
Values:
array([ 19. , 16697.69921875])
- (spectrum, tof)DataArrayViewbinned data [len=0, len=0, ..., len=0, len=0]
dim='event', content=DataArray( dims=(event: 65836865), data=float32[counts], coords={'tof':float64[µs], 'pulse_time':datetime64[ns]})
Now we process the vanadium data in a similar was as the sample data.
[18]:
vana /= vana.coords['gd_prtn_chrg']
Removing the variances of the Vanadium data#
Warning
Heybrock et al. (2023) have shown that Scipp’s uncertainty propagation is not suited for broadcast operations such as normalizing event counts by a scalar value, which is the case when normalizing by Vanadium counts. These operations are forbidden in recent versions of Scipp. Until an alternative method is found to satisfactorily track the variances in this workflow, we remove the variances in the Vanadium data. The issue is tracked here.
[19]:
vana.bins.constituents['data'].variances = None
Conversion to d-spacing#
Now, we compute d-spacing using the same calibration parameters as before.
[20]:
vana_dspacing = powder.to_dspacing_with_calibration(vana, calibration=cal)
[21]:
vana_dspacing
[21]:
- spectrum: 24794
- dspacing: 1
- difa(spectrum)float649290304000000s/ft^20.0, 0.0, ..., 0.0, 0.0
Values:
array([0., 0., 0., ..., 0., 0., 0.]) - difc(spectrum)float6410N/mW2.997e+04, 3.003e+04, ..., 7113.696, 7125.408
Values:
array([29971.33683513, 30030.72117038, 30087.70542759, ..., 7107.51320735, 7113.6962062 , 7125.40789494]) - dspacing(dspacing [bin-edge], spectrum)float64Å0.001, 0.001, ..., 2.347, 2.343
Values:
array([[6.33939023e-04, 6.32685439e-04, 6.31487172e-04, ..., 2.67322753e-03, 2.67090405e-03, 2.66651401e-03], [5.57122270e-01, 5.56020587e-01, 5.54967519e-01, ..., 2.34930259e+00, 2.34726066e+00, 2.34340258e+00]]) - gd_prtn_chrg()float64µAh3777.917473130555
Values:
array(3777.91747313) - position(spectrum)vector3m[ 1.17451004 -1.01106149 -2.03796699], [ 1.18147634 -0.95946649 -2.05334117], ..., [1.81428985 0.09565841 3.84338287], [1.81375055 0.1499371 3.84269584]
Values:
array([[ 1.17451004, -1.01106149, -2.03796699], [ 1.18147634, -0.95946649, -2.05334117], [ 1.18844265, -0.90787149, -2.06871534], ..., [ 1.81482915, 0.04137972, 3.8440699 ], [ 1.81428985, 0.09565841, 3.84338287], [ 1.81375055, 0.1499371 , 3.84269584]]) - sample_position()vector3m[0. 0. 0.]
Values:
array([0., 0., 0.]) - source_position()vector3m[ 0. 0. -60.]
Values:
array([ 0., 0., -60.]) - spectrum(spectrum)int321, 2, ..., 24793, 24794
Values:
array([ 1, 2, 3, ..., 24792, 24793, 24794], dtype=int32) - tof(dspacing [bin-edge])float64µs19.0, 1.670e+04
Values:
array([ 19. , 16697.69921875]) - tzero(spectrum)float64µs0.0, 0.0, ..., 0.0, 0.0
Values:
array([0., 0., 0., ..., 0., 0., 0.])
- (spectrum, dspacing)DataArrayViewbinned data [len=0, len=0, ..., len=0, len=0]
dim='event', content=DataArray( dims=(event: 65836865), data=float32[counts/uAh], coords={'tof':float64[µs], 'pulse_time':datetime64[ns], 'dspacing':float64[Å]})
- calibration(spectrum)boolFalse, False, ..., False, False
Values:
array([False, False, False, ..., False, False, False])
Inspect d-spacing#
We need to histogram the events in order to normalize our sample data by vanadium. For consistency, we use these bin edges for both vanadium and the sample data.
[22]:
d = vana_dspacing.coords['dspacing']
dspacing_edges = sc.linspace('dspacing', d.min().value, d.max().value, 200, unit=d.unit)
All spectra combined#
We start simple by combining all spectra using data.bins.concat('spectrum')
. Then, we can normalize the same data by vanadium to get a d-spacing distribution.
Note that because we removed the variances on the Vanadium data, after the following cell, the standard deviations on the result are underestimated.
[23]:
all_spectra = diffraction.normalize_by_vanadium(
sample_dspacing.bins.concat('spectrum'),
vanadium=vana_dspacing.bins.concat('spectrum'),
edges=dspacing_edges,
)
[24]:
all_spectra.hist(dspacing=dspacing_edges).plot()
[24]:
Group into \(2\theta\) bins#
For a better resolution, we now group the sample and vanadium data into a number of bins in the scattering angle \(2\theta\) (see here) and normalize each individually.
[25]:
two_theta = sc.linspace(dim='two_theta', unit='deg', start=25.0, stop=90.0, num=16)
sample_by_two_theta = diffraction.group_by_two_theta(sample_dspacing, edges=two_theta)
vana_by_two_theta = diffraction.group_by_two_theta(vana_dspacing, edges=two_theta)
[26]:
normalized = diffraction.normalize_by_vanadium(
sample_by_two_theta, vanadium=vana_by_two_theta, edges=dspacing_edges
)
Histogram the results in order to get a useful binning in the following plots.
[27]:
normalized = normalized.hist(dspacing=dspacing_edges)
Now we can inspect the d-spacing distribution as a function of \(2\theta\).
[28]:
normalized.plot()
[28]:
In order to get 1-dimensional plots, we can select some ranges of scattering angles.
[29]:
angle = sc.midpoints(normalized.coords['two_theta']).values
results = {
f'{round(angle[group], 3)} rad': normalized['two_theta', group]
for group in range(2, 6)
}
sc.plot(results)
[29]:
Or interactively by plotting with a 1d projection.
[30]:
%matplotlib widget
pp.superplot(normalized)
[30]: