# Data reduction for POWGEN

This notebook shows a basic reduction workflow for powder diffraction for the SNS [POWGEN](https://sns.gov/powgen) instrument.
It serves mainly to develop and present routines for powder diffraction and will eventually be removed in favor of a workflow for DREAM at ESS.

**Note** that we load functions from `external` modules.
These modules will be removed when their ESS counterparts exist.

In [None]:
import scipp as sc
import scippneutron as scn
import plopp as pp

import ess
from ess.diffraction import powder
from ess import diffraction
from ess.external import powgen

Initialize the logging system to write logs to a file `powgen.log`.
This also displays a widget which shows log messages emitted by the functions that we call.

In [None]:
ess.logging.configure_workflow('powgen_reduction', filename='powgen.log')

## Load data

Load the sample data.

**Note:** We get the file name from `powgen.data`.
This module provides access to managed example files.
In the real world, we would need to find the file name in a different way.
But here, the data has been converted to a [Scipp HDF5 file](https://scipp.github.io/user-guide/reading-and-writing-files.html#HDF5).

In [None]:
sample_full = sc.io.load_hdf5(powgen.data.sample_file())

In [None]:
sample_full

The loaded data group contains some auxiliary detector info that we need later.
The events are

In [None]:
sample = sample_full['data']
sample

## Inspect the raw data

We can plot the data array to get an idea of its contents.

In [None]:
sample.hist(spectrum=500, tof=400).plot()

We can see how that data maps onto the detector by using POWGEN's instrument view.

In [None]:
scn.instrument_view(sample.hist())

## Filter out invalid events

The file contains events that cannot be associated with a specific pulse.
We can get a range of valid time-of-flight values from the instrument characterization file associated with the current run.
There is currently no mechanism in `scippneutron` or `ess` to load such a file as it is not clear if ESS will use this approach.
The values used below are taken from `PG3_characterization_2011_08_31-HR.txt` which is part of the sample files of Mantid.
See, e.g., [PowderDiffractionReduction](https://www.mantidproject.org/PowderDiffractionReduction).

We remove all events that have a time-of-flight value outside the valid range:

In [None]:
sample = sample.bin(tof=sc.array(dims=['tof'], values=[0.0, 16666.67], unit='us'))

## Normalize by proton charge

Next, we normalize the data by the proton charge.

In [None]:
sample /= sample.coords['gd_prtn_chrg']

We can check the unit of the event weight to see that the data was indeed divided by a charge.

In [None]:
sample.data.values[0].unit

## Compute d-spacing

Here, we compute d-spacing using calibration parameters provided in an example file.
First, we load the calibration parameters.

**Note:** ESS instruments will use a different, yet to be determined way of encoding calibration parameters.

In [None]:
cal = sc.io.load_hdf5(powgen.data.calibration_file())

The calibration is loaded with a 'detector' dimension.
Compute the corresponding spectrum indices using the detector info loaded as part of the sample data.

In [None]:
cal = powgen.beamline.map_detector_to_spectrum(
    cal, detector_info=sample_full['detector_info']
)

In [None]:
cal

Now when can compute d-spacing for the sample using the calibration parameters.

In [None]:
sample_dspacing = powder.to_dspacing_with_calibration(sample, calibration=cal)

## Vanadium correction

Before we can process the d-spacing distribution further, we need to normalize the data by a vanadium measurement.

In [None]:
vana_full = sc.io.load_hdf5(powgen.data.vanadium_file())

In [None]:
vana_full

In [None]:
vana = vana_full['data']
vana

Now we process the vanadium data in a similar was as the sample data.

In [None]:
vana /= vana.coords['gd_prtn_chrg']

### Removing the variances of the Vanadium data

<div class="alert alert-warning">

**Warning**
    
Heybrock et al. (2023) have shown that Scipp's uncertainty propagation is not suited for broadcast operations
such as normalizing event counts by a scalar value, which is the case when normalizing by Vanadium counts.
These operations are forbidden in recent versions of Scipp.
Until an alternative method is found to satisfactorily track the variances in this workflow,
we remove the variances in the Vanadium data.
The issue is tracked [here](https://github.com/scipp/ess/issues/171).

</div>

In [None]:
vana.bins.constituents['data'].variances = None

### Conversion to d-spacing

Now, we compute d-spacing using the same calibration parameters as before.

In [None]:
vana_dspacing = powder.to_dspacing_with_calibration(vana, calibration=cal)

In [None]:
vana_dspacing

## Inspect d-spacing

We need to histogram the events in order to normalize our sample data by vanadium.
For consistency, we use these bin edges for both vanadium and the sample data.

In [None]:
d = vana_dspacing.coords['dspacing']
dspacing_edges = sc.linspace('dspacing', d.min().value, d.max().value, 200, unit=d.unit)

### All spectra combined

We start simple by combining all spectra using `data.bins.concat('spectrum')`.
Then, we can normalize the same data by vanadium to get a d-spacing distribution.

**Note that because we removed the variances on the Vanadium data, after the following cell, the standard deviations on the result are underestimated.**

In [None]:
all_spectra = diffraction.normalize_by_vanadium(
    sample_dspacing.bins.concat('spectrum'),
    vanadium=vana_dspacing.bins.concat('spectrum'),
    edges=dspacing_edges,
)

In [None]:
all_spectra.hist(dspacing=dspacing_edges).plot()

### Group into $2\theta$ bins

For a better resolution, we now group the sample and vanadium data into a number of bins in the scattering angle $2\theta$ (see [here](https://scipp.github.io/scippneutron/user-guide/coordinate-transformations.html))
and normalize each individually.

In [None]:
two_theta = sc.linspace(dim='two_theta', unit='deg', start=25.0, stop=90.0, num=16)
sample_by_two_theta = diffraction.group_by_two_theta(sample_dspacing, edges=two_theta)
vana_by_two_theta = diffraction.group_by_two_theta(vana_dspacing, edges=two_theta)

In [None]:
normalized = diffraction.normalize_by_vanadium(
    sample_by_two_theta, vanadium=vana_by_two_theta, edges=dspacing_edges
)

Histogram the results in order to get a useful binning in the following plots.

In [None]:
normalized = normalized.hist(dspacing=dspacing_edges)

Now we can inspect the d-spacing distribution as a function of $2\theta$.

In [None]:
normalized.plot()

In order to get 1-dimensional plots, we can select some ranges of scattering angles.

In [None]:
angle = sc.midpoints(normalized.coords['two_theta']).values
results = {
    f'{round(angle[group], 3)} rad': normalized['two_theta', group]
    for group in range(2, 6)
}
sc.plot(results)

Or interactively by plotting with a 1d projection.

In [None]:
%matplotlib widget
pp.superplot(normalized)