McStas Workflow - Chunk by Chunk#
In this example, we will process McStas events chunk by chunk, panel by panel.
Build Base Workflow#
[1]:
from ess.nmx.mcstas import NMXMcStasWorkflow
from ess.nmx.data import get_small_mcstas
from ess.nmx.mcstas.types import *
wf = NMXMcStasWorkflow()
# Replace with the path to your own file
wf[FilePath] = get_small_mcstas()
wf[MaximumCounts] = 10_000
wf[TimeBinSteps] = 50
wf.visualize(NMXReducedDataGroup, graph_attr={"rankdir": "TD"}, compact=True)
[1]:
Compute Raw Data Metadata#
time-of-flight coordinate and McStasWeight2CountScaleFactor should not be different from chunk to chunk.
Therefore we need to compute TimeBinStep and McStasWeight2CoutScaleFactor before we compute NMXReducedData.
It can be done by ess.reduce.streaming.StreamProcessor.
In this example, MinimumTimeOfArrival, MaximumTimeOfArrival and MaximumProbability will be renewed every time a chunk is added to the streaming processor.
(Min/Max)Accumulator remembers the previous minimum/maximum value and compute new minimum/maximum value with the new chunk.
raw_event_data_chunk_generator yields a chunk of raw event probability from mcstas h5 file.
This example below process the data chunk by chunk with size: CHUNK_SIZE.
[2]:
from functools import partial
from ess.reduce.streaming import StreamProcessor, MaxAccumulator, MinAccumulator
# Stream processor building helper
scalefactor_stream_processor = partial(
StreamProcessor,
dynamic_keys=(RawEventProbability,),
target_keys=(NMXRawDataMetadata,),
accumulators={
MaximumProbability: MaxAccumulator,
MaximumTimeOfArrival: MaxAccumulator,
MinimumTimeOfArrival: MinAccumulator,
},
)
metadata_wf = wf.copy()
[3]:
metadata_wf.visualize(NMXRawDataMetadata, graph_attr={"rankdir": "TD"}, compact=True)
[3]:
[4]:
from ess.nmx.mcstas.load import (
raw_event_data_chunk_generator,
mcstas_weight_to_probability_scalefactor,
)
from ess.nmx.mcstas.streaming import calculate_number_of_chunks
from ipywidgets import IntProgress
CHUNK_SIZE = 10 # Number of event rows to process at once
# Increase this number to speed up the processing
NUM_DETECTORS = 3
# Loop over the detectors
file_path = metadata_wf.compute(FilePath)
raw_data_metadatas = {}
for detector_i in range(0, NUM_DETECTORS):
temp_wf = metadata_wf.copy()
temp_wf[DetectorIndex] = detector_i
detector_name = temp_wf.compute(DetectorName)
max_chunk_id = calculate_number_of_chunks(
temp_wf.compute(FilePath), detector_name=detector_name, chunk_size=CHUNK_SIZE
)
cur_detector_progress_bar = IntProgress(
min=0, max=max_chunk_id, description=f"Detector {detector_i}"
)
display(cur_detector_progress_bar)
# Build the stream processor
processor = scalefactor_stream_processor(temp_wf)
for da in raw_event_data_chunk_generator(
file_path=file_path, detector_name=detector_name, chunk_size=CHUNK_SIZE
):
if any(da.sizes.values()) == 0:
continue
else:
results = processor.add_chunk({RawEventProbability: da})
cur_detector_progress_bar.value += 1
display(results[NMXRawDataMetadata])
raw_data_metadatas[detector_i] = results[NMXRawDataMetadata]
# We take the min/maximum values of the scale factor
min_toa = min(meta.min_toa for meta in raw_data_metadatas.values())
max_toa = max(meta.max_toa for meta in raw_data_metadatas.values())
max_probability = max(meta.max_probability for meta in raw_data_metadatas.values())
toa_bin_edges = sc.linspace(dim='t', start=min_toa, stop=max_toa, num=51)
scale_factor = mcstas_weight_to_probability_scalefactor(
max_counts=wf.compute(MaximumCounts), max_probability=max_probability
)
/home/runner/work/essnmx/essnmx/.tox/docs/lib/python3.11/site-packages/ess/nmx/mcstas/load.py:188: UserWarning: The chunk size may be too small < 10_000_000.
Consider increasing the chunk size for better performance.
Hint: NMX typically expect ~10^8 bins as reduced data.
_check_chunk_size(chunk_size)
NMXRawDataMetadata(max_probability=<scipp.Variable> () float64 [counts] 14.665, min_toa=<scipp.Variable> () float64 [s] 0.0972552, max_toa=<scipp.Variable> () float64 [s] 0.142863)
NMXRawDataMetadata(max_probability=<scipp.Variable> () float64 [counts] 14.1429, min_toa=<scipp.Variable> () float64 [s] 0.0993471, max_toa=<scipp.Variable> () float64 [s] 0.143276)
NMXRawDataMetadata(max_probability=<scipp.Variable> () float64 [counts] 14.9342, min_toa=<scipp.Variable> () float64 [s] 0.0993372, max_toa=<scipp.Variable> () float64 [s] 0.145276)
Compute Metadata#
Other metadata does not require any chunk-based computation.
Therefore we export the metadata first and append detector data later.
Compute Final Output#
Now with all the metadata, we can compute the final output chunk by chunk.
We will also compute static parameters in advance so that stream processor does not compute them every time another chunk is added.
We will as well export the reduced data detector by detector.
[5]:
from ess.nmx.mcstas.xml import McStasInstrument
final_wf = wf.copy()
# Set the scale factor and time bin edges
final_wf[McStasWeight2CountScaleFactor] = scale_factor
final_wf[TimeBinSteps] = toa_bin_edges
# Set the crystal rotation manually for now ...
final_wf[CrystalRotation] = sc.vector([0, 0, 0.0], unit='deg')
# Set static info
final_wf[McStasInstrument] = wf.compute(McStasInstrument)
final_wf.visualize(NMXReducedDataGroup, compact=True)
[5]:
[6]:
from ess.nmx.mcstas.nexus import NXLauetofWriter
def temp_generator(file_path, detector_name):
max_chunk_id = calculate_number_of_chunks(
file_path, detector_name=detector_name, chunk_size=CHUNK_SIZE
)
cur_detector_progress_bar = IntProgress(
min=0, max=max_chunk_id, description=f"Detector {detector_i}"
)
display(cur_detector_progress_bar)
for da in raw_event_data_chunk_generator(
file_path=file_path, detector_name=detector_name, chunk_size=CHUNK_SIZE
):
yield da
cur_detector_progress_bar.value += 1
# When a panel is added to the writer,
# the writer will start processing the data from the generator
# and store the results in memory
# The writer will then write the data to the file
# when ``save`` is called
writer = NXLauetofWriter(
chunk_generator=temp_generator,
chunk_insert_key=RawEventProbability,
workflow=final_wf,
output_filename="test.h5",
overwrite=True,
extra_meta={"McStasWeight2CountScaleFactor": scale_factor},
)
[7]:
for detector_i in range(3):
display(writer.add_panel(detector_id=detector_i))
/home/runner/work/essnmx/essnmx/.tox/docs/lib/python3.11/site-packages/ess/nmx/mcstas/load.py:188: UserWarning: The chunk size may be too small < 10_000_000.
Consider increasing the chunk size for better performance.
Hint: NMX typically expect ~10^8 bins as reduced data.
_check_chunk_size(chunk_size)
- countsscippDataArray(id: 1638400, t: 50)float64counts0.0, 0.0, ..., 0.0, 0.0
- proton_chargescippVariable()float64counts10.316200911169643
- crystal_rotationscippVariable()vector3deg[0. 0. 0.]
- sample_positionscippVariable()vector3m[0. 0. 0.]
- source_positionscippVariable()vector3m[ -0.53123 0. -157.405 ]
- sample_namescippVariable()stringsampleMantid
- fast_axisscippVariable()vector3𝟙[ 0.999986 0. -0.00529148]
- slow_axisscippVariable()vector3𝟙[0. 1. 0.]
- origin_positionscippVariable()vector3m[-0.248454 -0.25 0.292 ]
- positionscippVariable(id: 1638400)vector3m[-0.248454 -0.25 0.292 ], [-0.24805401 -0.25 0.29199788], ..., [0.26273884 0.2616 0.28929499], [0.26313884 0.2616 0.28929288]
- detector_shapescippVariable()PyObject(1280, 1280)
- x_pixel_sizescippVariable()float64m0.0004
- y_pixel_sizescippVariable()float64m0.0004
- detector_namescippVariable()stringnD_Mantid_0
- countsscippDataArray(id: 1638400, t: 50)float64counts0.0, 0.0, ..., 0.0, 0.0
- proton_chargescippVariable()float64counts8.939925637100606
- crystal_rotationscippVariable()vector3deg[0. 0. 0.]
- sample_positionscippVariable()vector3m[0. 0. 0.]
- source_positionscippVariable()vector3m[ -0.53123 0. -157.405 ]
- sample_namescippVariable()stringsampleMantid
- fast_axisscippVariable()vector3𝟙[-0.00531614 0. -0.99998587]
- slow_axisscippVariable()vector3𝟙[0. 1. 0.]
- origin_positionscippVariable()vector3m[-0.288666 -0.25 0.252 ]
- positionscippVariable(id: 1638400)vector3m[-0.288666 -0.25 0.252 ], [-0.28866813 -0.25 0.25160001], ..., [-0.29138361 0.2616 -0.25919278], [-0.29138574 0.2616 -0.25959277]
- detector_shapescippVariable()PyObject(1280, 1280)
- x_pixel_sizescippVariable()float64m0.0004
- y_pixel_sizescippVariable()float64m0.0004
- detector_namescippVariable()stringnD_Mantid_1
- countsscippDataArray(id: 1638400, t: 50)float64counts0.0, 0.0, ..., 0.0, 0.0
- proton_chargescippVariable()float64counts9.322607619985897
- crystal_rotationscippVariable()vector3deg[0. 0. 0.]
- sample_positionscippVariable()vector3m[0. 0. 0.]
- source_positionscippVariable()vector3m[ -0.53123 0. -157.405 ]
- sample_namescippVariable()stringsampleMantid
- fast_axisscippVariable()vector3𝟙[0.00531614 0. 0.99998587]
- slow_axisscippVariable()vector3𝟙[0. 1. 0.]
- origin_positionscippVariable()vector3m[ 0.288667 -0.25 -0.251 ]
- positionscippVariable(id: 1638400)vector3m[ 0.288667 -0.25 -0.251 ], [ 0.28866913 -0.25 -0.25060001], ..., [0.29138461 0.2616 0.26019278], [0.29138674 0.2616 0.26059277]
- detector_shapescippVariable()PyObject(1280, 1280)
- x_pixel_sizescippVariable()float64m0.0004
- y_pixel_sizescippVariable()float64m0.0004
- detector_namescippVariable()stringnD_Mantid_2