lucabaldini / hexsample Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 2.55 MB

Solid-state hybrid detectors with hexagonal sampling.

Home Page: https://lucabaldini.github.io/hexsample/

License: GNU General Public License v3.0

Makefile 0.04% Python 99.58% Shell 0.19% Batchfile 0.19%

detector-2d hexagonal-architecture x-ray-imaging x-ray-spectroscopy event-driven-readout

hexsample's Introduction

hexsample

Solid-state hybrid detectors with hexagonal sampling.

Documentation: https://lucabaldini.github.io/hexsample/

hexsample's People

Contributors

Stargazers

Watchers

hexsample's Issues

Expose the pitch of the readout in the simulation options

Now that we have all the necessary metadata we can simulate with a generic geometry and pick it back automatically at the recon stage.

Properly implement the trigger condition in the readout

At this point we do have a concept of trigger threshold, but most likely we are not doing the right thing when an event should not actually cause a trigger.

In readout.py all the read() functions should handle the condition where a given event is not causing a trigger.

Speed up the simulation

At this point hxsim runs at something between 200 and 1000 events per seconds depending of the machine and on the particular setup. This implies roughly O(1 hour) for 1 M events, which is not outrageous, but still kind of annoying. It would be nice to speed things up by at least a factor of a few. A few lines of action might be:

check if we're doing something wrong with the generation of the noise on the matrix---it looks like adding the noise per se slows things down by a factor of ~2;
check if I/O is a bottleneck at all;
check what's the smallest sensible padding that we can use, in order to reduce the event size;
check if there's something that we can vectorize without complicating the logic flow too much.

I am sure there are others.

Create a library of analysis functions

This is already started on the analysis branch, and we can use this issue to discuss the path forward.

Add all the necessary metadata to the output HDF5 files

We should keep track of all the simulation settings in the simulation output, and carry this over to the recon files, adding the configuration used for the reconstruction.

Here is a link at a repo that we might be able to use for inspiration
https://github.com/lmassach/tj-monopix2-daq

Keep track of the file type in the file header

Knowing whether a given event file is, e.g., a digi or a recon file would be extremely useful downstream in order to be able to tell what one can do with the file itself.

Unit test failing intermittendtly on CI

FAILED tests/test_source.py::test_cu_k_forest - AssertionError: assert (33.67776280160456 - 5) <= (5.0 * 3.1622776601683795)
 +  where 3.1622776601683795 = <ufunc 'sqrt'>((2.0 * 5))
 +    where <ufunc 'sqrt'> = np.sqrt

Handle the input options for the different readout modes

e.g., when reading out with in SPARSE mode, there's no point in bringing along the padding, which is only relevant for RECTANGULAR readout.

More generally, we should have a definite strategy to make sure that all the keywords ending up in the digi header make sense for a specific simulation.

Handling indexes of fitted parameters in DoubleGaussian fit

Sometimes fits with DoubleGaussian() flips the indexes of mean and sigma for alpha and beta peak, that in default are 0 and 1.
It is not enough to assign the lowest one to alpha because if fit do not converge for one of the two peaks, could give a random value (see plot below).

If it is not possible to fix the indexes, it is needed to find some conditions to assign the mean values to the right peak. For now, in analyze_grid.py, the condition of failure for fitting consists of the following check: if 0 - FitStatus.parameter_error('mean0') >= 0) is True then fit has failed for at least one of the two Gaussians.
fig.pdf

Consider using LMFIT

See
https://lmfit.github.io/lmfit-py/
https://github.com/lmfit/lmfit-py

It seems that we're going a long way to re-implement most of the stuff that is already available in LMFIT.

We should take a deep look at the package and see if it makes sense to drop our modeling Python module.

DigiEventSparse reconstruction has no physical sense

By now, the sparse readout strategy is needed only for creating DigiEventSparse digi events, which reconstruction is not even well defined.

In clustering.py, for the class DigiInputFileSparse, we only take all triggered pixels, zero suppress them and then the result clustered, without an effective physical meaning, so, recon files with readoutmode == SPARSE does not have to be used for Physics analysis.

In the future proper criteria for DigiEventSparse should be thought and implemented if needed.

Expose the random seed in the simulation

This would be useful in order to be able to have reproducible results across different runs.

DigiOutputFiles created without readoutmode attribute

In this moment, in the class DigiOutputFile* (where * indicates all the 3 possible readout modes) the attribute readoutmode is not initialised in the file header, instead, this attribute is copied in the header in hxsim.py through the readout_args variable, that define which HexagonalReadout* to use. The correct way is initialising immediately the attribute without writing it twice in the file header.

Transition to the new numpy random Generator infrastructure

https://numpy.org/doc/stable/reference/random/new-or-different.html#new-or-different

Optimize the clustering

We are starting with a fairly simple NN clustering implementation, where the two basic handles that we have are the zero-suppression threshold and the number of neighbors.

(Note that pixels added to the cluster can be on opposite sides of the seed, which probably does not make much sense.)

I think the first information that we need here is understanding how the charge is shared between adjacent pixels without the noise.

Close the loop over the different types of digitizaiton

We have been working on different readout strategies in the more_digi branch, and we do have three DigiEvent structures

HexagonalReadoutSparse: read all the pixels above the trigger threshold, no matter where they are;
HexagonalReadoutRectangular: good, old XPOL-like readout with ROT, padding and ROI;
HexagonalReadoutCircular: highest pixel above the trigger threshold + the 6 neighbors.

We need to do a few things in order for this to be actually used, particularly:

write all the IO data structures for data persistence;
write the proper clustering structures to seed the reconstruction;
adapt the top-level application to select at runtime the specific digitization scheme we want to use;
adapt the display facilities.

Think about whether we want to generate the trigger noise

Note that the generation of the pixel noise in PHA space is now performed at the end of the digitization and therefore, if we need to simulate the noise at the trigger level, this must be done separately.

See the discussion at #12

Adding beam options in hxsim

In hxsim.py, the test beam is generated inside the simulation as a GaussianBeam. It would be useful to add an option for choosing between different types of beams, such as a UniformBeam (to be implemented, useful for the hardware simulation of the custom readout chip) or a PointBeam(this is less interesting, it can be simulated using a narrow GaussianBeam), useful for testing.

Change the signature of the read() functions in the readout structures

In the current implementation the three different readout classes have different read() signature---HexagonalReadoutRectangular needs the padding, that the other two do not.

Under the reasonable assumption that the chip is configured once outside the event loop, it would make much more sense for the read() hook to have the same signature for all classes, e.g.

def read(self, timestamp: float, x: np.ndarray, y: np.ndarray)

and add a setup() hook for the configuration part (that is, all the stuff that does not change on an event by event basis.)

If we do somthing along the lines of

def setup(**kwargs)

we might event get away by passing the argparse dictionary to the function. (Let's see how this looks.)

Intermittent unit test failure

=================================== FAILURES ===================================
________________________________ test_diffusion ________________________________

diff_sigma = 40.0

    def test_diffusion(diff_sigma=40.):
        """
        """
        grid = HexagonalGrid(HexagonalLayout.ODD_R, 2, 2, 0.005)
        evt = MonteCarloEvent(0., 8000., 0., 0., 0.05, 3000)
        x, y = evt.propagate(diff_sigma)
        readout = HexagonalReadout(HexagonalLayout.ODD_R, 10, 10, 0.005, 40., 1.)
        padding = Padding(1)
>       digi_event = readout.read(evt.timestamp, x, y, 500., padding, 80, 0)

tests/test_mc.py:35: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
hexsample/digi.py:412: in read
    roi, pha = self.trigger(signal, trg_threshold, min_col, min_row, padding)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <hexsample.digi.HexagonalReadout object at 0x7f17a77ffe50>
signal = array([[  72, 1430,    1,    0],
       [1426,   71,    0,    0]])
trg_threshold = 500.0, min_col = 4, min_row = 4
padding = Padding(top=1, right=1, bottom=1, left=1)

    def trigger(self, signal : np.ndarray, trg_threshold, min_col : int, min_row : int,
        padding : Padding) -> Tuple[RegionOfInterest, np.ndarray]:
        """Apply the trigger, calculate the region of interest, and pad the
        signal array to the proper dimension.
    
        .. warning::
           This is still incorrect at the edges of the readout chip, as we are
           not trimming the ROI (and the corresponding arrays) to the physical
           dimensions of the chip.
        """
        # pylint: disable=too-many-arguments, too-many-locals
        # Sum the sampled signal into the 2 x 2 trigger miniclusters.
        trg_signal = self.sum_miniclusters(signal)
        # Zero-suppress the trigger signal below the trigger threshold.
        self.zero_suppress(trg_signal, trg_threshold)
        # This is tricky, and needs to be documented properly---basically we
        # build arrays with all the (minicluster) columns and rows for which
        # at least one minicluster is above threshold. The multiplicative factor
        # of 2 serves the purpose of converting minicluster to pixel coordinates.
        trg_cols = 2 * np.nonzero(trg_signal.sum(axis=0))[0]
        trg_rows = 2 * np.nonzero(trg_signal.sum(axis=1))[0]
        # Build the actual ROI in chip coordinates and initialize the RegionOfInterest
        # object.
        roi_min_col = min_col + trg_cols.min() - padding.left
        roi_max_col = min_col + trg_cols.max() + 1 + padding.right
        roi_min_row = min_row + trg_rows.min() - padding.top
        roi_max_row = min_row + trg_rows.max() + 1 + padding.bottom
        roi = RegionOfInterest(roi_min_col, roi_max_col, roi_min_row, roi_max_row, padding)
        # And now the actual PHA array: we start with all zeroes...
        pha = np.full(roi.shape(), 0.)
        # ...and then we patch the original signal array into the proper submask.
        num_rows, num_cols = signal.shape
        start_row = padding.bottom - trg_rows.min()
        start_col = padding.left - trg_cols.min()
>       pha[start_row:start_row + num_rows, start_col:start_col + num_cols] = signal
E       ValueError: could not broadcast input array from shape (2,4) into shape (2,3)

hexsample/digi.py:328: ValueError

ReconEvent class implementation for all reconstruction strategies

In recon.py, the ReconEvent class is implemented, having an attribute roi_size. This attribute has no meaning for readout modes different from readoutmode == RECTANGULAR recon files.

By now, the attribute has been commented (line 101) for continuing the close of issue #50 but, in the future we have two different paths:

creating three different classes (as we have done for all the other classes), adapting this attribute to the features of every readout class;
modifying the name of the attribute from roi_size to number_of_pixels, in order to give to it a physical sense for every readout mode, even if for the circular readout is redundant, being the number of pixels in the event alway fixed to 7 (1 + the 6 adjacent neighbors).

The first one seems the right choice.

Add a check on the keyword arguments in the pipeline wrappers

We just realized that, at this moment, a typo in a kwarg gets happily ignored.

Digitization errors at the borders of the matrix

We do have errors when the ROI hits the physical bounds of the readout.

2023-10-06 12:35:28.238 | ERROR    | hexsample.digi:__post_init__:86 - Error in DigiEvent post-initializaion.
RegionOfInterest(min_col=160, max_col=304, min_row=0, max_row=189, padding=Padding(top=2, right=2, bottom=2, left=2))
ROI size: 27550
pha size: 27360