Coder Social home page Coder Social logo

eic / edm4eic Goto Github PK

View Code? Open in Web Editor NEW
3.0 42.0 3.0 278 KB

A data model for EIC defined with podio and based on EDM4hep.

Home Page: https://eic.github.io/EDM4eic/

License: GNU Lesser General Public License v3.0

CMake 25.87% C++ 71.30% Python 2.83%
eic podio edm4hep

edm4eic's People

Contributors

c-dilks avatar dhevang avatar osbornjd avatar shujiel avatar sly2j avatar tylerkutz avatar veprbl avatar wdconinc avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

edm4eic's Issues

Make edm4eic_merge to work for more than tracks

Environment: (where does this bug occur, have you tried other environments)

main - eic container

Steps to reproduce: (give a step by step account of how to trigger the bug)

run edm4eic_merge on epic output files.

Expected Result: (what do you expect when you execute the steps above)

Merged output file

Actual Result: (what do you get when you execute the steps above)

Segfault when any branches other than SimTrackerHitCollection are included

Consistent storage and interfacing of covariances

Is your feature request related to a problem? Please describe.
We store Cov2f, Cov3f, Cov4f as a diagonal followed by upper triangular by row. We store Cov6f as full upper triangular by row, with diagonals distributed through the data block. The inconsistency may lead to errors.

Describe the solution you'd like
A consistent covariant treatment would be desirable. Diagonal-first storage is least common and least likely to allow code reuse in Jacobian multiplications.

Describe alternatives you've considered
Diagonal-first has advantages because with default constructor arguments this allows a concise specification of diagonal-only covariance matrices. The indexing logic is more convoluted, though.

adding volume and surface ID to trajectory

Is your feature request related to a problem? Please describe.
The acts multitrajectoryhelper provided volume and surface id for each hit. We already have chi2 for each hits in the trajectory data structure (see

edm4eic::Trajectory:
)
. Adding those geometry info helps the tracking performance analysis.

I propose to add the following four integer vectors:
( see https://github.com/acts-project/acts/blob/main/Core/include/Acts/EventData/MultiTrajectoryHelpers.hpp)
measurementVolume, measurementLayer, outlierVolume, outlierLayer

CMake version requirement mismatch

Environment: (where does this bug occur, have you tried other environments)

  • Which version: v1.0.0 tag
  • CMake - 3.16.3 (Default for ubuntu 20.04 LTS)

Steps to reproduce: (give a step by step account of how to trigger the bug)

Building with CMake older than 3.20 gives:

cmake -Wno-dev -DCMAKE_INSTALL_PREFIX=/home/romanov/eic/soft/edm4eic/edm4eic-v1.0.0 -DCMAKE_CXX_STANDARD=17 -DBUILD_DATA_MODEL=ON  /home/romanov/eic/soft/edm4eic/src/v1.0.0&& cmake --build . -- -j 28&& cmake --build . --target install

... 

CMake Error at CMakeLists.txt:71 (cmake_path):
  Unknown CMake command "cmake_path".

The problem is that minimum CMake required set to 3.12 but cmake_path is introduced in CMake 3.20

P.S. Question of CMake version was raised for EICrecon some time ago, so this link might be relevant

add association or relation for `RawPMTHit` to MC

Is your feature request related to a problem? Please describe.
For algorithm development it is useful to have the MC photon info, given digitized RawPMTHits. This is a specific need for RICH detectors, where we expect ~1 photon per digitized hit, but sometimes 2 or 3. We thus prefer a one-to-many mapping of hits to photons.

Describe the solution you'd like
This is not the kind of mapping we would have available from real data, and generally MCReco associations are preferred for these. We need a 1-N mapping:

  edm4eic::MCRecoPMTHitAssociation:
    Members:
     - float weight                  // weight of this association
    OneToOneRelations:
     - edm4eic::RawPMTHit rawHit      // reference to the digitized hit
    OneToManyRelations:
     - edm4hep::SimTrackerHit simHits // reference to the simulated hits

Note the MC photons are available in the SimTrackerHit::MCParticle 1-1 relation.

Describe alternatives you've considered
An alternative is a 1-N relation:

  edm4eic::RawPMTHit:
     Members:
      - uint64_t          cellID            // The detector specific (geometrical) cell id.
      - uint32_t          integral          // PMT signal integral [ADC]
      - uint32_t          timeStamp         // PMT signal time [TDC]
     OneToManyRelations:
      - edm4hep::SimTrackerHit simHits  // simulated hits

This is much more straightforward to handle downstream (and therefore somewhat preferable), but has the downside of introducing a Raw->Truth relation (but is that really so bad?)

Additional context
Ideally the RICH PID algorithms do not need MC truth photons; however, during development and testing, they are extremely useful to have.

Request to include Chi2 and number of clusters (hits) used in the fit

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

I am understanding the performance of fitter in ACTS, therefore, I need Chi2 and number of hits used in the fit in the output tree to get the more understanding about the fitter.
m_log->debug(" chi2 = {}", trajState.chi2Sum);
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

EDM4eic CMake config in non-canonical location

Environment: (where does this bug occur, have you tried other environments)

  • Which branch (often main for latest released): main
  • Which version (or HEAD for the most recent on git): 2.1.0
  • Any specific OS or system where the issue occurs? eic-shell
  • Any special versions of ROOT or Geant4? No

Steps to reproduce: (give a step by step account of how to trigger the bug)

  1. /usr/local/lib/cmake/EDM4HEP/EDM4HEPTargets.cmake
  2. /usr/local/lib/EDM4EIC/EDM4EICTargets.cmake

Expected Result: (what do you expect when you execute the steps above)

CMake config files should be in /usr/local/lib/cmake.

RawTrackerHit has signed data members

Environment: (where does this bug occur, have you tried other environments)

  • Which branch (often main for latest released): main
  • Which version (or HEAD for the most recent on git): HEAD
  • Any specific OS or system where the issue occurs? none
  • Any special versions of ROOT or Geant4? none

Steps to reproduce: (give a step by step account of how to trigger the bug)

https://github.com/eic/EDM4eic/blob/1556c6dd2331560efbd7bb39f8e83efdde859987/edm4eic.yaml#LL300-L302

Expected Result: (what do you expect when you execute the steps above)

uint32_t for data words.

Actual Result: (what do you get when you execute the steps above)

int32_t for data words.

Ambiguous overloads for Vector3f operators in edm4hep and edm4eic

Environment: (where does this bug occur, have you tried other environments)

  • Which branch (often main for latest released): 3.0.1
  • Which version (or HEAD for the most recent on git): 3.0.1
  • Any specific OS or system where the issue occurs? on my local system
  • Any special versions of ROOT or Geant4? nothing special, edm4hep-0.9, edm4eic-3.0.1, no modifications

Steps to reproduce: (give a step by step account of how to trigger the bug)

  1. Attempt to compile EICrecon

Expected Result: (what do you expect when you execute the steps above)

Compilation should not fail due to ambiguous overloads in the data models.

Actual Result: (what do you get when you execute the steps above)

/home/wdconinc/git/EICrecon/src/algorithms/pid/ParticlesWithPID.cc: In member function ‘bool eicrecon::ParticlesWithPID::linkCherenkovPID(edm4eic::MutableReconstructedParticle&, const edm4eic::CherenkovParticleIDCollection&, edm4hep::ParticleIDCollection&)’:
/home/wdconinc/git/EICrecon/src/algorithms/pid/ParticlesWithPID.cc:263:69: error: ambiguous overload for ‘operator/’ (operand types are ‘const edm4hep::Vector3f’ and ‘std::size_t’ {aka ‘long unsigned int’})
  263 |                 in_track_p = in_track_p + ( in_track_point.momentum / in_track.points_size() );
      |                                             ~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~
      |                                                            |                              |
      |                                                            const edm4hep::Vector3f        std::size_t {aka long unsigned int}
In file included from /home/wdconinc/git/EICrecon/src/algorithms/pid/ParticlesWithPID.cc:6:
/opt/local/include/edm4eic/vector_utils.h:153:34: note: candidate: ‘V operator/(const V&, double) [with V = edm4hep::Vector3f]’
  153 | template <edm4eic::VectorND V> V operator/(const V& v, const double d) {
      |                                  ^~~~~~~~
In file included from /home/wdconinc/git/EICrecon/src/algorithms/pid/ParticlesWithPID.cc:7:
/opt/local/include/edm4hep/utils/vector_utils.h:266:20: note: candidate: ‘constexpr V edm4hep::operator/(const V&, double) [with V = Vector3f]’
  266 | inline constexpr V operator/(const V& v, const double d) {
      |                    ^~~~~~~~

Additional Context

I consider this a bug on the edm4eic side, since we should be able to reuse the edm4hep vector utils. I also wonder why this suddenly doesn't work anymore (I'm experimenting with c++20 support, so maybe template resolution rules are slightly different such that constexpr is now treated as different from non-constexpr).

Trigger EICrecon CI and benchmarks

Is your feature request related to a problem? Please describe.
We are currently only triggering juggler, and not testing EICrecon.

Describe the solution you'd like
Trigger an EICrecon CI run (which should trigger benchmarks).

Describe alternatives you've considered
Not test changes and hope for the best.

Doxygen pages do not include EDM4hep...

Environment: (where does this bug occur, have you tried other environments)

The doxygen pages at https://eic.github.io/EDM4eic/ do not include the EDM4hep classes. This is because we build doxygen documentation inside an alpine container which does not have EDM4hep installed, so the modification to the Doxyfile don't take effect in the GitHub pipelines as they do on a local documentation build.

We could artifact up the EDM4hep include directory from eic-shell in GitHub actions and dump that into the alpine image before running doxygen.

We could also install doxygen into eic-shell and not do anything with alpine.

build fails outside of eic-shell

Environment: (where does this bug occur, have you tried other environments)

  • Which branch (often main for latest released): v1.0.0
  • Which version (or HEAD for the most recent on git):
  • Any specific OS or system where the issue occurs? RHEL7
  • Any special versions of ROOT or Geant4?

Steps to reproduce: (give a step by step account of how to trigger the bug)

This is done outside of eic-shell

  1. cmake build against PODIO v00-15 and edm4hep v00-06

Expected Result: (what do you expect when you execute the steps above)

Package builds

Actual Result: (what do you get when you execute the steps above)

The build fails with the message below indicating it is unable to find the edm4hep/Vector3f.h file. Messages during the cmake stage indicate it is finding the correct podio and edm4hep versions. I also confirmed that the the file exists in the edm4hep build.

I suspect this is due to development and testing being done entirely within eic-shell which has edm4hep include files installed (symlinked) in /usr/local which is in the include path by default. Building in an environment where edm4hep is installed outside of common directories like /usr/local will need to have the edm4hep include directory added.

[ 1%] Building CXX object CMakeFiles/edm4eic.dir/src/ReconstructedParticle.cc.o
/opt/rh/devtoolset-9/root/usr/bin/c++ -Dedm4eic_EXPORTS -I/home/davidl/work/2022.07.11.EICrecon/EICTOP/root/root-6.26.04/include -I/home/davidl/work/2022.07.11.EICrecon/EICTOP/edm4eic/v1.0.0/build -isystem /home/davidl/work/2022.07.11.EICrecon/EICTOP/PODIO/v00-15/install/include -std=c++17 -pipe -fsigned-char -pthread -fPIC -std=gnu++2a -MD -MT CMakeFiles/edm4eic.dir/src/ReconstructedParticle.cc.o -MF CMakeFiles/edm4eic.dir/src/ReconstructedParticle.cc.o.d -o CMakeFiles/edm4eic.dir/src/ReconstructedParticle.cc.o -c /home/davidl/work/2022.07.11.EICrecon/EICTOP/edm4eic/v1.0.0/build/src/ReconstructedParticle.cc
In file included from /home/davidl/work/2022.07.11.EICrecon/EICTOP/edm4eic/v1.0.0/build/edm4eic/ReconstructedParticleObj.h:7,
from /home/davidl/work/2022.07.11.EICrecon/EICTOP/edm4eic/v1.0.0/build/edm4eic/ReconstructedParticle.h:6,
from /home/davidl/work/2022.07.11.EICrecon/EICTOP/edm4eic/v1.0.0/build/src/ReconstructedParticle.cc:4:
/home/davidl/work/2022.07.11.EICrecon/EICTOP/edm4eic/v1.0.0/build/edm4eic/ReconstructedParticleData.h:7:10: fatal error: edm4hep/Vector3f.h: No such file or directory
7 | #include "edm4hep/Vector3f.h"
| ^~~~~~~~~~~~~~~~~~~~
compilation terminated.

Add a graph visualization

Is your feature request related to a problem? Please describe.
Using AIDASoft/podio#377 and following along with key4hep/EDM4hep#193

Describe the solution you'd like
A graph to show all datatypes, relations, and associations; this would be especially helpful for new users.

Describe alternatives you've considered
No graph, since the YAML file already tells everything.

EventStore used in merge utility is removed in podio 0.17.4

Environment: (where does this bug occur, have you tried other environments)

  • Any special versions of ROOT or Geant4? Podio-0.17.4

Steps to reproduce: (give a step by step account of how to trigger the bug)

  1. Attempt compilation with podio 0.17.4

Expected Result: (what do you expect when you execute the steps above)

utils/src/merge.cpp should compile by using frames.

Actual Result: (what do you get when you execute the steps above)

Include file not found.

Additional Context

AIDASoft/podio#429

Change the definition of edm4eic::TrackerHit to contain one to many relationship with edm4eic::RawTrackerHit

I would like to be able to cluster hits in pixel detectors to output a single central position and currently can't see a way to go about this within the data model or EICrecon.

In addition to changing the TrackerHit definition, the loop over RawTrackerHit in TrackerHitReconstruction_factory will need to be moved into the algorithm TrackerHitReconstruction and then an separate algorithm to include clustered hits can be added.

If this clustering of hits is already done elsewhere and is simple enough to use please point me in the direction. (e.g. Acts looks like it has the capabilities to do it presumably smarter but with additional complexities to integrate detectors outside of the central region)

Remove `ReconstructedParticle::PDG` member?

Is your feature request related to a problem? Please describe.

After some discussion in recent Software & Computing meetings, there may be some interest in removing the scalar member PDG from ReconstructedParticle, which in EICrecon is presently used for true PDG. Some points from these discussions:

  • MCRecoParticleAssociation::getSim::getPDG can be used to provide true PDG instead
    • not every out ReconstructedParticle object will be found in a MCRecoParticleAssociation, do those have a true PDG?
    • what should we do about members which may depend on PDG, such as ReconstructedParticle::mass? Or are these members filled without knowledge of the true PDG?
  • Removing member PDG brings us closer to EDM4hep model; it would be nice to switch to edm4hep::ReconstructedParticle, however, relations such as vertex, track, and cluster are to the EDM4eic objects, so this does not seem practical at this time
  • Removing member PDG will break:
    • Juggler
    • Any EICrecon reconstruction algorithms which use it
    • User analysis code
  • ReconstructedParticle::getParticleIDUsed and ReconstructedParticle::getParticleIDs will be used for PID objects, which in turn contain the PDG value(s) from PID

Let's use this Github issue to continue the discussion and eventually converge on a path forward for the future.
@sly2j @Chao1009 @wdconinc @mdiefent

C++20 standard should not be forced

set(CMAKE_CXX_STANDARD 20)

In reality we only require C++17 and allow C++20. By forcing C++20, we get in trouble with ROOT and nlohmann-json, when including both of these onto the same compilation unit. ROOT redefines its own span based on the C++17 standard at ROOT compile time, which conflicts with the std::span at EDM4eic compile time.

Add (mutable) type getters to edm4eic::vertex

The vertex object has a type object that is currently not rigorously defined. In the future it would be beneficial to have some concrete types that a vertex could be (e.g. primary, secondary, background, other...). Adding getters with bit shifting will allow for easier type identification, similar to that in edm4hep::MCParticle.

Describe the solution you'd like
Something similar to https://github.com/key4hep/EDM4hep/blob/main/edm4hep.yaml#L188-L220

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Refactor data model to better fit JANA workflow

One of the main differences in the paradigms between JANA and the existing Gaudi-informed data model is what they use as the primary key for connecting data and algorithms. Specifically, podio uses collection names (a string) as the primary key and emphasizes a small set of common data types. One may ask for a collection by name and then discover the type. JANA is designed around using typeid as the primary key and a factory tag (a string) as a secondary key (which is only needed for special cases).

The existing data model was designed with a Gaudi-like system in mind. For example, the edm4hep::SimCalorimeterHit is made to serve both the E/M barrel and Hardronic Endcap calorimeters. In order to use this data model in a JANA-like system, we must basically sacrifice the secondary key to use as the primary key. However, if the data model were extended to include both EEMCSimCalorimeterHit and HCALSimCalorimeterHit types (both inheriting from edm4hep::SimCalorimeterHit) then these could serve as primary keys in JANA. This would also result in a more complete data model. As it is, the data model does not contain any information about EEMC or HCAL. While the fields are the same, the nature of those is quite different.

Using a data model designed with a different framework in mind than the one it is being used with can lead to awkward code patterns as it is doing with JANA. This change should be relatively easy as it does not modify existing data types, only adds new, trivial ones. Another added benefit will be that if the EEMC decides it needs an additional field, we will already have a dedicated type to insert this change into without having to modify the edm4hep::SimCalorimeterHit base class.

One side effect of this will be that generic algorithms will be harder to implement. The inputs will not be an issue since upcasting to a core edm4hep type is trivial. The outputs though will present a different challenge since the algorithm itself will need to instantiate objects of the derived type. (e.g. it should produce objects of type EEMCSimCalorimeterHit when run for that detector but type HCALSimCalorimeterHit when producing objects for that detector.) This is one of multiple challenges though with the generic algorithms already. A major one being that Gaudi/podio works with _Collection_s of objects while JANA deals with vectors of const pointers. If we made all generic algorithms work with std::vectors of pointers, JANA would be happier, but Gaudi would not. Same is true the other way around. (Yes, we can force it one way or the other, but this will be the tail wagging the dog.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.