Comments (11)
To clarify - that dump is from a single random event. I observe this on all events I tested, from two MC files: a Z'->tt (amc@nlo) and a WH(bb) (powheg). Both showered with pythia.
from pandatree.
Ok, this is interesting, but confusing. Here's a slightly modified snippet/output of the same event as above:
for (auto &p : genParticles) {
if (!p.finalState)
continue;
for (auto &q : genParticles) {
if (!q.finalState)
continue;
if (&p == &q)
continue;
if (DeltaR2(q.eta(), q.phi(), p.eta(), p.phi()) < 0.00001) {
int parentage = 0;
parentage += int(q.parent.isValid());
parentage += int(p.parent.isValid());
parentage += int(q.parent.get() == p.parent.get());
int q_parent_pdgid = q.parent.isValid() ? q.parent->pdgid : 0;
int p_parent_pdgid = p.parent.isValid() ? p.parent->pdgid : 0;
PDebug("",Form("%.3f,%.3f,%.3f,%i,%i matches with %.3f,%.3f,%.3f,%i,%i; parentage=%i",
q.pt(), q.eta(), q.phi(), q.pdgid, q_parent_pdgid,
p.pt(), p.eta(), p.phi(), p.pdgid, p_parent_pdgid,
parentage));
}
}
}
0.553,-3.488,1.761,2212,2212 matches with 0.553,-3.487,1.761,2212,21; parentage=2
0.294,-3.344,2.776,11,1 matches with 0.294,-3.344,2.776,11,-2; parentage=2
0.027,-3.230,2.886,-11,1 matches with 0.027,-3.230,2.886,-11,-2; parentage=2
7.059,0.346,-0.127,310,2212 matches with 7.059,0.346,-0.127,310,-1; parentage=2
33.750,0.375,-0.093,310,2212 matches with 33.750,0.375,-0.093,310,-1; parentage=2
67.000,0.418,-0.247,2212,2212 matches with 67.000,0.418,-0.247,2212,1; parentage=2
1.168,2.478,2.852,2212,2212 matches with 1.168,2.478,2.852,2212,1; parentage=2
0.425,-0.197,-2.129,2212,2212 matches with 0.425,-0.197,-2.129,2212,1; parentage=2
0.553,-3.487,1.761,2212,21 matches with 0.553,-3.488,1.761,2212,2212; parentage=2
1.168,2.478,2.852,2212,1 matches with 1.168,2.478,2.852,2212,2212; parentage=2
0.425,-0.197,-2.129,2212,1 matches with 0.425,-0.197,-2.129,2212,2212; parentage=2
67.000,0.418,-0.247,2212,1 matches with 67.000,0.418,-0.247,2212,2212; parentage=2
7.059,0.346,-0.127,310,-1 matches with 7.059,0.346,-0.127,310,2212; parentage=2
33.750,0.375,-0.093,310,-1 matches with 33.750,0.375,-0.093,310,2212; parentage=2
2.439,1.076,2.817,22,2212 matches with 9.750,1.074,2.817,321,2212; parentage=3
0.927,-3.398,-0.225,2212,5 matches with 0.927,-3.398,-0.225,2212,2212; parentage=2
0.173,-0.156,1.809,11,1 matches with 0.173,-0.156,1.809,11,2212; parentage=2
0.061,-0.224,1.881,-11,1 matches with 0.061,-0.224,1.881,-11,2212; parentage=2
9.750,1.074,2.817,321,2212 matches with 2.439,1.076,2.817,22,2212; parentage=3
0.173,-0.156,1.809,11,2212 matches with 0.173,-0.156,1.809,11,1; parentage=2
0.061,-0.224,1.881,-11,2212 matches with 0.061,-0.224,1.881,-11,1; parentage=2
0.927,-3.398,-0.225,2212,2212 matches with 0.927,-3.398,-0.225,2212,5; parentage=2
0.294,-3.344,2.776,11,-2 matches with 0.294,-3.344,2.776,11,1; parentage=2
0.027,-3.230,2.886,-11,-2 matches with 0.027,-3.230,2.886,-11,1; parentage=2
So the particles are "distinct" in the sense that they really do have different parents! It seems that the parents are always baryons (either a bare q or a hadron, frequently a proton), indicating this may have something to do with the showering, or there's a bug with the way we're filling these particles.
from pandatree.
Er, there are gluons too.
from pandatree.
Ugh, it is probably due to the way we fill the particles. We take both prunedGenParticles collection (stores full reco::GenParticle objects but only for limited "interesting" particles like leptons and bosons) and packedGenParticles collection (stores all final state particles as pat::PackedGenParticle) and combine them. The overlap removal is defined here:
https://github.com/PandaPhysics/PandaProd/blob/master/Producer/src/GenParticlesFiller.cc#L122
So it's possible that in some cases we aren't cleaning the overlaps properly. In fact we only consider overlaps between a GenParticle and PackedGenParticle that share the same parent, so if there are particles in two collections that represent the same final-state particle but somehow have different parents, we will end up in a situation like this.
from pandatree.
Just to make sure I understand the logic: there's no explicit check for parentage in that line, but because we're looping over the daughters of a mother here [1], we'd only ever compare to particles if they happened to share a mother, right?
I suppose there are two solutions here:
- Apply the deltaR/pT/pdgid checks to all particles, independent of parentage, and prune the tree.
- Go back to the old behavior, where we didn't merge the two collections. IIRC, we used to just use the prunedGenParticle collection, right? And is it only "interesting" particles? I thought at the very least it included everything in the final state.
[1] https://github.com/PandaPhysics/PandaProd/blob/master/Producer/src/GenParticlesFiller.cc#L116
from pandatree.
It has to be only the interesting particles - if pruned collection contains all final-state particles, there is no need for packed collection.
This is the definition of "interesting":
http://cmslxr.fnal.gov/source/PhysicsTools/PatAlgos/python/slimming/prunedGenParticles_cfi.py?v=CMSSW_9_3_0
Looks like someone already wrote a merger, although I'm not sure how foolproof this is. Looks better than ours. Maybe we should just insert this in our prod config and read the output in panda.
http://cmslxr.fnal.gov/source/GeneratorInterface/RivetInterface/plugins/MergedGenParticleProducer.cc?v=CMSSW_9_3_0
from pandatree.
It looks like this prefers packed gen particles to pruned (unpacked) particles. We lose some information doing that, but then I suppose we lose that information when saving things in panda anyway.
The logic seems simple enough, except I'm slightly concerned about the exact-p4 comparison. Can we be certain that there aren't any rounding errors from compression/packing? Maybe we should still do delta-R/pt comparisons.
from pandatree.
Actually I don't trust the exact-p4 comparison at all. The dump above clearly shows that packing/unpacking causes eta/phi to change slightly.
from pandatree.
I agree. Maybe we should let whoever wrote this know too..
And one important information only in pruned is the status flags, so we should prefer pruned particles in overlap removal.
from pandatree.
Piggy-backing on this issue: I think the current final-state panda GenParticles are made from the packed candidates, and therefore don't have StatusFlags filled properly. We should fix that too.
from pandatree.
I think I have something that works for the merging of the pruned and packed collections. The method is to actually pack the pruned particle and check the integral values are equal to those of the packed particle via the PackedValuesExposer.
Test on 1k LO DY events:
http://t3serv001.mit.edu/~dhsu/campaign010/dupRemoval.png
from pandatree.
Related Issues (20)
- merge.py creates "events" tree with multiple cycle numbers for large files
- Add Charged PF Veto for Photons
- .dump() prints weird things for chars HOT 4
- Add all PDF variations
- Add muon isStandalone flag
- DeepFlavor configuration changed HOT 5
- Update electron MVA id
- Auto-generate library HOT 1
- Trigger objects HOT 1
- add Significance of the impact parameter for the leptons? HOT 3
- Run GenHFHadronMatcher in 010 producer HOT 10
- Run DNN b-jet regression in Panda 010 HOT 1
- Deep double B HOT 3
- Important HLTBits bug HOT 3
- statusFlags of genparticles HOT 1
- More lepton variables HOT 1
- GenJet quantities HOT 7
- PF MET significance
- Need general purpose Electron MVA value
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pandatree.