Coder Social home page Coder Social logo

kuewkinoanalysis's People

Contributors

anazario avatar caleb-james-smith avatar crogan avatar jphsx avatar kuhep avatar mlazarovits avatar zflowers avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kuewkinoanalysis's Issues

MET Trigger Tool Seg Fault

I have observed a seg fault in MET Trigger Tool when making custom reduced nutples. The seg fault only occurs for a small amount of jobs (not all events or jobs).

Branch: https://github.com/crogan/KUEWKinoAnalysis/tree/sv_debug

Location of seg fault: https://github.com/crogan/KUEWKinoAnalysis/blob/sv_debug/src/METTriggerTool.cc#L157

Number of failed jobs due to seg fault:

[caleb@login-el7 KUEWKinoAnalysis]$ grep -i seg Summer16_102X_v3/err/crab_TTJets-DiLept-FullSim-2016-v3_Summer16_102X/*.err | wc -l
5
[caleb@login-el7 KUEWKinoAnalysis]$ grep -i seg Summer16_102X_v3/err/crab_TTJets-DiLept-FastSim-2016-v3_Summer16_102X/*.err | wc -l
2
[caleb@login-el7 KUEWKinoAnalysis]$ grep -i seg Fall17_102X_v3/err/crab_TTJets-DiLept-FullSim-2017-v3_Fall17_102X/*.err | wc -l
25
[caleb@login-el7 KUEWKinoAnalysis]$ grep -i seg Fall17_102X_v3/err/crab_TTJets-DiLept-FastSim-2017-v3_Fall17_102X/*.err | wc -l
3
[caleb@login-el7 KUEWKinoAnalysis]$ grep -i seg Autumn18_102X_v3/err/crab_TTJets-DiLept-FullSim-2018-v3_Autumn18_102X/*.err | wc -l
25
[caleb@login-el7 KUEWKinoAnalysis]$ grep -i seg Autumn18_102X_v3/err/crab_TTJets-DiLept-FastSim-2018-v3_Autumn18_102X/*.err | wc -l
0

Condor setup directory:

/home/caleb/SUSY_analysis/production_v2/CMSSW_10_6_5/src/KUEWKinoAnalysis/Summer16_102X_v3

Example problem job:

crab_TTJets-DiLept-FastSim-2016-v3_Summer16_102X_20_9

Setup:

cd /home/caleb/SUSY_analysis/production_v2/CMSSW_10_6_5/src/KUEWKinoAnalysis
git checkout sv_debug
cmsenv
restenv
make clean && make cmssw -j8

Run:

./MakeReducedNtuple_NANO.x -ifile=root://cmseos.fnal.gov//store/group/lpcsusylep/NANO_SVSF_v2/TTJets_DiLept_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/crab_TTJets-DiLept-FastSim-2016-v2/220411_024729/0000/TTJets-DiLept-FastSim-2016_28.root -ofile=crab_TTJets-DiLept-FastSim-2016-v3_Summer16_102X_20_9.root -tree=Events -dataset=crab_TTJets-DiLept-FastSim-2016-v3 -filetag=Summer16_102X -eventcount=./Summer16_102X_v3/config/EventCount.root -filtereff=./Summer16_102X_v3/config/FilterEff.root -json=./Summer16_102X_v3/config/GRL_JSON.txt -pu=./Summer16_102X_v3/config/PU/ -btag=./Summer16_102X_v3/config/BtagSF/ -jme=./Summer16_102X_v3/config/JME/ -svfile=./Summer16_102X_v3/config/NNmodel.json -metfile=./Summer16_102X_v3/config/METTrigger/Parameters.csv -split=10,10

Output (shortened):

...

event: 104410
event: 104411
event: 104412
Start Get_EFF(): name = HT-Le600--SingleElectrontrigger-E1--Nele-E1_SingleElectron_2016_Electron, MET = 172.576, updown = 0
Get_EFF(): Before first if, else
Get_EFF(): case (2)
Get_EFF(): After first if, else
Get_EFF(): return (1)
Start Get_EFF(): name = HT-Le600--SingleElectrontrigger-E1--Nele-E1_Bkg_2016_Electron, MET = 172.576, updown = 0
Get_EFF(): Before first if, else
Get_EFF(): case (2)
Get_EFF(): After first if, else
Get_EFF(): return (1)
Start Get_EFF(): name = HT-Le600--SingleElectrontrigger-E1--Nele-E1_SingleElectron_2016_Electron, MET = 172.576, updown = 0
Get_EFF(): Before first if, else
Get_EFF(): case (2)
Get_EFF(): After first if, else
Get_EFF(): return (1)
Start Get_EFF(): name = HT-Le600--SingleElectrontrigger-E1--Nele-E1_Bkg_2016_Electron, MET = 172.576, updown = 0
Get_EFF(): Before first if, else
Get_EFF(): case (2)
Get_EFF(): After first if, else
Get_EFF(): return (1)
Start Get_EFF(): name = HT-Le600--SingleElectrontrigger-E1--Nele-E1_SingleElectron_2016_Electron, MET = 172.576, updown = 0
Get_EFF(): Before first if, else
Get_EFF(): case (2)
Get_EFF(): After first if, else
Get_EFF(): return (1)
Start Get_EFF(): name = HT-Le600--SingleElectrontrigger-E1--Nele-E1_Bkg_2016_Electron, MET = 172.576, updown = 0
Get_EFF(): Before first if, else
Get_EFF(): case (2)
Get_EFF(): After first if, else
Get_EFF(): return (1)
event: 104413
event: 104414
event: 104415
Start Get_EFF(): name = SingleElectrontrigger-E1--Nele-E1_SingleElectron_2016_Electron, MET = 156.716, updown = 0
Get_EFF(): Before first if, else

 *** Break *** segmentation violation

 ===========================================================
#6  0x00000000005291c2 in METTriggerTool::Get_EFF(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double, int) ()
#7  0x000000000052ccbd in METTriggerTool::Get_SF(double, double, int, bool, bool, bool, int) ()
#8  0x000000000057860d in ReducedNtuple<SUSYNANOBase>::FillOutputTree(TTree*, Systematic const&) ()
#9  0x000000000053b9a4 in NtupleBase<SUSYNANOBase>::WriteNtuple(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int) ()
#10 0x0000000000438054 in main ()
===========================================================

Errors when running Build Fit Input

Last week, when running Build Fit Input on condor at CMS LPC, I had a reoccurring issue that affected many condor jobs in multiple submissions. It related to a server error when loading large signal ROOT files (e.g. over 30 GB in size). These files are loaded in all jobs; a large fraction of jobs gave the same server error, which leads to a seg fault.

Running the following on CMS LPC to submit condor jobs:

./BuildFitInputCondor.x -maxN 1 ++bkg +proc T2bW ++cat -year 2017 -lumi 137 --connect -path root://xrootd.unl.edu//store/user/zflowers/crogan/ -o test_BuildFitInput/

Example error message:

1 tar: write error
2 WARNING: In non-interactive mode release checks e.g. deprecated releases, production architectures are disabled.
3 Error in <TNetXNGFile::Open>: [ERROR] Server responded with an error: [3011] No servers are available to read the file.
4
5
6  *** Break *** segmentation violation
7
8
9
10 ===========================================================
11 There was a crash.
12 This is the entire stack trace of all threads:
13 ===========================================================

Example prints from log to check for open file:

is root://xrootd.unl.edu//store/user/zflowers/crogan/Summer16_102X_SMS/SMS-TSlepSlep_TuneCUETP8M1_13TeV-madgraphMLM-pythia8_Summer16_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Summer16_102X_SMS/SMS-TSlepSlep_TuneCP2_13TeV-madgraphMLM-pythia8_ext_Summer16_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Summer16_102X_SMS/SMS-TSlepSlep_mSlep-500To1300_TuneCUETP8M1_13TeV-madgraphMLM-pythia8_Summer16_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Fall17_102X_SMS/SMS-T2bW_TuneCP2_13TeV-madgraphMLM-pythia8_Fall17_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Fall17_102X_SMS/SMS-T2bW_X05_dM-10to80_genHT-160_genMET-80_mWMin-0p1_TuneCP2_13TeV-madgraphMLM-pythia8_Fall17_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Fall17_102X_SMS/SMS-T2bW_X05_dM-10to80_2Lfilter_mWMin-0p1_TuneCP2_13TeV-madgraphMLM-pythia8_Fall17_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Fall17_102X_SMS/SMS-T2tt_dM-10to80_genHT-160_genMET-80_mWMin-0p1_TuneCP2_13TeV-madgraphMLM-pythia8_Fall17_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Fall17_102X_SMS/SMS-T2tt_dM-6to8_genHT-160_genMET-80_TuneCP2_13TeV-madgraphMLM-pythia8_Fall17_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Fall17_102X_SMS/SMS-T2tt_mStop-400to1200_TuneCP2_13TeV-madgraphMLM-pythia8_Fall17_102X.root open? 1
is root://xrootd.unl.edu//store/user/zflowers/crogan/Fall17_102X_SMS/SMS-TChiWZ_ZToLL_TuneCP2_13TeV-madgraphMLM-pythia8_Fall17_102X.root open? 

In this case, the seg fault was occurring for the last file in the list.

I avoided this error by adding a "SKIP_SIGNAL" flag to skip the InitSMS() statements for all signals, and processing only background. This worked for running background only. However, a different fix is needed to run signal.

https://github.com/crogan/KUEWKinoAnalysis/blob/friday/src/SampleTool.cc#L237

https://github.com/crogan/KUEWKinoAnalysis/blob/friday/src/SampleTool.cc#L353

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.