Comments (3)
Hi Simone,
it is important though not noted in the documentation:
normalization constant in reweighters is not fixed.
This is because the final normalization constant may depend on third-party factors.
In many cases the normalization constant does not play a significant role (e.g. to compute efficiencies / ROC curves / train classifiers), however when it does, you should compute it yourself.
Explanation: absence of normalization in reweighters makes it possible to guarantee that reweighter.predict_weights
is deterministic mapping.
E.g. if you predict a large sample at once or predict separately weight for each event and concatenate predictions - the result is the same. If you normalize, obviously the result is wrong in the second case.
from hep_ml.
@jcob95, you should renormalize externally. As I understand your case, you should compute expected amount of samples in each bin first, and then within each bin you need to apply normalization so that total weight coincides with expected.
from hep_ml.
Hi, related to this question, I'm trying to compare a single reweighter trained and tested using the entire dataset to several reweighters which are trained on individual bins of the data. What I'm trying to do is reconstruct the reweighted distributions over the whole data range from the binned reweighters.
Therefore, is it possible to obtain the normalization constant used somehow or can I normalize the reweighters externally?
Thanks
from hep_ml.
Related Issues (20)
- Negative sWeights HOT 8
- uBoost Convergence HOT 1
- Multidimensional reweighting HOT 2
- Random behavior of GBReweighter and UGradientBoostingClassifier
- Nominal weights when correcting already weighted original HOT 1
- Assertion Error with UGradientBoost HOT 1
- sPlot returns NAN sWeights HOT 3
- Odd behaviour of GBReweighter HOT 3
- Using sWeights with GBReweighter HOT 1
- Saving uboost BDT with tf/keras base estimators HOT 5
- Persistify GBReweighter instance HOT 1
- Error propagation from weights HOT 6
- Create a new release? HOT 1
- Theano is going away HOT 1
- Benchmark with independent classification model HOT 3
- New release? HOT 2
- Large variations in signal/background distributions HOT 7
- GBReweighter KeyError: 'squared_error' ?? HOT 7
- Porting loss function to XGBoost HOT 1
- numpy.float and numpy.int deprecated/removed in newer versions of numpy HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hep_ml.