nnpdf / pineko Goto Github PK

View Code? Open in Web Editor NEW

2.0 6.0 0.0 27.3 MB

PineAPPL + EKO ─➤ fast theories

Home Page: https://pineko.readthedocs.io

License: GNU General Public License v3.0

Python 99.42% Shell 0.58%

hep-ph physics high-energy-physics python

pineko's Introduction

PINEKO is a Python module to produce fktables from interpolation grids and EKOs.

Installation

PINEKO is available via

PyPI:

pip install pineko

Development

If you want to install from source you can run

git clone [email protected]:N3PDF/pineko.git
cd pineko
poetry install

To setup poetry, and other tools, see Contribution Guidelines.

Documentation

The documentation is available here:
To build the documentation from source run

cd docs
poetry run make html

Tests and benchmarks

To run unit test you can do

poetry run pytest

Contributing

Your feedback is welcome! If you want to report a (possible) bug or want to ask for a new feature, please raise an issue:
Please follow our Code of Conduct and read the Contribution Guidelines

pineko's People

Contributors

Stargazers

Watchers

pineko's Issues

Refactor configs

After a brief experience with the current configs

drop non-local paths:

pineko/src/pineko/configs.py

Lines 104 to 106 in 4b1d511

    
           paths.append(pathlib.Path.home()) 
        
           paths.append(pathlib.Path(appdirs.user_config_dir())) 
        
           paths.append(pathlib.Path(appdirs.site_config_dir()))

include git-like resolution, i.e. check parent folders for a pineko.toml file
test paths semantics
- absolute paths are kept as they are
- relative paths are relative to pineko.toml

Essentially, the best way to achieve the last one should be something like:

pinekopath = pathlib.Path(...).absolute()

def absolute_or_pineko(path: pathlib.Path):
    try:
        return path.relative_to(pinekopath)
    except ValueError:
        return path

# or, almost the same
def absolute_or_pineko2(path: pathlib.Path):
    if path.is_absolute():
        return path
    return path.relative_to(pinekopath)

Factorization scale variations for grids.

At the moment we have only implemented renormalization scale variations for grids that are missing the scale dependence.
It would be good to add the factorization scale dependence as well.

This might be unnecessary if we get the last few grids with scale variations that we are missing but I'm opening the issue just to keep this issue in mind.

We should have already the information necessary for NLO.

(I'm assigning myself to it since I don't want to inflict this pain onto others, but cc possible interested parties @cschwan @felixhekhorn @andreab1997)

FONLL-B DIS FK table need different PTO

I'm sorry to say again, but NLO FK tables are still wrong (only FK tables this time though - so it's not expensive):
https://github.com/N3PDF/pineko/blob/e12c6482a32c7c8f902bfa2d2504e4bfe451fd19/src/pineko/theory.py#L325

this is wrong in the case of NLO (so PTO=1) and FONLL-B because FONLL-B also contains bits at $O(a_s^2)$ which will be neglected with this statement ...

the solution is

find out (from the theory card) whether we're in FONLL-B
check whether we're looking at a DIS grid (from the lumi, I guess - we can cross check with NNPDF/nnpdf#1529)
eventually do a correction of max_as

Replace `Grid::axes` with `Grid::evolve_info`

Grid::axes is too restrictive, see the discussion here: NNPDF/pineappl#197. Therefore it should be replaced with Grid::evolve_info.

KeyError: 'nf0'

I'm trying to generate an FK table to test the performance of NNPDF/pineappl#103. However, when I run ./run.py, I get the following error message:

Traceback (most recent call last):
  File "/scratch/cschwan/pineko/./run.py", line 58, in <module>
    ensure_eko(pineappl_path, myoperator_path)
  File "/scratch/cschwan/pineko/./run.py", line 37, in ensure_eko
    ops = eko.run_dglap(theory_card=theory_card, operators_card=operators_card)
  File "/home/cschwan/projects/pineappl/pineappl_py/env/lib/python3.9/site-packages/eko/__init__.py", line 27, in run_dglap
    r = runner.Runner(theory_card, operators_card)
  File "/home/cschwan/projects/pineappl/pineappl_py/env/lib/python3.9/site-packages/eko/runner.py", line 84, in __init__
    tc = ThresholdsAtlas.from_dict(theory_card)
  File "/home/cschwan/projects/pineappl/pineappl_py/env/lib/python3.9/site-packages/eko/thresholds.py", line 179, in from_dict
    nf_ref = theory_card["nf0"]
KeyError: 'nf0'

FK logs

In pineko.toml I have specified, as suggested

[paths.logs]
eko = "logs/eko"
fk = "logs/fk"

The first time I run
$ pineko theory fks <theory_number> <dataset_name>
I get the error
FileNotFoundError: [Errno 2] No such file or directory: '/home/enocera/Documents/NNPDF/nnpdfgit/pineko/logs/fk/<theory_number>-<dataset_name>-None.log'
If I create by hand the directory fk/, then everything works.

So it seems to me that the logs/eko dir is created (if not existing yet) when running
$ pineko theory ekos <theory_number> <dataset_name>
but the logs/fk dir is not created (if not existing) when running
$ pineko theory fks <theory_number> <dataset_name>
Is this correct?

Support new eko format

Add option to remove (or modify behaviour) progress bar

The progress bar looks nice but if one is in a metered or low-bandwith connection over ssh the constant update apparently takes quite some data.

Road map for NNPDF40MHOU

Let me collect here the steps that we are missing in order to complete NNPDF40MHOU (so that we can keep track of progresses)

Implement numerical FONLL (in charge: AB)
Compute jets FKtables (to be investigated)
Include NNLO k-factors in the FKtables (to be discussed...)
merge #75 (pineappl pre-release needed)
Use the automatic scale variations tool to add the sv orders to the hadronic grids (in charge: AB)
Recompute all the FKtables
Test thcovmat and fits
Do the fits

Let me also tag all the people that may be involved or just interested in this (just for you to know) @cschwan @felixhekhorn @alecandido @giacomomagni @scarlehoff

Integrability FK tables

When using pineko 0.3.3 there is a problem with integrability FKtables.
Ekos instead are produced correctly.

Configurations loaded from '/data/theorie/gmagni/N3PDF/pineko/pineko.toml'
Analyze INTEGXT3
┌───────────────┐
│ Computing ... │
└───────────────┘
   /data/theorie/gmagni/N3PDF/pineko/data/grids/440/NNPDF_INTEG_XT3_40.pineappl.lz4
 + /data/theorie/gmagni/N3PDF/pineko/data/ekos/440/NNPDF_INTEG_XT3_40.tar
 = /data/theorie/gmagni/N3PDF/pineko/data/fktables/440/NNPDF_INTEG_XT3_40.pineappl.lz4
 with max_as=3, max_al=0, xir=1.0, xif=1.0
Traceback (most recent call last):
  File "/project/theorie/gmagni/miniconda3/envs/nnpdf/bin/pineko", line 5, in <module>
    command()
  File "/project/theorie/gmagni/miniconda3/envs/nnpdf/lib/python3.9/site-packages/click/core.py", line 1130, in call
    return self.main(*args, **kwargs)
  File "/project/theorie/gmagni/miniconda3/envs/nnpdf/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/project/theorie/gmagni/miniconda3/envs/nnpdf/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/theorie/gmagni/miniconda3/envs/nnpdf/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/theorie/gmagni/miniconda3/envs/nnpdf/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/theorie/gmagni/miniconda3/envs/nnpdf/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/data/theorie/gmagni/N3PDF/pineko/src/pineko/cli/theory_.py", line 93, in fks
    theory.TheoryBuilder(
  File "/data/theorie/gmagni/N3PDF/pineko/src/pineko/theory.py", line 448, in fks
    self.iterate(self.fk, tcard=tcard, pdf=pdf)
  File "/data/theorie/gmagni/N3PDF/pineko/src/pineko/theory.py", line 201, in iterate
    f(name, grid, **kwargs)
  File "/data/theorie/gmagni/N3PDF/pineko/src/pineko/theory.py", line 417, in fk
    _grid, _fk, comparison = evolve.evolve_grid(
  File "/data/theorie/gmagni/N3PDF/pineko/src/pineko/evolve.py", line 165, in evolve_grid
    operators, targetgrid=eko.interpolation.XGrid(x_grid)
  File "/data/theorie/gmagni/N3PDF/eko/src/eko/interpolation.py", line 430, in init
    raise ValueError(f"xgrid needs at least 2 points, received {len(xgrid)}")
ValueError: xgrid needs at least 2 points, received 1

Enforce Numpy doc style

Following @alecandido s comment in eko on now we're trying to use Numpy doc style guide - actually, we could really enforce this by pre-commit using pydocstyle - shall we do this? @andreab1997 @alecandido @giacomomagni @niclaurenti

it would (most likely) be a major rework in eko (I'm guessing), so we could instead start it here: 21 issues

Drop Q2 everywhere

We should update code and docs not to use $Q^2$ in place of $\mu_F^2$, in order to avoid ambiguity, and resulting confusion and bugs.

fix docs
fix code

Also $x$ would be better replaced by $z$ (but this is less crucial).

More details in #53 (comment)

Fktables for NLO theory

This is the NLO eqiuivalent of NNPDF/fktables#5

We list here the Fktables which we cannot compute at the moment for the central NLO theory (208):

jets (so needs NNPDF/eko#105)
- ATLAS_2JET_7TEV_R06
- ATLAS_1JET_8TEV_R06
- CMS_1JET_8TEV
- CMS_2JET_7TEV
missing grids
- ATLAS_WCHARM_WP_DIFF_7TEV (take from dom /media/Fk/fktables/data/appl_subgrids)
- ATLAS_WCHARM_WM_DIFF_7TEV (take from dom /media/Fk/fktables/data/appl_subgrids)
- CMSWCHARMTOT (Write the yamldb for the composition CMSWCHARM_WP and CMSWCHARM_WM with operand "add", take the two grids from /media/Fk/fktables/data/appl_subgrids )
- CMSWCHARMRAT (same of previous but with operands "ratio")
- CMS_WCHARM_DIFF_UNNORM_13TEV (same as the previous but with operand "add" and different grids: CMS_WCHARM_13TEV_WMC and CMS_WCHARM_13TEV_WPCB)
We miss ren sv:
- ATLAS_TOPDIFF_DILEPT_8TEV_TTRAPNORM (because we are missing ren sv for CMSTTBARTOT8TEV-TOPDIFF8TEVTOT, coming from Sherpa)
- ATLAS_TTBARTOT_13TEV_FULLLUMI (because we are missing ren sv for ATLAS_TTBARTOT_13TEV_FULLLUMI-TOPDIFF13TEVTOT, coming from Sherpa)
- ATLASTTBARTOT7TEV (because of ATLASTTBARTOT7TEV-TOPDIFF7TEVTOT, coming from Sherpa)
- ATLASTTBARTOT8TEV (because of ATLASTTBARTOT8TEV-TOPDIFF8TEVTOT, this should be the same as CMSTTBARTOT8TEV-TOPDIFF8TEVTOT)
- ATLAS_WM_JET_8TEV_PT ( because of ATLAS_WM_JET_8TEV_PT-atlas-atlas-wjets-arxiv-1711.03296-xsec003 which is coming from ploughshare)
- ATLAS_WP_JET_8TEV_PT ( because of ATLAS_WP_JET_8TEV_PT-atlas-atlas-wjets-arxiv-1711.03296-xsec002 which is coming from ploughshare)
- ATLASZPT8TEVMDIST (it is coming from MCFM)
- ATLASZPT8TEVYDIST (it is coming from MCFM)
- CMS_1JET_8TEV (coming from NLOjet++)
- CMSTOPDIFF8TEVTTRAPNORM (because of CMSTOPDIFF8TEVTTRAPNORM-TOPDIFF8TEVTTRAP, coming from Sherpa and total cross-section see above)
- CMSTTBARTOT13TEV (because of CMSTTBARTOT13TEV-TOPDIFF13TEVTOT, coming from Sherpa)
- CMSTTBARTOT8TEV (because of CMSTTBARTOT8TEV-TOPDIFF8TEVTOT, coming from Sherpa)
- CMSTTBARTOT7TEV (because of CMSTTBARTOT7TEV-TOPDIFF7TEVTOT, coming from Sherpa)
- CMSZDIFF12 (coming from MCFM)
- ATLAS_WCHARM_WP_DIFF_7TEV
- ATLAS_WCHARM_WM_DIFF_7TEV
- CMSWCHARMTOT
- CMSWCHARMRAT
- CMS_WCHARM_DIFF_UNNORM_13TEV
FTDY (needs VRAP)
- DYE886R_dw_ite
- DYE886P
- DYE906R_dw_ite
Things to check:
- D0ZRAP_40 (working for NNLO, check the bug)
- D0WMASY
- ATLASWZRAP36PB
- ATLASDY2D8TEV
- ATLAS_DY_2D_8TEV_LOWMASS
- ATLASPHT15_SF (two bins for each FKtable are correct the others are not)
- CMSWEASY840PB
- CMSWMASY47FB
- CMSWMU8TEV
Things with a conjectured solution
- ATLAS_TTB_DIFF_8TEV_LJ_TRAPNORM (maybe just a 2.0 conversion factor)
- ATLAS_TTB_DIFF_8TEV_LJ_TTRAPNORM (same)
- CDFZRAP_NEW (see issue NNPDF/fktables#17)
- CMS_TTBAR_2D_DIFF_MTT_TRAP_NORM (this is actually correct NNPDF/fktables#5 )

Scaffolding

At the moment, we need a given hierarchy of folders to run.

Unfortunately, this may complicate the usage for a new user (or any non-developer in general), since you have to be manage part of the structure manually, but then it has to respect some constraints.

To improve UX, I propose to implement to following features, keeping them simple, but useful:

a unique subcommand to manage a project structure, let's call it pineko scaffold (name to decide, take it as a placeholder)
a command to setup a new project pineko scaffold new (possibly aliased to pineko new for simplicity)
- it should create all the directories, and possibly download some stuffs (e.g. yamldb)
a function to verify correctness of the project structure, since it should be declared in the config file pineko.toml, check_folders()
- it should be run by every subcommand of pineko scaffold before running (simple to implement in click, we just need to put the call in the group function)
```
@command.group("scaffold")
def subcommand():
    check_folders()
```
a subcommand pineko scaffold check to just run check_folders() (and print a simple and nice report if something goes wrong)

Since paths here are crucial, this is related to #51

Optionally:

a subcommand pineko scaffold update to download a new version of remote resources (e.g. yamldb)
- in case, better to keep the old one as backup (put in a .old-data folder somewhere, and mangle with a timestamp)
a subcommand pineko scaffold repair, to fix the structure according to the current one spelled out in pineko.toml

Strong coupling constants are wrongly calculated

This was originally reported here: NNPDF/pineappl#226, but I'm quite confident that this is a bug in Pineko, specifically these lines:

pineko/src/pineko/evolve.py

Lines 200 to 207 in f8c3261

    
           alphas_values = [ 
        
               4.0 
        
               * np.pi 
        
               * sc.a_s( 
        
                   xir * xir * muf2 / xif / xif, 
        
               ) 
        
               for muf2 in muf2_grid 
        
           ]

which evaluate the strong coupling constant that do not correspond the renormalization scales given here:

pineko/src/pineko/evolve.py

Line 211 in f8c3261

xir * xir * mur2_grid,

I believe the patch is to change these lines as follows, with which I get the correct evolution (tested with CMS_TTB_5TEV_TOT):

@@ -201,9 +201,9 @@ def evolve_grid(
         4.0
         * np.pi
         * sc.a_s(
-            xir * xir * muf2 / xif / xif,
+            xir * xir * mur2,
         )
-        for muf2 in muf2_grid
+        for mur2 in mur2_grid
     ]
     # We need to use ekompatibility in order to pass a dictionary to pineappl
     fktable = grid.evolve(

Do actual `optimize`

While optimize is called here https://github.com/N3PDF/pineko/blob/ee6b9da3e7ed6078f2816543cf3952399beaba04/src/pineko/evolve.py#L112 it is currently called with it's default value def optimize(self, assumptions = "Nf6Ind"): see https://github.com/N3PDF/pineappl/blob/101cbdefdd3b9f6c0427de48e57bca53e0f3f843/pineappl_py/pineappl/fk_table.py#L37

instead we should add a field to the theory card and use that value

Remove `fact_scale` even when using EXPONENTIATED

Just to be sure, with the new implementation of scale variations methods in EKO we should also avoid the fact_scale argument that at the moment is used here

pineko/src/pineko/evolve.py

Line 198 in 63af934

fact_scale=muf2 if sv_method == "EXPONENTIATED" else None,

right? @felixhekhorn @alecandido

Pineko doesn't respect `_template.yaml`

I've noticed pineko doesn't respect all options in _template.yaml. I've noticed with n_integration_cores which gets filled as 0 regardless of what I put in the template.

This is with a fresh installation of pineline[full].

I've noticed this one because my computer crashed after eko took every possible resource, but it might be other options that are not respected. I'll have a look whether there's something else (relevant for physics) being ignored or whether it is a problem of the pineline package in pypi.

Fix click help - avoid running group code

The problem is here: since this code is always run, also when the user is just asking for help.

I propose this should be solved in this PR, at least for this subcommand.

Originally posted by @alecandido in #69 (comment)

It's a complex task, and not deadly relevant. I now propose to postpone for later.

Default ren scale different from fact

@andreab1997 Neither of those two options, it will be evaluated at xir * xir * mur2, where xir is 1.0 in your example and mur2 is set internally and may differ w.r.t. Q2. As I said this usually isn't the case because in pinefarm we always set mur = muf, but for some grids this is the case, see for instance here NNPDF/pineappl#25 (comment).

Originally posted by @cschwan in #53 (comment)

We should take into account that sometimes we can not use the central fact scale (as coming from EKO) as the central ren scale.

Configuration database(s)

Current status

1 theory database out of which a single record is selected for a fit
the record contains tentatively all information needed for evolution and DIS
(this is of course mainly due to historic reasons)

Problems

FNS actually should only matter for cross sections, but instead is also (ab-)used for evolution
- FONLL is not involved in evolution and its threshold does not play any role in evolution
- instead a dedicated threshold for cross section is needed, that can be chosen freely and independent of evolution thresholds
some of the settings are redundant: M_Z, M_W, sin(theta_w), G_F, alphaqed are not linearly independent
in principle the card could be divided into two parts (evolution <-> cross sections) with some few settings shared

Configurations ("o-cards")

both eko and yadism will be shipped default-less (in big contrast to APFEL)
both eko and yadism require some additional configurations:
- in eko "operators": the target scale of the operators, the discretization and some numerical details
- in yadism "observables": the discretization, DIS configurations (currents, hadron, lepton) and the target functions, e.g. F2total(x=0.1, Q2=90)
we require a mapping of the old settings to the new settings (which both programs already use at this point)
- our current implementation of this remapping is given here
- the current status of eko already ignores the FNS setting, but instead it has to be fed with the correct kcThr
- yadism has to be fed with the correct kDIScThr (the name can of course be changed to kxscThr or similar) to determine the thresholds and FNS to determine the (re-)combination of heavy/light coefficient functions
- in order to implement FONLL correctly (to our understanding) we need to deactivate the charm threshold in eko, but not in yadism

Proposed Workflow

pineko should determine a consistent configuration for both eko and yadism
ask them each in turn to compute their ingredients
join their respective outcome to provide what is needed: a mapping f_j(x,Q_0) -> theory prediction

Questions

how are the "observables" currently determined?
how can we organize and maintain the configurations?
how can we ensure the consistency between (experimental) dataset and theory?

pineko cli is broken if used without config file

As discovered here #69, when some of the commands of the pineko cli are used without a valid configuration file, it fails even if only asked for the --help. One of the command is pineko scaffold of #69 but also pineko theory fails in the same way.

https://github.com/NNPDF/pineko/pull/69/files#r1060039961

Prepend `theory_` to theory card name

In order to have the possibility of storing all the theory cards in a single place I propose to prepend the string theory_ to the theory ID and change this line accordingly:

pineko/src/pineko/theory_card.py

Line 24 in f7164fa

return configs.configs["paths"]["theory_cards"] / f"{theory_id}.yaml"

This should make the syntax compatible to the one used in pinefarm here and validphys:

Create the absolute theory

Rather than s trimming as proposed in #61 (or together with a trimming of unnecessary information). Let's define here in pineko the absolute theory. As discussed during this Wednesday code meeting, there is nothing in n3fit/vp that uses the theory card that is not used also by pineko (PTO being the only thing that is actually used but as mentioned, the order needs to be known by pineko to create the fktables/eko).

Once we have an absolute theory we just swap the one in the nnpdf repository for whatever we decide here.

Make opcard template theory dependent

A practical limitation is the current pineko design: because pineko currently assumes a global operator_card_template:

pineko/pineko.debug.toml

Line 6 in 707f21f

operator_card_template = "data/operator_cards/_template.yaml"

which defines the interpolation_xgrid which is the internal grid used to compute all operators
but it could even hold a inputgrid, which instead defines the grid exposed to the fit (consider that for eko input=fitting scale)
I guess we should change this design to make the template theory dependent, i.e. scope it by the theory id as all the other stuff

Originally posted by @felixhekhorn in #42 (comment)

Interpolation

This is something that it's not strictly related to pineko, but it's involving all the 3 projects pineappl, eko, and yadism, simply from the opposite end.

                    yadism
                 /          \
interpolation  --  pineappl  --  pineko
                 \          /
                     eko

Issue

At the moment eko, yadism and pineappl are all making use of interpolation, and in theory it's the same one, but in practice:

pineappl it's using its own implementation, that should be separately maintained and keep consistent (e.g.: weight function it is present in pineappl, but this does not mean that it is automatically used in the other two projects)
eko has its own implementation as well
yadism it's making use of the implementation of eko, but this yields a dependency on eko that conceptually it's not needed, indeed interpolation should be implemented somewhere, and by chance it was already implemented in eko

Main motivation

During the yadism-pineappl integration we are acquiring a new dependency in yadism, and so we are thinking about dropping the dependency in eko, since without alpha_s (stripped by the pineapplgrid filling) and without interpolation the two are going to be very loosely coupled, if not at all (there is just one residual small bit we could take care of our own).

Proposal

Should we join the three dependencies in a single package?

Downsides

it might be implemented in rust, i.e. it should be the pineappl one, since we are able to write python bindings from rust (as for pineappl, so in particular they are already available) but not the opposite (if it even made sense at all...), but then we don't know if numba could complaint or not having rust in the game

this may also be just a worry, because maybe numba it's not interacting enough to complaint, but maybe yes...

Christopher please add yours

I don't know how much effort would be for you to split interpolation in a separate crate

More pre-commit

Taking some inspiration from poetry https://github.com/python-poetry/poetry/blob/master/.pre-commit-config.yaml I suggest to update our pre-commit config

by opting into

  - id: check-merge-conflict
  - id: check-case-conflict
  - id: check-ast
  - id: check-docstring-first

adding

 - repo: https://github.com/pre-commit/pre-commit
   rev: v2.20.0
   hooks:
     - id: validate_manifest

opting into https://pre-commit.ci/ (with autofix_prs: false) to enforce people running pre-commit (right @andreab1997 ? 🙃 )

what do you think? @alecandido @andreab1997 @niclaurenti @giacomomagni

Improve Couplings treatment

We can improve the computation of the necessary couplings for PineAPPL here

pineko/src/pineko/evolve.py

Lines 181 to 188 in 63af934

    
           sc = eko.couplings.Couplings( 
        
               tcard.couplings, 
        
               tcard.order, 
        
               evmod, 
        
               quark_masses, 
        
               hqm_scheme=tcard.quark_masses_scheme, 
        
               thresholds_ratios=np.power(list(iter(tcard.matching)), 2.0), 
        
           )

using the physical masses, i.e. if we want MSbar masses actually use them
determine the $n_f$ of the central and pass that as nf_to

Allow `theory ekos` without grids beeing present

add argument to parser.get_yaml_information to allow only to ask for the name, so we can run eko without the actual grids being present

Grids used for new theories are ordered in the wrong way

with the help of @scarlehoff, I found the problem of the new theories. Doing for example pineappl convolute BCDMS_NC_EM_D_F2.pineappl.lz4 NNPDF40_nnlo_as_01180 for both the grids used for theory 405 (theory 4 in dom) and theory 424 (theory 24 in dom) what you get is

bin     Q2           x          F2d      scale uncertainty
---+-----+-----+-----+-----+------------+--------+--------
  0  8.75  8.75  0.07  0.07 3.8331501e-1  -10.65%    5.72%
  1 10.25 10.25  0.07  0.07 3.8615391e-1   -7.25%    5.42%
  2 10.25 10.25   0.1   0.1 3.6238832e-1   -6.00%    3.72%
  3 11.75 11.75   0.1   0.1 3.6313899e-1   -5.42%    3.55%
  4 11.75 11.75  0.14  0.14 3.3887777e-1   -4.40%    1.97%
  5 11.75 11.75  0.18  0.18 3.1553538e-1   -3.66%    1.25%
  6 11.75 11.75 0.225 0.225 2.8649654e-1   -2.97%    2.22%
  7 13.25 13.25   0.1   0.1 3.6377078e-1   -5.18%    3.40%
  8 13.25 13.25  0.14  0.14 3.3826829e-1   -4.19%    1.89%
  9 13.25 13.25  0.18  0.18 3.1408120e-1   -3.47%    1.18%
 10 13.25 13.25 0.225 0.225 2.8441184e-1   -2.80%    2.11%

bin     Q2           x          F2d      scale uncertainty
---+-----+-----+-----+-----+------------+--------+--------
  0  8.75  8.75  0.07  0.07 3.7227338e-1  -11.74%    5.16%
  1 10.25 10.25  0.07  0.07 3.7563029e-1  -10.90%    4.83%
  2 10.25 10.25   0.1   0.1 3.5371954e-1  -10.01%    4.38%
  3 11.75 11.75   0.1   0.1 3.5478369e-1   -9.34%    4.16%
  4 13.25 13.25   0.1   0.1 3.5576112e-1   -8.75%    3.96%
  5 11.75 11.75  0.14  0.14 3.3177244e-1   -8.04%    5.73%
  6 13.25 13.25  0.14  0.14 3.3149059e-1   -7.52%    5.51%
  7    15    15  0.14  0.14 3.3118411e-1   -7.02%    5.29%
  8    17    17  0.14  0.14 3.3076224e-1   -6.59%    5.09%
  9    19    19  0.14  0.14 3.3045852e-1   -6.21%    4.92%
 10 11.75 11.75  0.18  0.18 3.0924455e-1   -6.60%    7.35%

So, as you can see, while the grid for theory 424 is ordered by q2 (and this is the correct way since the commondata are ordered in the same way), the gird for theory 405 is ordered by x value. This problem then propagates to the FKs and then to the predictions causing the problems we have seen.
@felixhekhorn @alecandido @cschwan Do you have some idea on why the two grids are ordered in different ways? Maybe something is changed in the runcards (?)

FK Spec

Background

How the fktable is loaded to be used in the fit can be seen here. While the actual convolution is performed here and here.
Some variables that I use there and I'm going to use to define the stuff below:

ndata: number of data points
nbasis: number of channels that are different from 0
nx: number of points in the grid in x

Input

I will ask for the fktable by giving the following information:

theoryid: 200
dataset_inputs:
    - {dataset: ATLAS_WZ_TOT_13TEV, cfac: [NRM, QCD]}

If you apply the c-factors for me it is much appreciated. Otherwise validphys will do so.
The way it works right now is that validphys checks whether the theoryid exists in <prefix>/share/NNPDF/data and if it doesn't it downloads the json to see whether it exists in the server.

If it exists it downloads the theory and then loops over all datasets opening the file, reading the data, applying the c-factors if needed, etc...

Actual specs

Once the theory has been downloaded and opened I need to get the following information:

fktable
A tensor of shape (ndata, nbasis, nx, nx) for hadronic processes or (ndata, nbasis, nx) for DIS processes.
xgrid
The grid in x. If/when pineapple tables use the same grid for all fktables this can come from somewhere else or be a pointer to some common place.
basis
List of channels that are different from 0. For DIS fits this is a list of PDF flavours like (1, 2, 3, -3) and anything more complicated than that I guess would be silly. For hadronic fits you can decide how to give me the information. The way I use it is by creating a PDF x PDF = Luminosity tensor and then masking it with a boolean tensor of size (flavours, flavours).

Then some metadata that would make my life easier (but the information is contained in the objects above):

ndata: number of experimental data points
nx: number of points in the x-grid
nbasis: number of non-zero channels
hadronic: boolean flag telling me whether the process is hadronic or not

Note that I don't put any constraints in the type of the different objects. To actually operate with them I will make them into tensorflow tensors so numpy arrays is the more natural choice but I can work with anything.

Wishlist

I'm not entirely sure you can compress the current .dat files much more than they currently are (I don't know how much effort was put into compressing them back in the day) but it would make me much happier if instead of downloading 3 GB to do a fit I could download just 1.

Scaffold from the theories repository

One thing that might be useful is to add an option to scaffold to download some relevant data from the theories repo.

So one can do:

pineko scaffold --theory 400

and it will automatically download the theory runcard for 400 (and put it in the right place) and the operator template.

In favour/against ?

Conflicting numpy requirements

Compiling PineAPPL's Python interface I get the following error:

   Compiling pineappl v0.5.0-beta.6 (/home/cschwan/projects/pineappl/pineappl)
   Compiling pineappl_py v0.5.0-beta.6 (/home/cschwan/projects/pineappl/pineappl_py)
    Finished release [optimized] target(s) in 15.83s
📦 Built wheel for CPython 3.9 to /tmp/.tmppj76Gp/pineappl-0.5.0_beta.6-cp39-cp39-linux_x86_64.whl
⚠️  Warning: pip raised a warning running ["-m", "pip", "--disable-pip-version-check", "install", "--force-reinstall", "/tmp/.tmppj76Gp/pineappl-0.5.0_beta.6-cp39-cp39-linux_x86_64.whl"]:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
numba 0.55.1 requires numpy<1.22,>=1.18, but you have numpy 1.22.1 which is incompatible.
🛠  Installed pineappl-0.5.0-beta.6

Add EKO version to FkTables

same as we do for pinefarm we should add the eko version and even the pineko one

Dump FK Table comparison

rename metadata results to results_grid
add metadata results_fk with comparison w/ against w/o evolution

will need NNPDF/pineappl#74 to read metadata

Allow explicit assumptions in the theory

Just the same as it is allowed with convolute.
We need it for the theory that allows for c-cbar.

Btw, convolute doesn't work for me.

pineko convolute data/grids/200/E906deut_bin_09.pineappl.lz4 data/ekos/200/E906deut_bin_09.tar test 0 2                                                                           
┌───────────────┐
│ Computing ... │
└───────────────┘
   data/grids/200/E906deut_bin_09.pineappl.lz4
 + data/ekos/200/E906deut_bin_09.tar
 = test
 with max_as=0, max_al=2, xir=1.0, xif=1.0
Traceback (most recent call last):
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/bin/pineko", line 8, in <module>
    sys.exit(command())
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/pineko/cli/convolute.py", line 52, in subcommand
    _grid, _fk, comp = evolve.evolve_grid(
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/pineko/evolve.py", line 133, in evolve_grid
    alphas_values = [op["alphas"] for op in operators["Q2grid"].values()]
  File "/home/juacrumar/.cache/pypoetry/virtualenvs/pinefarm-HwI1B8mU-py3.10/lib/python3.10/site-packages/pineko/evolve.py", line 133, in <listcomp>
    alphas_values = [op["alphas"] for op in operators["Q2grid"].values()]
KeyError: 'alphas'

How to deal with jets

the hacks I needed for jets (i.e. https://github.com/NNPDF/fktables/blob/main/split_grid.py https://github.com/NNPDF/fktables/blob/main/jets_build.sh https://github.com/NNPDF/fktables/blob/86927f9460f8bbaf7f4daab0789c2d32283d3f2d/fkutils/builder.py#L76 https://github.com/NNPDF/fktables/blob/86927f9460f8bbaf7f4daab0789c2d32283d3f2d/fkutils/builder.py#L173 ) I did not port, but I hope NNPDF/eko#105 will be available soon

Originally posted by @felixhekhorn in #12 (comment)

More Unit Tests

I know this was almost the title of #25, but many of them relied on LHAPDF, so they have been moved as benchmarks (not a great name at this point, they are simply not unit tests).

But what is left is largely insufficient, so we definitely need more unit tests. Even mocking, if needed.

We need to keep track of cfactors names

Since we have different names for the datasets (and for the operands when the observable has an operation) we need either to:

Keep track of the names from the compound in the yaml database for loading the cfactors.
Change the names of the cfactors. <--- We might want to do this in the future (since the cfactors should eventually match the pineappl names) but at the moment I think we need option 1 (or create a new theory in which all names are changed).

One practical example, our yaml entry for D0WMASY is:

appl: true
conversion_factor: 1.0
operands:
- - D0WMASY-grid-40-6-15-3-Wplus_wly_pt25
- - D0WMASY-grid-40-6-15-3-Wminus_wly_pt25
operation: ASY
target_dataset: D0WMASY

but the cfactor files for those two tables are: CF_QCD_D0WMASY_WM.dat and CF_QCD_D0WMASY_WP.dat

When the name happens to be the same it works.

Simplify `max_as/l` application

When max_as/l are specified, to use grids at a lower order than available (e.g. computing an NLO theory from NNLO grids), everything might be done consistently just cutting the grids during loading: everything else (checks and operations) will be done adaptively according to the loaded grids, that will always be used completely.

This will have the advantage of only loading (or keeping in memory) the required subgrids, lowering the memory impact of Pineko in these cases, and the even greater advantage of simplifying the code, since there will be no need to separately consider the case of grids containing orders higher than needed.

Create a pypi package

MHOU implement scheme B+C

We should implement schemes B and C of https://inspirehep.net/literature/1741422

scheme B: use fact scale variation from evolution (of course ren scale still in processes)
- fact scale variation should be resummed (exponentiated) or not (expanded) according to the chosen evolution mode, see EKO docs
pineko needs to request for scheme B the central scale (and not the shifted one as at the moment of writing is doing); effectively eko is taking fully care of the SV
scheme C: both scale variations coming from processes

Both schemes do not involve refitting (because they are defined this way).

SV check does not take into account the PTO

Currently our SV check does not take into account the fact that renormalization sv only start at NLO where NLO actually means next with respect to the first non-zero order of the process.
Therefore, it does not allow the computation of a sv Fktable for DIS when PTO=1 (which in DIS case is the first non-zero order).

Refactor `sub_scvar`

in sub_scvar there is way to much repetition https://github.com/N3PDF/pineko/blob/9923b5c478b534d42036e3ce107d46a51619d0f7/src/pineko/cli/check.py#L48
reduce the prints to 1 or at most 2. Can be addressed in #25 or afterwards.

(just as a reminder: the other day you were wondering about the correct usage - so feel free to also update the doc string 🙃 )

Check available ekos

At this point we start having a few ekos computed, and as we found out with @felixhekhorn computing one of them is quite an expensive task (at least for development, since it's blocking us).

Moreover, I believe that a few eko can really be recycled, e.g. I think that most of LHC pineapplgrids will have the same x_grid and Q2_grid.

However it is a bit annoying to check which one of the available ekos is the correct one, but we have all the tools required to automatize the process.

Proposal

Let's make a trivial function (and maybe provide a subcommand) that given a folder explore the files inside (or even all the files in the tree) for ekos, and output all the compatible ones that has been found.

Note

This is a step forward towards the toolchain automation and assets management. I'm considering that maybe we'll simply need to progressively improve tasks that we're doing manually at the moment, instead of projecting an actual automation pipeline all-together.

ETA for pineko theory ekos

I find the ETA very useful for pineko theory fks. Why not having it also for pineko theory ekos?

Logo

I would put pineapple slips on top of this:

we can remove the eyes
we can just show the shadow

We can even avoid putting slips, and scramble colors: we push the top to greenish and the rest to oranges, to have a pine cone colored like a pineapple :)

Whenever I'll have spare time I'll provide candidates ^^

Lines 186 to 194 in 60bb344

    
           fktable.write_lz4(str(fktable_path)) 
        
           # compare before/after 
        
           comparison = None 
        
           if comparison_pdf is not None: 
        
               comparison = comparator.compare( 
        
                   grid, fktable, max_as, max_al, comparison_pdf, xir, xif 
        
               ) 
        
               fktable.set_key_value("results_fk", comparison.to_string()) 
        
               fktable.set_key_value("results_fk_pdfset", comparison_pdf)

You add two key-value-pairs, but the FK table is written before, and that's the only call to write_lz4 in the project. That probably means that the metadata is never propagated to the file on disk.

	paths.append(pathlib.Path.home())
	paths.append(pathlib.Path(appdirs.user_config_dir()))
	paths.append(pathlib.Path(appdirs.site_config_dir()))

	alphas_values = [
	4.0
	* np.pi
	* sc.a_s(
	xir * xir * muf2 / xif / xif,
	)
	for muf2 in muf2_grid
	]

	sc = eko.couplings.Couplings(
	tcard.couplings,
	tcard.order,
	evmod,
	quark_masses,
	hqm_scheme=tcard.quark_masses_scheme,
	thresholds_ratios=np.power(list(iter(tcard.matching)), 2.0),
	)

	fktable.write_lz4(str(fktable_path))
	# compare before/after
	comparison = None
	if comparison_pdf is not None:
	comparison = comparator.compare(
	grid, fktable, max_as, max_al, comparison_pdf, xir, xif
	)
	fktable.set_key_value("results_fk", comparison.to_string())
	fktable.set_key_value("results_fk_pdfset", comparison_pdf)

nnpdf / pineko Goto Github PK

pineko's Introduction

Installation

Development

Documentation

Tests and benchmarks

Contributing

pineko's People

Contributors

Stargazers

Watchers

pineko's Issues

Current status

Problems

Configurations ("o-cards")

Proposed Workflow

Questions

Issue

Main motivation

Proposal

Downsides

Background

Input

Actual specs

Wishlist

Proposal

Note

Recommend Projects

Recommend Topics

Recommend Org