nnpdf / pinefarm Goto Github PK
View Code? Open in Web Editor NEWGenerate PineAPPL grids from PineCards
Home Page: https://pinefarm.readthedocs.io
License: GNU General Public License v3.0
Generate PineAPPL grids from PineCards
Home Page: https://pinefarm.readthedocs.io
License: GNU General Public License v3.0
https://github.com/NNPDF/runcards/blob/9c1f06fb6c2e5a439e54b7e8af586d9c12d7ad6f/pyproject.toml#L30
This is connected to this issue NNPDF/pineline#15
I can install lz4 v4.0.2 perfectly fine, but version 3. errors out (I think because it tries to look for Python.h
in the wrong place, so clearly their fault (also because they fixed it, even if I could've helped it a bit). But since I needed to install python3.10 somewhere else it doesn't work.
Hi @cschwan , @felixhekhorn. After installing pinefarm with pip (pip install pinefarm
) in a fresh conda environment I try to get the list of pinecards and theories. In both cases the command fails:
$ pinefarm list theories
^[[1;2D/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/pinefarm/cli/_base.py:26: UserWarning: No configuration file detected.
warnings.warn("No configuration file detected.")
/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/pinefarm/configs.py:82: UserWarning: Using default minimal configuration ('root = $PWD').
warnings.warn("Using default minimal configuration ('root = $PWD').")
As we decided with @felixhekhorn @cschwan and @scarlehoff, we will stop storing the pinecard version in the grid (or maybe make it optional?), and we will write the full tar-gzipped pinecard (the folder) in the metadata, encoded in a string with base64
, that is a rather common encoding.
PineAPPL will provide support for extracting the tarball from the metadata (i.e. decode base64
to bytes, a redirect should do the rest of the job, I guess).
At least on dom
on the DIS-more
branch I get:
Error processing line 1 of /home/cschwan/.local/lib/python3.10/site-packages/zzz_poetry_dynamic_versioning.pth:
Traceback (most recent call last):
File "/usr/lib/python3.10/site.py", line 186, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "/home/cschwan/.local/lib/python3.10/site-packages/poetry_dynamic_versioning/__init__.py", line 13, in <module>
import tomlkit
ModuleNotFoundError: No module named 'tomlkit'
See the pending review on https://github.com/NNPDF/pinefarm/pull/23/files/177c81e1ac686f608970ba1cb79cad93e42f8520.
Originally posted by @alecandido in NNPDF/pinecards#152 (comment)
In Madgraph5_aMC@NLO v3.3.1 the cutting routine was changed, so that pjet
isn't calculated in the place where we use it anymore. We either have to 1) run the jet algorithm ourselves or 2) move jet-related custom cuts from passcuts_user
to passcuts_jets
in cuts.f
.
I thought a bit about this idea, and I'm coming with a proposal.
I want to split the pinefarm
package in two different ones (but distributed together, as eko
and ekobox
), to put a boundary between the two.
One it will be the current UI, with the CLI and all the tools (installation, configs, ...).
The other will contain mostly run.py
and external/
, and it will contain whatever is strictly related to the computation itself.
Of course with a new package I will need a new name, the best I came with is pinefarmer
. Alternatives are welcome.
So, pinefarm
will do everything it is doing, plus managing the container as well.
pinefarmer
instead will be installed inside the container, and it will accept a minimal input from the outside, and perform the actual grid computation.
Then, there is the problem of tooling for containers.
There are two main container engines for our purposes: Docker (by Docker) and Podman (by RedHat). Docker is more or less the first and most popular one, while Podman arrived later on.
There are a few more engines, and in general more complications of many kinds (orchestration, runtimes, ...), mainly because cloud computing is a big market. We are not really interested in cloud computing at the moment, we just want to take a tool from there, but in case you struggle with the vocabulary, RedHat provides a good summary.
Initialize disclaimer: I dislike Docker, at this point also for historical reasons, a few of which still applies, but not all of them. I might be biased.
The main difference between Docker and Podman (besides the companies behind them) is that the first requires a daemon to run (dockerd
), and the second one is daemon-less. In the old times (i.e. a couple years ago, at most) dockerd
required to run under root user, now they also provide a root-less option (but, if I understood correctly, it is not the default one).
More details on this RedHat page.
This reason for me was sufficient to choose Podman: I could simply do apt install podman
, and then use the CLI:
podman pull <container-image>
podman run <contianer-image>
podman ps # show active containers
...
nothing more.
Now, I'd like not to rely on the CLI availability, and if possible also not relying on anything else to be installed. I would prefer that everything is installed by the pinefarm
package installation, as a Python dependency, but I'm not sure I can do.
Docker would require at least the daemon installation, Podman maybe would require nothing.
On the other hand, I'd like to have a ready-to-use Python package, and Docker has it -> docker-py
. While Docker is not traditionally open source (even if it released and even "donated" some codes after some time), this is. Podman also has a python package -> podman-py
, but it is much less popular and maintained. However, it mostly contains bindings for Podman REST API, so we could skip the package and directly go for requests to the API. But this would require a service to run (and so also the Python package requires it), so not much different from root-less Docker in the end...
They both have a library that is directly accessible, to make use of the same functions of the CLI, so (at least for Podman) it wouldn't require a service to run. But, of course, they are Go libraries:
The Docker one is most likely the corresponding of the Python package (that I expect to be bindings to it), but it might require the daemon anyhow. I also expect you do not want to move pinefarm
to Go...
It seems that to clone a repository from GitHub now you always need to set credentials, whatever URL you are using.
So, I'm proposing to replace repositories (that in general are also an overkill) with the master
/main
branch content, i.e. instead of cloning with Git we will just download the zip content of one branch from GitHub.
E.g. for this repository the corresponding URL is https://github.com/NNPDF/runcards/archive/refs/heads/master.zip
The other option is to use the latest release, e.g. for PineAPPL the URL to the zip of the last release can be found with a request to:
https://api.github.com/repos/n3pdf/pineappl/releases/latest
(the key in the response is actually zipball_url
). Once we have the URL, one GET
more and we'll have the zip as well (for tarball replace zip
with tar
in the key.
Which one do you prefer?
master
/main
: always updatedTODO:
pygit2
dependency, if not used for anything left3.4.1
, see NNPDF/pinecards#142)At the moment, in runcardsrunner
I'm printing (w/ or w/o rich
, even inconsistently...) so I should move to logging (using rich
as handler, it will be automatically consistent).
Most likely log.py
will have to be updated accordingly (e.g. Tee
).
Thanks @scarlehoff for noticing
The CKM matrix (should be) a list of numbers, however it is declared as a string
Personally I'd put a list everywhere. Is there a reason why a string would be preferred?
The README isn't up-to-date and in particular it doesn't answer the important question: how do I run a process?
I'm reopening this PR as I think we should impose integrability on \Delta f(x)
and not x \Delta f(x)
as currently implemented.
See #58
Originally posted by @giacomomagni in #57 (comment)
@niclaurenti got hit by the following bug:
in https://github.com/NNPDF/runcards/blob/7f11afce4242791acad47d4c7be393e629b5121d/pinefarm/external/interface.py#L114 we require that we are inside an actual repo (as would be in development mode, so when cloning this repo) - not being so, as can happen e.g. upon pip install pineline[full]
and adding some stand alone cards, results in an exception (@niclaurenti if you still have - please paste the full error here)
I guess we should catch that error and eventually leave the field empty
Anything other than protons yields a wrong convolution:
https://github.com/NNPDF/runcards/blob/master/runcardsrunner/table.py#L33-L35
This should read out the metadata initial_state_1
and initial_state_2
and call Grid::convolute_with_two
if the initial states are different.
Hi @cschwan, I am trying to reproduce one of the pinecards using pinefarm. For ATLAS_TTB_13TEV_TOT (as well as for others) I get this error that seems related to pineappl
INFO:
INFO: Checking test output:
INFO: P0_gg_ttx
INFO: Result for test_ME:
Command "launch auto " interrupted with error:
FileNotFoundError : [Errno 2] No such file or directory: '/store/DAMTP/mnc33/Projects_store/PhD/nnpdf40_pheno/pinefarm_runs/results/200-ATLAS_TTB_13TEV_TOT--20231113103915/ATLAS_TTB_13TEV_TOT/SubProcesses/P0_gg_ttx/test_ME.log'
Please report this bug on https://bugs.launchpad.net/mg5amcnlo
More information is found in '/store/DAMTP/mnc33/Projects_store/PhD/nnpdf40_pheno/pinefarm_runs/results/200-ATLAS_TTB_13TEV_TOT--20231113103915/ATLAS_TTB_13TEV_TOT/run_01_tag_1_debug.log'.
Please attach this file to your report.
INFO:
quit
INFO:
quit
quit
Error calling StartServiceByName for org.freedesktop.Notifications: Timeout was reached
Traceback (most recent call last):
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/bin/pinefarm", line 8, in <module>
sys.exit(command())
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/pinefarm/cli/run.py", line 31, in subcommand
main(dataset, theory_card, pdf)
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/pinefarm/cli/run.py", line 67, in main
run_dataset(runner)
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/pinefarm/cli/run.py", line 122, in run_dataset
runner.generate_pineappl()
File "/store/DAMTP/mnc33/miniconda3/envs/pinefarm/lib/python3.9/site-packages/pinefarm/external/mg5/__init__.py", line 184, in generate_pineappl
grid = pineappl.grid.Grid.read(mg5_grids[0])
IndexError: list index out of range
Thanks for using LHAPDF 6.4.0. Please make sure to cite the paper:
Eur.Phys.J. C75 (2015) 3, 132 (http://arxiv.org/abs/1412.7420)
Please comment with other problems you find.
As we have different MC available as back-end (at the moment mg5
and yadism
), we should add a conversion
back-end powered by pineappl
conversion scripts.
Indeed, we are not able to produce all of the grids needed (and we won't be for quite some time), as some of them are the result of MC runs, with some non-publicly available MC.
In these cases we're gently gifted the runcards, so we should download them from somewhere else (or have the user running rr
downloading them), and then convert to pineappl
.
(env) cschwan@montblanc ~ $ pinefarm run ATLAS_1JET_8TEV_R06 pinefarm/extras/theories/theory_200_1.yaml
/home/cschwan/runcards/env/lib/python3.8/site-packages/pinefarm/cli/_base.py:24: UserWarning: No configuration file detected.
warnings.warn("No configuration file detected.")
/home/cschwan/runcards/env/lib/python3.8/site-packages/pinefarm/configs.py:81: UserWarning: Using default minimal configuration ('root = $PWD').
warnings.warn("Using default minimal configuration ('root = $PWD').")
ATLAS_1JET_8TEV_R06
Computing ATLAS_1JET_8TEV_R06...
โ Found pineappl
Installing...
Traceback (most recent call last):
File "/home/cschwan/runcards/env/bin/pinefarm", line 8, in <module>
sys.exit(command())
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/pinefarm/cli/run.py", line 30, in subcommand
main(dataset, theory_card, pdf)
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/pinefarm/cli/run.py", line 65, in main
install_reqs(runner, pdf)
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/pinefarm/cli/run.py", line 84, in install_reqs
runner.install()
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/pinefarm/external/mg5/__init__.py", line 47, in install
install.mg5amc()
File "/home/cschwan/runcards/env/lib/python3.8/site-packages/pinefarm/install.py", line 75, in mg5amc
shutil.move(el, dest)
File "/usr/lib/python3.8/shutil.py", line 787, in move
real_dst = os.path.join(dst, _basename(src))
File "/usr/lib/python3.8/shutil.py", line 750, in _basename
return os.path.basename(path.rstrip(sep))
AttributeError: 'PosixPath' object has no attribute 'rstrip'
Starting with the Python implementation of the runner we have to specify a theory whenever we want to generate a grid, for instance
./rr run TEST_RUN_SH theories/theory_200.yaml
I think we should reflect this in the filename of the generated grid, so in this instance we should generate
TEST_RUN_SH_T200.pineappl.lz4
, which tells us the grid was generated with theory 200.
Furthermore, we need to discuss
Further steps breakout:
I am running pinefarm
(installed with pip) as following:
pinefarm run runcards/ATLAS_TTB_13TEV_TOT theory_200_1.yaml
The command is executed in a fresh conda environment in which I have only installed lhapdf (but no pineappl.).
The run crashes in the following way, so it seems that pinefarm is not installing PineAPPL on the fly as said in (https://pinefarm.readthedocs.io/en/latest/run.html)
Error detected in "launch auto "
write debug file /store/DAMTP/mnc33/Projects_store/PhD/nnpdf40_pheno/pinefarm_runs/results/200-ATLAS_TTB_13TEV_TOT--20231115093751/ATLAS_TTB_13TEV_TOT/run_01_tag_1_debug.log
If you need help with this issue, please, contact us on https://answers.launchpad.net/mg5amcnlo
str : No valid pineappl installation found.
Please set the path to pineappl-config by using
MG5_aMC> set <absolute-path-to-pineappl>/bin/pineappl-config
Moreover, the .prefix/bin folder generated in the same folder in which I run pinefarm is empty
Something that isn't possible with our toolchain is to generate two separate distributions, because all grids are merged together in the end. For the CDF W-boson mass grids that would be beneficial and also for some of our top-pair production grids. We could distinguish distributions that need to be merged together and those which are separate by looking at the names of each histogram: merge grids together if their histograms have the same name, leave them alone if their names differ. Another concern is the metadata, for which we'll need as many files as there are different histograms.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.