Coder Social home page Coder Social logo

n3pdf / pdfflow Goto Github PK

View Code? Open in Web Editor NEW
8.0 8.0 0.0 358 KB

PDFflow is parton distribution function interpolation library written in Python and based on the TensorFlow framework.

Home Page: https://pdfflow.readthedocs.io

License: Apache License 2.0

Python 96.89% CMake 0.92% Makefile 0.25% C 1.30% Fortran 0.64%
hep-ex hep-ph hep-th parton-distribution-functions pdf pdf-interpolation python tensorflow

pdfflow's People

Contributors

marcorossi5 avatar scarlehoff avatar scarrazza avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pdfflow's Issues

Add a point about disabling logs in the docs

As per issue #33, add a mention to the possibility of disabling the logs when using pdfflow.

import logging
logger_pdfflow = logging.getLogger('pdfflow')
logger_pdfflow.setLevel(logging.WARNING)

It might make sense to have it also as a environment variable because when pdfflow is used from C, Fortran, etc, changing the log level won't be this easy.

PDF sets management

At the moment we are relying on LHAPDF for the whole server/set/organization management. I don't think we really want to reimplement the whole thing, but it would be nice to have at least some feature that is able to download PDFs to /usr/share/pdfflow or even /usr/share/LHAPDF so that the library can work stand-alone-y.

Issue running PDFflow in mac

Originally from: NNPDF/nnpdf#2033


When running for instance: pytest -v test_hyperopt.py::test_restart_from_pickle in nnpdf.

(n3fit runs fine though, on the basic runcard)

I get the error:

[INFO]: All requirements processed and checked successfully. Executing actions.
[INFO] (pdfflow.pflow) Loading member 0 from NNPDF40_nnlo_as_01180
[INFO]: Loading member 0 from NNPDF40_nnlo_as_01180
[CRITICAL]: Bug in n3fit ocurred. Please report it.
Traceback (most recent call last):
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/n3fit/src/n3fit/scripts/n3fit_exec.py", line 332, in run
    super().run()
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/app.py", line 151, in run
    super().run()
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/reportengine/app.py", line 380, in run
    rb.execute_sequential()
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/reportengine/resourcebuilder.py", line 166, in execute_sequential
    result = self.get_result(callspec.function,
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/reportengine/resourcebuilder.py", line 175, in get_result
    fres =  function(**kwdict)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/covmats.py", line 253, in dataset_t0_predictions
    return central_predictions(dataset, t0set).to_numpy().reshape(-1)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/convolution.py", line 233, in central_predictions
    return _predictions(dataset, pdf, central_fk_predictions)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/convolution.py", line 166, in _predictions
    all_predictions.append(fkfunc(fk_w_cuts, pdf))
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/convolution.py", line 302, in central_fk_predictions
    return central_hadron_predictions(loaded_fk, pdf)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/convolution.py", line 412, in central_hadron_predictions
    return _gv_hadron_predictions(loaded_fk, gv)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/convolution.py", line 335, in _gv_hadron_predictions
    gv1 = gv1func(qmat=[Q], vmat=FK_FLAVOURS, xmat=xgrid).squeeze(-1)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/pdfbases.py", line 308, in central_grid_values
    return self.apply_grid_values(func, vmat, xmat, qmat)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/pdfbases.py", line 422, in apply_grid_values
    gv = func(flmat, xmat, qmat)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/gridvalues.py", line 114, in central_grid_values
    return _grid_values(pdf.load_t0(), flmat, xmat, qmat)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/gridvalues.py", line 57, in _grid_values
    return lpdf.grid_values(flmat, xmat, qmat)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/lhapdfset.py", line 116, in grid_values
    raw = np.array([member.xfxQ(flavors, xarr, qarr) for member in self.members]).swapaxes(1, 2)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/lhapdfset.py", line 116, in <listcomp>
    raw = np.array([member.xfxQ(flavors, xarr, qarr) for member in self.members]).swapaxes(1, 2)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/lhapdf_compatibility.py", line 84, in xfxQ
    ret_dict = self.xfxQ(b, c)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/lhapdf_compatibility.py", line 80, in xfxQ
    return self._xfxQ_all_pid(a, b)
  File "/Users/aronjansen/Dropbox/eScience/projects/protonStructure/nnpdfgit/nnpdf/validphys2/src/validphys/lhapdf_compatibility.py", line 66, in _xfxQ_all_pid
    res = self.pdf.py_xfxQ2_allpid(x, q**2).numpy()
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/pdfflow/pflow.py", line 433, in py_xfxQ2_allpid
    return self.xfxQ2_allpid(a_x, a_q2)
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/pdfflow/pflow.py", line 394, in xfxQ2_allpid
    return self.xfxQ2(pid, a_x, a_q2)
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/pdfflow/pflow.py", line 350, in xfxQ2
    f_f = self._xfxQ2(pid_idx, a_x, a_q2)
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/pdfflow/pflow.py", line 266, in _xfxQ2
    res += subgrid(shape, a_q2, pids=u, arr_x=a_x)
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/pdfflow/subgrid.py", line 181, in __call__
    result = self.fn_interpolation(
  File "/Users/aronjansen/.pyenv/versions/3.9.4/lib/python3.9/site-packages/pdfflow/functions.py", line 161, in first_subgrid
    if tf.size(f_idx) != 0:
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: Using a symbolic `tf.Tensor` as a Python `bool` is not allowed. You can attempt the following resolutions to the problem: If you are running in Graph mode, use Eager execution mode or decorate this function with @tf.function. If you are using AutoGraph, you can try decorating this function with @tf.function. If that does not work, then you may be using an unsupported feature or your source code may not be visible to AutoGraph. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/reference/limitations.md#access-to-source-code for more information.

Benchmarks

Beside having a precision benchmark against LHAPDF it would be interesting to also know how does it compare in CPU-time (and, of course, also GPU-time)

Multimember execution is resource expensive

I'm not sure how to go around this because multi-member means, by nature, many grids have to be loaded in memory at once (if we want to run the interpolation of all of them) as a result the graph is humongous.

Maybe it actually makes sense to fallback to eager mode in this case. Or to tell the user that's an option.
Admittedly the usual scenario for a MC where vectorization is important is to ask for many values of x (and maybe q) at once, asking for many members (order 100) for a few members of x and q is quick even sequentially so run_eager in this case seems a reasonable option.

Add example folder

With several usage examples

  • Tabulation example
  • Python integration example
  • Example with VegasFlow

Dependency issue

Seems like one of the dependency your dependencies has been updated in a non-backward compatible way, namely cloudpickle through tensorflow_probability.

Traceback
ImportError while importing test module '/home/alessandro/projects/N3PDF/pdfflow/src/pdfflow/tests/test_pflow.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:   
src/pdfflow/tests/test_pflow.py:6: in 
    from pdfflow.pflow import mkPDF
src/pdfflow/pflow.py:18: in 
    from pdfflow.subgrid import Subgrid
src/pdfflow/subgrid.py:12: in 
    from pdfflow.functions import inner_subgrid
src/pdfflow/functions.py:32: in 
    from pdfflow.region_interpolator import interpolate                                                                                                                                                                                                                    
src/pdfflow/region_interpolator.py:7: in 
    from pdfflow.neighbour_knots import four_neighbour_knots
src/pdfflow/neighbour_knots.py:7: in                                                                                                                                                                                                                               
    import tensorflow_probability as tfp                                                                                                                                                                                                                                   
env/lib/python3.8/site-packages/tensorflow_probability/__init__.py:76: in 
    from tensorflow_probability.python import *  # pylint: disable=wildcard-import
env/lib/python3.8/site-packages/tensorflow_probability/python/__init__.py:23: in 
    from tensorflow_probability.python import distributions
env/lib/python3.8/site-packages/tensorflow_probability/python/distributions/__init__.py:88: in 
    from tensorflow_probability.python.distributions.pixel_cnn import PixelCNN
env/lib/python3.8/site-packages/tensorflow_probability/python/distributions/pixel_cnn.py:37: in 
    from tensorflow_probability.python.layers import weight_norm
env/lib/python3.8/site-packages/tensorflow_probability/python/layers/__init__.py:31: in 
    from tensorflow_probability.python.layers.distribution_layer import CategoricalMixtureOfOneHotCategorical
env/lib/python3.8/site-packages/tensorflow_probability/python/layers/distribution_layer.py:28: in 
    from cloudpickle.cloudpickle import CloudPickler
    ImportError: cannot import name 'CloudPickler' from 'cloudpickle.cloudpickle' (/home/alessandro/projects/N3PDF/pdfflow/env/lib/python3.8/site-packages/cloudpickle/cloudpickle.py)

It seems to me to have made all the required checks, and I really hope not to be my personal fault, but there is still this chance.

However, for the sake of reporting:

  • I freshly installed the bare tensorflow system-wide just before the installation (I had no previous installation at all)
  • everything else has been installed by setup.py into the environment

Maybe the minimal thing I would suggest is to specify on setup.py the restrictions on the version you are using. If you need any more info from my side let me know.

Add badges, metadata, etc

Add bades to the readme.md for the test, coverage (if added) and documentation.
Also metadata to the repository, etc etc.

Switch on/off LogOutput && Multi-replicas computation

Apologies if I combine the following issues (requests) in one, but on the other hand, I do not think these require two separate issues.

  • It would be nice to have some kind of Verbose that switches on/off the log output here (for instance):

    logger.info("loading %s", self.fname)

    This would be helpful when loading multiple set of replicas.

  • It would also be incredibly useful to have something like: (1) mkPDFs that loads all the PDF sets at once, and (2) xfxQ2s that computes a three-dimensional grid of (rep, pid, x)-points. This is because, doing the following (see below) becomes expensive for large number of replicas.

for rep in replicas:
     res = rep.xfxQ2(pids, xgrid, q2scale)
    ...

On my computer, I could not generate a grid of 1000 replicas as it gets terminated by a SIGKILL error. I saw that there was some discussions in #19 but just wanted a separate issue ๐Ÿ˜… .

Roadmap for the paper

Following the development, here my wish list for the paper:

  • create final performance benchmark plots
  • create accuracy benchmark plots for NNPDF and other PDF sets
  • prepare examples (singlet top, FK convolution)
  • finalize code and related tasks.

Python 3.9 support

Basically at it stands PDFFlow should work with python 3.9 out of the box, the only thing stopping it from working is the fact that TensorFlow (tensorflow/tensorflow#44485) won't support official pip packages for python 3.9 for a while.

The pip package works perfectly fine with python 3.9 so if you have your own installation of Tensorflow working (for instance from your OS vendor) you can simply do

pip install pdfflow --no-deps

(see also N3PDF/vegasflow#62)

Wrong Uncertainties on performance benchmark

Description

There could be an issue with the performance benchmark statistics. The problem relies in the following lines of code

Code example

avg_l = t_lha.mean(0)
avg_p0 = t_pdf0.mean(0)
avg_p1 = t_pdf1.mean(0)
std_l = t_lha.std(0)
std_p0 = t_pdf0.std(0)
std_p1 = t_pdf1.std(0)

Additional information

The thing is in order to give the uncertainties on this numbers in the plots I missed to dived by sqrt of the number of experiments.

We provided just 10 experiments, then the bars should be smaller by just a factor of three. They where almost invisible, so this wouldn't be a big issue in the plots.

I propose to include from now the correction factor.
Let me know what you think, thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.