Coder Social home page Coder Social logo

astroml_figures's Introduction

astroML Figures

Figures from the astroML book and paper

astroml_figures's People

Contributors

bsipocz avatar connolly2 avatar ivezic avatar jakevdp avatar stephenportillo avatar suberlak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

astroml_figures's Issues

Figure 10.3 inconsistencies

I just noticed that, somehow, the top 4 and the bottom 4 panels in the 2nd edition of the web version of fig. 10.3 are swapped  (and now the caption is wrong). The two printed versions and the 1st edition of the web version are fine. See https://www.astroml.org/book_figures/chapter10/fig_FFT_aliasing.html 

Note: we should also check the notebook version of this figure.

Figure 4.2

Hi,
upon running the code for Book Figure 4.2 on Ubuntu, Python returned an error: 'GMM' object has no attribute 'eval' for logprob, responsibilities =M_best.eval(x).
To solve the problem, I replaced M_best.eval(x) (line 85) with:
M_best.score_samples(x.reshape((-1,1)))
and M_best.predict_proba(x) (line 110) with:
p = M_best.predict_proba(x.reshape((-1,1)))

I'm using scikit-learn 0.17

(was astroML/astroML#82, more discussion is on that issue)

Wrong equation reference in Figure 5.9

In the comment in Figure 5.9, there is an incorrect reference in the equations.
For the probability p(b), instead of "eqn. 5.70", it should be "eqn. 5.71".
For the gaussian approximation, the equation is not "eqn. 5.71".

RuntimeError triggered by pymc3 for figure 5.24

at the time of opening this issue I suspect this is a local issue on my laptop, but either case having the issue doesn't hurt.

I now run into pymc3 issues a few times with pycharm mostly when examples are embended in notebooks, but this now consistently appears on the command line, too. I only see the error using python3.8, while it works as expected with identical numpy and pymc3 versions on python3.7.

python book_figures/chapter5/fig_model_comparison_mcmc.py 
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [M1_log_sigma, M1_mu]
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [M1_log_sigma, M1_mu]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 262, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 95, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/bsipocz/munka/devel/worktrees/astroML_figures/giant_figure_generating_branch_ed2/book_figures/chapter5/fig_model_comparison_mcmc.py", line 87, in <module>
    trace1 = pm.sample(draws=2500, tune=100)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/sampling.py", line 469, in sample
    trace = _mp_sample(**sample_args)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/sampling.py", line 1053, in _mp_sample
    sampler = ps.ParallelSampler(
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 355, in __init__
    self._samplers = [
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 356, in <listcomp>
    ProcessAdapter(
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 242, in __init__
    self._process.start()
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.


Fig 3.19 is double Weibull

cross ref from https://github.com/astroML/text_errata:

Page 104: Figure 3.19 shows the positive part of a double Weibull distribution, not a Weibull distribution. In this case it means that the values on the y axis are half of what they should be. To get a Weibull distribution in scipy, use exponweib with a=1 rather than dweibull.

chapter 9 "Star/Quasar Classification ROC Curves" example trains classifiers on the whole data set rather than the train split

In fig_star_quasar_ROC.py, inside compute_results(), classifiers are trained on X rather than X_train. This means that the test set has been observed from the classifiers during training which of course is a bad practice.

The way to fix this would be to change line 90 from

model.fit(X, y)

to

model.fit(X_train, y_train)

Additionally, the figure fig_star_quasar_ROC_1.png needs to be updated as it is the result of the execution of the script.

Setting up CI

Some sort of CI testing here would be useful, preferably we would also need a cron job that regularly runs to double check nothing has been broken.

Figure 9.12: sklearn.tree.DecisionTreeClassifier incompatibility

I am using:

sklearn.version: 0.16.1
astroML.version: 0.3

File "fig_rrlyrae_treevis.py", line 242, in 
random_state=0, criterion='entropy')
TypeError: init() got an unexpected keyword argument 'compute_importances'

I added these lines to fix my fork:

# in 0.14+ Setting compute_importances=True is no longer required. 
try:
  # version < 0.14
 clf = DecisionTreeClassifier(compute_importances=True,
                             random_state=0, criterion='entropy')
except:
  # version 0.14+
  clf = DecisionTreeClassifier(
                             random_state=0, criterion='entropy')

see also: astroML/astroML#77

(was astroML/astroML#78)

Figure 6.17

In figure 6.17, we should use the correlation from the full data rather than the mean of bootstrap samples as the best estimate.

(was: astroML/astroML#76)

Python 3.7 compatibility: issues with pymc (at least 10 figures)

pymc has a method called await. Given that async and await are reserved keywords in python 3.7 pymc is not even importable causing at least the following figures not compatible with python3.7 either:

  • book_figures/chapter5/fig_cauchy_mcmc.py
  • book_figures/chapter5/fig_signal_background.py
  • book_figures/chapter5/fig_model_comparison_mcmc.py
    - [ ] book_figures/chapter1/fig_moving_objects_multicolor.py this was never problematic, not sure how it ended up on this list
  • book_figures/chapter10/fig_matchedfilt_chirp2.py
  • book_figures/chapter10/fig_matchedfilt_chirp.py
  • book_figures/chapter10/fig_arrival_time.py
  • book_figures/chapter10/fig_matchedfilt_burst.py
  • book_figures/chapter5/fig_gaussgauss_mcmc.py
  • book_figures/chapter8/fig_outlier_rejection.py

Plot regression with newer sklearn.decomposition.PCA

The PCA projection in book_figures/chapter7/fig_S_manifold_PCA.py has changed depending on the sklearn version being used (y range should be flipped).

Investigate the cause of it, and report upstream if it looks like a bug.

Avoid hacky way of setting up GaussianMixture dataset

Some of the current examples are hacking GaussianMixture() to set up the input dataset. In more recent versions of scikit-learn sampling with a none witted GaussianMixture is not really supported feature (discussion around scikit-learn/scikit-learn#7822 (comment)).

So while is possible to hack it around, we should look into other ways to generate the input dataset for these user facing examples.

examples are e.g.: book_figures/chapter6/fig_GMM_nclusters.py

CNN cartoon issues due to M51 picture

The M51 picture for the CNN cartoon brings up two low priority issues, one needs documentation only, the other a solution:

  • the jpeg file requires Pillow as a dependency. Maybe the best solution is to convert is to png (the only issue is to make sure the result image is the same as what went into the book).

  • using the current test and pdf generating mechanism (relying on extracting the code out to a temporary "somefile.py") is not working with the current solution for the file path of the image. Running the script directly works, so users shouldn't be affected by is.
    Copying a workaround from the astroML pickle_results mechanism is probably the easiest solution here.

Change default for use_latex to False

While we had it True to generate the figures for the books, this default regularly causes issues for users working with the figure files.

Therefore I think changing the detault to False has more benefits, and adding a comment about it in all the code files that in the book we used True should provide the necessary information for reproducibility.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.