Figures from the astroML book and paper
astroml / astroml_figures Goto Github PK
View Code? Open in Web Editor NEWFigures from the astroML book and paper
License: BSD 2-Clause "Simplified" License
Figures from the astroML book and paper
License: BSD 2-Clause "Simplified" License
Baseline figure for 1.15 is missing
I just noticed that, somehow, the top 4 and the bottom 4 panels in the 2nd edition of the web version of fig. 10.3 are swapped (and now the caption is wrong). The two printed versions and the 1st edition of the web version are fine. See https://www.astroml.org/book_figures/chapter10/fig_FFT_aliasing.html
Note: we should also check the notebook version of this figure.
Hi,
upon running the code for Book Figure 4.2 on Ubuntu, Python returned an error: 'GMM' object has no attribute 'eval' for logprob, responsibilities =M_best.eval(x).
To solve the problem, I replaced M_best.eval(x) (line 85) with:
M_best.score_samples(x.reshape((-1,1)))
and M_best.predict_proba(x) (line 110) with:
p = M_best.predict_proba(x.reshape((-1,1)))
I'm using scikit-learn 0.17
(was astroML/astroML#82, more discussion is on that issue)
This is a duplicate of astroML/astroML#96 as somehow SDSS Stripe 82 Moving Object Catalog
sneaked back to the htmls.
In the comment in Figure 5.9, there is an incorrect reference in the equations.
For the probability p(b), instead of "eqn. 5.70", it should be "eqn. 5.71".
For the gaussian approximation, the equation is not "eqn. 5.71".
at the time of opening this issue I suspect this is a local issue on my laptop, but either case having the issue doesn't hurt.
I now run into pymc3 issues a few times with pycharm mostly when examples are embended in notebooks, but this now consistently appears on the command line, too. I only see the error using python3.8, while it works as expected with identical numpy and pymc3 versions on python3.7.
python book_figures/chapter5/fig_model_comparison_mcmc.py
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [M1_log_sigma, M1_mu]
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [M1_log_sigma, M1_mu]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 262, in run_path
return _run_module_code(code, init_globals, run_name,
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 95, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/bsipocz/munka/devel/worktrees/astroML_figures/giant_figure_generating_branch_ed2/book_figures/chapter5/fig_model_comparison_mcmc.py", line 87, in <module>
trace1 = pm.sample(draws=2500, tune=100)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/sampling.py", line 469, in sample
trace = _mp_sample(**sample_args)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/sampling.py", line 1053, in _mp_sample
sampler = ps.ParallelSampler(
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 355, in __init__
self._samplers = [
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 356, in <listcomp>
ProcessAdapter(
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pymc3/parallel_sampling.py", line 242, in __init__
self._process.start()
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
return Popen(process_obj)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "/Users/bsipocz/.pyenv/versions/3.8.0/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
cross ref from https://github.com/astroML/text_errata:
Page 104: Figure 3.19 shows the positive part of a double Weibull distribution, not a Weibull distribution. In this case it means that the values on the y axis are half of what they should be. To get a Weibull distribution in scipy, use exponweib with a=1 rather than dweibull.
In fig_star_quasar_ROC.py, inside compute_results()
, classifiers are trained on X
rather than X_train
. This means that the test set has been observed from the classifiers during training which of course is a bad practice.
The way to fix this would be to change line 90 from
model.fit(X, y)
to
model.fit(X_train, y_train)
Additionally, the figure fig_star_quasar_ROC_1.png needs to be updated as it is the result of the execution of the script.
Some sort of CI testing here would be useful, preferably we would also need a cron job that regularly runs to double check nothing has been broken.
I am using:
sklearn.version: 0.16.1
astroML.version: 0.3
File "fig_rrlyrae_treevis.py", line 242, in
random_state=0, criterion='entropy')
TypeError: init() got an unexpected keyword argument 'compute_importances'
I added these lines to fix my fork:
# in 0.14+ Setting compute_importances=True is no longer required.
try:
# version < 0.14
clf = DecisionTreeClassifier(compute_importances=True,
random_state=0, criterion='entropy')
except:
# version 0.14+
clf = DecisionTreeClassifier(
random_state=0, criterion='entropy')
see also: astroML/astroML#77
(was astroML/astroML#78)
for the 2nd edition.
Once they are finalized, update the caption for the new figures.
In figure 6.17, we should use the correlation from the full data rather than the mean of bootstrap samples as the best estimate.
(was: astroML/astroML#76)
pymc has a method called await
. Given that async
and await
are reserved keywords in python 3.7 pymc is not even importable causing at least the following figures not compatible with python3.7 either:
The PCA projection in book_figures/chapter7/fig_S_manifold_PCA.py
has changed depending on the sklearn version being used (y range should be flipped).
Investigate the cause of it, and report upstream if it looks like a bug.
cc @connolly
Newer versions enforce the usage of kwargs therefore examples should be checked and fixed.
The default has been changed for this, we need to add back the black edges.
There is no need to point out the error in print ed1, the old figure and the note should be removed
Some of the current examples are hacking GaussianMixture()
to set up the input dataset. In more recent versions of scikit-learn sampling with a none witted GaussianMixture
is not really supported feature (discussion around scikit-learn/scikit-learn#7822 (comment)).
So while is possible to hack it around, we should look into other ways to generate the input dataset for these user facing examples.
examples are e.g.: book_figures/chapter6/fig_GMM_nclusters.py
The M51 picture for the CNN cartoon brings up two low priority issues, one needs documentation only, the other a solution:
the jpeg file requires Pillow as a dependency. Maybe the best solution is to convert is to png (the only issue is to make sure the result image is the same as what went into the book).
using the current test and pdf generating mechanism (relying on extracting the code out to a temporary "somefile.py") is not working with the current solution for the file path of the image. Running the script directly works, so users shouldn't be affected by is.
Copying a workaround from the astroML pickle_results mechanism is probably the easiest solution here.
While we had it True to generate the figures for the books, this default regularly causes issues for users working with the figure files.
Therefore I think changing the detault to False has more benefits, and adding a comment about it in all the code files that in the book we used True should provide the necessary information for reproducibility.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.