peterroelants / peterroelants.github.io Goto Github PK
View Code? Open in Web Editor NEWBlog
Home Page: http://peterroelants.github.io/
License: Mozilla Public License 2.0
Blog
Home Page: http://peterroelants.github.io/
License: Mozilla Public License 2.0
in gaussian-process-kernel-fitting.ipynb, do we need a '-' for the negative log marginal likelihood?
https://peterroelants.github.io/posts/gaussian-process-kernel-fitting/
Thank you very much for the great repo!
When I try to run the code in the notebook "gaussian-process-kernel-fitting.ipynb" in the "Tuning the hyperparameters" section I get the exception
"RuntimeError: loss
passed to Optimizer.compute_gradients should be a function when eager execution is enabled."
It seems to be related to tensorflow version, but I could not solve it myself.
I tried the solution mention here:
https://stackoverflow.com/questions/57858219/loss-passed-to-optimizer-compute-gradients-should-be-a-function-when-eager-exe
However, that creates another problem.
my evironment is running under:
python 3.6.9
tensorflow==2.1.0
tensorflow-estimator==2.1.0
tensorflow-probability==0.9.0
Heya!
I tried to run the GP tutorial notebook on my local machine, but got the following error pop up:
NotJSONError('Notebook does not appear to be JSON: \'{\\n "cells": [\\n {\\n "cell_type": "m...')
The other notebooks in the directory work just fine. Tried to run the JSON through an online validator and it passed. Any ideas?
Thanks!
In the blog https://peterroelants.github.io/posts/rnn-implementation-part02/ , at the end there is a link to the IPYNB which redirects to
but it is actually
Need to change that in HTML file
I believe there is an error in the GP example with noise. More precisely, in:
Σ11 = kernel_func(X1, X1) + σ_noise * np.eye(n1)
one should be adding sigma**2, because it is covariance matrix.
Not an issue, but I've put together a reproduction of the algorithm with some more OO and my favorite plotting library, at https://github.com/matanster/bandits. Just saying, and thanks for the original post on this!
Hi Peter, first thanks so much for putting your notes on machine learning online - I found the article "Understanding Gaussian processes" particularly rigorous and helpful.
Can I please clarify two things in that particular post?
\mu_{2 | 1} &= \mu_{2}+ \Sigma_{21} \Sigma_{11}^{-1}\left(\mathbf{y}{1}-\mu{1}\right)
\Sigma_{2 | 1}=\Sigma_{22}-\Sigma_{21} \Sigma_{1}^{-1} \Sigma_{12}
should be
\mu_{2 | 1} &=\mu_{2}+ \Sigma_{12} \Sigma_{22}^{-1}\left(\mathbf{y}{1}-\mu{1}\right)
\Sigma_{2 | 1}=\Sigma_{22}-\Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21}
I've derived the computation based on your post on conditional distribution here.
Thanks for your time!
First, thank you for these articles!
However, when playing with the code, if I changed the number of input samples to 40 I get this result:
w(0): 0.1000 cost: 46.1816
w(1): 4.7754 cost: 92.1105
w(2): -1.8647 cost: 184.7509
w(3): 7.5657 cost: 371.6103
w(4): -5.8276 cost: 748.5129
I solved this by using a learning rate inversely proportional to the number of samples, i.e.
learning_rate = 2 / nb_of samples
instead of a fixed 0.1
.
I tested it with sample sizes from 5 to 10 million, and it seems to always converge now.
I don't know if this makes any mathematical sense, just want to let you know.
Edited: No error, my mistake
Hi, I think there is a mistake in the equation ds_k/ds_{k-m}... in the last factor you have ds_{k-m+1}/ds_{k-1} but I think it should be ds_{k-m+1}/ds_{k-m}
Peter,
Thank you so much for the great RNN tutorial post. This might seem long, but it is very quick.
1 - For Part 1, you defined the states array S to be 1x1. How will your example change if one decided to use 2 hidden states for example. The clear final solution is that one of them will be turned off, but how would you define it?. In this case your wRec will be 2x1 right?
2- In the same part - section “Compute the gradients with the backward step”; you explain BPTT briefly, and it is not clear to me how you came up with the partial derivatives. I worked out a small 3 time steps example.
Questions:
My Example,
dc/dwx = dc/dy * dy/wx
dc/dy = 2(y - t)
but y in this example is nothing but (S2 * 1), so:
y = S3
y = x3 * wx + S2_Wrec ,… substitute for S2
y = x3 * wx + (x2_ wx + S1* Wrec )* Wrec , …. expand
y = x3 * wx + x2* wx * Wrec + S1* Wrec^2 ,…. Substitute for S1
y = x3 * wx + x2* wx * Wrec + (x1_wx + S0 * Wrec )_ Wrec^2 ,… expand
y = x3 * wx + x2* wx * Wrec + x1*wx * Wrec^2 + S0 * Wrec^3
then,
dy/dwx = x3 + x2 * Wrec + x1 * Wrec^2
= sum (xi * Wrec^(i-1)) where i = {1,2,3}
Best,
-M
I am going through a NN tutorial from this website
I am confused about one particular paragraph in this page (screenshot below).
Is the choice of the intercept bias of -1 purely arbitrary? I don't quite understand his explanation.
It said in the screenshot that the RBF function maps all values to a range of [0, +infinity]. However, the RBF function only maps to a range of [0,1]. Is this a mistake? And how does this positive range lead to a choice of -1 intercept bias?
Greek letter ς
is only used as last letter in words that end with an s sound (weird rule I know). In math
to denote functions we use regular sigma σ
(in mathml \sigma).
should be 1x2
Hi,
when running
np.expand_dims(np.linspace(*xlim, 25), 1)
I get
SyntaxError: only named arguments may follow *expression
I am running the same version of packages and python as in the notebook.
This is not an issue per se, but I am wondering what method you used to convert ipython notebook to a github pages. Can you briefly shared your experience?
Thank you. This is great project.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.