Comments (10)
Hi, you could take a look into the references inside https://cran.r-project.org/web/packages/dse/vignettes/Guide.pdf
from pydse.
@FlorianWilhelm
I've red the guide you sent, and also the references there contained. There's some part of the S code that is similar to your code, but I could not find (maybe I'm not a good searcher) a theoretical explanation of the log likelihood calculation.
Probably there is some optimization/statistics theory I don't know, but Gilbert's publications don't seem to have an answer. A cost function build upon eigenvalues of a matrix of correlations (between variables with no lag) puzzles me.
Thanks for previous response, hope you can help me
from pydse.
@franchesoni, note that the product of all eigenvalues of a matrix equals the determinant of a matrix. In the pdf of a multivariate normal distribution you need to calculate the determinant of the covariance matrix. Got it?
from pydse.
I got that. This is what I understood:
I should assume that the residuals are independent and identically distributed according to a multivariate normal distribution with zero mean.
Then I can write down the negative log likelihood and identify the terms 'like1' and 'const' of your code.
Is the other term that I don't understand yet. Somehow (lack of latex here), res_i^T cov^{-1} res_i is the same that len(s).
Could you clarify that last step?
thanks!
from pydse.
It's actually only a bit of linear algebra. Just keep in mind that cov = res^t * res and do some transformations and you arrive at the number of non-zero eigenvalues which is len(s). Got it?
from pydse.
Thanks for the response!
Unfortunately, it does not seem immediate at all to me (and I've tried!). I would appreciate it if you could write that bit of linear algebra I'm lacking of. Sorry for the nitpicking, it's the final step!
from pydse.
Maybe I underestimated this... it was quite some time that I programmed it and I also peeked a bit into the implementation of DSE. I would say that in general res_i^T cov^{-1} res_i does not equal len(s) as you can be easily seen by constructing a counter example. So maybe it's just an approximation if you sum over all.
Also, let's assume we only have a univariate time series, thus cov^{-1} is just a scalar that only equals 1 = len(s) if that's the std of res. In this case the sum over all res_i^T cov^{-1} res_i could be approximated with sampleT * len(s) due to the definition of cov. In the general case where cov != 1 for p =1 or if we have eigenvalues not equal 1 for p > 1 this is not the case. At least that are my findings after putting one hour of thought into it.
Please tell me a bit more about your motivations in knowing this? Is this some student work or why are you so eager to know that technical detail? Have you also tried to contact the original author of DSE? In general PyDSE is more a prototype and not actively developed anymore...
from pydse.
@FlorianWilhelm , thanks for taking the time. I appreciate it.
My main motivation, aside from learning ambitions, comes from my job. I work in the Solar Energy Lab in forecasting through temporal series analysis, and I'm exploring the performance of ARMA models (now I'm comparing static digital filtering with statsmodels' Kalman filtering approach).
As I don't have anyone double checking my work, I can't copy or implement things without making sure that they are correct (not that I doubt you, it's just that I can't leave giant bugs or even think about publishing unfunded research). Nevertheless, your code is great, just what I need.
As I found analytically, when p = 1 and the res is non zero, the equality holds (squared L2 norms go away leaving only sampleT). As you said, the general case is not that obvious (at least to me). I will continue asking elsewhere as you suggest, with the confidence on the correctness of the method (I've made a few tests).
Thanks for answering all of it! Cheers
from pydse.
How about that @franchesoni? Best you copy it in some LaTeX aware editor.
We have for
Putting everything together, we have that
from pydse.
Great @FlorianWilhelm ! It was just some linear algebra indeed.
About the degenerate case, I asked in cross-validated and there is an intuitive explanation that relates with PCA. Basically, when some eigenvalue of the covariance is negligible, we can use the principal component representation of the data, and arrive at the case you just nicely covered.
I've not done the math yet, but the intuition is there. Thanks for the very helpful responses!
from pydse.
Related Issues (1)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pydse.