yuhangzhou88 / esl_solution Goto Github PK

View Code? Open in Web Editor NEW

247.0 5.0 35.0 12.54 MB

Home Page: https://yuhangzhou88.github.io/ESL_Solution/

License: MIT License

HTML 99.45% JavaScript 0.16% Python 0.39%

esl_solution's Introduction

Welcome!

Please check out the webpage!

This is an unofficial solution manual of The Element of Statistical Learning (ESL).

To Contribute

git clone the repo into a local directory
cd <YOUR_DIR>
Make changes if needed
mkdocs serve, see more details on MkDocs
By default the website is served on local host http://127.0.0.1:8000/, visit in browser and view your changes

Current Status

As of Late 2022: I'm less engaged in this project at this moment. There are some outstanding issues and some unfinished exercises (mainly in Chapter 14), I sincerely appreciate help on those items.

esl_solution's People

Contributors

Stargazers

Watchers

esl_solution's Issues

I'm kinda confused by the conditional expectation in this exercise and what the expectation is taken with respect to. Moreover, could you explain how you went from $E[h1(Xs)+h2(Xc)|Xs]$ to E[h1(Xs)|Xc]+Xc?

Ex2.9 The beta can only be a function of training data. Your chosen beta is also a function of testing data

Ex2.9 The beta can only be a function of training data. Your chosen beta is also a function of testing data.

Ex 9.5a

Hi, Yuhang:
Thanks for developing this site for ESL!

I learned how to solve 9.5 (a) from your solution. You can probably generalize the proof as follows.

Best,
Qichun

Ex 4.3

In the solution it reads

$$ \widehat{y} = \widehat{B}^T x. $$

However, $\widehat{B}$ has size $(p+1) \times K$, while $x$ has size $p \times 1$.

Ex. 9.1: no universe S in local linear regression

I am concerned about the usage of S in your solution

The "smoothing matrix" is different for each query point in the local linear regression. In other words, different weighted linear regressions are performed on different points.

BTW, I think it should be straightforward by starting from the definition (my thought: szcf-weiya/ESL-CN#167)

ESL Ex. 5.15 part (a) solution

Hi Yuhang, I thought I should have seen a typo in your Ex. 5.15 (a)'s answer, but I'm not sure if I'm correct.

Below the sentence "Therefore, by definition of $K$ we have", the first line contains a constant $\gamma_i$ that has not been defined. Should this $\gamma_i$ be $\lambda_i$ instead, as what you defined in the denominator of the inner product $H_K$? Thank you!

3.2

第二问求beta ci没有这么复杂吧？印象里beta也是正态分布，mean和sd都有解析解

ex 5.11

Hello!
Can you further explain the formula right below the sentence : "it is easy to show..." . I don't get that.
Thank you!

Solution to Ex. 4.3 is wrong

It is clear that

is wrong because $\hat{\mathsf{B}}$ is not a square matrix thus cannot be inversed. The proper way is to expand $\hat{\Sigma}$ and prove
$$B(B^T\Sigma B)^{-1}B^T\mu_k = \Sigma^{-1}\mu_k$$.

TODO: Clean up reference

Search and clean up "\cite{xxx}" in *.md files.

Exercise 4.5

I think something is wrong in the solution to 4.5.

The log-likelihood is <=0 for any beta, but it's claimed that it tends to infinity.

A Small mistake in Q 3.21

Hi, thanks a lot for your wonderful work on the solutions!

I would like to point out that there's a small mistake in equation (1) of the Q 3.21.

It should be

$$ \begin{aligned} & \min_{rank (B) = r} Tr \left( (Y - XB) \Sigma^{-1} (Y - X B)^{\top} \right) \\ = & \min_{rank (B) = r} Tr\left( Y \Sigma^{-1} Y^{\top}\right) - Tr\left(\Sigma^{-\frac{1}{2}}Y^{\top}X(X^{\top}X)^{-1}X^{\top}Y - CC^{\top}\right), \end{aligned} $$

while the first term is a $n \times n$ matrix and the second and third terms are apparently not, one can't add the matrices together, but their trace can. And one should minus the second term rather than add it.

Although this will not affect the later deduction, if you could modify the answers a bit, I would be grateful!

Mistake in Ex 14.8

In ∂L/∂R calculation, the third term on RHS should have positive sign;
-2X₁ᵀ·1·μᵀ should be +2X₁ᵀ·1·μᵀ