nndl / solutions Goto Github PK
View Code? Open in Web Editor NEW《神经网络与深度学习》课后习题答案-分享讨论
《神经网络与深度学习》课后习题答案-分享讨论
可以缓解, 但不一定能得到理想的效果, 增大学习率可能使最优值被跨越, 也可能造成梯度爆炸.
Why can't we handle vanishing gradient problem in neural nets using large step sizes?
将 w 初始化为 0 会使得同一层的神经元在计算时没有区别性, 具有同样的梯度, 产生同样的权重更新.
分类问题中的标签,是没有连续的概念的。每个标签之间的距离也是没有实际意义的,所以预测值和标签两个向量之间的平方差这个值不能反应分类这个问题的优化程度。
比如分类 1,2,3, 真实分类是1, 而被分类到2和3错误程度应该是一样的, 但是平方损失函数的损失却不相同.
习题 8-4 证明 Hopfield 网络的能量函数随时间单调递减
异步更新时这个命题比较好证明
同步更新时要如何证明呢?网上也没找到相关的资料
要注意的是,Softmax回归中使用的𝐶个权重向量是冗余的,即对所有的 权重向量都减去一个同样的向量𝒗,不改变其输出结果.因此,Softmax 回归往往需要使用正则化来约束其参数.此外,我们还可以利用这个特性来避免计算Softmax函数时在数值计算上溢出问题.
不加入正则化项限制权重向量的大小, 可能造成权重向量过大, 产生上溢.
from tensorflow import keras
from tensorflow.keras import layers
L=3
N=18
m=3
network=keras.Sequential([])
for _ in range(L):
network.add(layers.Dense(N/L))
network.add(layers.ReLU())
network.add(layers.Dense(1))
network.build(input_shape=(None,m))
network.count_params()==N+1+(L-1)*(N/L)*(N/L)+m*N/L+N/L
空洞卷积中,卷积核大小为K,膨胀率为D,求P,使得卷积为等宽卷积
答:
根据等宽卷积:
(M - K' + 2P)/S + 1 = M,其中S= 1,
K' = K + (K-1)(D-1),
求得 :
P = (K-1)D/2
加入正则化项的目的是为了防止过拟合, 防止它对于输入的微小变化过于敏感, 但偏置对任意的输入都产生同样的效应, 加入他们对于防止过拟合没有什么帮助.
代数里面的定理
Rank(AB) <= min{Rank(A),Rank(B)}
证明:
对B进行分块表示为:B = (b1,b2,...,bn)
AB = (Ab1,Ab2,...,Abn) = C
C是B的线性组合
则Rank(B)<Rank(C)
同理Rank(A) < Rank(C)
证毕
求教
二维卷积, 输入 3 x 3, 卷积核大小 2 x 2, 仿射变换形式:
z = w ⊗ x
其中,
w = [[w1, w2, 0],
[0, w1, w2],
[w2, 0, w1]]
计算函数
𝑦 = max(𝑥1, ⋯ , 𝑥𝐷)
和函数𝑦 = arg max(𝑥1, ⋯ , 𝑥𝐷)
的梯度.
以D=2进行分析
对于 y = max(x1, x2)
当 x1 > x2 时,y=x1,梯度为(1, 0)
当 x1 < x2 时,y=x2,梯度为(0, 1)
当 x1 = x2 时,y不可导
对于 y = argmax(x1, x2)
当 x1 > x2 时,y=1,梯度为(0, 0)
当 x1 < x2 时,y=2,梯度为(0, 0)
当 x1 = x2 时,y不可导
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.