Posts Tagged ‘Neural Network’

修正Neural Network参数小结

April 14th, 2017

在实际构建神经网络的过程中,经常碰到一些选择的问题,现在进行总结:

  • Getting more training examples: Fixes high variance
  • Trying smaller sets of features: Fixes high variance
  • Adding features: Fixes high bias
  • Adding polynomial features: Fixes high bias
  • Decreasing λ: Fixes high bias
  • Increasing λ: Fixes high variance.

当遇到高差异性时(high variance),可以试图增加训练样本或者减少特征数量来解决,但是如果遇到高偏见性(high bias),那么就表明这个训练集可能特征数太少,需要增加特征。λ作为惩罚系数存在,λ越大,惩罚系数越大,越可以修正高差异性,反之修正高偏见性。对于λ的取值,一般遵循在cross-validation set中取最优来决定。

Diagnosing Neural Networks

  • A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
  • A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

Using a single hidden layer is a good starting default. You can train your neural network on a number of hidden layers using your cross validation set. You can then select the one that performs best.只有一层的神经网络最简单,但是同时可能会造成性能损失,所以我们要增加隐藏层数和特征数,但是复杂的神经网络又会导致过拟合和计算复杂度太高的问题,所以要权衡这种平衡。

Model Complexity Effects:

  • Lower-order polynomials (low model complexity) have high bias and low variance. In this case, the model fits poorly consistently.
  • Higher-order polynomials (high model complexity) fit the training data extremely well and the test data extremely poorly. These have low bias on the training data, but very high variance.
  • In reality, we would want to choose a model somewhere in between, that can generalize well but also fits the data reasonably well.

默认将数据集分为3部分,60%的训练集,20%的cross-validation set和20%的测试集。

参考:

https://www.coursera.org/learn/machine-learning/supplement/llc5g/deciding-what-to-do-next-revisited

http://www.cnblogs.com/sddai/p/5696834.html

How to elaborate and train a Neural Network

March 31st, 2017

Elaborate a Neural Network

First, pick a network architecture; choose the layout of your neural network, including how many hidden units in each layer and how many layers in total you want to have.

  • Number of input units = dimension of features x(i)

  • Number of output units = number of classes

  • Number of hidden units per layer = usually more the better (must balance with cost of computation as it increases with more hidden units)

  • Defaults: 1 hidden layer. If you have more than 1 hidden layer, then it is recommended that you have the same number of units in every hidden layer.

构建 Neural Network 首先要明确要创建几个隐藏层,每个隐藏层有多少个参数。

首先输入单元个数就是输入的特征数,输出的个数就是分类的个数,每个隐藏层中单元的个数是多少?

通常意义上,隐藏层中单元的个数越多,这个分类效果越好,但是需要权衡计算与特征数的关系。默认情况,一个神经网络会存在一个隐藏层,当多余一个隐藏层的情况下,每层拥有的单元个数相同。

Training a Neural Network

  • Randomly initialize the weights
  • Implement forward propagation to get hΘ(x(i)) for any x(i)
  • Implement the cost function
  • Implement backpropagation to compute partial derivatives
  • Use gradient checking to confirm that your backpropagation works. Then disable gradient checking.
  • Use gradient descent or a built-in optimization function to minimize the cost function with the weights in theta.

输入的特征的权重是随机指定的(如果全部输入的权重都为一个常数,那么输入到隐藏层的值就是相同的,那么导致hΘ(x(i))也是相同的,导致symmetry。不同的初始权重就是为了Symmetry Breaking)。

实现前馈传播算法,计算出每层的x(i),实现代价函数,通过反向传播算法计算每个Θ的偏导,然后通过梯度检查测试反向传播算法是否成功,然后将梯度检查disable掉(梯度检查计算复杂度太高)。

最后使用梯度下降找到最小的代价函数值和Θ。这个就是需要的特征集。

参考:

http://blog.csdn.net/jinlianjingd/article/details/50767743

https://www.coursera.org/learn/machine-learning/supplement/Uskwd/putting-it-together