Neural Network » JasonLe's TechBlog

Posts Tagged ‘Neural Network’

修正Neural Network参数小结

April 14th, 2017

在实际构建神经网络的过程中，经常碰到一些选择的问题，现在进行总结：

Getting more training examples: Fixes high variance

Trying smaller sets of features: Fixes high variance

Adding features: Fixes high bias

Adding polynomial features: Fixes high bias

Decreasing λ: Fixes high bias

Increasing λ: Fixes high variance.

当遇到高差异性时（high variance），可以试图增加训练样本或者减少特征数量来解决，但是如果遇到高偏见性（high bias），那么就表明这个训练集可能特征数太少，需要增加特征。λ作为惩罚系数存在，λ越大，惩罚系数越大，越可以修正高差异性，反之修正高偏见性。对于λ的取值，一般遵循在cross-validation set中取最优来决定。

Diagnosing Neural Networks

A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

Using a single hidden layer is a good starting default. You can train your neural network on a number of hidden layers using your cross validation set. You can then select the one that performs best.只有一层的神经网络最简单，但是同时可能会造成性能损失，所以我们要增加隐藏层数和特征数，但是复杂的神经网络又会导致过拟合和计算复杂度太高的问题，所以要权衡这种平衡。

Model Complexity Effects:

Lower-order polynomials (low model complexity) have high bias and low variance. In this case, the model fits poorly consistently.
Higher-order polynomials (high model complexity) fit the training data extremely well and the test data extremely poorly. These have low bias on the training data, but very high variance.
In reality, we would want to choose a model somewhere in between, that can generalize well but also fits the data reasonably well.

默认将数据集分为3部分，60%的训练集，20%的cross-validation set和20%的测试集。

参考：

https://www.coursera.org/learn/machine-learning/supplement/llc5g/deciding-what-to-do-next-revisited

http://www.cnblogs.com/sddai/p/5696834.html

No comments »

Posted in Machine Learning

Tags: Neural Network

How to elaborate and train a Neural Network

March 31st, 2017

Elaborate a Neural Network

First, pick a network architecture; choose the layout of your neural network, including how many hidden units in each layer and how many layers in total you want to have.

Number of input units = dimension of features x(i)
Number of output units = number of classes
Number of hidden units per layer = usually more the better (must balance with cost of computation as it increases with more hidden units)
Defaults: 1 hidden layer. If you have more than 1 hidden layer, then it is recommended that you have the same number of units in every hidden layer.

构建 Neural Network 首先要明确要创建几个隐藏层，每个隐藏层有多少个参数。

首先输入单元个数就是输入的特征数，输出的个数就是分类的个数，每个隐藏层中单元的个数是多少？

通常意义上，隐藏层中单元的个数越多，这个分类效果越好，但是需要权衡计算与特征数的关系。默认情况，一个神经网络会存在一个隐藏层，当多余一个隐藏层的情况下，每层拥有的单元个数相同。

Training a Neural Network

Randomly initialize the weights
Implement forward propagation to get hΘ(x(i)) for any x(i)
Implement the cost function
Implement backpropagation to compute partial derivatives
Use gradient checking to confirm that your backpropagation works. Then disable gradient checking.
Use gradient descent or a built-in optimization function to minimize the cost function with the weights in theta.

输入的特征的权重是随机指定的（如果全部输入的权重都为一个常数，那么输入到隐藏层的值就是相同的，那么导致hΘ(x(i))也是相同的，导致symmetry。不同的初始权重就是为了Symmetry Breaking）。

实现前馈传播算法，计算出每层的x(i)，实现代价函数，通过反向传播算法计算每个Θ的偏导，然后通过梯度检查测试反向传播算法是否成功，然后将梯度检查disable掉（梯度检查计算复杂度太高）。

最后使用梯度下降找到最小的代价函数值和Θ。这个就是需要的特征集。

参考：

http://blog.csdn.net/jinlianjingd/article/details/50767743

https://www.coursera.org/learn/machine-learning/supplement/Uskwd/putting-it-together