April » 2017 » JasonLe's TechBlog

Archive for April, 2017

主成分分析（Principal components analysis）的计算过程

April 28th, 2017

今天使用Octave练习计算PCA，发现自己对于这个计算还不是特别了解，特此记录。

首先从txt中获取数据，然后将矩阵都入到X中。

octave:7> data = load('ex2data.txt')
data =

   2.50000   2.40000
   0.50000   0.70000
   2.20000   2.90000
   1.90000   2.20000
   3.10000   3.00000
   2.30000   2.70000
   2.00000   1.60000
   1.00000   1.10000
   1.50000   1.60000
   1.10000   0.90000
octave:8> X = data(:, [1, 2]);
octave:15> mu=mean(X)
mu =

   1.8100   1.9100

求x平均值，然后对于所有的样例，都减去对应的均值。这里x的均值是1.81和1.91，那么一个样例减去均值后即为（0.69,0.49），得到

octave:16> X_norm = bsxfun(@minus, X, mu);
octave:17> X_norm 
X_norm =

   0.690000   0.490000
  -1.310000  -1.210000
   0.390000   0.990000
   0.090000   0.290000
   1.290000   1.090000
   0.490000   0.790000
   0.190000  -0.310000
  -0.810000  -0.810000
  -0.310000  -0.310000
  -0.710000  -1.010000

我们使用这个矩阵去构造协方差矩阵sigma = （X_norm’ * X_norm)/size(X_norm(:,1))

octave:34> sigma = X_norm' * X_norm
sigma =

   5.5490   5.5390
   5.5390   6.4490

octave:35> sigma = sigma/10
sigma =

   0.55490   0.55390
   0.55390   0.64490

使用octave中的svd函数直接计算出协方差的特征值和特征向量

octave:36> [U,S,V] = svd(sigma)
U =

  -0.67787  -0.73518
  -0.73518   0.67787

S =

Diagonal Matrix

   1.155625          0
          0   0.044175

V =

  -0.67787  -0.73518
  -0.73518   0.67787

U为特征向量，S为sigma的特征值，通过特征值可以求变量的retained（保留程度）。
比如此时我们想把2维数据转换到1维数据上，那么可以使用U的第一列或者第二列，通过公式z=U’ * X_norm或者X_norm * U得到降维后的数据。

octave:39> z = X_norm*U(:,1)
z =

  -0.827970
   1.777580
  -0.992197
  -0.274210
  -1.675801
  -0.912949
   0.099109
   1.144572
   0.438046
   1.223821

研究其数理意义，就是求源数据到向量基z的投影误差最小，找到合适的基向量代表这个平面。
http://www.cnblogs.com/jerrylead/archive/2011/04/18/2020209.html

http://www.cnblogs.com/LeftNotEasy/archive/2011/01/19/svd-and-applications.html

No comments »

Posted in Algorithm, Machine Learning

Tags: PCA

修正Neural Network参数小结

April 14th, 2017

在实际构建神经网络的过程中，经常碰到一些选择的问题，现在进行总结：

Getting more training examples: Fixes high variance

Trying smaller sets of features: Fixes high variance

Adding features: Fixes high bias

Adding polynomial features: Fixes high bias

Decreasing λ: Fixes high bias

Increasing λ: Fixes high variance.

当遇到高差异性时（high variance），可以试图增加训练样本或者减少特征数量来解决，但是如果遇到高偏见性（high bias），那么就表明这个训练集可能特征数太少，需要增加特征。λ作为惩罚系数存在，λ越大，惩罚系数越大，越可以修正高差异性，反之修正高偏见性。对于λ的取值，一般遵循在cross-validation set中取最优来决定。

Diagnosing Neural Networks

A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

Using a single hidden layer is a good starting default. You can train your neural network on a number of hidden layers using your cross validation set. You can then select the one that performs best.只有一层的神经网络最简单，但是同时可能会造成性能损失，所以我们要增加隐藏层数和特征数，但是复杂的神经网络又会导致过拟合和计算复杂度太高的问题，所以要权衡这种平衡。

Model Complexity Effects:

Lower-order polynomials (low model complexity) have high bias and low variance. In this case, the model fits poorly consistently.
Higher-order polynomials (high model complexity) fit the training data extremely well and the test data extremely poorly. These have low bias on the training data, but very high variance.
In reality, we would want to choose a model somewhere in between, that can generalize well but also fits the data reasonably well.

默认将数据集分为3部分，60%的训练集，20%的cross-validation set和20%的测试集。