To avoid significant noise amplification when the number of training data are small, an approach is to add an extra term (extra constraint) to the least-squares cost function.

  • The extra term penalises the norm of the coefficient vector.

Modifying cost functions to favour structured solutions is called regularisation. Least-squares regression combined with l2-norm regularisaion is known as ridge regression in statistics and as Tikhonov regularisation in the literature on inverse problems.

 

In the simplest case, a positive multiple of the sum of squares of the variables is added to the cost function:

$$ \sum_{i=1}^{k}(a_i^Tx-b_i)^2+\rho \sum_{i=1}^{n}x_i^2 $$

where

$$ \rho>0 $$

  • The extra terms result in a sensible solution in cases when minimising the first sum only does not

To refine the choice among Pareto optimal solutions, the objective function landscape can be adjusted by adding specific terms.

'ConvexOptimisation' 카테고리의 다른 글

Least-squares problems  (0) 2024.01.16
Mathematical optimisation problem (basic concept)  (0) 2024.01.16

A least-squares problem is an optimisation problem with no constraints and an objective, which is as follows:

minimise

$$ f_0(x)=\left\|Ax-b \right\|_2^2=\sum_{i=1}^{k}(a_i^Tx-b_i)^2 $$

The objective function is a sum of squares of terms of the form

$$ a_i^Tx-b_i $$

 

The solution can be reduced to solving a set of linear equations,

$$ f(x)=\left\| Ax-b\right\|_2^2=(A_x-b)^T(Ax-b) $$

$$ =((Ax)^T-b^T)(Ax-b) $$
$$ =x^TA^TAx-b^TAx-x^TA^Tb+b^Tb $$

 

If x is a global minimum of the objective function, then its gradient is the zero vector.

$$ \triangledown f(x)=(\frac{\partial f}{\partial x_1},...,\frac{\partial f}{\partial x_n}) $$

The gradients are:

$$ \triangledown(x^TA^TAx)=2A^TAx, \triangledown(b^TAx)=A^Tb, \triangledown(x^TA^Tb)=A^Tb $$

Calculate these gradients with respect to

$$ x_1,...,x_n $$

Thus, the gradient of the objective function is

$$ \triangledown f(x)=2A^TAx-A^Tb-A^Tb=2A^TAx-2A^Tb $$

To find the least squares solution, we can solve

$$ \triangledown f(x)=0 $$

Or equivalently

$$ A^TAx=A^Tb $$

So we have the analytical solution:

$$ x=(A^TA)^{-1}A^Tb $$

 

To recognise an optimisation problem as a least-squares problem, we only need to verify that the objective is a quadratic function.

'ConvexOptimisation' 카테고리의 다른 글

Regularised least squares (RLS)  (0) 2024.01.22
Mathematical optimisation problem (basic concept)  (0) 2024.01.16

The form is as follows:

minimise $$  f_0(x) $$

subject to $$  f_1(x)\leq b_i, i=1,...,m $$

 

The optimisation variable of the problem is the following vector:

$$ x=(x_1,...,x_n) $$

The objective function is

$$ f_0:R^n\to R $$

The constraint functions are

$$ f_i:R^n\to R $$

The constrants are

$$ b_1,...,b_m $$

Note that the constrants are the limits or bounds for the constraints.

 

The smallest objective value among all vectors that satisfy the constraints is the optimal or solution of the problem:

$$ x^* $$

 

For example,

The following convex function (red coloured quadratic function) is

$$ f(x)=2^2/5 $$

The constraint function (black coloured linear function) is

$$ f(x)=0.5x+7 $$

The feasible area is coloured in yellow:

$$ f_1(z) \leq b_1,...,f_m(z) \leq b_m $$

for any z, we have

$$ f_1(z) \leq f_0(x^*) $$

'ConvexOptimisation' 카테고리의 다른 글

Regularised least squares (RLS)  (0) 2024.01.22
Least-squares problems  (0) 2024.01.16

+ Recent posts