当前课程知识点:现代优化方法(英文) > Chapter 2. Fundamentals of Unconstrained Optimization > 2.3 Optimality Conditions Ⅱ > 2.3 Optimality Conditions Ⅱ
Hello, everyone.
Welcome to modern optimization methods.
I'm Qingna Li.
Today we will introduce
the third part of chapter two,
the optimality conditions.
In the previous lecture, we already introduced
the first order necessary condition
and the second order necessary condition.
So in this part,
we will mainly focus on the second order sufficient conditions.
So what is second order sufficient conditions?
It is defined by
we assume that we have some conditions,
then we try to derive that x* is a local minimizer.
So this is what a sufficient condition means.
And the second order actually means that we
use the second order information.
So we must assume that
f has the second order Hessian matrix.
So based on the analysis here,
we state our second order theorem like this.
So we assume that
the Hessian matrix or the Hessian function is continuous
in an open neighborhood of x*.
And also we assume that
the gradient of f is zero at x*.
In this case,
then we further assume that
the Hessian matrix of f at x* is positive definite.
So it is positive definite, not semidefinite.
In this case,
we can get that x* is a strict local minimizer.
So this is the second order sufficient condition.
So you can see that
the conditions or the assumptions are actually
somehow the reverse of second order necessary condition.
So we got gradient zero
and Hessian matrix at x* to be positive definite.
Then the conclusion is that x* is a strict local minimizer.
So from this point of view,
let's still take an example to show this.
So we still take the quadratic function as an example.
And if this x* satisfies the gradient to be zero,
and also the Hessian matrix,
which is a constant matrix,
to be positive definite.
In this case,
we can conclude that x* is a strict local minimizer.
So this is an example to show
the second order sufficient condition.
Now to prove the result,
we still need some techniques.
So the technique is the same as before.
We use the assumption that
Hessian matrix is continuous
in an open neighborhood of x*.
Then in this case,
we can have the Hessian matrix at x* to be positive definite.
So this is another assumptions.
With the two assumptions here,
we can move this x* a little bit in its neighborhood.
Then we define this neighborhood as D,
with radius r.
So within this neighborhood,
we still have that
Hessian matrix of f in this neighbourhood
to be positive definite.
Therefore, we have that
if we do Taylor's expansion of function f at x*+p,
then we can have the three terms here.
Here we expand it to the second order information.
So the first order information,
the second order information.
And the second order information,
we got a z.
z is between point x* and x*+p.
Notice that
this p, we choose its radius smaller than r.
So x*+p is within this neighborhood D.
Therefore, the point z is still within this neighborhood.
By the assumption here,
we have that
the Hessian function of f at z is strictly
or is positive definite,
implying that the second term
is strictly greater than zero.
So in this case,
it means that
this function value is greater than function value of f at x*.
In other words,
we have found a neighborhood of x*.
Within this neighborhood,
any point has
strictly greater function value than f(x*).
So in other words, x* is a strict local minimizer.
So from this way,
we have shown the second order sufficient condition.
So different from other two theorems in the previous lecture.
In this proof, we didn't use the contradiction way.
So this is another way to prove the theorem.
Now we have several remarks in terms of
second order sufficient condition.
So the first one is that
the sufficient conditions are those conditions
under which we can guarantee to find a local minimizer.
So there may be different types of
second order sufficient condition.
As long as it involves second order information
and it can lead to a local minimizer,
it will be called a second order sufficient condition.
Here in the theorem,
we only introduce one way
of second order sufficient conditions.
There may be others.
In other words,
if some second order sufficient conditions fail,
it does not imply that this x is not a local minimizer,
because there may be some other ways
to check or to prove that
whether this x is a strict local minimizer.
So the second remark is that
the second order sufficient condition we present,
is actually stronger than the necessary conditions.
Recall that in second order necessary conditions,
the condition is that
the Hessian matrix of f at x* is positive semidefinite.
So this is in the second order necessary condition.
However, in the second order sufficient condition,
it is actually the Hessian matrix of f at x* is positive definite.
So it's different.
In second order sufficient condition,
the condition is much stronger
than the second order necessary condition.
And also the result is that
it guarantees a strict local minimizer.
So it is a strong local minimizer,
rather than a weak local minimizer.
So the conclusion is also different.
Now, the final remark is that
the second order sufficient conditions are not necessarily
for a point to be a strict local minimizer.
There may be other conditions.
I'll give an example here.
For example, you got x^4,
this function.
And obviously, zero is a local minimizer.
And also it is a strict local minimizer, actually.
However, as we can verify that
the Hessian matrix of f at x* is actually zero.
So in other words,
the second order sufficient condition does not hold.
However, x=0 is also a strict local minimizer.
So it means that
even second order sufficient condition fails,
x* may also be a strict local minimizer.
So these are some remarks
about second order sufficient conditions.
Now let's take a look at convex case.
So for convex case,
we have some very special result.
Actually a very beautiful result.
so the first one is that
if this function is convex,
we can know that
any local minimizer is also a global minimizer.
So this is a beautiful result.
Because in this case,
you do not have to worry about local or global.
Both of them, they are global minimizers.
And the second result is that
if f is differentiable,
then any stationary point x*
is actually a global minimizer of function f.
So this is the second result.
Now I still put a quadratic function here to show you that
in the convex function case,
x=2 is both the local and global minimizer.
Now let's try to prove the result.
So for the first part,
we still use the way of contradiction.
We assume that x* is a local minimizer,
but it is not a global minimizer.
So in this case,
we still try to derive some contradiction.
And how we derive the contradiction?
We must use the convexity.
So it is a very important property.
So to use the convexity.
Because x* is just a local minimizer,
so there must exist another point
whose function value is strictly lower than function value at x*.
So we denote it as point z and f(z)
is strictly smaller than f(x*).
In this case,
we draw a connection between z and x.
So we draw a line segment between x* and z.
In other words,
we take a convex combination between the two points.
We denote that convex combination as x.
Now we consider the function value of f at x.
So by the convexity of f,
we know that the function of f at x must be smaller
than the convex combination of the two function values.
In other words, we have this condition here.
So this is by the result of convexity.
Moreover, we have assumed that
the function value of f at z is strictly smaller than f(x*).
Therefore, we get that f(x)
the function value along this line segment
is strictly smaller than f(x*).
In other words,
in a neighborhood of x*,
we have found a piece of the line segment
whose function value is strictly smaller than f(x*).
In other words,
x* is not a local minimizer in this case.
Therefore,
by this way, we have derived a contradiction.
Therefore, we can see that if x* is a local minimizer,
then it must be a global minimizer.
Now, this is the first part of the proof.
To show the second part of the result,
we still use the way of contradiction.
However, the proof is much simpler than the first part.
So the conclusion is that
we try to prove any stationary point is a global minimizer of f.
So for contradiction,
we assume that
x* is a stationary point.
However, it is not a global minimizer.
So similar to the previous one,
we can find a point z
whose function value is strictly smaller than f(x*).
Now we still use the convexity.
However, we use the equivalent conditions of convexity.
That is the first order condition for optimality condition.
So in this way,
the condition here is actually the convexity condition.
So we have that function value of f is smaller
than function value of f(x*).
However, by our convexity,
we actually know that
this must be smaller than zero.
However,
since x* is a stationary point here,
So this must not be zero.
In this way,
we get a contradiction here.
So by the way of contradiction,
we have shown that x* must be a global minimizer.
So we finish the second part of this proof.
Now let's make a brief summary about this part.
So in this part,
we have shown the first order necessary condition,
the second order necessary condition,
as well as the second order sufficient condition.
In particular,
we have discussed the convex case.
We have some special and beautiful results.
Now let's stop here and see you next time.
-1.1 About optimization
-1.2 Classification of optimization
--1.2 Classification of optimization
-1.3 Preliminaries in convex analysis
--1.3 Preliminaries in convex analysis
-课件
-【作业须知】
-练习题
-2.1 What is a Solution?
-2.2 Optimality Conditions Ⅰ
-2.3 Optimality Conditions Ⅱ
-2.4 Line search strategy
-2.5 Search direction Ⅱ
-2.6 Convergence
-课件
-【作业须知】
-作业
-3.1 Exact Step Length
-3.2 The Wolfe conditions
-3.3 Inexact Line Search II
-3.4 Convergence of Line Search Methods
--3.4 Convergence of Line Search Methods
-3.5 Convergence Rate
-课件
-【作业须知】
-作业
-4.1 Main Idea of Trust Region Methods
--4.1 Main Idea of Trust Region Methods
-4.2 Trust-Region Algorithm
-4.3 Solving Subproblem
-4.4 Solving Subproblem II
-4.5 Convergence
-课件
-【作业须知】
-作业
-5.1 Conjugate direction method
--5.1 Conjugate direction method
-5.2 Property of conjugate direction method
--5.2 Property of conjugate direction method
-5.3 Conjugate gradient method
--5.3 Conjugate gradient method
-5.4 Rate of convergence
-5.5 Nonlinear conjugate gradient method
--5.5 Nonlinear conjugate gradient method
-5.6 Convergence of nonlinear conjugate gradient method
--5.6 Convergence of nonlinear conjugate gradient method
-课件
-【作业须知】
-作业
-6.1 Semismoothness
-6.2 Semismooth Newton's Method
-6.3 Support Vector Machine
-6.4 Semismooth Newtons' Method for SVM
--6.4 Semismooth Newtons' Method for SVM
-6.5 Exploring Sparsity in SVM
--6.5 Exploring Sparsity in SVM
-课件
-【作业须知】
-作业
-7.1 Local and Global Solutions
--7.1 Local and Global Solutions
-7.2 Examples One
-7.3 Examples Two and Three
-7.4 Constraint Qualifications
--7.4 Constraint Qualifications
-7.5 First-Order Optimality Conditions
--7.5 First-Order Optimality Conditions
-7.6 Second Order Necessary Condition
--7.6 Second Order Necessary Condition
-7.7 Second Order Sufficient Condition
--7.7 Second Order Sufficient Condition
-7.8 Duality
-课件
-【作业须知】
-作业
-8.1 KKT conditions
-8.2 An Example
-8.3 Dual Problem
-课件
-【作业须知】
-测试题
-9.1 Quadratic Penalty Method
--9.1 Quadratic Penalty Method
-9.2 Exact Penalty Function
-9.3 Augmented Lagrangian Method
--9.3 Augmented Lagrangian Method
-9.4 Quadratic Penalty Method for Hypergraph Matching
--9.4 Quadratic Penalty Method for Hypergraph Matching
-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
-9.6 Augmented Lagrangian Method for SVM
--9.6 Augmented Lagrangian Method for SVM
-9.7 Augmented Lagrangian Method for SVM: Ⅱ
--9.7 Augmented Lagrangian Method for SVM: Ⅱ
-课件
-【作业须知】
-作业