当前课程知识点:现代优化方法(英文) >  Chapter 7. Theory of Constrained Optimization >  7.6 Second Order Necessary Condition >  7.6 Second Order Necessary Condition

返回《现代优化方法(英文)》慕课在线视频课程列表

7.6 Second Order Necessary Condition在线视频

下一节:7.7 Second Order Sufficient Condition

返回《现代优化方法(英文)》慕课在线视频列表

7.6 Second Order Necessary Condition课程教案、知识点、字幕

Hi. Welcome to modern optimization methods.

Today we will talk about the sixth part of chapter seven,

the second order optimality conditions.

Recall that at the moment we are talking about the

constrained optimization problems

with equality constraints and inequality constraints.

And the KKT conditions we have introduced before

are as follows,

which we got the gradient of Lagrangian function

with respect to x to be zero,

as well as feasibility conditions

and complementarity conditions

also require that

when we introduce the constraint qualification,

we also use the linearized feasible direction,

F at point x,

which actually can approximate

or sometimes represent the tangent cone of Ω at x.

Now let's take a look at the properties

of the directions in linearized feasible directions.

Now let's take a gradient,

or let's make inner product

between the direction d

with the gradient of Lagrangian function

with respect to x,

then we can see that because at optimal solution x*,

the gradient of Lagrangian function is zero.

Therefore, the inner product is zero as well.

If we do expansion to the Lagrangian function,

then we can get that the two terms

are equal to each other.

Because the difference of them is zero.

In other words,

it means that

if we take direction d multiplied

by the gradient of f at optimal solution x*.

Then this term can be expressed

by the second term here.

We rearrange them into this part.

And this term,

as we can see that i in active set,

so we separate the equality constraints,

as well as the inequality active constraints

into two parts.

So why we do this analysis?

And we can answer this question in the next page.

Now,

if we take a further look at this equation here,

it actually means that

for the inner product of d

and gradient of function f

it can be expressed into two terms here.

And if we further take a look at the two terms,

then by the definition of linearized feasible directions,

we can have the conclusion that for this term,

it is actually zero.

Because for all the linearized direction,

we require that d takes inner product

with the gradient of equality constraints should be zero.

And also by the definition of linearized feasible directions,

this term should be nonnegative.

Because λi is nonnegative

and this term inner product is nonnegative.

So overall,

we have that the sum of the two terms here

are nonnegative,

meaning that for the linearized direction d

it is actually an increasing direction for the function value f ,

the inner product between the two term

will be greater than or equal to zero.

Now let's take a look at the result here.

So to summarize,

for all the linearized feasible direction d,

we have this property that

this d may be an increasing direction.

Or we have this d transposed the gradient to be zero.

So there are two possible cases.

One is that

this d will increase the function value

another case is that the d will be orthogonal to the gradient.

In other words,

along this case,

or for this case,

if d satisfies this condition,

then we have no idea

whether this function value will increase or not,

because such d is orthogonal to the gradient.

Therefore,

the second order conditions

actually take care of the directions d here.

So it requires or has some special properties

about such kind of d.

They have to control or characterize the properties

of such undecided directions

in order to analyze the second order optimality condition.

Now having this in mind,

then we will define the critical cone

which is actually related to such undecided directions.

Now,

to present the critical cone,

let's make some assumptions.

We assume that

the function f and constraints,

they are twice continuously differentiable,

because we need to use the second order information

of the two functions.

We have some settings.

We assume that at x*,

where x* is actually a local minimizer.

We give the linearized feasible direction F(x*).

Then we have some Lagrange multiplier λ*

satisfying the KKT condition.

In other words,

when we talk about the second order information,

we already assume that x* is the local minimizer.

There is a corresponding Lagrange multiplier

which satisfies the KKT condition.

Now we give the definition for critical cone.

So critical cone is actually defined as an intersection

of the linearized feasible direction.

Also,

with another property,

So which property

it actually requires that

for all the active inequality constraints,

the direction w should be orthogonal

to the gradient of this constraint.

Meanwhile,

we require that,

in this case,

the corresponding Lagrange multiplier λi

is also strictly positive.

So this direction will be the critical cone.

In other words,

critical cone we denote as C(x*, λ*),

is actually related to a local optimal solution x*,

as well as its corresponding Lagrange multipliers λ*.

With this x* and λ*, we can check the linearized feasible direction,

and we also pick up those directions

which satisfy the second part.

So the intersection between the two sets

will make up the critical cone.

In other words,

for the critical cone,

it actually requires the three conditions here.

The first one is that for all equality constraints,

we got that this direction should be orthogonal

to the gradient.

And for inequality constraints here,

which are active inequality constraints,

we separate them into two cases.

One is that if the λ multiplier is strictly positive,

then we require that this direction

should be orthogonal to the gradient of constraints.

Otherwise,

if this λ is zero,

then we only require that

it has nonnegative inner product.

So this will be the directions in the critical cone.

You may wonder that such direction looks weird

and we have no idea how it comes.

So we explain this in the page here.

And as we mentioned before,

that critical cone actually characterizes

those undecided directions,

for function value f.

Now let's take a look at

why this sentence is like this.

We have this remark.

If this λ is zero,

as we mentioned before,

for the inactive set.

So this is inactive inequality constraint.

In this case,

for this λ,

if they lie in the inequality constraint,

then we have this term should be strictly zero.

Because λi is zero.

Therefore this term is zero.

And for this direction in the critical cone,

then we do the inner product with the gradient of f

we can see that the inner product

can be represented in this term.

This is from the first order optimality condition.

And for this term,

as we can see that,

we can separate them into equality constraints,

active inequality constraints,

as well as inactive inequality constraints.

So for the inactive inequality constraints,

we already showed that this term is zero.

And for the active inequality constraints,

we already mention them in the previous part,

that this term is actually zero.

In other words,

for all the directions in the critical cone,

they actually characterize the undecided direction.

Because this direction is orthogonal to the gradient,

which means that

we have no idea along this gradient

whether you will increase or not.

Now let's take a look at an example

to show what is the critical cone.

We have minimization of function f

which is very simple.

It's just x1.

We have two inequality constraints.

I think you can figure out this problem

in two-dimensional space

and find the optimal solution zero.

So at this point,

then we can find out the active inequality constraints,

which both of the constraints are actually active.

And with this,

we can find a unique Lagrange multiplier here.

Which we can solve from the KKT condition.

We can also verify the LICQ condition.

by making the active set

and take the gradient of inequality constraints.

Then we get the LICQ holds.

In other words,

the Lagrange multiplier will be unique,

actually.

We can do the linearized feasible set.

We can calculate the gradient of the two terms

to make them satisfy this condition.

This is the linearized feasible direction.

Based on this.

We can calculate the critical cone.

So the critical cone actually picks up points

in this sets,

and then we look further

to find out the active inequality constraints with positive λi,

which is the second constraints.

So the second constraint we need the direction w,

orthogonal to this direction.

Therefore,

by simple calculation that

this w transpose gradient of c2

should be strictly zero.

Then we can see that

this w could only takes the form like (0, d2),

where this d2 is actually any real number.

Therefore,

we can see that by taking intersection

with the linearized feasible direction,

then the final critical cone will be zero,

the first item.

And the second one should be nonnegative

because we have this nonnegative.

Then for the critical cone,

we can characterize the second order sufficient condition.

Now let's make assumptions in the following.

We first assume that x* is a local minimizer

of the constrained problem,

and LICQ holds.

Then we have the corresponding Lagrange multiplier λ*,

which satisfies

the KKT condition.

Then the second order necessary condition

actually says that for all the directions

in the critical cone,

we require,

or we have the result as this.

So the Lagrangian function,

if we take the Hessian matrix with respect to x,

then this matrix is positive semidefinite

over this cone.

So in other words,

for all the directions in the critical cone,

we have this term to be nonnegative.

So this is the second order sufficient condition.

In other words,

for all the undecided directions,

they are actually positive definite directions

for the Lagrangian function.

All they are the nondecreasing directions

for the Lagrangian function.

Now we have some remarks here.

So this theorem actually means that

if this x* is a local solution,

then the Hessian of the Lagrangian function

has nonnegative curvature

along all the undecided directions

or along all the critical directions,

meaning that the Lagrangian function,

they have nonnegative directions.

Now let's make a brief summary about this part.

In this part,

we have introduced the second order necessary conditions.

So to derive the second order necessary conditions,

we actually introduce a very important definition

called the critical cone.

In the critical cone,

we characterize the undecided directions

for function f.

So for second order necessary condition,

we basically need some properties

about the undecided directions in the critical cone.

Okay,

we stop here for this part and see you next time.

现代优化方法(英文)课程列表:

Chapter 1. Introduction

-1.1 About optimization

--1.1 About optimization

-1.2 Classification of optimization

--1.2 Classification of optimization

-1.3 Preliminaries in convex analysis

--1.3 Preliminaries in convex analysis

-课件

-【作业须知】

-练习题

Chapter 2. Fundamentals of Unconstrained Optimization

-2.1 What is a Solution?

--2.1 What is a Solution?

-2.2 Optimality Conditions Ⅰ

--2.2 Optimality Conditions Ⅰ

-2.3 Optimality Conditions Ⅱ

--2.3 Optimality Conditions Ⅱ

-2.4 Line search strategy

--2.4 Line search strategy

-2.5 Search direction Ⅱ

--2.5 Search direction Ⅱ

-2.6 Convergence

--2.6 Convergence

-课件

-【作业须知】

-作业

Chapter 3. Line Search Methods

-3.1 Exact Step Length

--3.1 Exact Step Length

-3.2 The Wolfe conditions

--3.2 The Wolfe conditions

-3.3 Inexact Line Search II

--3.3 Inexact Line Search II

-3.4 Convergence of Line Search Methods

--3.4 Convergence of Line Search Methods

-3.5 Convergence Rate

--3.5 Convergence Rate

-课件

-【作业须知】

-作业

Chapter 4. Trust Region Methods

-4.1 Main Idea of Trust Region Methods

--4.1 Main Idea of Trust Region Methods

-4.2 Trust-Region Algorithm

--4.2 Trust-Region Algorithm

-4.3 Solving Subproblem

--4.3 Solving Subproblem

-4.4 Solving Subproblem II

--4.4 Solving Subproblem II

-4.5 Convergence

--4.5 Convergence

-课件

-【作业须知】

-作业

Chapter 5. Conjugate Gradient Methods

-5.1 Conjugate direction method

--5.1 Conjugate direction method

-5.2 Property of conjugate direction method

--5.2 Property of conjugate direction method

-5.3 Conjugate gradient method

--5.3 Conjugate gradient method

-5.4 Rate of convergence

--5.4 Rate of convergence

-5.5 Nonlinear conjugate gradient method

--5.5 Nonlinear conjugate gradient method

-5.6 Convergence of nonlinear conjugate gradient method

--5.6 Convergence of nonlinear conjugate gradient method

-课件

-【作业须知】

-作业

Chapter 6. Semismooth Newton's Method

-6.1 Semismoothness

--6.1 Semismoothness

-6.2 Semismooth Newton's Method

-6.3 Support Vector Machine

--6.3 Support Vector Machine

-6.4 Semismooth Newtons' Method for SVM

--6.4 Semismooth Newtons' Method for SVM

-6.5 Exploring Sparsity in SVM

--6.5 Exploring Sparsity in SVM

-课件

-【作业须知】

-作业

Chapter 7. Theory of Constrained Optimization

-7.1 Local and Global Solutions

--7.1 Local and Global Solutions

-7.2 Examples One

--7.2 Examples One

-7.3 Examples Two and Three

--7.3 Examples Two and Three

-7.4 Constraint Qualifications

--7.4 Constraint Qualifications

-7.5 First-Order Optimality Conditions

--7.5 First-Order Optimality Conditions

-7.6 Second Order Necessary Condition

--7.6 Second Order Necessary Condition

-7.7 Second Order Sufficient Condition

--7.7 Second Order Sufficient Condition

-7.8 Duality

--7.8 Duality

-课件

-【作业须知】

-作业

Chapter 8. Further Discussions on Constrained Optimization

-8.1 KKT conditions

--8.1 KKT conditions

-8.2 An Example

--8.2 An Example

-8.3 Dual Problem

--8.3 Dual Problem

-课件

-【作业须知】

-测试题

Chapter 9. Penalty and Augmented Lagrangian Methods

-9.1 Quadratic Penalty Method

--9.1 Quadratic Penalty Method

-9.2 Exact Penalty Function

--9.2 Exact Penalty Function

-9.3 Augmented Lagrangian Method

--9.3 Augmented Lagrangian Method

-9.4 Quadratic Penalty Method for Hypergraph Matching

--9.4 Quadratic Penalty Method for Hypergraph Matching

-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

-9.6 Augmented Lagrangian Method for SVM

--9.6 Augmented Lagrangian Method for SVM

-9.7 Augmented Lagrangian Method for SVM: Ⅱ

--9.7 Augmented Lagrangian Method for SVM: Ⅱ

-课件

-【作业须知】

-作业

7.6 Second Order Necessary Condition笔记与讨论

也许你还感兴趣的课程:

© 柠檬大学-慕课导航 课程版权归原始院校所有,
本网站仅通过互联网进行慕课课程索引,不提供在线课程学习和视频,请同学们点击报名到课程提供网站进行学习。