7.2 Examples One慕课视频播放-现代优化方法（英文）-MOOC慕课视频教程-柠檬大学

Hi, everyone. Welcome to modern optimization methods

In this part, we will introduce the second part of chapter seven

the first order necessary condition

We begin with an example

Recall that in this

chapter we are considering the constrained optimization problem

Now we recall that the constrained optimization problem

takes the following form

We got minimizing function f(x)

where x is in n-dimensional space

We have some equality constraints

as well as some inequality constraints

This is the problem we are considering

Now we will try to

introduce the first order optimality condition

for the constrained optimization problem.

We begin with an example

and let's first define the so called active set

The active set is actually defined by given the point x

For example, we have x this point

And the indices where the constraints are obtained

with equality constraints

we will denote the indices as active set

In other words

we have the formally definition here

that it is composed of all the equality constraints

ε as well as the inequality constraints

where at x

the inequality constraints will be obtained with equality part

This is the active set

In other words, given a point x

the active set means that for these constraints

they are obtained by the exact equality

which means that the point will lie exactly on the

curve of constraints

At this point, we have some definitions for active and inactive

for inequality constraints

Recall that for inequality constraints

assume that we got inequality constraints like this

This is ci(x) equal to zero

And this part, maybe the

ci(x) greater than or equal to zero part

If x lies exactly on the boundary

then it means that this inequality is active for this point x

in other words, you cannot move

very freely by this constraint

However, if x lies

in the interior part of this constraints

then it means that in a very small region of this x

we can move freely.

Without taking account of this inequality constraints

This is for the active and inactive

for x with respect to a constraint ci(x).

Now let's give an example

We have a linear function

x1+x2. We have some equality constraints

where it is actually a circle constraint

Now, for this constraint

we can actually see that there is no inequality constraints

and we can reformulate this constraint as x1^2+x2^2-2

So by this reformulation

we illustrate the figure of this problem

Look at the figure here

And we have our feasible region as the circle

All the point on the circle will be the feasible set of this problem

And on this circle

we're trying to find a point

whose function value is the largest one

Recall that our objective function is x1+x2

If we write down the gradient of this function value

it is actually a constant vector

which means that along this constant vector

the function value will increase fastest

If we look at the

circle here and we try to find the largest function value

think about this problem

which point will we get for the optimal solution

Let's take a look at some examples

If we take the point here

suppose we are currently standing at this point

Now let's take a look at the gradient direction

The direction that the function value increases fast

So the gradient direction will move along this direction

pointing out that direction

And if we take a look at the

gradient of the constraints

which is actually orthogonal to the tangent line

You can see that this is the gradient about this constraint

We can see that at this point

if we move along this gradient direction

and move along to this point or this point

that our function value will actually increase

And suppose that we are standing here

at this point, we can actually have that

the gradient of the constraints

partial c1 will be

the inverse direction to the gradient of function value

In other words

for this point

we cannot move down further to reach the smaller function value

At this point

this is the smallest function value on the circle that we can achieve

And for any other point

for example, this point here

we actually reach the largest function value on the circle

If we take a look at the relationship

at the optimal value here

you can see that actually

the gradient of the constraint

lies on the same line of the gradient of f

and actually

they take the inverse direction

the directions of the two vectors, they're opposite

In other words

if x* is an optimal solution

it means that the gradient of the constraint

and the gradient of function f they are parallel

In other words

we can find a constant λ

which may be both positive or negative

such that the two vectors here are parallel

This is the first explanation for the optimality condition

Now let's take a look at the further interpretation

about why the gradient and the

gradient of constraint need to be parallel

Previously we just demonstrate by a figure

Now we analyze it by some mathematical formulae

We assume that x is feasible

In other words, we got this equality constraint to be zero

Because we have only one equality constraint

And now to maintain the feasibility

when we move to next step

the step s has to satisfy that the next

x+s has to be feasible as well

meaning that c1(x+s) need to be zero

To derive that

what such feasible direction needs to be satisfied

We do the Taylor's expansion to this function

It is approximately equal to this linear term

We just expand the function at x* by a linear term

We drop some high order term

It is approximately equal to c1(x)

at this point, plus a linear term here

So the linear term is

with respect to the gradient of the constraint here

transpose the direction

Recall that x is a feasible point

So this c1(x) is zero, actually

Then this approximation means that

the gradient of c1 at x* transpose s

This is the approximation

In other words

if we try to move along to the next feasible direction

the feasible direction has to approximately satisfy that

the gradient of the constraints

transpose the direction s to be zero

This is the feasibility constraint

or the feasibility condition

Now let's take a look at the second part

When we move along a feasible direction

we also need to consider the reduction in function value f

So what is the condition of this direction

if it needs to keep the reduction

As we mentioned before

such reduction in function f actually means that we got this

gradient transpose s to be strictly negative

In other words, this condition has to be satisfied

Therefore, to summarize

if we try to move along to the next iteration

then the direction s

needs to satisfy both the

sufficient reduction condition as well as the feasibility condition here

Now, the feasibility condition says

that you have this s

must be orthogonal to the gradient of the constraint

The feasibility condition says s

multiplied by this gradient need to be strictly negative

So what is the feasibility

for these choices of s

if we can find such direction s

satisfying both conditions

It actually means that x is not an optimal solution

Because we can still move forward

to get a better function value

And it also satisfies the condition

In other words

if no function

or if no direction s

satisfies the two conditions here

then we can say that x* probably is a local solution.

In other words

if we know that x* is a local solution

it means that we cannot move

forward to get another better solution

In other words

there is no such direction s

that satisfies the both

feasibility condition as well as reduction condition

This is a very important fact

So this condition holds actually

when the direction gradient f

and the direction of ci(x) are parallel to each other

As we mentioned before

if x* is already a local minimizer

then we cannot find such s

to satisfy the reduction condition as well as the feasibility condition

In other words

only when the two vectors are parallel

such s will be an infeasible direction

Therefore, for one equality constraint case

to derive the first order optimality condition

let's mathematically define the Lagrangian function

For one equality constraint case

the Lagrangian function is like this

We have the objective function f(x)

then minus the constant λ1 multiplied by c1

There is another constant λ1 here

It actually means the constant when they are parallel

So this is just the parameter λ1 here.

After we introduce the Lagrangian function

then at a local solution x*

we can find a scalar λ1*

such that the gradient of Lagrangian function is zero

If we take the gradient for the Lagrangian function

we can see that

it is exactly the gradient of the objective function f

minus the gradient of c1

with a multiple case

There is a parameter between the gradient of c1

Then in other words

this is exactly meaning that

the direction of gradient of f

is parallel to the direction of gradient of the equality constraint

This actually tells us the

first order optimality condition for constrained problem

Here we also require that this equality constraint to be held

In other words

there is another

default constraint here that we also require

this x to be a feasible point

So this gives the

first order optimality condition for the one constraint case

and also for the one equality constraint case

Now let's present it mathematically

We assume that if x* is a local minimizer

then there exists a scalar

which is a constant

such that the following two conditions hold

The first one is the Lagrangian function

If we take the gradient with respect to s then it is zero

meaning that the two vectors here are parallel

And the second one actually mentions that

this x must be a feasible solution

This is the first order optimality condition

We have several remarks about the

first order optimality condition here

So notice here that

this is just a necessary condition

meaning that if we know the local minimizer is x*

then we have such relationship

So conversely

if such relationship holds

we cannot guarantee that x* is a local minimizer

This is an important fact

Because we will show that for the maximum

or for the local maximizer x

it also satisfies the condition here

Meaning that this is just a necessary condition

Another important fact is that

for the parameter or the scalar λ1

it may be positive or negative

And there is no restrictions on this λ1

We just require that the two vectors to be parallel

Now let's make a brief summary about this part

In this part, we investigate the first order optimality condition

in the case that there is only one equality constraint

And we also introduce the definition of active set

which will be used later in this chapter

We stop here and see you next time

现代优化方法（英文）课程列表：

Chapter 1. Introduction

-1.1 About optimization

--1.1 About optimization

-1.2 Classification of optimization

--1.2 Classification of optimization

-1.3 Preliminaries in convex analysis

--1.3 Preliminaries in convex analysis

-课件

-【作业须知】

-练习题

Chapter 2. Fundamentals of Unconstrained Optimization

-2.1 What is a Solution?

--2.1 What is a Solution?

-2.2 Optimality Conditions Ⅰ

--2.2 Optimality Conditions Ⅰ

-2.3 Optimality Conditions Ⅱ

--2.3 Optimality Conditions Ⅱ

-2.4 Line search strategy

--2.4 Line search strategy

-2.5 Search direction Ⅱ

--2.5 Search direction Ⅱ

-2.6 Convergence

--2.6 Convergence

-课件

-【作业须知】

-作业

Chapter 3. Line Search Methods

-3.1 Exact Step Length

--3.1 Exact Step Length

-3.2 The Wolfe conditions

--3.2 The Wolfe conditions

-3.3 Inexact Line Search II

--3.3 Inexact Line Search II

-3.4 Convergence of Line Search Methods

--3.4 Convergence of Line Search Methods

-3.5 Convergence Rate

--3.5 Convergence Rate

-课件

-【作业须知】

-作业

Chapter 4. Trust Region Methods

-4.1 Main Idea of Trust Region Methods

--4.1 Main Idea of Trust Region Methods

-4.2 Trust-Region Algorithm

--4.2 Trust-Region Algorithm

-4.3 Solving Subproblem

--4.3 Solving Subproblem

-4.4 Solving Subproblem II

--4.4 Solving Subproblem II

-4.5 Convergence

--4.5 Convergence

-课件

-【作业须知】

-作业

Chapter 5. Conjugate Gradient Methods

-5.1 Conjugate direction method

--5.1 Conjugate direction method

-5.2 Property of conjugate direction method

--5.2 Property of conjugate direction method

-5.3 Conjugate gradient method

--5.3 Conjugate gradient method

-5.4 Rate of convergence

--5.4 Rate of convergence

-5.5 Nonlinear conjugate gradient method

--5.5 Nonlinear conjugate gradient method

-5.6 Convergence of nonlinear conjugate gradient method

--5.6 Convergence of nonlinear conjugate gradient method

-课件

-【作业须知】

-作业

Chapter 6. Semismooth Newton's Method

-6.1 Semismoothness

--6.1 Semismoothness

-6.2 Semismooth Newton's Method

-6.3 Support Vector Machine

--6.3 Support Vector Machine

-6.4 Semismooth Newtons' Method for SVM

--6.4 Semismooth Newtons' Method for SVM

-6.5 Exploring Sparsity in SVM

--6.5 Exploring Sparsity in SVM

-课件

-【作业须知】

-作业

Chapter 7. Theory of Constrained Optimization

-7.1 Local and Global Solutions

--7.1 Local and Global Solutions

-7.2 Examples One

--7.2 Examples One

-7.3 Examples Two and Three

--7.3 Examples Two and Three

-7.4 Constraint Qualifications

--7.4 Constraint Qualifications

-7.5 First-Order Optimality Conditions

--7.5 First-Order Optimality Conditions

-7.6 Second Order Necessary Condition

--7.6 Second Order Necessary Condition

-7.7 Second Order Sufficient Condition

--7.7 Second Order Sufficient Condition

-7.8 Duality

--7.8 Duality

-课件

-【作业须知】

-作业

Chapter 8. Further Discussions on Constrained Optimization

-8.1 KKT conditions

-8.2 An Example

-8.3 Dual Problem

-【作业须知】

-测试题

Chapter 9. Penalty and Augmented Lagrangian Methods

-9.1 Quadratic Penalty Method

--9.1 Quadratic Penalty Method

-9.2 Exact Penalty Function

--9.2 Exact Penalty Function

-9.3 Augmented Lagrangian Method

--9.3 Augmented Lagrangian Method

-9.4 Quadratic Penalty Method for Hypergraph Matching

--9.4 Quadratic Penalty Method for Hypergraph Matching

-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

-9.6 Augmented Lagrangian Method for SVM

--9.6 Augmented Lagrangian Method for SVM

-9.7 Augmented Lagrangian Method for SVM: Ⅱ

--9.7 Augmented Lagrangian Method for SVM: Ⅱ

-课件

-【作业须知】

-作业

7.2 Examples One在线视频

7.2 Examples One课程教案、知识点、字幕

现代优化方法（英文）课程列表：

Chapter 1. Introduction

Chapter 2. Fundamentals of Unconstrained Optimization

Chapter 3. Line Search Methods

Chapter 4. Trust Region Methods

Chapter 5. Conjugate Gradient Methods

Chapter 6. Semismooth Newton's Method

Chapter 7. Theory of Constrained Optimization

Chapter 8. Further Discussions on Constrained Optimization

Chapter 9. Penalty and Augmented Lagrangian Methods

7.2 Examples One笔记与讨论

也许你还感兴趣的课程: