当前课程知识点:现代优化方法(英文) >  Chapter 3. Line Search Methods >  3.3 Inexact Line Search II >  3.3 Inexact Line Search II

返回《现代优化方法(英文)》慕课在线视频课程列表

3.3 Inexact Line Search II在线视频

下一节:3.4 Convergence of Line Search Methods

返回《现代优化方法(英文)》慕课在线视频列表

3.3 Inexact Line Search II课程教案、知识点、字幕

Hi. Welcome to modern optimization methods.

Today we will introduce the third part of chapter three,

the inexact line search.

Before this lecture,

we already introduce the Wolfe condition,

which is one of the inexact line search strategy.

Today we will carry on to discuss more details

about the inexact line search,

including the strong Wolfe condition,

the Goldstein condition,

as well as the backtracking strategy.

Now let's first recall the Wolfe condition.

As we mentioned in the previous lecture,

the Wolfe condition contains two parts.

The first one is to guarantee the decrease,

sufficient decrease of function values.

And the second condition is the curvature condition,

which can rule out small size of step length.

So based on the Wolfe condition,

now we introduce the strong Wolfe condition.

From the name you can see that

the strong Wolfe condition actually have more restrictions.

So what kinds of restrictions?

The strong Wolfe condition also includes two parts.

The first one is similar as before,

the sufficient condition.

And the second one, we have more restrictions.

Basically,

we add absolute value on both sides of the inequalities.

It means that we actually remove

more candidates of the step length.

So if we transfer the second condition in terms of Φ(α),

then we can see that

we have more restrictions on the slope

of the Φ(α) at this step length.

Now the Goldstein conditions,

which is another condition.

We can see that the Goldstein conditions,

they actually have two inequalities.

So if we write down these conditions separately,

let's see first the right two parts.

And the right two parts actually mean

the same as before,

which is the same as the sufficient decrease

in Wolfe condition.

So we require that

the next iteration must be smaller

than the current one,

then plus a negative part,

which is exactly the sufficient decrease condition

in Wolfe conditions.

If we look at the left two terms here,

it actually requires another restriction

in function values.

So the function values must be greater

than the current one, then plus a negative one.

In other words,

we require this function values

to have a sufficient decrease.

However, we do not wish the decrease to be too large.

So there is a lower bound for the next iteration.

All right. So as you can see that

it is actually we can take it

as a special case of the Wolfe condition.

There is only one parameter

in the Goldstein conditions,

which is c. And however,

there are two parameters c1

and c2 in Wolfe conditions.

So by choosing special choice of c1 and c2,

then we can reach the Goldstein conditions.

Now we transfer the Goldstein conditions

to the language of Φ(α) and we plotted them

in the following figure.

Now let's take a look at this figure,

how the Goldstein condition works.

Now still the curve of Φ(α) here.

And at this point, this is Φ(0), right?

So we do two lines here.

The first line is the first term in the inequalities,

which is the slope here. All right?

And the second line here,

it is the right hand side of the three conditions.

And it is another line with slope here. Right?

So the Goldstein condition actually looking for those α

which between the two straight lines.

So by the two straight lines,

we take intersections with the Φ(α).

And you can see that the interval here,

here and there are three intervals that

can provide us candidates for Goldstein conditions.

In other words,

as we can see that

Goldstein conditions rule out small size of α as well.

And meanwhile,

it does not include very small function values.

so for the local minimizer here

and the local minimizer here,

they are all excluded. In other words,

the Goldstein condition only look for favorable decrease

in function values.

They do not look for large decrease

in function values.

Therefore, this part and this part

and that part are included.

So it is the situation for Goldstein conditions.

And you can see that it may exclude the local

and global minimizers.

Now these are some remarks about Goldstein conditions.

Some of them I already mentioned,

as you can see, that the Goldstein conditions,

they are very often used in Newton type methods

to guarantee the global convergence.

However, as it may rule out the global minimizers

of Φ(α).

We may lose the chance to get a very large decrease.

So this is the feature of Goldstein conditions.

Also, you may wonder

whether the feasible region of this α exists.

So as a reason that Goldstein conditions

is a special case of Wolfe condition.

Therefore, by the results in Wolfe condition,

we can guarantee that the Goldstein condition

always returns nonempty, feasible α.

Now let's take a look at the backtracking strategy.

So backtracking from its name,

we can see that

it is restricting the step size smaller and smaller.

So backtracking actually only require this condition holds.

So this condition is exactly the sufficient decrease condition

in Wolfe conditions.

So with backtracking approach,

we just use such conditions to guarantee

the sufficient decrease.

However,

we cannot get rid of very small step size α.

Another advantage of backtracking approach is that

it has very small function value iterations.

Now let's present the algorithm for

backtracking strategy. The idea is very simple.

We are given a starting point α bar,

which is our first trial step length.

We choose a ratio ρ,

which is a decreasing ratio in each trial.

Then we choose a parameter c

which is the condition here to be satisfied.

Then after choosing the three parameters,

we can do calculation.

So at current xk and pk notice here αk

and pk are given.

And we calculate the next iteration f.

Then compare with the right hand side

whether the conditions are satisfied.

If the answer is yes,

then we return and we get our step length.

Otherwise, the condition does not hold.

So we need to decrease the step length.

So assume that you jump to the corner with

the step size very large and the condition fails.

So in that case, we shrink the step size.

We move a little bit back.

We try the step size as this one,

a ratio of the original step size.

Then we repeat the calculation to compare

whether the condition holds or not.

Until we find a step that satisfies it,

then we stop the iteration.

So the backtracking is very simple.

And it is easy to program in any software.

So for backtracking line search strategy,

another very important fact

is the following condition.

This condition may frequently be used

in proving the global convergence of line search strategy.

It is like this.

So assume that αk is satisfied

for the line search strategy,

then we can find or we can use that

the previous step length,

which is the ρ inverse αk

this step length must make the condition fail.

Because α satisfies the condition.

Therefore, the previous one must make the condition fail.

So we can get this condition here.

That means

if we put the previous step length into the inequality,

the inequality will change the direction

to strictly greater than.

It will be frequently used to estimate a lower bound

of step length in convergence analysis.

Now we also have some other remarks

about the step length αk.

So for the initial step length α, in Newton's method,

we naturally choose it as unit step length.

And for other real application problems,

the initial trial step length may vary

we may choose other proper choices.

And for the ratio ρ,

which is a decrease ratio of the step length,

we can also change it a little bit.

So for each iteration, we can choose this ρ,

not a fixed value,

but with a lower bound and upper bound.

It can be chosen from a reasonable interval.

However,

this interval must be a subinterval

between zero and one.

As long as this condition is satisfied,

the result always holds as well.

So these are some remarks

about the backtracking strategy.

Now we have introduced three special strategies

for inexact line search,

including the strong Wolfe condition,

the Goldstein condition,

as well as the backtracking conditions.

Actually, the strong Wolfe condition

and Goldstein conditions,

they are actually closely related to

the Wolfe conditions. However,

the backtracking condition actually is another one

which only guarantees the sufficient decrease.

There is no way for backtracking strategy

to rule out the step length. In other words,

the backtracking strategy may lead to

very small step length.

Okay, so we stop here for this part

and see you next time.

现代优化方法(英文)课程列表:

Chapter 1. Introduction

-1.1 About optimization

--1.1 About optimization

-1.2 Classification of optimization

--1.2 Classification of optimization

-1.3 Preliminaries in convex analysis

--1.3 Preliminaries in convex analysis

-课件

-【作业须知】

-练习题

Chapter 2. Fundamentals of Unconstrained Optimization

-2.1 What is a Solution?

--2.1 What is a Solution?

-2.2 Optimality Conditions Ⅰ

--2.2 Optimality Conditions Ⅰ

-2.3 Optimality Conditions Ⅱ

--2.3 Optimality Conditions Ⅱ

-2.4 Line search strategy

--2.4 Line search strategy

-2.5 Search direction Ⅱ

--2.5 Search direction Ⅱ

-2.6 Convergence

--2.6 Convergence

-课件

-【作业须知】

-作业

Chapter 3. Line Search Methods

-3.1 Exact Step Length

--3.1 Exact Step Length

-3.2 The Wolfe conditions

--3.2 The Wolfe conditions

-3.3 Inexact Line Search II

--3.3 Inexact Line Search II

-3.4 Convergence of Line Search Methods

--3.4 Convergence of Line Search Methods

-3.5 Convergence Rate

--3.5 Convergence Rate

-课件

-【作业须知】

-作业

Chapter 4. Trust Region Methods

-4.1 Main Idea of Trust Region Methods

--4.1 Main Idea of Trust Region Methods

-4.2 Trust-Region Algorithm

--4.2 Trust-Region Algorithm

-4.3 Solving Subproblem

--4.3 Solving Subproblem

-4.4 Solving Subproblem II

--4.4 Solving Subproblem II

-4.5 Convergence

--4.5 Convergence

-课件

-【作业须知】

-作业

Chapter 5. Conjugate Gradient Methods

-5.1 Conjugate direction method

--5.1 Conjugate direction method

-5.2 Property of conjugate direction method

--5.2 Property of conjugate direction method

-5.3 Conjugate gradient method

--5.3 Conjugate gradient method

-5.4 Rate of convergence

--5.4 Rate of convergence

-5.5 Nonlinear conjugate gradient method

--5.5 Nonlinear conjugate gradient method

-5.6 Convergence of nonlinear conjugate gradient method

--5.6 Convergence of nonlinear conjugate gradient method

-课件

-【作业须知】

-作业

Chapter 6. Semismooth Newton's Method

-6.1 Semismoothness

--6.1 Semismoothness

-6.2 Semismooth Newton's Method

-6.3 Support Vector Machine

--6.3 Support Vector Machine

-6.4 Semismooth Newtons' Method for SVM

--6.4 Semismooth Newtons' Method for SVM

-6.5 Exploring Sparsity in SVM

--6.5 Exploring Sparsity in SVM

-课件

-【作业须知】

-作业

Chapter 7. Theory of Constrained Optimization

-7.1 Local and Global Solutions

--7.1 Local and Global Solutions

-7.2 Examples One

--7.2 Examples One

-7.3 Examples Two and Three

--7.3 Examples Two and Three

-7.4 Constraint Qualifications

--7.4 Constraint Qualifications

-7.5 First-Order Optimality Conditions

--7.5 First-Order Optimality Conditions

-7.6 Second Order Necessary Condition

--7.6 Second Order Necessary Condition

-7.7 Second Order Sufficient Condition

--7.7 Second Order Sufficient Condition

-7.8 Duality

--7.8 Duality

-课件

-【作业须知】

-作业

Chapter 8. Further Discussions on Constrained Optimization

-8.1 KKT conditions

--8.1 KKT conditions

-8.2 An Example

--8.2 An Example

-8.3 Dual Problem

--8.3 Dual Problem

-课件

-【作业须知】

-测试题

Chapter 9. Penalty and Augmented Lagrangian Methods

-9.1 Quadratic Penalty Method

--9.1 Quadratic Penalty Method

-9.2 Exact Penalty Function

--9.2 Exact Penalty Function

-9.3 Augmented Lagrangian Method

--9.3 Augmented Lagrangian Method

-9.4 Quadratic Penalty Method for Hypergraph Matching

--9.4 Quadratic Penalty Method for Hypergraph Matching

-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

-9.6 Augmented Lagrangian Method for SVM

--9.6 Augmented Lagrangian Method for SVM

-9.7 Augmented Lagrangian Method for SVM: Ⅱ

--9.7 Augmented Lagrangian Method for SVM: Ⅱ

-课件

-【作业须知】

-作业

3.3 Inexact Line Search II笔记与讨论

也许你还感兴趣的课程:

© 柠檬大学-慕课导航 课程版权归原始院校所有,
本网站仅通过互联网进行慕课课程索引,不提供在线课程学习和视频,请同学们点击报名到课程提供网站进行学习。