当前课程知识点:现代优化方法(英文) > Chapter 3. Line Search Methods > 3.3 Inexact Line Search II > 3.3 Inexact Line Search II
Hi. Welcome to modern optimization methods.
Today we will introduce the third part of chapter three,
the inexact line search.
Before this lecture,
we already introduce the Wolfe condition,
which is one of the inexact line search strategy.
Today we will carry on to discuss more details
about the inexact line search,
including the strong Wolfe condition,
the Goldstein condition,
as well as the backtracking strategy.
Now let's first recall the Wolfe condition.
As we mentioned in the previous lecture,
the Wolfe condition contains two parts.
The first one is to guarantee the decrease,
sufficient decrease of function values.
And the second condition is the curvature condition,
which can rule out small size of step length.
So based on the Wolfe condition,
now we introduce the strong Wolfe condition.
From the name you can see that
the strong Wolfe condition actually have more restrictions.
So what kinds of restrictions?
The strong Wolfe condition also includes two parts.
The first one is similar as before,
the sufficient condition.
And the second one, we have more restrictions.
Basically,
we add absolute value on both sides of the inequalities.
It means that we actually remove
more candidates of the step length.
So if we transfer the second condition in terms of Φ(α),
then we can see that
we have more restrictions on the slope
of the Φ(α) at this step length.
Now the Goldstein conditions,
which is another condition.
We can see that the Goldstein conditions,
they actually have two inequalities.
So if we write down these conditions separately,
let's see first the right two parts.
And the right two parts actually mean
the same as before,
which is the same as the sufficient decrease
in Wolfe condition.
So we require that
the next iteration must be smaller
than the current one,
then plus a negative part,
which is exactly the sufficient decrease condition
in Wolfe conditions.
If we look at the left two terms here,
it actually requires another restriction
in function values.
So the function values must be greater
than the current one, then plus a negative one.
In other words,
we require this function values
to have a sufficient decrease.
However, we do not wish the decrease to be too large.
So there is a lower bound for the next iteration.
All right. So as you can see that
it is actually we can take it
as a special case of the Wolfe condition.
There is only one parameter
in the Goldstein conditions,
which is c. And however,
there are two parameters c1
and c2 in Wolfe conditions.
So by choosing special choice of c1 and c2,
then we can reach the Goldstein conditions.
Now we transfer the Goldstein conditions
to the language of Φ(α) and we plotted them
in the following figure.
Now let's take a look at this figure,
how the Goldstein condition works.
Now still the curve of Φ(α) here.
And at this point, this is Φ(0), right?
So we do two lines here.
The first line is the first term in the inequalities,
which is the slope here. All right?
And the second line here,
it is the right hand side of the three conditions.
And it is another line with slope here. Right?
So the Goldstein condition actually looking for those α
which between the two straight lines.
So by the two straight lines,
we take intersections with the Φ(α).
And you can see that the interval here,
here and there are three intervals that
can provide us candidates for Goldstein conditions.
In other words,
as we can see that
Goldstein conditions rule out small size of α as well.
And meanwhile,
it does not include very small function values.
so for the local minimizer here
and the local minimizer here,
they are all excluded. In other words,
the Goldstein condition only look for favorable decrease
in function values.
They do not look for large decrease
in function values.
Therefore, this part and this part
and that part are included.
So it is the situation for Goldstein conditions.
And you can see that it may exclude the local
and global minimizers.
Now these are some remarks about Goldstein conditions.
Some of them I already mentioned,
as you can see, that the Goldstein conditions,
they are very often used in Newton type methods
to guarantee the global convergence.
However, as it may rule out the global minimizers
of Φ(α).
We may lose the chance to get a very large decrease.
So this is the feature of Goldstein conditions.
Also, you may wonder
whether the feasible region of this α exists.
So as a reason that Goldstein conditions
is a special case of Wolfe condition.
Therefore, by the results in Wolfe condition,
we can guarantee that the Goldstein condition
always returns nonempty, feasible α.
Now let's take a look at the backtracking strategy.
So backtracking from its name,
we can see that
it is restricting the step size smaller and smaller.
So backtracking actually only require this condition holds.
So this condition is exactly the sufficient decrease condition
in Wolfe conditions.
So with backtracking approach,
we just use such conditions to guarantee
the sufficient decrease.
However,
we cannot get rid of very small step size α.
Another advantage of backtracking approach is that
it has very small function value iterations.
Now let's present the algorithm for
backtracking strategy. The idea is very simple.
We are given a starting point α bar,
which is our first trial step length.
We choose a ratio ρ,
which is a decreasing ratio in each trial.
Then we choose a parameter c
which is the condition here to be satisfied.
Then after choosing the three parameters,
we can do calculation.
So at current xk and pk notice here αk
and pk are given.
And we calculate the next iteration f.
Then compare with the right hand side
whether the conditions are satisfied.
If the answer is yes,
then we return and we get our step length.
Otherwise, the condition does not hold.
So we need to decrease the step length.
So assume that you jump to the corner with
the step size very large and the condition fails.
So in that case, we shrink the step size.
We move a little bit back.
We try the step size as this one,
a ratio of the original step size.
Then we repeat the calculation to compare
whether the condition holds or not.
Until we find a step that satisfies it,
then we stop the iteration.
So the backtracking is very simple.
And it is easy to program in any software.
So for backtracking line search strategy,
another very important fact
is the following condition.
This condition may frequently be used
in proving the global convergence of line search strategy.
It is like this.
So assume that αk is satisfied
for the line search strategy,
then we can find or we can use that
the previous step length,
which is the ρ inverse αk
this step length must make the condition fail.
Because α satisfies the condition.
Therefore, the previous one must make the condition fail.
So we can get this condition here.
That means
if we put the previous step length into the inequality,
the inequality will change the direction
to strictly greater than.
It will be frequently used to estimate a lower bound
of step length in convergence analysis.
Now we also have some other remarks
about the step length αk.
So for the initial step length α, in Newton's method,
we naturally choose it as unit step length.
And for other real application problems,
the initial trial step length may vary
we may choose other proper choices.
And for the ratio ρ,
which is a decrease ratio of the step length,
we can also change it a little bit.
So for each iteration, we can choose this ρ,
not a fixed value,
but with a lower bound and upper bound.
It can be chosen from a reasonable interval.
However,
this interval must be a subinterval
between zero and one.
As long as this condition is satisfied,
the result always holds as well.
So these are some remarks
about the backtracking strategy.
Now we have introduced three special strategies
for inexact line search,
including the strong Wolfe condition,
the Goldstein condition,
as well as the backtracking conditions.
Actually, the strong Wolfe condition
and Goldstein conditions,
they are actually closely related to
the Wolfe conditions. However,
the backtracking condition actually is another one
which only guarantees the sufficient decrease.
There is no way for backtracking strategy
to rule out the step length. In other words,
the backtracking strategy may lead to
very small step length.
Okay, so we stop here for this part
and see you next time.
-1.1 About optimization
-1.2 Classification of optimization
--1.2 Classification of optimization
-1.3 Preliminaries in convex analysis
--1.3 Preliminaries in convex analysis
-课件
-【作业须知】
-练习题
-2.1 What is a Solution?
-2.2 Optimality Conditions Ⅰ
-2.3 Optimality Conditions Ⅱ
-2.4 Line search strategy
-2.5 Search direction Ⅱ
-2.6 Convergence
-课件
-【作业须知】
-作业
-3.1 Exact Step Length
-3.2 The Wolfe conditions
-3.3 Inexact Line Search II
-3.4 Convergence of Line Search Methods
--3.4 Convergence of Line Search Methods
-3.5 Convergence Rate
-课件
-【作业须知】
-作业
-4.1 Main Idea of Trust Region Methods
--4.1 Main Idea of Trust Region Methods
-4.2 Trust-Region Algorithm
-4.3 Solving Subproblem
-4.4 Solving Subproblem II
-4.5 Convergence
-课件
-【作业须知】
-作业
-5.1 Conjugate direction method
--5.1 Conjugate direction method
-5.2 Property of conjugate direction method
--5.2 Property of conjugate direction method
-5.3 Conjugate gradient method
--5.3 Conjugate gradient method
-5.4 Rate of convergence
-5.5 Nonlinear conjugate gradient method
--5.5 Nonlinear conjugate gradient method
-5.6 Convergence of nonlinear conjugate gradient method
--5.6 Convergence of nonlinear conjugate gradient method
-课件
-【作业须知】
-作业
-6.1 Semismoothness
-6.2 Semismooth Newton's Method
-6.3 Support Vector Machine
-6.4 Semismooth Newtons' Method for SVM
--6.4 Semismooth Newtons' Method for SVM
-6.5 Exploring Sparsity in SVM
--6.5 Exploring Sparsity in SVM
-课件
-【作业须知】
-作业
-7.1 Local and Global Solutions
--7.1 Local and Global Solutions
-7.2 Examples One
-7.3 Examples Two and Three
-7.4 Constraint Qualifications
--7.4 Constraint Qualifications
-7.5 First-Order Optimality Conditions
--7.5 First-Order Optimality Conditions
-7.6 Second Order Necessary Condition
--7.6 Second Order Necessary Condition
-7.7 Second Order Sufficient Condition
--7.7 Second Order Sufficient Condition
-7.8 Duality
-课件
-【作业须知】
-作业
-8.1 KKT conditions
-8.2 An Example
-8.3 Dual Problem
-课件
-【作业须知】
-测试题
-9.1 Quadratic Penalty Method
--9.1 Quadratic Penalty Method
-9.2 Exact Penalty Function
-9.3 Augmented Lagrangian Method
--9.3 Augmented Lagrangian Method
-9.4 Quadratic Penalty Method for Hypergraph Matching
--9.4 Quadratic Penalty Method for Hypergraph Matching
-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
-9.6 Augmented Lagrangian Method for SVM
--9.6 Augmented Lagrangian Method for SVM
-9.7 Augmented Lagrangian Method for SVM: Ⅱ
--9.7 Augmented Lagrangian Method for SVM: Ⅱ
-课件
-【作业须知】
-作业