当前课程知识点:现代优化方法(英文) > Chapter 7. Theory of Constrained Optimization > 7.8 Duality > 7.8 Duality
Hi. Welcome to modern optimization methods.
Today we will introduce the last part of
chapter seven,
the duality theorem.
Now,
note that duality is actually an important part
in optimization algorithm,
especially for the later introduced
augmented Lagrangian method.
And also duality theorem provides
another point of view to do linear optimization.
and also,
It actually states important insights
to the fields of convex optimization problems.
So to address the duality theorem,
let's first take a look at the KKT conditions.
Now recall that we are
considering a constrained optimization.
At this moment,
we consider only the inequality constraints.
For the equality constraint case,
it is similar.
And here we write down the constraint c
in terms of a vector constraint.
That means for each component here,
we got ci(x) must be
positive or equal to zero.
This is for the constraint.
From this point of view,
we can give the Lagrangian function,
which is we write it in the term of vector form.
Here we use λ transpose c(x)
to represent the sum of each
λi multiplied by ci(x).
Now after introducing the Lagrangian function,
then we can have the following results.
So we define firstly
the new function q about λ
by choosing the infimum of Lagrangian function
with respect to x.
So recall that the Lagrangian function
has actually two parts of variables.
So x is the first part of variables.
And λ is also a sort of variable.
We take the infimum
about this Lagrangian function
with respect to x
in the full dimensional case.
Then we get another function
called q(λ) about λ.
So this λ is actually important.
It will be the dual objective function.
Notice that in many cases,
the value of q(λ) will be negative infinity.
Due to this observation,
we can define the effective domain
as the following.
So we define the
domain of q as those function values
which achieve greater than
negative infinity.
All the x or all the variables
which have finite function values
will be collected in this region D.
The dual problem is then defined
as the max of q(λ),
such that this λ should be nonnegative.
So the nonnegativity
actually comes from the fact that we have
c(x) here are inequality constraints.
So this is the dual problem.
Let's take a look at an example
how we can get the dual problem.
We have this
minimization of a quadratic function
subject to a linear
inequality constraint here.
According to our way
to generate the dual problem,
we first write down the Lagrangian function,
which is the objective function minus
λi multiplied by the inequality constraints.
After that,
we take infimum
of the Lagrangian function with respect to x,
which means that we try to find the
minimizer of this Lagrangian function
with respect to x1 and x2.
So to do this,
we divide them
for several steps.
At the moment here,
this λ is actually a fixed parameter,
which we assume that it is given.
After that,
for this Lagrangian function,
as we can see that
in terms of x1 and x2 ,
it is actually a strongly convex function.
with respect to x1 and x2.
Therefore,
to get the minimum solution,
then we can just take the gradient of this
function with respect to x to be zero.
Then we can find the optimal solution x1 and x2.
Therefore, by taking gradient to be zero,
then we can find that
x1 is actually λ1 and x2 is zero.
In other words,
if we substitute
the minimization point into the function values,
then we can find the minimum value
of this Lagrangian function,
which is our q(λ1).
Therefore, for the dual objective function q,
it is only related to the variable λ1
and the formula we already
calculate it like this.
So this is the dual objective function.
After that,
the final step is that
we take the maximum of the dual function.
So the dual function is here.
Meanwhile,
we require this λ1 to be nonnegative.
Therefore, this is the dual problem.
We can easily see that
the optimal solution for the dual problem
is achieved at λ1=1.
This example actually demonstrates that
how we can derive the dual problem for an example.
Now to summarize,
the objective of the dual problem
take the following form.
We have to take the infimum
of the Lagrangian functions with respect to x.
So given a particular problem,
it may not that easy
to find the dual objective.
Because you may got complicated functions,
or complicated constraints,
which may prevent you to find easy solutions
for the infimum here.
On the other hand,
when this f and -ci
they are all convex functions,
in other words,
when the problem is a convex problem,
then we can easily
find that the dual function
is also convex.
In this situation,
all the local minimizers they are global minimizers.
Therefore,
the problem will be easier to get
as we mentioned in the previous example.
So this is a very special case.
Now let's talk about the duality theorem.
We have two theorems about the duality.
The first one is that
the function q is always concave
and its domain D is convex.
This is a very interesting result.
So no matter the primal problem,
whether it is convex or not,
we have the dual problem
is always concave
and its domain D is convex.
In other words,
the dual problem,
if we transfer it into a minimization problem,
it is a convex problem.
The second result is that
there is some relationship
between the primal problem and the dual problem.
The relationship basically says that
if we take any feasible point
for the primal problem,
and we calculate the objective function,
and for any dual variable λ bar,
which is satisfying the constraints nonnegativity.
Then we have that
the dual objective function is always
smaller than or equal to the primal
objective functions.
In other words,
the dual problem
will always have function values
not exceeding the primal one.
This is called the weak duality theorem.
Of course, we have some strong duality theorem
under some very special constraint qualifications.
Now let's take a look at the KKT condition.
Recall that for the KKT condition,
we have now
rewritten them in terms of vector form.
So the vector form actually means that
we rewrite the gradient of constraints
in terms of column vectors here.
Then the first part of KKT conditions
can be rewritten as the gradient of function
minus the gradient of constraints
multiplied by the Lagrange multiplier λ bar.
The other parts of the KKT conditions
are the same as before.
After this formulation,
we can see that
the following strong duality theorem.
Let's take a detailed look at
what this theorem says.
We suppose that x bar is the solution
of the primal problem.
And f and -ci are convex functions.
So this basically means that we are considering a
convex optimization problem.
Assume further that
the function and the constraints,
they are all differentiable
at this optimal solution x bar.
Then for any λ bar
which satisfies the KKT conditions,
we can have the result that
this λ bar is also a solution of the dual problem.
In other words,
at an optimal solution x*
of the primal problem,
the solution for KKT conditions
also corresponds to the solution
of the dual problem.
In other words,
if we can successfully solve the KKT conditions
in the convex case,
then we can get
the optimal primal problem solutions
on one hand.
Meanwhile,
we can get the optimal solutions
for the dual problem,
which is very nice.
Now let's take a look at another example,
which is about linear programming.
For linear programming,
we got linear objective functions
and also linear constraints.
Here we also consider
the inequality linear constraints.
Now,
following the way of writing down
the dual objective functions,
we can see that
the Lagrangian function
is like this,
we minimize
the primal objective function here
and then minus the Lagrange multiplier
multiplied by the inequality constraints.
This is the Lagrangian function.
We minimize
this Lagrangian function over x
to get the dual objective.
So how can we derive this infimum?
Because we are trying to minimize it
with respect to x,
so we reformulate this function
in terms of x.
We write it as a linear function about x.
Therefore,
the linear term of x will be this part.
And the constant term will be b^Tλ.
So for a linear function,
we got coefficient like this.
If we try to take the infimum
in the whole space,
so think about this
how we can get the infimum?
Of course,
notice that the coefficient here
is not specified.
It is related to λ.
We have to discuss
two different cases.
The first one is that
if this coefficient here is not zero.
In this case
it means that we have a hyperplane
over the whole space.
Therefore,
by taking x
to be the negative coefficient,
we can always
take the infimum to be
minus infinity
or negative infinity.
In this case,
for case one,
the q(λ)
will tend to negative infinity.
For the second case,
if we got this coefficient to be zero,
in other words,
it is a constant function
with respect to x.
Then the infinity is
this constant itself.
Therefore,
if we summarize the two cases,
then we can see that for q(λ),
this negative infinity part,
we can exclude this.
If we take the maximum of this q(λ),
then we will take this part.
For this part,
in order to get this part,
we have an extra constraint here
that we need to make sure that
the coefficient to be zero.
In a word,
we try to maximize q.
Therefore,
the dual objective function will be b^Tλ,
which is a constant with respect to x.
Meanwhile,
we need to make sure that λ
is nonnegative.
And also we have to make sure that
the coefficient about x is zero,
which is in this equality constraint.
By deriving in this way,
we can actually get that
the dual objective function
and the dual problem for linear programming.
Now let's make a brief summary
about this part.
In this part,
we have introduced the dual problem,
including the dual objective
as well as the corresponding constraints.
We also demonstrate two examples
to show you how we derive the dual problem
in practice.
Let's make a brief summary about this chapter.
In this chapter,
we basically introduce
how we can check the optimality condition
for constrained optimization problem,
including the first-order necessary condition,
the second-order necessary condition,
as well as the second-order sufficient condition.
Finally, we also discuss the duality theory,
which is an important part
in constrained optimization.
Okay, so we stop here
and see you next time.
-1.1 About optimization
-1.2 Classification of optimization
--1.2 Classification of optimization
-1.3 Preliminaries in convex analysis
--1.3 Preliminaries in convex analysis
-课件
-【作业须知】
-练习题
-2.1 What is a Solution?
-2.2 Optimality Conditions Ⅰ
-2.3 Optimality Conditions Ⅱ
-2.4 Line search strategy
-2.5 Search direction Ⅱ
-2.6 Convergence
-课件
-【作业须知】
-作业
-3.1 Exact Step Length
-3.2 The Wolfe conditions
-3.3 Inexact Line Search II
-3.4 Convergence of Line Search Methods
--3.4 Convergence of Line Search Methods
-3.5 Convergence Rate
-课件
-【作业须知】
-作业
-4.1 Main Idea of Trust Region Methods
--4.1 Main Idea of Trust Region Methods
-4.2 Trust-Region Algorithm
-4.3 Solving Subproblem
-4.4 Solving Subproblem II
-4.5 Convergence
-课件
-【作业须知】
-作业
-5.1 Conjugate direction method
--5.1 Conjugate direction method
-5.2 Property of conjugate direction method
--5.2 Property of conjugate direction method
-5.3 Conjugate gradient method
--5.3 Conjugate gradient method
-5.4 Rate of convergence
-5.5 Nonlinear conjugate gradient method
--5.5 Nonlinear conjugate gradient method
-5.6 Convergence of nonlinear conjugate gradient method
--5.6 Convergence of nonlinear conjugate gradient method
-课件
-【作业须知】
-作业
-6.1 Semismoothness
-6.2 Semismooth Newton's Method
-6.3 Support Vector Machine
-6.4 Semismooth Newtons' Method for SVM
--6.4 Semismooth Newtons' Method for SVM
-6.5 Exploring Sparsity in SVM
--6.5 Exploring Sparsity in SVM
-课件
-【作业须知】
-作业
-7.1 Local and Global Solutions
--7.1 Local and Global Solutions
-7.2 Examples One
-7.3 Examples Two and Three
-7.4 Constraint Qualifications
--7.4 Constraint Qualifications
-7.5 First-Order Optimality Conditions
--7.5 First-Order Optimality Conditions
-7.6 Second Order Necessary Condition
--7.6 Second Order Necessary Condition
-7.7 Second Order Sufficient Condition
--7.7 Second Order Sufficient Condition
-7.8 Duality
-课件
-【作业须知】
-作业
-8.1 KKT conditions
-8.2 An Example
-8.3 Dual Problem
-课件
-【作业须知】
-测试题
-9.1 Quadratic Penalty Method
--9.1 Quadratic Penalty Method
-9.2 Exact Penalty Function
-9.3 Augmented Lagrangian Method
--9.3 Augmented Lagrangian Method
-9.4 Quadratic Penalty Method for Hypergraph Matching
--9.4 Quadratic Penalty Method for Hypergraph Matching
-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
-9.6 Augmented Lagrangian Method for SVM
--9.6 Augmented Lagrangian Method for SVM
-9.7 Augmented Lagrangian Method for SVM: Ⅱ
--9.7 Augmented Lagrangian Method for SVM: Ⅱ
-课件
-【作业须知】
-作业