当前课程知识点:现代优化方法(英文) > Chapter 9. Penalty and Augmented Lagrangian Methods > 9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ > 9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
Hi. Welcome to modern optimization methods.
Today we will introduce the fifth part
of chapter nine,
the hypergraph matching problem,
and how we apply quadratic penalty method to solve it.
Recall that in the previous part,
we have introduced the hypergraph matching problem,
as well as its mathematical formulation.
Now, recall that the mathematical formulation
for hypergraph matching takes the following form.
We minimize a multilinear function f
subject to some equality constraints
as well as a 0-1 constraints.
Notice that the 0-1 constraints
actually means that
this problem is a discrete optimization problem.
Or more specifically,
it is a 0-1 integer programming problem,
which is in general NP hard to solve.
Therefore, in this part,
we will mainly discuss
how we can design some algorithms
to solve this problem.
Now recall that in these equality constraints,
it actually means that for each block
of this vector x,
the sum of these components is 1.
Then we have a constraint,
x is either 0 or 1.
In other words,
in each block
there can only be one component which is 1,
the other components should be 0.
So from this point of view,
we can reformulate this problem
as a sparse constrained optimization problem.
We write it in the following part.
So the objective function is the same as before,
equality constraints are the same as before.
However,
for the 0-1 constraint,
we rewrite it as a lower bound constraint,
as well as a sparse constraint.
The zero norm here actually means
the number of the nonzero elements in x.
In other words,
this constraint basically says
the nonzero elements in vector x
should not be greater than n1.
So recall that we also have a constraint,
the sum of each block to be 1.
Therefore,
combining the two conditions here,
we can actually have that for each block,
there could only be one nonzero element,
which is 1.
And for the rest of these components
in this block,
it will always be 0.
So this is a sparse constrained formulation.
And from this point of view,
we can have different ways
to solve this sparse constrained optimization problem.
For example,
the hard-thresholding type based algorithms,
as well as the l0 penalty based algorithm.
Also we can use the lp regularization algorithms
to solve it approximately.
So these are different ways or different algorithms
based on the zero norm constraint.
Now what we are going to do
is that we relax this problem
based on the zero constraints.
so we just drop the zero norm constraint.
We get the following relaxation problem
where this is the equality constraints
and the lower bound constraints.
And this
relaxation problem is actually the same
as the following one,
where we got extra upper bound constraints,
x is greater than or equal to 1.
The reason is that by this equality constraints
and nonnegative constraints,
we can see that for all the components in x.
It cannot be greater than 1.
So the two relaxations,
they are the same.
As we mentioned in the previous part,
the traditional relaxation is this one.
However,
there is no theoretical research
or theoretical investigation on what is the relationship
between the relaxation problem
with the original problem.
Therefore,
the motivation of our method is that
we first study the relationship
between the relaxation problem
with the original problem.
Surprisingly,
we can prove the following results.
The relaxation problem actually admits
at least one global minimization
of the original problem.
So the mathematical statement is like this:
there exists a global minimizer x*
of the relaxation problem,
such that the zero norm is exactly n1.
Furthermore,
in this case, x* is a global minimizer
of the original problem.
In other words,
the optimal solution of the original problem
is actually a subset of the optimal solution
of the relaxation problem,
which is very nice.
In other words,
the original problem and the relaxation problem,
they share part of global minimizers.
Of course,
the optimal function value is the same.
Now, we also give an example,
how we can achieve a global minimizer
of the original problem
by doing some modification to the global minimizer
of the relaxation problem.
So here is an example that
if we have a block y,
which is a global minimizer of the relaxation problem,
how we can get a global minimizer x
of the original problem.
Then we just pick up the one of the nonzero elements
and set them as one.
The rest elements we set as zero.
For example,
for this y1 block,
we can have three different cases of x*.
Then x* is one of the global optimal solution
of the original problem.
The result is actually very nice.
So the remaining problem is that
how we can solve the relaxation problem.
Now to solve the relaxation problem,
the idea of our algorithm
is different from other existing methods.
The idea here is that we try to identify the nonzero elements
of the global minimizer of the relaxation problem.
Why?
Because as we demonstrate in the theorem,
if we can successfully identify the nonzero elements
of the global minimizer of the relaxation problem.
Then we can use the algorithm
we just demonstrate to get a global minimizer
of the original problem.
Therefore,
the nonzero elements,
which we call the support set,
is very important.
We try to identify the support set of the global minimizers
in the relaxation problem.
So the scale of the global minimizer
is not that important.
We try to identify the support set.
So based on such analysis,
the equality constraint is not that important.
It may not be necessary to restrict these elements x
to 0 and 1.
We may give them more freedom
to change in order to identify the support set.
Therefore,
we penalize the equality constraints
into the objective part by quadratic penalty methods.
We solve the remaining subproblem here.
So the nonzero part,
we also keep it as constraints.
However,
if we just solve this subproblem,
it is not well-defined.
Because as we may remember that
this f(x) is a x^3 objective function.
It will go to minus infinity,
as x tends to infinity.
Therefore,
we need to add some bounds
to this feasible part in order to
keep this problem well-defined.
That's why we add an upper bound here
to solve the subproblem.
Actually,
this quadratic penalty term,
or this quadratic penalty function
is actually the quadratic penalty function
to the original problem.
We can just rearrange our original problem
by putting it an upper bound here.
This M is greater than or equal to 1.
Therefore,
this constraint is equivalent to x between 0 and 1.
Therefore,
the quadratic penalty method
to this problem is exactly this one.
In other words,
when we solve the subproblem,
we solve this subproblem instead.
This is well-defined.
Now we give the quadratic penalty method.
In the first step,
we give the initial point
and give a quadratic penalty parameter σ0.
Then we solve the subproblem
to get a global minimizer xk,
notice here that we assume
we can get a global minimizer xk.
Then we check the termination rule.
If it does not satisfy the termination rule
then we increase the penalty parameter
and the iteration goes on.
Otherwise we will stop and project this xk
to the assignment matrix.
Now you may wonder
what is the convergence result?
So for this quadratic penalty method,
we have the traditional result
as we introduced before.
For each xk,
we assume that this xk is the global minimizer
of the subproblem.
By driving the penalty parameter σk
to positive infinity,
we can prove that for any accumulation point of xk,
it is a global minimizer of the relaxation problem.
In other words,
the quadratic penalty method can return us
a global minimizer of the relaxation problem.
However,
a question is,
can we return a global minimizer of the original problem?
Because we are trying to recover the original problem.
As we discussed before,
the quadratic penalty method has the property that
we have to drive σk to infinity
in order to make the limit point to be a global minimizer.
However,
notice here that we try to identify the support set
of the global minimizer of the relaxation problem.
Therefore,
can we get more interesting results
in terms of support set?
The answer is yes.
We can actually prove that
under reasonable assumptions,
when the quadratic penalty parameter σk
is large enough,
then we can successfully identify
the support set of the original global minimization problem.
This is what this theorem 3
and theorem 4 talk about.
Basically,
the two theorems tell us that
the quadratic penalty method
enjoys the exact recovery result
in terms of identifying the support set
of the global minimizer of the original problem.
This is a very interesting result.
Let's take a look at how we solve the subproblem.
The subproblem is like this.
We try to solve a nonlinear problem
with box constraints here,
x is greater than zero and smaller than M.
So what do we solve?
we use the method called active set based method.
The algorithm was originally proposed
by Bertsekas in 1982,
which is an old method and we modify it
into the projected gradient method.
So this method is actually specified
to identify the support set as well.
So that's why we choose this method
to solve the subproblem.
Now finally,
let's present some numerical results.
So here we compare our method QPPG and QPPG2
with the four existing methods
and we use the three measurements.
The first one is ‘Accuracy’,
which means that
how many correct graph matching we can get
over all the matching pairs.
The ‘Matching Score’ actually means that
the largest matching objective function we can achieve.
And the third one is 'Running Time',
of course,
the shorter the running time is,
the faster the algorithm is.
For the first two ‘Accuracy’ and ‘Matching Score’,
the larger they are, the better the algorithm is.
Now let's first take a look at the first test.
We demonstrate whether we can successfully identify
the support set of the original problem.
This is for one specific example
where n1 and n2 are all 30.
At the first iteration,
we cannot identify the nonzero elements.
Because all of them,
they are nonzero.
And after 21 iterations,
we can see that there are some elements
which are significantly larger than zero.
Finally,
when the algorithm terminates,
we can see that we can successfully identify
the correct support set of the original problem.
So this circle here are the two nonzero elements.
Now the second example basically demonstrates that
the comparison of our method with the rest four methods.
The red line is our method.
As you can see that we can get comparable accuracy
with RRWHM method.
And we are better than the other three methods.
For the matching score,
our method also performs very good.
And also for the CPU time,
it is just the medium performance.
This figure basically shows that
how we match the two houses.
Now the third problem is that
for some larger size of problems,
we also do similar tests.
And conclusions are basically the same.
This figure demonstrates that
how we match a fish data set.
We can successfully get the matching score to be one.
Now let's make a brief summary about this part.
In this part,
we actually discuss how we apply quadratic penalty method
to solve the hypergraph matching problem.
We also demonstrate interesting results
to show the efficiency of the method.
If you're interested in this talk,
you can check the reference here
to get more details.
We stop here and see you next time.
-1.1 About optimization
-1.2 Classification of optimization
--1.2 Classification of optimization
-1.3 Preliminaries in convex analysis
--1.3 Preliminaries in convex analysis
-课件
-【作业须知】
-练习题
-2.1 What is a Solution?
-2.2 Optimality Conditions Ⅰ
-2.3 Optimality Conditions Ⅱ
-2.4 Line search strategy
-2.5 Search direction Ⅱ
-2.6 Convergence
-课件
-【作业须知】
-作业
-3.1 Exact Step Length
-3.2 The Wolfe conditions
-3.3 Inexact Line Search II
-3.4 Convergence of Line Search Methods
--3.4 Convergence of Line Search Methods
-3.5 Convergence Rate
-课件
-【作业须知】
-作业
-4.1 Main Idea of Trust Region Methods
--4.1 Main Idea of Trust Region Methods
-4.2 Trust-Region Algorithm
-4.3 Solving Subproblem
-4.4 Solving Subproblem II
-4.5 Convergence
-课件
-【作业须知】
-作业
-5.1 Conjugate direction method
--5.1 Conjugate direction method
-5.2 Property of conjugate direction method
--5.2 Property of conjugate direction method
-5.3 Conjugate gradient method
--5.3 Conjugate gradient method
-5.4 Rate of convergence
-5.5 Nonlinear conjugate gradient method
--5.5 Nonlinear conjugate gradient method
-5.6 Convergence of nonlinear conjugate gradient method
--5.6 Convergence of nonlinear conjugate gradient method
-课件
-【作业须知】
-作业
-6.1 Semismoothness
-6.2 Semismooth Newton's Method
-6.3 Support Vector Machine
-6.4 Semismooth Newtons' Method for SVM
--6.4 Semismooth Newtons' Method for SVM
-6.5 Exploring Sparsity in SVM
--6.5 Exploring Sparsity in SVM
-课件
-【作业须知】
-作业
-7.1 Local and Global Solutions
--7.1 Local and Global Solutions
-7.2 Examples One
-7.3 Examples Two and Three
-7.4 Constraint Qualifications
--7.4 Constraint Qualifications
-7.5 First-Order Optimality Conditions
--7.5 First-Order Optimality Conditions
-7.6 Second Order Necessary Condition
--7.6 Second Order Necessary Condition
-7.7 Second Order Sufficient Condition
--7.7 Second Order Sufficient Condition
-7.8 Duality
-课件
-【作业须知】
-作业
-8.1 KKT conditions
-8.2 An Example
-8.3 Dual Problem
-课件
-【作业须知】
-测试题
-9.1 Quadratic Penalty Method
--9.1 Quadratic Penalty Method
-9.2 Exact Penalty Function
-9.3 Augmented Lagrangian Method
--9.3 Augmented Lagrangian Method
-9.4 Quadratic Penalty Method for Hypergraph Matching
--9.4 Quadratic Penalty Method for Hypergraph Matching
-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ
-9.6 Augmented Lagrangian Method for SVM
--9.6 Augmented Lagrangian Method for SVM
-9.7 Augmented Lagrangian Method for SVM: Ⅱ
--9.7 Augmented Lagrangian Method for SVM: Ⅱ
-课件
-【作业须知】
-作业