当前课程知识点:现代优化方法(英文) >  Chapter 9. Penalty and Augmented Lagrangian Methods >  9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ >  9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

返回《现代优化方法(英文)》慕课在线视频课程列表

9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ在线视频

下一节:9.6 Augmented Lagrangian Method for SVM

返回《现代优化方法(英文)》慕课在线视频列表

9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ课程教案、知识点、字幕

Hi. Welcome to modern optimization methods.

Today we will introduce the fifth part

of chapter nine,

the hypergraph matching problem,

and how we apply quadratic penalty method to solve it.

Recall that in the previous part,

we have introduced the hypergraph matching problem,

as well as its mathematical formulation.

Now, recall that the mathematical formulation

for hypergraph matching takes the following form.

We minimize a multilinear function f

subject to some equality constraints

as well as a 0-1 constraints.

Notice that the 0-1 constraints

actually means that

this problem is a discrete optimization problem.

Or more specifically,

it is a 0-1 integer programming problem,

which is in general NP hard to solve.

Therefore, in this part,

we will mainly discuss

how we can design some algorithms

to solve this problem.

Now recall that in these equality constraints,

it actually means that for each block

of this vector x,

the sum of these components is 1.

Then we have a constraint,

x is either 0 or 1.

In other words,

in each block

there can only be one component which is 1,

the other components should be 0.

So from this point of view,

we can reformulate this problem

as a sparse constrained optimization problem.

We write it in the following part.

So the objective function is the same as before,

equality constraints are the same as before.

However,

for the 0-1 constraint,

we rewrite it as a lower bound constraint,

as well as a sparse constraint.

The zero norm here actually means

the number of the nonzero elements in x.

In other words,

this constraint basically says

the nonzero elements in vector x

should not be greater than n1.

So recall that we also have a constraint,

the sum of each block to be 1.

Therefore,

combining the two conditions here,

we can actually have that for each block,

there could only be one nonzero element,

which is 1.

And for the rest of these components

in this block,

it will always be 0.

So this is a sparse constrained formulation.

And from this point of view,

we can have different ways

to solve this sparse constrained optimization problem.

For example,

the hard-thresholding type based algorithms,

as well as the l0 penalty based algorithm.

Also we can use the lp regularization algorithms

to solve it approximately.

So these are different ways or different algorithms

based on the zero norm constraint.

Now what we are going to do

is that we relax this problem

based on the zero constraints.

so we just drop the zero norm constraint.

We get the following relaxation problem

where this is the equality constraints

and the lower bound constraints.

And this

relaxation problem is actually the same

as the following one,

where we got extra upper bound constraints,

x is greater than or equal to 1.

The reason is that by this equality constraints

and nonnegative constraints,

we can see that for all the components in x.

It cannot be greater than 1.

So the two relaxations,

they are the same.

As we mentioned in the previous part,

the traditional relaxation is this one.

However,

there is no theoretical research

or theoretical investigation on what is the relationship

between the relaxation problem

with the original problem.

Therefore,

the motivation of our method is that

we first study the relationship

between the relaxation problem

with the original problem.

Surprisingly,

we can prove the following results.

The relaxation problem actually admits

at least one global minimization

of the original problem.

So the mathematical statement is like this:

there exists a global minimizer x*

of the relaxation problem,

such that the zero norm is exactly n1.

Furthermore,

in this case, x* is a global minimizer

of the original problem.

In other words,

the optimal solution of the original problem

is actually a subset of the optimal solution

of the relaxation problem,

which is very nice.

In other words,

the original problem and the relaxation problem,

they share part of global minimizers.

Of course,

the optimal function value is the same.

Now, we also give an example,

how we can achieve a global minimizer

of the original problem

by doing some modification to the global minimizer

of the relaxation problem.

So here is an example that

if we have a block y,

which is a global minimizer of the relaxation problem,

how we can get a global minimizer x

of the original problem.

Then we just pick up the one of the nonzero elements

and set them as one.

The rest elements we set as zero.

For example,

for this y1 block,

we can have three different cases of x*.

Then x* is one of the global optimal solution

of the original problem.

The result is actually very nice.

So the remaining problem is that

how we can solve the relaxation problem.

Now to solve the relaxation problem,

the idea of our algorithm

is different from other existing methods.

The idea here is that we try to identify the nonzero elements

of the global minimizer of the relaxation problem.

Why?

Because as we demonstrate in the theorem,

if we can successfully identify the nonzero elements

of the global minimizer of the relaxation problem.

Then we can use the algorithm

we just demonstrate to get a global minimizer

of the original problem.

Therefore,

the nonzero elements,

which we call the support set,

is very important.

We try to identify the support set of the global minimizers

in the relaxation problem.

So the scale of the global minimizer

is not that important.

We try to identify the support set.

So based on such analysis,

the equality constraint is not that important.

It may not be necessary to restrict these elements x

to 0 and 1.

We may give them more freedom

to change in order to identify the support set.

Therefore,

we penalize the equality constraints

into the objective part by quadratic penalty methods.

We solve the remaining subproblem here.

So the nonzero part,

we also keep it as constraints.

However,

if we just solve this subproblem,

it is not well-defined.

Because as we may remember that

this f(x) is a x^3 objective function.

It will go to minus infinity,

as x tends to infinity.

Therefore,

we need to add some bounds

to this feasible part in order to

keep this problem well-defined.

That's why we add an upper bound here

to solve the subproblem.

Actually,

this quadratic penalty term,

or this quadratic penalty function

is actually the quadratic penalty function

to the original problem.

We can just rearrange our original problem

by putting it an upper bound here.

This M is greater than or equal to 1.

Therefore,

this constraint is equivalent to x between 0 and 1.

Therefore,

the quadratic penalty method

to this problem is exactly this one.

In other words,

when we solve the subproblem,

we solve this subproblem instead.

This is well-defined.

Now we give the quadratic penalty method.

In the first step,

we give the initial point

and give a quadratic penalty parameter σ0.

Then we solve the subproblem

to get a global minimizer xk,

notice here that we assume

we can get a global minimizer xk.

Then we check the termination rule.

If it does not satisfy the termination rule

then we increase the penalty parameter

and the iteration goes on.

Otherwise we will stop and project this xk

to the assignment matrix.

Now you may wonder

what is the convergence result?

So for this quadratic penalty method,

we have the traditional result

as we introduced before.

For each xk,

we assume that this xk is the global minimizer

of the subproblem.

By driving the penalty parameter σk

to positive infinity,

we can prove that for any accumulation point of xk,

it is a global minimizer of the relaxation problem.

In other words,

the quadratic penalty method can return us

a global minimizer of the relaxation problem.

However,

a question is,

can we return a global minimizer of the original problem?

Because we are trying to recover the original problem.

As we discussed before,

the quadratic penalty method has the property that

we have to drive σk to infinity

in order to make the limit point to be a global minimizer.

However,

notice here that we try to identify the support set

of the global minimizer of the relaxation problem.

Therefore,

can we get more interesting results

in terms of support set?

The answer is yes.

We can actually prove that

under reasonable assumptions,

when the quadratic penalty parameter σk

is large enough,

then we can successfully identify

the support set of the original global minimization problem.

This is what this theorem 3

and theorem 4 talk about.

Basically,

the two theorems tell us that

the quadratic penalty method

enjoys the exact recovery result

in terms of identifying the support set

of the global minimizer of the original problem.

This is a very interesting result.

Let's take a look at how we solve the subproblem.

The subproblem is like this.

We try to solve a nonlinear problem

with box constraints here,

x is greater than zero and smaller than M.

So what do we solve?

we use the method called active set based method.

The algorithm was originally proposed

by Bertsekas in 1982,

which is an old method and we modify it

into the projected gradient method.

So this method is actually specified

to identify the support set as well.

So that's why we choose this method

to solve the subproblem.

Now finally,

let's present some numerical results.

So here we compare our method QPPG and QPPG2

with the four existing methods

and we use the three measurements.

The first one is ‘Accuracy’,

which means that

how many correct graph matching we can get

over all the matching pairs.

The ‘Matching Score’ actually means that

the largest matching objective function we can achieve.

And the third one is 'Running Time',

of course,

the shorter the running time is,

the faster the algorithm is.

For the first two ‘Accuracy’ and ‘Matching Score’,

the larger they are, the better the algorithm is.

Now let's first take a look at the first test.

We demonstrate whether we can successfully identify

the support set of the original problem.

This is for one specific example

where n1 and n2 are all 30.

At the first iteration,

we cannot identify the nonzero elements.

Because all of them,

they are nonzero.

And after 21 iterations,

we can see that there are some elements

which are significantly larger than zero.

Finally,

when the algorithm terminates,

we can see that we can successfully identify

the correct support set of the original problem.

So this circle here are the two nonzero elements.

Now the second example basically demonstrates that

the comparison of our method with the rest four methods.

The red line is our method.

As you can see that we can get comparable accuracy

with RRWHM method.

And we are better than the other three methods.

For the matching score,

our method also performs very good.

And also for the CPU time,

it is just the medium performance.

This figure basically shows that

how we match the two houses.

Now the third problem is that

for some larger size of problems,

we also do similar tests.

And conclusions are basically the same.

This figure demonstrates that

how we match a fish data set.

We can successfully get the matching score to be one.

Now let's make a brief summary about this part.

In this part,

we actually discuss how we apply quadratic penalty method

to solve the hypergraph matching problem.

We also demonstrate interesting results

to show the efficiency of the method.

If you're interested in this talk,

you can check the reference here

to get more details.

We stop here and see you next time.

现代优化方法(英文)课程列表:

Chapter 1. Introduction

-1.1 About optimization

--1.1 About optimization

-1.2 Classification of optimization

--1.2 Classification of optimization

-1.3 Preliminaries in convex analysis

--1.3 Preliminaries in convex analysis

-课件

-【作业须知】

-练习题

Chapter 2. Fundamentals of Unconstrained Optimization

-2.1 What is a Solution?

--2.1 What is a Solution?

-2.2 Optimality Conditions Ⅰ

--2.2 Optimality Conditions Ⅰ

-2.3 Optimality Conditions Ⅱ

--2.3 Optimality Conditions Ⅱ

-2.4 Line search strategy

--2.4 Line search strategy

-2.5 Search direction Ⅱ

--2.5 Search direction Ⅱ

-2.6 Convergence

--2.6 Convergence

-课件

-【作业须知】

-作业

Chapter 3. Line Search Methods

-3.1 Exact Step Length

--3.1 Exact Step Length

-3.2 The Wolfe conditions

--3.2 The Wolfe conditions

-3.3 Inexact Line Search II

--3.3 Inexact Line Search II

-3.4 Convergence of Line Search Methods

--3.4 Convergence of Line Search Methods

-3.5 Convergence Rate

--3.5 Convergence Rate

-课件

-【作业须知】

-作业

Chapter 4. Trust Region Methods

-4.1 Main Idea of Trust Region Methods

--4.1 Main Idea of Trust Region Methods

-4.2 Trust-Region Algorithm

--4.2 Trust-Region Algorithm

-4.3 Solving Subproblem

--4.3 Solving Subproblem

-4.4 Solving Subproblem II

--4.4 Solving Subproblem II

-4.5 Convergence

--4.5 Convergence

-课件

-【作业须知】

-作业

Chapter 5. Conjugate Gradient Methods

-5.1 Conjugate direction method

--5.1 Conjugate direction method

-5.2 Property of conjugate direction method

--5.2 Property of conjugate direction method

-5.3 Conjugate gradient method

--5.3 Conjugate gradient method

-5.4 Rate of convergence

--5.4 Rate of convergence

-5.5 Nonlinear conjugate gradient method

--5.5 Nonlinear conjugate gradient method

-5.6 Convergence of nonlinear conjugate gradient method

--5.6 Convergence of nonlinear conjugate gradient method

-课件

-【作业须知】

-作业

Chapter 6. Semismooth Newton's Method

-6.1 Semismoothness

--6.1 Semismoothness

-6.2 Semismooth Newton's Method

-6.3 Support Vector Machine

--6.3 Support Vector Machine

-6.4 Semismooth Newtons' Method for SVM

--6.4 Semismooth Newtons' Method for SVM

-6.5 Exploring Sparsity in SVM

--6.5 Exploring Sparsity in SVM

-课件

-【作业须知】

-作业

Chapter 7. Theory of Constrained Optimization

-7.1 Local and Global Solutions

--7.1 Local and Global Solutions

-7.2 Examples One

--7.2 Examples One

-7.3 Examples Two and Three

--7.3 Examples Two and Three

-7.4 Constraint Qualifications

--7.4 Constraint Qualifications

-7.5 First-Order Optimality Conditions

--7.5 First-Order Optimality Conditions

-7.6 Second Order Necessary Condition

--7.6 Second Order Necessary Condition

-7.7 Second Order Sufficient Condition

--7.7 Second Order Sufficient Condition

-7.8 Duality

--7.8 Duality

-课件

-【作业须知】

-作业

Chapter 8. Further Discussions on Constrained Optimization

-8.1 KKT conditions

--8.1 KKT conditions

-8.2 An Example

--8.2 An Example

-8.3 Dual Problem

--8.3 Dual Problem

-课件

-【作业须知】

-测试题

Chapter 9. Penalty and Augmented Lagrangian Methods

-9.1 Quadratic Penalty Method

--9.1 Quadratic Penalty Method

-9.2 Exact Penalty Function

--9.2 Exact Penalty Function

-9.3 Augmented Lagrangian Method

--9.3 Augmented Lagrangian Method

-9.4 Quadratic Penalty Method for Hypergraph Matching

--9.4 Quadratic Penalty Method for Hypergraph Matching

-9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

--9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ

-9.6 Augmented Lagrangian Method for SVM

--9.6 Augmented Lagrangian Method for SVM

-9.7 Augmented Lagrangian Method for SVM: Ⅱ

--9.7 Augmented Lagrangian Method for SVM: Ⅱ

-课件

-【作业须知】

-作业

9.5 Quadratic Penalty Method for Hypergraph Matching: Ⅱ笔记与讨论

也许你还感兴趣的课程:

© 柠檬大学-慕课导航 课程版权归原始院校所有,
本网站仅通过互联网进行慕课课程索引,不提供在线课程学习和视频,请同学们点击报名到课程提供网站进行学习。