Segmented Least Squares慕课视频播放-算法设计与分析-MOOC慕课视频教程-柠檬大学

下面我们来学习

分段最小二乘问题

最小二乘问题是统计和数值分析中

最基本的问题

在平面上给定n个点

(x_1, y_1), (x_2, y_2) , . . . , (x_n, y_n)

求一条直线y=ax+b

对这些点进行拟合

我们希望这些点尽量去接近这条直线

最常用的准则是使得误差的平方和最小

我们把误差记为SSE

定义为每个点的实际值y_i与

估计值ax_i+b的差值的平方和

通过基本的微积分运算

求这个函数的驻点

我们可以解出a和b

它们使得误差是最小的

在很多实际应用中

数据并不是落于一条直线的附近

而可能位于一系列直线的周围

比如下图中的例子

用3条直线进行拟合

直觉上更准确地描述了数据的特征

还是给定平面中的n个点

(x_1, y_1), (x_2, y_2) , . . . , (x_n, y_n)

并且x_1

我们希望找到一系列的直线

能够最小化某个目标f(x)

最小二乘问题中用一条直线进行拟合

所得到的结果可能不够准确

但是用过多的直线

又不能够凝练的给出数据的特征

如何确定优化的目标f(x)

能够平衡准确性和精炼性呢

在优化目标里

要考虑每一条直线

所对应的点所产生的误差E

以及所使用的直线的数目L

可以考虑一个组合的目标E+cL

这里c>0 是一个常数

下面我们用动态规划的方法来进行求解

我们定义OPT(j)是考虑点p_1, p_2 , . . ., p_j时

对应的最小费用

定义e(i, j)是对p_i, p_i+1 , . . . , p_j这些点

用一条直线拟合所产生的最小二乘费用

要计算OPT(j)

关键在于

我们把最后一条直线所对应的点分离出来

我们令这些点为p_i, p_i+1 , . . . , p_j

那么问题的目标值就等于最后一条直线

所对应的最小二乘费用e(i, j)

加上这条直线对应的费用c

以及处理之前的i-1点对应的目标值

我们就可以写出动态规划基本方程

边界条件是当j=0的时候

OPT(j)=0

在刚才的讨论中

我们没有确定

到底最后一条直线包含哪些点

我们可以枚举所有的可能

最优解一定是其中最好的结果

因此 i从1到j

OPT(j)=e(i, j)+c+OPT(i-1)的最小值

我们来看一下算法的伪代码

首先对于所有的i和j

计算e(i, j)

令M[0]=0

然后按照动态规划基本方程计算每个M[j]

计算e(i, j)

共有n2个

每个需要O(n)时间

总计O(n^3)

每个M[j]的计算

需要对i从1到j

计算e(i, j)+c+OPT(i-1)

这些值在之前的运算已经得到

需要O(1)的时间

因此每个M[j]需要O(n)的计算时间

总计O(n^2)

通过以上的讨论

我们可知

算法的运行时间是O(n^3)

算法设计与分析课程列表：

1 Introduction of Algorithm

-1.1 Introduction

--Introduction

-1.2 A First Problem: Stable Matching

--A First Problem: Stable Matching

-1.3 Gale-Shapley Algorithm

--Gale-Shapley Algorithm

-1.4 Understanding Gale-Shapley Algorithm

--Understanding Gale-Shapley Algorithm

-Homework1

-Lecture note 1

--Lecture note 1 Introduction of Algorithm

2 Basics of Algorithm Analysis

-2.1 Computational Tractability

--Computational Tractability

-2.2 Asymptotic Order of Growth

--Asymptotic Order of Growth

-2.3 A Survey of Common Running Times

--A Survey of Common Running Times

-Homework2

-Lecture note 2

--Lecture note 2 Basics of Algorithm Analysis

3 Graph

-3.1 Basic Definitions and Applications

--Basic Definitions and Applications

-3.2 Graph Traversal

--Graph Traversal

-3.3 Testing Bipartiteness

--Testing Bipartiteness

-3.4 Connectivity in Directed Graphs

--Connectivity in Directed Graphs

-3.5 DAG and Topological Ordering

--DAG and Topological Ordering

-Homework3

-Lecture note 3

--Lecture note 3 Graph

4 Greedy Algorithms

-4.1 Coin Changing

--Coin Changing

-4.2 Interval Scheduling

--Interval Scheduling

-4.3 Interval Partitioning

--Interval Partitioning

-4.4 Scheduling to Minimize Lateness

--Scheduling to Minimize Lateness

-4.5 Optimal Caching

--Optimal Caching

-4.6 Shortest Paths in a Graph

--Shortest Paths in a Graph

-4.7 Minimum Spanning Tree

--Minimum Spanning Tree

-4.8 Correctness of Algorithms

--Correctness of Algorithms

-4.9 Clustering

--Clustering

-Homework4

-Lecture note 4

--Lecture note 4 Greedy Algorithms

5 Divide and Conquer

-5.1 Mergesort

--Mergesort

-5.2 Counting Inversions

--Counting Inversions

-5.3 Closest Pair of Points

--Closest Pair of Points

-5.4 Integer Multiplication

--Integer Multiplication

-5.5 Matrix Multiplication

--Video

-5.6 Convolution and FFT

--Convolution and FFT

-5.7 FFT

--FFT

-5.8 Inverse DFT

--Inverse DFT

-Homework5

-Lecture note 5

--Lecture note 5 Divide and Conquer

6 Dynamic Programming

-6.1 Weighted Interval Scheduling

--Weighted Interval Scheduling

-6.2 Segmented Least Squares

--Segmented Least Squares

-6.3 Knapsack Problem

--Knapsack Problem

-6.4 RNA Secondary Structure

--RNA Secondary Structure

-6.5 Sequence Alignment

--Sequence Alignment

-6.6 Shortest Paths

--Shortest Paths

-Homework6

-Lecture note 6

--Lecture note 6 Dynamic Programming

7 Network Flow

-7.1 Flows and Cuts

--Flows and Cuts

-7.2 Minimum Cut and Maximum Flow

--Minimum Cut and Maximum Flow

-7.3 Ford-Fulkerson Algorithm

--Ford-Fulkerson Algorithm

-7.4 Choosing Good Augmenting Paths

--Choosing Good Augmenting Paths

-7.5 Bipartite Matching

--Bipartite Matching

-Homework7

-Lecture note 7

--Lecture note 7 Network Flow

8 NP and Computational Intractability

-8.1 Polynomial-Time Reductions

--Polynomial-Time Reductions

-8.2 Basic Reduction Strategies I

--Basic Reduction Strategies I

-8.3 Basic Reduction Strategies II

--Basic Reduction Strategies II

-8.4 Definition of NP

--Definition of NP

-8.5 Problems in NP

--Problems in NP

-8.6 NP-Completeness

--NP-Completeness

-8.7 Sequencing Problems

--Sequencing Problems

-8.8 Numerical Problems

--Numerical Problems

-8.9 co-NP and the Asymmetry of NP

--co-NP and the Asymmetry of NP

-Homework8

-Lecture note 8

--Lecture note 8 NP and Computational Intractability

9 Approximation Algorithms

-9.1 Load Balancing

--Load Balancing

-9.2 Center Selection

--Center Selection

-9.3 The Pricing Method: Vertex Cover

--The Pricing Method: Vertex Cover

-9.4 LP Rounding: Vertex Cover

--LP Rounding: Vertex Cover

-9.5 Knapsack Problem

--Knapsack Problem

-Homework9

-Lecture note 9

--Lecture note 9 Approximation Algorithms

10 Local Search

-10.1 Landscape of an Optimization Problem

--Landscape of an Optimization Problem

-10.2 Maximum Cut

--Maximum Cut

-10.3 Nash Equilibria

--Nash Equilibria

-10.4 Price of Stability

--Price of Stability

-Homework10

-Lecture note 10

--Lecture note 10 Local Search

11 Randomized Algorithms

-11.1 Contention Resolution

--Contention Resolution

-11.2 Linearity of Expectation

--Linearity of Expectation

-11.3 MAX 3-SAT

--MAX 3-SAT

-11.4 Chernoff Bounds

--Chernoff Bounds

-Homework11

-Lecture note 11

--Lecture note 11 Randomized Algorithms

Exam

-Exam

Segmented Least Squares在线视频