当前课程知识点:Learn Statistics with Ease >  Chapter 3 Descriptive Statistics: Numerical Methods >  3.2Mean、Median、Mode >  3.2.1 Frequency distribution: the initial appearance of the overall distribution characteristics 频数分布:初显总体分布特征

返回《Learn Statistics with Ease》慕课在线视频课程列表

3.2.1 Frequency distribution: the initial appearance of the overall distribution characteristics 频数分布:初显总体分布特征在线视频

下一节:3.3 .1 Statistics chart: show the best partner for data 统计图表:展现数据最佳拍档

返回《Learn Statistics with Ease》慕课在线视频列表

3.2.1 Frequency distribution: the initial appearance of the overall distribution characteristics 频数分布:初显总体分布特征课程教案、知识点、字幕

统计数列
Statistical Series

按品质标志分组形成的数列
There are series based on qualitative characters

为品质数列
called quality series

按数量标志分组
and on numerical indications

生成的数列为变量数列
called variable series

当然我们也可以统一为变量数列
Of course, we can regard both kinds as variable series

因为品质数列它最终结果
because the result of quality series

也可以量化为变量
can also be transformed into a variable

那变量数列它的构成形式
A variable series consists of

有两个基本要素
two fundamental elements

一 变量 二 次数
variable and number of times

或者 就变量或频数
Or called, variable and frequency

如果用相对数表示
When expressed as a relative number

那就变量和比率 或者变量和频率
it is variable and ratio, or variable and frequency

那在这里变量的表现形式不同的话
And if the variable takes different forms of expression

又有两类变量
there will be two subdivisions

或者我们统计分组
Or,in statistical grouping

一类是单项式数列
One of them is single-valued distribution series

单项式数列就是用一个数值
which uses one numerical value

来代表一组
to represent a set

还有一种叫组距式数列
and the other is class interval distribution series

它是用一个区间
which uses an interval

一个组来代表
to represent a set

比如说 我们年龄可以这样分
For instance, when we list ages

16岁 17岁 18岁 19岁
we can use individual numerical value of 16, 17, 18 and 19

分别代表一个组
to represent individual groups

这叫单项数列
that is single-valued distribution series

你也可以这样分
Or we can use the age box

20岁以下 21到30岁
under 20, 21 to 30

31到40岁
and 31 to 40

那这个20岁以下
These intervals, under 20

21到30 31到40
21 to 30, 31 to 40

这个是组距式数列
form class interval distribution series

这两组数列的优缺点
Both series have pros and cons

要看各自的分析方法
and should be used according to analysis methods

如果编表的来看
In preparing reports

如果那个总体的差异比较大
when there are considerable variations

单位的差异比较大
in population and units

你如果用
and if you use

单项式数列
single-valued distribution series

那么单项式数列编出的表
the table following single-valued distribution series

将变成哈达表
And it will become a Hadar table

太难看了
But that will be too long to read

用组距数列就编的比较紧凑
The table following class interval distribution series is more precious

但是组距数列编出来的变量数列
But the variable series from class interval distribution series

它计算出来的结果
is not as accurate as

没有单项式数列计算的结果准确
the result from single-valued distribution series

比如说用组距式数列
The average number

计算后面的平均数
calculated by class interval distribution series

那么平均数将可能是一个近似值
is more of approximate value

这个就是我们讲的
These are the two forms of

变量数列的两种形式
variable series that we have discussed

那变量数列里面最关键的数列
The most important variable series

用的比较多的数列
And the most used one

就是组距式数列
is class interval distribution series

组距式数列的构成要素里面
And there are a few things we need to pay attention to

有几个要注意
in the components of class interval distribution series

一个是组限 组距 组数
They are class limit, class interval, number of sets

组限与组距之间的关系
the relation between class limit and class interval

组距等于上限减下限
The class interval is upper limit minus lower limit

上限是这一组里面的最大值
Upper limit is the maximum value in this set

下限是这一组里面的最小值
and lower limit, the minimal value

当然这个变量的最大值
That the maximum value of the variable

减这个变量的最小值
minus the minimal value of the variable

属于这个变量的全距
forms the range of the variable

在这里我们每一组 它有两个数值
There are two values in each set

一个是上限 一个是下限
the upper limit and the lower limit

那哪一个代表这一组呢
So, which one is used to represent the set

我们就用组里面的组中值
We use the class mid-point

代表这一组的平均数
to represent the average number of the set

那组中值怎么算呢
How do we calculate class mid-point

就是上限加下限除以2
Divide the sum of the upper limit and the lower limit by two

大家看例子
Here is an example

成绩单上的考试成绩
The grades in our school report

是属于连续型变量
are continuous variables

而且变量值较多
with many variable values

因此要采用组距分组
So, it is better to use interval grouping

既然是组距分组
In interval grouping

那么就需要确定全距 组数 和组距
we need to set range, number of sets and class interval

下面我们就对这份成绩单上的
Now let us conduct statistical grouping

总评分数据进行统计分组
among the grades in the school report

当然有一些组它属于开口向上组
Some groups are upward open-end class

比如说 我们90分以上
like the group above 90

它就开口向上
is upward open-end

还有60分以下
And the group, below 60

就开口向下
is downward open-end

它只有上限或者只有下限
An open-end class only has upper limit or lower limit

那怎么办呢
What can we do about this

就用上限
We subtract

减去1/2相邻组的组距
half the class interval of its neighbor set from its upper limit

那下限
And in upward open-end class

开口向上的就下限
we add the lower limit

加上1/2相邻组的组距
to half the class interval of its neighbor set

大家看例子
Let us look at this example

在这里我们还要特别注意的是
I would like to remind you

如果全部数据中的最大值和最小值
if the maximum value and the minimal value

与其它数据相差悬殊
differ greatly from the other data in the set

那么在组距式数列中
there might be blank group

为避免出现空白组
in class interval distribution series

或个别极端值被漏掉
or danger of missing extreme values

第一组和最后一组可以采取
It is better to set the first and last group

某个值以下或某个值以上
to be open-end class

这样的开口组
above or below a certain value

如上述例子
As in the case above

采用开口组的形式可以表示如下
the open-end class can be set like this

在这里我们可以看到
As we can see

60分以下和90分以上
the groups of below 60 and above 90

所采用的都是开口组的形式
are set in the form of open-end

既然是开口组
Since they are open-end classes

那么我们就应当采用前面所讲过的
we should use the formula mentioned earlier

计算开口组组中值的公式
to calculate its class mid-point

也即组中值等于下限
which is its lower limit

加1/2邻组组距
plus half the class interval of its neighbor set

或者是组中值等于上限减去
or its upper limit minus

1/2邻组组距
half the class interval of its neighbor set

在这里60以下的组中值
The class mid-point of group below 60

实际上就是用
is

60-(70-60)/2=55
60-(70-60)/2=55

90以上这一组所对应的组中值
and of group above 90

应当等于这一组的下限
is the sum of its lower limit

加1/2邻组组距
and half the class interval of its neighbor set

也就是90减80等于95
that is, 90+(90-80)/2=95

至于闭口组组中值的计算方法
As for the class mid-point of close-end group

它应当等于1/2上限加下限
it is half of the sum of its upper limit and lower limit

例如 60到70这一组
Take the group of 60 to 70 for instance

它的组中值应当等于60加70除以2
its class mid-point is (60+70)/2

等于65
that is, 65

实际上闭口组组中值的计算公式
In fact, the calculating formula of mid-point in close-end class

还可以做进一步变形
can be further conversed

也即组中值等于
Because the mid-point

1/2上限加下限
is half of the sum of its upper limit and lower limit

展开得到1/2上限
it is also half of the upper limit plus

加上1/2上限减去组距
half of upper limit and then minus the class interval

最后变形得到上限减去1/2组距
and finally derived into upper limit minus half of the class interval

现在我们再把闭口组组中值
Now let us compare the calculating formula

刚刚所拓展得到的这一计算公式
of mid-point in close-end class

和我们之前所学过的开口组
with the calculating formula

组中值的计算公式做一比较
of mid-point in open-end class

一个是上限减去1/2组距
One is upper limit minus half of the class interval

另外一个是上限
the other is upper limit

减去1/2邻组组距
minus half of the class interval of its neighbor set

由此可以看到
We can see

闭口组组距的确定原则
the class interval of close-end group

是等于本组的组距
is equal to its class interval

而开口组组距的确定原则
while the class interval of open-end group

则是等于相邻组的组距
is equal to the class interval of its neighbor set

无论是开口组还是闭口组
As for class interval series formed

所形成的组距式数列
in both open-end class and close-end class

其每个组中的变量
the variables of the series

都是连续性变量
are continuous variables

在对连续性变量编制
We should also be aware that

组距式数列的过程中
in writing class interval distribution series

我们还要注意的是
of continuous variables

它们的相邻组限是重叠的
there is an overlap between neighboring sets

也即前一组的上限和后一组的下限
That is, the former group’s upper limit and the later group’s lower limit

是相同的变量
can be the same

那么大家可能就会问
Some of you may wonder

如果某个同学的成绩正好为70分
if a student’s grade is 70

那他应该计入哪一组呢
which group is this grade

这里大家只需要记住一个原则
There is a principle we should keep in mind

就可以解决上面的问题
when dealing with this kind of question

也就是上组限不在内原则
that is, the principle to exclude upper limit

它的意思是每一组的
It means, for each group

上限变量值的总体单位
the maximum variable value of the unit

是不包括在本组内
is not included in this group

而应该放在下一组
but should be included in the next group

通过上述对二妞班上统计学考试
If we use this method to sort out the scores of Statistics Tests

成绩进行分组
in Er Niu’s class

我们可以看到
we can see

它所得到的组距式数列
the class intervals shown between

每一组的组距是相等的
class interval distribution series are equal

那么我们在统计学里
In Statistics

称之为等距数列
we call it series of equal interval

而当各组组距不相等
When the class interval is not equal

所形成的组距数列
among the class interval distribution series

我们称之为异距数列
we call it series of unequal interval

在异距数列中
In series of unequal interval

反映次数在各组分布
the indicator that reflect the density of

密集程度的指标
frequency in each group

是次数密度
is called frequency density

所谓次数密度
Frequency density

它表示的是本组的次数
shows the proportion of frequency

与本组组距之比
to the class interval of this group

大家可以看下面的表
Let us look the table below

直观上我们可以看到
It is intuitively obvious that

100至120
the group from 100 to 120

这个组的单位数最多
has the most units

总体单位在这个组内最为集中
The density of the population unit seems the biggest in this group

但实际情况是否真的如此呢
Is it true

我们注意到这个组距数列
We have noticed the class interval series here

并不是一个等距数列
is not a series of equal interval

每一组的组距不尽相等
The class intervals of each unit vary

一般来说组距越长
Generally, the longer the class interval is

容纳的总体单位相应的也会更多
the more units it includes

就好比一间面积大的房子
in the same way that

要比一间面积小的房子
a classroom of larger area

能够容纳多一些学生
can hold more students than

是一样的道理
a classroom of smaller area

但我们并不能就此认为
But it is wrong for us to assume

大房子的学生更为密集
the density of students in the large classroom is higher

我们要看的应该是单位面积上
We should calculate the number of students

容纳的学生的多少
per unit area

同样的道理
Similarly

现在我们看到的数据
the data we are seeing now

有的组距是20
with class interval of 20

有的组距是10
and of 10

此时要看总体在各组中的分布特点
should be calculated by the distribution of units in each class

这个时候就要用次数密度
We should use the indicator of

这一指标来帮助我们了
frequency density to calculate

比如第一组为30除以20
For example, the first group is 30/20

等于0.15
it is 0.15

依次把每一组的次数密度计算好
Calculate the frequency density of these groups one by one

再观察他们的结果
and observe the result

我们可以看到90至100
We can tell that the group of from 90 to 100

这一组的频数密度为0.7最大
has the largest frequency density, 0.7

这说明这一组内的单位最为密集
This means the units in this group is the densest

而不是100至120这个组
rather than the group of from 100 to 120

通常情况下等距分组
Normally equal class interval

适用于变量值变动比较均匀的现象
is used when the variable value change is even

而异距分组
And unequal class interval

适用于变量值变动不均匀
is used when variable change is uneven

出现急剧增长或下降
with rapid growth or decline

波动较大的现象
in fluctuation

Learn Statistics with Ease课程列表:

Chapter 1 Data and Statistics

-Introduction

-1.1 Applications in Business and Economics

--1.1.1 Statistics application: everywhere 统计应用:无处不在

-1.2 Data、Data Sources

--1.2.1 History of Statistical Practice: A Long Road 统计实践史:漫漫长路

-1.3 Descriptive Statistics

--1.3.1 History of Statistics: Learn from others 统计学科史:博采众长

--1.3.2 Homework 课后习题

-1.4 Statistical Inference

--1.4.1 Basic research methods: statistical tools 基本研究方法:统计的利器

--1.4.2 Homework课后习题

--1.4.3 Basic concepts: the cornerstone of statistics 基本概念:统计的基石

--1.4.4 Homework 课后习题

-1.5 Unit test 第一单元测试题

Chapter 2 Descriptive Statistics: Tabular and Graphical Methods

-Statistical surveys

-2.1Summarizing Qualitative Data

--2.1.1 Statistical investigation: the sharp edge of mining raw ore 统计调查:挖掘原矿的利刃

-2.2Frequency Distribution

--2.2.1 Scheme design: a prelude to statistical survey 方案设计:统计调查的前奏

-2.3Relative Frequency Distribution

--2.3.1 Homework 课后习题

-2.4Bar Graph

--2.4.1 Homework 课后习题

-2.6 Unit 2 test 第二单元测试题

Chapter 3 Descriptive Statistics: Numerical Methods

-Descriptive Statistics: Numerical Methods

-3.1Measures of Location

--3.1.1 Statistics grouping: from original ecology to systematization 统计分组:从原生态到系统化

--3.1.2 Homework 课后习题

-3.2Mean、Median、Mode

--3.2.1 Frequency distribution: the initial appearance of the overall distribution characteristics 频数分布:初显总体分布特征

--3.2.2 Homework 课后习题

-3.3Percentiles

--3.3 .1 Statistics chart: show the best partner for data 统计图表:展现数据最佳拍档

--3.3.2 Homework 课后习题

-3.4Quartiles

--3.4.1 Calculating the average (1): Full expression of central tendency 计算平均数(一):集中趋势之充分表达

--3.4.2 Homework 课后习题

-3.5Measures of Variability

--3.5.1 Calculating the average (2): Full expression of central tendency 计算平均数(二):集中趋势之充分表达

--3.5.2 Homework 课后习题

-3.6Range、Interquartile Range、A.D、Variance

--3.6.1 Position average: a robust expression of central tendency 1 位置平均数:集中趋势之稳健表达1

--3.6.2 Homework 课后习题

-3.7Standard Deviation

--3.7.1 Position average: a robust expression of central tendency 2 位置平均数:集中趋势之稳健表达2

-3.8Coefficient of Variation

--3.8.1 Variance and standard deviation (1): Commonly used indicators of deviation from the center 方差与标准差(一):离中趋势之常用指标

--3.8.2 Variance and Standard Deviation (2): Commonly Used Indicators of Deviation Trend 方差与标准差(二):离中趋势之常用指标

-3.9 unit 3 test 第三单元测试题

Chapter 4 Time Series Analysis

-Time Series Analysis

-4.1 The horizontal of time series

--4.1.1 Time series (1): The past, present and future of the indicator 时间序列 (一) :指标的过去现在未来

--4.1.2 Homework 课后习题

--4.1.3 Time series (2): The past, present and future of indicators 时间序列 (二) :指标的过去现在未来

--4.1.4 Homework 课后习题

--4.1.5 Level analysis: the basis of time series analysis 水平分析:时间数列分析的基础

--4.1.6Homework 课后习题

-4.2 The speed analysis of time series

--4.2.1 Speed analysis: relative changes in time series 速度分析:时间数列的相对变动

--4.2.2 Homework 课后习题

-4.3 The calculation of the chronological average

--4.3.1 Average development speed: horizontal method and cumulative method 平均发展速度:水平法和累积法

--4.3.2 Homework 课后习题

-4.4 The calculation of average rate of development and increase

--4.4.1 Analysis of Component Factors: Finding the Truth 构成因素分析:抽丝剥茧寻真相

--4.4.2 Homework 课后习题

-4.5 The secular trend analysis of time series

--4.5.1 Long-term trend determination, smoothing method 长期趋势测定,修匀法

--4.5.2 Homework 课后习题

--4.5.3 Long-term trend determination: equation method 长期趋势测定:方程法

--4.5.4 Homework 课后习题

-4.6 The season fluctuation analysis of time series

--4.6.1 Seasonal change analysis: the same period average method 季节变动分析:同期平均法

-4.7 Unit 4 test 第四单元测试题

Chapter 5 Statistical Index

-Statistical indices

-5.1 The Conception and Type of Statistical Index

--5.1.1 Index overview: definition and classification 指数概览:定义与分类

-5.2 Aggregate Index

--5.2.1 Comprehensive index: first comprehensive and then compare 综合指数:先综合后对比

-5.4 Aggregate Index System

--5.4.1 Comprehensive Index System 综合指数体系

-5.5 Transformative Aggregate Index (Mean value index)

--5.5.1 Average index: compare first and then comprehensive (1) 平均数指数:先对比后综合(一)

--5.5.2 Average index: compare first and then comprehensive (2) 平均数指数:先对比后综合(二)

-5.6 Average target index

--5.6.1 Average index index: first average and then compare 平均指标指数:先平均后对比

-5.7 Multi-factor Index System

--5.7.1 CPI Past and Present CPI 前世今生

-5.8 Economic Index in Reality

--5.8.1 Stock Price Index: Big Family 股票价格指数:大家庭

-5.9 Unit 5 test 第五单元测试题

Chapter 6 Sampling Distributions

-Sampling and sampling distribution

-6.1The binomial distribution

--6.1.1 Sampling survey: definition and several groups of concepts 抽样调查:定义与几组概念

-6.2The geometric distribution

--6.2.1 Probability sampling: common organizational forms 概率抽样:常用组织形式

-6.3The t-distribution

--6.3.1 Non-probability sampling: commonly used sampling methods 非概率抽样:常用抽取方法

-6.4The normal distribution

--6.4.1 Common probability distributions: basic characterization of random variables 常见概率分布:随机变量的基本刻画

-6.5Using the normal table

--6.5.1 Sampling distribution: the cornerstone of sampling inference theory 抽样分布:抽样推断理论的基石

-6.9 Unit 6 test 第六单元测试题

Chapter 7 Confidence Intervals

-Parameter Estimation

-7.1Properties of point estimates: bias and variability

--7.1.1 Point estimation: methods and applications 点估计:方法与应用

-7.2Logic of confidence intervals

--7.2.1 Estimation: Selection and Evaluation 估计量:选择与评价

-7.3Meaning of confidence level

--7.3.1 Interval estimation: basic principles (1) 区间估计:基本原理(一)

--7.3.2 Interval estimation: basic principles (2) 区间估计:基本原理(二)

-7.4Confidence interval for a population proportion

--7.4.1 Interval estimation of the mean: large sample case 均值的区间估计:大样本情形

--7.4.2 Interval estimation of the mean: small sample case 均值的区间估计:小样本情形

-7.5Confidence interval for a population mean

--7.5.1 Interval estimation of the mean: small sample case 区间估计:总体比例和方差

-7.6Finding sample size

--7.6.1 Determination of sample size: a prelude to sampling (1) 样本容量的确定:抽样的前奏(一)

--7.6.2 Determination of sample size: a prelude to sampling (2) 样本容量的确定:抽样的前奏(二)

-7.7 Unit 7 Test 第七单元测试题

Chapter 8: Hypothesis Tests

-Hypothesis Tests

-8.1Forming hypotheses

--8.1.1 Hypothesis testing: proposing hypotheses 假设检验:提出假设

-8.2Logic of hypothesis testing

--8.2.1 Hypothesis testing: basic ideas 假设检验:基本思想

-8.3Type I and Type II errors

--8.3.1 Hypothesis testing: basic steps 假设检验:基本步骤

-8.4Test statistics and p-values 、Two-sided tests

--8.4.1 Example analysis: single population mean test 例题解析:单个总体均值检验

-8.5Hypothesis test for a population mean

--8.5.1 Analysis of examples of individual population proportion and variance test 例题分析 单个总体比例及方差检验

-8.6Hypothesis test for a population proportion

--8.6.1 P value: another test criterion P值:另一个检验准则

-8.7 Unit 8 test 第八单元测试题

Chapter 9 Correlation and Regression Analysis

-Correlation and regression analysis

-9.1Correlative relations

--9.1.1 Correlation analysis: exploring the connection of things 相关分析:初探事物联系

--9.1.2 Correlation coefficient: quantify the degree of correlation 相关系数:量化相关程度

-9.2The description of regression equation

--9.2.1 Regression Analysis: Application at a Glance 回归分析:应用一瞥

-9.3Fit the regression equation

--9.3.1 Regression analysis: equation establishment 回归分析:方程建立

-9.4Correlative relations of determination

--9.4.1 Regression analysis: basic ideas

--9.4.2 Regression analysis: coefficient estimation 回归分析:系数估计

-9.5The application of regression equation

--9.5.1 Regression analysis: model evaluation 回归分析:模型评价

3.2.1 Frequency distribution: the initial appearance of the overall distribution characteristics 频数分布:初显总体分布特征笔记与讨论

也许你还感兴趣的课程:

© 柠檬大学-慕课导航 课程版权归原始院校所有,
本网站仅通过互联网进行慕课课程索引,不提供在线课程学习和视频,请同学们点击报名到课程提供网站进行学习。