当前课程知识点:Learn Statistics with Ease > Chapter 3 Descriptive Statistics: Numerical Methods > 3.7Standard Deviation > 3.7.1 Position average: a robust expression of central tendency 2 位置平均数:集中趋势之稳健表达2
返回《Learn Statistics with Ease》慕课在线视频课程列表
返回《Learn Statistics with Ease》慕课在线视频列表
这个就属于位置平均数
This is location average
大家注意了
We should keep in mind
使用这些平均数的时候
when we use these kinds of average
一定要遵循一些基本原则
there are certain principles to follow
第一个原则就是
The first principle is
一 同质性
homogeneity
平均数的对象那个总体
The population of this average
单位具有同质性
should be homogeneous in quality
前面我们讲过
As we have mentioned in earlier lectures
总体的第一个特点大量性
one of the characteristics of population is volume
第二就是同质性
The other is homogeneity
同质性指的就是
Homogeneity means
它们属于这个同一个总体的
they should belong to the same population
这样的话
Only in this way
我们计算出来的平均数
can the calculative average
才有代表性
be representative
比如说平均收入
Take the average income for example
有些地方公布出来的平均收入
The average income released in some places
就不具有同质性
violates homogeneity
有些 比如说有些人置疑
In some cases, there are doubts
农民人均收入
about the average income of farmers
农民人均可支配收入
Per capita disposable income of farmers
它好多不是农村的收入
includes data that is not rural income
它好多是什么
What is this data?
工业收入或者投资收入
It might be industry revenue and capital gains
这个能不能计算进去
It should not be included.
再一个因为它
Moreover, it is because
这些投资的人呢
the investors
他可能不是农民了
may not be farmers anymore
他已经是转换了
They might have become
他可能是金融家
financiers
也可能是企业家
and entrepreneurs
但只是因为在计算的时候
But he is registered as farmer
他的身份就是他的户口
in household registration
还属于农民
when the calculation is done
所以它把他的收入也计算进来
And his data is counted as well
但是我们如果从统计的
But judging by the homogeneity principle
总体的同质性来讲
in statistics
这一部分人不应该属于农民
their data should not be counted
所以这个收入里面
as the income of farmers anymore
在计算的时候特别要注意
We should always keep in mind
平均数使用计算的时候
when we need to calculate the average
一定要注意总体的同质性
we must ensure the homogeneity of the population
不然的话它会夸大
Or else, there will be exaggerations
或者是缩小平均数的那个代表性
or reduce the representation of the average
第二个问题呢
Another thing to remember is that
我们要用组平均数
we should use subgroup average
来补充说明总平均数
to supplement the population average
因为我们可能看
Sometimes, we might find
总平均数并不高
the population average is not high
但是你看
But we might also discover
各组的组平均数都很高
the subgroup averages are high
这是什么原因呢
What is the reason for this
这就是因为出现了权数
Because there is weight
因为权数在这里
Because weight has
会起着不同的作用
played its role in the calculation
这个在美国就出现过
This happened in the U.S.
美国劳务工组织就认为
The labor workers organizations in the U.S.
男女就业方面没有性别歧视
used to deny discrimination against sex in employment
并且可能还对女性更有优惠
They even claimed it was favorable to women
他们有一套数据
They showed a set of data
但是美国妇女组织
But National Organization for Women
拿出的数据就不是
had different data
她拿出的数据就是
Their data showed
美国妇女在各个部门
women were discriminated in every department
她觉得就业方面受到歧视
and in employment
她就业率要低于男性
Women’s employment rate was lower
这个时候就是组平均数
There was a contradiction
和总平均数发生了问题
between population average and subgroup average
所以我们在使用总平均数的时候
So when we use population average
用组平均数来进行补充说明
we should supplement it with subgroup average
这两类的分的情况
The two types of averages
是根据具体的要求来确定
should be defined by specific requirements
使用的要求对象来确定的
and by the requirements in operation
等一下我们会发现
Soon we will discover that
我们使用的平均数特别多
there are various averages, including
有算术平均数
arithmetic mean
调和平均数 几何平均数
harmonic mean, geometric mean
中位数 众数 四分位数
median, mode and quartile
这么多平均数
There are so many types of average
它们的使用的时候一定要注意
that we must pay attention to their application
它们使用的场合
on different occasions
它有些地方只能用一种
Sometimes, only one type of average is needed
比如说我们讲的人均收入
such as in calculating per-capita income
人均支出 消费支出
per capita expenditure and consumer expenditure
比如我举一个例子
Here is an example
它有一个农场
There is a farm
那个农场那个税务官员来问他
When the tax official asks the farmer
你们的收入是多少
what is your income
他说我这个农场的人均的
The farmer replies that the average annual income
年均收入是5000美元
per capita is $5000
但是他跟朋友介绍的时候
But when the farmer talks to his friend
我们这个农场的人
he says the workers on this farm
年均收入是10万美元
has an average annual income of $ 100,000
这是什么原因造成的
Why is there such a difference
这两个平均数都没错
Both of the averages are correct
他前面用的是中位数
But the one the farmer tells the official is median
后面用的是算术平均数
And the one the farmer talks to his friend is arithmetic mean
因为算术平均数受到
And arithmetic mean is affected
极端值的影响
by outliers
受到极端值的影响就是
Here, this is how it is
这个地方是一个农场
This is indeed a farm
农场就是养牛的
a cattle raising ranch
但是这个地方空气挺好
But it is located in a place with good air quality
住了两个亿万富翁
Two billionaires live in this area
就是这些穷人跟两个亿万富翁
And both the farmers and the billionaires
加起来算算平均每个人10万美元
are counted into the population, whose average is $100,000 per capita
但是如果由少
If we arrange the list
由收入最少的往最大的排队
from lowest to highest
一直排列
in this order
处于中间位置的那个人
the person in the middle position
加起来的收入就是5000美元
has an income of $5000
所以说我们讲
Therefore, we can say
每个平均数
each type of average
它是要有一定的使用特殊场合
is used to suit certain purposes
比如我们讲的衣服 帽子 鞋
In the case of clothing, hat and shoes
这个他使用的平均数
the average is given
在工厂里面生产的时候
for the reference of the manufacturer
他是用平均数生产
The average is needed in production arrangement
因为他也不知道
because the manufacture does not know
每一个顾客的要求尺寸
the size of every individual customer
他用平均数算的话
What kind of average
他用什么平均数呢
is the manufacture going to use
他不能用算术平均数
He cannot use arithmetic mean
因为算术平均数算出来的话呢
Because the size calculated by arithmetic mean
没有人能穿
is not realistic
他用的是众数
He uses mode
就是大多数人能够穿的尺码
which represents the size of most people
所以大家注意
So we should remember
等一下讲的这么多的平均数里面
to pay attention to the occasion
使用场合一定要注意
that requires the application of average
这是我们讲的平均数的有关情况
This is what we have learned about average
平均数的有关情况里面
There is one more thing
再要补充的就是中位数 众数
about the quantitative relations
和算术平均数它们之间
among median, mode
有一定的数量关系
and arithmetic mean
这个关系它是
This relation is summarized
我们通过现实的一些例子
and calculated from many examples
来测算出来的
in real life
但是这个数值不一定严格
But the numerical result is not rigorous
大家可以看看相应的公式
We can look at the relevant formula
大家就看那相应的公式
Look at the relevant formula
那公式里面有
which shows
比如说中位数它等于
say, the median is equal to
算术平均数什么 什么
arithmetic mean and the like
与它们之间的关系
Look at their relationship
众数跟算术平均数之间的什么关系
What is the relationship between mode and arithmetic mean
而算术平均数与众数和中位数
And what is the relation among arithmetic mean
之间的关系
mode and median
但是如果画出一个钟型图来的话
If there is a bell curve
你就能看出来
you can easily figure out
中位数总在中间
that median is always in the middle
而只是说分布的时候要看
And we look closer at the distribution
如果是正态分布
In normal distribution
中位数 众数 算术平均数
median, mode and arithmetic mean
是在一个点上
are at the same point
它们三者会相等
meaning they are the same
但是如果右偏的话呢
In right-skewed distribution
那中位数在中间
median is still in the middle
众数呢是在左边
with mode on its left
算术平均数在右边
arithmetic mean, right
如果是左偏的话呢
In left-skewed distribution
刚好反的
it is the opposite
中位数在中间
with median in the middle
算术平均数在左边
arithmetic mean, on its left
众数在右边
and mode, right
大家可以看看相应的图形
These are the corresponding charts
下面我们来试试
Now let us try to
用刚刚对二妞班上的
verify the theory
平均成绩所计算的结果
using the calculated GPA
来判断一下它们是属于
of Er Niu’s class to check if it is
对称分布 左偏分布
symmetrical distribution, left-skewed distribution
还是右偏分布
or right-skewed distribution
我们来看一看
Let us see
算术平均数(计算如上)
the arithmetic (see the calculation above)
中位数(计算如上)
median (see the calculation above)
众数(计算如上)
and mode (see the calculation above)
通过这三个指标大小的不同
From the difference in the three indicators
我们可以看到
we can see
(关系如上)
(the relationship above)
由此它显示的应该是右偏分布
thus, the chart should display right-skewed distribution
好
Ok
这就是我们讲的集中趋势指标
this is what we have learned about central tendency
-1.1 Applications in Business and Economics
--1.1.1 Statistics application: everywhere 统计应用:无处不在
-1.2 Data、Data Sources
--1.2.1 History of Statistical Practice: A Long Road 统计实践史:漫漫长路
-1.3 Descriptive Statistics
--1.3.1 History of Statistics: Learn from others 统计学科史:博采众长
--1.3.2 Homework 课后习题
-1.4 Statistical Inference
--1.4.1 Basic research methods: statistical tools 基本研究方法:统计的利器
--1.4.2 Homework课后习题
--1.4.3 Basic concepts: the cornerstone of statistics 基本概念:统计的基石
--1.4.4 Homework 课后习题
-1.5 Unit test 第一单元测试题
-2.1Summarizing Qualitative Data
--2.1.1 Statistical investigation: the sharp edge of mining raw ore 统计调查:挖掘原矿的利刃
-2.2Frequency Distribution
--2.2.1 Scheme design: a prelude to statistical survey 方案设计:统计调查的前奏
-2.3Relative Frequency Distribution
--2.3.1 Homework 课后习题
-2.4Bar Graph
--2.4.1 Homework 课后习题
-2.6 Unit 2 test 第二单元测试题
-Descriptive Statistics: Numerical Methods
-3.1Measures of Location
--3.1.1 Statistics grouping: from original ecology to systematization 统计分组:从原生态到系统化
--3.1.2 Homework 课后习题
-3.2Mean、Median、Mode
--3.2.2 Homework 课后习题
-3.3Percentiles
--3.3 .1 Statistics chart: show the best partner for data 统计图表:展现数据最佳拍档
--3.3.2 Homework 课后习题
-3.4Quartiles
--3.4.1 Calculating the average (1): Full expression of central tendency 计算平均数(一):集中趋势之充分表达
--3.4.2 Homework 课后习题
-3.5Measures of Variability
--3.5.1 Calculating the average (2): Full expression of central tendency 计算平均数(二):集中趋势之充分表达
--3.5.2 Homework 课后习题
-3.6Range、Interquartile Range、A.D、Variance
--3.6.1 Position average: a robust expression of central tendency 1 位置平均数:集中趋势之稳健表达1
--3.6.2 Homework 课后习题
-3.7Standard Deviation
--3.7.1 Position average: a robust expression of central tendency 2 位置平均数:集中趋势之稳健表达2
-3.8Coefficient of Variation
-3.9 unit 3 test 第三单元测试题
-4.1 The horizontal of time series
--4.1.1 Time series (1): The past, present and future of the indicator 时间序列 (一) :指标的过去现在未来
--4.1.2 Homework 课后习题
--4.1.3 Time series (2): The past, present and future of indicators 时间序列 (二) :指标的过去现在未来
--4.1.4 Homework 课后习题
--4.1.5 Level analysis: the basis of time series analysis 水平分析:时间数列分析的基础
--4.1.6Homework 课后习题
-4.2 The speed analysis of time series
--4.2.1 Speed analysis: relative changes in time series 速度分析:时间数列的相对变动
--4.2.2 Homework 课后习题
-4.3 The calculation of the chronological average
--4.3.1 Average development speed: horizontal method and cumulative method 平均发展速度:水平法和累积法
--4.3.2 Homework 课后习题
-4.4 The calculation of average rate of development and increase
--4.4.1 Analysis of Component Factors: Finding the Truth 构成因素分析:抽丝剥茧寻真相
--4.4.2 Homework 课后习题
-4.5 The secular trend analysis of time series
--4.5.1 Long-term trend determination, smoothing method 长期趋势测定,修匀法
--4.5.2 Homework 课后习题
--4.5.3 Long-term trend determination: equation method 长期趋势测定:方程法
--4.5.4 Homework 课后习题
-4.6 The season fluctuation analysis of time series
--4.6.1 Seasonal change analysis: the same period average method 季节变动分析:同期平均法
-4.7 Unit 4 test 第四单元测试题
-5.1 The Conception and Type of Statistical Index
--5.1.1 Index overview: definition and classification 指数概览:定义与分类
-5.2 Aggregate Index
--5.2.1 Comprehensive index: first comprehensive and then compare 综合指数:先综合后对比
-5.4 Aggregate Index System
--5.4.1 Comprehensive Index System 综合指数体系
-5.5 Transformative Aggregate Index (Mean value index)
--5.5.1 Average index: compare first and then comprehensive (1) 平均数指数:先对比后综合(一)
--5.5.2 Average index: compare first and then comprehensive (2) 平均数指数:先对比后综合(二)
-5.6 Average target index
--5.6.1 Average index index: first average and then compare 平均指标指数:先平均后对比
-5.7 Multi-factor Index System
--5.7.1 CPI Past and Present CPI 前世今生
-5.8 Economic Index in Reality
--5.8.1 Stock Price Index: Big Family 股票价格指数:大家庭
-5.9 Unit 5 test 第五单元测试题
-Sampling and sampling distribution
-6.1The binomial distribution
--6.1.1 Sampling survey: definition and several groups of concepts 抽样调查:定义与几组概念
-6.2The geometric distribution
--6.2.1 Probability sampling: common organizational forms 概率抽样:常用组织形式
-6.3The t-distribution
--6.3.1 Non-probability sampling: commonly used sampling methods 非概率抽样:常用抽取方法
-6.4The normal distribution
--6.4.1 Common probability distributions: basic characterization of random variables 常见概率分布:随机变量的基本刻画
-6.5Using the normal table
--6.5.1 Sampling distribution: the cornerstone of sampling inference theory 抽样分布:抽样推断理论的基石
-6.9 Unit 6 test 第六单元测试题
-7.1Properties of point estimates: bias and variability
--7.1.1 Point estimation: methods and applications 点估计:方法与应用
-7.2Logic of confidence intervals
--7.2.1 Estimation: Selection and Evaluation 估计量:选择与评价
-7.3Meaning of confidence level
--7.3.1 Interval estimation: basic principles (1) 区间估计:基本原理(一)
--7.3.2 Interval estimation: basic principles (2) 区间估计:基本原理(二)
-7.4Confidence interval for a population proportion
--7.4.1 Interval estimation of the mean: large sample case 均值的区间估计:大样本情形
--7.4.2 Interval estimation of the mean: small sample case 均值的区间估计:小样本情形
-7.5Confidence interval for a population mean
--7.5.1 Interval estimation of the mean: small sample case 区间估计:总体比例和方差
-7.6Finding sample size
--7.6.1 Determination of sample size: a prelude to sampling (1) 样本容量的确定:抽样的前奏(一)
--7.6.2 Determination of sample size: a prelude to sampling (2) 样本容量的确定:抽样的前奏(二)
-7.7 Unit 7 Test 第七单元测试题
-8.1Forming hypotheses
--8.1.1 Hypothesis testing: proposing hypotheses 假设检验:提出假设
-8.2Logic of hypothesis testing
--8.2.1 Hypothesis testing: basic ideas 假设检验:基本思想
-8.3Type I and Type II errors
--8.3.1 Hypothesis testing: basic steps 假设检验:基本步骤
-8.4Test statistics and p-values 、Two-sided tests
--8.4.1 Example analysis: single population mean test 例题解析:单个总体均值检验
-8.5Hypothesis test for a population mean
--8.5.1 Analysis of examples of individual population proportion and variance test 例题分析 单个总体比例及方差检验
-8.6Hypothesis test for a population proportion
--8.6.1 P value: another test criterion P值:另一个检验准则
-8.7 Unit 8 test 第八单元测试题
-Correlation and regression analysis
-9.1Correlative relations
--9.1.1 Correlation analysis: exploring the connection of things 相关分析:初探事物联系
--9.1.2 Correlation coefficient: quantify the degree of correlation 相关系数:量化相关程度
-9.2The description of regression equation
--9.2.1 Regression Analysis: Application at a Glance 回归分析:应用一瞥
-9.3Fit the regression equation
--9.3.1 Regression analysis: equation establishment 回归分析:方程建立
-9.4Correlative relations of determination
--9.4.1 Regression analysis: basic ideas
--9.4.2 Regression analysis: coefficient estimation 回归分析:系数估计
-9.5The application of regression equation