当前课程知识点:Learn Statistics with Ease > Chapter 7 Confidence Intervals > 7.4Confidence interval for a population proportion > 7.4.2 Interval estimation of the mean: small sample case 均值的区间估计:小样本情形
返回《Learn Statistics with Ease》慕课在线视频课程列表
返回《Learn Statistics with Ease》慕课在线视频列表
接下来我们介绍小样本情形下
Next, we are going to introduce the interval estimation process of
总体均值的区间估计过程
the population mean in the small sample case
在小样本的情况下
In the case of small samples
总体均值的抽样分布
the sampling distribution of the population mean
依赖于总体的分布
depends on the distribution of the population
我们仅讨论总体
We're only talking about total
服从正态分布的情形
where the population obeys a normal distribution
根据前面抽样分布的定理
According to the sampling distribution theorems
我们知道总体
we know that when the population
是正态总体的情况下
is a normal population
如果标准差已知
If the standard deviation is known and
小样本出现了
the sample is small
X_bar还是会服从正态分布的
X-bar will still obey a normal distribution
但是如果总体是个正态总体
But if the population is a normal population
总体标准差未知的情况下
the population standard deviation is unknown
那如果样本容量小于30的话
and the sample size is less than 30
这个时候X_bar
X-bar at this time
就不再服从正态分布了
will no longer obey a normal distribution
由它所构造的t统计量
The t statistic constructed by it
那t统计量的一般形式
The general form of the t statistic
大家还会写吗
can you still write
我们前面在学抽样分布的时候
In learning the sampling distribution
学过t统计量
we learned about the t statistic
那t统计量在这里的表达形式是
The expression for the t statistic right here is
(公式如上)
(The formula is as above)
这个t统计量的形式
The form of this t-statistic
看起来非常类似于
looks very similar to
标准的正态分布
standard normal distribution
但是这里我要说明的是
But here's what I want to say is that
它并不是由
it is not transformed from
标准正态分布转化而来的
the standard normal distribution
它是由t统计量的构造
It's constructed by the t statistic
那t统计量的分子
The numerator of the t statistic
是一个标准的正态分布
is a standard normal distribution
它的分母是由卡方统计量
Its denominator is constructed by
除以它的自由度
dividing the chi-square statistic by its degrees of freedom
然后开根号来构造的
and then taking the square root
那么这个t统计量它服从的
This t statistic obeys
就是t分布
t distribution
并且它有一个自由度和它对应
and has a corresponding degree of freedom
这个自由度的话
The corresponding degree of freedom
是n-1和它对应的
is n-1
所以在小样本的情形下面
So in the small sample case
我们对总体均值的区间估计
for the interval estimation of the population mean
也要先了解它X一拔的抽样分布
we should first know its X-bar sampling distribution
根据前面的学习
According to the previous study
我们了解到t分布的图形特征
we know that the graphical characteristics of the t-distribution
和标准正态分布的图形特征
and those of the standard normal distribution
基本是类似的
are basically similar
比如它们都是对称的钟形分布
For example, they're both symmetrical bell distributions
它们都是关于Y轴对称的
they're both symmetric about the Y-axis
所以基于这样的一个基本知识
So based on this basic knowledge
我们前面的区间估计的原理
the principles of the previous interval estimation
可以再一次
can once again be
应用到t分布的图形上面
applied to the graph of the t-distribution
那以X_bar为中心所构建的区间
The intervals built around X_bar
仍然是在负无穷到正无穷
are still randomly fluctuating
这个区间上面随机地波动的
on the intervals from minus infinity to infinity
在它波动的过程里边
In the process of their fluctuation
有的区间能包含μ
some intervals can cover μ, and
有的区间不能包含μ
some can’t
我们依然希望
Still we hope
能包含μ的区间占大多数
a majority of intervals can cover μ
比如90% 比如95%
Let's say 90 %, 95 %
如果有了这个置信水平
If we have this confidence level
同样的我们可以通过查表
similarly, we can, by looking up the table
得到对应的临界值
find the corresponding critical values
只不过这一次
Just in this time
不再是查标准正态分布表
instead of looking up the standard normal distribution table
而是查t分布表而已
we look up the t distribution table
所以区间估计的
So the basic principles of interval estimation
基本原理还是有用的
are still useful
只不过抽样分布的形式
It's just that the sampling distribution form
从正态分布换成了t分布而已
has changed from a normal distribution to a t-distribution
在t分布里边
In the t distribution
对于给定的置信度
for a certain confidence degree
同样可以通过查表
by looking up the table
找到它对应的临界值
we can find its critical value
它的临界值的一般表达方法
The general expression of its critical value
用(公式如上)来表达
is as the formula above
利用临界值
By using the critical value
同样地可以把极限误差
the limit error
Ex_bar计算出来
Ex_bar, can also be calculated
那这个时候它就等于
At this time, it equals
(公式如上)
(The formula is as above)
因此在正态总体条件下
So in the case of a normal population
总体均值的区间估计
the interval estimate of the population mean
在标准差未知的小样本情形下
at unknown standard deviation and small samples
可以采用下面的方法进行
can adopt the following
下端点的算法
algorithm for the lower endpoint
(公式如上)
(The formula is as above)
上端点的算法仍然是(公式如上)
The algorithm for the upper endpoint is still (formula above)
在这里(公式如上)
Here (the formula above)
是在自由度为n-1的t分布里边
the corresponding critical value
右侧尾部面积为(公式如上)时
is given for the t distribution with a n-1 degree of freedom and
所对应的临界值了
when the tail area of the right side is (the above formula)
因为它和标准正态分布的
Because it has the same features
特点是一样的
as the standard normal distribution
也是左右对称的
It's also left-right symmetric
因此中间有1-α
So we have 1-α in the middle, and
左右两边就各分到有
the left and right sides each
(公式如上)的面积
have an area (as shown in the formula above)
我们来看一个例子
Let's look at an example
在ABSPC.xls的数据里边
In the abspc.xls data
大妞妈妈同时记录了
the girl's mother recorded at the same time
所抽查到的拉杆箱的承重量
the carrying weight of the trolley cases sampled
其中有28个拉杆箱
of which, 28 cases
提供了该项指标值
provided this indicator value
具体的数字在下面的表格里面
The numbers are in the table below
其中表中的大写的N
where the capital N
表示没有这一项数据
represents no data of this item
接下来要我们根据上述资料
Next, we are going to, based on the above information,
建立置信度为90%的
establish an interval of
总体均值的区间
the population mean with 90% confidence
假定承重量指标
It is assumed that the carrying indictor
总体服从正态分布
population obeys a normal distribution
那有了这些信息
Having such information
我们也同样地可以先分析
we can also first make an analysis
假定总体是个正态总体
Let's say the population is a normal population
n=28 是个小样本
n =28 is a small sample
并且没有信息直接告诉我们
and there's no information to tell us directly about
总体的标准差
the standard deviation of the population
所以我们刚才提到的
So there comes in the t-distribution
t分布的情形就出现了
we’ve just mentioned
正态总体 小样本
Normal population, small sample
并且标准差未知
and unknown standard deviation
那这个时候我们就可以判断
Now, we can judge that
X_bar是服从
X-bar obeys
自由度为27的t分布
a t distribution with 27 degrees of freedom
当然另外还告诉了
And, of course, I told you
我们置信度是90%
the confidence is 90%
那根据1-α等于90%
According to 1-α equaling 90%
等一下我们要查的表
the table that we're going to look up is
就不再是标准正态分布表
no longer the standard normal distribution table
而是t分布表了
but a t-distribution table
接下来我们的步骤
The following steps
和前面的例子基本类似
will be basically similar to those in the previous example
第一 计算点估计值
First, calculate the point estimate
第二 计算极限误差
Second, calculate the limiting error
解 依题意
Solution:according to the question
总体服从正态分布 n=28
The population obeys a normal distribution, n=28
此时总体方差未知
The population variance is unknown
可以用自由度为27的t分布
You can use a t distribution of 27 degrees of freedom
进行总体均值的区间估计
to carry out the interval estimation of the population mean
首先第一步计算样本平均数
The first step is to calculate the sample average
(公式如上)
(The formula is as above)
将数据代入进来
Substitute the data into the formula
计算的结果是35.36公斤
The result is 35.36 kg
这是平均的承重量
This is the average weight
那为了计算极限误差
In order to calculate the limiting error
我们需要有一个准备
we need to have a preparation
就是样本标准差
That's the sample standard deviation
这个我们在前面也分析过
We’ve analyzed previously
样本标准差的算法
The algorithm of sample standard deviation
刚才也提醒过
I reminded you just now that
就是一定是无偏性处理以后的
It must be the formula after unbiasedness treatment
一定是(公式如上)
must be the (formula above)
把信息代入进来
Substitute the information into the formula
计算的结果是17.9
The result is 17.9
有了这个信息
With this information
接下来计算极限误差
we’ll calculate the limiting error
(公式如上)
(The formula is above)
那这个地方的话
Here
有一个(公式如上)出现了
there is a (formula as above)
它的查表的方法
Its table lookup method
和标准正态分布基本类似
is the same as for the standard normal distribution
好 接下来我来教大家查一下
Now, let me show you how to lookup
t分布表的临界值
the critical value of the t-distribution table
请大家拿出教材来
Please take out your textbooks
那我们拿到教材以后的话
Now, let's
翻到教材后面的目录
Go to the table of contents at the back of the textbook
目录里边你可以找到
You can find in the table of contents
有一个t分布双侧临界值表
A two-sided critical value table for t-distribution
那我们在进行区间估计的时候
So when we do interval estimation
这张表是最方便查临界值的
This table is very convenient for us to look up the critical values
那么在这个双侧临界值表里面
After getting this two-sided
拿到以后
critical value table
我们首先要观察一下图片
Let's look at the picture first
表格上面有一个图片
there is a picture on the table
那么这个图片里边显示
It is shown in the picture
两侧阴影部分面积加起来有α
the shaded areas on both sides add up to α
中间是1-α
In the middle is 1-α
那么在我们这个题目里边
In our question
中间1-α是90%
the middle 1-α accounts for 90%
那当然两侧的面积就是10%
So, of course, the area of both sides accounts for 10%
那这个10%我们到哪里去找呢
Where can we find the 10%
在我们表格里面的
In the table
第一行里边不是有一些数字
there are some numbers in the first line
有0.1 有0.05 有0.02
such as 0.1, 0.05, 0.02
有0.01这样的一些数字
0.01
那这一次我们对应的是0.1
So this time we're dealing with 0.1
因为是90%剩余的部分
Because it is the remaining part after 90% is reduced
所以我们现在就找到第一列
Now, find the first line
是我们要查的这一列
and look up in the first line
那根据前面的分析
According to the previous analysis
我们的自由度是27
the degrees of freedom are 27
所以在表格里面
So, in the table
往下找
Search downward
找到自由度为27交叉的位置
Find the position where it crosses with the 27 degrees of freedom
1.703就是我们这一个例题
1.703 is the critical value in t-distribution table
t分布表的临界值了
for this example
把这个数字查好以后
After finding the number
接下来我们就可以计算
we can calculate
极限误差的值
the value of the limiting error
那极限误差
For the limiting error
(公式如上)
(The formula is as above)
注意 这个地方是根号n
Notice this is the square root of n
不是根号n-1
not the square root of n-1
所以大家一定要注意一下
So you have to pay attention to
无偏性的调整
unbiased adjustment
是调整在s本身的算法里面
The adjustment is in the algorithm of S itself
不是任意你看到有n的地方
but not that wherever you see n
都要把它变成n-1
you have to turn it into n-1
好 回到我们刚才的算式里边
Now, go back to the previous algorithm
(公式如上)
(The formula is as above)
计算的结果是5.76公斤
The result 1s 5.76 kg
这样的一个数字
With this number
那有了点估计值
we have the point estimate
有了极限误差
the limiting error
90%的置信区间就可以算出来
Now, we can figure out the 90% confidence interval
下限是29.60公斤
The lower limit is 29.60 kg, and
上限是41.12公斤
The upper limit is 41.12kg
那这样的话
Then
我们就可以帮助大妞妈妈
we can help the girl’s mother
把24寸ABS加PC材质的旅行箱的
to calculate the average weight
平均承重量给它计算出来了
of the 24-inch ABS and PC suitcase
这就是在小样本的情形下
So that's the method for estimating
如何来估计总体均值的
the confidence interval for the population mean
置信区间的方法
in the case of small samples
它和大样本情形唯一的不同
The only difference from in the case of large samples
就是这一次我们利用了t分布
is that we use t-distribution
而原来我们利用的是正态分布
it was normal distribution we previously used
来帮助我们查表
to help us in table look-up
好 接下来我们对总体均值的
Then, we are going to make a little summary of
区间估计过程
the process of interval estimation of
稍微做一个总结
population mean
根据我们前面的例子
According to our previous example
我们可以总结单个总体均值
we can summarize the steps of estimating
区间估计的步骤
the mean interval of a single population
第一步通常是计算样本平均数
The first step is usually to calculate the average of the sample
帮助我们获得点估计值
to help us get a point estimate
可以是简单算法
It could be a simple algorithm or
也可以是加权的算法
a weighted algorithm
第二步
The second step is
通常是要计算样本标准差
usually to calculate the sample standard deviation
来帮助我们计算极限误差
to help us calculate the limiting error
样本标准差
Algorithms for sample standard deviation
和样本平均数的算法
and sample mean
是配套的
are matching
如果样本平均数用了简单算法
If the sample average calculation uses a simple algorithm
那样本标准差也是简单算法
the calculation of sample standard deviation will also use a simple algorithm
如果样本平均数用了加权算法
If the sample average calculation uses a weighted algorithm
那么样本标准差也是加权算法
the calculation of sample standard deviation will also use a weighted algorithm
另外 还要提醒大家的是
Another thing to remind you of is
一定要做无偏性的调整
to be sure to make unbiased adjustments
第三步是计算极限误差
The third step is to calculate the limiting error
在大样本的时候
In the case of large samples
极限误差的算法
the limiting error algorithm
依赖于正态分布
depends on the normal distribution
因为X_bar它的抽样分布
Because X_bar’s sampling distribution
是个正态分布
is a normal distribution
那你可以是
You can use
(公式如上)
(The formula is as above)
在小样本的情形下面
In the case of small samples
我们计算极限误差
calculation of limiting error
是依赖于自由度为n-1的t分布
depends on the t-distribution with n-1 degrees of freedom
这个时候是查t分布表
At this time, look up the t-distribution table
得到临界值(公式如上)
to get the critical value (formula above)
所以(公式如上)
Therefore, (the formula is as above)
第四步则是构造
The fourth step is to construct
置信水平为1-α的区间
the interval with a confidence level of 1-α
下端点(公式如上)
with the lower endpoint of (formula above)
上端点(公式如上)
and the upper endpoint of (formula above)
这样的话我们就基本上完成了总体均值的
So we're done with the process of
完成了总体均值的
the interval estimation
区间估计的过程
of the population mean
好 这一讲就是这样
So much for this lecture
谢谢大家
Thank you
-1.1 Applications in Business and Economics
--1.1.1 Statistics application: everywhere 统计应用:无处不在
-1.2 Data、Data Sources
--1.2.1 History of Statistical Practice: A Long Road 统计实践史:漫漫长路
-1.3 Descriptive Statistics
--1.3.1 History of Statistics: Learn from others 统计学科史:博采众长
--1.3.2 Homework 课后习题
-1.4 Statistical Inference
--1.4.1 Basic research methods: statistical tools 基本研究方法:统计的利器
--1.4.2 Homework课后习题
--1.4.3 Basic concepts: the cornerstone of statistics 基本概念:统计的基石
--1.4.4 Homework 课后习题
-1.5 Unit test 第一单元测试题
-2.1Summarizing Qualitative Data
--2.1.1 Statistical investigation: the sharp edge of mining raw ore 统计调查:挖掘原矿的利刃
-2.2Frequency Distribution
--2.2.1 Scheme design: a prelude to statistical survey 方案设计:统计调查的前奏
-2.3Relative Frequency Distribution
--2.3.1 Homework 课后习题
-2.4Bar Graph
--2.4.1 Homework 课后习题
-2.6 Unit 2 test 第二单元测试题
-Descriptive Statistics: Numerical Methods
-3.1Measures of Location
--3.1.1 Statistics grouping: from original ecology to systematization 统计分组:从原生态到系统化
--3.1.2 Homework 课后习题
-3.2Mean、Median、Mode
--3.2.2 Homework 课后习题
-3.3Percentiles
--3.3 .1 Statistics chart: show the best partner for data 统计图表:展现数据最佳拍档
--3.3.2 Homework 课后习题
-3.4Quartiles
--3.4.1 Calculating the average (1): Full expression of central tendency 计算平均数(一):集中趋势之充分表达
--3.4.2 Homework 课后习题
-3.5Measures of Variability
--3.5.1 Calculating the average (2): Full expression of central tendency 计算平均数(二):集中趋势之充分表达
--3.5.2 Homework 课后习题
-3.6Range、Interquartile Range、A.D、Variance
--3.6.1 Position average: a robust expression of central tendency 1 位置平均数:集中趋势之稳健表达1
--3.6.2 Homework 课后习题
-3.7Standard Deviation
--3.7.1 Position average: a robust expression of central tendency 2 位置平均数:集中趋势之稳健表达2
-3.8Coefficient of Variation
-3.9 unit 3 test 第三单元测试题
-4.1 The horizontal of time series
--4.1.1 Time series (1): The past, present and future of the indicator 时间序列 (一) :指标的过去现在未来
--4.1.2 Homework 课后习题
--4.1.3 Time series (2): The past, present and future of indicators 时间序列 (二) :指标的过去现在未来
--4.1.4 Homework 课后习题
--4.1.5 Level analysis: the basis of time series analysis 水平分析:时间数列分析的基础
--4.1.6Homework 课后习题
-4.2 The speed analysis of time series
--4.2.1 Speed analysis: relative changes in time series 速度分析:时间数列的相对变动
--4.2.2 Homework 课后习题
-4.3 The calculation of the chronological average
--4.3.1 Average development speed: horizontal method and cumulative method 平均发展速度:水平法和累积法
--4.3.2 Homework 课后习题
-4.4 The calculation of average rate of development and increase
--4.4.1 Analysis of Component Factors: Finding the Truth 构成因素分析:抽丝剥茧寻真相
--4.4.2 Homework 课后习题
-4.5 The secular trend analysis of time series
--4.5.1 Long-term trend determination, smoothing method 长期趋势测定,修匀法
--4.5.2 Homework 课后习题
--4.5.3 Long-term trend determination: equation method 长期趋势测定:方程法
--4.5.4 Homework 课后习题
-4.6 The season fluctuation analysis of time series
--4.6.1 Seasonal change analysis: the same period average method 季节变动分析:同期平均法
-4.7 Unit 4 test 第四单元测试题
-5.1 The Conception and Type of Statistical Index
--5.1.1 Index overview: definition and classification 指数概览:定义与分类
-5.2 Aggregate Index
--5.2.1 Comprehensive index: first comprehensive and then compare 综合指数:先综合后对比
-5.4 Aggregate Index System
--5.4.1 Comprehensive Index System 综合指数体系
-5.5 Transformative Aggregate Index (Mean value index)
--5.5.1 Average index: compare first and then comprehensive (1) 平均数指数:先对比后综合(一)
--5.5.2 Average index: compare first and then comprehensive (2) 平均数指数:先对比后综合(二)
-5.6 Average target index
--5.6.1 Average index index: first average and then compare 平均指标指数:先平均后对比
-5.7 Multi-factor Index System
--5.7.1 CPI Past and Present CPI 前世今生
-5.8 Economic Index in Reality
--5.8.1 Stock Price Index: Big Family 股票价格指数:大家庭
-5.9 Unit 5 test 第五单元测试题
-Sampling and sampling distribution
-6.1The binomial distribution
--6.1.1 Sampling survey: definition and several groups of concepts 抽样调查:定义与几组概念
-6.2The geometric distribution
--6.2.1 Probability sampling: common organizational forms 概率抽样:常用组织形式
-6.3The t-distribution
--6.3.1 Non-probability sampling: commonly used sampling methods 非概率抽样:常用抽取方法
-6.4The normal distribution
--6.4.1 Common probability distributions: basic characterization of random variables 常见概率分布:随机变量的基本刻画
-6.5Using the normal table
--6.5.1 Sampling distribution: the cornerstone of sampling inference theory 抽样分布:抽样推断理论的基石
-6.9 Unit 6 test 第六单元测试题
-7.1Properties of point estimates: bias and variability
--7.1.1 Point estimation: methods and applications 点估计:方法与应用
-7.2Logic of confidence intervals
--7.2.1 Estimation: Selection and Evaluation 估计量:选择与评价
-7.3Meaning of confidence level
--7.3.1 Interval estimation: basic principles (1) 区间估计:基本原理(一)
--7.3.2 Interval estimation: basic principles (2) 区间估计:基本原理(二)
-7.4Confidence interval for a population proportion
--7.4.1 Interval estimation of the mean: large sample case 均值的区间估计:大样本情形
--7.4.2 Interval estimation of the mean: small sample case 均值的区间估计:小样本情形
-7.5Confidence interval for a population mean
--7.5.1 Interval estimation of the mean: small sample case 区间估计:总体比例和方差
-7.6Finding sample size
--7.6.1 Determination of sample size: a prelude to sampling (1) 样本容量的确定:抽样的前奏(一)
--7.6.2 Determination of sample size: a prelude to sampling (2) 样本容量的确定:抽样的前奏(二)
-7.7 Unit 7 Test 第七单元测试题
-8.1Forming hypotheses
--8.1.1 Hypothesis testing: proposing hypotheses 假设检验:提出假设
-8.2Logic of hypothesis testing
--8.2.1 Hypothesis testing: basic ideas 假设检验:基本思想
-8.3Type I and Type II errors
--8.3.1 Hypothesis testing: basic steps 假设检验:基本步骤
-8.4Test statistics and p-values 、Two-sided tests
--8.4.1 Example analysis: single population mean test 例题解析:单个总体均值检验
-8.5Hypothesis test for a population mean
--8.5.1 Analysis of examples of individual population proportion and variance test 例题分析 单个总体比例及方差检验
-8.6Hypothesis test for a population proportion
--8.6.1 P value: another test criterion P值:另一个检验准则
-8.7 Unit 8 test 第八单元测试题
-Correlation and regression analysis
-9.1Correlative relations
--9.1.1 Correlation analysis: exploring the connection of things 相关分析:初探事物联系
--9.1.2 Correlation coefficient: quantify the degree of correlation 相关系数:量化相关程度
-9.2The description of regression equation
--9.2.1 Regression Analysis: Application at a Glance 回归分析:应用一瞥
-9.3Fit the regression equation
--9.3.1 Regression analysis: equation establishment 回归分析:方程建立
-9.4Correlative relations of determination
--9.4.1 Regression analysis: basic ideas
--9.4.2 Regression analysis: coefficient estimation 回归分析:系数估计
-9.5The application of regression equation