当前课程知识点:Learn Statistics with Ease > Chapter 2 Descriptive Statistics: Tabular and Graphical Methods > 2.2Frequency Distribution > 2.2.1 Scheme design: a prelude to statistical survey 方案设计:统计调查的前奏
返回《Learn Statistics with Ease》慕课在线视频课程列表
返回《Learn Statistics with Ease》慕课在线视频列表
第二讲 主要讲的就是调查方案的设计
Part 2 will focus on scheme design of statistic survey
调查方案的设计
scheme design of statistic survey
里面牵涉到许多内容
covers a lot of ground
为什么我们要讲这么一个内容
What can we learn from today’s lecture
因为统计调查涉及的面非常广
Statistic Survey involves a wide range
花费的人力 物力 财力非常多
costing manpower, material and financial resources
我们不能说
We should avoid
因为准备不足
failing a survey
而使得某一次调查失败
because of inadequate preparation
那造成的损失是非常大的 没法弥补
The loss caused by the failure will be too large to make up for
那么
Therefore
在统计调查方案设计里面
we should get to know what is to be included
包括哪些内容呢
in the scheme design of statistic survey
大家从这些方面来考虑问题就好办
We should address this problem from the following aspects
第一
First,
我们要知道
we need to know
你这次调查的调查目的是什么
the purpose of this survey
在我们统计里面(讲) 就是确定调查目的
In statistic, to define the objective of a survey
就是为什么调查
is to figure out why we conduct this survey
你只有搞清楚
Only by figuring out
我这里的调查目的
the objective of survey
比如
for instance
人口普查是要了解国情国力的基本情况
the population census is to grasp the national power and strength
了解我们的总体人口 这些情况
and the situation of the general population
所以
Consequently
了解了调查目的
to understand the purpose of the survey
才知道向谁调查 调查什么
helps to clarify whom and what should be surveyed
怎么调查 什么时候调查
when and how to launch the survey
所以第一个
In conclusion,
调查方案里面最首要的问题
The top priority in survey design
是确定 明确调查目的
is to determine and clarify the purpose of the survey
第二个 向谁调查
Secondly, whom should be investigated
也就是我们讲的确定调查对象
or to determine respondents
和调查单位
and units of survey
调查对象 它是一个总体
The respondents form a population
就是我们上一章讲的统计总体
which we have discussed in our last chapter
它由总体单位构成
The population is made of units
那些单位就是我们要调查的单位
and these units are what we should investigate
那调查单位是什么呢
What are investigation units
就是我们想搜集的信息
They are information we want to collect
或者项目的承担者
or the undertakers of the project
比如人口普查
In population census
人口普查的调查总体是
the population of the survey
中华人民共和国的所有的
is all the permanent residents
具有国籍的常住人口
with Chinese nationality
它的调查单位是每一个
Its survey unit is each individual
中华人民共和国的国民(公民)
person (citizen) of PRC
当然 指的是常住人口
Of course, it refers to the permanent residents
调查单位它负担 承担着调查项目
The survey unit undertakes and assumes the survey items
这些调查项目
The survey items include
既包括我们前面讲的品质标志
not only the qualitative character
也包括我们前面讲的数量标志
but also the numerical indication as discussed earlier
也就是所有的变量
namely, the variables
第三个 调查什么
Thirdly, what should we investigate
就是我们确定调查内容和调查表
This is when we determine the survey content and the questionnaire
调查内容也就是调查项目
Survey content means survey items
一般来讲 大家都这么认为
Most people agree that
我进行一次调查
in an investigation
我希望我调查内容
the more is included into the survey content
越丰富越好
the better
实际情况可不是这样
But this is not necessarily true
因为你要求 想要许多信息
If you want to collect much information
但是你要调查单位配合
you need to the survey units’ cooperation
或者调查单位理解
or to ask for their understanding
比如说 我们国家
For example, in the second
在第二次人口普查的时候
national population census
曾经想调查文化程度
the item of education was almost included
那时候
But then
我们国家经济文化水平
the national economic and cultural development
没有达到那个程度
was less than satisfactory
所以取消了这个调查项目
At last the item was cancelled
第二 你还要了解 我们那些
Moreover, you should understand
当时的国民的文化水平到底有多高
the culture and education level of the people at that time
理解能力到底有多强
or their ability to comprehend
不然的话
Or else
你设置的调查项目
the survey items and survey content
调查内容也无法满足 搜集到
that you set may not be fulfilled or collected
第三
Third
有些隐私性问题
when there are privacy issues
你也不能够想调查就调查的到
it is not easy to collect the information
要采用一些特殊的调查方法
Special investigation methods should be adopted
来进行调查
in that case
比如说
For example, questions like
你多长时间洗床单 多长时间洗头
how often you wash your bed sheet, or wash your hair
这关系到个人的隐私
are seen invading personal privacy
大家都不会 都不愿意告诉别人
People are often reluctant to share this information
那这样的话
In that case
我们在(选择)调查方法的时候
when we (choose) investigation methods
使用的时候
or use these methods
就要设计一些
we should design some
相关的 模糊的 混淆的一些问题
relevant, confusing and not that specific questions
来得到真实的有关情况
to infer the real-life situation
特别是 黄 赌 毒此类的调查
Especially in surveys regarding pornography, gambling, and drug abuse
更是要采用一些比较好的方法
we should devise better ways
比如 我举一个例子
For example
比如说 家里面没有买电视
suppose there is no TV in a household
为什么没买电视
Why don’t they buy a TV set
有两种可能
There are two possible reasons
一种可能是因为家里没钱
One is that they cannot afford it
(第)二种可能是因为怕影响小孩的学习
the other is that they do not want the TV to affect the child’s learning
当然 现在电视普及了
Of course, TV has become popular
你(可以)把它改成是计算机
You (can) replace TV with computer
那这样的话呢
So
你调查的时候
when you conduct the survey
你如果直接问
if you ask directly
你家里为什么不买电脑
why don’t you buy a computer
那家长可能说 我怕影响小孩学习
the respondent might say, I do not want it to affect my child’s learning
那这样调查就没有用了
and the survey may lose some authenticity
那怎么办呢
So, what do we do
我们就要设一个漏洞
We need to design a loophole
分析调查方法
using statistical method of analysis
电脑对小孩的学习有利 有弊
Computer poses both negative and positive influence on children’s learning
利 比如说 开拓小孩的视野
The positive influence is, say, to expand children’s horizon
能增加他的那个知识面等等
and increase their knowledge
弊 在于会分散他的学习时间等等
The negative influence is that it would distract children from their learning
问家长 你赞同电脑对小孩的学习
And the question in the survey can be: whether you agree that
是利大于弊还是弊大于利
computer’s positive influence outweighs its negative influence
如果他觉得 利大于弊
If he agrees
那接着问 那为什么我们家不买呢
then continue to ask, why don’t you buy a computer
没买呢
In this case
那只有另外一个回答
the absence of computer only shows
家里暂时没钱
they cannot afford it
所以说
Therefore
调查内容设置的时候
it is highly important to pay attention to
一定要注意
survey content
要根据它的调查方法
by considering investigation method
和调查人员的素质 培训情况
the quality and training of investigators
调查方案的第三个问题
And now it comes to
里面的第二个小问题就是调查表
preparing the questionnaire
你(的)调查资料要在调查表上体现
The questionnaire should represent your survey data
那调查表有两类
There are two kinds of questionnaires
一类叫单一表 一类叫一览表
single table and checklist
单一表和一览表的区分就是
The difference between single table and checklist is
调查单位的多少
the number of investigation units
一份表填一个调查单位叫作单一表
A table with one investigation unit is called single table
我们学校里的学生情况登记表
student registration form in our university, for instance
你个人的家庭 社会等等信息
Your family and social information
这就登记在一张表上
are recorded in one table
那这张表就属于单一表
which is called single table
一览表是一份表里面
A checklist is a table
填多个调查单位
with various investigation units
比如说 人口普查表
for example, population census
它一份表里按家庭
is conducted according to households
一个家庭里面有多个调查单位
There are many investigation units in one household
一个家庭成员都在里头
Everyone in this family is included
这是调查方案里面的第三个
This is the third element in survey design
调查内容和调查表
the survey content and the questionnaire
第四个 什么时候调查
Fourth, when to conduct the survey
就确定调查时间
It means to determine survey time
调查时间分两类
There are two types of survey time
一类是我们讲的
One is what we call
调查工作的起止时间
starting and ending times of the survey
也就是
In other words
什么时候开始调查
when to start the survey
什么时候结束调查
when to end the survey
中间时间多长
and how long is the duration in between
第二个调查时间指的是
The second type of survey time
调查资料所属的时间
means the time when survey data is collected
比如说 人口普查
For example, in population census
人口普查规定的标准时点
it means the standard time point stated by population census
标准时点
The standard time point
比如在11月1号零时
For example, 00:00 a.m. on November 1st
属于标准时点
is a standard time point
那指的就是
It includes
这个人在11月1号零时
all the individuals who are alive
他还活着的人
at 00:00 a.m. on November 1st
这是资料所属的时间
This is the time when data is collected
那11月1号零时之前
If before 00:00 a.m., November 1st
进行的统计调查的登记
a person might have been registered in Statistic Survey
但是 他没有坚持到11月1号零时
but he could not make it to 00:00 a.m. November 1st
这时候 这个人要剔除
then, we must delete that person
如果11月1号以后出生的人
Or anyone who is born after November 1st
他没赶上
does not meet the standard
也不能加进去
and should not be added
所以 这就叫调查资料所属时间
This is the time in which survey data is collected
统计调查方案的最后一个工作
The last job of statistical investigation plan
那就是组织方式的实施
is to implement its organization
什么叫组织方式的实施
What is to implement its organization
因为一项大型的统计调查
Because many large-scale Statistic Survey
它牵涉的人 财 物特别多
involve a wide range of human, physical
量特别大
and monetary resources
那这样的话 首先要
In that case, we should first
组成一个调查领导机构
organize a leading institute for the survey
比如说 我们国家
For example, every time
每次进行的国家大型调查
The national scale investigation is conducted
都由国务院总理或者副总理
Premier or Vice Premier of the State Council
牵头当调查组长
will serve head of survey director
省里面由省长或者副省长
Provincial investigation is also directed by
当调查组长
governor and deputy governor of the province
第二呢
Secondly
(对)调查经费的来源进行预算
budge for the source of survey funds
第三 调查人员的培训
Thirdly, the training of investigators
第四 调查工作的宣传
Fourthly, survey promotion
第五 调查资料的整理 储存 运输
Fifthly, the sorting, storing, transporting
处理 发布等等
arranging, and publishing of survey data
这就是我们第五个
This is the fifth and also the last job of statistical survey design
确定调查组织实施的方式
to organize the implementation of the survey
前面讲了 调查方案的内容
We’ve learned how to design a survey in this lecture
调查方案按照这些内容进行设计
To design a survey based on these elements
就能够进行具体调查
will help us to carry out a specific investigation
那大妞二妞她也了解
The two girls already know
她妈妈 堵车那条线路的有关情况
there is traffic jam on their mother’s way home
她就应该按照这个来进行设计
They should design a detailed questionnaire
她的具体问卷
based on these elements
她要进行实地考察 实地观察
They need to carry out on-the-spot investigation and observation
比如 我曾经看过一篇论文
I once read a thesis
一个本科生写的
written by an undergraduate student
他就是利用40多天
who spent 40 days
在中国人民大学西门
conducting an on-the-spot investigation
对那个红绿灯进行实地调查
on the traffic lights at the west gate of RUC
写出了(对)北京市交通的
In his thesis, he made some proposals
一些建议的情况 一篇文章
on the traffic (situation) in Beijing
写得非常好
The thesis is well written
大妞二妞可以采取这种方法
The two girls can also adopt this method
来进行调查
in their investigation
并且分析这里边的堵车的有关情况
and analyze the traffic jam
并且为交警更好的管理提供建议
to raise valid proposals to improve traffic management
这是统计调查(这一章)的第二个问题
This is the second part in (the chapter of) statistic survey
统计(调查)方案的设计
(survey) design in statistic
第三个问题 统计调查会产生误差
In the rest of today’s lecture, we will discuss survey errors
不管哪种调查都会产生误差
Survey errors happen all the time
它误差分两大类
There are two kinds of errors
第一大类 就是我们讲的
One is what we call
可消除误差
eliminable errors
第二大类 是不可消除的误差
The other is irreversible errors
可消除的误差它可以分两类
The eliminable errors can be divided into two types
一类就是登记误差
One is data recording error
登记误差指的是什么呢
What is data recording error
就是我们大家因为工作疲劳
It happens when investigators are too tired
或者是注意力不强
or too distracted
使得我们的真实数据没有登上
to record the real data
写出了错误的资料 错误的信息
and wrote down the wrong data or information
第二个可以消除误差是系统性误差
The other type of eliminable error is systematic error
系统性误差是可能(在)调查的时候
Systematic error may happen (when)
选择的调查方法
there is something wrong with the choice
或者是选择的调查单位有问题
of method or unit in investigation
比如说 比较典型的(例子)
For example, there is a typical (case)
就是美国1936年的总统竞选
of U.S. presidential election in 1963
文学文摘它调查了240万人
where Literary Digest investigated 2.4 million people
就是按照家里的车牌号码
based on plate numbers
和电话号码进行调查
and home phone numbers
得出的结论是
and reached the conclusion that
罗斯福不可能当总统
Roosevelt would lose
而盖洛普只调查了几千人
Meanwhile Gallup Poll investigated only thousands of people
它采取的是随机抽样
using random sampling
就分层随机抽
also called Stratified sampling
得出的结论是罗斯福当总统
and drew the conclusion that Roosevelt would be elected
最终的结果是 盖洛普调查
It ended up that, the result of
民意调查得出的结果准确
Gallup survey, its opinion poll, was accurate
实际情况是罗斯福当了总统
The truth is Roosevelt was elected president
那前面调查了240万人的民意测验
What went wrong in the opinion poll
调查结果怎么就错的呢
covering 2.4 million respondents
这个就出现了系统偏差
There was a systematic error
1936年美国有车有电话的家庭
In 1936, the American family in possession of a car and a phone
都属于富裕的
was the affluent class
他们反对罗斯福的新政
They opposed Roosevelt’s New Deal
所以他们投票不会投罗斯福
and sure, would not vote for him
所以这个呢 就属于系统性误差
There is a systematic error
系统性误差 只要认真 负责
Systematic error can be eliminated
方法正确 是可以消除的
if you are responsible and use the right way
所以前面两类叫作可消除误差
That is why the two types are eliminable errors
第二类不可消除误差
The other kind of error is irreversible
不可消除误差(的) 第一个
The first type of irreversible error
就是抽样误差
is sampling error
(当)你是从总体中
When you draw samples to represent the population
按随机原则抽取的样本来代表总体
based on a random principle
就是你样本与总体的分布
the distribution of samples and the population
非常接近 但是也是有差别的
can be quite close, but there is also a chance of difference
那这个误差是没法消除
This error cannot be eliminated
但是可以控制并加以计算
but can be controlled and calculated
还有一个误差不可以消除
The other error that cannot be eliminated
就是度量衡误差
is measurement error
度量衡误差就是我们去度量
Measurement error happens when
比如重量 长度等等东西
we measure the weight, length and so on
时间 我们度量时间用的是钟
We use clock to measure time
度量长度用的是尺
Use ruler to measure length
度量重量用的是秤
and use scale to measure weight
但是这些度量衡
But these measurements
它是一定存在误差的
have intrinsic error
就最准的 它叫原子钟
Atomic clock is the most accurate
它多少万年也会慢一秒
but it is one second late every 10 thousand years
所以这个误差也是无法消除
This error is irreversible
那在现实生活里面怎么办
What can we do about this in real life
我们一般 比如量身高
We can, for example, when measuring height
那就多量几次 加起来平均
measure it several times and combine the results to get an average number
让接近于 或者叫逼近于真实
which is close to or approximate to reality
好 统计调查的内容
Ok, these are the three parts
就这三个部分
of Statistic Survey
那现在我们就可以反过来知道
Reviewing the case, we know
大妞二妞她(们)把我们
the two girls can use
前面学过的内容进行运用
what we have learned in this chapter
设计出合格的 科学的
to design a valid and scientific questionnaire
但是千万别内容太多的调查表
Beware for too much content
因为内容太多
If they include too much content
调查的单位他是不会填
they may find it difficult to set units of the survey
要填也可能会真实性会下降
or harm the authenticity of the survey
用她们设计的调查表进行调查
The questionnaire they design
所获得的那些原始资料
will collect original data from the survey
有助于(她们)分析她的妈妈所坐的
to help them analyze the traffic jam
回家的那条线路的交通拥挤情况
on their mother’s way home
好 谢谢大家
That is all, thank you.
-1.1 Applications in Business and Economics
--1.1.1 Statistics application: everywhere 统计应用:无处不在
-1.2 Data、Data Sources
--1.2.1 History of Statistical Practice: A Long Road 统计实践史:漫漫长路
-1.3 Descriptive Statistics
--1.3.1 History of Statistics: Learn from others 统计学科史:博采众长
--1.3.2 Homework 课后习题
-1.4 Statistical Inference
--1.4.1 Basic research methods: statistical tools 基本研究方法:统计的利器
--1.4.2 Homework课后习题
--1.4.3 Basic concepts: the cornerstone of statistics 基本概念:统计的基石
--1.4.4 Homework 课后习题
-1.5 Unit test 第一单元测试题
-2.1Summarizing Qualitative Data
--2.1.1 Statistical investigation: the sharp edge of mining raw ore 统计调查:挖掘原矿的利刃
-2.2Frequency Distribution
--2.2.1 Scheme design: a prelude to statistical survey 方案设计:统计调查的前奏
-2.3Relative Frequency Distribution
--2.3.1 Homework 课后习题
-2.4Bar Graph
--2.4.1 Homework 课后习题
-2.6 Unit 2 test 第二单元测试题
-Descriptive Statistics: Numerical Methods
-3.1Measures of Location
--3.1.1 Statistics grouping: from original ecology to systematization 统计分组:从原生态到系统化
--3.1.2 Homework 课后习题
-3.2Mean、Median、Mode
--3.2.2 Homework 课后习题
-3.3Percentiles
--3.3 .1 Statistics chart: show the best partner for data 统计图表:展现数据最佳拍档
--3.3.2 Homework 课后习题
-3.4Quartiles
--3.4.1 Calculating the average (1): Full expression of central tendency 计算平均数(一):集中趋势之充分表达
--3.4.2 Homework 课后习题
-3.5Measures of Variability
--3.5.1 Calculating the average (2): Full expression of central tendency 计算平均数(二):集中趋势之充分表达
--3.5.2 Homework 课后习题
-3.6Range、Interquartile Range、A.D、Variance
--3.6.1 Position average: a robust expression of central tendency 1 位置平均数:集中趋势之稳健表达1
--3.6.2 Homework 课后习题
-3.7Standard Deviation
--3.7.1 Position average: a robust expression of central tendency 2 位置平均数:集中趋势之稳健表达2
-3.8Coefficient of Variation
-3.9 unit 3 test 第三单元测试题
-4.1 The horizontal of time series
--4.1.1 Time series (1): The past, present and future of the indicator 时间序列 (一) :指标的过去现在未来
--4.1.2 Homework 课后习题
--4.1.3 Time series (2): The past, present and future of indicators 时间序列 (二) :指标的过去现在未来
--4.1.4 Homework 课后习题
--4.1.5 Level analysis: the basis of time series analysis 水平分析:时间数列分析的基础
--4.1.6Homework 课后习题
-4.2 The speed analysis of time series
--4.2.1 Speed analysis: relative changes in time series 速度分析:时间数列的相对变动
--4.2.2 Homework 课后习题
-4.3 The calculation of the chronological average
--4.3.1 Average development speed: horizontal method and cumulative method 平均发展速度:水平法和累积法
--4.3.2 Homework 课后习题
-4.4 The calculation of average rate of development and increase
--4.4.1 Analysis of Component Factors: Finding the Truth 构成因素分析:抽丝剥茧寻真相
--4.4.2 Homework 课后习题
-4.5 The secular trend analysis of time series
--4.5.1 Long-term trend determination, smoothing method 长期趋势测定,修匀法
--4.5.2 Homework 课后习题
--4.5.3 Long-term trend determination: equation method 长期趋势测定:方程法
--4.5.4 Homework 课后习题
-4.6 The season fluctuation analysis of time series
--4.6.1 Seasonal change analysis: the same period average method 季节变动分析:同期平均法
-4.7 Unit 4 test 第四单元测试题
-5.1 The Conception and Type of Statistical Index
--5.1.1 Index overview: definition and classification 指数概览:定义与分类
-5.2 Aggregate Index
--5.2.1 Comprehensive index: first comprehensive and then compare 综合指数:先综合后对比
-5.4 Aggregate Index System
--5.4.1 Comprehensive Index System 综合指数体系
-5.5 Transformative Aggregate Index (Mean value index)
--5.5.1 Average index: compare first and then comprehensive (1) 平均数指数:先对比后综合(一)
--5.5.2 Average index: compare first and then comprehensive (2) 平均数指数:先对比后综合(二)
-5.6 Average target index
--5.6.1 Average index index: first average and then compare 平均指标指数:先平均后对比
-5.7 Multi-factor Index System
--5.7.1 CPI Past and Present CPI 前世今生
-5.8 Economic Index in Reality
--5.8.1 Stock Price Index: Big Family 股票价格指数:大家庭
-5.9 Unit 5 test 第五单元测试题
-Sampling and sampling distribution
-6.1The binomial distribution
--6.1.1 Sampling survey: definition and several groups of concepts 抽样调查:定义与几组概念
-6.2The geometric distribution
--6.2.1 Probability sampling: common organizational forms 概率抽样:常用组织形式
-6.3The t-distribution
--6.3.1 Non-probability sampling: commonly used sampling methods 非概率抽样:常用抽取方法
-6.4The normal distribution
--6.4.1 Common probability distributions: basic characterization of random variables 常见概率分布:随机变量的基本刻画
-6.5Using the normal table
--6.5.1 Sampling distribution: the cornerstone of sampling inference theory 抽样分布:抽样推断理论的基石
-6.9 Unit 6 test 第六单元测试题
-7.1Properties of point estimates: bias and variability
--7.1.1 Point estimation: methods and applications 点估计:方法与应用
-7.2Logic of confidence intervals
--7.2.1 Estimation: Selection and Evaluation 估计量:选择与评价
-7.3Meaning of confidence level
--7.3.1 Interval estimation: basic principles (1) 区间估计:基本原理(一)
--7.3.2 Interval estimation: basic principles (2) 区间估计:基本原理(二)
-7.4Confidence interval for a population proportion
--7.4.1 Interval estimation of the mean: large sample case 均值的区间估计:大样本情形
--7.4.2 Interval estimation of the mean: small sample case 均值的区间估计:小样本情形
-7.5Confidence interval for a population mean
--7.5.1 Interval estimation of the mean: small sample case 区间估计:总体比例和方差
-7.6Finding sample size
--7.6.1 Determination of sample size: a prelude to sampling (1) 样本容量的确定:抽样的前奏(一)
--7.6.2 Determination of sample size: a prelude to sampling (2) 样本容量的确定:抽样的前奏(二)
-7.7 Unit 7 Test 第七单元测试题
-8.1Forming hypotheses
--8.1.1 Hypothesis testing: proposing hypotheses 假设检验:提出假设
-8.2Logic of hypothesis testing
--8.2.1 Hypothesis testing: basic ideas 假设检验:基本思想
-8.3Type I and Type II errors
--8.3.1 Hypothesis testing: basic steps 假设检验:基本步骤
-8.4Test statistics and p-values 、Two-sided tests
--8.4.1 Example analysis: single population mean test 例题解析:单个总体均值检验
-8.5Hypothesis test for a population mean
--8.5.1 Analysis of examples of individual population proportion and variance test 例题分析 单个总体比例及方差检验
-8.6Hypothesis test for a population proportion
--8.6.1 P value: another test criterion P值:另一个检验准则
-8.7 Unit 8 test 第八单元测试题
-Correlation and regression analysis
-9.1Correlative relations
--9.1.1 Correlation analysis: exploring the connection of things 相关分析:初探事物联系
--9.1.2 Correlation coefficient: quantify the degree of correlation 相关系数:量化相关程度
-9.2The description of regression equation
--9.2.1 Regression Analysis: Application at a Glance 回归分析:应用一瞥
-9.3Fit the regression equation
--9.3.1 Regression analysis: equation establishment 回归分析:方程建立
-9.4Correlative relations of determination
--9.4.1 Regression analysis: basic ideas
--9.4.2 Regression analysis: coefficient estimation 回归分析:系数估计
-9.5The application of regression equation