当前课程知识点:Learn Statistics with Ease > Chapter 1 Data and Statistics > 1.4 Statistical Inference > 1.4.3 Basic concepts: the cornerstone of statistics 基本概念:统计的基石
返回《Learn Statistics with Ease》慕课在线视频课程列表
返回《Learn Statistics with Ease》慕课在线视频列表
统计的基本概念
Basic Concepts of Statistics
统计学的基本概念
Basic concepts of statistics
是一对一对地提出来的
always appear in pairs
第一对概念呢
The first pair we are going to discuss
就是总体与总体单位
is population and unit
什么叫总体呢
What is population
第一次接触的人可能不清楚
We can compare this abstract concept[The person learning for the first time may not be clear]
如果学过数学集合的话呢
to mathematical aggregation[If you have learned Math Set]
总体就相当于集合
Population is [Is like]the aggregation
它是由许多性质相同的元素
a whole part made of
所组成的一个整体
elements of the same nature
那就是集合
that is, an aggregation.
那总体是什么呢
What about population
它是由许多性质相同的
The population is a collection made of
总体单位构成的一个整体
units of the same nature
那就是统计总体
referred to as statistical population
它呢 有三个特点
It has three characteristics
第一 大量性
first, volume
数量特别大
in large quantity
里面的总体单位数量特别大
the total number of units is huge
第二 同质性
Second, homogeneity
你要成为一个总体
To form a population
你必须 总体单位某些性质是要相同的
there must be units of the same nature[Some properties of the overall unit are the same]
第三 它必须具有差异性
Third, differences
总体单位是在某些性质上
The units are homoplasmic
或某些标志上它是同质的
in some properties or in some characters
但是在大部分标志表现上
But most of the characters must display
是不相同的
differences
这也是统计存在的前提条件
This is also a precondition for statistics
总体呢
The population
这里讲的总体是经济统计的总体
we discuss here is statistical population of economy
还有个总体
There is also a concept of collection
在我们的教科书
in our textbook
就是中学的教科书里面
high school textbook
它里面写的总体
in which the concept of collection
它特别指明 说
is clearly specified, saying
我们这个总体是不对的
The statistical population we discuss here is different from
它那个总体是对的
the concept of collection in the book
它的那个总体是怎么解释的呢
How does the textbook explain the collection
它是由一组性质相同的数
It is a collection consisting of a set of
构成的总体
numbers of the same nature
它这个总体和数理统计的总体是相同的
This collection is similar to the population of mathematical statistics
那我们现在来分析
Now let us analyze
这两个总体的差别
the differences between the two concepts of collection
比如说
For example
按照经济统计学来研究高血压
if we study high-blood pressure by economic statistics
比如说 我有一种药 A
say, I have a medicine, A
这种药 A药能治高血压病
this medicine A can treat high-blood pressure
那我们的研究对象
Then what is our object of study
研究总体是什么
or population
经济统计的总体就是说
In economic statistics, population refers to
患高血压的所有患者构成的整体
the total collection of all the high-blood pressure patients
就是我们研究的总体
This is the population of our research
而数理统计
But in mathematical statistics
也就是中学的那个统计它说不是
of the collection in high school textbook
它说是什么呢
What does it tell us
它说是(把那些)
It means (to collect)
患高血压患者的血压数
the blood pressure of all these patients
构成一个整体
to form a collection
它是把血压数
It collects the set of data
就是一组数据构成一个整体
of the blood pressure to form a collection
这样的话呢
In this case
就跟我们经济统计总体不同
the collection is different from the economic statistical population
其实我们经济总体跟
In fact, economic statistical population
那个数理的总体
mathematical statistical population
或者叫中学里面的那个总体呢
and the collection in high school textbook
只是一个层次不同
are of different levels
我们高血压患者构成一个总体
Our population is a made of high blood pressure patients.
高血压患者他有好多标志
These patients have many characters
第一 血压是一个标志
such as blood pressure
年龄是一个标志
age
性别是一个标志
and gender
按照他们的(观点)来构成总体的话
If economic statistical population is formed
那经济统计的总体可以
in the same way of those two collections
构成比多他N倍的总体出来
our population would be N times bigger
所以 我们的总体比他的总体范围大
Therefore, our population has a larger range
当然
Of course
总体还可以分为有限总体和无限总体
the population can be further divided into finite population and infinite population
有限总体就是总体单位可穷尽
Finite population has finite number of units
无限总体呢 总体单位不可穷尽
Infinite population has infinite units
为什么要区别呢
Why do we distinguish them
因为等下统计调查里面
Because in statistical survey
普查 全面调查针对的是有限总体
census and overall surveys are aimed at finite population
无限总体不能进行全面调查
We cannot conduct overall surveys in infinite population
在这里
Here
大家理解总体的时候
when we understand the concept of population
一定要注意
we should pay attention to the fact
总体一定是有单位
that there must be units in population
一定跟研究目的联系在一起
and it relates to our research purposes
大家理解了这些才好理解
A clear understanding of the concept helps us to
比如说
distinguish
中国是不是总体
whether China is a population
中国不是总体
China is not a population
你要讲清楚
to be clear
你是研究中国的人口
our research is about the population in China
那么中国的所有的常住的
So the population here is composed of
具有中国国籍的人构成的整体才是总体
all permanent residents with Chinese nationality
这个总体里面
In this population
一定有每一个中华人民共和国国民
there must be individual unit
这个作为总体单位
of every citizen of PRC
如果你笼统讲中国
If we take China as a population here
那总体单位是什么
then what are the units
可能是固定资产
They could be fixed assets
可能是树
trees
可能是铁路
or railways
他必须指研究目的和里面的总体单位
It must specify the research purpose and the units contained
标志和指标
Character and indicator
标志指的是反映总体单位
Character is a term used to reflect
属性或特征的名称
the attributes or features of the unit
这就叫标志
This is character
标志就是个名称
Character is a term
是什么名称呢
What term
是总体单位的名称
The term of the unit
总体单位什么的名称呢
How do we name the unit
属性或者特征的名称
By its attributes or features
为了大家好理解
To make it clear
我举一下例子
please look at this example
比如 我要了解江西财经大学学生的
Suppose I want to know the test results of CET-4 and 6
四六级考试情况
of the students in JUFE
总体就是江西财经大学
Then JUFE is the population
这个学期参加四六级考试
all the students taking CET-4 and 6
的所有学生
this semester
每一个学生就是一个总体单位
are units in this population
你说明每个学生 他的属性或特征
What are the characters that can be used
的名称有哪些呢
to define individual students’ attributes or features
姓名 性别 民族
Name, gender, nationality
身高 体重 年龄
height, weight and age
这些都属于我们讲的属性或者特征的名称
these are all terms that we can use to define the attribute or features.
那说明属性的名称
Terms used to describe attributes
我们一般叫它品质标志
are called qualitative character
说明特征的名称
Terms used to describe features
一般把它叫作数量标志
are called numerical indication
因为特征的名称
because there are often numerical values
想年龄 身高 体重
involved to represent features
它是 都有数值表示的
such as age, height and weight
那标志的表现 就是我们
This character is expressed, in what we call,
刚才讲的具体的数值
specific numerical values
和具体的属性
and specific attributes
比如说
such as
性别是男 民族(是)汉
gender-male and nationality-Han
这就是标志的表现
are expressed in character
那指标它指的是什么
So what is indicator
指标是说明
Indicator is a term and value
总体数量特征的名称和数值
to represent the overall quantity
比如说
For example
要说明中华人民共和国
if we need to represent
他的经济总量 经济状况
the GDP or the overall economy in China
那他的总体是我国所有的常住单位
then the population of the research is all the residential units
把所有的常住单位的标志值
Put the character value of all the residential units
它的标志值是什么呢
together
比如说 它的增加值
its growth, for instance
加起来就属于它的指标 GDP
can form an indicator, GDP
所以GDP呢 既有GDP的名称
Therefore, GDP is a term
也有GDP的数值
and also a numerical value
比如说
For example
GDP 74万亿人民币
suppose the GDP is RMB74 trillion
那就是一个统计指标
then it is also a statistical indicator
统计指标分类比较多
There are many statistical indicators
按它的表现形式分
Based on form of expression
它有绝对数 绝对指标
there are absolute number, absolute indicator
相对数 相对指标
relative number, relative indicator
有平均数 平均指标
and average number, average indicator
按它的性质分呢
Based on its nature
有数量指标
there are numerical indication
有质量指标
and qualitative indicator
这里把它们分开来的目的呢
Why do we separate the two
是等一下后面我们在讲
We will come back to this question later
指数的时候会用到
Indicator is used in
像价格指数
like, price indicator
上海证券交易所的价格指数
Price index at Shanghai Stock Exchange
深圳成份指数
Shenzhen stock market’s constituent index
它(们)就属于质量指标指数
they belong to qualitative indicators
而成交量指数 它属于数量指标指数
Volume indicator belongs to numerical indications
那什么叫质量指标
What is qualitative indicator
什么叫数量指标呢
And what is numerical indication
反映总体深度的指标
Qualitative indicator represents
那就属于质量指标
the overall depth
它跟总体的范围大小没多大关系
It has no connection with the range of the population
反映总体的广度的指标
An indicator that marks the range of the population
那就属于数量指标
should belong to numerical indication
它跟总体的范围大小有直接关系
which is directly linked to the range of the population
这是第二对概念
This the second pair of concepts
第三对概念是变量和变异
The third pair of concepts is variable and variance
变异指的是 所有的变动
Variance refers to all kinds of changes
指标会变 标志也会变
of indicators and characters
那变量呢
As for variable
我们原来一般认为
we used to think
变量指的是可变的数量标志
variable is numerical indication or statistical indicator
及统计指标
that can change
但是你也可以讲
But we can also say
变量是所有的标志和指标
variable means all the indicators and characters
它都是变量
They are all variable
因为数量标志会变
Numerical indications change
品质标志也会变
so do qualitative characters
比如性别 可以变为男 女
such as sex, from male to female
我们的单位的属性也会变
The unit characters can also be changed
这些都属于变量
They are all variables
变量(按其取值是否连续)可以分为几类
Variables (based on whether its value is continuous) can be divided to
有连续变量和离散变量
continuous variable and discrete variable
这个在统计分组的时候
This is very important
特别是(整理)组距数列的时候
in statistical grouping, especially in (sorting)
特别重要
class width series
连续变量它是不可分割的
Continuous variables are inseparable
而离散变量可以一一列举
but discrete variables can be listed
所以大家在这里一定要分清楚
We should distinguish the differences
就是不要受我们的习惯所约束
and avoid being affected by general knowledge
大家一般认为
We generally think
年龄是离散变量
age is a discrete variable
体重是离散变量
body weight is a discreet variable
分数是离散变量
and fraction is a discrete variable.
这个都是不正确的
But it is not true
比如年龄 它应该是连续变量
For example, age is a continuous variable
只是我们在现实生活中
In real life
我们习惯用周岁来讲
we count the age by year
某某人的年龄
The age of a person
就是说没满一岁为零岁
grows only by whole number
没满两岁为一岁
A child is considered one year old until he reaches two
其实年龄是一个连续的
In fact, age grows as a continuous
不(间)断的一个变量
and progressive variable
变量还可以进行分类
The variable can also be divided
按它的类型(计量尺度)分为
based on its type (measurement)
定类(变量) 定序(变量)
into nominal (variable), ordinal (variable)
定距(变量) 定比(变量)
interval (variable) and ratio (variable)
这个分(类)呢
The division (type)
对以后我们的数据处理非常有用
is helpful in data processing
定类(变量)呢
Nominal (variable)
就像我们(前面)讲的
as what we said (earlier)
人按性别分 最简单的分为男女两类
if divided by sex, there are male and female
据说还有一种分成五类
I heard there was also a division of five categories
这是定类分析(变量)
This is nominal analysis (variable)
像我们的生物 动物还有植物
like in the division of living creatures
也进行分类
animals and plants
还有定序(变量)
There is also ordinal (variable)
像我们的产品按等级分
if the division is based on level
一级 二级 三级
there are first, second and third level products
这些属于定序变量
Th is ordinal variable[These are ordinal variables]
还有定距(变量)
There is interval (variable)[And fixed distance (variable)]
定距变量
Interval variable
像我们讲的
as what we said
温度 华氏 摄氏
there are Fahrenheit temperature and Celsius temperature
这些属于定距变量
This is interval variable
还有一个叫定比变量
And there is ratio (variable)
像我们刚才讲的
as what we said
GDP等等
GDP etc.
大部分(变量)属于定比变量
Most of the variables are ratio variables
定比变量跟前面的那些变量比
You (will) find that
你们(可以)发现
compared to other variables
定比变量可以进行四则运算
ratio variable can be used in arithmetic operations
而前面那些变量有些不行
But other variables cannot
大部分不行
or most of them cannot
就像男女 你不可能进行四则运算
For instance, you can’t put the variables of male and female in arithmetic operations
它是定类变量 好 这个是第三对概念
They are nominal variable. And this is the third pair of concepts learned today
这三对概念之间
Among the three pairs of concepts
有个关系 大家搞清楚
there is a relationship that we should make clear
就是我们的目的是说明总体情况
Our purpose of using these concepts
所以一切是围绕着总体来的
is to describe the overall situation of the population
那总体是由什么构成的呢
What is population made of
是由总体单位构成的
It is made of units
那说明总体单位是什么呢
How can we describe units
是标志
By characters
那说明总体的是指标
And indicators are used to describe the population
那大部分标志汇总就变成了指标
If we put characters together, we will get an indicator
指标和标志都属于变量
Both indicators and characters are variables
它既可以分为连续变量
They can be continuous variables
也可以分为离散变量
and discrete variables
所以这三对概念是联系在一起的
So, there is a connection among the three pairs of concepts
好 今天就讲到这里
Ok, that is all for today
00:00:00,000 --> 00:00:00,000
-1.1 Applications in Business and Economics
--1.1.1 Statistics application: everywhere 统计应用:无处不在
-1.2 Data、Data Sources
--1.2.1 History of Statistical Practice: A Long Road 统计实践史:漫漫长路
-1.3 Descriptive Statistics
--1.3.1 History of Statistics: Learn from others 统计学科史:博采众长
--1.3.2 Homework 课后习题
-1.4 Statistical Inference
--1.4.1 Basic research methods: statistical tools 基本研究方法:统计的利器
--1.4.2 Homework课后习题
--1.4.3 Basic concepts: the cornerstone of statistics 基本概念:统计的基石
--1.4.4 Homework 课后习题
-1.5 Unit test 第一单元测试题
-2.1Summarizing Qualitative Data
--2.1.1 Statistical investigation: the sharp edge of mining raw ore 统计调查:挖掘原矿的利刃
-2.2Frequency Distribution
--2.2.1 Scheme design: a prelude to statistical survey 方案设计:统计调查的前奏
-2.3Relative Frequency Distribution
--2.3.1 Homework 课后习题
-2.4Bar Graph
--2.4.1 Homework 课后习题
-2.6 Unit 2 test 第二单元测试题
-Descriptive Statistics: Numerical Methods
-3.1Measures of Location
--3.1.1 Statistics grouping: from original ecology to systematization 统计分组:从原生态到系统化
--3.1.2 Homework 课后习题
-3.2Mean、Median、Mode
--3.2.2 Homework 课后习题
-3.3Percentiles
--3.3 .1 Statistics chart: show the best partner for data 统计图表:展现数据最佳拍档
--3.3.2 Homework 课后习题
-3.4Quartiles
--3.4.1 Calculating the average (1): Full expression of central tendency 计算平均数(一):集中趋势之充分表达
--3.4.2 Homework 课后习题
-3.5Measures of Variability
--3.5.1 Calculating the average (2): Full expression of central tendency 计算平均数(二):集中趋势之充分表达
--3.5.2 Homework 课后习题
-3.6Range、Interquartile Range、A.D、Variance
--3.6.1 Position average: a robust expression of central tendency 1 位置平均数:集中趋势之稳健表达1
--3.6.2 Homework 课后习题
-3.7Standard Deviation
--3.7.1 Position average: a robust expression of central tendency 2 位置平均数:集中趋势之稳健表达2
-3.8Coefficient of Variation
-3.9 unit 3 test 第三单元测试题
-4.1 The horizontal of time series
--4.1.1 Time series (1): The past, present and future of the indicator 时间序列 (一) :指标的过去现在未来
--4.1.2 Homework 课后习题
--4.1.3 Time series (2): The past, present and future of indicators 时间序列 (二) :指标的过去现在未来
--4.1.4 Homework 课后习题
--4.1.5 Level analysis: the basis of time series analysis 水平分析:时间数列分析的基础
--4.1.6Homework 课后习题
-4.2 The speed analysis of time series
--4.2.1 Speed analysis: relative changes in time series 速度分析:时间数列的相对变动
--4.2.2 Homework 课后习题
-4.3 The calculation of the chronological average
--4.3.1 Average development speed: horizontal method and cumulative method 平均发展速度:水平法和累积法
--4.3.2 Homework 课后习题
-4.4 The calculation of average rate of development and increase
--4.4.1 Analysis of Component Factors: Finding the Truth 构成因素分析:抽丝剥茧寻真相
--4.4.2 Homework 课后习题
-4.5 The secular trend analysis of time series
--4.5.1 Long-term trend determination, smoothing method 长期趋势测定,修匀法
--4.5.2 Homework 课后习题
--4.5.3 Long-term trend determination: equation method 长期趋势测定:方程法
--4.5.4 Homework 课后习题
-4.6 The season fluctuation analysis of time series
--4.6.1 Seasonal change analysis: the same period average method 季节变动分析:同期平均法
-4.7 Unit 4 test 第四单元测试题
-5.1 The Conception and Type of Statistical Index
--5.1.1 Index overview: definition and classification 指数概览:定义与分类
-5.2 Aggregate Index
--5.2.1 Comprehensive index: first comprehensive and then compare 综合指数:先综合后对比
-5.4 Aggregate Index System
--5.4.1 Comprehensive Index System 综合指数体系
-5.5 Transformative Aggregate Index (Mean value index)
--5.5.1 Average index: compare first and then comprehensive (1) 平均数指数:先对比后综合(一)
--5.5.2 Average index: compare first and then comprehensive (2) 平均数指数:先对比后综合(二)
-5.6 Average target index
--5.6.1 Average index index: first average and then compare 平均指标指数:先平均后对比
-5.7 Multi-factor Index System
--5.7.1 CPI Past and Present CPI 前世今生
-5.8 Economic Index in Reality
--5.8.1 Stock Price Index: Big Family 股票价格指数:大家庭
-5.9 Unit 5 test 第五单元测试题
-Sampling and sampling distribution
-6.1The binomial distribution
--6.1.1 Sampling survey: definition and several groups of concepts 抽样调查:定义与几组概念
-6.2The geometric distribution
--6.2.1 Probability sampling: common organizational forms 概率抽样:常用组织形式
-6.3The t-distribution
--6.3.1 Non-probability sampling: commonly used sampling methods 非概率抽样:常用抽取方法
-6.4The normal distribution
--6.4.1 Common probability distributions: basic characterization of random variables 常见概率分布:随机变量的基本刻画
-6.5Using the normal table
--6.5.1 Sampling distribution: the cornerstone of sampling inference theory 抽样分布:抽样推断理论的基石
-6.9 Unit 6 test 第六单元测试题
-7.1Properties of point estimates: bias and variability
--7.1.1 Point estimation: methods and applications 点估计:方法与应用
-7.2Logic of confidence intervals
--7.2.1 Estimation: Selection and Evaluation 估计量:选择与评价
-7.3Meaning of confidence level
--7.3.1 Interval estimation: basic principles (1) 区间估计:基本原理(一)
--7.3.2 Interval estimation: basic principles (2) 区间估计:基本原理(二)
-7.4Confidence interval for a population proportion
--7.4.1 Interval estimation of the mean: large sample case 均值的区间估计:大样本情形
--7.4.2 Interval estimation of the mean: small sample case 均值的区间估计:小样本情形
-7.5Confidence interval for a population mean
--7.5.1 Interval estimation of the mean: small sample case 区间估计:总体比例和方差
-7.6Finding sample size
--7.6.1 Determination of sample size: a prelude to sampling (1) 样本容量的确定:抽样的前奏(一)
--7.6.2 Determination of sample size: a prelude to sampling (2) 样本容量的确定:抽样的前奏(二)
-7.7 Unit 7 Test 第七单元测试题
-8.1Forming hypotheses
--8.1.1 Hypothesis testing: proposing hypotheses 假设检验:提出假设
-8.2Logic of hypothesis testing
--8.2.1 Hypothesis testing: basic ideas 假设检验:基本思想
-8.3Type I and Type II errors
--8.3.1 Hypothesis testing: basic steps 假设检验:基本步骤
-8.4Test statistics and p-values 、Two-sided tests
--8.4.1 Example analysis: single population mean test 例题解析:单个总体均值检验
-8.5Hypothesis test for a population mean
--8.5.1 Analysis of examples of individual population proportion and variance test 例题分析 单个总体比例及方差检验
-8.6Hypothesis test for a population proportion
--8.6.1 P value: another test criterion P值:另一个检验准则
-8.7 Unit 8 test 第八单元测试题
-Correlation and regression analysis
-9.1Correlative relations
--9.1.1 Correlation analysis: exploring the connection of things 相关分析:初探事物联系
--9.1.2 Correlation coefficient: quantify the degree of correlation 相关系数:量化相关程度
-9.2The description of regression equation
--9.2.1 Regression Analysis: Application at a Glance 回归分析:应用一瞥
-9.3Fit the regression equation
--9.3.1 Regression analysis: equation establishment 回归分析:方程建立
-9.4Correlative relations of determination
--9.4.1 Regression analysis: basic ideas
--9.4.2 Regression analysis: coefficient estimation 回归分析:系数估计
-9.5The application of regression equation