Dr. Ji Son

Dr. Ji Son

Sampling Distribution of the Mean

Slide Duration:

Table of Contents

Section 1: Introduction
Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro
0:00
Roadmap
0:10
Roadmap
0:11
Statistics
0:35
Statistics
0:36
Let's Think About High School Science
1:12
Measurement and Find Patterns (Mathematical Formula)
1:13
Statistics = Math of Distributions
4:58
Distributions
4:59
Problematic… but also GREAT
5:58
Statistics
7:33
How is It Different from Other Specializations in Mathematics?
7:34
Statistics is Fundamental in Natural and Social Sciences
7:53
Two Skills of Statistics
8:20
Description (Exploration)
8:21
Inference
9:13
Descriptive Statistics vs. Inferential Statistics: Apply to Distributions
9:58
Descriptive Statistics
9:59
Inferential Statistics
11:05
Populations vs. Samples
12:19
Populations vs. Samples: Is it the Truth?
12:20
Populations vs. Samples: Pros & Cons
13:36
Populations vs. Samples: Descriptive Values
16:12
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:10
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:11
Example 1: Descriptive Statistics vs. Inferential Statistics
19:09
Example 2: Descriptive Statistics vs. Inferential Statistics
20:47
Example 3: Sample, Parameter, Population, and Statistic
21:40
Example 4: Sample, Parameter, Population, and Statistic
23:28
Section 2: About Samples: Cases, Variables, Measurements
About Samples: Cases, Variables, Measurements

32m 14s

Intro
0:00
Data
0:09
Data, Cases, Variables, and Values
0:10
Rows, Columns, and Cells
2:03
Example: Aircrafts
3:52
How Do We Get Data?
5:38
Research: Question and Hypothesis
5:39
Research Design
7:11
Measurement
7:29
Research Analysis
8:33
Research Conclusion
9:30
Types of Variables
10:03
Discrete Variables
10:04
Continuous Variables
12:07
Types of Measurements
14:17
Types of Measurements
14:18
Types of Measurements (Scales)
17:22
Nominal
17:23
Ordinal
19:11
Interval
21:33
Ratio
24:24
Example 1: Cases, Variables, Measurements
25:20
Example 2: Which Scale of Measurement is Used?
26:55
Example 3: What Kind of a Scale of Measurement is This?
27:26
Example 4: Discrete vs. Continuous Variables.
30:31
Section 3: Visualizing Distributions
Introduction to Excel

8m 9s

Intro
0:00
Before Visualizing Distribution
0:10
Excel
0:11
Excel: Organization
0:45
Workbook
0:46
Column x Rows
1:50
Tools: Menu Bar, Standard Toolbar, and Formula Bar
3:00
Excel + Data
6:07
Exce and Data
6:08
Frequency Distributions in Excel

39m 10s

Intro
0:00
Roadmap
0:08
Data in Excel and Frequency Distributions
0:09
Raw Data to Frequency Tables
0:42
Raw Data to Frequency Tables
0:43
Frequency Tables: Using Formulas and Pivot Tables
1:28
Example 1: Number of Births
7:17
Example 2: Age Distribution
20:41
Example 3: Height Distribution
27:45
Example 4: Height Distribution of Males
32:19
Frequency Distributions and Features

25m 29s

Intro
0:00
Roadmap
0:10
Data in Excel, Frequency Distributions, and Features of Frequency Distributions
0:11
Example #1
1:35
Uniform
1:36
Example #2
2:58
Unimodal, Skewed Right, and Asymmetric
2:59
Example #3
6:29
Bimodal
6:30
Example #4a
8:29
Symmetric, Unimodal, and Normal
8:30
Point of Inflection and Standard Deviation
11:13
Example #4b
12:43
Normal Distribution
12:44
Summary
13:56
Uniform, Skewed, Bimodal, and Normal
13:57
Sketch Problem 1: Driver's License
17:34
Sketch Problem 2: Life Expectancy
20:01
Sketch Problem 3: Telephone Numbers
22:01
Sketch Problem 4: Length of Time Used to Complete a Final Exam
23:43
Dotplots and Histograms in Excel

42m 42s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Previously
1:02
Data, Frequency Table, and visualization
1:03
Dotplots
1:22
Dotplots Excel Example
1:23
Dotplots: Pros and Cons
7:22
Pros and Cons of Dotplots
7:23
Dotplots Excel Example Cont.
9:07
Histograms
12:47
Histograms Overview
12:48
Example of Histograms
15:29
Histograms: Pros and Cons
31:39
Pros
31:40
Cons
32:31
Frequency vs. Relative Frequency
32:53
Frequency
32:54
Relative Frequency
33:36
Example 1: Dotplots vs. Histograms
34:36
Example 2: Age of Pennies Dotplot
36:21
Example 3: Histogram of Mammal Speeds
38:27
Example 4: Histogram of Life Expectancy
40:30
Stemplots

12m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
What Sets Stemplots Apart?
0:46
Data Sets, Dotplots, Histograms, and Stemplots
0:47
Example 1: What Do Stemplots Look Like?
1:58
Example 2: Back-to-Back Stemplots
5:00
Example 3: Quiz Grade Stemplot
7:46
Example 4: Quiz Grade & Afterschool Tutoring Stemplot
9:56
Bar Graphs

22m 49s

Intro
0:00
Roadmap
0:05
Roadmap
0:08
Review of Frequency Distributions
0:44
Y-axis and X-axis
0:45
Types of Frequency Visualizations Covered so Far
2:16
Introduction to Bar Graphs
4:07
Example 1: Bar Graph
5:32
Example 1: Bar Graph
5:33
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:07
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:08
Example 2: Create a Frequency Visualization for Gender
14:02
Example 3: Cases, Variables, and Frequency Visualization
16:34
Example 4: What Kind of Graphs are Shown Below?
19:29
Section 4: Summarizing Distributions
Central Tendency: Mean, Median, Mode

38m 50s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Central Tendency 1
0:56
Way to Summarize a Distribution of Scores
0:57
Mode
1:32
Median
2:02
Mean
2:36
Central Tendency 2
3:47
Mode
3:48
Median
4:20
Mean
5:25
Summation Symbol
6:11
Summation Symbol
6:12
Population vs. Sample
10:46
Population vs. Sample
10:47
Excel Examples
15:08
Finding Mode, Median, and Mean in Excel
15:09
Median vs. Mean
21:45
Effect of Outliers
21:46
Relationship Between Parameter and Statistic
22:44
Type of Measurements
24:00
Which Distributions to Use With
24:55
Example 1: Mean
25:30
Example 2: Using Summation Symbol
29:50
Example 3: Average Calorie Count
32:50
Example 4: Creating an Example Set
35:46
Variability

42m 40s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Variability (or Spread)
0:45
Variability (or Spread)
0:46
Things to Think About
5:45
Things to Think About
5:46
Range, Quartiles and Interquartile Range
6:37
Range
6:38
Interquartile Range
8:42
Interquartile Range Example
10:58
Interquartile Range Example
10:59
Variance and Standard Deviation
12:27
Deviations
12:28
Sum of Squares
14:35
Variance
16:55
Standard Deviation
17:44
Sum of Squares (SS)
18:34
Sum of Squares (SS)
18:35
Population vs. Sample SD
22:00
Population vs. Sample SD
22:01
Population vs. Sample
23:20
Mean
23:21
SD
23:51
Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File
27:21
Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File
35:25
Example 3: Sum of Squares
38:58
Example 4: Standard Deviation
41:48
Five Number Summary & Boxplots

57m 15s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Summarizing Distributions
0:37
Shape, Center, and Spread
0:38
5 Number Summary
1:14
Boxplot: Visualizing 5 Number Summary
3:37
Boxplot: Visualizing 5 Number Summary
3:38
Boxplots on Excel
9:01
Using 'Stocks' and Using Stacked Columns
9:02
Boxplots on Excel Example
10:14
When are Boxplots Useful?
32:14
Pros
32:15
Cons
32:59
How to Determine Outlier Status
33:24
Rule of Thumb: Upper Limit
33:25
Rule of Thumb: Lower Limit
34:16
Signal Outliers in an Excel Data File Using Conditional Formatting
34:52
Modified Boxplot
48:38
Modified Boxplot
48:39
Example 1: Percentage Values & Lower and Upper Whisker
49:10
Example 2: Boxplot
50:10
Example 3: Estimating IQR From Boxplot
53:46
Example 4: Boxplot and Missing Whisker
54:35
Shape: Calculating Skewness & Kurtosis

41m 51s

Intro
0:00
Roadmap
0:16
Roadmap
0:17
Skewness Concept
1:09
Skewness Concept
1:10
Calculating Skewness
3:26
Calculating Skewness
3:27
Interpreting Skewness
7:36
Interpreting Skewness
7:37
Excel Example
8:49
Kurtosis Concept
20:29
Kurtosis Concept
20:30
Calculating Kurtosis
24:17
Calculating Kurtosis
24:18
Interpreting Kurtosis
29:01
Leptokurtic
29:35
Mesokurtic
30:10
Platykurtic
31:06
Excel Example
32:04
Example 1: Shape of Distribution
38:28
Example 2: Shape of Distribution
39:29
Example 3: Shape of Distribution
40:14
Example 4: Kurtosis
41:10
Normal Distribution

34m 33s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
What is a Normal Distribution
0:44
The Normal Distribution As a Theoretical Model
0:45
Possible Range of Probabilities
3:05
Possible Range of Probabilities
3:06
What is a Normal Distribution
5:07
Can Be Described By
5:08
Properties
5:49
'Same' Shape: Illusion of Different Shape!
7:35
'Same' Shape: Illusion of Different Shape!
7:36
Types of Problems
13:45
Example: Distribution of SAT Scores
13:46
Shape Analogy
19:48
Shape Analogy
19:49
Example 1: The Standard Normal Distribution and Z-Scores
22:34
Example 2: The Standard Normal Distribution and Z-Scores
25:54
Example 3: Sketching and Normal Distribution
28:55
Example 4: Sketching and Normal Distribution
32:32
Standard Normal Distributions & Z-Scores

41m 44s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
A Family of Distributions
0:28
Infinite Set of Distributions
0:29
Transforming Normal Distributions to 'Standard' Normal Distribution
1:04
Normal Distribution vs. Standard Normal Distribution
2:58
Normal Distribution vs. Standard Normal Distribution
2:59
Z-Score, Raw Score, Mean, & SD
4:08
Z-Score, Raw Score, Mean, & SD
4:09
Weird Z-Scores
9:40
Weird Z-Scores
9:41
Excel
16:45
For Normal Distributions
16:46
For Standard Normal Distributions
19:11
Excel Example
20:24
Types of Problems
25:18
Percentage Problem: P(x)
25:19
Raw Score and Z-Score Problems
26:28
Standard Deviation Problems
27:01
Shape Analogy
27:44
Shape Analogy
27:45
Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer
28:24
Example 2: Heights of Male College Students
33:15
Example 3: Mean and Standard Deviation
37:14
Example 4: Finding Percentage of Values in a Standard Normal Distribution
37:49
Normal Distribution: PDF vs. CDF

55m 44s

Intro
0:00
Roadmap
0:15
Roadmap
0:16
Frequency vs. Cumulative Frequency
0:56
Frequency vs. Cumulative Frequency
0:57
Frequency vs. Cumulative Frequency
4:32
Frequency vs. Cumulative Frequency Cont.
4:33
Calculus in Brief
6:21
Derivative-Integral Continuum
6:22
PDF
10:08
PDF for Standard Normal Distribution
10:09
PDF for Normal Distribution
14:32
Integral of PDF = CDF
21:27
Integral of PDF = CDF
21:28
Example 1: Cumulative Frequency Graph
23:31
Example 2: Mean, Standard Deviation, and Probability
24:43
Example 3: Mean and Standard Deviation
35:50
Example 4: Age of Cars
49:32
Section 5: Linear Regression
Scatterplots

47m 19s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Previous Visualizations
0:30
Frequency Distributions
0:31
Compare & Contrast
2:26
Frequency Distributions Vs. Scatterplots
2:27
Summary Values
4:53
Shape
4:54
Center & Trend
6:41
Spread & Strength
8:22
Univariate & Bivariate
10:25
Example Scatterplot
10:48
Shape, Trend, and Strength
10:49
Positive and Negative Association
14:05
Positive and Negative Association
14:06
Linearity, Strength, and Consistency
18:30
Linearity
18:31
Strength
19:14
Consistency
20:40
Summarizing a Scatterplot
22:58
Summarizing a Scatterplot
22:59
Example 1: Gapminder.org, Income x Life Expectancy
26:32
Example 2: Gapminder.org, Income x Infant Mortality
36:12
Example 3: Trend and Strength of Variables
40:14
Example 4: Trend, Strength and Shape for Scatterplots
43:27
Regression

32m 2s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Linear Equations
0:34
Linear Equations: y = mx + b
0:35
Rough Line
5:16
Rough Line
5:17
Regression - A 'Center' Line
7:41
Reasons for Summarizing with a Regression Line
7:42
Predictor and Response Variable
10:04
Goal of Regression
12:29
Goal of Regression
12:30
Prediction
14:50
Example: Servings of Mile Per Year Shown By Age
14:51
Intrapolation
17:06
Extrapolation
17:58
Error in Prediction
20:34
Prediction Error
20:35
Residual
21:40
Example 1: Residual
23:34
Example 2: Large and Negative Residual
26:30
Example 3: Positive Residual
28:13
Example 4: Interpret Regression Line & Extrapolate
29:40
Least Squares Regression

56m 36s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
Best Fit
0:47
Best Fit
0:48
Sum of Squared Errors (SSE)
1:50
Sum of Squared Errors (SSE)
1:51
Why Squared?
3:38
Why Squared?
3:39
Quantitative Properties of Regression Line
4:51
Quantitative Properties of Regression Line
4:52
So How do we Find Such a Line?
6:49
SSEs of Different Line Equations & Lowest SSE
6:50
Carl Gauss' Method
8:01
How Do We Find Slope (b1)
11:00
How Do We Find Slope (b1)
11:01
Hoe Do We Find Intercept
15:11
Hoe Do We Find Intercept
15:12
Example 1: Which of These Equations Fit the Above Data Best?
17:18
Example 2: Find the Regression Line for These Data Points and Interpret It
26:31
Example 3: Summarize the Scatterplot and Find the Regression Line.
34:31
Example 4: Examine the Mean of Residuals
43:52
Correlation

43m 58s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Summarizing a Scatterplot Quantitatively
0:47
Shape
0:48
Trend
1:11
Strength: Correlation ®
1:45
Correlation Coefficient ( r )
2:30
Correlation Coefficient ( r )
2:31
Trees vs. Forest
11:59
Trees vs. Forest
12:00
Calculating r
15:07
Average Product of z-scores for x and y
15:08
Relationship between Correlation and Slope
21:10
Relationship between Correlation and Slope
21:11
Example 1: Find the Correlation between Grams of Fat and Cost
24:11
Example 2: Relationship between r and b1
30:24
Example 3: Find the Regression Line
33:35
Example 4: Find the Correlation Coefficient for this Set of Data
37:37
Correlation: r vs. r-squared

52m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
R-squared
0:44
What is the Meaning of It? Why Squared?
0:45
Parsing Sum of Squared (Parsing Variability)
2:25
SST = SSR + SSE
2:26
What is SST and SSE?
7:46
What is SST and SSE?
7:47
r-squared
18:33
Coefficient of Determination
18:34
If the Correlation is Strong…
20:25
If the Correlation is Strong…
20:26
If the Correlation is Weak…
22:36
If the Correlation is Weak…
22:37
Example 1: Find r-squared for this Set of Data
23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?
33:54
Example 3: Why Does r-squared Only Range from 0 to 1
37:29
Example 4: Find the r-squared for This Set of Data
39:55
Transformations of Data

27m 8s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Why Transform?
0:26
Why Transform?
0:27
Shape-preserving vs. Shape-changing Transformations
5:14
Shape-preserving = Linear Transformations
5:15
Shape-changing Transformations = Non-linear Transformations
6:20
Common Shape-Preserving Transformations
7:08
Common Shape-Preserving Transformations
7:09
Common Shape-Changing Transformations
8:59
Powers
9:00
Logarithms
9:39
Change Just One Variable? Both?
10:38
Log-log Transformations
10:39
Log Transformations
14:38
Example 1: Create, Graph, and Transform the Data Set
15:19
Example 2: Create, Graph, and Transform the Data Set
20:08
Example 3: What Kind of Model would You Choose for this Data?
22:44
Example 4: Transformation of Data
25:46
Section 6: Collecting Data in an Experiment
Sampling & Bias

54m 44s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Descriptive vs. Inferential Statistics
1:04
Descriptive Statistics: Data Exploration
1:05
Example
2:03
To tackle Generalization…
4:31
Generalization
4:32
Sampling
6:06
'Good' Sample
6:40
Defining Samples and Populations
8:55
Population
8:56
Sample
11:16
Why Use Sampling?
13:09
Why Use Sampling?
13:10
Goal of Sampling: Avoiding Bias
15:04
What is Bias?
15:05
Where does Bias Come from: Sampling Bias
17:53
Where does Bias Come from: Response Bias
18:27
Sampling Bias: Bias from Bas Sampling Methods
19:34
Size Bias
19:35
Voluntary Response Bias
21:13
Convenience Sample
22:22
Judgment Sample
23:58
Inadequate Sample Frame
25:40
Response Bias: Bias from 'Bad' Data Collection Methods
28:00
Nonresponse Bias
29:31
Questionnaire Bias
31:10
Incorrect Response or Measurement Bias
37:32
Example 1: What Kind of Biases?
40:29
Example 2: What Biases Might Arise?
44:46
Example 3: What Kind of Biases?
48:34
Example 4: What Kind of Biases?
51:43
Sampling Methods

14m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Biased vs. Unbiased Sampling Methods
0:32
Biased Sampling
0:33
Unbiased Sampling
1:13
Probability Sampling Methods
2:31
Simple Random
2:54
Stratified Random Sampling
4:06
Cluster Sampling
5:24
Two-staged Sampling
6:22
Systematic Sampling
7:25
Example 1: Which Type(s) of Sampling was this?
8:33
Example 2: Describe How to Take a Two-Stage Sample from this Book
10:16
Example 3: Sampling Methods
11:58
Example 4: Cluster Sample Plan
12:48
Research Design

53m 54s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Descriptive vs. Inferential Statistics
0:51
Descriptive Statistics: Data Exploration
0:52
Inferential Statistics
1:02
Variables and Relationships
1:44
Variables
1:45
Relationships
2:49
Not Every Type of Study is an Experiment…
4:16
Category I - Descriptive Study
4:54
Category II - Correlational Study
5:50
Category III - Experimental, Quasi-experimental, Non-experimental
6:33
Category III
7:42
Experimental, Quasi-experimental, and Non-experimental
7:43
Why CAN'T the Other Strategies Determine Causation?
10:18
Third-variable Problem
10:19
Directionality Problem
15:49
What Makes Experiments Special?
17:54
Manipulation
17:55
Control (and Comparison)
21:58
Methods of Control
26:38
Holding Constant
26:39
Matching
29:11
Random Assignment
31:48
Experiment Terminology
34:09
'true' Experiment vs. Study
34:10
Independent Variable (IV)
35:16
Dependent Variable (DV)
35:45
Factors
36:07
Treatment Conditions
36:23
Levels
37:43
Confounds or Extraneous Variables
38:04
Blind
38:38
Blind Experiments
38:39
Double-blind Experiments
39:29
How Categories Relate to Statistics
41:35
Category I - Descriptive Study
41:36
Category II - Correlational Study
42:05
Category III - Experimental, Quasi-experimental, Non-experimental
42:43
Example 1: Research Design
43:50
Example 2: Research Design
47:37
Example 3: Research Design
50:12
Example 4: Research Design
52:00
Between and Within Treatment Variability

41m 31s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Experimental Designs
0:51
Experimental Designs: Manipulation & Control
0:52
Two Types of Variability
2:09
Between Treatment Variability
2:10
Within Treatment Variability
3:31
Updated Goal of Experimental Design
5:47
Updated Goal of Experimental Design
5:48
Example: Drugs and Driving
6:56
Example: Drugs and Driving
6:57
Different Types of Random Assignment
11:27
All Experiments
11:28
Completely Random Design
12:02
Randomized Block Design
13:19
Randomized Block Design
15:48
Matched Pairs Design
15:49
Repeated Measures Design
19:47
Between-subject Variable vs. Within-subject Variable
22:43
Completely Randomized Design
22:44
Repeated Measures Design
25:03
Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment
26:16
Example 2: Block Design
31:41
Example 3: Completely Randomized Designs
35:11
Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?
39:01
Section 7: Review of Probability Axioms
Sample Spaces

37m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Why is Probability Involved in Statistics
0:48
Probability
0:49
Can People Tell the Difference between Cheap and Gourmet Coffee?
2:08
Taste Test with Coffee Drinkers
3:37
If No One can Actually Taste the Difference
3:38
If Everyone can Actually Taste the Difference
5:36
Creating a Probability Model
7:09
Creating a Probability Model
7:10
D'Alembert vs. Necker
9:41
D'Alembert vs. Necker
9:42
Problem with D'Alembert's Model
13:29
Problem with D'Alembert's Model
13:30
Covering Entire Sample Space
15:08
Fundamental Principle of Counting
15:09
Where Do Probabilities Come From?
22:54
Observed Data, Symmetry, and Subjective Estimates
22:55
Checking whether Model Matches Real World
24:27
Law of Large Numbers
24:28
Example 1: Law of Large Numbers
27:46
Example 2: Possible Outcomes
30:43
Example 3: Brands of Coffee and Taste
33:25
Example 4: How Many Different Treatments are there?
35:33
Addition Rule for Disjoint Events

20m 29s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Disjoint Events
0:41
Disjoint Events
0:42
Meaning of 'or'
2:39
In Regular Life
2:40
In Math/Statistics/Computer Science
3:10
Addition Rule for Disjoin Events
3:55
If A and B are Disjoint: P (A and B)
3:56
If A and B are Disjoint: P (A or B)
5:15
General Addition Rule
5:41
General Addition Rule
5:42
Generalized Addition Rule
8:31
If A and B are not Disjoint: P (A or B)
8:32
Example 1: Which of These are Mutually Exclusive?
10:50
Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?
12:57
Example 3: Engagement Party
15:17
Example 4: Home Owner's Insurance
18:30
Conditional Probability

57m 19s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
'or' vs. 'and' vs. Conditional Probability
1:07
'or' vs. 'and' vs. Conditional Probability
1:08
'and' vs. Conditional Probability
5:57
P (M or L)
5:58
P (M and L)
8:41
P (M|L)
11:04
P (L|M)
12:24
Tree Diagram
15:02
Tree Diagram
15:03
Defining Conditional Probability
22:42
Defining Conditional Probability
22:43
Common Contexts for Conditional Probability
30:56
Medical Testing: Positive Predictive Value
30:57
Medical Testing: Sensitivity
33:03
Statistical Tests
34:27
Example 1: Drug and Disease
36:41
Example 2: Marbles and Conditional Probability
40:04
Example 3: Cards and Conditional Probability
45:59
Example 4: Votes and Conditional Probability
50:21
Independent Events

24m 27s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Independent Events & Conditional Probability
0:26
Non-independent Events
0:27
Independent Events
2:00
Non-independent and Independent Events
3:08
Non-independent and Independent Events
3:09
Defining Independent Events
5:52
Defining Independent Events
5:53
Multiplication Rule
7:29
Previously…
7:30
But with Independent Evens
8:53
Example 1: Which of These Pairs of Events are Independent?
11:12
Example 2: Health Insurance and Probability
15:12
Example 3: Independent Events
17:42
Example 4: Independent Events
20:03
Section 8: Probability Distributions
Introduction to Probability Distributions

56m 45s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Sampling vs. Probability
0:57
Sampling
0:58
Missing
1:30
What is Missing?
3:06
Insight: Probability Distributions
5:26
Insight: Probability Distributions
5:27
What is a Probability Distribution?
7:29
From Sample Spaces to Probability Distributions
8:44
Sample Space
8:45
Probability Distribution of the Sum of Two Die
11:16
The Random Variable
17:43
The Random Variable
17:44
Expected Value
21:52
Expected Value
21:53
Example 1: Probability Distributions
28:45
Example 2: Probability Distributions
35:30
Example 3: Probability Distributions
43:37
Example 4: Probability Distributions
47:20
Expected Value & Variance of Probability Distributions

53m 41s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Discrete vs. Continuous Random Variables
1:04
Discrete vs. Continuous Random Variables
1:05
Mean and Variance Review
4:44
Mean: Sample, Population, and Probability Distribution
4:45
Variance: Sample, Population, and Probability Distribution
9:12
Example Situation
14:10
Example Situation
14:11
Some Special Cases…
16:13
Some Special Cases…
16:14
Linear Transformations
19:22
Linear Transformations
19:23
What Happens to Mean and Variance of the Probability Distribution?
20:12
n Independent Values of X
25:38
n Independent Values of X
25:39
Compare These Two Situations
30:56
Compare These Two Situations
30:57
Two Random Variables, X and Y
32:02
Two Random Variables, X and Y
32:03
Example 1: Expected Value & Variance of Probability Distributions
35:35
Example 2: Expected Values & Standard Deviation
44:17
Example 3: Expected Winnings and Standard Deviation
48:18
Binomial Distribution

55m 15s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Discrete Probability Distributions
1:42
Discrete Probability Distributions
1:43
Binomial Distribution
2:36
Binomial Distribution
2:37
Multiplicative Rule Review
6:54
Multiplicative Rule Review
6:55
How Many Outcomes with k 'Successes'
10:23
Adults and Bachelor's Degree: Manual List of Outcomes
10:24
P (X=k)
19:37
Putting Together # of Outcomes with the Multiplicative Rule
19:38
Expected Value and Standard Deviation in a Binomial Distribution
25:22
Expected Value and Standard Deviation in a Binomial Distribution
25:23
Example 1: Coin Toss
33:42
Example 2: College Graduates
38:03
Example 3: Types of Blood and Probability
45:39
Example 4: Expected Number and Standard Deviation
51:11
Section 9: Sampling Distributions of Statistics
Introduction to Sampling Distributions

48m 17s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Probability Distributions vs. Sampling Distributions
0:55
Probability Distributions vs. Sampling Distributions
0:56
Same Logic
3:55
Logic of Probability Distribution
3:56
Example: Rolling Two Die
6:56
Simulating Samples
9:53
To Come Up with Probability Distributions
9:54
In Sampling Distributions
11:12
Connecting Sampling and Research Methods with Sampling Distributions
12:11
Connecting Sampling and Research Methods with Sampling Distributions
12:12
Simulating a Sampling Distribution
14:14
Experimental Design: Regular Sleep vs. Less Sleep
14:15
Logic of Sampling Distributions
23:08
Logic of Sampling Distributions
23:09
General Method of Simulating Sampling Distributions
25:38
General Method of Simulating Sampling Distributions
25:39
Questions that Remain
28:45
Questions that Remain
28:46
Example 1: Mean and Standard Error of Sampling Distribution
30:57
Example 2: What is the Best Way to Describe Sampling Distributions?
37:12
Example 3: Matching Sampling Distributions
38:21
Example 4: Mean and Standard Error of Sampling Distribution
41:51
Sampling Distribution of the Mean

1h 8m 48s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Special Case of General Method for Simulating a Sampling Distribution
1:53
Special Case of General Method for Simulating a Sampling Distribution
1:54
Computer Simulation
3:43
Using Simulations to See Principles behind Shape of SDoM
15:50
Using Simulations to See Principles behind Shape of SDoM
15:51
Conditions
17:38
Using Simulations to See Principles behind Center (Mean) of SDoM
20:15
Using Simulations to See Principles behind Center (Mean) of SDoM
20:16
Conditions: Does n Matter?
21:31
Conditions: Does Number of Simulation Matter?
24:37
Using Simulations to See Principles behind Standard Deviation of SDoM
27:13
Using Simulations to See Principles behind Standard Deviation of SDoM
27:14
Conditions: Does n Matter?
34:45
Conditions: Does Number of Simulation Matter?
36:24
Central Limit Theorem
37:13
SHAPE
38:08
CENTER
39:34
SPREAD
39:52
Comparing Population, Sample, and SDoM
43:10
Comparing Population, Sample, and SDoM
43:11
Answering the 'Questions that Remain'
48:24
What Happens When We Don't Know What the Population Looks Like?
48:25
Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
49:42
How Do We Know whether a Sample is Sufficiently Unlikely?
53:36
Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
54:40
Example 1: Mean Batting Average
55:25
Example 2: Mean Sampling Distribution and Standard Error
59:07
Example 3: Sampling Distribution of the Mean
1:01:04
Sampling Distribution of Sample Proportions

54m 37s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Intro to Sampling Distribution of Sample Proportions (SDoSP)
0:51
Categorical Data (Examples)
0:52
Wish to Estimate Proportion of Population from Sample…
2:00
Notation
3:34
Population Proportion and Sample Proportion Notations
3:35
What's the Difference?
9:19
SDoM vs. SDoSP: Type of Data
9:20
SDoM vs. SDoSP: Shape
11:24
SDoM vs. SDoSP: Center
12:30
SDoM vs. SDoSP: Spread
15:34
Binomial Distribution vs. Sampling Distribution of Sample Proportions
19:14
Binomial Distribution vs. SDoSP: Type of Data
19:17
Binomial Distribution vs. SDoSP: Shape
21:07
Binomial Distribution vs. SDoSP: Center
21:43
Binomial Distribution vs. SDoSP: Spread
24:08
Example 1: Sampling Distribution of Sample Proportions
26:07
Example 2: Sampling Distribution of Sample Proportions
37:58
Example 3: Sampling Distribution of Sample Proportions
44:42
Example 4: Sampling Distribution of Sample Proportions
45:57
Section 10: Inferential Statistics
Introduction to Confidence Intervals

42m 53s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Inferential Statistics
0:50
Inferential Statistics
0:51
Two Problems with This Picture…
3:20
Two Problems with This Picture…
3:21
Solution: Confidence Intervals (CI)
4:59
Solution: Hypotheiss Testing (HT)
5:49
Which Parameters are Known?
6:45
Which Parameters are Known?
6:46
Confidence Interval - Goal
7:56
When We Don't Know m but know s
7:57
When We Don't Know
18:27
When We Don't Know m nor s
18:28
Example 1: Confidence Intervals
26:18
Example 2: Confidence Intervals
29:46
Example 3: Confidence Intervals
32:18
Example 4: Confidence Intervals
38:31
t Distributions

1h 2m 6s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
When to Use z vs. t?
1:07
When to Use z vs. t?
1:08
What is z and t?
3:02
z-score and t-score: Commonality
3:03
z-score and t-score: Formulas
3:34
z-score and t-score: Difference
5:22
Why not z? (Why t?)
7:24
Why not z? (Why t?)
7:25
But Don't Worry!
15:13
Gossett and t-distributions
15:14
Rules of t Distributions
17:05
t-distributions are More Normal as n Gets Bigger
17:06
t-distributions are a Family of Distributions
18:55
Degrees of Freedom (df)
20:02
Degrees of Freedom (df)
20:03
t Family of Distributions
24:07
t Family of Distributions : df = 2 , 4, and 60
24:08
df = 60
29:16
df = 2
29:59
How to Find It?
31:01
'Student's t-distribution' or 't-distribution'
31:02
Excel Example
33:06
Example 1: Which Distribution Do You Use? Z or t?
45:26
Example 2: Friends on Facebook
47:41
Example 3: t Distributions
52:15
Example 4: t Distributions , confidence interval, and mean
55:59
Introduction to Hypothesis Testing

1h 6m 33s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Issues to Overcome in Inferential Statistics
1:35
Issues to Overcome in Inferential Statistics
1:36
What Happens When We Don't Know What the Population Looks Like?
2:57
How Do We Know whether a sample is Sufficiently Unlikely
3:43
Hypothesizing a Population
6:44
Hypothesizing a Population
6:45
Null Hypothesis
8:07
Alternative Hypothesis
8:56
Hypotheses
11:58
Hypotheses
11:59
Errors in Hypothesis Testing
14:22
Errors in Hypothesis Testing
14:23
Steps of Hypothesis Testing
21:15
Steps of Hypothesis Testing
21:16
Single Sample HT ( When Sigma Available)
26:08
Example: Average Facebook Friends
26:09
Step1
27:08
Step 2
27:58
Step 3
28:17
Step 4
32:18
Single Sample HT (When Sigma Not Available)
36:33
Example: Average Facebook Friends
36:34
Step1: Hypothesis Testing
36:58
Step 2: Significance Level
37:25
Step 3: Decision Stage
37:40
Step 4: Sample
41:36
Sigma and p-value
45:04
Sigma and p-value
45:05
On tailed vs. Two Tailed Hypotheses
45:51
Example 1: Hypothesis Testing
48:37
Example 2: Heights of Women in the US
57:43
Example 3: Select the Best Way to Complete This Sentence
1:03:23
Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro
0:00
Roadmap
0:14
Roadmap
0:15
One Mean vs. Two Means
1:17
One Mean vs. Two Means
1:18
Notation
2:41
A Sample! A Set!
2:42
Mean of X, Mean of Y, and Difference of Two Means
3:56
SE of X
4:34
SE of Y
6:28
Sampling Distribution of the Difference between Two Means (SDoD)
7:48
Sampling Distribution of the Difference between Two Means (SDoD)
7:49
Rules of the SDoD (similar to CLT!)
15:00
Mean for the SDoD Null Hypothesis
15:01
Standard Error
17:39
When can We Construct a CI for the Difference between Two Means?
21:28
Three Conditions
21:29
Finding CI
23:56
One Mean CI
23:57
Two Means CI
25:45
Finding t
29:16
Finding t
29:17
Interpreting CI
30:25
Interpreting CI
30:26
Better Estimate of s (s pool)
34:15
Better Estimate of s (s pool)
34:16
Example 1: Confidence Intervals
42:32
Example 2: SE of the Difference
52:36
Hypothesis Testing for the Difference of Two Independent Means

50m

Intro
0:00
Roadmap
0:06
Roadmap
0:07
The Goal of Hypothesis Testing
0:56
One Sample and Two Samples
0:57
Sampling Distribution of the Difference between Two Means (SDoD)
3:42
Sampling Distribution of the Difference between Two Means (SDoD)
3:43
Rules of the SDoD (Similar to CLT!)
6:46
Shape
6:47
Mean for the Null Hypothesis
7:26
Standard Error for Independent Samples (When Variance is Homogenous)
8:18
Standard Error for Independent Samples (When Variance is not Homogenous)
9:25
Same Conditions for HT as for CI
10:08
Three Conditions
10:09
Steps of Hypothesis Testing
11:04
Steps of Hypothesis Testing
11:05
Formulas that Go with Steps of Hypothesis Testing
13:21
Step 1
13:25
Step 2
14:18
Step 3
15:00
Step 4
16:57
Example 1: Hypothesis Testing for the Difference of Two Independent Means
18:47
Example 2: Hypothesis Testing for the Difference of Two Independent Means
33:55
Example 3: Hypothesis Testing for the Difference of Two Independent Means
44:22
Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
The Goal of Hypothesis Testing
1:27
One Sample and Two Samples
1:28
Independent Samples vs. Paired Samples
3:16
Independent Samples vs. Paired Samples
3:17
Which is Which?
5:20
Independent SAMPLES vs. Independent VARIABLES
7:43
independent SAMPLES vs. Independent VARIABLES
7:44
T-tests Always…
10:48
T-tests Always…
10:49
Notation for Paired Samples
12:59
Notation for Paired Samples
13:00
Steps of Hypothesis Testing for Paired Samples
16:13
Steps of Hypothesis Testing for Paired Samples
16:14
Rules of the SDoD (Adding on Paired Samples)
18:03
Shape
18:04
Mean for the Null Hypothesis
18:31
Standard Error for Independent Samples (When Variance is Homogenous)
19:25
Standard Error for Paired Samples
20:39
Formulas that go with Steps of Hypothesis Testing
22:59
Formulas that go with Steps of Hypothesis Testing
23:00
Confidence Intervals for Paired Samples
30:32
Confidence Intervals for Paired Samples
30:33
Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
32:28
Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
44:02
Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
52:23
Type I and Type II Errors

31m 27s

Intro
0:00
Roadmap
0:18
Roadmap
0:19
Errors and Relationship to HT and the Sample Statistic?
1:11
Errors and Relationship to HT and the Sample Statistic?
1:12
Instead of a Box…Distributions!
7:00
One Sample t-test: Friends on Facebook
7:01
Two Sample t-test: Friends on Facebook
13:46
Usually, Lots of Overlap between Null and Alternative Distributions
16:59
Overlap between Null and Alternative Distributions
17:00
How Distributions and 'Box' Fit Together
22:45
How Distributions and 'Box' Fit Together
22:46
Example 1: Types of Errors
25:54
Example 2: Types of Errors
27:30
Example 3: What is the Danger of the Type I Error?
29:38
Effect Size & Power

44m 41s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Distance between Distributions: Sample t
0:49
Distance between Distributions: Sample t
0:50
Problem with Distance in Terms of Standard Error
2:56
Problem with Distance in Terms of Standard Error
2:57
Test Statistic (t) vs. Effect Size (d or g)
4:38
Test Statistic (t) vs. Effect Size (d or g)
4:39
Rules of Effect Size
6:09
Rules of Effect Size
6:10
Why Do We Need Effect Size?
8:21
Tells You the Practical Significance
8:22
HT can be Deceiving…
10:25
Important Note
10:42
What is Power?
11:20
What is Power?
11:21
Why Do We Need Power?
14:19
Conditional Probability and Power
14:20
Power is:
16:27
Can We Calculate Power?
19:00
Can We Calculate Power?
19:01
How Does Alpha Affect Power?
20:36
How Does Alpha Affect Power?
20:37
How Does Effect Size Affect Power?
25:38
How Does Effect Size Affect Power?
25:39
How Does Variability and Sample Size Affect Power?
27:56
How Does Variability and Sample Size Affect Power?
27:57
How Do We Increase Power?
32:47
Increasing Power
32:48
Example 1: Effect Size & Power
35:40
Example 2: Effect Size & Power
37:38
Example 3: Effect Size & Power
40:55
Section 11: Analysis of Variance
F-distributions

24m 46s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Z- & T-statistic and Their Distribution
0:34
Z- & T-statistic and Their Distribution
0:35
F-statistic
4:55
The F Ration ( the Variance Ratio)
4:56
F-distribution
12:29
F-distribution
12:30
s and p-value
15:00
s and p-value
15:01
Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?
18:33
Example 2: F-distributions
19:29
Example 3: F-distributions and Heights
21:29
ANOVA with Independent Samples

1h 9m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
1:12
The Limitations of t-tests
1:13
Two Major Limitations of Many t-tests
3:26
Two Major Limitations of Many t-tests
3:27
Ronald Fisher's Solution… F-test! New Null Hypothesis
4:43
Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)
4:44
Analysis of Variance (ANoVA) Notation
7:47
Analysis of Variance (ANoVA) Notation
7:48
Partitioning (Analyzing) Variance
9:58
Total Variance
9:59
Within-group Variation
14:00
Between-group Variation
16:22
Time out: Review Variance & SS
17:05
Time out: Review Variance & SS
17:06
F-statistic
19:22
The F Ratio (the Variance Ratio)
19:23
S²bet = SSbet / dfbet
22:13
What is This?
22:14
How Many Means?
23:20
So What is the dfbet?
23:38
So What is SSbet?
24:15
S²w = SSw / dfw
26:05
What is This?
26:06
How Many Means?
27:20
So What is the dfw?
27:36
So What is SSw?
28:18
Chart of Independent Samples ANOVA
29:25
Chart of Independent Samples ANOVA
29:26
Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?
35:52
Hypotheses
35:53
Significance Level
39:40
Decision Stage
40:05
Calculate Samples' Statistic and p-Value
44:10
Reject or Fail to Reject H0
55:54
Example 2: ANOVA with Independent Samples
58:21
Repeated Measures ANOVA

1h 15m 13s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
0:36
Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
0:37
ANOVA (F-test) to the Rescue!
5:49
Omnibus Hypothesis
5:50
Analyze Variance
7:27
Independent Samples vs. Repeated Measures
9:12
Same Start
9:13
Independent Samples ANOVA
10:43
Repeated Measures ANOVA
12:00
Independent Samples ANOVA
16:00
Same Start: All the Variance Around Grand Mean
16:01
Independent Samples
16:23
Repeated Measures ANOVA
18:18
Same Start: All the Variance Around Grand Mean
18:19
Repeated Measures
18:33
Repeated Measures F-statistic
21:22
The F Ratio (The Variance Ratio)
21:23
S²bet = SSbet / dfbet
23:07
What is This?
23:08
How Many Means?
23:39
So What is the dfbet?
23:54
So What is SSbet?
24:32
S² resid = SS resid / df resid
25:46
What is This?
25:47
So What is SS resid?
26:44
So What is the df resid?
27:36
SS subj and df subj
28:11
What is This?
28:12
How Many Subject Means?
29:43
So What is df subj?
30:01
So What is SS subj?
30:09
SS total and df total
31:42
What is This?
31:43
What is the Total Number of Data Points?
32:02
So What is df total?
32:34
so What is SS total?
32:47
Chart of Repeated Measures ANOVA
33:19
Chart of Repeated Measures ANOVA: F and Between-samples Variability
33:20
Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
35:50
Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
40:25
Hypotheses
40:26
Significance Level
41:46
Decision Stage
42:09
Calculate Samples' Statistic and p-Value
46:18
Reject or Fail to Reject H0
57:55
Example 2: Repeated Measures ANOVA
58:57
Example 3: What's the Problem with a Bunch of Tiny t-tests?
1:13:59
Section 12: Chi-square Test
Chi-Square Goodness-of-Fit Test

58m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Where Does the Chi-Square Test Belong?
0:50
Where Does the Chi-Square Test Belong?
0:51
A New Twist on HT: Goodness-of-Fit
7:23
HT in General
7:24
Goodness-of-Fit HT
8:26
Hypotheses about Proportions
12:17
Null Hypothesis
12:18
Alternative Hypothesis
13:23
Example
14:38
Chi-Square Statistic
17:52
Chi-Square Statistic
17:53
Chi-Square Distributions
24:31
Chi-Square Distributions
24:32
Conditions for Chi-Square
28:58
Condition 1
28:59
Condition 2
30:20
Condition 3
30:32
Condition 4
31:47
Example 1: Chi-Square Goodness-of-Fit Test
32:23
Example 2: Chi-Square Goodness-of-Fit Test
44:34
Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?
56:06
Chi-Square Test of Homogeneity

51m 36s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
Goodness-of-Fit vs. Homogeneity
1:13
Goodness-of-Fit HT
1:14
Homogeneity
2:00
Analogy
2:38
Hypotheses About Proportions
5:00
Null Hypothesis
5:01
Alternative Hypothesis
6:11
Example
6:33
Chi-Square Statistic
10:12
Same as Goodness-of-Fit Test
10:13
Set Up Data
12:28
Setting Up Data Example
12:29
Expected Frequency
16:53
Expected Frequency
16:54
Chi-Square Distributions & df
19:26
Chi-Square Distributions & df
19:27
Conditions for Test of Homogeneity
20:54
Condition 1
20:55
Condition 2
21:39
Condition 3
22:05
Condition 4
22:23
Example 1: Chi-Square Test of Homogeneity
22:52
Example 2: Chi-Square Test of Homogeneity
32:10
Section 13: Overview of Statistics
Overview of Statistics

18m 11s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
The Statistical Tests (HT) We've Covered
0:28
The Statistical Tests (HT) We've Covered
0:29
Organizing the Tests We've Covered…
1:08
One Sample: Continuous DV and Categorical DV
1:09
Two Samples: Continuous DV and Categorical DV
5:41
More Than Two Samples: Continuous DV and Categorical DV
8:21
The Following Data: OK Cupid
10:10
The Following Data: OK Cupid
10:11
Example 1: Weird-MySpace-Angle Profile Photo
10:38
Example 2: Geniuses
12:30
Example 3: Promiscuous iPhone Users
13:37
Example 4: Women, Aging, and Messaging
16:07
Loading...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
Bookmark & Share Embed

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Please ensure that your website editor is in text mode when you paste the code.
(In Wordpress, the mode button is on the top right corner.)
  ×
  • - Allow users to view the embedded video in full-size.
Since this lesson is not free, only the preview will appear on your website.
  • Discussion

  • Answer Engine

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Lecture Comments (2)

1 answer

Last reply by: Professor Son
Fri Oct 17, 2014 5:53 PM

Post by Anthony Biggs on April 7, 2012

I want to watch the part to do with CLT but it wont let me skip!!!!

Sampling Distribution of the Mean

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:05
    • Roadmap
  • Special Case of General Method for Simulating a Sampling Distribution 1:53
    • Special Case of General Method for Simulating a Sampling Distribution
    • Computer Simulation
  • Using Simulations to See Principles behind Shape of SDoM 15:50
    • Using Simulations to See Principles behind Shape of SDoM
    • Conditions
  • Using Simulations to See Principles behind Center (Mean) of SDoM 20:15
    • Using Simulations to See Principles behind Center (Mean) of SDoM
    • Conditions: Does n Matter?
    • Conditions: Does Number of Simulation Matter?
  • Using Simulations to See Principles behind Standard Deviation of SDoM 27:13
    • Using Simulations to See Principles behind Standard Deviation of SDoM
    • Conditions: Does n Matter?
    • Conditions: Does Number of Simulation Matter?
  • Central Limit Theorem 37:13
    • SHAPE
    • CENTER
    • SPREAD
  • Comparing Population, Sample, and SDoM 43:10
    • Comparing Population, Sample, and SDoM
  • Answering the 'Questions that Remain' 48:24
    • What Happens When We Don't Know What the Population Looks Like?
    • Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
    • How Do We Know whether a Sample is Sufficiently Unlikely?
    • Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
  • Example 1: Mean Batting Average 55:25
  • Example 2: Mean Sampling Distribution and Standard Error 59:07
  • Example 3: Sampling Distribution of the Mean 1:01:04

Transcription: Sampling Distribution of the Mean

Hi and welcome to www.educator.com.0000

Today we are going to talk about the sampling distribution of the mean.0002

We have been learning about sampling distributions in general but the sampling distribution0004

of the mean sometimes called the sampling distribution of the sample mean is a special case of it.0010

It has a lot of really interesting properties that better come in handy over and over again.0019

It is worth looking into detail.0025

This is just me, it is not like all of statistics, but I am going to call it SDOM for short sampling distribution of the mean.0027

Just so that you will know what I'm talking about without me having to say sampling distribution of the mean every single time.0038

We are going to use simulations online to see some principles that might arise out of season principle0046

and regularities that arise about the shape, mean, standard deviation of the SDOM.0055

Basically, the idea here is that shape, center, and spread really summarize the sampling distribution of mean.0064

We are also going to talk about how the principle of the SDOM that0072

we have looked up through the simulations have also been proven in the central limit theorem.0076

We are not going to go over the big formal proofs, although you can find those online but I want you to see how these two connect to each other.0083

Finally we are going to compare the population distributions that we have looked at before0092

sample distributions as well as sampling distribution of the mean.0097

This is a new kind of distribution.0102

Finally, we are going to recap and see if we had answered some of these questions that remain from last time.0104

Remember this is a special case of the general method for simulating the sampling distribution.0112

Let us go over that and fill in how the SDOM or the sampling distribution of the mean is a special case of regular old sampling distribution.0123

First up, take a random sample of size n from the population that does not change with the SDOM for all sampling distributions.0147

That is always the same stuff.0159

The real difference is really in step number 2.0161

Here we are computing the summary statistics.0166

The thing that makes sampling distribution of the mean special is that the particular summary statistic that you compute here is the mean.0170

You will plot that mean.0181

We actually looked at SDOM before but we just called them sampling distributions of0186

and now we are going to look at all the different properties of sampling distributions of the mean.0192

Then you repeat 1 and 2 many, many times, that step does not change.0198

Finally, display and examine the distribution of summary statistic.0202

That does not change either.0206

The only thing that we really have nailed down in the SDOM is this one and it is just that because0208

it is the sampling distribution of the mean all you do is find the means then you plot those.0214

It is a distribution entirely made up of tiny little mean.0219

We are going to go to looking at some simulation on the computer so if you want to type in this in your browser, go ahead and do that it is onlinestuffdata.com/_sampling_dist.0223

The onlinestuffdata.com actually have a cool statistics simulation you might want to also explore some of those.0243

This is what it should look like if you go to the web site here and if you hit begin it should start a java applet that looks something like this.0251

Let us just go over what these different things mean.0268

First things first, up on top it says parent population.0273

This is what this computer simulation shows you is a potential parent population but if you have a mouse, you can draw anything you want.0293

It does not have to look like this.0305

It can look as different as you wanted to look.0308

You can make any parent population.0312

This is a 5 mode distribution and also if you click on this little bar you could see different ones that are already preprogrammed for you.0314

There is the normal distribution levels like that, the uniform distribution and skewed distribution.0330

Custom just gives it to blank and then you could draw whatever you want.0338

Here you could paint any parent distribution you want and this is the distribution from which our sampling distribution will pull out a random sample.0344

And whenever you draw a distribution here, it will show you what the mean is.0359

The mean is shown in blue 17.58, the median shown in pink, and the standard deviation,0365

this shows you 1 standard deviation served to the negative side, and the positive side from the mean.0377

Also how much skew it is in the kurtosis as well, something we learn to calculate back when we are talking about the normal distribution.0386

We could also go ahead and let us use normal for now.0396

Here we could see that the skew and kurtosis are perfectly 0.0404

The mean and median are the same value, they are right on top of each other and the standard deviation is 5.0410

5 on each side and you can sort of see if you go 5, 5, and 5 you have gotten 90% of distribution.0415

Just like a normal distribution should.0425

Let us go on to the steps outlined in how to stimulate a sampling distribution.0427

The first step was to pull out a set of random sample of size n from the parent population.0436

What we can do is we could tell this little java applet what out sum and n should be.0443

We could say give me n(16) it is going to pull out 16 and here we could actually ask for a variety of different summary statistic,0452

but because we are talking about the SDOM or the sampling distribution of the mean, let us choose mean.0467

If you hit animate it, it will show you what it looks like to pull out one sample of size 16 and find the mean.0473

Here hit animate it, it is pulling out 16 randomly selected data points from the population.0484

It finds the mean of that which is the little blue notch there and then drops that mean here.0494

We keep track of just the means and so let us do that again.0502

It is going to pullout 16 data points find the mean and then drop back down.0506

We are keeping track of all those means.0513

Let us do it one more time, pulling out 16, finds the mean, drop that down.0515

Here our sampling distribution is really small but the nice thing about this little simulation is that it will do 5 different samples for you and just drop down 5 means.0523

Let me show you that.0537

It pulled off 5 samples like of 16 and did all that stuff, but without showing you pulling it out then it just drop down the 5 means.0539

In fact, you could do 1000 of those, so this time it pulled out 16,000 times and put the means down.0550

It could even do 10,000 and so we could keep hitting 10,000.0562

It might seem as though this is not changing.0569

What this is frequency.0572

How frequent a particular mean is?0574

Even though the frequency is going up, the shape is really not changing very much.0576

Let us look at this distribution of means.0581

It has a skew that is very close to 0 and a kurtosis very close to 0.0584

In fact, this looks very much like a normal distribution.0590

If you can imagine 3 of these little steps going out that is about 99% of the entire distribution.0594

Here we see that the mean is very similar to the mean here.0602

Here the mean is 16.01 and here the mean is 16.0608

We are already starting to see some of the things we have talked about earlier.0614

We see them in this simulation.0619

That is good.0624

The question is this, maybe it makes sense all your sampling distribution of means would be normal if your parent population is normal.0625

But what if your parent population was not normal?0637

For instance, what if it was uniform then what would our distribution of means look like?0639

Let us pullout 16 and find the mean and let us do that 10,000 times and another 10,000.0647

Here we have 20,001 different simulated little experiments and we have all these means and what do we see?0661

We see the means are still very similar 16 and 15.98.0671

Not only that but we see that it is approximately normal.0678

Imagine this little space going out 3 times that is about 99% of this entire distribution.0682

The skew is very close to 0.0689

The kurtosis is also very close to 0.0692

So far for normal parent populations as well as the uniform parent populations, we see that the sampling distribution of the mean is very close to normal.0694

Our notion of what normal distributions are and we know a lot about normal distributions.0706

What about skew?0712

Would we expect this sampling distribution of means would we expect that also to be normal or maybe a little bit skewed?0716

Let us animate 1 just to see here if we pull down 16, drop down the mean.0727

The mean is sort of close to where the mean is up here but will it be normal or skewed looking.0735

Let us do it 5 times, another 5 times, another 5 times, let us do it 10,000, another 10,000 and what do we see?0747

Although the skew is a little bit greater than it used to be in previous ones, this really looks more normal than anything else.0760

If we get another 10,000, another 10,000, it does not seem to change very much and it looks pretty normal.0770

If you take this little space the standard deviation and go out about 3 times that is about 99% of the distribution.0779

Not only that but we see that this mean 8.07 is very similar to the mean of the current population 8.08.0787

We are saying that even though the original parent population is not normal, the sampling distribution of the mean0797

or the SDOM tends to look very normal, specially if you have a lot of means.0806

You simulated it a lot.0813

A lot of large numbers can be involved and now let us do a custom one.0815

We can do some crazy ones here.0820

What about something like this?0822

What do we expect this to be to have a normal sampling distribution of the mean?0825

I am going to animate 1 here is 16 data point, finds the mean, drops it down0830

and you do that 10,000 times and another 10,000 and we have 16.01.0839

16 as the similar mean, very close to 0 for skew, very close to 0 for kurtosis.0848

This is looking very, very normal.0859

Take the 3, go out 3, 99% that is very interesting.0863

You could try a whole bunch of different crazy things and I dare you to try to draw one0870

that would not give you a normally distributed sampling distribution of the mean.0877

What about this one?0883

It is like a shadow of the parabola or something.0884

What about that?0892

Let us get directly to the 10,000.0893

Let us clear this.0897

Let us get directly to 10,000 and even for this crazy distribution we see that the sampling distribution of the mean looks fairly normal.0902

The mean is almost always like right on top of the mean of the parent population and another thing0917

that I want you to notice is that when the standard deviation is here in 11.66, here the standard deviation is much smaller.0926

There are couple of things we have already learned from using these simulations.0937

Let us think about using these simulations to see the principle behind the shape of the SDOM.0946

One of the things we saw is that no matter what the shape of the parent’s population SDOM tends to be normal.0958

Wait a second, we have only tested that for n(16).0989

Maybe 16 is somehow special.0995

Maybe what about for n(2) is this also going to be normal.0998

Let us try that.1008

Look at what we see when the n is very small this does not tend to be normal.1009

What about n(5), that looks a little more normal but not really that nice normal distribution we saw with 16.1018

What about 10?1029

Now that is starting to look a little bit more normal.1030

What we see is that as long as n tends to be reasonably large and I do not know if there is a magic number1036

but as long as tends to be pretty large, you tend to get a normal sampling distribution of the mean.1044

That is definitely one thing, so there are a couple of conditions.1052

These conditions have to be fulfilled before this is true.1060

Sample is n, the sample size must be reasonably large.1067

2 is too small, 5 even is a little too small.1081

It starts looking better but whatever reasonably large is it should be reasonably large.1087

A lot of times people use a rule like 40 to be any of n(40) to be like reasonably larger but you know that is just the rule of them.1093

What else?1104

We looked at sample size that might be an important thing, but what else might matter?1106

Does it matter how many times we sample?1117

Let us see, but let us clear the bottom 3 and let us do it for 5.1126

If we did the simulation 5 times then we would not get a normal distribution only when we start doing it1133

like 1000 times or 10,000 times rather or 20,000 times does it start looking more and more normal?1142

It also seems to be that the more simulations we have the better.1151

If using simulations must have a large number of simulations.1158

This is true only if n is reasonably large and only if you have a very large number of simulations if you are using simulations.1179

Although this is really helpful there are some conditions that you have to meet before you could invoke this. We learn some things from using simulations about the shape of the SDOM.1195

What about the principles behind the center or mean of the SDOM?1214

One of the things we found was generally the mean of the SDOM and I am going to show you a new notation for this,1222

the mean of the SDOM is shown as mu sub x bar because it is the mu of a bunch means.1234

The mean of the SDOM tends to equal the mean of the population.1243

Sometimes I will say parent population, but that means the same thing as the population and that is symbolized by mu.1257

Just plain mu.1266

Notice that this is using mu because remember SDOM and all sampling distributions are theoretical distributions.1267

Theoretical distributions tend to be notated just like populations.1279

Does it have to be that are there any conditions?1285

I will put conditions does n matter?1292

First we will test this.1303

We want to know if it matters the size of our sample.1306

Let me clear lower 3, let us do a skewed.1310

Let us see if it matters whether the size of our sample matters.1321

Let us try for n=2 and let us do that 5 times or 10,000 times.1325

N=2 and this does not look very normal but the mean is still very close to that mean.1334

That is interesting to note.1344

What about for uniform?1350

N is 2 is the mean very similar to this mean?1352

It tends to be very similar.1359

For small n the means tend to be equal to each other.1364

What about for n(5) which is also still pretty small.1370

Let us try 10,000, 16 and 16 still pretty good.1375

What about for skewed?1381

Let us try 10,000.1382

We are seeing something.1385

What about for very large n?1388

Let us see about that 8.07, 8.08.1391

What about for some crazy custom distributions?1398

15.06, 15.05 pretty good.1403

For both small and large n the means tend to be similar.1412

Let us try with this crazy distribution in n(2), 15.04, 15.05.1419

We are seeing something here.1430

We are saying that sort of no matter the size of the n, the mean tends to be equal.1432

Does n matter?1436

For both small and smaller and larger n, mu sub x bar = mu.1441

That is nice.1469

It is like we do not have to make sure that we have a large n in order to invoke this principle.1470

What about does number of simulation matter?1477

Let us go see.1489

Let us clear the bottom 3, let us say we only did 5.1492

If we only did 5 simulations we get 15.61 which is not that far off from 15.05.1500

Let us clear that again and do another 5.1510

Here we get 11.22 and here we get 15.05.1514

Here we see that maybe the number of actual simulations does matter.1522

Here we get 15.79 which is not so bad.1528

Let us clear that and do it again 16.47.1532

We are seeing that if you have a small number simulation then you are not really sure if it is close or not.1537

It might be very little usually a sort of in the right range but having more seems to give you that assurance.1547

Let us try with a larger n.1556

With a larger n we see 15.05, 15.03 and clear that again.1561

I am going to that simulation 5 times 15.03, 15.05, and clear it again.1571

16.51 that is off.1578

One more time 16.1583

We do see that number of simulations better.1587

Having a large number simulations gives more accuracy in mu sub x bar in the sample mean.1591

By accuracy I mean more of a match between the mean of your sampling distribution as well as your population mean.1619

n does not matter but the number of simulations does.1627

What about the standard deviation of the SDOM?1632

We have talked about it very briefly in one of the problems before but saying standard deviation of the sampling distribution of the mean is long.1639

We actually end up using this standard deviation a lot.1652

This is going to be something that comes up over and over again and because of that1656

we give it a special name so that we could shortcut saying standard deviation of the sampling distribution of the mean.1660

The special name is this, it is standard error that is what it is called.1666

Deviation is the word for you how far did the distance between the mean and some data point that is the deviation.1677

Error is often used interchangeably with that term how, off are you from some target?1688

The target here being the mean.1699

It does not mean that we made an error like actual mistake but it just means how off are these data points.1701

That is why it is called standard error and whenever we say standard error what we really mean is this whole thing.1709

The standard deviation of the sampling distribution of the mean.1716

We do not use the word standard error for any other standard deviations.1719

For instance, if we calculate the standard deviation of the sample or population we would not call it standard error.1724

This term is only reserved for this concept.1731

What if we learn about the standard error when we did some of our simulations?1734

Let us look again with a special focus on standard deviation.1742

Let us go to the normal distribution and let us start with the n(5).1751

Let us take 10,000 simulations and we see that it is fairly normal and it has a very similar mean has a skew and kurtosis close to 0.1759

This is looking very normal to us but what is one thing that you notice between this and this?1775

This one seems a lot sort of like cornier or sharper.1782

The standard error seems to be smaller than the population standard deviation.1787

Let us see if that is true for different values of n.1797

Let us try n=10 and I will just keep straight to 10,000 simulations and another 10,000.1801

We see once again, it is nice and normal skew and kurtosis are close to 0, similar mean1810

but we see that the standard deviation here is still bigger than the standard deviation down here.1818

Not only that, but we see that this is even pointier than it was before.1826

It was before even sharper.1831

Let us try for n(16).1832

I think it is hard to notice but it is even pointier like narrow in terms of its standard deviation.1838

If you notice the numbers, these numbers keep going lower and lower.1851

Before where n=5, the standard deviation was 2.23.1856

For n=10, the standard deviation is 1.61.1863

Obviously the standard deviation has not changed at all.1871

Let us look at standard deviation of n=25.1876

Here the standard error is 1 so it is really small.1878

One thing we see it is that basically the standard error is always smaller than the parent population,1888

but not only that, how small it is 10 seems to be related to n?1900

Let us see if that is true with other formations.1906

Maybe we will draw a custom one. Here is the tri-modal distribution.1912

Maybe this time we will start off with the biggest n, the n(25) and let us see.1920

The standard deviation is 1.71 and this is looking pretty normal but what about n=10?1929

Is the standard error going to be smaller than 1.7?1940

It is not, it is better and that is what we saw before as n is smaller and smaller the standard error is bigger and bigger.1944

There is a bigger spread.1956

Let us try with n=5.1957

We should predict the issue be even bigger and that is what we see with n=2 now the standard error is really big.1960

Maybe we will try with one other distribution, the uniform distribution.1972

Let us go from small to big n so now when n is small and getting bigger and bigger and bigger what should happen to standard error?1980

Standard error should get smaller and smaller.1989

They are facing an inverse relationship.1993

Here we start off with a pretty wide looking standard deviation and you want to look at the red part right here it is pretty wide 6.79.1996

Remember 6.79 is still smaller than 9.52.2006

It is always smaller than the parent population but always n matters.2013

What about n=5?2020

Now it is 4. Something, n=10, now it is 3, n=25, now it is 1.91.2024

We see a couple of things here.2039

Standard error and to write standard error we would write sigma because the sampling distribution is a theoretical distribution sub x bar2041

to indicate it is the standard deviation of a bunch of means is always smaller than standard deviation of population and we call that just sigma.2057

This seems to be true.2083

Does n matter?2089

Yes it does matter.2096

How does it matter?2100

The larger the n the smaller the standard error.2102

Obviously you could also write this as the smaller the n the larger the standard error.2115

It is the same idea.2123

Do the numbers of simulations matter?2125

Let us see.2130

Let us say we only had 5 what is the standard deviation?2138

Well it is still smaller.2146

Standard deviation is almost always smaller, that is true.2148

Standard deviation is almost always smaller but that is rough.2162

We do not know exactly how small.2172

Does number of simulation matter?2174

This is a pretty big idea.2191

It just tends to be smaller, not really for the big idea that standard error is smaller than the population standard deviation.2196

The simulations do not really matter but we want to get more precise than that.2217

This is a pretty general to just say all the standard error is smaller than the standard deviation of the population.2222

It is nice to know exactly how much smaller.2229

That is where the central limit theorem comes in.2231

Although we have looked at it through simulation that is the empirical method of looking at2237

what the properties of the sampling distribution of the mean look like.2244

With the central limit theorem did was it formalize those things like people observe that this is true that2250

the shape tends to be normal as n is large that the center tends to be the similar between the SDOM as well as the population.2257

We also saw that the spread tends to be smaller and the SDOM than in the population particularly2267

as n goes up but the central limit theorem is the formal proof of that idea.2275

I am not going to go for the proof bit I will just go over what it ends with.2281

The central limit theorem ends with this.2286

As sample size increases the shape of SDOM becomes more normal.2289

I should say approximates normality.2311

It is not like that it is like being built in becoming more like transforming but it approximates normality.2317

As n goes up the shape becomes more normal.2331

This is just a formal way of saying it and being actually proven this mathematically.2338

To note, although it is not part of the central limit theorem, one thing to note is that the population is not necessarily the same shape as the SDOM.2343

The SDOM is always normal as well as sample size is large but that does not mean that the populations are always normal.2356

That is helpful to ask because we know the shape of the SDOM even though we do not know anything about the shape of the population.2365

What about center?2373

The principle of the center is that the mean of the SDOM is equal to the mean of the population and does n matter?2376

No, it does not.2387

What about the spread?2390

Here the standard error is equal to the standard deviation of the population divided by the square root of n.2393

There is an easier way to remember this is that the variance of the SDOM is actually equal to the variance of the population divided by n.2425

Standard error when we square root both sides, this is going to be the variance of the standard error of the population divided by the square root of n.2438

That is the only thing but this will give us that nice inverse relationship as n goes up, as n becomes bigger and bigger and bigger2450

you are dividing the population standard deviation by a bigger and bigger number.2462

Therefore resulting in a smaller and smaller and smaller standard error.2469

As well as the opposite, as n get smaller and smaller let us consider the n(1).2475

When n is 1 we know that the standard error precisely equals the standard deviation of the population.2482

Let us just think about that in our heads, if we are actually getting just an n(1) from the population then we end up sampling the population exactly as it was.2491

It would be like dropping down the same thing and because of that the standard deviation would be the same, the mean would be the same.2504

Not only that but the shape would be the same as the shape of the population, it would not be normal.2512

It is helpful to think about the special case when n=1.2520

In statistics you would never sample n is just 1 that is not helpful to us.2525

We really get more out of the central limit theorem when n is larger and larger.2533

This is the central limit there in a nutshell and this is going to be helpful to us because now we do not need a large number of simulation2541

because in a lot of the simulations we are looking at, the simulation matter like how many simulations you do.2555

What is nice about the fact that they just proven the central limit theorem also we may collect the CLT.2564

Is that now we could just get directly to these principals without having to actually do computer-based simulations.2570

but the computer-based simulations are helpful for just to be able to see and know where the central limit theorem comes from empirically.2579

Let us compare the population, the sample, and the SDOM.2587

These are 2 things that we looked at a lot before, but I want you to see how these all fit together.2596

When we talked the population before we called it the truth, and this is the thing we really want to know.2603

We do not actually want to know about samples.2610

We really want to know about the population.2612

The problem is it is really hard to get empirical data.2615

You actually go out and get data from the world.2620

It is impossible to get data on.2623

Largely it is either theoretical or we just do not know what the population is like or when we do have known populations,2626

it might be small populations or very well study.2637

The summary values for the population are called parameters and in the same way the samples are not the truth.2640

We are not really interested in samples but we are using the sample as a window to the truth.2651

What is nice about the sample is that unlike the population it is empirical.2657

We could actually go out there and get data on it.2661

The summary values are called statistics, we call them statistics.2663

For the population we symbolize the mean by saying mu here we call it x bar here the size is N but the sample size is n.2668

Variance here is called sigma2 but here it is s, this is n-12 version.2687

For standard deviation it is just sigma4s.2700

Also you might see S but when you use S you are not trying to approximate the population standard deviation.2706

You are just interested in the actual standard deviation of the sample.2718

Now that we know this how does the SDOM fit in all of this?2722

The SDOM is it the truth?2729

Is it the thing we want to know?2732

No not really nor is it just a window to the truth.2734

In essence what it really help us do is that it helps us get form this to this.2738

It is sort of a middleman because what the SDOM like is like a whole distributions of windows to the truth.2744

Although it is not the truth itself, it helps us interpret the sample because it is a whole bunch of windows to the truth2760

and you can see where does the sample fit in to the sampling distribution of the mean?2769

We do not get data on it.2776

It is not empirical.2780

It is theoretical.2781

It is still a theoretical distribution but we can easily generate it because of the CLT.2782

Those principles help us know what the SDOM looks like and instead of calling them parameters2793

or statistics what we call them is expected values.2801

Just like probability distributions they are both theoretical distributions of samples.2804

Instead of calling it mu or x bar it is likely it is the distribution of windows to the truth, so it is the mu of a whole bunch of x bar.2813

Here instead of N and n, it is N sub x bar and this is only in the case where we use simulations.2828

Now that we just derive it through the central limit theorem and variance we will call it sigma2,2848

but with the x bar and for standard deviation instead of calling it sigma, it is just sigma sub x bar.2861

Notice that there are these little x bars here and that is what really sets this sampling distribution of the mean apart2871

because it is about the mean(means), n(mean), the variance(means) and the standard deviation(means).2880

It is always about means of sample means, and so because of that you always see the sub x bar here for all these expected values.2890

That is how these things fit in together and now let us answer some of these questions that remain.2900

This was from the previous lesson.2910

We even went over sampling distributions in general.2913

We had some questions that remained.2918

Perhaps the SDOM can help us answer some of these questions.2920

What happens when we do not know what the population looks like?2925

In the case of the SDOM, we do not have to know what the population looks like.2929

If you use SDOM or the sampling distribution of sample means then you do not need know what the population looks like.2937

We do not have to know whether it is uniform or skewed or anything like that if we use the SDOM2951

because we know the shape, mean, and spread of the SDOM.2970

We could use that instead of having to rely in the middleman.2978

Can we have sampling distributions for summary statistics other than the mean?2983

Why yes you can, because if you want you can play with the simulation further and I am just going to clear this and go with normal again.2988

We have to use the mean we can actually use the median and so we can look at what the median looks like3000

or we can look at standard deviation and look at what standard deviation looks like.3006

We could also look at the range, the interquartile.3015

We can look at variance and we can look at that for a whole bunch of different kinds of n as well as different kinds of summary statistics.3024

If we look at that we could say that standard deviation, it is that what it looks like.3050

What about the median?3056

We can look at that but when we look at medians and standard deviation or variance, the CLT does not necessarily apply full force.3058

For instance, the median here is not necessarily going to be the median here.3074

For instance, let me show you a custom example, notice that the median here is similar to the median here,3082

but if we look at for small sample sizes here the median is pretty similar.3102

Okay, so there at least the median works for this one.3114

The median work, but I think there are cases where the median does not necessarily work.3118

Like this one, here the median is 17 but here the median is 16.3129

The median do not always necessarily equal the median of the sampling distributions of median.3141

In these kinds of other sample statistics such as median and variances and standard deviation,3147

you do not necessarily have all of the properties of the CLT.3157

There are some exceptions.3161

For instance, when you standard deviation very often you do get roughly normally distributed distributions.3163

But it is not quite as regular as the sampling distribution of the mean and there are some exceptions here and there3174

but it is sort of hudge pudgy only the sampling distribution of the mean really fit all 3 properties of the CLT.3187

Here we can say yes but CLT does not necessarily apply.3194

How do we know whether a sample is sufficiently unlikely?3214

We do not really know for sure but one of the things that we talked about when we talked about normal distributions in the past3222

is that we have this normal distribution, we can tell you where 99% of the means might fall or 95% or 90% of the means might fall.3230

We can actually make these cut off points once we know that it is a normal distribution.3245

We can decide as long as it is way different than 99% of what is expected we can say it is sufficiently unlikely.3259

We can set these sort of arbitrary marks.3267

Although it does not say exactly how different it has to be, we could set marks given that it is a normal population.3270

Do we always have to simulate a large number of samples in order to get a sampling distribution?3280

No, not necessarily because of the CLT.3286

The CLT does not rely on simulations.3293

Simulations are really a way of empirically doing it like so pretending to do it many, many times, but because of the CLT we could actually go directly to it.3306

We do not actually have to do the simulations.3322

Let us go into some examples.3324

Example 1, from 1910 to 1919 the batting averages of major league baseball players was approximately normal.3328

It is the mean and standard deviation.3337

Can you construct a sampling distribution of the mean batting average for random samples of 15 players?3339

What would be the resulting shape, center, and spread?3351

To me it helps just like that simulation it have the populations up here and the sampling distributions down here.3354

What I like to do is draw the population and the SDOM just to give myself a sense of what is going on.3365

The batting averages approximately normal with a mean of .266 and a standard deviations of .037.3373

The mu is .266 and standard deviation is .037.3388

What is the resulting shape, center, and spread of the sampling distribution of the mean batting average for random samples of 15 players?3396

We know that whatever this is, this should probably be skinnier but it should be approximately normal.3405

n is 15 because we are using n because it is the size of our sample.3425

We will use N sub x bar if we are talking about the number of simulations that we had done.3433

But we are not doing any simulations.3441

What would be the resulting shape?3444

I would say approximately normal.3448

What would be the center?3455

That is mu sub x bar and we know that mu sub x bar = mu and so that equals .266.3462

There we have center and what about spread?3472

Spread would be the standard deviation of the SDOM so sigma sub x bar.3481

We know that sigma sub x bar is equal to sigma ÷ √n (.037) / √15 .3491

I am just going to pull out in Excel file in order to do my calculations.3503

Feel free to use a calculator.3510

Divided by √(15 )and we get .00966.3512

We see that this is much smaller than than this which is skinnier shape so skinnier spread.3523

Not as much spread.3536

There you have it.3538

We have the shape, center, and spread.3543

Example 2, consider the sampling distribution of the mean, I like to think of SDOM of a random sample of size n3546

taken from a population of size N, with a mean of mu and a standard deviation of sigma.3557

For a fixed n how does the mean of the sampling distribution change as n increases?3566

For a fixed n how does the standard error change as n increases?3573

Whether n increases or not, mu sub x bar = mu.3579

n does not matter very much and that is one of the things we solved.3589

For a fixed N, how does the standard error change as n increases?3599

We know that sigma sub x bar, the standard error is sort of inversely proportional to the √n and the standard deviation of the population.3606

How does the standard error changes and increases?3626

Standard error becomes smaller that is the inverse relationship as one goes up, the other comes down.3630

Here whether n increases or decreases this relationship stays the same.3650

Here as n increases the standard error decreases.3657

For most farms in the Mid West, each one 1,000 of an acre produces on average 15,000 kernels of corn with a standard deviation of 2,000.3662

Supposed 25 of these many plots are randomly chosen on a typical Mid Western farm,3677

what is the probability that the mean number of kernels per plot will exceed 15,000, 16,000, 17,000.3686

Let us think about with this is asking us.3695

Imagine an acre, split up into 1,000 tiny, tiny, tiny little plots and pretend that 1 is 1,000.3697

I know it does not look quite right but pretend that it is.3710

In that little plot, this is the mean and this is the standard deviation for most farms.3713

Suppose 25 of these mini plots are randomly chosen on a typical Mid Western farm.3721

What it is telling us is we do not know what this population?3727

What that distribution looks like?3731

We only know that on average the mu is 15,000 and the standard deviation is 2,000.3733

We want to know what is the probability that will take a plot and look at how many kernels there are3746

and what is the probability that mean number will exceed 15,000.3759

You cannot actually just look at the population because we do not know if the population is normal, but we do know that the sampling distribution of the mean.3766

That distribution is approximately normal.3779

We will use the SDOM and although we cannot draw the population because we do not know what it looks like.3783

But we do know what this looks like.3797

This looks like this.3799

We know what the mu is mu sub x bar.3801

We know that is going to be 15,000 because we know it is the same as this.3807

We also know the standard error of this.3812

The standard error is standard deviation ÷ √n .3817

n is going to be 25.3825

That is 2,000 ÷ √25=2,000 ÷5=400.3829

We know that this is 400 and so we can say what is the probability that the mean number of kernels per plot will exceed 15,000?3856

We take 25 plots from 25 of these many plots and we find the mean.3876

What is the likelihood that the mean will exceed 15,000?3884

Let me color the different color.3888

15, 000 is right here, what is that probability?3893

The probability that that mean exceeds 15,000?3900

We know that is 50% because we know that normal distributions are symmetrical and that is 15%.3905

What about for 16,000?3928

One thing that might be helpful to know is how many standard deviations away?3932

If you remember, you have to sort of think back to all that normal distributions stuff.3939

We need to know where 16,000 is.3946

Each jump is 400 and we need to know how many 400 away the 16,000 is.3949

In order to do that we could actually just use Z score.3965

The Z score for 16,000 is 16,000 – 15,000 / 400.3969

That is 2.5 and we know that that the Z score where 16,000 is at 2.5.3984

I am running out of rows.4016

I am just going to delete that row.4018

You can look at either in the back of your book how much this area is by using Z score of 2.5 givent hat the Z score is 0.4021

You can look at that little area and that is going to give us the probability that x bar = exceeds 15,000 and the probability that x bar exceeds 16,000.4041

The way that I am going to do it is I am going to look it up on my book.4060

I am going to use the normal distributions function in Excel.4064

Normsdist this gives me the area underneath the curve for a particular Z score 2.5.4068

This gives me this area.4085

I need to do 1 – this and this .0062.4091

The probability that x will exceed 15,000 is 50% but the probability that x bar will exceed 16,000 is only .0062.4099

Most of our x bars will be below 16,000.4121

Educator®

Please sign in to participate in this lecture discussion.

Resetting Your Password?
OR

Start Learning Now

Our free lessons will get you started (Adobe Flash® required).
Get immediate access to our entire library.

Membership Overview

  • Available 24/7. Unlimited Access to Our Entire Library.
  • Search and jump to exactly what you want to learn.
  • *Ask questions and get answers from the community and our teachers!
  • Practice questions with step-by-step solutions.
  • Download lecture slides for taking notes.
  • Track your course viewing progress.
  • Accessible anytime, anywhere with our Android and iOS apps.