  Dr. Ji Son

Sampling Distribution of Sample Proportions

Slide Duration:

Section 1: Introduction
Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro
0:00
0:10
0:11
Statistics
0:35
Statistics
0:36
Let's Think About High School Science
1:12
Measurement and Find Patterns (Mathematical Formula)
1:13
Statistics = Math of Distributions
4:58
Distributions
4:59
Problematic… but also GREAT
5:58
Statistics
7:33
How is It Different from Other Specializations in Mathematics?
7:34
Statistics is Fundamental in Natural and Social Sciences
7:53
Two Skills of Statistics
8:20
Description (Exploration)
8:21
Inference
9:13
Descriptive Statistics vs. Inferential Statistics: Apply to Distributions
9:58
Descriptive Statistics
9:59
Inferential Statistics
11:05
Populations vs. Samples
12:19
Populations vs. Samples: Is it the Truth?
12:20
Populations vs. Samples: Pros & Cons
13:36
Populations vs. Samples: Descriptive Values
16:12
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:10
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:11
Example 1: Descriptive Statistics vs. Inferential Statistics
19:09
Example 2: Descriptive Statistics vs. Inferential Statistics
20:47
Example 3: Sample, Parameter, Population, and Statistic
21:40
Example 4: Sample, Parameter, Population, and Statistic
23:28
Section 2: About Samples: Cases, Variables, Measurements

32m 14s

Intro
0:00
Data
0:09
Data, Cases, Variables, and Values
0:10
Rows, Columns, and Cells
2:03
Example: Aircrafts
3:52
How Do We Get Data?
5:38
Research: Question and Hypothesis
5:39
Research Design
7:11
Measurement
7:29
Research Analysis
8:33
Research Conclusion
9:30
Types of Variables
10:03
Discrete Variables
10:04
Continuous Variables
12:07
Types of Measurements
14:17
Types of Measurements
14:18
Types of Measurements (Scales)
17:22
Nominal
17:23
Ordinal
19:11
Interval
21:33
Ratio
24:24
Example 1: Cases, Variables, Measurements
25:20
Example 2: Which Scale of Measurement is Used?
26:55
Example 3: What Kind of a Scale of Measurement is This?
27:26
Example 4: Discrete vs. Continuous Variables.
30:31
Section 3: Visualizing Distributions
Introduction to Excel

8m 9s

Intro
0:00
Before Visualizing Distribution
0:10
Excel
0:11
Excel: Organization
0:45
Workbook
0:46
Column x Rows
1:50
Tools: Menu Bar, Standard Toolbar, and Formula Bar
3:00
Excel + Data
6:07
Exce and Data
6:08
Frequency Distributions in Excel

39m 10s

Intro
0:00
0:08
Data in Excel and Frequency Distributions
0:09
Raw Data to Frequency Tables
0:42
Raw Data to Frequency Tables
0:43
Frequency Tables: Using Formulas and Pivot Tables
1:28
Example 1: Number of Births
7:17
Example 2: Age Distribution
20:41
Example 3: Height Distribution
27:45
Example 4: Height Distribution of Males
32:19
Frequency Distributions and Features

25m 29s

Intro
0:00
0:10
Data in Excel, Frequency Distributions, and Features of Frequency Distributions
0:11
Example #1
1:35
Uniform
1:36
Example #2
2:58
Unimodal, Skewed Right, and Asymmetric
2:59
Example #3
6:29
Bimodal
6:30
Example #4a
8:29
Symmetric, Unimodal, and Normal
8:30
Point of Inflection and Standard Deviation
11:13
Example #4b
12:43
Normal Distribution
12:44
Summary
13:56
Uniform, Skewed, Bimodal, and Normal
13:57
17:34
Sketch Problem 2: Life Expectancy
20:01
Sketch Problem 3: Telephone Numbers
22:01
Sketch Problem 4: Length of Time Used to Complete a Final Exam
23:43
Dotplots and Histograms in Excel

42m 42s

Intro
0:00
0:06
0:07
Previously
1:02
Data, Frequency Table, and visualization
1:03
Dotplots
1:22
Dotplots Excel Example
1:23
Dotplots: Pros and Cons
7:22
Pros and Cons of Dotplots
7:23
Dotplots Excel Example Cont.
9:07
Histograms
12:47
Histograms Overview
12:48
Example of Histograms
15:29
Histograms: Pros and Cons
31:39
Pros
31:40
Cons
32:31
Frequency vs. Relative Frequency
32:53
Frequency
32:54
Relative Frequency
33:36
Example 1: Dotplots vs. Histograms
34:36
Example 2: Age of Pennies Dotplot
36:21
Example 3: Histogram of Mammal Speeds
38:27
Example 4: Histogram of Life Expectancy
40:30
Stemplots

12m 23s

Intro
0:00
0:05
0:06
What Sets Stemplots Apart?
0:46
Data Sets, Dotplots, Histograms, and Stemplots
0:47
Example 1: What Do Stemplots Look Like?
1:58
Example 2: Back-to-Back Stemplots
5:00
7:46
Example 4: Quiz Grade & Afterschool Tutoring Stemplot
9:56
Bar Graphs

22m 49s

Intro
0:00
0:05
0:08
Review of Frequency Distributions
0:44
Y-axis and X-axis
0:45
Types of Frequency Visualizations Covered so Far
2:16
Introduction to Bar Graphs
4:07
Example 1: Bar Graph
5:32
Example 1: Bar Graph
5:33
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:07
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:08
Example 2: Create a Frequency Visualization for Gender
14:02
Example 3: Cases, Variables, and Frequency Visualization
16:34
Example 4: What Kind of Graphs are Shown Below?
19:29
Section 4: Summarizing Distributions
Central Tendency: Mean, Median, Mode

38m 50s

Intro
0:00
0:07
0:08
Central Tendency 1
0:56
Way to Summarize a Distribution of Scores
0:57
Mode
1:32
Median
2:02
Mean
2:36
Central Tendency 2
3:47
Mode
3:48
Median
4:20
Mean
5:25
Summation Symbol
6:11
Summation Symbol
6:12
Population vs. Sample
10:46
Population vs. Sample
10:47
Excel Examples
15:08
Finding Mode, Median, and Mean in Excel
15:09
Median vs. Mean
21:45
Effect of Outliers
21:46
Relationship Between Parameter and Statistic
22:44
Type of Measurements
24:00
Which Distributions to Use With
24:55
Example 1: Mean
25:30
Example 2: Using Summation Symbol
29:50
Example 3: Average Calorie Count
32:50
Example 4: Creating an Example Set
35:46
Variability

42m 40s

Intro
0:00
0:05
0:06
0:45
0:46
5:45
5:46
Range, Quartiles and Interquartile Range
6:37
Range
6:38
Interquartile Range
8:42
Interquartile Range Example
10:58
Interquartile Range Example
10:59
Variance and Standard Deviation
12:27
Deviations
12:28
Sum of Squares
14:35
Variance
16:55
Standard Deviation
17:44
Sum of Squares (SS)
18:34
Sum of Squares (SS)
18:35
Population vs. Sample SD
22:00
Population vs. Sample SD
22:01
Population vs. Sample
23:20
Mean
23:21
SD
23:51
Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File
27:21
Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File
35:25
Example 3: Sum of Squares
38:58
Example 4: Standard Deviation
41:48
Five Number Summary & Boxplots

57m 15s

Intro
0:00
0:06
0:07
Summarizing Distributions
0:37
0:38
5 Number Summary
1:14
Boxplot: Visualizing 5 Number Summary
3:37
Boxplot: Visualizing 5 Number Summary
3:38
Boxplots on Excel
9:01
Using 'Stocks' and Using Stacked Columns
9:02
Boxplots on Excel Example
10:14
When are Boxplots Useful?
32:14
Pros
32:15
Cons
32:59
How to Determine Outlier Status
33:24
Rule of Thumb: Upper Limit
33:25
Rule of Thumb: Lower Limit
34:16
Signal Outliers in an Excel Data File Using Conditional Formatting
34:52
Modified Boxplot
48:38
Modified Boxplot
48:39
Example 1: Percentage Values & Lower and Upper Whisker
49:10
Example 2: Boxplot
50:10
Example 3: Estimating IQR From Boxplot
53:46
Example 4: Boxplot and Missing Whisker
54:35
Shape: Calculating Skewness & Kurtosis

41m 51s

Intro
0:00
0:16
0:17
Skewness Concept
1:09
Skewness Concept
1:10
Calculating Skewness
3:26
Calculating Skewness
3:27
Interpreting Skewness
7:36
Interpreting Skewness
7:37
Excel Example
8:49
Kurtosis Concept
20:29
Kurtosis Concept
20:30
Calculating Kurtosis
24:17
Calculating Kurtosis
24:18
Interpreting Kurtosis
29:01
Leptokurtic
29:35
Mesokurtic
30:10
Platykurtic
31:06
Excel Example
32:04
Example 1: Shape of Distribution
38:28
Example 2: Shape of Distribution
39:29
Example 3: Shape of Distribution
40:14
Example 4: Kurtosis
41:10
Normal Distribution

34m 33s

Intro
0:00
0:13
0:14
What is a Normal Distribution
0:44
The Normal Distribution As a Theoretical Model
0:45
Possible Range of Probabilities
3:05
Possible Range of Probabilities
3:06
What is a Normal Distribution
5:07
Can Be Described By
5:08
Properties
5:49
'Same' Shape: Illusion of Different Shape!
7:35
'Same' Shape: Illusion of Different Shape!
7:36
Types of Problems
13:45
Example: Distribution of SAT Scores
13:46
Shape Analogy
19:48
Shape Analogy
19:49
Example 1: The Standard Normal Distribution and Z-Scores
22:34
Example 2: The Standard Normal Distribution and Z-Scores
25:54
Example 3: Sketching and Normal Distribution
28:55
Example 4: Sketching and Normal Distribution
32:32
Standard Normal Distributions & Z-Scores

41m 44s

Intro
0:00
0:06
0:07
A Family of Distributions
0:28
Infinite Set of Distributions
0:29
Transforming Normal Distributions to 'Standard' Normal Distribution
1:04
Normal Distribution vs. Standard Normal Distribution
2:58
Normal Distribution vs. Standard Normal Distribution
2:59
Z-Score, Raw Score, Mean, & SD
4:08
Z-Score, Raw Score, Mean, & SD
4:09
Weird Z-Scores
9:40
Weird Z-Scores
9:41
Excel
16:45
For Normal Distributions
16:46
For Standard Normal Distributions
19:11
Excel Example
20:24
Types of Problems
25:18
Percentage Problem: P(x)
25:19
Raw Score and Z-Score Problems
26:28
Standard Deviation Problems
27:01
Shape Analogy
27:44
Shape Analogy
27:45
Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer
28:24
Example 2: Heights of Male College Students
33:15
Example 3: Mean and Standard Deviation
37:14
Example 4: Finding Percentage of Values in a Standard Normal Distribution
37:49
Normal Distribution: PDF vs. CDF

55m 44s

Intro
0:00
0:15
0:16
Frequency vs. Cumulative Frequency
0:56
Frequency vs. Cumulative Frequency
0:57
Frequency vs. Cumulative Frequency
4:32
Frequency vs. Cumulative Frequency Cont.
4:33
Calculus in Brief
6:21
Derivative-Integral Continuum
6:22
PDF
10:08
PDF for Standard Normal Distribution
10:09
PDF for Normal Distribution
14:32
Integral of PDF = CDF
21:27
Integral of PDF = CDF
21:28
Example 1: Cumulative Frequency Graph
23:31
Example 2: Mean, Standard Deviation, and Probability
24:43
Example 3: Mean and Standard Deviation
35:50
Example 4: Age of Cars
49:32
Section 5: Linear Regression
Scatterplots

47m 19s

Intro
0:00
0:04
0:05
Previous Visualizations
0:30
Frequency Distributions
0:31
Compare & Contrast
2:26
Frequency Distributions Vs. Scatterplots
2:27
Summary Values
4:53
Shape
4:54
Center & Trend
6:41
8:22
Univariate & Bivariate
10:25
Example Scatterplot
10:48
Shape, Trend, and Strength
10:49
Positive and Negative Association
14:05
Positive and Negative Association
14:06
Linearity, Strength, and Consistency
18:30
Linearity
18:31
Strength
19:14
Consistency
20:40
Summarizing a Scatterplot
22:58
Summarizing a Scatterplot
22:59
Example 1: Gapminder.org, Income x Life Expectancy
26:32
Example 2: Gapminder.org, Income x Infant Mortality
36:12
Example 3: Trend and Strength of Variables
40:14
Example 4: Trend, Strength and Shape for Scatterplots
43:27
Regression

32m 2s

Intro
0:00
0:05
0:06
Linear Equations
0:34
Linear Equations: y = mx + b
0:35
Rough Line
5:16
Rough Line
5:17
Regression - A 'Center' Line
7:41
Reasons for Summarizing with a Regression Line
7:42
Predictor and Response Variable
10:04
Goal of Regression
12:29
Goal of Regression
12:30
Prediction
14:50
Example: Servings of Mile Per Year Shown By Age
14:51
Intrapolation
17:06
Extrapolation
17:58
Error in Prediction
20:34
Prediction Error
20:35
Residual
21:40
Example 1: Residual
23:34
Example 2: Large and Negative Residual
26:30
Example 3: Positive Residual
28:13
Example 4: Interpret Regression Line & Extrapolate
29:40
Least Squares Regression

56m 36s

Intro
0:00
0:13
0:14
Best Fit
0:47
Best Fit
0:48
Sum of Squared Errors (SSE)
1:50
Sum of Squared Errors (SSE)
1:51
Why Squared?
3:38
Why Squared?
3:39
Quantitative Properties of Regression Line
4:51
Quantitative Properties of Regression Line
4:52
So How do we Find Such a Line?
6:49
SSEs of Different Line Equations & Lowest SSE
6:50
Carl Gauss' Method
8:01
How Do We Find Slope (b1)
11:00
How Do We Find Slope (b1)
11:01
Hoe Do We Find Intercept
15:11
Hoe Do We Find Intercept
15:12
Example 1: Which of These Equations Fit the Above Data Best?
17:18
Example 2: Find the Regression Line for These Data Points and Interpret It
26:31
Example 3: Summarize the Scatterplot and Find the Regression Line.
34:31
Example 4: Examine the Mean of Residuals
43:52
Correlation

43m 58s

Intro
0:00
0:05
0:06
Summarizing a Scatterplot Quantitatively
0:47
Shape
0:48
Trend
1:11
Strength: Correlation ®
1:45
Correlation Coefficient ( r )
2:30
Correlation Coefficient ( r )
2:31
Trees vs. Forest
11:59
Trees vs. Forest
12:00
Calculating r
15:07
Average Product of z-scores for x and y
15:08
Relationship between Correlation and Slope
21:10
Relationship between Correlation and Slope
21:11
Example 1: Find the Correlation between Grams of Fat and Cost
24:11
Example 2: Relationship between r and b1
30:24
Example 3: Find the Regression Line
33:35
Example 4: Find the Correlation Coefficient for this Set of Data
37:37
Correlation: r vs. r-squared

52m 52s

Intro
0:00
0:07
0:08
R-squared
0:44
What is the Meaning of It? Why Squared?
0:45
Parsing Sum of Squared (Parsing Variability)
2:25
SST = SSR + SSE
2:26
What is SST and SSE?
7:46
What is SST and SSE?
7:47
r-squared
18:33
Coefficient of Determination
18:34
If the Correlation is Strong…
20:25
If the Correlation is Strong…
20:26
If the Correlation is Weak…
22:36
If the Correlation is Weak…
22:37
Example 1: Find r-squared for this Set of Data
23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?
33:54
Example 3: Why Does r-squared Only Range from 0 to 1
37:29
Example 4: Find the r-squared for This Set of Data
39:55
Transformations of Data

27m 8s

Intro
0:00
0:05
0:06
Why Transform?
0:26
Why Transform?
0:27
Shape-preserving vs. Shape-changing Transformations
5:14
Shape-preserving = Linear Transformations
5:15
Shape-changing Transformations = Non-linear Transformations
6:20
Common Shape-Preserving Transformations
7:08
Common Shape-Preserving Transformations
7:09
Common Shape-Changing Transformations
8:59
Powers
9:00
Logarithms
9:39
Change Just One Variable? Both?
10:38
Log-log Transformations
10:39
Log Transformations
14:38
Example 1: Create, Graph, and Transform the Data Set
15:19
Example 2: Create, Graph, and Transform the Data Set
20:08
Example 3: What Kind of Model would You Choose for this Data?
22:44
Example 4: Transformation of Data
25:46
Section 6: Collecting Data in an Experiment
Sampling & Bias

54m 44s

Intro
0:00
0:05
0:06
Descriptive vs. Inferential Statistics
1:04
Descriptive Statistics: Data Exploration
1:05
Example
2:03
To tackle Generalization…
4:31
Generalization
4:32
Sampling
6:06
'Good' Sample
6:40
Defining Samples and Populations
8:55
Population
8:56
Sample
11:16
Why Use Sampling?
13:09
Why Use Sampling?
13:10
Goal of Sampling: Avoiding Bias
15:04
What is Bias?
15:05
Where does Bias Come from: Sampling Bias
17:53
Where does Bias Come from: Response Bias
18:27
Sampling Bias: Bias from Bas Sampling Methods
19:34
Size Bias
19:35
Voluntary Response Bias
21:13
Convenience Sample
22:22
Judgment Sample
23:58
25:40
Response Bias: Bias from 'Bad' Data Collection Methods
28:00
Nonresponse Bias
29:31
Questionnaire Bias
31:10
Incorrect Response or Measurement Bias
37:32
Example 1: What Kind of Biases?
40:29
Example 2: What Biases Might Arise?
44:46
Example 3: What Kind of Biases?
48:34
Example 4: What Kind of Biases?
51:43
Sampling Methods

14m 25s

Intro
0:00
0:05
0:06
Biased vs. Unbiased Sampling Methods
0:32
Biased Sampling
0:33
Unbiased Sampling
1:13
Probability Sampling Methods
2:31
Simple Random
2:54
Stratified Random Sampling
4:06
Cluster Sampling
5:24
Two-staged Sampling
6:22
Systematic Sampling
7:25
8:33
Example 2: Describe How to Take a Two-Stage Sample from this Book
10:16
Example 3: Sampling Methods
11:58
Example 4: Cluster Sample Plan
12:48
Research Design

53m 54s

Intro
0:00
0:06
0:07
Descriptive vs. Inferential Statistics
0:51
Descriptive Statistics: Data Exploration
0:52
Inferential Statistics
1:02
Variables and Relationships
1:44
Variables
1:45
Relationships
2:49
Not Every Type of Study is an Experiment…
4:16
Category I - Descriptive Study
4:54
Category II - Correlational Study
5:50
Category III - Experimental, Quasi-experimental, Non-experimental
6:33
Category III
7:42
Experimental, Quasi-experimental, and Non-experimental
7:43
Why CAN'T the Other Strategies Determine Causation?
10:18
Third-variable Problem
10:19
Directionality Problem
15:49
What Makes Experiments Special?
17:54
Manipulation
17:55
Control (and Comparison)
21:58
Methods of Control
26:38
Holding Constant
26:39
Matching
29:11
Random Assignment
31:48
Experiment Terminology
34:09
'true' Experiment vs. Study
34:10
Independent Variable (IV)
35:16
Dependent Variable (DV)
35:45
Factors
36:07
Treatment Conditions
36:23
Levels
37:43
Confounds or Extraneous Variables
38:04
Blind
38:38
Blind Experiments
38:39
Double-blind Experiments
39:29
How Categories Relate to Statistics
41:35
Category I - Descriptive Study
41:36
Category II - Correlational Study
42:05
Category III - Experimental, Quasi-experimental, Non-experimental
42:43
Example 1: Research Design
43:50
Example 2: Research Design
47:37
Example 3: Research Design
50:12
Example 4: Research Design
52:00
Between and Within Treatment Variability

41m 31s

Intro
0:00
0:06
0:07
Experimental Designs
0:51
Experimental Designs: Manipulation & Control
0:52
Two Types of Variability
2:09
Between Treatment Variability
2:10
Within Treatment Variability
3:31
Updated Goal of Experimental Design
5:47
Updated Goal of Experimental Design
5:48
Example: Drugs and Driving
6:56
Example: Drugs and Driving
6:57
Different Types of Random Assignment
11:27
All Experiments
11:28
Completely Random Design
12:02
Randomized Block Design
13:19
Randomized Block Design
15:48
Matched Pairs Design
15:49
Repeated Measures Design
19:47
Between-subject Variable vs. Within-subject Variable
22:43
Completely Randomized Design
22:44
Repeated Measures Design
25:03
Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment
26:16
Example 2: Block Design
31:41
Example 3: Completely Randomized Designs
35:11
Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?
39:01
Section 7: Review of Probability Axioms
Sample Spaces

37m 52s

Intro
0:00
0:07
0:08
Why is Probability Involved in Statistics
0:48
Probability
0:49
Can People Tell the Difference between Cheap and Gourmet Coffee?
2:08
Taste Test with Coffee Drinkers
3:37
If No One can Actually Taste the Difference
3:38
If Everyone can Actually Taste the Difference
5:36
Creating a Probability Model
7:09
Creating a Probability Model
7:10
D'Alembert vs. Necker
9:41
D'Alembert vs. Necker
9:42
Problem with D'Alembert's Model
13:29
Problem with D'Alembert's Model
13:30
Covering Entire Sample Space
15:08
Fundamental Principle of Counting
15:09
Where Do Probabilities Come From?
22:54
Observed Data, Symmetry, and Subjective Estimates
22:55
Checking whether Model Matches Real World
24:27
Law of Large Numbers
24:28
Example 1: Law of Large Numbers
27:46
Example 2: Possible Outcomes
30:43
Example 3: Brands of Coffee and Taste
33:25
Example 4: How Many Different Treatments are there?
35:33

20m 29s

Intro
0:00
0:08
0:09
Disjoint Events
0:41
Disjoint Events
0:42
Meaning of 'or'
2:39
In Regular Life
2:40
In Math/Statistics/Computer Science
3:10
3:55
If A and B are Disjoint: P (A and B)
3:56
If A and B are Disjoint: P (A or B)
5:15
5:41
5:42
8:31
If A and B are not Disjoint: P (A or B)
8:32
Example 1: Which of These are Mutually Exclusive?
10:50
Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?
12:57
Example 3: Engagement Party
15:17
Example 4: Home Owner's Insurance
18:30
Conditional Probability

57m 19s

Intro
0:00
0:05
0:06
'or' vs. 'and' vs. Conditional Probability
1:07
'or' vs. 'and' vs. Conditional Probability
1:08
'and' vs. Conditional Probability
5:57
P (M or L)
5:58
P (M and L)
8:41
P (M|L)
11:04
P (L|M)
12:24
Tree Diagram
15:02
Tree Diagram
15:03
Defining Conditional Probability
22:42
Defining Conditional Probability
22:43
Common Contexts for Conditional Probability
30:56
Medical Testing: Positive Predictive Value
30:57
Medical Testing: Sensitivity
33:03
Statistical Tests
34:27
Example 1: Drug and Disease
36:41
Example 2: Marbles and Conditional Probability
40:04
Example 3: Cards and Conditional Probability
45:59
Example 4: Votes and Conditional Probability
50:21
Independent Events

24m 27s

Intro
0:00
0:05
0:06
Independent Events & Conditional Probability
0:26
Non-independent Events
0:27
Independent Events
2:00
Non-independent and Independent Events
3:08
Non-independent and Independent Events
3:09
Defining Independent Events
5:52
Defining Independent Events
5:53
Multiplication Rule
7:29
Previously…
7:30
But with Independent Evens
8:53
Example 1: Which of These Pairs of Events are Independent?
11:12
Example 2: Health Insurance and Probability
15:12
Example 3: Independent Events
17:42
Example 4: Independent Events
20:03
Section 8: Probability Distributions
Introduction to Probability Distributions

56m 45s

Intro
0:00
0:08
0:09
Sampling vs. Probability
0:57
Sampling
0:58
Missing
1:30
What is Missing?
3:06
Insight: Probability Distributions
5:26
Insight: Probability Distributions
5:27
What is a Probability Distribution?
7:29
From Sample Spaces to Probability Distributions
8:44
Sample Space
8:45
Probability Distribution of the Sum of Two Die
11:16
The Random Variable
17:43
The Random Variable
17:44
Expected Value
21:52
Expected Value
21:53
Example 1: Probability Distributions
28:45
Example 2: Probability Distributions
35:30
Example 3: Probability Distributions
43:37
Example 4: Probability Distributions
47:20
Expected Value & Variance of Probability Distributions

53m 41s

Intro
0:00
0:06
0:07
Discrete vs. Continuous Random Variables
1:04
Discrete vs. Continuous Random Variables
1:05
Mean and Variance Review
4:44
Mean: Sample, Population, and Probability Distribution
4:45
Variance: Sample, Population, and Probability Distribution
9:12
Example Situation
14:10
Example Situation
14:11
Some Special Cases…
16:13
Some Special Cases…
16:14
Linear Transformations
19:22
Linear Transformations
19:23
What Happens to Mean and Variance of the Probability Distribution?
20:12
n Independent Values of X
25:38
n Independent Values of X
25:39
Compare These Two Situations
30:56
Compare These Two Situations
30:57
Two Random Variables, X and Y
32:02
Two Random Variables, X and Y
32:03
Example 1: Expected Value & Variance of Probability Distributions
35:35
Example 2: Expected Values & Standard Deviation
44:17
Example 3: Expected Winnings and Standard Deviation
48:18
Binomial Distribution

55m 15s

Intro
0:00
0:05
0:06
Discrete Probability Distributions
1:42
Discrete Probability Distributions
1:43
Binomial Distribution
2:36
Binomial Distribution
2:37
Multiplicative Rule Review
6:54
Multiplicative Rule Review
6:55
How Many Outcomes with k 'Successes'
10:23
Adults and Bachelor's Degree: Manual List of Outcomes
10:24
P (X=k)
19:37
Putting Together # of Outcomes with the Multiplicative Rule
19:38
Expected Value and Standard Deviation in a Binomial Distribution
25:22
Expected Value and Standard Deviation in a Binomial Distribution
25:23
Example 1: Coin Toss
33:42
38:03
Example 3: Types of Blood and Probability
45:39
Example 4: Expected Number and Standard Deviation
51:11
Section 9: Sampling Distributions of Statistics
Introduction to Sampling Distributions

48m 17s

Intro
0:00
0:08
0:09
Probability Distributions vs. Sampling Distributions
0:55
Probability Distributions vs. Sampling Distributions
0:56
Same Logic
3:55
Logic of Probability Distribution
3:56
Example: Rolling Two Die
6:56
Simulating Samples
9:53
To Come Up with Probability Distributions
9:54
In Sampling Distributions
11:12
Connecting Sampling and Research Methods with Sampling Distributions
12:11
Connecting Sampling and Research Methods with Sampling Distributions
12:12
Simulating a Sampling Distribution
14:14
Experimental Design: Regular Sleep vs. Less Sleep
14:15
Logic of Sampling Distributions
23:08
Logic of Sampling Distributions
23:09
General Method of Simulating Sampling Distributions
25:38
General Method of Simulating Sampling Distributions
25:39
Questions that Remain
28:45
Questions that Remain
28:46
Example 1: Mean and Standard Error of Sampling Distribution
30:57
Example 2: What is the Best Way to Describe Sampling Distributions?
37:12
Example 3: Matching Sampling Distributions
38:21
Example 4: Mean and Standard Error of Sampling Distribution
41:51
Sampling Distribution of the Mean

1h 8m 48s

Intro
0:00
0:05
0:06
Special Case of General Method for Simulating a Sampling Distribution
1:53
Special Case of General Method for Simulating a Sampling Distribution
1:54
Computer Simulation
3:43
Using Simulations to See Principles behind Shape of SDoM
15:50
Using Simulations to See Principles behind Shape of SDoM
15:51
Conditions
17:38
Using Simulations to See Principles behind Center (Mean) of SDoM
20:15
Using Simulations to See Principles behind Center (Mean) of SDoM
20:16
Conditions: Does n Matter?
21:31
Conditions: Does Number of Simulation Matter?
24:37
Using Simulations to See Principles behind Standard Deviation of SDoM
27:13
Using Simulations to See Principles behind Standard Deviation of SDoM
27:14
Conditions: Does n Matter?
34:45
Conditions: Does Number of Simulation Matter?
36:24
Central Limit Theorem
37:13
SHAPE
38:08
CENTER
39:34
39:52
Comparing Population, Sample, and SDoM
43:10
Comparing Population, Sample, and SDoM
43:11
48:24
What Happens When We Don't Know What the Population Looks Like?
48:25
Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
49:42
How Do We Know whether a Sample is Sufficiently Unlikely?
53:36
Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
54:40
Example 1: Mean Batting Average
55:25
Example 2: Mean Sampling Distribution and Standard Error
59:07
Example 3: Sampling Distribution of the Mean
1:01:04
Sampling Distribution of Sample Proportions

54m 37s

Intro
0:00
0:06
0:07
Intro to Sampling Distribution of Sample Proportions (SDoSP)
0:51
Categorical Data (Examples)
0:52
Wish to Estimate Proportion of Population from Sample…
2:00
Notation
3:34
Population Proportion and Sample Proportion Notations
3:35
What's the Difference?
9:19
SDoM vs. SDoSP: Type of Data
9:20
SDoM vs. SDoSP: Shape
11:24
SDoM vs. SDoSP: Center
12:30
15:34
Binomial Distribution vs. Sampling Distribution of Sample Proportions
19:14
Binomial Distribution vs. SDoSP: Type of Data
19:17
Binomial Distribution vs. SDoSP: Shape
21:07
Binomial Distribution vs. SDoSP: Center
21:43
24:08
Example 1: Sampling Distribution of Sample Proportions
26:07
Example 2: Sampling Distribution of Sample Proportions
37:58
Example 3: Sampling Distribution of Sample Proportions
44:42
Example 4: Sampling Distribution of Sample Proportions
45:57
Section 10: Inferential Statistics
Introduction to Confidence Intervals

42m 53s

Intro
0:00
0:06
0:07
Inferential Statistics
0:50
Inferential Statistics
0:51
Two Problems with This Picture…
3:20
Two Problems with This Picture…
3:21
Solution: Confidence Intervals (CI)
4:59
Solution: Hypotheiss Testing (HT)
5:49
Which Parameters are Known?
6:45
Which Parameters are Known?
6:46
Confidence Interval - Goal
7:56
When We Don't Know m but know s
7:57
When We Don't Know
18:27
When We Don't Know m nor s
18:28
Example 1: Confidence Intervals
26:18
Example 2: Confidence Intervals
29:46
Example 3: Confidence Intervals
32:18
Example 4: Confidence Intervals
38:31
t Distributions

1h 2m 6s

Intro
0:00
0:04
0:05
When to Use z vs. t?
1:07
When to Use z vs. t?
1:08
What is z and t?
3:02
z-score and t-score: Commonality
3:03
z-score and t-score: Formulas
3:34
z-score and t-score: Difference
5:22
Why not z? (Why t?)
7:24
Why not z? (Why t?)
7:25
But Don't Worry!
15:13
Gossett and t-distributions
15:14
Rules of t Distributions
17:05
t-distributions are More Normal as n Gets Bigger
17:06
t-distributions are a Family of Distributions
18:55
Degrees of Freedom (df)
20:02
Degrees of Freedom (df)
20:03
t Family of Distributions
24:07
t Family of Distributions : df = 2 , 4, and 60
24:08
df = 60
29:16
df = 2
29:59
How to Find It?
31:01
'Student's t-distribution' or 't-distribution'
31:02
Excel Example
33:06
Example 1: Which Distribution Do You Use? Z or t?
45:26
47:41
Example 3: t Distributions
52:15
Example 4: t Distributions , confidence interval, and mean
55:59
Introduction to Hypothesis Testing

1h 6m 33s

Intro
0:00
0:06
0:07
Issues to Overcome in Inferential Statistics
1:35
Issues to Overcome in Inferential Statistics
1:36
What Happens When We Don't Know What the Population Looks Like?
2:57
How Do We Know whether a sample is Sufficiently Unlikely
3:43
Hypothesizing a Population
6:44
Hypothesizing a Population
6:45
Null Hypothesis
8:07
Alternative Hypothesis
8:56
Hypotheses
11:58
Hypotheses
11:59
Errors in Hypothesis Testing
14:22
Errors in Hypothesis Testing
14:23
Steps of Hypothesis Testing
21:15
Steps of Hypothesis Testing
21:16
Single Sample HT ( When Sigma Available)
26:08
26:09
Step1
27:08
Step 2
27:58
Step 3
28:17
Step 4
32:18
Single Sample HT (When Sigma Not Available)
36:33
36:34
Step1: Hypothesis Testing
36:58
Step 2: Significance Level
37:25
Step 3: Decision Stage
37:40
Step 4: Sample
41:36
Sigma and p-value
45:04
Sigma and p-value
45:05
On tailed vs. Two Tailed Hypotheses
45:51
Example 1: Hypothesis Testing
48:37
Example 2: Heights of Women in the US
57:43
Example 3: Select the Best Way to Complete This Sentence
1:03:23
Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro
0:00
0:14
0:15
One Mean vs. Two Means
1:17
One Mean vs. Two Means
1:18
Notation
2:41
A Sample! A Set!
2:42
Mean of X, Mean of Y, and Difference of Two Means
3:56
SE of X
4:34
SE of Y
6:28
Sampling Distribution of the Difference between Two Means (SDoD)
7:48
Sampling Distribution of the Difference between Two Means (SDoD)
7:49
Rules of the SDoD (similar to CLT!)
15:00
Mean for the SDoD Null Hypothesis
15:01
Standard Error
17:39
When can We Construct a CI for the Difference between Two Means?
21:28
Three Conditions
21:29
Finding CI
23:56
One Mean CI
23:57
Two Means CI
25:45
Finding t
29:16
Finding t
29:17
Interpreting CI
30:25
Interpreting CI
30:26
Better Estimate of s (s pool)
34:15
Better Estimate of s (s pool)
34:16
Example 1: Confidence Intervals
42:32
Example 2: SE of the Difference
52:36
Hypothesis Testing for the Difference of Two Independent Means

50m

Intro
0:00
0:06
0:07
The Goal of Hypothesis Testing
0:56
One Sample and Two Samples
0:57
Sampling Distribution of the Difference between Two Means (SDoD)
3:42
Sampling Distribution of the Difference between Two Means (SDoD)
3:43
Rules of the SDoD (Similar to CLT!)
6:46
Shape
6:47
Mean for the Null Hypothesis
7:26
Standard Error for Independent Samples (When Variance is Homogenous)
8:18
Standard Error for Independent Samples (When Variance is not Homogenous)
9:25
Same Conditions for HT as for CI
10:08
Three Conditions
10:09
Steps of Hypothesis Testing
11:04
Steps of Hypothesis Testing
11:05
Formulas that Go with Steps of Hypothesis Testing
13:21
Step 1
13:25
Step 2
14:18
Step 3
15:00
Step 4
16:57
Example 1: Hypothesis Testing for the Difference of Two Independent Means
18:47
Example 2: Hypothesis Testing for the Difference of Two Independent Means
33:55
Example 3: Hypothesis Testing for the Difference of Two Independent Means
44:22
Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro
0:00
0:09
0:10
The Goal of Hypothesis Testing
1:27
One Sample and Two Samples
1:28
Independent Samples vs. Paired Samples
3:16
Independent Samples vs. Paired Samples
3:17
Which is Which?
5:20
Independent SAMPLES vs. Independent VARIABLES
7:43
independent SAMPLES vs. Independent VARIABLES
7:44
T-tests Always…
10:48
T-tests Always…
10:49
Notation for Paired Samples
12:59
Notation for Paired Samples
13:00
Steps of Hypothesis Testing for Paired Samples
16:13
Steps of Hypothesis Testing for Paired Samples
16:14
Rules of the SDoD (Adding on Paired Samples)
18:03
Shape
18:04
Mean for the Null Hypothesis
18:31
Standard Error for Independent Samples (When Variance is Homogenous)
19:25
Standard Error for Paired Samples
20:39
Formulas that go with Steps of Hypothesis Testing
22:59
Formulas that go with Steps of Hypothesis Testing
23:00
Confidence Intervals for Paired Samples
30:32
Confidence Intervals for Paired Samples
30:33
Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
32:28
Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
44:02
Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
52:23
Type I and Type II Errors

31m 27s

Intro
0:00
0:18
0:19
Errors and Relationship to HT and the Sample Statistic?
1:11
Errors and Relationship to HT and the Sample Statistic?
1:12
7:00
One Sample t-test: Friends on Facebook
7:01
Two Sample t-test: Friends on Facebook
13:46
Usually, Lots of Overlap between Null and Alternative Distributions
16:59
Overlap between Null and Alternative Distributions
17:00
How Distributions and 'Box' Fit Together
22:45
How Distributions and 'Box' Fit Together
22:46
Example 1: Types of Errors
25:54
Example 2: Types of Errors
27:30
Example 3: What is the Danger of the Type I Error?
29:38
Effect Size & Power

44m 41s

Intro
0:00
0:05
0:06
Distance between Distributions: Sample t
0:49
Distance between Distributions: Sample t
0:50
Problem with Distance in Terms of Standard Error
2:56
Problem with Distance in Terms of Standard Error
2:57
Test Statistic (t) vs. Effect Size (d or g)
4:38
Test Statistic (t) vs. Effect Size (d or g)
4:39
Rules of Effect Size
6:09
Rules of Effect Size
6:10
Why Do We Need Effect Size?
8:21
Tells You the Practical Significance
8:22
HT can be Deceiving…
10:25
Important Note
10:42
What is Power?
11:20
What is Power?
11:21
Why Do We Need Power?
14:19
Conditional Probability and Power
14:20
Power is:
16:27
Can We Calculate Power?
19:00
Can We Calculate Power?
19:01
How Does Alpha Affect Power?
20:36
How Does Alpha Affect Power?
20:37
How Does Effect Size Affect Power?
25:38
How Does Effect Size Affect Power?
25:39
How Does Variability and Sample Size Affect Power?
27:56
How Does Variability and Sample Size Affect Power?
27:57
How Do We Increase Power?
32:47
Increasing Power
32:48
Example 1: Effect Size & Power
35:40
Example 2: Effect Size & Power
37:38
Example 3: Effect Size & Power
40:55
Section 11: Analysis of Variance
F-distributions

24m 46s

Intro
0:00
0:04
0:05
Z- & T-statistic and Their Distribution
0:34
Z- & T-statistic and Their Distribution
0:35
F-statistic
4:55
The F Ration ( the Variance Ratio)
4:56
F-distribution
12:29
F-distribution
12:30
s and p-value
15:00
s and p-value
15:01
Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?
18:33
Example 2: F-distributions
19:29
Example 3: F-distributions and Heights
21:29
ANOVA with Independent Samples

1h 9m 25s

Intro
0:00
0:05
0:06
The Limitations of t-tests
1:12
The Limitations of t-tests
1:13
Two Major Limitations of Many t-tests
3:26
Two Major Limitations of Many t-tests
3:27
Ronald Fisher's Solution… F-test! New Null Hypothesis
4:43
Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)
4:44
Analysis of Variance (ANoVA) Notation
7:47
Analysis of Variance (ANoVA) Notation
7:48
Partitioning (Analyzing) Variance
9:58
Total Variance
9:59
Within-group Variation
14:00
Between-group Variation
16:22
Time out: Review Variance & SS
17:05
Time out: Review Variance & SS
17:06
F-statistic
19:22
The F Ratio (the Variance Ratio)
19:23
S²bet = SSbet / dfbet
22:13
What is This?
22:14
How Many Means?
23:20
So What is the dfbet?
23:38
So What is SSbet?
24:15
S²w = SSw / dfw
26:05
What is This?
26:06
How Many Means?
27:20
So What is the dfw?
27:36
So What is SSw?
28:18
Chart of Independent Samples ANOVA
29:25
Chart of Independent Samples ANOVA
29:26
Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?
35:52
Hypotheses
35:53
Significance Level
39:40
Decision Stage
40:05
Calculate Samples' Statistic and p-Value
44:10
Reject or Fail to Reject H0
55:54
Example 2: ANOVA with Independent Samples
58:21
Repeated Measures ANOVA

1h 15m 13s

Intro
0:00
0:05
0:06
The Limitations of t-tests
0:36
Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
0:37
ANOVA (F-test) to the Rescue!
5:49
Omnibus Hypothesis
5:50
Analyze Variance
7:27
Independent Samples vs. Repeated Measures
9:12
Same Start
9:13
Independent Samples ANOVA
10:43
Repeated Measures ANOVA
12:00
Independent Samples ANOVA
16:00
Same Start: All the Variance Around Grand Mean
16:01
Independent Samples
16:23
Repeated Measures ANOVA
18:18
Same Start: All the Variance Around Grand Mean
18:19
Repeated Measures
18:33
Repeated Measures F-statistic
21:22
The F Ratio (The Variance Ratio)
21:23
S²bet = SSbet / dfbet
23:07
What is This?
23:08
How Many Means?
23:39
So What is the dfbet?
23:54
So What is SSbet?
24:32
S² resid = SS resid / df resid
25:46
What is This?
25:47
So What is SS resid?
26:44
So What is the df resid?
27:36
SS subj and df subj
28:11
What is This?
28:12
How Many Subject Means?
29:43
So What is df subj?
30:01
So What is SS subj?
30:09
SS total and df total
31:42
What is This?
31:43
What is the Total Number of Data Points?
32:02
So What is df total?
32:34
so What is SS total?
32:47
Chart of Repeated Measures ANOVA
33:19
Chart of Repeated Measures ANOVA: F and Between-samples Variability
33:20
Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
35:50
Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
40:25
Hypotheses
40:26
Significance Level
41:46
Decision Stage
42:09
Calculate Samples' Statistic and p-Value
46:18
Reject or Fail to Reject H0
57:55
Example 2: Repeated Measures ANOVA
58:57
Example 3: What's the Problem with a Bunch of Tiny t-tests?
1:13:59
Section 12: Chi-square Test
Chi-Square Goodness-of-Fit Test

58m 23s

Intro
0:00
0:05
0:06
Where Does the Chi-Square Test Belong?
0:50
Where Does the Chi-Square Test Belong?
0:51
A New Twist on HT: Goodness-of-Fit
7:23
HT in General
7:24
Goodness-of-Fit HT
8:26
12:17
Null Hypothesis
12:18
Alternative Hypothesis
13:23
Example
14:38
Chi-Square Statistic
17:52
Chi-Square Statistic
17:53
Chi-Square Distributions
24:31
Chi-Square Distributions
24:32
Conditions for Chi-Square
28:58
Condition 1
28:59
Condition 2
30:20
Condition 3
30:32
Condition 4
31:47
Example 1: Chi-Square Goodness-of-Fit Test
32:23
Example 2: Chi-Square Goodness-of-Fit Test
44:34
Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?
56:06
Chi-Square Test of Homogeneity

51m 36s

Intro
0:00
0:09
0:10
Goodness-of-Fit vs. Homogeneity
1:13
Goodness-of-Fit HT
1:14
Homogeneity
2:00
Analogy
2:38
5:00
Null Hypothesis
5:01
Alternative Hypothesis
6:11
Example
6:33
Chi-Square Statistic
10:12
Same as Goodness-of-Fit Test
10:13
Set Up Data
12:28
Setting Up Data Example
12:29
Expected Frequency
16:53
Expected Frequency
16:54
Chi-Square Distributions & df
19:26
Chi-Square Distributions & df
19:27
Conditions for Test of Homogeneity
20:54
Condition 1
20:55
Condition 2
21:39
Condition 3
22:05
Condition 4
22:23
Example 1: Chi-Square Test of Homogeneity
22:52
Example 2: Chi-Square Test of Homogeneity
32:10
Section 13: Overview of Statistics
Overview of Statistics

18m 11s

Intro
0:00
0:07
0:08
The Statistical Tests (HT) We've Covered
0:28
The Statistical Tests (HT) We've Covered
0:29
Organizing the Tests We've Covered…
1:08
One Sample: Continuous DV and Categorical DV
1:09
Two Samples: Continuous DV and Categorical DV
5:41
More Than Two Samples: Continuous DV and Categorical DV
8:21
The Following Data: OK Cupid
10:10
The Following Data: OK Cupid
10:11
Example 1: Weird-MySpace-Angle Profile Photo
10:38
Example 2: Geniuses
12:30
Example 3: Promiscuous iPhone Users
13:37
Example 4: Women, Aging, and Messaging
16:07
Bookmark & Share Embed

## Copy & Paste this embed code into your website’s HTML

Please ensure that your website editor is in text mode when you paste the code.
(In Wordpress, the mode button is on the top right corner.)
×
• - Allow users to view the embedded video in full-size.
Since this lesson is not free, only the preview will appear on your website.

• ## Related Books 1 answer Last reply by: Professor SonMon Jul 11, 2016 3:56 PMPost by Martin Lau on June 25, 2014Hi Dr. Son,Could you please explain:why, for binomial distribution,as n increases, standarddeviation increases, not decreases?Thank you in advance.Martin 0 answersPost by Alexandra Vazquez on December 4, 2012you're a lifesaver. best teacher ever.

### Sampling Distribution of Sample Proportions

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

• Intro 0:00
• Intro to Sampling Distribution of Sample Proportions (SDoSP) 0:51
• Categorical Data (Examples)
• Wish to Estimate Proportion of Population from Sample…
• Notation 3:34
• Population Proportion and Sample Proportion Notations
• What's the Difference? 9:19
• SDoM vs. SDoSP: Type of Data
• SDoM vs. SDoSP: Shape
• SDoM vs. SDoSP: Center
• Binomial Distribution vs. Sampling Distribution of Sample Proportions 19:14
• Binomial Distribution vs. SDoSP: Type of Data
• Binomial Distribution vs. SDoSP: Shape
• Binomial Distribution vs. SDoSP: Center
• Binomial Distribution vs. SDoSP: Spread
• Example 1: Sampling Distribution of Sample Proportions 26:07
• Example 2: Sampling Distribution of Sample Proportions 37:58
• Example 3: Sampling Distribution of Sample Proportions 44:42
• Example 4: Sampling Distribution of Sample Proportions 45:57

### Transcription: Sampling Distribution of Sample Proportions

Hi and welcome to www.educator.com.0000

Today we are going to be talking about sampling distribution of sample proportions.0001

First thing, we are going to do is just introduce ourselves to the concept of sampling distribution of sample proportions.0006

This is just me this is not like everybody in statistics but I am going to call it SDOS for short.0014

We do not have to keep writing out sampling distribution of sample proportions.0019

Then we are going to go through some notation and then finally we are going to compare and contrast the SDOS to the SDOM.0024

They are both sampling distribution, but one is that the mean and the others of sample proportions.0034

We are going to compare and contrast the binomial distribution.0040

The probability distribution that we looked at a couple of lessons back with the SDOS.0045

What is this thing?0050

What is the SDOS?0054

First this concept is going to come into play whenever we collect some sort of categorical data.0056

For instance, we might ask a sample of citizen do you approve or disapprove of the president and0063

That would be a categorical response.0070

They are not saying I approve this much, but they are just saying I approve or disapprove.0073

At the end of that data collection what you get is not a mean or a median0079

but you get something like a proportion of citizens who believe the president is doing a good job.0086

Something like 43%, 64%, 29%.0091

Or another example might be proportion of students who plagiarized on a paper before.0097

Here we are getting proportions.0104

They are not means or medians.0107

They are just percentages of the entire sample.0109

Finally another one that is been talked about a lot these days is the proportion of people covered under healthcare.0112

When we collect this categorical data oftentimes we want to use that in order to estimate the proportion0119

of the population that actually is covered by health care or with plagiarized before or who believe the president is doing a good job.0129

We want to estimate the population level parameter.0137

However, samples are very variables.0142

Samples are variable and that means that our estimate would not always be very good.0147

It will be good sometimes, but it would not always be very good.0157

It would help us out if we knew the entire distribution of potential samples.0161

Would not it be handy?0175

That is called the sampling distribution.0179

And because we are not sampling and finding a mean, instead we are sampling0182

and finding a proportion it is called a sampling distribution of sample proportions.0187

This is the idea found here the entire distribution of potential samples, but once we get each sample what do we do to it?0196

We do not find the mean, we find the sample proportion and we plot those on a distribution.0206

Some things that are helpful for us to get straight.0213

When we talk about the population proportion that parameter we will just call it p.0220

When we talk about a sample proportion we are going to call it p hat.0226

We have seen that notation before when we talk about regression when we had y for the actual data but we had y hat for the predicted data.0232

You can think about it like this.0243

The real one from the world is going to be not have the hat.0245

This is sort of the truth that we are trying to find and this is how we are going to estimate that truth.0254

We are going to use this to estimate that but we want to know how good is our estimate?0262

Is it any good?0270

And is it reliable or not?0270

Here the distribution of the population is binary.0275

There is 1 or the other.0281

I am just going to draw the entire population as just a bar and pretend this bar had the value of 1.0, 100%.0282

Some proportion of this is p and the other is not p.0292

Some proportion of these people approve of the job he's doing as a president and the other proportion does not.0299

How should we represent that in a picture and algebraic form?0312

Well I'm just going to draw a line here and say this part is going to be my proportion p.0316

That is my proportion of those that agree that the president is doing a good job.0324

Then what would be but this little area here?0330

Well, how would we represent that algebraically?0335

That would simply be 1 – p because the whole thing is p.0338

This segment is p so this segment must be 1 – p.0344

When you add p + 1 -p what you get is 1.0350

Notice that we did not draw like a normal distribution or anything because it is not that people have different values.0355

It is not that some people are low, some people are high.0363

It is just yes or no?0365

Have you plagiarized or not?0367

Have you gone bungee jumping or not?0370

Are you covered by health care or not?0375

They are just these binary characteristics that we are interested in.0378

That is what the population is like.0383

Now from population we draw a sample of size n just like always.0385

We are always drawing sample size n.0402

When we look at that little sample of the population what does it look like?0405

Instead of the whole thing you drew a little sample, you drew a subset of those people0412

and presumably this little sample should most likely reflect the population that it came from.0421

These should be radically different from this.0430

It can be sometimes but for the most part this sample should reflect the population that it came from.0434

The entire sample this thing =1 and in this entire sample we have some probability p hat and that is the proportion in our sample that agree.0442

This would be represented by 1 – p hat those are the people in our sample that disagree.0466

You might be thinking how is this whole thing 1 and how is this whole thing 1 because this one looks smaller?0473

When we say 1 we are talking about proportions.0480

We are really think 100%.0484

When we are saying 1 here it represent 100% of the population.0486

When we day 1 down here we are saying 100% of the sample.0490

That is the distinction we want to make.0496

Once you do this then you get this p hat.0499

And once you have that p hat then you can plot it on a sampling distribution.0504

Here is what the sampling distribution of sample portions looks like.0512

The lower bound and upper bound on this have to be 0 and 1.0516

You can never have a p hat that is less than 0 and you can never have a p hat that is greater than 100%.0522

You are inevitably stuck between 0 and 1.0530

Those are the only sample proportions you could possibly get.0534

Whatever we get here we plot here.0537

Soon we will build up a sampling distribution of sample proportions.0540

Whatever it looks like that will be our sampling distribution of sample portions.0544

Let us contrast the SDOM versus the SDOS.0556

There is one key difference between these two and that is the biggest thing you really need to keep in mind.0573

When we are talking about the SDOM we are finding a mean.0579

You cannot find a mean between agree and disagree.0582

Those are categorical data.0585

Here is what we do know is we need the data where you can find the mean and those data are continuous data.0589

Sometimes these are also called measurement data because you actually got this by measuring something.0599

When you have continuous data for instance how many miles do you drive per day?0609

Getting a sample of you know what the average number of miles people in California drive each day?0619

That would be a continuous measure if you get data like that you can actually average it together.0624

But if we ask the question like do you drive every day?0633

Yes or no?0638

That would be categorical data.0639

The type of data we are talking about here happens to be binary because it is either you are in one category or you are in the other.0642

There is not like three categories.0656

It is like there might be you agree with the president or you disagree with the president or you feel neutral.0659

What we would do in order to look at it as SDOS is to lump people together.0666

It might be agree versus disagree or do not care.0674

We might lump those two people together to just call them not agrees.0678

The shape of the SDOM what is nice about it is that as n increases, what happens to the shape?0683

The shape approximates normal.0694

As our sample size increases the shape is more and more reliably normal.0703

The nice thing about the SDOS is that the same principle applies.0710

You could just draw a little link there as an increases, shape approximates normal.0715

Because as we draw sample sizes of size n, as n gets bigger even for SDOS we are actually seeing normal like distributions.0732

If you remember the central limit theorem, that is where the shape, center, and spread stuff comes from.0752

Center if you remember, the population mu equals the center of the SDOM which is mu sub x bar.0759

It is a whole bunch of little x bar.0771

There is a similar idea here, but there is a difference.0774

Basically when we talk about center here remember that we do not have the population mu.0786

We do not have a population mu.0794

We do not have a population mean.0795

Instead what we have is more like a population proportion.0796

We want to know what is the relationship between p and p hat?0806

In this case, what we see is that the mu that we want to see is going to be equal to the proportion.0814

And if you think about it, let us say you have 60% of your population is approving of the president.0831

If you are draw just 1 person, 1 person from that population what is the chance that that 1 person approved the president?0845

And that 1 person have a 60% chance of approving the president.0851

When you draw 2 people, those 2 people also have a 60% chance of approving of the president.0859

If you draw 3 people you still have 60% chance of approving of the president.0867

The population p is equal to the mu because remember now we have a mean because we have a distribution of p hat.0872

Here is the idea.0888

Get all these p hats, the entire distribution p hats.0890

Once you have those, if you find a mean of that, that is equal to the population.0894

That is the nice thing about the center.0902

Remember this number is between 0 and 1 because you cannot have lower than 0 higher than 1.0906

This value is also between 0 and 1.0915

Another way to think about it is that the rate in the population will be the mean of all the rates that you get in your samples.0921

When we talk about spread before we often look at standard deviation.0938

Obviously you can also look at variance and the equal sides of standard deviation.0943

Here when we talked about standard deviation of the SDOM we called it sigma sub x bar because it is the standard deviation of a bunch of x-bars,0949

a bunch of means and that is equal to sigma.0961

The real population standard deviation divided by √n your sample size.0969

Here as n increases what happens to the standard error?0976

We should also call it, standard error.0988

What happens to standard error?0990

Standard error goes down, decreases.0992

As n goes up standard error goes down because as n gets bigger and bigger and bigger, this whole thing gets smaller and smaller.1002

Just like here we did not have a population mean.1016

We do not have a population standard deviation.1024

There is no variability there.1027

Instead we use a different formula.1031

First, let us talk about variance here.1034

In order to write variance you call it sigma and instead of sigma sub x bar you call it sigma sub p hat.1038

Just like mu sub p hat.1050

You are constantly saying this is the sigma of the whole bunch of sample proportions.1052

And because we are talking about variance you want to square that.1059

That is going to also be p × 1 - P ÷ n.1064

When you look at this you see that this still holds for both of these.1082

As n increases what happens to the value of the spread?1088

As n increases spread goes down.1099

Imagine squeezing it.1102

If you wanted to find standard deviation what you would see a sigma sub p hat =vp×1-p / n.1104

We will talk a little bit more about where this comes from in the next segment.1120

But what I want you to see here is that there is this principle as n increases,1125

It becomes less variable.1143

We see a lot of similarities across the SDOM and SDOS.1146

Let us talk about the binomial distribution and SDOS.1154

Hopefully remember the binomial distribution from few lessons ago, there we are also talking about categorical data.1163

Not only that we are talking about binary categorical data.1173

Remember we are talking about how many successes, K number of successes out of n.1180

You take a sample of size n and your counting how many number of successes and plotting all of that on a distribution.1197

Here is also categorical and we are also looking at binary choices 1 or the other.1206

Here we are not looking at k number of successes we are looking at sample proportions.1215

I want to stop here to briefly remind you what we are talking about the SDOM the lowest number1223

that p hat could be a 0 and the highest number is 1.1235

Those are the limits.1241

When we talk about a binomial distribution the lowest number that this could be is 0 and the highest number over here is n.1243

It is because we are plotting k on this distribution.1256

0 number of successes 1, 2, 3, 4, 5 all the way up to n number of successes and out of n.1260

What is the shape here when we do not necessarily know.1267

It does not have to be normal.1273

It could be different kinds of shapes.1274

It must be skewed, it could be different shapes.1279

We do not necessarily know.1282

We do not know the shape.1284

Here we know as n increases more normal.1286

Here we do know the shape as long as we have a large enough n.1293

Here when we talk about center we had looked at sort of how many n would we normally see?1303

What would be the average k?1319

Before in the binomial distribution our notion of center was largely guided by the probability of success × n.1323

You can think of it like this, here is our little sample of n and here some proportion of the sample p1342

of this sample is going to be a success whatever the successes is.1354

Some proportion is going to be the success and how many is that p?1363

To get the raw value I do not want in terms of percent.1369

I want it in raw value here.1373

To get that proportion what I would say is that the center of the binomial distribution is p × n because this whole thing is size n.1375

It is only p(n).1393

If it was 100% then it would be 1 × n, all of them.1396

If it was 75% it would be .75 × n and that will give you only 75% of n.1401

If it was 10% of n it would be .1 × n.1409

This is our definition of center.1414

Here we saw that the definition of center.1417

All we did is basically divide this by n because we no longer want k number of successes we want to know what is that proportion?1421

We do not care what the n is.1431

We care what the actual k is.1433

We just want the proportion.1436

Life becomes easier and mu sub p hat is actually just p.1438

Life is simple.1447

If you remember spread way back in the day here, this is standard deviation so you know why I am square rooting.1452

The standard deviation of n × p × 1 – p.1469

You could see sort of the similarity between this and the standard deviation sigma sub p hat where we have vp×1-p.1476

But instead of multiplying by my √n we are dividing by √n.1491

Let us think about the implications of that.1499

Here as n increases, what happens to the standard deviation?1502

It gets wider and wider and wider because remember if n increases we are stretching out the space.1509

There are more room for variation.1517

Standard deviation increases.1521

However, here you are always limited to 0 and 1.1530

You can never go about that even if you increase your n you try to get more and more people in a sample.1536

It does not matter.1544

You are always stuck between 0 and 1.1545

As n increases the standard deviation decreases.1548

Here there are some definite similarities but there are moments of contrast that are important.1554

Let us go on to some examples.1565

The ethnicity of about 92% of the population of China is Han Chinese, so there are a lot of other ethnic minorities in China, but not a lot only 8%.1570

Suppose you take a random sample of 1,000 Chinese what is the probability of getting 90% or fewer pun Chinese in your sample?1582

What is the probability of getting 925 pun Chinese or more?1592

Well, one thing that helps is for us to realize here if we wanted to we could use binary distributions1597

because we can easily translate from 90% to 900 Hun Chinese.1611

But we can also use the sampling distribution of sample means because we can easily change 925 into 92.5%.1617

We can choose either path we want.1630

I am going to go with the SDOS because that was the lesson is about.1633

First we know that the population I am just going to draw a fake population here, just so that we can remember.1636

Here is my population of China and 92%.1645

My real p= 92% and so my 1 - p =.08.1649

8% is non-Hun Chinese, 92% is Hun Chinese.1663

Now, given this let us say I sample a whole bunch of times and every time I sample I get a sample proportion and I plot that.1669

Because we have a fairly large sample size I can assume that we have a normal distribution.1677

I know that my limits are 0 and 1 and this whole thing this is really p hat.1684

The question is what is the probability of getting 90% or fewer Hun Chinese in your sample?1696

First, it would be helpful to know what this middle is.1705

Actually it is not exactly going to be symmetrical.1710

It is 50%.1715

Here it should really be 92% because the mu(p hat) = p and that is 92%.1717

The upper limit here is 1.0 and the limit down here is 0.1733

What is the probability of getting 90% or fewer Hun Chinese?1742

In order to figure out where 90% is, it would be helpful for us to know the standard error1752

or the standard deviation of the sampling distribution.1761

What is my standard error?1765

This is sigma sub p hat and in order to find that that is going to be the vp×1-p /n.1767

That would be 92% × .08 ÷ 1,000 and take the square root of all of that.1781

Feel free to do it on a calculator I am just going to show it to you one Excel.1794

We have v92% × 8% / 1, 000.1800

Remember order of operations does not really matter for multiplying and dividing.1818

They can be done simultaneously, so it does not matter if they do this first or this first.1825

We see that we have a tiny standard deviation .0086.1830

Even though 90% does not seem like that far away actually is quite far away.1848

How do we find how far away .90 is?1856

You have to think and say this is the normal distribution and there is something we know about normal distribution.1862

We could find these areas in terms of z score.1875

We knew the z score we can find that area.1878

These are my p hats but I'm going to start a row for z scores.1882

Z scores I know the middle is going to be 0 and 1 standard deviation out this is the .0086 distance that is -1.1890

How many of this .0086 is away am I?1904

I could use my notion of z scores.1911

My z score is 4.90 looks something like this.1915

What is the distance between the middle and the score that I'm interested in?1920

That is just 90 -.92.1927

That is going to give me that distance but I do not want that distance in terms of percentages.1931

I want it in terms of my standard error in terms of these little jumps.1937

I'm going to say divide by .0086 to give me how many of these points are 6 jumps away am I if I am at 90?1942

Let us put back into our calculators.1952

I need a parenthesis as order of operations we need to do the subtraction before the division and Excel will not know that.1959

.9 -.92 / .0086 = -2.33.1970

Here is -2 and apparently this is -2.33.1991

Okay, now that we have that z score are we done?1999

No, we need to know what is the probability of getting 90% or fewer Han Chinese in your sample?2013

What we want to know is this area here.2020

This is 90% or fewer are Han Chinese in sample.2022

That is the area we want to know.2032

At this point because you have the z score you could look it up in the back of the book using your z tables.2035

Just to show you I am going to use Excel to find this and I will leave my z score there because it will come in handy.2042

Remember normsdist and it asks me to put in the z and once it does that I know that this proportion should be very small and its only 1% of this.2057

We should expect what is the probability of getting 90% or fewer Han Chinese in our sample.2080

It is 1%.2086

We want to find out what is the probability of getting 925 Hun Chinese or more.2088

In this case, why do we do the same thing but 92.5 so that would be somewhere past here.2102

925 where is that?2115

Let us find the z score so that we could be exact.2123

Z score of .925 is the distance between 925 and .92 divided by the little jumps, the standard errors .0086.2125

When I do that what do I get .925 -.9 /.0086 = 2.990.2141

That did not look right to me because this should be a smaller z score than this one because this should be farther out.2161

It is .58 is our z score.2190

I wrote this one at a wrong place.2196

.925 is somewhere here.2200

That is .58 that is our z score.2206

In order to find the area let me shade that in so you know in order to find the area because2216

we are looking for is the probability of getting this score or more, that area should be 50%.2228

It should be much more than this and actually I have put it in my normal distribution but remember this will give you what is on this left side or the negative side.2235

We need to look at 1 - that normal distribution.2253

This is 28%.2258

What is the probability of getting 929 Han Chinese or more that is going to be.28.2262

Example 2, college freshmen from a wide variety of colleges across the US participate in a survey2274

where 61% reported that they are attending college that was their first choice.2285

If you took a random sample of 100 freshmen how likely is it that at least 50 of those students are attending their first choice college?2291

Saying at least 50 that is a good thing to keep in mind for later.2299

Let us try this population.2305

Here is my population of college freshmen and 61% a little more than half.2307

61% is our p and 1 - p is not quite 40% but is 39.2315

The other 39 they are not attending their first choice college.2324

Imagine taking out of that population of random sample of 100 freshmen and looking at2328

the sample proportion and plotting that on the SDOS.2340

100 is still a pretty large n so I am going to go with that normal distribution.2344

I know that my SDOS mu.2356

Mu sub p hat this should equal p and that is 61%.2363

What is my standard deviation of this SDOS because I'm not just looking at who is in here.2372

I am looking at it if I took a sample of 100 students how good is my sample?2380

Whenever you hear that, like how good is the sample then you know you need a sampling distribution.2386

I should probably find my standard error because standard error because it is a sampling distribution.2393

Here is vp×1-p /n that is going to be v.61 × .39 ÷ 100.2408

I will just look that up here so v.61 × .39 ÷ 100 = .0488.2428

This little jumper here is .0488 that is how big does little jumps are.2456

I'm looking for how likely is it that at least 50 of these students are attending their first choice college.2481

I can turn this into a percentage by looking at 50/100.2493

My p hat that I have been given is 50/100 and that is .5 and I want to know how likely is this p hat.2499

It is nice to find out where the p hat is and this is the raw proportion.2512

It would be nice to find the z score and the z score of .5 should be the distance between .5 and the mean divided by the little jumps.2520

How big are my jumps in order to find how many jumps away.2540

Let us put that in our calculator, .5 - .61 ÷ .0488 = -2.25.2545

Here we are somewhere like this -2.25 and this is 4.5.2569

We want to know how likely is it that at least 50 of those students are attending that first choice college.2583

When we say at least this is the lower limit.2592

We are looking for this whole thing.2598

You can look that up in the back of your book or you could say the proportion that p hat will be greater than or equal to .5.2603

I do not know if you remember this notation here we want to know, I remember will give us the negative side,2618

so we have the 1 - this little piece.2634

1 – norms s in order for standardized that is how we get that z and we put in our z and we should get .9879.2639

Very close to survey .9879.2660

Almost 99% of our sample should have at least 50% of those students attending their first choice college.2667

Third example, about 75% of the US population owns a cell phone and that is growing.2681

On average, what proportion of people would you expect to have a cell phone in a sample of 10, 20 or 40?2691

This is talking about the average proportion.2698

We are looking at the mu sub p hat on average, what proportion of people would you expect?2702

For 10 people it should be 75% for n=10.2711

Even for that the sampling distributions mean should be 75%.2725

This should be 75%.2735

What it is getting at is that no matter how big or little your sample size your mean of the sampling distribution2740

does not really change and that is similar to what we saw in the sampling distribution of the mean as well.2751

Final example, that 60% of married women are employed.2757

If you select 75 married women, what is the probability that between 30 and 40 women are employed?2763

Here we need to know that our actual population and these are all married ladies and 60% are employed.2770

That is our p and 1 - p is 40%.2793

Imagine now taking samples of 75 so this is SDOS for n=75.2798

75 is still fairly large so I will assume normal distribution.2808

What is the probability that between 30 and 40 women are employed?2815

We know that mu sub p hat is 60% we also know that the standard deviation of p hat is the v.60 × .40 /n(75)2822

and all of that under the square root sign.2842

I will just quickly put this into my calculator.2845

v.6 × .4 ÷ 75 = .0566.2850

What is the probability that between 30 and 40 women are employed?2873

First of all it helps me to figure out what percentage is 30 women out of 75 and what percentage 40 women out of 75?2883

Let us call that p hat sub 30 that is 30 ÷ 75 and also p hat sub 40 that is 40 ÷ 75.2892

If I want to get it in decimals 40 ÷ 75, 30 ÷ 75 that is .53 and 4.2905

I am going to know that these 2 slickers.2922

I will go about 6 down so this first one and another 6, so this would be roughly .54.2942

Let us actually find the z scores of this.2957

Z(.4) = these are the p hats.2966

These are the z scores is .4 - .6 all divided by the little jumps.2976

And these little jumps are .0566.2992

.4 - .6 we need a parenthesis here divided by .0566.3001

That is the z score of -3 .5.3025

What about the score of .53 I'm just going to forget about the repeating part.3048

It will just be something like .5 - .6 ÷.0566 and that is -1.2.3061

Here is my big problem, first we need to know this area but there is no table that will tell us just that area.3081

Here is what we will have to do, we will have to take everything below this and then subtract out everything below that3109

because then we will get this entire area including this infinite tail and then take out a tiny little bit of it to top that part off3131

to get it in between this part and this part.3141

In order to do that, I will use my normsdist and remember that will give me the negative side.3144

Let me put in my bigger number first and then subtract, that is my entire area below z= -1.2.3161

That is the entire area.3181

I am going to subtract out the tiny sliver way over here.3184

Area below z= -3.5.3188

I can just normsdist -3.5 and that should be a really tiny, tiny, tiny number.3196

I need to subtract this area out of this.3208

I take this whole thing and subtract this little sliver and I get roughly very similar number.3212

.119 that is my area.3221

This area here we can call it the probability where p hat is greater than or equal to .4 and less than or equal to .53 repeating is roughly .119.3227

It is about 11.9% is the probability that between 30 and 40 women are employed.3257

That is the end of sampling distribution of sample proportion.3269

Thanks for using www.educator.com.3275

OR

### Start Learning Now

Our free lessons will get you started (Adobe Flash® required).