Dr. Ji Son

Dr. Ji Son

Frequency Distributions in Excel

Slide Duration:

Table of Contents

Section 1: Introduction
Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro
0:00
Roadmap
0:10
Roadmap
0:11
Statistics
0:35
Statistics
0:36
Let's Think About High School Science
1:12
Measurement and Find Patterns (Mathematical Formula)
1:13
Statistics = Math of Distributions
4:58
Distributions
4:59
Problematic… but also GREAT
5:58
Statistics
7:33
How is It Different from Other Specializations in Mathematics?
7:34
Statistics is Fundamental in Natural and Social Sciences
7:53
Two Skills of Statistics
8:20
Description (Exploration)
8:21
Inference
9:13
Descriptive Statistics vs. Inferential Statistics: Apply to Distributions
9:58
Descriptive Statistics
9:59
Inferential Statistics
11:05
Populations vs. Samples
12:19
Populations vs. Samples: Is it the Truth?
12:20
Populations vs. Samples: Pros & Cons
13:36
Populations vs. Samples: Descriptive Values
16:12
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:10
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:11
Example 1: Descriptive Statistics vs. Inferential Statistics
19:09
Example 2: Descriptive Statistics vs. Inferential Statistics
20:47
Example 3: Sample, Parameter, Population, and Statistic
21:40
Example 4: Sample, Parameter, Population, and Statistic
23:28
Section 2: About Samples: Cases, Variables, Measurements
About Samples: Cases, Variables, Measurements

32m 14s

Intro
0:00
Data
0:09
Data, Cases, Variables, and Values
0:10
Rows, Columns, and Cells
2:03
Example: Aircrafts
3:52
How Do We Get Data?
5:38
Research: Question and Hypothesis
5:39
Research Design
7:11
Measurement
7:29
Research Analysis
8:33
Research Conclusion
9:30
Types of Variables
10:03
Discrete Variables
10:04
Continuous Variables
12:07
Types of Measurements
14:17
Types of Measurements
14:18
Types of Measurements (Scales)
17:22
Nominal
17:23
Ordinal
19:11
Interval
21:33
Ratio
24:24
Example 1: Cases, Variables, Measurements
25:20
Example 2: Which Scale of Measurement is Used?
26:55
Example 3: What Kind of a Scale of Measurement is This?
27:26
Example 4: Discrete vs. Continuous Variables.
30:31
Section 3: Visualizing Distributions
Introduction to Excel

8m 9s

Intro
0:00
Before Visualizing Distribution
0:10
Excel
0:11
Excel: Organization
0:45
Workbook
0:46
Column x Rows
1:50
Tools: Menu Bar, Standard Toolbar, and Formula Bar
3:00
Excel + Data
6:07
Exce and Data
6:08
Frequency Distributions in Excel

39m 10s

Intro
0:00
Roadmap
0:08
Data in Excel and Frequency Distributions
0:09
Raw Data to Frequency Tables
0:42
Raw Data to Frequency Tables
0:43
Frequency Tables: Using Formulas and Pivot Tables
1:28
Example 1: Number of Births
7:17
Example 2: Age Distribution
20:41
Example 3: Height Distribution
27:45
Example 4: Height Distribution of Males
32:19
Frequency Distributions and Features

25m 29s

Intro
0:00
Roadmap
0:10
Data in Excel, Frequency Distributions, and Features of Frequency Distributions
0:11
Example #1
1:35
Uniform
1:36
Example #2
2:58
Unimodal, Skewed Right, and Asymmetric
2:59
Example #3
6:29
Bimodal
6:30
Example #4a
8:29
Symmetric, Unimodal, and Normal
8:30
Point of Inflection and Standard Deviation
11:13
Example #4b
12:43
Normal Distribution
12:44
Summary
13:56
Uniform, Skewed, Bimodal, and Normal
13:57
Sketch Problem 1: Driver's License
17:34
Sketch Problem 2: Life Expectancy
20:01
Sketch Problem 3: Telephone Numbers
22:01
Sketch Problem 4: Length of Time Used to Complete a Final Exam
23:43
Dotplots and Histograms in Excel

42m 42s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Previously
1:02
Data, Frequency Table, and visualization
1:03
Dotplots
1:22
Dotplots Excel Example
1:23
Dotplots: Pros and Cons
7:22
Pros and Cons of Dotplots
7:23
Dotplots Excel Example Cont.
9:07
Histograms
12:47
Histograms Overview
12:48
Example of Histograms
15:29
Histograms: Pros and Cons
31:39
Pros
31:40
Cons
32:31
Frequency vs. Relative Frequency
32:53
Frequency
32:54
Relative Frequency
33:36
Example 1: Dotplots vs. Histograms
34:36
Example 2: Age of Pennies Dotplot
36:21
Example 3: Histogram of Mammal Speeds
38:27
Example 4: Histogram of Life Expectancy
40:30
Stemplots

12m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
What Sets Stemplots Apart?
0:46
Data Sets, Dotplots, Histograms, and Stemplots
0:47
Example 1: What Do Stemplots Look Like?
1:58
Example 2: Back-to-Back Stemplots
5:00
Example 3: Quiz Grade Stemplot
7:46
Example 4: Quiz Grade & Afterschool Tutoring Stemplot
9:56
Bar Graphs

22m 49s

Intro
0:00
Roadmap
0:05
Roadmap
0:08
Review of Frequency Distributions
0:44
Y-axis and X-axis
0:45
Types of Frequency Visualizations Covered so Far
2:16
Introduction to Bar Graphs
4:07
Example 1: Bar Graph
5:32
Example 1: Bar Graph
5:33
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:07
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:08
Example 2: Create a Frequency Visualization for Gender
14:02
Example 3: Cases, Variables, and Frequency Visualization
16:34
Example 4: What Kind of Graphs are Shown Below?
19:29
Section 4: Summarizing Distributions
Central Tendency: Mean, Median, Mode

38m 50s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Central Tendency 1
0:56
Way to Summarize a Distribution of Scores
0:57
Mode
1:32
Median
2:02
Mean
2:36
Central Tendency 2
3:47
Mode
3:48
Median
4:20
Mean
5:25
Summation Symbol
6:11
Summation Symbol
6:12
Population vs. Sample
10:46
Population vs. Sample
10:47
Excel Examples
15:08
Finding Mode, Median, and Mean in Excel
15:09
Median vs. Mean
21:45
Effect of Outliers
21:46
Relationship Between Parameter and Statistic
22:44
Type of Measurements
24:00
Which Distributions to Use With
24:55
Example 1: Mean
25:30
Example 2: Using Summation Symbol
29:50
Example 3: Average Calorie Count
32:50
Example 4: Creating an Example Set
35:46
Variability

42m 40s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Variability (or Spread)
0:45
Variability (or Spread)
0:46
Things to Think About
5:45
Things to Think About
5:46
Range, Quartiles and Interquartile Range
6:37
Range
6:38
Interquartile Range
8:42
Interquartile Range Example
10:58
Interquartile Range Example
10:59
Variance and Standard Deviation
12:27
Deviations
12:28
Sum of Squares
14:35
Variance
16:55
Standard Deviation
17:44
Sum of Squares (SS)
18:34
Sum of Squares (SS)
18:35
Population vs. Sample SD
22:00
Population vs. Sample SD
22:01
Population vs. Sample
23:20
Mean
23:21
SD
23:51
Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File
27:21
Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File
35:25
Example 3: Sum of Squares
38:58
Example 4: Standard Deviation
41:48
Five Number Summary & Boxplots

57m 15s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Summarizing Distributions
0:37
Shape, Center, and Spread
0:38
5 Number Summary
1:14
Boxplot: Visualizing 5 Number Summary
3:37
Boxplot: Visualizing 5 Number Summary
3:38
Boxplots on Excel
9:01
Using 'Stocks' and Using Stacked Columns
9:02
Boxplots on Excel Example
10:14
When are Boxplots Useful?
32:14
Pros
32:15
Cons
32:59
How to Determine Outlier Status
33:24
Rule of Thumb: Upper Limit
33:25
Rule of Thumb: Lower Limit
34:16
Signal Outliers in an Excel Data File Using Conditional Formatting
34:52
Modified Boxplot
48:38
Modified Boxplot
48:39
Example 1: Percentage Values & Lower and Upper Whisker
49:10
Example 2: Boxplot
50:10
Example 3: Estimating IQR From Boxplot
53:46
Example 4: Boxplot and Missing Whisker
54:35
Shape: Calculating Skewness & Kurtosis

41m 51s

Intro
0:00
Roadmap
0:16
Roadmap
0:17
Skewness Concept
1:09
Skewness Concept
1:10
Calculating Skewness
3:26
Calculating Skewness
3:27
Interpreting Skewness
7:36
Interpreting Skewness
7:37
Excel Example
8:49
Kurtosis Concept
20:29
Kurtosis Concept
20:30
Calculating Kurtosis
24:17
Calculating Kurtosis
24:18
Interpreting Kurtosis
29:01
Leptokurtic
29:35
Mesokurtic
30:10
Platykurtic
31:06
Excel Example
32:04
Example 1: Shape of Distribution
38:28
Example 2: Shape of Distribution
39:29
Example 3: Shape of Distribution
40:14
Example 4: Kurtosis
41:10
Normal Distribution

34m 33s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
What is a Normal Distribution
0:44
The Normal Distribution As a Theoretical Model
0:45
Possible Range of Probabilities
3:05
Possible Range of Probabilities
3:06
What is a Normal Distribution
5:07
Can Be Described By
5:08
Properties
5:49
'Same' Shape: Illusion of Different Shape!
7:35
'Same' Shape: Illusion of Different Shape!
7:36
Types of Problems
13:45
Example: Distribution of SAT Scores
13:46
Shape Analogy
19:48
Shape Analogy
19:49
Example 1: The Standard Normal Distribution and Z-Scores
22:34
Example 2: The Standard Normal Distribution and Z-Scores
25:54
Example 3: Sketching and Normal Distribution
28:55
Example 4: Sketching and Normal Distribution
32:32
Standard Normal Distributions & Z-Scores

41m 44s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
A Family of Distributions
0:28
Infinite Set of Distributions
0:29
Transforming Normal Distributions to 'Standard' Normal Distribution
1:04
Normal Distribution vs. Standard Normal Distribution
2:58
Normal Distribution vs. Standard Normal Distribution
2:59
Z-Score, Raw Score, Mean, & SD
4:08
Z-Score, Raw Score, Mean, & SD
4:09
Weird Z-Scores
9:40
Weird Z-Scores
9:41
Excel
16:45
For Normal Distributions
16:46
For Standard Normal Distributions
19:11
Excel Example
20:24
Types of Problems
25:18
Percentage Problem: P(x)
25:19
Raw Score and Z-Score Problems
26:28
Standard Deviation Problems
27:01
Shape Analogy
27:44
Shape Analogy
27:45
Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer
28:24
Example 2: Heights of Male College Students
33:15
Example 3: Mean and Standard Deviation
37:14
Example 4: Finding Percentage of Values in a Standard Normal Distribution
37:49
Normal Distribution: PDF vs. CDF

55m 44s

Intro
0:00
Roadmap
0:15
Roadmap
0:16
Frequency vs. Cumulative Frequency
0:56
Frequency vs. Cumulative Frequency
0:57
Frequency vs. Cumulative Frequency
4:32
Frequency vs. Cumulative Frequency Cont.
4:33
Calculus in Brief
6:21
Derivative-Integral Continuum
6:22
PDF
10:08
PDF for Standard Normal Distribution
10:09
PDF for Normal Distribution
14:32
Integral of PDF = CDF
21:27
Integral of PDF = CDF
21:28
Example 1: Cumulative Frequency Graph
23:31
Example 2: Mean, Standard Deviation, and Probability
24:43
Example 3: Mean and Standard Deviation
35:50
Example 4: Age of Cars
49:32
Section 5: Linear Regression
Scatterplots

47m 19s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Previous Visualizations
0:30
Frequency Distributions
0:31
Compare & Contrast
2:26
Frequency Distributions Vs. Scatterplots
2:27
Summary Values
4:53
Shape
4:54
Center & Trend
6:41
Spread & Strength
8:22
Univariate & Bivariate
10:25
Example Scatterplot
10:48
Shape, Trend, and Strength
10:49
Positive and Negative Association
14:05
Positive and Negative Association
14:06
Linearity, Strength, and Consistency
18:30
Linearity
18:31
Strength
19:14
Consistency
20:40
Summarizing a Scatterplot
22:58
Summarizing a Scatterplot
22:59
Example 1: Gapminder.org, Income x Life Expectancy
26:32
Example 2: Gapminder.org, Income x Infant Mortality
36:12
Example 3: Trend and Strength of Variables
40:14
Example 4: Trend, Strength and Shape for Scatterplots
43:27
Regression

32m 2s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Linear Equations
0:34
Linear Equations: y = mx + b
0:35
Rough Line
5:16
Rough Line
5:17
Regression - A 'Center' Line
7:41
Reasons for Summarizing with a Regression Line
7:42
Predictor and Response Variable
10:04
Goal of Regression
12:29
Goal of Regression
12:30
Prediction
14:50
Example: Servings of Mile Per Year Shown By Age
14:51
Intrapolation
17:06
Extrapolation
17:58
Error in Prediction
20:34
Prediction Error
20:35
Residual
21:40
Example 1: Residual
23:34
Example 2: Large and Negative Residual
26:30
Example 3: Positive Residual
28:13
Example 4: Interpret Regression Line & Extrapolate
29:40
Least Squares Regression

56m 36s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
Best Fit
0:47
Best Fit
0:48
Sum of Squared Errors (SSE)
1:50
Sum of Squared Errors (SSE)
1:51
Why Squared?
3:38
Why Squared?
3:39
Quantitative Properties of Regression Line
4:51
Quantitative Properties of Regression Line
4:52
So How do we Find Such a Line?
6:49
SSEs of Different Line Equations & Lowest SSE
6:50
Carl Gauss' Method
8:01
How Do We Find Slope (b1)
11:00
How Do We Find Slope (b1)
11:01
Hoe Do We Find Intercept
15:11
Hoe Do We Find Intercept
15:12
Example 1: Which of These Equations Fit the Above Data Best?
17:18
Example 2: Find the Regression Line for These Data Points and Interpret It
26:31
Example 3: Summarize the Scatterplot and Find the Regression Line.
34:31
Example 4: Examine the Mean of Residuals
43:52
Correlation

43m 58s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Summarizing a Scatterplot Quantitatively
0:47
Shape
0:48
Trend
1:11
Strength: Correlation ®
1:45
Correlation Coefficient ( r )
2:30
Correlation Coefficient ( r )
2:31
Trees vs. Forest
11:59
Trees vs. Forest
12:00
Calculating r
15:07
Average Product of z-scores for x and y
15:08
Relationship between Correlation and Slope
21:10
Relationship between Correlation and Slope
21:11
Example 1: Find the Correlation between Grams of Fat and Cost
24:11
Example 2: Relationship between r and b1
30:24
Example 3: Find the Regression Line
33:35
Example 4: Find the Correlation Coefficient for this Set of Data
37:37
Correlation: r vs. r-squared

52m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
R-squared
0:44
What is the Meaning of It? Why Squared?
0:45
Parsing Sum of Squared (Parsing Variability)
2:25
SST = SSR + SSE
2:26
What is SST and SSE?
7:46
What is SST and SSE?
7:47
r-squared
18:33
Coefficient of Determination
18:34
If the Correlation is Strong…
20:25
If the Correlation is Strong…
20:26
If the Correlation is Weak…
22:36
If the Correlation is Weak…
22:37
Example 1: Find r-squared for this Set of Data
23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?
33:54
Example 3: Why Does r-squared Only Range from 0 to 1
37:29
Example 4: Find the r-squared for This Set of Data
39:55
Transformations of Data

27m 8s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Why Transform?
0:26
Why Transform?
0:27
Shape-preserving vs. Shape-changing Transformations
5:14
Shape-preserving = Linear Transformations
5:15
Shape-changing Transformations = Non-linear Transformations
6:20
Common Shape-Preserving Transformations
7:08
Common Shape-Preserving Transformations
7:09
Common Shape-Changing Transformations
8:59
Powers
9:00
Logarithms
9:39
Change Just One Variable? Both?
10:38
Log-log Transformations
10:39
Log Transformations
14:38
Example 1: Create, Graph, and Transform the Data Set
15:19
Example 2: Create, Graph, and Transform the Data Set
20:08
Example 3: What Kind of Model would You Choose for this Data?
22:44
Example 4: Transformation of Data
25:46
Section 6: Collecting Data in an Experiment
Sampling & Bias

54m 44s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Descriptive vs. Inferential Statistics
1:04
Descriptive Statistics: Data Exploration
1:05
Example
2:03
To tackle Generalization…
4:31
Generalization
4:32
Sampling
6:06
'Good' Sample
6:40
Defining Samples and Populations
8:55
Population
8:56
Sample
11:16
Why Use Sampling?
13:09
Why Use Sampling?
13:10
Goal of Sampling: Avoiding Bias
15:04
What is Bias?
15:05
Where does Bias Come from: Sampling Bias
17:53
Where does Bias Come from: Response Bias
18:27
Sampling Bias: Bias from Bas Sampling Methods
19:34
Size Bias
19:35
Voluntary Response Bias
21:13
Convenience Sample
22:22
Judgment Sample
23:58
Inadequate Sample Frame
25:40
Response Bias: Bias from 'Bad' Data Collection Methods
28:00
Nonresponse Bias
29:31
Questionnaire Bias
31:10
Incorrect Response or Measurement Bias
37:32
Example 1: What Kind of Biases?
40:29
Example 2: What Biases Might Arise?
44:46
Example 3: What Kind of Biases?
48:34
Example 4: What Kind of Biases?
51:43
Sampling Methods

14m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Biased vs. Unbiased Sampling Methods
0:32
Biased Sampling
0:33
Unbiased Sampling
1:13
Probability Sampling Methods
2:31
Simple Random
2:54
Stratified Random Sampling
4:06
Cluster Sampling
5:24
Two-staged Sampling
6:22
Systematic Sampling
7:25
Example 1: Which Type(s) of Sampling was this?
8:33
Example 2: Describe How to Take a Two-Stage Sample from this Book
10:16
Example 3: Sampling Methods
11:58
Example 4: Cluster Sample Plan
12:48
Research Design

53m 54s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Descriptive vs. Inferential Statistics
0:51
Descriptive Statistics: Data Exploration
0:52
Inferential Statistics
1:02
Variables and Relationships
1:44
Variables
1:45
Relationships
2:49
Not Every Type of Study is an Experiment…
4:16
Category I - Descriptive Study
4:54
Category II - Correlational Study
5:50
Category III - Experimental, Quasi-experimental, Non-experimental
6:33
Category III
7:42
Experimental, Quasi-experimental, and Non-experimental
7:43
Why CAN'T the Other Strategies Determine Causation?
10:18
Third-variable Problem
10:19
Directionality Problem
15:49
What Makes Experiments Special?
17:54
Manipulation
17:55
Control (and Comparison)
21:58
Methods of Control
26:38
Holding Constant
26:39
Matching
29:11
Random Assignment
31:48
Experiment Terminology
34:09
'true' Experiment vs. Study
34:10
Independent Variable (IV)
35:16
Dependent Variable (DV)
35:45
Factors
36:07
Treatment Conditions
36:23
Levels
37:43
Confounds or Extraneous Variables
38:04
Blind
38:38
Blind Experiments
38:39
Double-blind Experiments
39:29
How Categories Relate to Statistics
41:35
Category I - Descriptive Study
41:36
Category II - Correlational Study
42:05
Category III - Experimental, Quasi-experimental, Non-experimental
42:43
Example 1: Research Design
43:50
Example 2: Research Design
47:37
Example 3: Research Design
50:12
Example 4: Research Design
52:00
Between and Within Treatment Variability

41m 31s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Experimental Designs
0:51
Experimental Designs: Manipulation & Control
0:52
Two Types of Variability
2:09
Between Treatment Variability
2:10
Within Treatment Variability
3:31
Updated Goal of Experimental Design
5:47
Updated Goal of Experimental Design
5:48
Example: Drugs and Driving
6:56
Example: Drugs and Driving
6:57
Different Types of Random Assignment
11:27
All Experiments
11:28
Completely Random Design
12:02
Randomized Block Design
13:19
Randomized Block Design
15:48
Matched Pairs Design
15:49
Repeated Measures Design
19:47
Between-subject Variable vs. Within-subject Variable
22:43
Completely Randomized Design
22:44
Repeated Measures Design
25:03
Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment
26:16
Example 2: Block Design
31:41
Example 3: Completely Randomized Designs
35:11
Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?
39:01
Section 7: Review of Probability Axioms
Sample Spaces

37m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Why is Probability Involved in Statistics
0:48
Probability
0:49
Can People Tell the Difference between Cheap and Gourmet Coffee?
2:08
Taste Test with Coffee Drinkers
3:37
If No One can Actually Taste the Difference
3:38
If Everyone can Actually Taste the Difference
5:36
Creating a Probability Model
7:09
Creating a Probability Model
7:10
D'Alembert vs. Necker
9:41
D'Alembert vs. Necker
9:42
Problem with D'Alembert's Model
13:29
Problem with D'Alembert's Model
13:30
Covering Entire Sample Space
15:08
Fundamental Principle of Counting
15:09
Where Do Probabilities Come From?
22:54
Observed Data, Symmetry, and Subjective Estimates
22:55
Checking whether Model Matches Real World
24:27
Law of Large Numbers
24:28
Example 1: Law of Large Numbers
27:46
Example 2: Possible Outcomes
30:43
Example 3: Brands of Coffee and Taste
33:25
Example 4: How Many Different Treatments are there?
35:33
Addition Rule for Disjoint Events

20m 29s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Disjoint Events
0:41
Disjoint Events
0:42
Meaning of 'or'
2:39
In Regular Life
2:40
In Math/Statistics/Computer Science
3:10
Addition Rule for Disjoin Events
3:55
If A and B are Disjoint: P (A and B)
3:56
If A and B are Disjoint: P (A or B)
5:15
General Addition Rule
5:41
General Addition Rule
5:42
Generalized Addition Rule
8:31
If A and B are not Disjoint: P (A or B)
8:32
Example 1: Which of These are Mutually Exclusive?
10:50
Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?
12:57
Example 3: Engagement Party
15:17
Example 4: Home Owner's Insurance
18:30
Conditional Probability

57m 19s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
'or' vs. 'and' vs. Conditional Probability
1:07
'or' vs. 'and' vs. Conditional Probability
1:08
'and' vs. Conditional Probability
5:57
P (M or L)
5:58
P (M and L)
8:41
P (M|L)
11:04
P (L|M)
12:24
Tree Diagram
15:02
Tree Diagram
15:03
Defining Conditional Probability
22:42
Defining Conditional Probability
22:43
Common Contexts for Conditional Probability
30:56
Medical Testing: Positive Predictive Value
30:57
Medical Testing: Sensitivity
33:03
Statistical Tests
34:27
Example 1: Drug and Disease
36:41
Example 2: Marbles and Conditional Probability
40:04
Example 3: Cards and Conditional Probability
45:59
Example 4: Votes and Conditional Probability
50:21
Independent Events

24m 27s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Independent Events & Conditional Probability
0:26
Non-independent Events
0:27
Independent Events
2:00
Non-independent and Independent Events
3:08
Non-independent and Independent Events
3:09
Defining Independent Events
5:52
Defining Independent Events
5:53
Multiplication Rule
7:29
Previously…
7:30
But with Independent Evens
8:53
Example 1: Which of These Pairs of Events are Independent?
11:12
Example 2: Health Insurance and Probability
15:12
Example 3: Independent Events
17:42
Example 4: Independent Events
20:03
Section 8: Probability Distributions
Introduction to Probability Distributions

56m 45s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Sampling vs. Probability
0:57
Sampling
0:58
Missing
1:30
What is Missing?
3:06
Insight: Probability Distributions
5:26
Insight: Probability Distributions
5:27
What is a Probability Distribution?
7:29
From Sample Spaces to Probability Distributions
8:44
Sample Space
8:45
Probability Distribution of the Sum of Two Die
11:16
The Random Variable
17:43
The Random Variable
17:44
Expected Value
21:52
Expected Value
21:53
Example 1: Probability Distributions
28:45
Example 2: Probability Distributions
35:30
Example 3: Probability Distributions
43:37
Example 4: Probability Distributions
47:20
Expected Value & Variance of Probability Distributions

53m 41s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Discrete vs. Continuous Random Variables
1:04
Discrete vs. Continuous Random Variables
1:05
Mean and Variance Review
4:44
Mean: Sample, Population, and Probability Distribution
4:45
Variance: Sample, Population, and Probability Distribution
9:12
Example Situation
14:10
Example Situation
14:11
Some Special Cases…
16:13
Some Special Cases…
16:14
Linear Transformations
19:22
Linear Transformations
19:23
What Happens to Mean and Variance of the Probability Distribution?
20:12
n Independent Values of X
25:38
n Independent Values of X
25:39
Compare These Two Situations
30:56
Compare These Two Situations
30:57
Two Random Variables, X and Y
32:02
Two Random Variables, X and Y
32:03
Example 1: Expected Value & Variance of Probability Distributions
35:35
Example 2: Expected Values & Standard Deviation
44:17
Example 3: Expected Winnings and Standard Deviation
48:18
Binomial Distribution

55m 15s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Discrete Probability Distributions
1:42
Discrete Probability Distributions
1:43
Binomial Distribution
2:36
Binomial Distribution
2:37
Multiplicative Rule Review
6:54
Multiplicative Rule Review
6:55
How Many Outcomes with k 'Successes'
10:23
Adults and Bachelor's Degree: Manual List of Outcomes
10:24
P (X=k)
19:37
Putting Together # of Outcomes with the Multiplicative Rule
19:38
Expected Value and Standard Deviation in a Binomial Distribution
25:22
Expected Value and Standard Deviation in a Binomial Distribution
25:23
Example 1: Coin Toss
33:42
Example 2: College Graduates
38:03
Example 3: Types of Blood and Probability
45:39
Example 4: Expected Number and Standard Deviation
51:11
Section 9: Sampling Distributions of Statistics
Introduction to Sampling Distributions

48m 17s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Probability Distributions vs. Sampling Distributions
0:55
Probability Distributions vs. Sampling Distributions
0:56
Same Logic
3:55
Logic of Probability Distribution
3:56
Example: Rolling Two Die
6:56
Simulating Samples
9:53
To Come Up with Probability Distributions
9:54
In Sampling Distributions
11:12
Connecting Sampling and Research Methods with Sampling Distributions
12:11
Connecting Sampling and Research Methods with Sampling Distributions
12:12
Simulating a Sampling Distribution
14:14
Experimental Design: Regular Sleep vs. Less Sleep
14:15
Logic of Sampling Distributions
23:08
Logic of Sampling Distributions
23:09
General Method of Simulating Sampling Distributions
25:38
General Method of Simulating Sampling Distributions
25:39
Questions that Remain
28:45
Questions that Remain
28:46
Example 1: Mean and Standard Error of Sampling Distribution
30:57
Example 2: What is the Best Way to Describe Sampling Distributions?
37:12
Example 3: Matching Sampling Distributions
38:21
Example 4: Mean and Standard Error of Sampling Distribution
41:51
Sampling Distribution of the Mean

1h 8m 48s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Special Case of General Method for Simulating a Sampling Distribution
1:53
Special Case of General Method for Simulating a Sampling Distribution
1:54
Computer Simulation
3:43
Using Simulations to See Principles behind Shape of SDoM
15:50
Using Simulations to See Principles behind Shape of SDoM
15:51
Conditions
17:38
Using Simulations to See Principles behind Center (Mean) of SDoM
20:15
Using Simulations to See Principles behind Center (Mean) of SDoM
20:16
Conditions: Does n Matter?
21:31
Conditions: Does Number of Simulation Matter?
24:37
Using Simulations to See Principles behind Standard Deviation of SDoM
27:13
Using Simulations to See Principles behind Standard Deviation of SDoM
27:14
Conditions: Does n Matter?
34:45
Conditions: Does Number of Simulation Matter?
36:24
Central Limit Theorem
37:13
SHAPE
38:08
CENTER
39:34
SPREAD
39:52
Comparing Population, Sample, and SDoM
43:10
Comparing Population, Sample, and SDoM
43:11
Answering the 'Questions that Remain'
48:24
What Happens When We Don't Know What the Population Looks Like?
48:25
Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
49:42
How Do We Know whether a Sample is Sufficiently Unlikely?
53:36
Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
54:40
Example 1: Mean Batting Average
55:25
Example 2: Mean Sampling Distribution and Standard Error
59:07
Example 3: Sampling Distribution of the Mean
1:01:04
Sampling Distribution of Sample Proportions

54m 37s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Intro to Sampling Distribution of Sample Proportions (SDoSP)
0:51
Categorical Data (Examples)
0:52
Wish to Estimate Proportion of Population from Sample…
2:00
Notation
3:34
Population Proportion and Sample Proportion Notations
3:35
What's the Difference?
9:19
SDoM vs. SDoSP: Type of Data
9:20
SDoM vs. SDoSP: Shape
11:24
SDoM vs. SDoSP: Center
12:30
SDoM vs. SDoSP: Spread
15:34
Binomial Distribution vs. Sampling Distribution of Sample Proportions
19:14
Binomial Distribution vs. SDoSP: Type of Data
19:17
Binomial Distribution vs. SDoSP: Shape
21:07
Binomial Distribution vs. SDoSP: Center
21:43
Binomial Distribution vs. SDoSP: Spread
24:08
Example 1: Sampling Distribution of Sample Proportions
26:07
Example 2: Sampling Distribution of Sample Proportions
37:58
Example 3: Sampling Distribution of Sample Proportions
44:42
Example 4: Sampling Distribution of Sample Proportions
45:57
Section 10: Inferential Statistics
Introduction to Confidence Intervals

42m 53s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Inferential Statistics
0:50
Inferential Statistics
0:51
Two Problems with This Picture…
3:20
Two Problems with This Picture…
3:21
Solution: Confidence Intervals (CI)
4:59
Solution: Hypotheiss Testing (HT)
5:49
Which Parameters are Known?
6:45
Which Parameters are Known?
6:46
Confidence Interval - Goal
7:56
When We Don't Know m but know s
7:57
When We Don't Know
18:27
When We Don't Know m nor s
18:28
Example 1: Confidence Intervals
26:18
Example 2: Confidence Intervals
29:46
Example 3: Confidence Intervals
32:18
Example 4: Confidence Intervals
38:31
t Distributions

1h 2m 6s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
When to Use z vs. t?
1:07
When to Use z vs. t?
1:08
What is z and t?
3:02
z-score and t-score: Commonality
3:03
z-score and t-score: Formulas
3:34
z-score and t-score: Difference
5:22
Why not z? (Why t?)
7:24
Why not z? (Why t?)
7:25
But Don't Worry!
15:13
Gossett and t-distributions
15:14
Rules of t Distributions
17:05
t-distributions are More Normal as n Gets Bigger
17:06
t-distributions are a Family of Distributions
18:55
Degrees of Freedom (df)
20:02
Degrees of Freedom (df)
20:03
t Family of Distributions
24:07
t Family of Distributions : df = 2 , 4, and 60
24:08
df = 60
29:16
df = 2
29:59
How to Find It?
31:01
'Student's t-distribution' or 't-distribution'
31:02
Excel Example
33:06
Example 1: Which Distribution Do You Use? Z or t?
45:26
Example 2: Friends on Facebook
47:41
Example 3: t Distributions
52:15
Example 4: t Distributions , confidence interval, and mean
55:59
Introduction to Hypothesis Testing

1h 6m 33s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Issues to Overcome in Inferential Statistics
1:35
Issues to Overcome in Inferential Statistics
1:36
What Happens When We Don't Know What the Population Looks Like?
2:57
How Do We Know whether a sample is Sufficiently Unlikely
3:43
Hypothesizing a Population
6:44
Hypothesizing a Population
6:45
Null Hypothesis
8:07
Alternative Hypothesis
8:56
Hypotheses
11:58
Hypotheses
11:59
Errors in Hypothesis Testing
14:22
Errors in Hypothesis Testing
14:23
Steps of Hypothesis Testing
21:15
Steps of Hypothesis Testing
21:16
Single Sample HT ( When Sigma Available)
26:08
Example: Average Facebook Friends
26:09
Step1
27:08
Step 2
27:58
Step 3
28:17
Step 4
32:18
Single Sample HT (When Sigma Not Available)
36:33
Example: Average Facebook Friends
36:34
Step1: Hypothesis Testing
36:58
Step 2: Significance Level
37:25
Step 3: Decision Stage
37:40
Step 4: Sample
41:36
Sigma and p-value
45:04
Sigma and p-value
45:05
On tailed vs. Two Tailed Hypotheses
45:51
Example 1: Hypothesis Testing
48:37
Example 2: Heights of Women in the US
57:43
Example 3: Select the Best Way to Complete This Sentence
1:03:23
Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro
0:00
Roadmap
0:14
Roadmap
0:15
One Mean vs. Two Means
1:17
One Mean vs. Two Means
1:18
Notation
2:41
A Sample! A Set!
2:42
Mean of X, Mean of Y, and Difference of Two Means
3:56
SE of X
4:34
SE of Y
6:28
Sampling Distribution of the Difference between Two Means (SDoD)
7:48
Sampling Distribution of the Difference between Two Means (SDoD)
7:49
Rules of the SDoD (similar to CLT!)
15:00
Mean for the SDoD Null Hypothesis
15:01
Standard Error
17:39
When can We Construct a CI for the Difference between Two Means?
21:28
Three Conditions
21:29
Finding CI
23:56
One Mean CI
23:57
Two Means CI
25:45
Finding t
29:16
Finding t
29:17
Interpreting CI
30:25
Interpreting CI
30:26
Better Estimate of s (s pool)
34:15
Better Estimate of s (s pool)
34:16
Example 1: Confidence Intervals
42:32
Example 2: SE of the Difference
52:36
Hypothesis Testing for the Difference of Two Independent Means

50m

Intro
0:00
Roadmap
0:06
Roadmap
0:07
The Goal of Hypothesis Testing
0:56
One Sample and Two Samples
0:57
Sampling Distribution of the Difference between Two Means (SDoD)
3:42
Sampling Distribution of the Difference between Two Means (SDoD)
3:43
Rules of the SDoD (Similar to CLT!)
6:46
Shape
6:47
Mean for the Null Hypothesis
7:26
Standard Error for Independent Samples (When Variance is Homogenous)
8:18
Standard Error for Independent Samples (When Variance is not Homogenous)
9:25
Same Conditions for HT as for CI
10:08
Three Conditions
10:09
Steps of Hypothesis Testing
11:04
Steps of Hypothesis Testing
11:05
Formulas that Go with Steps of Hypothesis Testing
13:21
Step 1
13:25
Step 2
14:18
Step 3
15:00
Step 4
16:57
Example 1: Hypothesis Testing for the Difference of Two Independent Means
18:47
Example 2: Hypothesis Testing for the Difference of Two Independent Means
33:55
Example 3: Hypothesis Testing for the Difference of Two Independent Means
44:22
Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
The Goal of Hypothesis Testing
1:27
One Sample and Two Samples
1:28
Independent Samples vs. Paired Samples
3:16
Independent Samples vs. Paired Samples
3:17
Which is Which?
5:20
Independent SAMPLES vs. Independent VARIABLES
7:43
independent SAMPLES vs. Independent VARIABLES
7:44
T-tests Always…
10:48
T-tests Always…
10:49
Notation for Paired Samples
12:59
Notation for Paired Samples
13:00
Steps of Hypothesis Testing for Paired Samples
16:13
Steps of Hypothesis Testing for Paired Samples
16:14
Rules of the SDoD (Adding on Paired Samples)
18:03
Shape
18:04
Mean for the Null Hypothesis
18:31
Standard Error for Independent Samples (When Variance is Homogenous)
19:25
Standard Error for Paired Samples
20:39
Formulas that go with Steps of Hypothesis Testing
22:59
Formulas that go with Steps of Hypothesis Testing
23:00
Confidence Intervals for Paired Samples
30:32
Confidence Intervals for Paired Samples
30:33
Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
32:28
Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
44:02
Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
52:23
Type I and Type II Errors

31m 27s

Intro
0:00
Roadmap
0:18
Roadmap
0:19
Errors and Relationship to HT and the Sample Statistic?
1:11
Errors and Relationship to HT and the Sample Statistic?
1:12
Instead of a Box…Distributions!
7:00
One Sample t-test: Friends on Facebook
7:01
Two Sample t-test: Friends on Facebook
13:46
Usually, Lots of Overlap between Null and Alternative Distributions
16:59
Overlap between Null and Alternative Distributions
17:00
How Distributions and 'Box' Fit Together
22:45
How Distributions and 'Box' Fit Together
22:46
Example 1: Types of Errors
25:54
Example 2: Types of Errors
27:30
Example 3: What is the Danger of the Type I Error?
29:38
Effect Size & Power

44m 41s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Distance between Distributions: Sample t
0:49
Distance between Distributions: Sample t
0:50
Problem with Distance in Terms of Standard Error
2:56
Problem with Distance in Terms of Standard Error
2:57
Test Statistic (t) vs. Effect Size (d or g)
4:38
Test Statistic (t) vs. Effect Size (d or g)
4:39
Rules of Effect Size
6:09
Rules of Effect Size
6:10
Why Do We Need Effect Size?
8:21
Tells You the Practical Significance
8:22
HT can be Deceiving…
10:25
Important Note
10:42
What is Power?
11:20
What is Power?
11:21
Why Do We Need Power?
14:19
Conditional Probability and Power
14:20
Power is:
16:27
Can We Calculate Power?
19:00
Can We Calculate Power?
19:01
How Does Alpha Affect Power?
20:36
How Does Alpha Affect Power?
20:37
How Does Effect Size Affect Power?
25:38
How Does Effect Size Affect Power?
25:39
How Does Variability and Sample Size Affect Power?
27:56
How Does Variability and Sample Size Affect Power?
27:57
How Do We Increase Power?
32:47
Increasing Power
32:48
Example 1: Effect Size & Power
35:40
Example 2: Effect Size & Power
37:38
Example 3: Effect Size & Power
40:55
Section 11: Analysis of Variance
F-distributions

24m 46s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Z- & T-statistic and Their Distribution
0:34
Z- & T-statistic and Their Distribution
0:35
F-statistic
4:55
The F Ration ( the Variance Ratio)
4:56
F-distribution
12:29
F-distribution
12:30
s and p-value
15:00
s and p-value
15:01
Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?
18:33
Example 2: F-distributions
19:29
Example 3: F-distributions and Heights
21:29
ANOVA with Independent Samples

1h 9m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
1:12
The Limitations of t-tests
1:13
Two Major Limitations of Many t-tests
3:26
Two Major Limitations of Many t-tests
3:27
Ronald Fisher's Solution… F-test! New Null Hypothesis
4:43
Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)
4:44
Analysis of Variance (ANoVA) Notation
7:47
Analysis of Variance (ANoVA) Notation
7:48
Partitioning (Analyzing) Variance
9:58
Total Variance
9:59
Within-group Variation
14:00
Between-group Variation
16:22
Time out: Review Variance & SS
17:05
Time out: Review Variance & SS
17:06
F-statistic
19:22
The F Ratio (the Variance Ratio)
19:23
S²bet = SSbet / dfbet
22:13
What is This?
22:14
How Many Means?
23:20
So What is the dfbet?
23:38
So What is SSbet?
24:15
S²w = SSw / dfw
26:05
What is This?
26:06
How Many Means?
27:20
So What is the dfw?
27:36
So What is SSw?
28:18
Chart of Independent Samples ANOVA
29:25
Chart of Independent Samples ANOVA
29:26
Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?
35:52
Hypotheses
35:53
Significance Level
39:40
Decision Stage
40:05
Calculate Samples' Statistic and p-Value
44:10
Reject or Fail to Reject H0
55:54
Example 2: ANOVA with Independent Samples
58:21
Repeated Measures ANOVA

1h 15m 13s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
0:36
Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
0:37
ANOVA (F-test) to the Rescue!
5:49
Omnibus Hypothesis
5:50
Analyze Variance
7:27
Independent Samples vs. Repeated Measures
9:12
Same Start
9:13
Independent Samples ANOVA
10:43
Repeated Measures ANOVA
12:00
Independent Samples ANOVA
16:00
Same Start: All the Variance Around Grand Mean
16:01
Independent Samples
16:23
Repeated Measures ANOVA
18:18
Same Start: All the Variance Around Grand Mean
18:19
Repeated Measures
18:33
Repeated Measures F-statistic
21:22
The F Ratio (The Variance Ratio)
21:23
S²bet = SSbet / dfbet
23:07
What is This?
23:08
How Many Means?
23:39
So What is the dfbet?
23:54
So What is SSbet?
24:32
S² resid = SS resid / df resid
25:46
What is This?
25:47
So What is SS resid?
26:44
So What is the df resid?
27:36
SS subj and df subj
28:11
What is This?
28:12
How Many Subject Means?
29:43
So What is df subj?
30:01
So What is SS subj?
30:09
SS total and df total
31:42
What is This?
31:43
What is the Total Number of Data Points?
32:02
So What is df total?
32:34
so What is SS total?
32:47
Chart of Repeated Measures ANOVA
33:19
Chart of Repeated Measures ANOVA: F and Between-samples Variability
33:20
Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
35:50
Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
40:25
Hypotheses
40:26
Significance Level
41:46
Decision Stage
42:09
Calculate Samples' Statistic and p-Value
46:18
Reject or Fail to Reject H0
57:55
Example 2: Repeated Measures ANOVA
58:57
Example 3: What's the Problem with a Bunch of Tiny t-tests?
1:13:59
Section 12: Chi-square Test
Chi-Square Goodness-of-Fit Test

58m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Where Does the Chi-Square Test Belong?
0:50
Where Does the Chi-Square Test Belong?
0:51
A New Twist on HT: Goodness-of-Fit
7:23
HT in General
7:24
Goodness-of-Fit HT
8:26
Hypotheses about Proportions
12:17
Null Hypothesis
12:18
Alternative Hypothesis
13:23
Example
14:38
Chi-Square Statistic
17:52
Chi-Square Statistic
17:53
Chi-Square Distributions
24:31
Chi-Square Distributions
24:32
Conditions for Chi-Square
28:58
Condition 1
28:59
Condition 2
30:20
Condition 3
30:32
Condition 4
31:47
Example 1: Chi-Square Goodness-of-Fit Test
32:23
Example 2: Chi-Square Goodness-of-Fit Test
44:34
Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?
56:06
Chi-Square Test of Homogeneity

51m 36s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
Goodness-of-Fit vs. Homogeneity
1:13
Goodness-of-Fit HT
1:14
Homogeneity
2:00
Analogy
2:38
Hypotheses About Proportions
5:00
Null Hypothesis
5:01
Alternative Hypothesis
6:11
Example
6:33
Chi-Square Statistic
10:12
Same as Goodness-of-Fit Test
10:13
Set Up Data
12:28
Setting Up Data Example
12:29
Expected Frequency
16:53
Expected Frequency
16:54
Chi-Square Distributions & df
19:26
Chi-Square Distributions & df
19:27
Conditions for Test of Homogeneity
20:54
Condition 1
20:55
Condition 2
21:39
Condition 3
22:05
Condition 4
22:23
Example 1: Chi-Square Test of Homogeneity
22:52
Example 2: Chi-Square Test of Homogeneity
32:10
Section 13: Overview of Statistics
Overview of Statistics

18m 11s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
The Statistical Tests (HT) We've Covered
0:28
The Statistical Tests (HT) We've Covered
0:29
Organizing the Tests We've Covered…
1:08
One Sample: Continuous DV and Categorical DV
1:09
Two Samples: Continuous DV and Categorical DV
5:41
More Than Two Samples: Continuous DV and Categorical DV
8:21
The Following Data: OK Cupid
10:10
The Following Data: OK Cupid
10:11
Example 1: Weird-MySpace-Angle Profile Photo
10:38
Example 2: Geniuses
12:30
Example 3: Promiscuous iPhone Users
13:37
Example 4: Women, Aging, and Messaging
16:07
Loading...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
Bookmark & Share Embed

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Please ensure that your website editor is in text mode when you paste the code.
(In Wordpress, the mode button is on the top right corner.)
  ×
  • - Allow users to view the embedded video in full-size.
  • Discussion

  • Answer Engine

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Lecture Comments (11)

0 answers

Post by Joseph Berk on January 10, 2017

The variable "friends" should have a "discrete" variable type not a "continuous". The reason is that you can't use all real numbers to represent the number of friends, but only integer values. For example - you can't have 1.5 friends so the data is "discontinuous" between the values of 1 & 2.

0 answers

Post by Michael Onizak on March 10, 2014

Just a heads up; those using MS Excel 2007 the pivot table button is located under the "Insert" tab not the "Data" tab. Also, as you saw the Apple version having three separate windows to create the pivot table where as the MS 2007 will be in one window. Hope this helps

0 answers

Post by steven bain on December 2, 2013

The files appear to the wrong files for this lesson waste of time!

0 answers

Post by Christopher Hu on November 17, 2013

hi Dr.  do you cover p-value?  thanks

0 answers

Post by Patrick Manuel on June 28, 2013

Where are the exercise files? I can't find them. Please post the exercise files. Thanks.

0 answers

Post by Monica Ballard on February 12, 2013




For the first expercise, my formula appears right in excel but I keep getting the answer of 0 rather than 7

0 answers

Post by Monica Ballard on February 12, 2013

For the first expercise, my formula appears right in excel but I keep getting the answer of 0 rather than 7

0 answers

Post by Brijesh Bolar on August 12, 2012

Very nice sessions. It is a pleasant surprise to learn excel as a part of stats course. I always wanted to learn excel for stats and this fulfills my wish.

2 answers

Last reply by: Jorge Delgado
Fri Nov 2, 2012 10:21 AM

Post by Robert Hsiao on January 27, 2012

I would like to download this data file, so it is easier to follow the lesson.

Frequency Distributions in Excel

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:08
    • Data in Excel and Frequency Distributions
  • Raw Data to Frequency Tables 0:42
    • Raw Data to Frequency Tables
    • Frequency Tables: Using Formulas and Pivot Tables
  • Example 1: Number of Births 7:17
  • Example 2: Age Distribution 20:41
  • Example 3: Height Distribution 27:45
  • Example 4: Height Distribution of Males 32:19

Transcription: Frequency Distributions in Excel

Hi welcome to www.educator.com.0000

We are going to be talking about how to create frequency distributions in Excel from raw data.0003

We are just going to overview when sample data set in Excel already, you can download it from one of the links below.0012

When we are going to talk about how to create frequency distributions from that data0022

but in order to create these distributions visualize a bowl of seeable distributions.0027

We need to go first from the data to frequency tables, then from the tables we will go to the visualizations.0034

First, going from raw data to frequency tables.0046

The reason we want to do this is oftentimes when we look at raw data it is really hard to make sense of.0050

It is just rows and rows and rows of data.0055

It would be nice if somebody could summarize that data for us so that we can visualize it.0059

When we summarize and visualize that data we get a sense of what the data looks like.0066

We are going to be talking later about actual shapes of distributions.0071

There are two ways to go and do frequency tables in Excel.0076

One is by using formulas.0080

Here we are going to be using the formula count F and the other way is to use pivot tables.0083

I’m going to show you one example of using pivot tables but we are going to be using mostly the formulas.0090

If you want to open up your Excel file that has all of our data in it, this is a sample data set of 100 friends from www.facebook.com.0100

Notice that they all have this CID which is their case ID and each column shows some sort of characteristic or variable.0110

Each cell for each person has a value for that variable.0124

Let us look at example 1, CID 1, case number1.0131

For this person they have 4 tagged photos, not a lot of tagged photos.0137

They have to seem 0 mobile uploads, again not a lot of mobile uploads, maybe they do Not have a smart phone right?0143

If we go down the line we could see that there are lots and lots and lots of variables here.0150

There are tagged photos, mobile photos, uploaded photos, profile pictures, then number of friends, number of siblings, relationship status right?0154

There is a whole bunch of these.0167

Here is one that we are going to be focusing on today, birth month.0169

Birth month is going to be important for us today.0172

We are going to be looking at age and height.0176

If I asked you if you see these 100 people and I will show them to you all at once so you could see them.0184

Here is this 100 people what can you tell me about their age.0193

What can you tell me about their height?0197

It will be hard to do because it is just lines and lines and lines of data.0199

It will be nice if there was one way where we could just easily see all the data at once in a way where it was a little more tangible to us.0204

That is where we are going to be talking about how to visualize these and how to create frequency tables.0216

In the files that I provided for you, I put in little tabs already.0219

One of the sheets has all of our data in it and one of the sheets talks about the variables.0225

Here we have a whole bunch of different variable names like the case ID number, the tagged photos,0235

how many photos they are tagged in, mobile uploads, how many mobile photos uploaded, relationship status, birth month, birth year, gender.0239

These are a whole bunch of different variables that are already in this data set.0249

I also have a column that tells you what kind of measure it is.0254

Is it a nominal measure where it is just a number but it really stands for a name?0259

Relationship status is one of those where there is a number there like 1, 2, 3 or 4 but it does not mean0264

that the relationship status is literally like the number 1, it actually means if you scroll over, if they have a zero it means that their 0440.9 relationship status is blank.0271

If they have a 1 it means that their single.0283

If they have 2, that means there in a relationship.0286

If it is 3, they are engaged.0289

If it is a 4, they are married and if it is a 5, it is complicated.0291

And 6 if it is other right?0295

That is an example of what we call a nominal type of measure.0297

Just so you can see all of these things at the same time, if you look down here there is this two little blue rectangles.0302

If you drag that over then you could sort of keep this column just static and locked while you move these columns.0311

We can also see that birth month is what we call an interval, it can also be seen as ordinal.0324

It is not quite interval because it is technically like 30 or 31 days, it is not exactly the same interval but you could sometimes call it interval.0332

Each of the numbers represent one of the months.0344

Birth year is also interval, there is an interval of exactly one year.0350

Gender is obviously nominal because even though there is a 1 or 2 it does not mean that their gender is 1 or 2.0354

It means that if they have a 1 they are male.0362

If they have a 2, they are female.0364

Some things like friends is really to understand though because friends is a ratio measure.0366

It is the count of how many friends they have so that is continuous type of variable and if they have a 0 means they have no friends.0372

That is very rare on www.facebook.com but it could happen.0382

I’m going to move this locked piece over.0387

The next tab you could see there it says birth month on it.0391

So far I have created a little set up so that we could begin our frequency table.0397

A frequency table is just a count of how many people are born in January.0402

How many people are born in February and so on and so forth.0408

Now if we have to do that by hand it would be hard.0412

We have to go to our data, click on data.0414

Go to birth month and we have to count up how many people have one, 1, 2.0418

But this is a very error prone process so we are going to use Excel to help us do that really efficiently.0426

First, let us go to our first example.0436

We have here a data set with data from 100 www.facebook.com friends.0440

More of these friends born in a particular month or is the number of births fairly uniform across the year.0444

Well is there reason to believe that one month is more popular for having babies than another month?0452

We are not sure but it is hard to see the answer to this question literally like see the answer to this question0458

by looking at the data because the data just look like this giant list.0464

That is why we are going to create frequency tables.0470

In order to create frequency tables we can start off with the formula.0474

In order to do a formula remember we always start off with the equal sign (=) to tell Excel “hey I’m doing a formula here”.0479

In order to count how many ones we have we could use the count formula.0486

It is a formula that is already prewritten in Excel.0493

Excel will just do it for us.0496

If we just stopped at the word count, it would just count how many things you have.0498

It would not count how many ones you have, right.0504

We want to use the formula count if, that is the function that we want to use0508

What is handy about Excel is that once you type in something then it will tell you what inputs you need.0514

Here it says you need the range.0521

The range of cells that you want Excel to look at as well as the criteria.0523

Here I’m going to tell Excel we will look over at my data.0529

I’m going to click on data and click from this one all the way down to the very very last row.0535

And if I go back to birth month then it should say date from row I2 all the way to I101 but it has it twice, I’m going to delete this part.0547

That is the data that I wanted to look at.0567

This little column right here is telling you the range.0570

It says go from I2 all the way to I101.0574

That is the criteria I want and before I put my criteria Excel tells me, it reminds me I need a little comma in between.0579

I’m going to put a little comma.0589

What is my criteria? I wanted to count it if it is a 1.0591

I’m going to say if is equal to whatever is in this cell.0598

Excel will automatically put in that this is part of the birth month sheet.0602

It actually does not need this one either but it will put it in automatically for you.0611

I’m going to delete that one just so you could see but you could have it there as well.0616

It does not matter.0620

Let me finish my little function and let us look at what it says.0623

It says count if the data in this range is basically equal to whatever is in a2, this one.0626

Let me hit enter and it should say 7.0636

7 people out of my 100 www.facebook.com friends are born in the month of January.0639

The great thing about Excel is that it is a relative program.0645

If I copy and paste this cell, one cell down it will take everything in my formula and sort of calibrate it one cell down, right.0649

Let me look at this, do I wanted to bring everything one row down?0665

That means my data would go from I3 to I102.0671

That is not what I want.0677

I want the data part to stay the same but I want this part to move and moved down.0678

So that then it will say count if this data is equal to 2.0684

Here is what I’m going to do, to tell Excel keep this part the same.0691

I’m going to tell I’m going to put in a dollar sign ($) right in front of the I and right in front of 2.0695

This says freeze the row and freeze the column.0702

I’m going to put that also in front of this one, as well as that one.0706

That means this data set will never move but this A2 will move.0712

Notice that doing that does not change anything from my first row but I’m going to take this and copy it.0718

I’m just hitting either command c if you are on a mac and control c if you are on a pc and then pasting it one cell underneath.0724

Let us double click on this to see what it says.0736

It says count if data and my data states exactly the same from I2 to I101.0739

That is exactly what I wanted to do.0745

Notice that now my criteria has changed.0747

My criteria has moved one row down because I have copied and paste in my formula, one row down.0750

Excel it is relative.0757

It will move everything one row down.0759

Let us try it with the next one.0762

I’m just copying and pasting this one, one row down.0764

Let us double click on it to see what it says.0769

It says count if.0771

Data stays exactly the same from row 2 to 101 but now it is comparing it to whatever is in A4 which is March.0774

The nice thing about Excel is that if you look right at the corner here, there is this little box in the lower right hand corner.0785

If you put your mouse over that it will turn into a little cross.0794

If I drag that all the way down, it will copy and paste my formula again and again all the way down.0800

We could just check one of these down here once again my data set has stayed the same because I put those dollar signs ($) in there.0807

My criteria has moved down to A10 now.0816

I have my frequency table now.0820

Frequency tables are nice because they just give you the raw numbers in the month of January there are 7 people who have birthdays then.0824

In the month of July there are 10 people who have birthdays then.0833

We could look at our data.0837

We could stop here but I want to show you another way that we could create frequency tables.0839

I’m going to go back on my data and show you a second way.0848

The second way is less common but I still want to show it to you because we may use it once in a while.0853

We are going to use what is called pivot tables.0857

What I’m going to do is just put my cursor anywhere and open my Excel toolbar.0862

Unfortunately, you cannot see it on this screen.0870

Open my Excel toolbar.0872

There is a little tab called data.0873

Seldom used.0877

If you scroll down there should be something that says pivot table or pivot table report.0880

I’m going to click on that.0888

Once that comes up, you should have a little pivot table wizard that pops up and you will say “where is this data you want to analyze?”0893

It is on my Microsoft Excel data base.0903

Is this the data you want to use?0907

Yes, I want from A all the way to N and from A1 all the way to 101.0908

That is next.0924

I want to put my pivot table on a new sheet, just so I can show you.0925

I’m just going to hit finish.0930

A new sheet should pop up, it is probably be called sheet 1.0934

I’m just going to make this a little bigger for you.0939

A little pivot table should pop up.0945

You should also have a little pivot table tool bar that also pops up.0949

Let me drag it in for you.0955

Here we go.0966

This is the little pivot table tool bar that comes up.0967

This pivot table tool bar has all of my variables in it.0970

I could drag these variables into this pivot table down here.0975

It actually shows why it is called a pivot table.0979

I assume it is because you could move these variables from one corner to another and that is where we get the pivot.0982

What we want is a bunch of months on this side and then I want it to tell me how many people are born in that month.0990

I’m going to look for birth month and put it in my row fields because each row is going to be a birth month.1000

I’m going to take that birth month and drag it into my data as well.1006

What is does is it sums up how many of those birth months there are.1012

For January it sums up 1, seven times but for 2 I do not want it to sum up.1018

Instead I’m going to tell my pivot table count how many they are, do not sum them up.1026

Go to pivot table and go to field settings and I will hit count instead of sum.1032

Then hit Ok.1039

When that happens you can see we basically get the same numbers that we have when we use the formula.1040

In the month of January we have 7.1045

In the month of July we have 10.1048

This is another way that you could look for frequency tables.1050

Notice that this one is pretty fast.1056

Pivot tables do require a little bit of work but on the front in there is a little bit of learning curve.1059

Once you do understand that, they are really handy.1066

We maybe using them again in the future.1068

If you do not feel comfortable with them, feel free to also use the formulas.1072

I will be using the formulas for the rest of this lesson.1076

Let us go back to my birth month.1079

My birth month pivot table created just through Excel formulas by themselves.1083

I have this nice frequency table but it will be nice if I could visualize it.1089

Here I have to read each row and although for 12 months it is not so bad, they might be times when this is less helpful to us.1097

What I’m going to do is highlight the data that I want to visualize and then hit chart.1105

It should be one of the tabs up here or you could go get it through one of your Excel tabs.1115

I’m going to say give it to me in columns or you could use borrow as well.1122

In Excel it just means it is on this side.1133

I’m going to use columns for now.1135

I will just pick the first one.1138

It seems the simplest.1140

I’m just going to delete that legend, it is redundant.1144

Here is my frequency table and we could literally see our data.1149

It is also tells me what each of these bars stand for.1157

It stands for 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.1162

What Excel will do is it will automatically seed your X axis with just those numbers starting from 1 and it will go up.1166

With months, handle it but is the same thing that Excel is doing.1174

In another example we will need to put in our own X axis.1179

Notice that here these are not means, they are not averages, these are frequencies.1185

This means that 7 people were born in January, 10 people are born in July and 7 people are born in December.1191

And so that is what our birth month frequency visualization looks like.1205

This is our frequency distribution for birth month.1210

Let me minimize this.1215

If the number of friends born in a particular, is one month particularly popular for one of our friends are born.1217

It does not seem to be the case, the months all tend to be from something like 7 – 10 people per month.1225

It seems that the numbers are fairly uniform.1233

Let us go into our second example.1241

Here is another example and now let us take our same data, the data from 100 www.facebook.com friends1243

and we are going to look at what is the age distribution in this sample.1249

Here is my Excel data I'm just going to click on the data sheet and here when we go up and look for age we could see here is a whole bunch of ages.1255

It seems like there is a lot of people in their 20’s.1268

A few people in their late teens but here we see some people who are 0 years old.1274

In this data set, if they have 0 it means that they do not list their year of birth or do they do not list their age.1281

Maybe they are embarrassed, maybe they are too young.1289

I do not know.1291

We do not learn a lot by just scrolling up and down on this data.1296

That is why it will be nice if we could look at a frequency table or look at a distribution visually.1300

I'm going to click on my age sheets and here I have already made set it up so that we could just1306

do our frequency table really easily from the lowest age in our sample which is 17 I have ignored the 0 obviously1314

to the oldest age in our sample which is 38 and there is all the ages in between.1323

Let us go ahead and put in our formula to find out how many people in our sample are 17 years old.1329

To start a formula we start with the equal sign (=).1335

We use count if because we do not want to count everybody, we just want to count the people who are 17.1338

Let us tell Excel where it should find our data, what is the range of data.1346

I’m going to click on data and click from this cell all the way down to row 101.1352

I know I need a comma after that.1364

I’m going to delete that part.1371

Here is our data range and I wanted to count it if this person is 17.1373

My inputs are there. Remember we want this data to stay the same all the time.1385

We do not want it to move because Excel will move it if it has the chance to.1389

I’m going to put dollar sign ($) in front of the L and the 2 and in front of the column indicator1394

and the row indicator to tell it to lock this data in place.1400

Always use this data, do not change.1404

Once I have that formula, I'm just going to drag it all the way down so that it counts at the frequencies for 18, 19, 20 year olds.1409

Let us back check. Let us look at 21 year olds.1424

It says count if our data set has stayed the same because we have locked it in with our dollar sign ($).1426

Now it is saying I will count these people if they are 21 years old, that is our criteria.1433

It looks like our formula has copied and pasted quite well.1438

Notice that for some of these some of ages, the frequency is 0.1443

There are 0 people who are 26 years old in our sample.1449

Now why do I want to keep that 26 in there?1454

If we skipped down on 26 and 28, 29, 30, 31, 32 and we looked and there is 127 year old in 133 year old,1457

we might mistakenly assume that from 27 to 33 there is equal chance of having1471

at least one person from our sample being sort of in that range.1478

You could see that is actually not true.1483

In between there, there is like a big desert of nobody and we want our distribution to reflect that.1486

Age is a continuous variable and so we do not want to skip any ages.1494

We want to show how the distribution looks as we look at age continuously.1500

This is nice because we can already see that the ages are clamped or clustered around age maybe 20 – 22, early 20’s.1507

It will be nice if we could really look at this.1518

One thing you might want to do is click on select both age and frequency.1520

Go to charts and we are going to do an X, Y scatter.1530

For those of you who have Microsoft Excel later than 2008, like 2009 and later you can go directly to column1538

but here we are going to start with 2008 Excel.1549

We are going to need to do a little fix.1555

First I’m going to click on a scatter.1557

A scatter is nice because it shows you both the age.1560

This is age 17 and the frequency.1566

Once we have that then I'm going to go to column and then it will show me 17 through 38.1572

If I had gone directly to column, here is what will happen.1584

If I did not go through scatter first, here is what will happen.1589

Let us say I just wanted the frequency, they will go directly to column it will not give me the proper ages on my X axis,1595

it will only give me Excel’s default setting for the X axis which is just labeling it from 1 all the way to 22.1604

However many there are that is not what we want.1614

Instead we would rather have Excel label the correct ages for us.1618

Just so that we will know that this is a frequency distribution of ages later.1625

We should go and label are horizontal X axis, we can label that age.1632

In that way we will know it is a frequency table but it is a frequency table of ages that is what the 17 stands for.1641

What is the age distribution in the sample is largely young.1654

They are mostly on the young side with a few people sort of in their 30’s.1658

Example 3, again from our same www.facebook.com data, what is the height distribution in this data?1666

What did their heights looked like.1673

Let us see.1675

If we click on data and we look at their heights, their heights are listed in inches.1679

Remember that 5 × 12 is 60, 60 inches is about 5 feet tall and then 68 is 5’8.1685

It is a quick way to think of it.1696

72 is 6 feet tall, that person is pretty taller.1698

Once again if we just look at these row by row, it is a just bunch of numbers.1703

We do not need that, we would rather have a nice frequency table.1709

Let us go to height.1713

I have already seated it for you with the height that is the minimum height in our data set as well as the maximum height in our data set.1716

The minimum height happens to be a little bit just shy of 5 feet, 4’10.1724

This one is a little bit more than 6 feet tall, 6’3.1734

Let us put in our frequency function.1740

Count if and let us go ahead and select the data that we want to use.1746

Now that we know we basically need to lock it in place, let us do that right here.1759

Let us lock it in place.1766

We already locked our data in and what is our criteria?1774

I want you to count it if they are 58 inches tall.1779

It seems that there is only one person in our data set of 100 that has that height.1787

I’m just going to copy and paste that all the way down.1792

Once again I'm just going to spot check, 69 inches tall count if this is the correct data.1796

It is locked in and this is the correct criteria that I wanted to use for that row.1806

Good.1811

When we look at this, it seems that it is not that there is one cluster.1814

It seems like there is this sort of giant spread out cluster.1818

It will be nice if we can look at this visually.1825

Let us go ahead and select both columns.1829

Go to chart and go ahead and select XY scatter.1833

This is going to give us both, it is going to use the height as the x coordinate and the frequency as the y coordinate.1839

Here we see that all our frequencies are up here because all of our heights are from 58 to 75 inches.1850

Let us change that into a column chart.1862

Here is how our distribution looks like.1870

Just in case we come back to this later it will be nice to know what these numbers down here represent.1874

I'm going to go to my formatting palette, I’m going to close that.1879

I’m going to go to my formatting palette and tell my horizontal axis that it should be labeled height in inches.1884

That is what our distribution of heights looks like.1902

It looks like these over here, this one seems pretty popular and these seem sort of popular.1907

These are less likely and this one a little bit less likely.1915

This is a sort of what our shape looks like and it is nice and it is really easy to see when we see it in a visualization.1921

It is harder to see when we just look at the list of numbers.1928

Let us move on to our next example.1934

Example 4, now that is the height distribution of everybody in our 100 person www.facebook.com example.1940

But it is a mix of males and females.1948

What if we just wanted the height distribution of males?1951

After all males tend to be taller than females.1954

Their distributions might look different.1956

Let us look at the height distribution only of males.1958

We could also look at only the height distribution of females.1962

Feel free to do that if you want.1965

Here I'm going to use my height by gender and there is a male frequency column and a female frequency column.1970

Once again here are my heights but we will have to figure out in our data set which rows belong to males and which rows belongs to females.1982

Let us go back to our data set.1993

Here is my column for gender, my variable of gender.1998

Some people are gender number 2 and some people are gender number 1.2003

If we look at our variables we could see that gender has been dummy coded because it is a nominal measure.2008

We will get 0 if gender is blank or unavailable.2019

They got 1 if their gender was male and 2 if their gender was female.2024

Here is what we will do, we will take all of our data and sort it by gender2030

so that all the 1 are clumped together and all the 2 are clumped together.2035

I'm going to use sort.2041

Sorry about that.2054

I think I did it and ended it, alright.2059

I’m going to use gender and I’m just going to sort it by clicking in this column.2060

I just want to make sure that these guys all moved with each other.2070

Now it is sorted so that all of my data for males is up on top and then all of my data for females is at the bottom.2077

Just to keep it straight for myself, I’m going to just color all the heights of males, all the values for height of males,2088

I'm going to color that with the blue font color.2098

Just to help myself keep it straight I’m going to color all the females height values with the sort of pinkish font color.2106

What does my distribution of only males look like?2119

We need to start off with the frequency table again.2123

Let us go to height by gender and here I will put in count if.2126

And let us put in my range.2136

Now my range is only going to be those that I have already colored blue2138

because they only want my range to be those that are already identified as males.2143

Here I’m going to select all these blue guys and put a comma.2150

And then tell if a male is 58 inches tall then I definitely want you to count him.2164

It turns out there are 0 males that are that tall or that short for that matter.2178

We want to lock that data set in place because we know that this is not going to need to move for this column at least.2184

I’m going to go ahead and copy and paste that all the way down and we see that2195

from the males the heights are sort of clustered up here rather than down here.2200

I wonder if that is the same for females.2208

Even though our question was really about males why do not we females too just to see.2210

I’m going to start with my count if.2217

The range for females needs to be all the data that has been already identified as females.2221

Here are these pink women and I’m going to go ahead and put in a comma because I know I will need one.2227

Go back here and I will say check if the female is that height.2235

Once again I want to lock in my data.2243

I do not want that to move when I copy and paste.2248

And then it turns out that our one person who is 58 inches tall before happened to be female and I’m going to drag that all the way down.2253

We see something different in females than we saw in males.2263

Females tend to be clustered around here and the most frequent height being about 64 inches.2267

For males, the heights are sort of clustered up here with the most frequent height being 69 inches.2275

Let us look at this now and visualization.2283

I’m just going to look at the heights of males for now.2288

Hit chart and go to XY scatter because I want to know both the height and the frequency of that height.2293

We see that males are clustered up here.2305

Let me change that into a column and what do we see?2309

We see that it is like a pile.2318

The males are sort of piled up around 68 – 70 and it falls off closer to 5 feet tall.2321

There is not as many people who are way taller than 6 feet.2330

That is the chart for males.2337

Feel free to go ahead and do the chart for females.2340

That is the end of our examples today.2346

Thanks for using www.educator.com.2348

Educator®

Please sign in to participate in this lecture discussion.

Resetting Your Password?
OR

Start Learning Now

Our free lessons will get you started (Adobe Flash® required).
Get immediate access to our entire library.

Membership Overview

  • Available 24/7. Unlimited Access to Our Entire Library.
  • Search and jump to exactly what you want to learn.
  • *Ask questions and get answers from the community and our teachers!
  • Practice questions with step-by-step solutions.
  • Download lecture slides for taking notes.
  • Track your course viewing progress.
  • Accessible anytime, anywhere with our Android and iOS apps.