  Dr. Ji Son

Dotplots and Histograms in Excel

Slide Duration:

Section 1: Introduction
Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro
0:00
0:10
0:11
Statistics
0:35
Statistics
0:36
Let's Think About High School Science
1:12
Measurement and Find Patterns (Mathematical Formula)
1:13
Statistics = Math of Distributions
4:58
Distributions
4:59
Problematic… but also GREAT
5:58
Statistics
7:33
How is It Different from Other Specializations in Mathematics?
7:34
Statistics is Fundamental in Natural and Social Sciences
7:53
Two Skills of Statistics
8:20
Description (Exploration)
8:21
Inference
9:13
Descriptive Statistics vs. Inferential Statistics: Apply to Distributions
9:58
Descriptive Statistics
9:59
Inferential Statistics
11:05
Populations vs. Samples
12:19
Populations vs. Samples: Is it the Truth?
12:20
Populations vs. Samples: Pros & Cons
13:36
Populations vs. Samples: Descriptive Values
16:12
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:10
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:11
Example 1: Descriptive Statistics vs. Inferential Statistics
19:09
Example 2: Descriptive Statistics vs. Inferential Statistics
20:47
Example 3: Sample, Parameter, Population, and Statistic
21:40
Example 4: Sample, Parameter, Population, and Statistic
23:28
Section 2: About Samples: Cases, Variables, Measurements

32m 14s

Intro
0:00
Data
0:09
Data, Cases, Variables, and Values
0:10
Rows, Columns, and Cells
2:03
Example: Aircrafts
3:52
How Do We Get Data?
5:38
Research: Question and Hypothesis
5:39
Research Design
7:11
Measurement
7:29
Research Analysis
8:33
Research Conclusion
9:30
Types of Variables
10:03
Discrete Variables
10:04
Continuous Variables
12:07
Types of Measurements
14:17
Types of Measurements
14:18
Types of Measurements (Scales)
17:22
Nominal
17:23
Ordinal
19:11
Interval
21:33
Ratio
24:24
Example 1: Cases, Variables, Measurements
25:20
Example 2: Which Scale of Measurement is Used?
26:55
Example 3: What Kind of a Scale of Measurement is This?
27:26
Example 4: Discrete vs. Continuous Variables.
30:31
Section 3: Visualizing Distributions
Introduction to Excel

8m 9s

Intro
0:00
Before Visualizing Distribution
0:10
Excel
0:11
Excel: Organization
0:45
Workbook
0:46
Column x Rows
1:50
Tools: Menu Bar, Standard Toolbar, and Formula Bar
3:00
Excel + Data
6:07
Exce and Data
6:08
Frequency Distributions in Excel

39m 10s

Intro
0:00
0:08
Data in Excel and Frequency Distributions
0:09
Raw Data to Frequency Tables
0:42
Raw Data to Frequency Tables
0:43
Frequency Tables: Using Formulas and Pivot Tables
1:28
Example 1: Number of Births
7:17
Example 2: Age Distribution
20:41
Example 3: Height Distribution
27:45
Example 4: Height Distribution of Males
32:19
Frequency Distributions and Features

25m 29s

Intro
0:00
0:10
Data in Excel, Frequency Distributions, and Features of Frequency Distributions
0:11
Example #1
1:35
Uniform
1:36
Example #2
2:58
Unimodal, Skewed Right, and Asymmetric
2:59
Example #3
6:29
Bimodal
6:30
Example #4a
8:29
Symmetric, Unimodal, and Normal
8:30
Point of Inflection and Standard Deviation
11:13
Example #4b
12:43
Normal Distribution
12:44
Summary
13:56
Uniform, Skewed, Bimodal, and Normal
13:57
17:34
Sketch Problem 2: Life Expectancy
20:01
Sketch Problem 3: Telephone Numbers
22:01
Sketch Problem 4: Length of Time Used to Complete a Final Exam
23:43
Dotplots and Histograms in Excel

42m 42s

Intro
0:00
0:06
0:07
Previously
1:02
Data, Frequency Table, and visualization
1:03
Dotplots
1:22
Dotplots Excel Example
1:23
Dotplots: Pros and Cons
7:22
Pros and Cons of Dotplots
7:23
Dotplots Excel Example Cont.
9:07
Histograms
12:47
Histograms Overview
12:48
Example of Histograms
15:29
Histograms: Pros and Cons
31:39
Pros
31:40
Cons
32:31
Frequency vs. Relative Frequency
32:53
Frequency
32:54
Relative Frequency
33:36
Example 1: Dotplots vs. Histograms
34:36
Example 2: Age of Pennies Dotplot
36:21
Example 3: Histogram of Mammal Speeds
38:27
Example 4: Histogram of Life Expectancy
40:30
Stemplots

12m 23s

Intro
0:00
0:05
0:06
What Sets Stemplots Apart?
0:46
Data Sets, Dotplots, Histograms, and Stemplots
0:47
Example 1: What Do Stemplots Look Like?
1:58
Example 2: Back-to-Back Stemplots
5:00
7:46
Example 4: Quiz Grade & Afterschool Tutoring Stemplot
9:56
Bar Graphs

22m 49s

Intro
0:00
0:05
0:08
Review of Frequency Distributions
0:44
Y-axis and X-axis
0:45
Types of Frequency Visualizations Covered so Far
2:16
Introduction to Bar Graphs
4:07
Example 1: Bar Graph
5:32
Example 1: Bar Graph
5:33
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:07
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:08
Example 2: Create a Frequency Visualization for Gender
14:02
Example 3: Cases, Variables, and Frequency Visualization
16:34
Example 4: What Kind of Graphs are Shown Below?
19:29
Section 4: Summarizing Distributions
Central Tendency: Mean, Median, Mode

38m 50s

Intro
0:00
0:07
0:08
Central Tendency 1
0:56
Way to Summarize a Distribution of Scores
0:57
Mode
1:32
Median
2:02
Mean
2:36
Central Tendency 2
3:47
Mode
3:48
Median
4:20
Mean
5:25
Summation Symbol
6:11
Summation Symbol
6:12
Population vs. Sample
10:46
Population vs. Sample
10:47
Excel Examples
15:08
Finding Mode, Median, and Mean in Excel
15:09
Median vs. Mean
21:45
Effect of Outliers
21:46
Relationship Between Parameter and Statistic
22:44
Type of Measurements
24:00
Which Distributions to Use With
24:55
Example 1: Mean
25:30
Example 2: Using Summation Symbol
29:50
Example 3: Average Calorie Count
32:50
Example 4: Creating an Example Set
35:46
Variability

42m 40s

Intro
0:00
0:05
0:06
0:45
0:46
5:45
5:46
Range, Quartiles and Interquartile Range
6:37
Range
6:38
Interquartile Range
8:42
Interquartile Range Example
10:58
Interquartile Range Example
10:59
Variance and Standard Deviation
12:27
Deviations
12:28
Sum of Squares
14:35
Variance
16:55
Standard Deviation
17:44
Sum of Squares (SS)
18:34
Sum of Squares (SS)
18:35
Population vs. Sample SD
22:00
Population vs. Sample SD
22:01
Population vs. Sample
23:20
Mean
23:21
SD
23:51
Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File
27:21
Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File
35:25
Example 3: Sum of Squares
38:58
Example 4: Standard Deviation
41:48
Five Number Summary & Boxplots

57m 15s

Intro
0:00
0:06
0:07
Summarizing Distributions
0:37
0:38
5 Number Summary
1:14
Boxplot: Visualizing 5 Number Summary
3:37
Boxplot: Visualizing 5 Number Summary
3:38
Boxplots on Excel
9:01
Using 'Stocks' and Using Stacked Columns
9:02
Boxplots on Excel Example
10:14
When are Boxplots Useful?
32:14
Pros
32:15
Cons
32:59
How to Determine Outlier Status
33:24
Rule of Thumb: Upper Limit
33:25
Rule of Thumb: Lower Limit
34:16
Signal Outliers in an Excel Data File Using Conditional Formatting
34:52
Modified Boxplot
48:38
Modified Boxplot
48:39
Example 1: Percentage Values & Lower and Upper Whisker
49:10
Example 2: Boxplot
50:10
Example 3: Estimating IQR From Boxplot
53:46
Example 4: Boxplot and Missing Whisker
54:35
Shape: Calculating Skewness & Kurtosis

41m 51s

Intro
0:00
0:16
0:17
Skewness Concept
1:09
Skewness Concept
1:10
Calculating Skewness
3:26
Calculating Skewness
3:27
Interpreting Skewness
7:36
Interpreting Skewness
7:37
Excel Example
8:49
Kurtosis Concept
20:29
Kurtosis Concept
20:30
Calculating Kurtosis
24:17
Calculating Kurtosis
24:18
Interpreting Kurtosis
29:01
Leptokurtic
29:35
Mesokurtic
30:10
Platykurtic
31:06
Excel Example
32:04
Example 1: Shape of Distribution
38:28
Example 2: Shape of Distribution
39:29
Example 3: Shape of Distribution
40:14
Example 4: Kurtosis
41:10
Normal Distribution

34m 33s

Intro
0:00
0:13
0:14
What is a Normal Distribution
0:44
The Normal Distribution As a Theoretical Model
0:45
Possible Range of Probabilities
3:05
Possible Range of Probabilities
3:06
What is a Normal Distribution
5:07
Can Be Described By
5:08
Properties
5:49
'Same' Shape: Illusion of Different Shape!
7:35
'Same' Shape: Illusion of Different Shape!
7:36
Types of Problems
13:45
Example: Distribution of SAT Scores
13:46
Shape Analogy
19:48
Shape Analogy
19:49
Example 1: The Standard Normal Distribution and Z-Scores
22:34
Example 2: The Standard Normal Distribution and Z-Scores
25:54
Example 3: Sketching and Normal Distribution
28:55
Example 4: Sketching and Normal Distribution
32:32
Standard Normal Distributions & Z-Scores

41m 44s

Intro
0:00
0:06
0:07
A Family of Distributions
0:28
Infinite Set of Distributions
0:29
Transforming Normal Distributions to 'Standard' Normal Distribution
1:04
Normal Distribution vs. Standard Normal Distribution
2:58
Normal Distribution vs. Standard Normal Distribution
2:59
Z-Score, Raw Score, Mean, & SD
4:08
Z-Score, Raw Score, Mean, & SD
4:09
Weird Z-Scores
9:40
Weird Z-Scores
9:41
Excel
16:45
For Normal Distributions
16:46
For Standard Normal Distributions
19:11
Excel Example
20:24
Types of Problems
25:18
Percentage Problem: P(x)
25:19
Raw Score and Z-Score Problems
26:28
Standard Deviation Problems
27:01
Shape Analogy
27:44
Shape Analogy
27:45
Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer
28:24
Example 2: Heights of Male College Students
33:15
Example 3: Mean and Standard Deviation
37:14
Example 4: Finding Percentage of Values in a Standard Normal Distribution
37:49
Normal Distribution: PDF vs. CDF

55m 44s

Intro
0:00
0:15
0:16
Frequency vs. Cumulative Frequency
0:56
Frequency vs. Cumulative Frequency
0:57
Frequency vs. Cumulative Frequency
4:32
Frequency vs. Cumulative Frequency Cont.
4:33
Calculus in Brief
6:21
Derivative-Integral Continuum
6:22
PDF
10:08
PDF for Standard Normal Distribution
10:09
PDF for Normal Distribution
14:32
Integral of PDF = CDF
21:27
Integral of PDF = CDF
21:28
Example 1: Cumulative Frequency Graph
23:31
Example 2: Mean, Standard Deviation, and Probability
24:43
Example 3: Mean and Standard Deviation
35:50
Example 4: Age of Cars
49:32
Section 5: Linear Regression
Scatterplots

47m 19s

Intro
0:00
0:04
0:05
Previous Visualizations
0:30
Frequency Distributions
0:31
Compare & Contrast
2:26
Frequency Distributions Vs. Scatterplots
2:27
Summary Values
4:53
Shape
4:54
Center & Trend
6:41
8:22
Univariate & Bivariate
10:25
Example Scatterplot
10:48
Shape, Trend, and Strength
10:49
Positive and Negative Association
14:05
Positive and Negative Association
14:06
Linearity, Strength, and Consistency
18:30
Linearity
18:31
Strength
19:14
Consistency
20:40
Summarizing a Scatterplot
22:58
Summarizing a Scatterplot
22:59
Example 1: Gapminder.org, Income x Life Expectancy
26:32
Example 2: Gapminder.org, Income x Infant Mortality
36:12
Example 3: Trend and Strength of Variables
40:14
Example 4: Trend, Strength and Shape for Scatterplots
43:27
Regression

32m 2s

Intro
0:00
0:05
0:06
Linear Equations
0:34
Linear Equations: y = mx + b
0:35
Rough Line
5:16
Rough Line
5:17
Regression - A 'Center' Line
7:41
Reasons for Summarizing with a Regression Line
7:42
Predictor and Response Variable
10:04
Goal of Regression
12:29
Goal of Regression
12:30
Prediction
14:50
Example: Servings of Mile Per Year Shown By Age
14:51
Intrapolation
17:06
Extrapolation
17:58
Error in Prediction
20:34
Prediction Error
20:35
Residual
21:40
Example 1: Residual
23:34
Example 2: Large and Negative Residual
26:30
Example 3: Positive Residual
28:13
Example 4: Interpret Regression Line & Extrapolate
29:40
Least Squares Regression

56m 36s

Intro
0:00
0:13
0:14
Best Fit
0:47
Best Fit
0:48
Sum of Squared Errors (SSE)
1:50
Sum of Squared Errors (SSE)
1:51
Why Squared?
3:38
Why Squared?
3:39
Quantitative Properties of Regression Line
4:51
Quantitative Properties of Regression Line
4:52
So How do we Find Such a Line?
6:49
SSEs of Different Line Equations & Lowest SSE
6:50
Carl Gauss' Method
8:01
How Do We Find Slope (b1)
11:00
How Do We Find Slope (b1)
11:01
Hoe Do We Find Intercept
15:11
Hoe Do We Find Intercept
15:12
Example 1: Which of These Equations Fit the Above Data Best?
17:18
Example 2: Find the Regression Line for These Data Points and Interpret It
26:31
Example 3: Summarize the Scatterplot and Find the Regression Line.
34:31
Example 4: Examine the Mean of Residuals
43:52
Correlation

43m 58s

Intro
0:00
0:05
0:06
Summarizing a Scatterplot Quantitatively
0:47
Shape
0:48
Trend
1:11
Strength: Correlation ®
1:45
Correlation Coefficient ( r )
2:30
Correlation Coefficient ( r )
2:31
Trees vs. Forest
11:59
Trees vs. Forest
12:00
Calculating r
15:07
Average Product of z-scores for x and y
15:08
Relationship between Correlation and Slope
21:10
Relationship between Correlation and Slope
21:11
Example 1: Find the Correlation between Grams of Fat and Cost
24:11
Example 2: Relationship between r and b1
30:24
Example 3: Find the Regression Line
33:35
Example 4: Find the Correlation Coefficient for this Set of Data
37:37
Correlation: r vs. r-squared

52m 52s

Intro
0:00
0:07
0:08
R-squared
0:44
What is the Meaning of It? Why Squared?
0:45
Parsing Sum of Squared (Parsing Variability)
2:25
SST = SSR + SSE
2:26
What is SST and SSE?
7:46
What is SST and SSE?
7:47
r-squared
18:33
Coefficient of Determination
18:34
If the Correlation is Strong…
20:25
If the Correlation is Strong…
20:26
If the Correlation is Weak…
22:36
If the Correlation is Weak…
22:37
Example 1: Find r-squared for this Set of Data
23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?
33:54
Example 3: Why Does r-squared Only Range from 0 to 1
37:29
Example 4: Find the r-squared for This Set of Data
39:55
Transformations of Data

27m 8s

Intro
0:00
0:05
0:06
Why Transform?
0:26
Why Transform?
0:27
Shape-preserving vs. Shape-changing Transformations
5:14
Shape-preserving = Linear Transformations
5:15
Shape-changing Transformations = Non-linear Transformations
6:20
Common Shape-Preserving Transformations
7:08
Common Shape-Preserving Transformations
7:09
Common Shape-Changing Transformations
8:59
Powers
9:00
Logarithms
9:39
Change Just One Variable? Both?
10:38
Log-log Transformations
10:39
Log Transformations
14:38
Example 1: Create, Graph, and Transform the Data Set
15:19
Example 2: Create, Graph, and Transform the Data Set
20:08
Example 3: What Kind of Model would You Choose for this Data?
22:44
Example 4: Transformation of Data
25:46
Section 6: Collecting Data in an Experiment
Sampling & Bias

54m 44s

Intro
0:00
0:05
0:06
Descriptive vs. Inferential Statistics
1:04
Descriptive Statistics: Data Exploration
1:05
Example
2:03
To tackle Generalization…
4:31
Generalization
4:32
Sampling
6:06
'Good' Sample
6:40
Defining Samples and Populations
8:55
Population
8:56
Sample
11:16
Why Use Sampling?
13:09
Why Use Sampling?
13:10
Goal of Sampling: Avoiding Bias
15:04
What is Bias?
15:05
Where does Bias Come from: Sampling Bias
17:53
Where does Bias Come from: Response Bias
18:27
Sampling Bias: Bias from Bas Sampling Methods
19:34
Size Bias
19:35
Voluntary Response Bias
21:13
Convenience Sample
22:22
Judgment Sample
23:58
25:40
Response Bias: Bias from 'Bad' Data Collection Methods
28:00
Nonresponse Bias
29:31
Questionnaire Bias
31:10
Incorrect Response or Measurement Bias
37:32
Example 1: What Kind of Biases?
40:29
Example 2: What Biases Might Arise?
44:46
Example 3: What Kind of Biases?
48:34
Example 4: What Kind of Biases?
51:43
Sampling Methods

14m 25s

Intro
0:00
0:05
0:06
Biased vs. Unbiased Sampling Methods
0:32
Biased Sampling
0:33
Unbiased Sampling
1:13
Probability Sampling Methods
2:31
Simple Random
2:54
Stratified Random Sampling
4:06
Cluster Sampling
5:24
Two-staged Sampling
6:22
Systematic Sampling
7:25
8:33
Example 2: Describe How to Take a Two-Stage Sample from this Book
10:16
Example 3: Sampling Methods
11:58
Example 4: Cluster Sample Plan
12:48
Research Design

53m 54s

Intro
0:00
0:06
0:07
Descriptive vs. Inferential Statistics
0:51
Descriptive Statistics: Data Exploration
0:52
Inferential Statistics
1:02
Variables and Relationships
1:44
Variables
1:45
Relationships
2:49
Not Every Type of Study is an Experiment…
4:16
Category I - Descriptive Study
4:54
Category II - Correlational Study
5:50
Category III - Experimental, Quasi-experimental, Non-experimental
6:33
Category III
7:42
Experimental, Quasi-experimental, and Non-experimental
7:43
Why CAN'T the Other Strategies Determine Causation?
10:18
Third-variable Problem
10:19
Directionality Problem
15:49
What Makes Experiments Special?
17:54
Manipulation
17:55
Control (and Comparison)
21:58
Methods of Control
26:38
Holding Constant
26:39
Matching
29:11
Random Assignment
31:48
Experiment Terminology
34:09
'true' Experiment vs. Study
34:10
Independent Variable (IV)
35:16
Dependent Variable (DV)
35:45
Factors
36:07
Treatment Conditions
36:23
Levels
37:43
Confounds or Extraneous Variables
38:04
Blind
38:38
Blind Experiments
38:39
Double-blind Experiments
39:29
How Categories Relate to Statistics
41:35
Category I - Descriptive Study
41:36
Category II - Correlational Study
42:05
Category III - Experimental, Quasi-experimental, Non-experimental
42:43
Example 1: Research Design
43:50
Example 2: Research Design
47:37
Example 3: Research Design
50:12
Example 4: Research Design
52:00
Between and Within Treatment Variability

41m 31s

Intro
0:00
0:06
0:07
Experimental Designs
0:51
Experimental Designs: Manipulation & Control
0:52
Two Types of Variability
2:09
Between Treatment Variability
2:10
Within Treatment Variability
3:31
Updated Goal of Experimental Design
5:47
Updated Goal of Experimental Design
5:48
Example: Drugs and Driving
6:56
Example: Drugs and Driving
6:57
Different Types of Random Assignment
11:27
All Experiments
11:28
Completely Random Design
12:02
Randomized Block Design
13:19
Randomized Block Design
15:48
Matched Pairs Design
15:49
Repeated Measures Design
19:47
Between-subject Variable vs. Within-subject Variable
22:43
Completely Randomized Design
22:44
Repeated Measures Design
25:03
Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment
26:16
Example 2: Block Design
31:41
Example 3: Completely Randomized Designs
35:11
Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?
39:01
Section 7: Review of Probability Axioms
Sample Spaces

37m 52s

Intro
0:00
0:07
0:08
Why is Probability Involved in Statistics
0:48
Probability
0:49
Can People Tell the Difference between Cheap and Gourmet Coffee?
2:08
Taste Test with Coffee Drinkers
3:37
If No One can Actually Taste the Difference
3:38
If Everyone can Actually Taste the Difference
5:36
Creating a Probability Model
7:09
Creating a Probability Model
7:10
D'Alembert vs. Necker
9:41
D'Alembert vs. Necker
9:42
Problem with D'Alembert's Model
13:29
Problem with D'Alembert's Model
13:30
Covering Entire Sample Space
15:08
Fundamental Principle of Counting
15:09
Where Do Probabilities Come From?
22:54
Observed Data, Symmetry, and Subjective Estimates
22:55
Checking whether Model Matches Real World
24:27
Law of Large Numbers
24:28
Example 1: Law of Large Numbers
27:46
Example 2: Possible Outcomes
30:43
Example 3: Brands of Coffee and Taste
33:25
Example 4: How Many Different Treatments are there?
35:33

20m 29s

Intro
0:00
0:08
0:09
Disjoint Events
0:41
Disjoint Events
0:42
Meaning of 'or'
2:39
In Regular Life
2:40
In Math/Statistics/Computer Science
3:10
3:55
If A and B are Disjoint: P (A and B)
3:56
If A and B are Disjoint: P (A or B)
5:15
5:41
5:42
8:31
If A and B are not Disjoint: P (A or B)
8:32
Example 1: Which of These are Mutually Exclusive?
10:50
Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?
12:57
Example 3: Engagement Party
15:17
Example 4: Home Owner's Insurance
18:30
Conditional Probability

57m 19s

Intro
0:00
0:05
0:06
'or' vs. 'and' vs. Conditional Probability
1:07
'or' vs. 'and' vs. Conditional Probability
1:08
'and' vs. Conditional Probability
5:57
P (M or L)
5:58
P (M and L)
8:41
P (M|L)
11:04
P (L|M)
12:24
Tree Diagram
15:02
Tree Diagram
15:03
Defining Conditional Probability
22:42
Defining Conditional Probability
22:43
Common Contexts for Conditional Probability
30:56
Medical Testing: Positive Predictive Value
30:57
Medical Testing: Sensitivity
33:03
Statistical Tests
34:27
Example 1: Drug and Disease
36:41
Example 2: Marbles and Conditional Probability
40:04
Example 3: Cards and Conditional Probability
45:59
Example 4: Votes and Conditional Probability
50:21
Independent Events

24m 27s

Intro
0:00
0:05
0:06
Independent Events & Conditional Probability
0:26
Non-independent Events
0:27
Independent Events
2:00
Non-independent and Independent Events
3:08
Non-independent and Independent Events
3:09
Defining Independent Events
5:52
Defining Independent Events
5:53
Multiplication Rule
7:29
Previously…
7:30
But with Independent Evens
8:53
Example 1: Which of These Pairs of Events are Independent?
11:12
Example 2: Health Insurance and Probability
15:12
Example 3: Independent Events
17:42
Example 4: Independent Events
20:03
Section 8: Probability Distributions
Introduction to Probability Distributions

56m 45s

Intro
0:00
0:08
0:09
Sampling vs. Probability
0:57
Sampling
0:58
Missing
1:30
What is Missing?
3:06
Insight: Probability Distributions
5:26
Insight: Probability Distributions
5:27
What is a Probability Distribution?
7:29
From Sample Spaces to Probability Distributions
8:44
Sample Space
8:45
Probability Distribution of the Sum of Two Die
11:16
The Random Variable
17:43
The Random Variable
17:44
Expected Value
21:52
Expected Value
21:53
Example 1: Probability Distributions
28:45
Example 2: Probability Distributions
35:30
Example 3: Probability Distributions
43:37
Example 4: Probability Distributions
47:20
Expected Value & Variance of Probability Distributions

53m 41s

Intro
0:00
0:06
0:07
Discrete vs. Continuous Random Variables
1:04
Discrete vs. Continuous Random Variables
1:05
Mean and Variance Review
4:44
Mean: Sample, Population, and Probability Distribution
4:45
Variance: Sample, Population, and Probability Distribution
9:12
Example Situation
14:10
Example Situation
14:11
Some Special Cases…
16:13
Some Special Cases…
16:14
Linear Transformations
19:22
Linear Transformations
19:23
What Happens to Mean and Variance of the Probability Distribution?
20:12
n Independent Values of X
25:38
n Independent Values of X
25:39
Compare These Two Situations
30:56
Compare These Two Situations
30:57
Two Random Variables, X and Y
32:02
Two Random Variables, X and Y
32:03
Example 1: Expected Value & Variance of Probability Distributions
35:35
Example 2: Expected Values & Standard Deviation
44:17
Example 3: Expected Winnings and Standard Deviation
48:18
Binomial Distribution

55m 15s

Intro
0:00
0:05
0:06
Discrete Probability Distributions
1:42
Discrete Probability Distributions
1:43
Binomial Distribution
2:36
Binomial Distribution
2:37
Multiplicative Rule Review
6:54
Multiplicative Rule Review
6:55
How Many Outcomes with k 'Successes'
10:23
Adults and Bachelor's Degree: Manual List of Outcomes
10:24
P (X=k)
19:37
Putting Together # of Outcomes with the Multiplicative Rule
19:38
Expected Value and Standard Deviation in a Binomial Distribution
25:22
Expected Value and Standard Deviation in a Binomial Distribution
25:23
Example 1: Coin Toss
33:42
38:03
Example 3: Types of Blood and Probability
45:39
Example 4: Expected Number and Standard Deviation
51:11
Section 9: Sampling Distributions of Statistics
Introduction to Sampling Distributions

48m 17s

Intro
0:00
0:08
0:09
Probability Distributions vs. Sampling Distributions
0:55
Probability Distributions vs. Sampling Distributions
0:56
Same Logic
3:55
Logic of Probability Distribution
3:56
Example: Rolling Two Die
6:56
Simulating Samples
9:53
To Come Up with Probability Distributions
9:54
In Sampling Distributions
11:12
Connecting Sampling and Research Methods with Sampling Distributions
12:11
Connecting Sampling and Research Methods with Sampling Distributions
12:12
Simulating a Sampling Distribution
14:14
Experimental Design: Regular Sleep vs. Less Sleep
14:15
Logic of Sampling Distributions
23:08
Logic of Sampling Distributions
23:09
General Method of Simulating Sampling Distributions
25:38
General Method of Simulating Sampling Distributions
25:39
Questions that Remain
28:45
Questions that Remain
28:46
Example 1: Mean and Standard Error of Sampling Distribution
30:57
Example 2: What is the Best Way to Describe Sampling Distributions?
37:12
Example 3: Matching Sampling Distributions
38:21
Example 4: Mean and Standard Error of Sampling Distribution
41:51
Sampling Distribution of the Mean

1h 8m 48s

Intro
0:00
0:05
0:06
Special Case of General Method for Simulating a Sampling Distribution
1:53
Special Case of General Method for Simulating a Sampling Distribution
1:54
Computer Simulation
3:43
Using Simulations to See Principles behind Shape of SDoM
15:50
Using Simulations to See Principles behind Shape of SDoM
15:51
Conditions
17:38
Using Simulations to See Principles behind Center (Mean) of SDoM
20:15
Using Simulations to See Principles behind Center (Mean) of SDoM
20:16
Conditions: Does n Matter?
21:31
Conditions: Does Number of Simulation Matter?
24:37
Using Simulations to See Principles behind Standard Deviation of SDoM
27:13
Using Simulations to See Principles behind Standard Deviation of SDoM
27:14
Conditions: Does n Matter?
34:45
Conditions: Does Number of Simulation Matter?
36:24
Central Limit Theorem
37:13
SHAPE
38:08
CENTER
39:34
39:52
Comparing Population, Sample, and SDoM
43:10
Comparing Population, Sample, and SDoM
43:11
48:24
What Happens When We Don't Know What the Population Looks Like?
48:25
Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
49:42
How Do We Know whether a Sample is Sufficiently Unlikely?
53:36
Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
54:40
Example 1: Mean Batting Average
55:25
Example 2: Mean Sampling Distribution and Standard Error
59:07
Example 3: Sampling Distribution of the Mean
1:01:04
Sampling Distribution of Sample Proportions

54m 37s

Intro
0:00
0:06
0:07
Intro to Sampling Distribution of Sample Proportions (SDoSP)
0:51
Categorical Data (Examples)
0:52
Wish to Estimate Proportion of Population from Sample…
2:00
Notation
3:34
Population Proportion and Sample Proportion Notations
3:35
What's the Difference?
9:19
SDoM vs. SDoSP: Type of Data
9:20
SDoM vs. SDoSP: Shape
11:24
SDoM vs. SDoSP: Center
12:30
15:34
Binomial Distribution vs. Sampling Distribution of Sample Proportions
19:14
Binomial Distribution vs. SDoSP: Type of Data
19:17
Binomial Distribution vs. SDoSP: Shape
21:07
Binomial Distribution vs. SDoSP: Center
21:43
24:08
Example 1: Sampling Distribution of Sample Proportions
26:07
Example 2: Sampling Distribution of Sample Proportions
37:58
Example 3: Sampling Distribution of Sample Proportions
44:42
Example 4: Sampling Distribution of Sample Proportions
45:57
Section 10: Inferential Statistics
Introduction to Confidence Intervals

42m 53s

Intro
0:00
0:06
0:07
Inferential Statistics
0:50
Inferential Statistics
0:51
Two Problems with This Picture…
3:20
Two Problems with This Picture…
3:21
Solution: Confidence Intervals (CI)
4:59
Solution: Hypotheiss Testing (HT)
5:49
Which Parameters are Known?
6:45
Which Parameters are Known?
6:46
Confidence Interval - Goal
7:56
When We Don't Know m but know s
7:57
When We Don't Know
18:27
When We Don't Know m nor s
18:28
Example 1: Confidence Intervals
26:18
Example 2: Confidence Intervals
29:46
Example 3: Confidence Intervals
32:18
Example 4: Confidence Intervals
38:31
t Distributions

1h 2m 6s

Intro
0:00
0:04
0:05
When to Use z vs. t?
1:07
When to Use z vs. t?
1:08
What is z and t?
3:02
z-score and t-score: Commonality
3:03
z-score and t-score: Formulas
3:34
z-score and t-score: Difference
5:22
Why not z? (Why t?)
7:24
Why not z? (Why t?)
7:25
But Don't Worry!
15:13
Gossett and t-distributions
15:14
Rules of t Distributions
17:05
t-distributions are More Normal as n Gets Bigger
17:06
t-distributions are a Family of Distributions
18:55
Degrees of Freedom (df)
20:02
Degrees of Freedom (df)
20:03
t Family of Distributions
24:07
t Family of Distributions : df = 2 , 4, and 60
24:08
df = 60
29:16
df = 2
29:59
How to Find It?
31:01
'Student's t-distribution' or 't-distribution'
31:02
Excel Example
33:06
Example 1: Which Distribution Do You Use? Z or t?
45:26
47:41
Example 3: t Distributions
52:15
Example 4: t Distributions , confidence interval, and mean
55:59
Introduction to Hypothesis Testing

1h 6m 33s

Intro
0:00
0:06
0:07
Issues to Overcome in Inferential Statistics
1:35
Issues to Overcome in Inferential Statistics
1:36
What Happens When We Don't Know What the Population Looks Like?
2:57
How Do We Know whether a sample is Sufficiently Unlikely
3:43
Hypothesizing a Population
6:44
Hypothesizing a Population
6:45
Null Hypothesis
8:07
Alternative Hypothesis
8:56
Hypotheses
11:58
Hypotheses
11:59
Errors in Hypothesis Testing
14:22
Errors in Hypothesis Testing
14:23
Steps of Hypothesis Testing
21:15
Steps of Hypothesis Testing
21:16
Single Sample HT ( When Sigma Available)
26:08
26:09
Step1
27:08
Step 2
27:58
Step 3
28:17
Step 4
32:18
Single Sample HT (When Sigma Not Available)
36:33
36:34
Step1: Hypothesis Testing
36:58
Step 2: Significance Level
37:25
Step 3: Decision Stage
37:40
Step 4: Sample
41:36
Sigma and p-value
45:04
Sigma and p-value
45:05
On tailed vs. Two Tailed Hypotheses
45:51
Example 1: Hypothesis Testing
48:37
Example 2: Heights of Women in the US
57:43
Example 3: Select the Best Way to Complete This Sentence
1:03:23
Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro
0:00
0:14
0:15
One Mean vs. Two Means
1:17
One Mean vs. Two Means
1:18
Notation
2:41
A Sample! A Set!
2:42
Mean of X, Mean of Y, and Difference of Two Means
3:56
SE of X
4:34
SE of Y
6:28
Sampling Distribution of the Difference between Two Means (SDoD)
7:48
Sampling Distribution of the Difference between Two Means (SDoD)
7:49
Rules of the SDoD (similar to CLT!)
15:00
Mean for the SDoD Null Hypothesis
15:01
Standard Error
17:39
When can We Construct a CI for the Difference between Two Means?
21:28
Three Conditions
21:29
Finding CI
23:56
One Mean CI
23:57
Two Means CI
25:45
Finding t
29:16
Finding t
29:17
Interpreting CI
30:25
Interpreting CI
30:26
Better Estimate of s (s pool)
34:15
Better Estimate of s (s pool)
34:16
Example 1: Confidence Intervals
42:32
Example 2: SE of the Difference
52:36
Hypothesis Testing for the Difference of Two Independent Means

50m

Intro
0:00
0:06
0:07
The Goal of Hypothesis Testing
0:56
One Sample and Two Samples
0:57
Sampling Distribution of the Difference between Two Means (SDoD)
3:42
Sampling Distribution of the Difference between Two Means (SDoD)
3:43
Rules of the SDoD (Similar to CLT!)
6:46
Shape
6:47
Mean for the Null Hypothesis
7:26
Standard Error for Independent Samples (When Variance is Homogenous)
8:18
Standard Error for Independent Samples (When Variance is not Homogenous)
9:25
Same Conditions for HT as for CI
10:08
Three Conditions
10:09
Steps of Hypothesis Testing
11:04
Steps of Hypothesis Testing
11:05
Formulas that Go with Steps of Hypothesis Testing
13:21
Step 1
13:25
Step 2
14:18
Step 3
15:00
Step 4
16:57
Example 1: Hypothesis Testing for the Difference of Two Independent Means
18:47
Example 2: Hypothesis Testing for the Difference of Two Independent Means
33:55
Example 3: Hypothesis Testing for the Difference of Two Independent Means
44:22
Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro
0:00
0:09
0:10
The Goal of Hypothesis Testing
1:27
One Sample and Two Samples
1:28
Independent Samples vs. Paired Samples
3:16
Independent Samples vs. Paired Samples
3:17
Which is Which?
5:20
Independent SAMPLES vs. Independent VARIABLES
7:43
independent SAMPLES vs. Independent VARIABLES
7:44
T-tests Always…
10:48
T-tests Always…
10:49
Notation for Paired Samples
12:59
Notation for Paired Samples
13:00
Steps of Hypothesis Testing for Paired Samples
16:13
Steps of Hypothesis Testing for Paired Samples
16:14
Rules of the SDoD (Adding on Paired Samples)
18:03
Shape
18:04
Mean for the Null Hypothesis
18:31
Standard Error for Independent Samples (When Variance is Homogenous)
19:25
Standard Error for Paired Samples
20:39
Formulas that go with Steps of Hypothesis Testing
22:59
Formulas that go with Steps of Hypothesis Testing
23:00
Confidence Intervals for Paired Samples
30:32
Confidence Intervals for Paired Samples
30:33
Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
32:28
Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
44:02
Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
52:23
Type I and Type II Errors

31m 27s

Intro
0:00
0:18
0:19
Errors and Relationship to HT and the Sample Statistic?
1:11
Errors and Relationship to HT and the Sample Statistic?
1:12
7:00
One Sample t-test: Friends on Facebook
7:01
Two Sample t-test: Friends on Facebook
13:46
Usually, Lots of Overlap between Null and Alternative Distributions
16:59
Overlap between Null and Alternative Distributions
17:00
How Distributions and 'Box' Fit Together
22:45
How Distributions and 'Box' Fit Together
22:46
Example 1: Types of Errors
25:54
Example 2: Types of Errors
27:30
Example 3: What is the Danger of the Type I Error?
29:38
Effect Size & Power

44m 41s

Intro
0:00
0:05
0:06
Distance between Distributions: Sample t
0:49
Distance between Distributions: Sample t
0:50
Problem with Distance in Terms of Standard Error
2:56
Problem with Distance in Terms of Standard Error
2:57
Test Statistic (t) vs. Effect Size (d or g)
4:38
Test Statistic (t) vs. Effect Size (d or g)
4:39
Rules of Effect Size
6:09
Rules of Effect Size
6:10
Why Do We Need Effect Size?
8:21
Tells You the Practical Significance
8:22
HT can be Deceiving…
10:25
Important Note
10:42
What is Power?
11:20
What is Power?
11:21
Why Do We Need Power?
14:19
Conditional Probability and Power
14:20
Power is:
16:27
Can We Calculate Power?
19:00
Can We Calculate Power?
19:01
How Does Alpha Affect Power?
20:36
How Does Alpha Affect Power?
20:37
How Does Effect Size Affect Power?
25:38
How Does Effect Size Affect Power?
25:39
How Does Variability and Sample Size Affect Power?
27:56
How Does Variability and Sample Size Affect Power?
27:57
How Do We Increase Power?
32:47
Increasing Power
32:48
Example 1: Effect Size & Power
35:40
Example 2: Effect Size & Power
37:38
Example 3: Effect Size & Power
40:55
Section 11: Analysis of Variance
F-distributions

24m 46s

Intro
0:00
0:04
0:05
Z- & T-statistic and Their Distribution
0:34
Z- & T-statistic and Their Distribution
0:35
F-statistic
4:55
The F Ration ( the Variance Ratio)
4:56
F-distribution
12:29
F-distribution
12:30
s and p-value
15:00
s and p-value
15:01
Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?
18:33
Example 2: F-distributions
19:29
Example 3: F-distributions and Heights
21:29
ANOVA with Independent Samples

1h 9m 25s

Intro
0:00
0:05
0:06
The Limitations of t-tests
1:12
The Limitations of t-tests
1:13
Two Major Limitations of Many t-tests
3:26
Two Major Limitations of Many t-tests
3:27
Ronald Fisher's Solution… F-test! New Null Hypothesis
4:43
Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)
4:44
Analysis of Variance (ANoVA) Notation
7:47
Analysis of Variance (ANoVA) Notation
7:48
Partitioning (Analyzing) Variance
9:58
Total Variance
9:59
Within-group Variation
14:00
Between-group Variation
16:22
Time out: Review Variance & SS
17:05
Time out: Review Variance & SS
17:06
F-statistic
19:22
The F Ratio (the Variance Ratio)
19:23
S²bet = SSbet / dfbet
22:13
What is This?
22:14
How Many Means?
23:20
So What is the dfbet?
23:38
So What is SSbet?
24:15
S²w = SSw / dfw
26:05
What is This?
26:06
How Many Means?
27:20
So What is the dfw?
27:36
So What is SSw?
28:18
Chart of Independent Samples ANOVA
29:25
Chart of Independent Samples ANOVA
29:26
Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?
35:52
Hypotheses
35:53
Significance Level
39:40
Decision Stage
40:05
Calculate Samples' Statistic and p-Value
44:10
Reject or Fail to Reject H0
55:54
Example 2: ANOVA with Independent Samples
58:21
Repeated Measures ANOVA

1h 15m 13s

Intro
0:00
0:05
0:06
The Limitations of t-tests
0:36
Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
0:37
ANOVA (F-test) to the Rescue!
5:49
Omnibus Hypothesis
5:50
Analyze Variance
7:27
Independent Samples vs. Repeated Measures
9:12
Same Start
9:13
Independent Samples ANOVA
10:43
Repeated Measures ANOVA
12:00
Independent Samples ANOVA
16:00
Same Start: All the Variance Around Grand Mean
16:01
Independent Samples
16:23
Repeated Measures ANOVA
18:18
Same Start: All the Variance Around Grand Mean
18:19
Repeated Measures
18:33
Repeated Measures F-statistic
21:22
The F Ratio (The Variance Ratio)
21:23
S²bet = SSbet / dfbet
23:07
What is This?
23:08
How Many Means?
23:39
So What is the dfbet?
23:54
So What is SSbet?
24:32
S² resid = SS resid / df resid
25:46
What is This?
25:47
So What is SS resid?
26:44
So What is the df resid?
27:36
SS subj and df subj
28:11
What is This?
28:12
How Many Subject Means?
29:43
So What is df subj?
30:01
So What is SS subj?
30:09
SS total and df total
31:42
What is This?
31:43
What is the Total Number of Data Points?
32:02
So What is df total?
32:34
so What is SS total?
32:47
Chart of Repeated Measures ANOVA
33:19
Chart of Repeated Measures ANOVA: F and Between-samples Variability
33:20
Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
35:50
Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
40:25
Hypotheses
40:26
Significance Level
41:46
Decision Stage
42:09
Calculate Samples' Statistic and p-Value
46:18
Reject or Fail to Reject H0
57:55
Example 2: Repeated Measures ANOVA
58:57
Example 3: What's the Problem with a Bunch of Tiny t-tests?
1:13:59
Section 12: Chi-square Test
Chi-Square Goodness-of-Fit Test

58m 23s

Intro
0:00
0:05
0:06
Where Does the Chi-Square Test Belong?
0:50
Where Does the Chi-Square Test Belong?
0:51
A New Twist on HT: Goodness-of-Fit
7:23
HT in General
7:24
Goodness-of-Fit HT
8:26
12:17
Null Hypothesis
12:18
Alternative Hypothesis
13:23
Example
14:38
Chi-Square Statistic
17:52
Chi-Square Statistic
17:53
Chi-Square Distributions
24:31
Chi-Square Distributions
24:32
Conditions for Chi-Square
28:58
Condition 1
28:59
Condition 2
30:20
Condition 3
30:32
Condition 4
31:47
Example 1: Chi-Square Goodness-of-Fit Test
32:23
Example 2: Chi-Square Goodness-of-Fit Test
44:34
Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?
56:06
Chi-Square Test of Homogeneity

51m 36s

Intro
0:00
0:09
0:10
Goodness-of-Fit vs. Homogeneity
1:13
Goodness-of-Fit HT
1:14
Homogeneity
2:00
Analogy
2:38
5:00
Null Hypothesis
5:01
Alternative Hypothesis
6:11
Example
6:33
Chi-Square Statistic
10:12
Same as Goodness-of-Fit Test
10:13
Set Up Data
12:28
Setting Up Data Example
12:29
Expected Frequency
16:53
Expected Frequency
16:54
Chi-Square Distributions & df
19:26
Chi-Square Distributions & df
19:27
Conditions for Test of Homogeneity
20:54
Condition 1
20:55
Condition 2
21:39
Condition 3
22:05
Condition 4
22:23
Example 1: Chi-Square Test of Homogeneity
22:52
Example 2: Chi-Square Test of Homogeneity
32:10
Section 13: Overview of Statistics
Overview of Statistics

18m 11s

Intro
0:00
0:07
0:08
The Statistical Tests (HT) We've Covered
0:28
The Statistical Tests (HT) We've Covered
0:29
Organizing the Tests We've Covered…
1:08
One Sample: Continuous DV and Categorical DV
1:09
Two Samples: Continuous DV and Categorical DV
5:41
More Than Two Samples: Continuous DV and Categorical DV
8:21
The Following Data: OK Cupid
10:10
The Following Data: OK Cupid
10:11
Example 1: Weird-MySpace-Angle Profile Photo
10:38
Example 2: Geniuses
12:30
Example 3: Promiscuous iPhone Users
13:37
Example 4: Women, Aging, and Messaging
16:07
Bookmark & Share Embed

## Copy & Paste this embed code into your website’s HTML

Please ensure that your website editor is in text mode when you paste the code.
(In Wordpress, the mode button is on the top right corner.)
×
• - Allow users to view the embedded video in full-size.
Since this lesson is not free, only the preview will appear on your website.

• ## Related Books 0 answersPost by Roland Amofa on June 25, 2016Can we have the lecture slides in a pdf format? 0 answersPost by Ray Gaytan on October 30, 2015Thank you, very much! Very helpful 0 answersPost by steven bain on December 5, 2013The exercise files seem to wrong or missing altogether. 0 answersPost by Kathleen Donovan on April 11, 2010Professor Yates, you're great! I like all your videos on Statistics and they are extremely helpful! I would definitely recommend them. Previously I used Youtube videos but these are definitely worth paying for --though I'm all for free education:)Kath

### Dotplots and Histograms in Excel

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

• Intro 0:00
• Previously 1:02
• Data, Frequency Table, and visualization
• Dotplots 1:22
• Dotplots Excel Example
• Dotplots: Pros and Cons 7:22
• Pros and Cons of Dotplots
• Dotplots Excel Example Cont.
• Histograms 12:47
• Histograms Overview
• Example of Histograms
• Histograms: Pros and Cons 31:39
• Pros
• Cons
• Frequency vs. Relative Frequency 32:53
• Frequency
• Relative Frequency
• Example 1: Dotplots vs. Histograms 34:36
• Example 2: Age of Pennies Dotplot 36:21
• Example 3: Histogram of Mammal Speeds 38:27
• Example 4: Histogram of Life Expectancy 40:30

### Transcription: Dotplots and Histograms in Excel

Hi and welcome back to www.educator.com.0000

We are going to be talking about dot plots, and histograms in Excel today.0003

First we are going to talk about going from data to dot plots.0009

Remember before we always have to go from data to frequency tables and then to some visualization.0013

Dot plots are nice because they can let you go straight from data directly to the visualization.0019

We are going to talk about going from data to histograms.0024

Histograms are going to be really helpful to us especially because a lot of times we are going to have variables that have many values.0028

We are going to talk about how do we have grouped, ungrouped values, which we have looked up before, and grouped values.0039

Finally we are just going to talk a little bit about plotting frequencies versus relative frequency.0046

Relative frequency is just a fancy way of saying it is frequency but divided by how many cases you have.0052

It is really like percentage.0060

Previously we always have to go from data and we had to stop over at making a frequency table and then go to the visualization.0065

But now with dot plots we cut out the middleman and we can go directly from data to the visualization.0073

That is a really handy thing there.0079

If you look in your Excel file the data is going to be the same as the data we have been working with, the 100 www.facebook.com friends.0085

Here is what we are going to do, before we had created a nice little graph0094

using the Excel tools to create visualization but now we are going to use dot plots.0100

Excel will create dot plots for you directly.0108

Instead we have to sort of fidget, the fudge actually comes in handy sometimes.0111

Let us go birth month.0119

We already know that birth month should have a uniform distribution.0121

We already looked at this data before.0125

What we are going to do is just look at how to transform it directly0127

from data into a visualization without having to use the Excel graphs or chart.0131

If you go to your birth month sheet here I have just put up the months, 1 through 12.0140

It just looks sort of like a frequency table but if you watch carefully we are going to transform it.0148

Let us go ahead and put in our regular formula for how to find frequency.0156

That is equal sign (=) because we are starting off with a function.0163

Count if because we wanted to count if that person was born in the month of January.0167

Let us put in our data.0177

If we click on data and we scroll down to months, here is birth month.0180

I'm going to select all of these rows.0188

So far it seems like we are just making a frequency table.0192

I’m going to put in a comma because I know I’m going to need that.0197

Let us go back to birth month.0201

I want you to count it if the birth month is January.0203

If we just hit enter here, that would mean we are just counting how many people are born in January.0210

We are going to do something a little bit different.0217

I want Excel to visualize for me how many people there are.0219

Not give me a number but actually show me a pictures.0224

Here is what we are going to use.0229

We are going to use the repetition function and so that is rept, you do not have to put it in capital.0230

I just wanted to distinguish it from the count if and put a parentheses because that is how you are going to put in the inputs.0238

Here Excel reminds us that we need text, whatever text you want to repeat over and over again and the number of times.0247

The text, you can pick your favorite text.0256

You just have to make sure that it is in quotation marks.0259

I'm going to put in an at symbol, that is my favorite one.0263

I’m going to close parentheses and put in a comma.0271

The beautiful thing about count if is that it is going to return to me a number.0274

The output is going to be a number.0279

If I just leave this here it will just output to me 7.0282

This function will actually read repeat this at symbol 7 times.0287

At the end of this I’m going to put a close parentheses.0296

That my parentheses match up.0302

And then I’m going to hit enter.0305

Great.0308

Let me just make these rows a little bit larger so you can see everything.0311

Here instead of having the number 7, I have 7 little symbols.0317

And you do not have to use the at sign (@) if you do not want to.0323

You could use a star, an asterisk.0327

You could use anything you want.0332

You could use an o, anything to help you see there is this many people born in the month of January.0333

This is a direct way of going into the visualization.0347

We can actually copy and paste this just like we did before.0351

All we have to do is make sure that our data is locked with our dollar signs.0357

I’m going to hit enter and all I’m going to do is take that, drag it all the way down and we have a visualization right there.0367

We do not have to make Excel do any extra work with the charts or anything like that.0377

That is the handy thing about a dot plot.0382

We are making dot plot in Excel but a lot of times you may be asked to make dot plots on paper and pen.0385

No actual statistician does that anymore but in a statistics class you maybe asked to do that.0392

The nice thing about dot plots is that you do not have to put the data in order or anything.0399

You could just go and just sort of go down the line and put a dot where 5 is,0404

put a dot where 3 is, put a dot where 6 is, put a dot where 9 is.0412

You can see that it is a really easy way to visualize the distribution and you do not have to do anything to your data before hand.0416

But once again doing it by hand pretty tedious and people do not really do it that way.0426

At least not when they are doing real statistics.0431

Alright, that is dot plots but let us talk about the pros and cons.0435

Dots plots, the pros and cons.0445

The pro is that it is nice and quick.0447

It is quick and dirty, right.0451

You could go directly from data to dot plots, no middleman, no frequency tables.0453

You do not have to know anything even about statistics and you could do it.0458

You can only do this with small data sets.0461

It is not useful with giant data sets.0465

Even your data set of a 100 cases, it is considered a relatively small data set.0467

Imagine if you have like a thousand of cases, I mean just to even see your dots go all the way to the side.0473

I mean it is going to be pretty crazy, right?0481

That is sort of a pro and a con, when you have a small data set this is a really nice thing to do.0485

Con is it cannot handle a lot of data points.0494

Another pro is that it shows the actual individual values so0499

you will know exactly what each dots stands for and it does in group cases together.0503

Everything sort of spread out for you and it shows you the shape of the distribution.0509

With our example we saw once again that it is a uniform distribution and you could see that right away.0514

The cons is sort of the opposite but it is really the same features as the pros but considering it from a different goal perspective.0521

One of the pros is that you can handle small data sets really well but the con is that it cannot handle a lot of data points.0533

The other con is that because it shows these individual values, it cannot group the cases together.0540

I forgot to show you one thing.0549

If you go back to the birth month sheet, one thing is this is a vertical0551

or horizontal but here each row represents one of the values of this variable.0563

Now sometimes you may be asked to create a dot plot where the dots are going up and down.0571

I will call that vertical version.0580

It is really easy to make that once you have created this.0582

Here is how to do it and this is a nice trick.0584

It is just something nice to know with Excel as well.0590

If you copy this, I'm using command C here and I'm just going to paste it over here on this side.0595

I guess I will paste it here for now.0605

We are going to have to change this, I will paste this down here so that you could see it is going up.0609

Here is what I’m going to do, I’m going to paste it special.0616

If you go to add it, you should be able to see something that says paste special.0621

Not a lot of people use this but if you click on paste special something like this will come out.0633

What I want Excel to do is transpose it, what is used to be columns, I want it to put it in rows.0639

I’m going to hit transpose.0645

It is just to paste it but it is going transpose paste it.0649

I will hit okay.0653

Here what we find is that it has put all of my rows into columns and columns into rows.0656

What I’m going to do is take these months here and0664

I’m going to cut them command x and I’m going to paste them below my frequency command v.0672

I’m going to select these dots from my dot plot and0681

then I’m going to use my formatting palette to tell it to face it in a vertical matter rather than horizontal.0684

Let us look at what this looks like.0698

I’m going to take this row and make it a little bigger so you could see.0701

I’m going to take all of these and shrink my column a little bit so you could see it all at the same time.0713

Hopefully you are able to follow along with that,0722

You will see that what I have done is I have taken my previously horizontal dots and I have made my dots vertical.0725

It is just formatting.0734

It is nothing special.0735

Honestly you could also do this from the very beginning if you know that is the way you need to do your dot plot.0737

Basically the information is the same though, that you could see that it is a roughly uniformed distribution.0743

There is not one month that is particularly popular.0751

They are just seem to be repeating pattern.0754

That is transposing dot plots.0758

Let us go on.0766

Let us go on to histograms.0769

Histograms will probably be the most frequent frequency distribution you use in your statistics course and in statistics real life.0771

Histograms are pretty easy to understand.0780

All you have to do is imagine drawing a rectangle around the dots0783

so that you do not see individual dots anymore but you will see rectangles.0787

What is nice about histograms is that you could group cases into bins.0793

For an instance, we could put everybody who is born in the winter months together into one rectangle.0798

We can have December, January, February, in one rectangle and then we could bin other months into another rectangle.0806

This is going to be useful to us when we have variables that have a lot of values.0818

For instance, take the example of variable like friends.0824

Friends is going to be an interesting variable to look at.0828

To look at what is the distribution of friends like in our www.facebook.com sample but people can have 1 friend.0831

People can have 15 friends, people can have 500 friends or 600 friends or 1000 friends.0838

The question is are we going to have a column for 1, 2, 3, 4, 5, all the way up to 1000?0847

That seems pretty crazy, right?0857

If we had an ungrouped frequency table it would look insane.0859

We probably have from like 1 all the way up to, let us be conservative and say 500.0864

We have one all the way up to 500 and then maybe we have one person that has 10879

and then another person that has 3 and a couple people that have 10, 12 and 15.0884

Probably not one particular value of friends like 273 is going to be that popular, right.0890

It will be like everybody just has 1 person in there mean, right?0897

Every value would not have very many cases.0902

An ungrouped frequency table would be unhelpful to us.0905

In this case we want to use a grouped frequency table.0912

We probably want a mean of everybody who has from 0 to 100 friends together and then from 101 to 200 friends together.0915

And from 201 to 300 friends together something like that.0924

That will probably be a lot more useful for us.0927

Let us go back to our examples in the Excel file and let us go friends.0929

In friends I have prepared what looks like a regular frequency table so far.0940

It has frequency here, but here I have min of mean.0946

My minimum value of the mean and my maximum value of the mean.0951

For instance, if I want my bin to go from 0 to 100 then I would put in frequency all the people who are in between 0 to 100.0955

In order to establish the mean size that you want, one thing that is helpful to know in advance0969

is what is the minimum value in your data set?0974

And what is the maximum value in your data set?0979

Here I'm going to be using Excel functions to help me do that.0982

The Excel function is pretty easy to remember because it is just mean, in order to find the minimum value.0988

I will put in mean and I put in my parentheses and I'm going to make data.0994

I'm going to my friends variable.1001

See we have people who have like thousands of friends, 700 friends.1005

I'm going to close my parentheses there.1024

Here this is just saying look at all of this data and find the minimum value.1027

It turns out that somebody in our data set has 0 friends, it is sad for that person but cest la vie, life is okay.1035

It will go on even if you do not have any www.facebook.com friends.1043

Let us look at the maximum value.1047

The maximum value is pretty easy because it is just max and parentheses and putting in that same data.1049

Obviously you could also copy and paste.1066

I have put in a close parentheses and I just hit enter.1069

And I see that somebody in our data set has over 2000 friends.1072

Can you imagine having an ungrouped frequency table where you have to number from 0 to 2257?1078

And then look at every single value in between?1085

I mean a lot of them will probably be 0 and even a really frequent one will only be like 2.1088

This is a pretty useless link to look at in an ungrouped way.1094

That is why we want to look at it in a grouped histogram.1098

What kind of mean size would you like?1102

Maybe 100 but instead of 100 I’m going to put in 99 because it is going to be the formula that I'm going to create.1105

You could like to subtract one later but I am going to make it inclusive, I will explain it in just a second.1116

I'm lazy and I do not want to type in like 100 and 200, and 201 to 300.1123

I’m just going to use Excel functions in order to help me do that.1130

Now I know that my max bin here is from 0.1135

My first bin is from 0 to 100.1139

My next bin obviously is going to start where this bin leaves off.1141

I’m going to take my max of bin number and then just add 1, for now I have 101.1146

And because it is starting from 1 already I know that if I added 100 it would be 201 and then the next one will start at 202.1155

That is when start I made my been size 99 because I know it already starts off at 101 and that way I can make it nice and even.1162

When I tell my max been to be this thing, my mean of bin size plus bin size, then will get to 200.1175

Sometimes it may require little bit of playing around with the formulas.1187

You might have to add 1 or subtract 1 or something here.1192

I’m sure you can figure it out as you go.1194

I’m just going to copy and paste this all the way down.1199

I know what I forgot, I’m just going to undo that.1204

What I now is that this cell, I do not want it to ever change.1209

I want this to be locked in.1214

What I'm going to do is take that now it should be locked in and then copy and paste it.1221

A few paces down and then you can see it goes from 101 to 200, 201 to 300.1230

I’m going to go all the way up until I capture 2257, at least up to 2300.1236

That is like twice as much.1246

That is pretty good.1253

I have all my bins and even though I have how many rows, I have 23 rows but that is a lot fewer than having 2200 rows.1255

That is why grouped bins are better.1270

Now let us put in a range.1274

The range is just telling me what is my mean bin size and what is my max bin size1276

This is just to help me so that when I graph it, when I make a chart out of it1282

That this range will help us put something on the X axis that is really useful for us.1288

Let me put in a range and basically I'm going to make my range out of the mean of bin and the max of bin.1295

These two boarders and I’m just going to put a dash in the middle.1302

The way to do that is by using the concatenate function.1306

Concatenate just means to put things together.1309

I'm putting in an equal sign (=) and the word concatenate.1313

And I will put in a parentheses and I’m going to put in this value, that is my first text and a comma (,)1320

to separate my text but then whatever is in this Excel will just concatenate it together, put it all together.1331

And then I want to put in it a dash (-) and then I want to put it in my max order1339

and enclose that parentheses and let us see what happens.1347

Now I will get 0 to 100 and this is going to be helpful for us later when we want to label our X axis in a useful way.1352

Now that we have that, what we are going to ask Excel to do is find me the count of all1363

the people who have a friends counts between 0 and 100.1374

Before we have only learned how to count if it is one particular value.1379

How do I tell Excel count if it if it is within a certain range?1385

We are going to use what we call count ifs, it is not just count if it is count ifs1391

because we are going to ask Excel find everybody above 0 but also under 100.1397

I’m going to put equal sign (=) and put in count ifs.1406

Whenever I put in count ifs, I treat almost like it is two separate count is statements that I have just put into one set of parentheses.1414

I will put in the range for the first one and then the criteria.1423

I will put in the range for the second thing and then put in the criteria.1427

Here is what I mean, my range is going to be my friends data and I will put in a comma there1431

and here I know that this data is not going to change.1451

I’m going to lock it in place.1454

It is already been locked in, I’m going to delete this part for now.1461

We want to tell it do not just count it if it is equal to 0 but count it if they have a friend count that is greater than 0.1467

How do I do that?1478

I use my quotation marks, see if you could see that.1479

I use my quotation marks and I will tell Excel count it if their friend value is greater than or equal to,1486

and I’m going to close my parentheses because what I want to say is count it if it is greater than or equal to whatever is in here.1496

For Excel to know that you want to treat this is a number you want to put in a and symbol and then you want to click on that bin.1505

It is going to be counting everybody who has a friend count greater than 0 but that is everybody in your data set.1518

That is not what you want.1525

You also at the same time want Excel to tell you if the friend count is under 100.1527

I’m going to put in a comma and now we need to put in my criteria range number 2 and my criteria number 2.1537

I know that my criteria range is actually the same as this.1544

I’m just going to copy and paste this.1548

I’m using command c and I’m putting it here, command b so that now I just have that same range that is never going to change.1550

I'm going to tell Excel now tell me if these values are less than or equal to the max of bin.1565

The so less than or equal to, close the quotation marks.1574

Put in my and symbol and click on that.1581

It looks a little crazy but it just means count it if it is greater than 0 or less than 100 and then I'm going to close that.1586

And I find out we have 8 people will fit that profile.1597

The beautiful thing about Excel is that although it is hard to do that one cell it is like it is pretty complicated,1602

once you put in your dollar signs and lock everything in place that needs to be locked in place,1609

you could just drag that all the way down and you will get this beautiful frequency table that is been bins for you.1615

And now that you have this what I would do is select both the range and frequency, hit charts and hit XY scatter.1624

This is what sort of your frequency table looks like but let us put it into columns so that the range will show up well.1641

I’m just going to cover all my frequency table stuff, so you could see this a little bit better.1653

Now you will see that our histogram has the whole range from 0 all the way to 2300 and it is binned all my people together.1659

Its binned people who have 0, 1, 2, 3, 4, 5, 99 friends all the way into one bin.1674

In histograms, one thing you will learn is that because this axis is continuous you want your histogram1684

to also reflect that you have not skipped any values down here.1691

What you want to do is double-click on your on your bars and a format data series should pop-up.1696

What you are going to do is click on options and I’m going to tell Excel do not put any gaps in between my bars.1712

In this way histograms tell you there are no values that are skipped along here, every value has been accounted for.1722

That is our first binned histogram.1733

Obviously you could change your bin size and we are going to do that just that you could see.1737

I’m going to make this real tiny and leave it over here and we are going to do another example where we are going to change the bin size.1744

Maybe instead of 100, we will want a really big bins like 250.1754

Maybe instead of 100 and then I want may bin size to be to 250.1768

Because of that I should change this to 249 and all the work has basically been done for us already because we put in those formulas.1777

Let us see, I only needed to go up to 2257, 2500 is the earliest I can close it off.1790

I’m just going to delete these rows, they are not useful to me.1804

Let me delete all of these guys.1815

Let us look at our bins, I’m just going to make this histogram large and so we could see wow like Excel1828

has already done all the work for us, making the graphs and everything.1837

All we had to do is change a few numbers here and there and we see that we basically have that same distribution.1840

Remember this distribution is called a skewed distribution and we have this tail that goes out to the right side where the values get larger.1847

That is what we call a skewed right table and that is because we have a couple of people1858

who have an inordinately large number of friends but most people are between 0 and 500 friends.1864

www.facebook.com actually says that their average number of friends is 130 or something like that1873

which is pretty low in consideration with our data set where it seems like a lot of people have even 250 to 500 friends.1879

And so now let us go back to our slides.1893

I so that was the grouped frequency table.1896

Let us look at histograms, pros and cons.1902

The pros side when you have a large number of values to plot just like with things like friends1904

that is certainly a variable where you have a large number of values.1912

There are also things like a number of tagged photos, that is going to be a large number of values.1916

Some people have just a few photos.1924

Some people have a ton of photos right?1926

These kind of variables have a large number of values to plot and in that case a histogram is really nice particularly a grouped histogram.1927

Also a histogram is dead if you do not need to see individual values1937

and a histogram will definitely show you the general shape of the distribution.1943

For example in the case of friends we see that it is a right skewed distribution.1947

One con though is that if you have many distributions that you want to compare simultaneously, histograms are going to be a little bit hard to do.1953

You can probably compare up to two separate distributions on one histogram1961

and see how the distribution flow but more than to it gets a little bit clunky.1966

Let us talk a little bit about frequency versus relative frequency.1975

Frequency is simply how many cases have a particular value so how many people have between 0 and 100 friends.1980

Well maybe it is something like 250 people.1991

How many cases have a particular value between 101 and 200.1998

Well maybe that is 240 people.2009

This is basically what frequency looks like, it is just counting.2014

Now relative frequency is what proportion of cases from the sample have that particular value.2018

If you had 250 people having friends between 0 and 100 friends, what is that in relation to your entire sample?2023

Well if your sample, if your (n) is 1000 people in your sample then it would be 25% or .25.2038

And then between 101 and 200 instead of being 240 people it would be 24% or .24.2054

A pretty simple idea but it will come up again in the future.2069

Let us go into some problems.2075

First example, if we want to visualize the distribution of uploaded photos should we use dot plots or histograms?2077

Let us think about uploaded photos, is that like months where there is sort of this set finite number?2085

Or is it more like friends? the variable friends where there is like a ton of potential values?2094

Uploaded photos I’m pretty sure that people have up to like thousands of uploaded photos sometimes.2102

I'm thinking that it looks more like the distribution of friends than it looks like months.2108

Uploaded photos because there are so many values we would probably want to use grouped histograms as well.2118

We probably do not want to look at every single value.2129

How many people have one photo?2132

How many people have two photos?2133

How many people have three photos? All the way up to like thousands.2135

Instead we would probably want to look at them in a grouped way.2137

How many people have between the 0 and 500 photos?2141

How many people have between 500 and 100,000 photos?2145

And why? mostly because there are going to be a lot of potential values.2150

Values is what we always plot on our X axis and whenever you could sort of imagine these values2166

being like cut up into like thousands of pieces, you probably want to use grouped histograms.2175

Example number 2, consider this dot plot with the age of pennies in a sample collected by a statistics class.2182

What are the cases and the variables that you see here?2189

It is really asking you what does each little dots in my dot plot represent?2193

Well I know that each dot probably represents a penny but not just a penny, it has to be some sort of number about the penny.2200

I can see it actually the case is the penny and the variable that it has is the age, probably in years.2211

Because that was written on the penny it says like this penny was made in 2010.2229

What is the shape of this distribution?2238

This looks to me if I just draw a little outline here, it draws outline, this looks to me like a tail.2241

I'm guessing this is the right skewed distribution.2254

And why does it have this distribution?2262

That is going to take a little bit of thought.2266

It looks to me as though lots of pennies that is in circulation pretty new and then there are as many old ones.2268

Maybe it is because there is this wall right here.2280

Penny cannot be any newer than 0 years old.2284

You cannot quite go past that and probably a lot of pennies that are really old sort of fallout of circulation2291

or maybe the US Mint takes them out of circulation.2298

That is what I would answer.2305

Let us go onto example 3.2309

Considering this histogram of mammals speeds is this a grouped or ungrouped frequency distribution?2310

Well to me it looks like there are bins here, this is a probably a bin that goes from 0 to 10 and this is between 10 and 20.2319

This is between 20 and 30.2331

In here there might be animals that go are able to run at 23 mph or 28 mph.2334

I would say that this a grouped frequency distribution.2343

This is a good question, if we used relative frequency will this change the shape of our distribution?2349

Well let us reason about that a little bit.2356

These peaks right here, these animals are running at speeds of between 30 to 50 mph and that seems to be the mode.2360

The most frequent type of mammal and that is probably not going to change if you take 6 and divided it by however many animals are here.2371

This should still be the peak and this should still be the peak.2384

This should still be one of the lower values.2388

This should still be one of the lower values.2391

I’m going to say that this would not change.2394

No it would not change the shape of our distribution.2397

I have made the relative frequency distribution down here and you could see2401

that even when we put it in percentages that the shape of it looks largely the same.2411

That is one handy thing to find out that, that when you use relative frequency the distribution does not change.2422

We are going to be using that in the future.2428

Last example, in this histogram of life expectancy which do you think is bigger?2431

The mean or remember it is like the balancing point of this distribution or the mode the most frequent value, explain.2437

In this histogram of life expectancy the mode is probably the easiest thing to figure out at first.2447

We could see right here the biggest peak is under at around 70 in life expectancy.2454

Each of these little bins look like they are 4 years.2463

This is probably between 68 and 72 years.2469

That is the mode but where do you think the mean might be?2477

We have to imagine taking this whole thing, cutting it out with cardboard and trying to balance it on your finger.2483

I think that we probably have to balance it somewhere here but not too far down, not quite in the middle but somewhere right here.2490

That is what I’m going to guess and that might be the mean.2503

This is the mode.2509

Because if I put my balance right here, I think it probably tip on this side.2512

I probably shifted over a little bit and this looks like my mode is my bigger value than my mean.2522

Explain.2535

I think I have already explained it right?2537

My mode is right here but my mean is a little bit over here because2542

I think that if I balanced it right underneath the mode, it would tip on that side.2543

There would be too many people here on this side.2551

I’m going to move over my mean and because of that I can clearly see that my mode is bigger.2554

There you go.2560

Thanks for using www.educator.com.2561

OR

### Start Learning Now

Our free lessons will get you started (Adobe Flash® required).