Dr. Ji Son

Dr. Ji Son

Shape: Calculating Skewness & Kurtosis

Slide Duration:

Table of Contents

Section 1: Introduction
Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro
0:00
Roadmap
0:10
Roadmap
0:11
Statistics
0:35
Statistics
0:36
Let's Think About High School Science
1:12
Measurement and Find Patterns (Mathematical Formula)
1:13
Statistics = Math of Distributions
4:58
Distributions
4:59
Problematic… but also GREAT
5:58
Statistics
7:33
How is It Different from Other Specializations in Mathematics?
7:34
Statistics is Fundamental in Natural and Social Sciences
7:53
Two Skills of Statistics
8:20
Description (Exploration)
8:21
Inference
9:13
Descriptive Statistics vs. Inferential Statistics: Apply to Distributions
9:58
Descriptive Statistics
9:59
Inferential Statistics
11:05
Populations vs. Samples
12:19
Populations vs. Samples: Is it the Truth?
12:20
Populations vs. Samples: Pros & Cons
13:36
Populations vs. Samples: Descriptive Values
16:12
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:10
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:11
Example 1: Descriptive Statistics vs. Inferential Statistics
19:09
Example 2: Descriptive Statistics vs. Inferential Statistics
20:47
Example 3: Sample, Parameter, Population, and Statistic
21:40
Example 4: Sample, Parameter, Population, and Statistic
23:28
Section 2: About Samples: Cases, Variables, Measurements
About Samples: Cases, Variables, Measurements

32m 14s

Intro
0:00
Data
0:09
Data, Cases, Variables, and Values
0:10
Rows, Columns, and Cells
2:03
Example: Aircrafts
3:52
How Do We Get Data?
5:38
Research: Question and Hypothesis
5:39
Research Design
7:11
Measurement
7:29
Research Analysis
8:33
Research Conclusion
9:30
Types of Variables
10:03
Discrete Variables
10:04
Continuous Variables
12:07
Types of Measurements
14:17
Types of Measurements
14:18
Types of Measurements (Scales)
17:22
Nominal
17:23
Ordinal
19:11
Interval
21:33
Ratio
24:24
Example 1: Cases, Variables, Measurements
25:20
Example 2: Which Scale of Measurement is Used?
26:55
Example 3: What Kind of a Scale of Measurement is This?
27:26
Example 4: Discrete vs. Continuous Variables.
30:31
Section 3: Visualizing Distributions
Introduction to Excel

8m 9s

Intro
0:00
Before Visualizing Distribution
0:10
Excel
0:11
Excel: Organization
0:45
Workbook
0:46
Column x Rows
1:50
Tools: Menu Bar, Standard Toolbar, and Formula Bar
3:00
Excel + Data
6:07
Exce and Data
6:08
Frequency Distributions in Excel

39m 10s

Intro
0:00
Roadmap
0:08
Data in Excel and Frequency Distributions
0:09
Raw Data to Frequency Tables
0:42
Raw Data to Frequency Tables
0:43
Frequency Tables: Using Formulas and Pivot Tables
1:28
Example 1: Number of Births
7:17
Example 2: Age Distribution
20:41
Example 3: Height Distribution
27:45
Example 4: Height Distribution of Males
32:19
Frequency Distributions and Features

25m 29s

Intro
0:00
Roadmap
0:10
Data in Excel, Frequency Distributions, and Features of Frequency Distributions
0:11
Example #1
1:35
Uniform
1:36
Example #2
2:58
Unimodal, Skewed Right, and Asymmetric
2:59
Example #3
6:29
Bimodal
6:30
Example #4a
8:29
Symmetric, Unimodal, and Normal
8:30
Point of Inflection and Standard Deviation
11:13
Example #4b
12:43
Normal Distribution
12:44
Summary
13:56
Uniform, Skewed, Bimodal, and Normal
13:57
Sketch Problem 1: Driver's License
17:34
Sketch Problem 2: Life Expectancy
20:01
Sketch Problem 3: Telephone Numbers
22:01
Sketch Problem 4: Length of Time Used to Complete a Final Exam
23:43
Dotplots and Histograms in Excel

42m 42s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Previously
1:02
Data, Frequency Table, and visualization
1:03
Dotplots
1:22
Dotplots Excel Example
1:23
Dotplots: Pros and Cons
7:22
Pros and Cons of Dotplots
7:23
Dotplots Excel Example Cont.
9:07
Histograms
12:47
Histograms Overview
12:48
Example of Histograms
15:29
Histograms: Pros and Cons
31:39
Pros
31:40
Cons
32:31
Frequency vs. Relative Frequency
32:53
Frequency
32:54
Relative Frequency
33:36
Example 1: Dotplots vs. Histograms
34:36
Example 2: Age of Pennies Dotplot
36:21
Example 3: Histogram of Mammal Speeds
38:27
Example 4: Histogram of Life Expectancy
40:30
Stemplots

12m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
What Sets Stemplots Apart?
0:46
Data Sets, Dotplots, Histograms, and Stemplots
0:47
Example 1: What Do Stemplots Look Like?
1:58
Example 2: Back-to-Back Stemplots
5:00
Example 3: Quiz Grade Stemplot
7:46
Example 4: Quiz Grade & Afterschool Tutoring Stemplot
9:56
Bar Graphs

22m 49s

Intro
0:00
Roadmap
0:05
Roadmap
0:08
Review of Frequency Distributions
0:44
Y-axis and X-axis
0:45
Types of Frequency Visualizations Covered so Far
2:16
Introduction to Bar Graphs
4:07
Example 1: Bar Graph
5:32
Example 1: Bar Graph
5:33
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:07
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:08
Example 2: Create a Frequency Visualization for Gender
14:02
Example 3: Cases, Variables, and Frequency Visualization
16:34
Example 4: What Kind of Graphs are Shown Below?
19:29
Section 4: Summarizing Distributions
Central Tendency: Mean, Median, Mode

38m 50s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Central Tendency 1
0:56
Way to Summarize a Distribution of Scores
0:57
Mode
1:32
Median
2:02
Mean
2:36
Central Tendency 2
3:47
Mode
3:48
Median
4:20
Mean
5:25
Summation Symbol
6:11
Summation Symbol
6:12
Population vs. Sample
10:46
Population vs. Sample
10:47
Excel Examples
15:08
Finding Mode, Median, and Mean in Excel
15:09
Median vs. Mean
21:45
Effect of Outliers
21:46
Relationship Between Parameter and Statistic
22:44
Type of Measurements
24:00
Which Distributions to Use With
24:55
Example 1: Mean
25:30
Example 2: Using Summation Symbol
29:50
Example 3: Average Calorie Count
32:50
Example 4: Creating an Example Set
35:46
Variability

42m 40s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Variability (or Spread)
0:45
Variability (or Spread)
0:46
Things to Think About
5:45
Things to Think About
5:46
Range, Quartiles and Interquartile Range
6:37
Range
6:38
Interquartile Range
8:42
Interquartile Range Example
10:58
Interquartile Range Example
10:59
Variance and Standard Deviation
12:27
Deviations
12:28
Sum of Squares
14:35
Variance
16:55
Standard Deviation
17:44
Sum of Squares (SS)
18:34
Sum of Squares (SS)
18:35
Population vs. Sample SD
22:00
Population vs. Sample SD
22:01
Population vs. Sample
23:20
Mean
23:21
SD
23:51
Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File
27:21
Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File
35:25
Example 3: Sum of Squares
38:58
Example 4: Standard Deviation
41:48
Five Number Summary & Boxplots

57m 15s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Summarizing Distributions
0:37
Shape, Center, and Spread
0:38
5 Number Summary
1:14
Boxplot: Visualizing 5 Number Summary
3:37
Boxplot: Visualizing 5 Number Summary
3:38
Boxplots on Excel
9:01
Using 'Stocks' and Using Stacked Columns
9:02
Boxplots on Excel Example
10:14
When are Boxplots Useful?
32:14
Pros
32:15
Cons
32:59
How to Determine Outlier Status
33:24
Rule of Thumb: Upper Limit
33:25
Rule of Thumb: Lower Limit
34:16
Signal Outliers in an Excel Data File Using Conditional Formatting
34:52
Modified Boxplot
48:38
Modified Boxplot
48:39
Example 1: Percentage Values & Lower and Upper Whisker
49:10
Example 2: Boxplot
50:10
Example 3: Estimating IQR From Boxplot
53:46
Example 4: Boxplot and Missing Whisker
54:35
Shape: Calculating Skewness & Kurtosis

41m 51s

Intro
0:00
Roadmap
0:16
Roadmap
0:17
Skewness Concept
1:09
Skewness Concept
1:10
Calculating Skewness
3:26
Calculating Skewness
3:27
Interpreting Skewness
7:36
Interpreting Skewness
7:37
Excel Example
8:49
Kurtosis Concept
20:29
Kurtosis Concept
20:30
Calculating Kurtosis
24:17
Calculating Kurtosis
24:18
Interpreting Kurtosis
29:01
Leptokurtic
29:35
Mesokurtic
30:10
Platykurtic
31:06
Excel Example
32:04
Example 1: Shape of Distribution
38:28
Example 2: Shape of Distribution
39:29
Example 3: Shape of Distribution
40:14
Example 4: Kurtosis
41:10
Normal Distribution

34m 33s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
What is a Normal Distribution
0:44
The Normal Distribution As a Theoretical Model
0:45
Possible Range of Probabilities
3:05
Possible Range of Probabilities
3:06
What is a Normal Distribution
5:07
Can Be Described By
5:08
Properties
5:49
'Same' Shape: Illusion of Different Shape!
7:35
'Same' Shape: Illusion of Different Shape!
7:36
Types of Problems
13:45
Example: Distribution of SAT Scores
13:46
Shape Analogy
19:48
Shape Analogy
19:49
Example 1: The Standard Normal Distribution and Z-Scores
22:34
Example 2: The Standard Normal Distribution and Z-Scores
25:54
Example 3: Sketching and Normal Distribution
28:55
Example 4: Sketching and Normal Distribution
32:32
Standard Normal Distributions & Z-Scores

41m 44s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
A Family of Distributions
0:28
Infinite Set of Distributions
0:29
Transforming Normal Distributions to 'Standard' Normal Distribution
1:04
Normal Distribution vs. Standard Normal Distribution
2:58
Normal Distribution vs. Standard Normal Distribution
2:59
Z-Score, Raw Score, Mean, & SD
4:08
Z-Score, Raw Score, Mean, & SD
4:09
Weird Z-Scores
9:40
Weird Z-Scores
9:41
Excel
16:45
For Normal Distributions
16:46
For Standard Normal Distributions
19:11
Excel Example
20:24
Types of Problems
25:18
Percentage Problem: P(x)
25:19
Raw Score and Z-Score Problems
26:28
Standard Deviation Problems
27:01
Shape Analogy
27:44
Shape Analogy
27:45
Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer
28:24
Example 2: Heights of Male College Students
33:15
Example 3: Mean and Standard Deviation
37:14
Example 4: Finding Percentage of Values in a Standard Normal Distribution
37:49
Normal Distribution: PDF vs. CDF

55m 44s

Intro
0:00
Roadmap
0:15
Roadmap
0:16
Frequency vs. Cumulative Frequency
0:56
Frequency vs. Cumulative Frequency
0:57
Frequency vs. Cumulative Frequency
4:32
Frequency vs. Cumulative Frequency Cont.
4:33
Calculus in Brief
6:21
Derivative-Integral Continuum
6:22
PDF
10:08
PDF for Standard Normal Distribution
10:09
PDF for Normal Distribution
14:32
Integral of PDF = CDF
21:27
Integral of PDF = CDF
21:28
Example 1: Cumulative Frequency Graph
23:31
Example 2: Mean, Standard Deviation, and Probability
24:43
Example 3: Mean and Standard Deviation
35:50
Example 4: Age of Cars
49:32
Section 5: Linear Regression
Scatterplots

47m 19s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Previous Visualizations
0:30
Frequency Distributions
0:31
Compare & Contrast
2:26
Frequency Distributions Vs. Scatterplots
2:27
Summary Values
4:53
Shape
4:54
Center & Trend
6:41
Spread & Strength
8:22
Univariate & Bivariate
10:25
Example Scatterplot
10:48
Shape, Trend, and Strength
10:49
Positive and Negative Association
14:05
Positive and Negative Association
14:06
Linearity, Strength, and Consistency
18:30
Linearity
18:31
Strength
19:14
Consistency
20:40
Summarizing a Scatterplot
22:58
Summarizing a Scatterplot
22:59
Example 1: Gapminder.org, Income x Life Expectancy
26:32
Example 2: Gapminder.org, Income x Infant Mortality
36:12
Example 3: Trend and Strength of Variables
40:14
Example 4: Trend, Strength and Shape for Scatterplots
43:27
Regression

32m 2s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Linear Equations
0:34
Linear Equations: y = mx + b
0:35
Rough Line
5:16
Rough Line
5:17
Regression - A 'Center' Line
7:41
Reasons for Summarizing with a Regression Line
7:42
Predictor and Response Variable
10:04
Goal of Regression
12:29
Goal of Regression
12:30
Prediction
14:50
Example: Servings of Mile Per Year Shown By Age
14:51
Intrapolation
17:06
Extrapolation
17:58
Error in Prediction
20:34
Prediction Error
20:35
Residual
21:40
Example 1: Residual
23:34
Example 2: Large and Negative Residual
26:30
Example 3: Positive Residual
28:13
Example 4: Interpret Regression Line & Extrapolate
29:40
Least Squares Regression

56m 36s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
Best Fit
0:47
Best Fit
0:48
Sum of Squared Errors (SSE)
1:50
Sum of Squared Errors (SSE)
1:51
Why Squared?
3:38
Why Squared?
3:39
Quantitative Properties of Regression Line
4:51
Quantitative Properties of Regression Line
4:52
So How do we Find Such a Line?
6:49
SSEs of Different Line Equations & Lowest SSE
6:50
Carl Gauss' Method
8:01
How Do We Find Slope (b1)
11:00
How Do We Find Slope (b1)
11:01
Hoe Do We Find Intercept
15:11
Hoe Do We Find Intercept
15:12
Example 1: Which of These Equations Fit the Above Data Best?
17:18
Example 2: Find the Regression Line for These Data Points and Interpret It
26:31
Example 3: Summarize the Scatterplot and Find the Regression Line.
34:31
Example 4: Examine the Mean of Residuals
43:52
Correlation

43m 58s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Summarizing a Scatterplot Quantitatively
0:47
Shape
0:48
Trend
1:11
Strength: Correlation ®
1:45
Correlation Coefficient ( r )
2:30
Correlation Coefficient ( r )
2:31
Trees vs. Forest
11:59
Trees vs. Forest
12:00
Calculating r
15:07
Average Product of z-scores for x and y
15:08
Relationship between Correlation and Slope
21:10
Relationship between Correlation and Slope
21:11
Example 1: Find the Correlation between Grams of Fat and Cost
24:11
Example 2: Relationship between r and b1
30:24
Example 3: Find the Regression Line
33:35
Example 4: Find the Correlation Coefficient for this Set of Data
37:37
Correlation: r vs. r-squared

52m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
R-squared
0:44
What is the Meaning of It? Why Squared?
0:45
Parsing Sum of Squared (Parsing Variability)
2:25
SST = SSR + SSE
2:26
What is SST and SSE?
7:46
What is SST and SSE?
7:47
r-squared
18:33
Coefficient of Determination
18:34
If the Correlation is Strong…
20:25
If the Correlation is Strong…
20:26
If the Correlation is Weak…
22:36
If the Correlation is Weak…
22:37
Example 1: Find r-squared for this Set of Data
23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?
33:54
Example 3: Why Does r-squared Only Range from 0 to 1
37:29
Example 4: Find the r-squared for This Set of Data
39:55
Transformations of Data

27m 8s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Why Transform?
0:26
Why Transform?
0:27
Shape-preserving vs. Shape-changing Transformations
5:14
Shape-preserving = Linear Transformations
5:15
Shape-changing Transformations = Non-linear Transformations
6:20
Common Shape-Preserving Transformations
7:08
Common Shape-Preserving Transformations
7:09
Common Shape-Changing Transformations
8:59
Powers
9:00
Logarithms
9:39
Change Just One Variable? Both?
10:38
Log-log Transformations
10:39
Log Transformations
14:38
Example 1: Create, Graph, and Transform the Data Set
15:19
Example 2: Create, Graph, and Transform the Data Set
20:08
Example 3: What Kind of Model would You Choose for this Data?
22:44
Example 4: Transformation of Data
25:46
Section 6: Collecting Data in an Experiment
Sampling & Bias

54m 44s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Descriptive vs. Inferential Statistics
1:04
Descriptive Statistics: Data Exploration
1:05
Example
2:03
To tackle Generalization…
4:31
Generalization
4:32
Sampling
6:06
'Good' Sample
6:40
Defining Samples and Populations
8:55
Population
8:56
Sample
11:16
Why Use Sampling?
13:09
Why Use Sampling?
13:10
Goal of Sampling: Avoiding Bias
15:04
What is Bias?
15:05
Where does Bias Come from: Sampling Bias
17:53
Where does Bias Come from: Response Bias
18:27
Sampling Bias: Bias from Bas Sampling Methods
19:34
Size Bias
19:35
Voluntary Response Bias
21:13
Convenience Sample
22:22
Judgment Sample
23:58
Inadequate Sample Frame
25:40
Response Bias: Bias from 'Bad' Data Collection Methods
28:00
Nonresponse Bias
29:31
Questionnaire Bias
31:10
Incorrect Response or Measurement Bias
37:32
Example 1: What Kind of Biases?
40:29
Example 2: What Biases Might Arise?
44:46
Example 3: What Kind of Biases?
48:34
Example 4: What Kind of Biases?
51:43
Sampling Methods

14m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Biased vs. Unbiased Sampling Methods
0:32
Biased Sampling
0:33
Unbiased Sampling
1:13
Probability Sampling Methods
2:31
Simple Random
2:54
Stratified Random Sampling
4:06
Cluster Sampling
5:24
Two-staged Sampling
6:22
Systematic Sampling
7:25
Example 1: Which Type(s) of Sampling was this?
8:33
Example 2: Describe How to Take a Two-Stage Sample from this Book
10:16
Example 3: Sampling Methods
11:58
Example 4: Cluster Sample Plan
12:48
Research Design

53m 54s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Descriptive vs. Inferential Statistics
0:51
Descriptive Statistics: Data Exploration
0:52
Inferential Statistics
1:02
Variables and Relationships
1:44
Variables
1:45
Relationships
2:49
Not Every Type of Study is an Experiment…
4:16
Category I - Descriptive Study
4:54
Category II - Correlational Study
5:50
Category III - Experimental, Quasi-experimental, Non-experimental
6:33
Category III
7:42
Experimental, Quasi-experimental, and Non-experimental
7:43
Why CAN'T the Other Strategies Determine Causation?
10:18
Third-variable Problem
10:19
Directionality Problem
15:49
What Makes Experiments Special?
17:54
Manipulation
17:55
Control (and Comparison)
21:58
Methods of Control
26:38
Holding Constant
26:39
Matching
29:11
Random Assignment
31:48
Experiment Terminology
34:09
'true' Experiment vs. Study
34:10
Independent Variable (IV)
35:16
Dependent Variable (DV)
35:45
Factors
36:07
Treatment Conditions
36:23
Levels
37:43
Confounds or Extraneous Variables
38:04
Blind
38:38
Blind Experiments
38:39
Double-blind Experiments
39:29
How Categories Relate to Statistics
41:35
Category I - Descriptive Study
41:36
Category II - Correlational Study
42:05
Category III - Experimental, Quasi-experimental, Non-experimental
42:43
Example 1: Research Design
43:50
Example 2: Research Design
47:37
Example 3: Research Design
50:12
Example 4: Research Design
52:00
Between and Within Treatment Variability

41m 31s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Experimental Designs
0:51
Experimental Designs: Manipulation & Control
0:52
Two Types of Variability
2:09
Between Treatment Variability
2:10
Within Treatment Variability
3:31
Updated Goal of Experimental Design
5:47
Updated Goal of Experimental Design
5:48
Example: Drugs and Driving
6:56
Example: Drugs and Driving
6:57
Different Types of Random Assignment
11:27
All Experiments
11:28
Completely Random Design
12:02
Randomized Block Design
13:19
Randomized Block Design
15:48
Matched Pairs Design
15:49
Repeated Measures Design
19:47
Between-subject Variable vs. Within-subject Variable
22:43
Completely Randomized Design
22:44
Repeated Measures Design
25:03
Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment
26:16
Example 2: Block Design
31:41
Example 3: Completely Randomized Designs
35:11
Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?
39:01
Section 7: Review of Probability Axioms
Sample Spaces

37m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Why is Probability Involved in Statistics
0:48
Probability
0:49
Can People Tell the Difference between Cheap and Gourmet Coffee?
2:08
Taste Test with Coffee Drinkers
3:37
If No One can Actually Taste the Difference
3:38
If Everyone can Actually Taste the Difference
5:36
Creating a Probability Model
7:09
Creating a Probability Model
7:10
D'Alembert vs. Necker
9:41
D'Alembert vs. Necker
9:42
Problem with D'Alembert's Model
13:29
Problem with D'Alembert's Model
13:30
Covering Entire Sample Space
15:08
Fundamental Principle of Counting
15:09
Where Do Probabilities Come From?
22:54
Observed Data, Symmetry, and Subjective Estimates
22:55
Checking whether Model Matches Real World
24:27
Law of Large Numbers
24:28
Example 1: Law of Large Numbers
27:46
Example 2: Possible Outcomes
30:43
Example 3: Brands of Coffee and Taste
33:25
Example 4: How Many Different Treatments are there?
35:33
Addition Rule for Disjoint Events

20m 29s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Disjoint Events
0:41
Disjoint Events
0:42
Meaning of 'or'
2:39
In Regular Life
2:40
In Math/Statistics/Computer Science
3:10
Addition Rule for Disjoin Events
3:55
If A and B are Disjoint: P (A and B)
3:56
If A and B are Disjoint: P (A or B)
5:15
General Addition Rule
5:41
General Addition Rule
5:42
Generalized Addition Rule
8:31
If A and B are not Disjoint: P (A or B)
8:32
Example 1: Which of These are Mutually Exclusive?
10:50
Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?
12:57
Example 3: Engagement Party
15:17
Example 4: Home Owner's Insurance
18:30
Conditional Probability

57m 19s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
'or' vs. 'and' vs. Conditional Probability
1:07
'or' vs. 'and' vs. Conditional Probability
1:08
'and' vs. Conditional Probability
5:57
P (M or L)
5:58
P (M and L)
8:41
P (M|L)
11:04
P (L|M)
12:24
Tree Diagram
15:02
Tree Diagram
15:03
Defining Conditional Probability
22:42
Defining Conditional Probability
22:43
Common Contexts for Conditional Probability
30:56
Medical Testing: Positive Predictive Value
30:57
Medical Testing: Sensitivity
33:03
Statistical Tests
34:27
Example 1: Drug and Disease
36:41
Example 2: Marbles and Conditional Probability
40:04
Example 3: Cards and Conditional Probability
45:59
Example 4: Votes and Conditional Probability
50:21
Independent Events

24m 27s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Independent Events & Conditional Probability
0:26
Non-independent Events
0:27
Independent Events
2:00
Non-independent and Independent Events
3:08
Non-independent and Independent Events
3:09
Defining Independent Events
5:52
Defining Independent Events
5:53
Multiplication Rule
7:29
Previously…
7:30
But with Independent Evens
8:53
Example 1: Which of These Pairs of Events are Independent?
11:12
Example 2: Health Insurance and Probability
15:12
Example 3: Independent Events
17:42
Example 4: Independent Events
20:03
Section 8: Probability Distributions
Introduction to Probability Distributions

56m 45s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Sampling vs. Probability
0:57
Sampling
0:58
Missing
1:30
What is Missing?
3:06
Insight: Probability Distributions
5:26
Insight: Probability Distributions
5:27
What is a Probability Distribution?
7:29
From Sample Spaces to Probability Distributions
8:44
Sample Space
8:45
Probability Distribution of the Sum of Two Die
11:16
The Random Variable
17:43
The Random Variable
17:44
Expected Value
21:52
Expected Value
21:53
Example 1: Probability Distributions
28:45
Example 2: Probability Distributions
35:30
Example 3: Probability Distributions
43:37
Example 4: Probability Distributions
47:20
Expected Value & Variance of Probability Distributions

53m 41s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Discrete vs. Continuous Random Variables
1:04
Discrete vs. Continuous Random Variables
1:05
Mean and Variance Review
4:44
Mean: Sample, Population, and Probability Distribution
4:45
Variance: Sample, Population, and Probability Distribution
9:12
Example Situation
14:10
Example Situation
14:11
Some Special Cases…
16:13
Some Special Cases…
16:14
Linear Transformations
19:22
Linear Transformations
19:23
What Happens to Mean and Variance of the Probability Distribution?
20:12
n Independent Values of X
25:38
n Independent Values of X
25:39
Compare These Two Situations
30:56
Compare These Two Situations
30:57
Two Random Variables, X and Y
32:02
Two Random Variables, X and Y
32:03
Example 1: Expected Value & Variance of Probability Distributions
35:35
Example 2: Expected Values & Standard Deviation
44:17
Example 3: Expected Winnings and Standard Deviation
48:18
Binomial Distribution

55m 15s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Discrete Probability Distributions
1:42
Discrete Probability Distributions
1:43
Binomial Distribution
2:36
Binomial Distribution
2:37
Multiplicative Rule Review
6:54
Multiplicative Rule Review
6:55
How Many Outcomes with k 'Successes'
10:23
Adults and Bachelor's Degree: Manual List of Outcomes
10:24
P (X=k)
19:37
Putting Together # of Outcomes with the Multiplicative Rule
19:38
Expected Value and Standard Deviation in a Binomial Distribution
25:22
Expected Value and Standard Deviation in a Binomial Distribution
25:23
Example 1: Coin Toss
33:42
Example 2: College Graduates
38:03
Example 3: Types of Blood and Probability
45:39
Example 4: Expected Number and Standard Deviation
51:11
Section 9: Sampling Distributions of Statistics
Introduction to Sampling Distributions

48m 17s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Probability Distributions vs. Sampling Distributions
0:55
Probability Distributions vs. Sampling Distributions
0:56
Same Logic
3:55
Logic of Probability Distribution
3:56
Example: Rolling Two Die
6:56
Simulating Samples
9:53
To Come Up with Probability Distributions
9:54
In Sampling Distributions
11:12
Connecting Sampling and Research Methods with Sampling Distributions
12:11
Connecting Sampling and Research Methods with Sampling Distributions
12:12
Simulating a Sampling Distribution
14:14
Experimental Design: Regular Sleep vs. Less Sleep
14:15
Logic of Sampling Distributions
23:08
Logic of Sampling Distributions
23:09
General Method of Simulating Sampling Distributions
25:38
General Method of Simulating Sampling Distributions
25:39
Questions that Remain
28:45
Questions that Remain
28:46
Example 1: Mean and Standard Error of Sampling Distribution
30:57
Example 2: What is the Best Way to Describe Sampling Distributions?
37:12
Example 3: Matching Sampling Distributions
38:21
Example 4: Mean and Standard Error of Sampling Distribution
41:51
Sampling Distribution of the Mean

1h 8m 48s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Special Case of General Method for Simulating a Sampling Distribution
1:53
Special Case of General Method for Simulating a Sampling Distribution
1:54
Computer Simulation
3:43
Using Simulations to See Principles behind Shape of SDoM
15:50
Using Simulations to See Principles behind Shape of SDoM
15:51
Conditions
17:38
Using Simulations to See Principles behind Center (Mean) of SDoM
20:15
Using Simulations to See Principles behind Center (Mean) of SDoM
20:16
Conditions: Does n Matter?
21:31
Conditions: Does Number of Simulation Matter?
24:37
Using Simulations to See Principles behind Standard Deviation of SDoM
27:13
Using Simulations to See Principles behind Standard Deviation of SDoM
27:14
Conditions: Does n Matter?
34:45
Conditions: Does Number of Simulation Matter?
36:24
Central Limit Theorem
37:13
SHAPE
38:08
CENTER
39:34
SPREAD
39:52
Comparing Population, Sample, and SDoM
43:10
Comparing Population, Sample, and SDoM
43:11
Answering the 'Questions that Remain'
48:24
What Happens When We Don't Know What the Population Looks Like?
48:25
Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
49:42
How Do We Know whether a Sample is Sufficiently Unlikely?
53:36
Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
54:40
Example 1: Mean Batting Average
55:25
Example 2: Mean Sampling Distribution and Standard Error
59:07
Example 3: Sampling Distribution of the Mean
1:01:04
Sampling Distribution of Sample Proportions

54m 37s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Intro to Sampling Distribution of Sample Proportions (SDoSP)
0:51
Categorical Data (Examples)
0:52
Wish to Estimate Proportion of Population from Sample…
2:00
Notation
3:34
Population Proportion and Sample Proportion Notations
3:35
What's the Difference?
9:19
SDoM vs. SDoSP: Type of Data
9:20
SDoM vs. SDoSP: Shape
11:24
SDoM vs. SDoSP: Center
12:30
SDoM vs. SDoSP: Spread
15:34
Binomial Distribution vs. Sampling Distribution of Sample Proportions
19:14
Binomial Distribution vs. SDoSP: Type of Data
19:17
Binomial Distribution vs. SDoSP: Shape
21:07
Binomial Distribution vs. SDoSP: Center
21:43
Binomial Distribution vs. SDoSP: Spread
24:08
Example 1: Sampling Distribution of Sample Proportions
26:07
Example 2: Sampling Distribution of Sample Proportions
37:58
Example 3: Sampling Distribution of Sample Proportions
44:42
Example 4: Sampling Distribution of Sample Proportions
45:57
Section 10: Inferential Statistics
Introduction to Confidence Intervals

42m 53s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Inferential Statistics
0:50
Inferential Statistics
0:51
Two Problems with This Picture…
3:20
Two Problems with This Picture…
3:21
Solution: Confidence Intervals (CI)
4:59
Solution: Hypotheiss Testing (HT)
5:49
Which Parameters are Known?
6:45
Which Parameters are Known?
6:46
Confidence Interval - Goal
7:56
When We Don't Know m but know s
7:57
When We Don't Know
18:27
When We Don't Know m nor s
18:28
Example 1: Confidence Intervals
26:18
Example 2: Confidence Intervals
29:46
Example 3: Confidence Intervals
32:18
Example 4: Confidence Intervals
38:31
t Distributions

1h 2m 6s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
When to Use z vs. t?
1:07
When to Use z vs. t?
1:08
What is z and t?
3:02
z-score and t-score: Commonality
3:03
z-score and t-score: Formulas
3:34
z-score and t-score: Difference
5:22
Why not z? (Why t?)
7:24
Why not z? (Why t?)
7:25
But Don't Worry!
15:13
Gossett and t-distributions
15:14
Rules of t Distributions
17:05
t-distributions are More Normal as n Gets Bigger
17:06
t-distributions are a Family of Distributions
18:55
Degrees of Freedom (df)
20:02
Degrees of Freedom (df)
20:03
t Family of Distributions
24:07
t Family of Distributions : df = 2 , 4, and 60
24:08
df = 60
29:16
df = 2
29:59
How to Find It?
31:01
'Student's t-distribution' or 't-distribution'
31:02
Excel Example
33:06
Example 1: Which Distribution Do You Use? Z or t?
45:26
Example 2: Friends on Facebook
47:41
Example 3: t Distributions
52:15
Example 4: t Distributions , confidence interval, and mean
55:59
Introduction to Hypothesis Testing

1h 6m 33s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Issues to Overcome in Inferential Statistics
1:35
Issues to Overcome in Inferential Statistics
1:36
What Happens When We Don't Know What the Population Looks Like?
2:57
How Do We Know whether a sample is Sufficiently Unlikely
3:43
Hypothesizing a Population
6:44
Hypothesizing a Population
6:45
Null Hypothesis
8:07
Alternative Hypothesis
8:56
Hypotheses
11:58
Hypotheses
11:59
Errors in Hypothesis Testing
14:22
Errors in Hypothesis Testing
14:23
Steps of Hypothesis Testing
21:15
Steps of Hypothesis Testing
21:16
Single Sample HT ( When Sigma Available)
26:08
Example: Average Facebook Friends
26:09
Step1
27:08
Step 2
27:58
Step 3
28:17
Step 4
32:18
Single Sample HT (When Sigma Not Available)
36:33
Example: Average Facebook Friends
36:34
Step1: Hypothesis Testing
36:58
Step 2: Significance Level
37:25
Step 3: Decision Stage
37:40
Step 4: Sample
41:36
Sigma and p-value
45:04
Sigma and p-value
45:05
On tailed vs. Two Tailed Hypotheses
45:51
Example 1: Hypothesis Testing
48:37
Example 2: Heights of Women in the US
57:43
Example 3: Select the Best Way to Complete This Sentence
1:03:23
Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro
0:00
Roadmap
0:14
Roadmap
0:15
One Mean vs. Two Means
1:17
One Mean vs. Two Means
1:18
Notation
2:41
A Sample! A Set!
2:42
Mean of X, Mean of Y, and Difference of Two Means
3:56
SE of X
4:34
SE of Y
6:28
Sampling Distribution of the Difference between Two Means (SDoD)
7:48
Sampling Distribution of the Difference between Two Means (SDoD)
7:49
Rules of the SDoD (similar to CLT!)
15:00
Mean for the SDoD Null Hypothesis
15:01
Standard Error
17:39
When can We Construct a CI for the Difference between Two Means?
21:28
Three Conditions
21:29
Finding CI
23:56
One Mean CI
23:57
Two Means CI
25:45
Finding t
29:16
Finding t
29:17
Interpreting CI
30:25
Interpreting CI
30:26
Better Estimate of s (s pool)
34:15
Better Estimate of s (s pool)
34:16
Example 1: Confidence Intervals
42:32
Example 2: SE of the Difference
52:36
Hypothesis Testing for the Difference of Two Independent Means

50m

Intro
0:00
Roadmap
0:06
Roadmap
0:07
The Goal of Hypothesis Testing
0:56
One Sample and Two Samples
0:57
Sampling Distribution of the Difference between Two Means (SDoD)
3:42
Sampling Distribution of the Difference between Two Means (SDoD)
3:43
Rules of the SDoD (Similar to CLT!)
6:46
Shape
6:47
Mean for the Null Hypothesis
7:26
Standard Error for Independent Samples (When Variance is Homogenous)
8:18
Standard Error for Independent Samples (When Variance is not Homogenous)
9:25
Same Conditions for HT as for CI
10:08
Three Conditions
10:09
Steps of Hypothesis Testing
11:04
Steps of Hypothesis Testing
11:05
Formulas that Go with Steps of Hypothesis Testing
13:21
Step 1
13:25
Step 2
14:18
Step 3
15:00
Step 4
16:57
Example 1: Hypothesis Testing for the Difference of Two Independent Means
18:47
Example 2: Hypothesis Testing for the Difference of Two Independent Means
33:55
Example 3: Hypothesis Testing for the Difference of Two Independent Means
44:22
Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
The Goal of Hypothesis Testing
1:27
One Sample and Two Samples
1:28
Independent Samples vs. Paired Samples
3:16
Independent Samples vs. Paired Samples
3:17
Which is Which?
5:20
Independent SAMPLES vs. Independent VARIABLES
7:43
independent SAMPLES vs. Independent VARIABLES
7:44
T-tests Always…
10:48
T-tests Always…
10:49
Notation for Paired Samples
12:59
Notation for Paired Samples
13:00
Steps of Hypothesis Testing for Paired Samples
16:13
Steps of Hypothesis Testing for Paired Samples
16:14
Rules of the SDoD (Adding on Paired Samples)
18:03
Shape
18:04
Mean for the Null Hypothesis
18:31
Standard Error for Independent Samples (When Variance is Homogenous)
19:25
Standard Error for Paired Samples
20:39
Formulas that go with Steps of Hypothesis Testing
22:59
Formulas that go with Steps of Hypothesis Testing
23:00
Confidence Intervals for Paired Samples
30:32
Confidence Intervals for Paired Samples
30:33
Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
32:28
Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
44:02
Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
52:23
Type I and Type II Errors

31m 27s

Intro
0:00
Roadmap
0:18
Roadmap
0:19
Errors and Relationship to HT and the Sample Statistic?
1:11
Errors and Relationship to HT and the Sample Statistic?
1:12
Instead of a Box…Distributions!
7:00
One Sample t-test: Friends on Facebook
7:01
Two Sample t-test: Friends on Facebook
13:46
Usually, Lots of Overlap between Null and Alternative Distributions
16:59
Overlap between Null and Alternative Distributions
17:00
How Distributions and 'Box' Fit Together
22:45
How Distributions and 'Box' Fit Together
22:46
Example 1: Types of Errors
25:54
Example 2: Types of Errors
27:30
Example 3: What is the Danger of the Type I Error?
29:38
Effect Size & Power

44m 41s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Distance between Distributions: Sample t
0:49
Distance between Distributions: Sample t
0:50
Problem with Distance in Terms of Standard Error
2:56
Problem with Distance in Terms of Standard Error
2:57
Test Statistic (t) vs. Effect Size (d or g)
4:38
Test Statistic (t) vs. Effect Size (d or g)
4:39
Rules of Effect Size
6:09
Rules of Effect Size
6:10
Why Do We Need Effect Size?
8:21
Tells You the Practical Significance
8:22
HT can be Deceiving…
10:25
Important Note
10:42
What is Power?
11:20
What is Power?
11:21
Why Do We Need Power?
14:19
Conditional Probability and Power
14:20
Power is:
16:27
Can We Calculate Power?
19:00
Can We Calculate Power?
19:01
How Does Alpha Affect Power?
20:36
How Does Alpha Affect Power?
20:37
How Does Effect Size Affect Power?
25:38
How Does Effect Size Affect Power?
25:39
How Does Variability and Sample Size Affect Power?
27:56
How Does Variability and Sample Size Affect Power?
27:57
How Do We Increase Power?
32:47
Increasing Power
32:48
Example 1: Effect Size & Power
35:40
Example 2: Effect Size & Power
37:38
Example 3: Effect Size & Power
40:55
Section 11: Analysis of Variance
F-distributions

24m 46s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Z- & T-statistic and Their Distribution
0:34
Z- & T-statistic and Their Distribution
0:35
F-statistic
4:55
The F Ration ( the Variance Ratio)
4:56
F-distribution
12:29
F-distribution
12:30
s and p-value
15:00
s and p-value
15:01
Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?
18:33
Example 2: F-distributions
19:29
Example 3: F-distributions and Heights
21:29
ANOVA with Independent Samples

1h 9m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
1:12
The Limitations of t-tests
1:13
Two Major Limitations of Many t-tests
3:26
Two Major Limitations of Many t-tests
3:27
Ronald Fisher's Solution… F-test! New Null Hypothesis
4:43
Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)
4:44
Analysis of Variance (ANoVA) Notation
7:47
Analysis of Variance (ANoVA) Notation
7:48
Partitioning (Analyzing) Variance
9:58
Total Variance
9:59
Within-group Variation
14:00
Between-group Variation
16:22
Time out: Review Variance & SS
17:05
Time out: Review Variance & SS
17:06
F-statistic
19:22
The F Ratio (the Variance Ratio)
19:23
S²bet = SSbet / dfbet
22:13
What is This?
22:14
How Many Means?
23:20
So What is the dfbet?
23:38
So What is SSbet?
24:15
S²w = SSw / dfw
26:05
What is This?
26:06
How Many Means?
27:20
So What is the dfw?
27:36
So What is SSw?
28:18
Chart of Independent Samples ANOVA
29:25
Chart of Independent Samples ANOVA
29:26
Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?
35:52
Hypotheses
35:53
Significance Level
39:40
Decision Stage
40:05
Calculate Samples' Statistic and p-Value
44:10
Reject or Fail to Reject H0
55:54
Example 2: ANOVA with Independent Samples
58:21
Repeated Measures ANOVA

1h 15m 13s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
0:36
Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
0:37
ANOVA (F-test) to the Rescue!
5:49
Omnibus Hypothesis
5:50
Analyze Variance
7:27
Independent Samples vs. Repeated Measures
9:12
Same Start
9:13
Independent Samples ANOVA
10:43
Repeated Measures ANOVA
12:00
Independent Samples ANOVA
16:00
Same Start: All the Variance Around Grand Mean
16:01
Independent Samples
16:23
Repeated Measures ANOVA
18:18
Same Start: All the Variance Around Grand Mean
18:19
Repeated Measures
18:33
Repeated Measures F-statistic
21:22
The F Ratio (The Variance Ratio)
21:23
S²bet = SSbet / dfbet
23:07
What is This?
23:08
How Many Means?
23:39
So What is the dfbet?
23:54
So What is SSbet?
24:32
S² resid = SS resid / df resid
25:46
What is This?
25:47
So What is SS resid?
26:44
So What is the df resid?
27:36
SS subj and df subj
28:11
What is This?
28:12
How Many Subject Means?
29:43
So What is df subj?
30:01
So What is SS subj?
30:09
SS total and df total
31:42
What is This?
31:43
What is the Total Number of Data Points?
32:02
So What is df total?
32:34
so What is SS total?
32:47
Chart of Repeated Measures ANOVA
33:19
Chart of Repeated Measures ANOVA: F and Between-samples Variability
33:20
Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
35:50
Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
40:25
Hypotheses
40:26
Significance Level
41:46
Decision Stage
42:09
Calculate Samples' Statistic and p-Value
46:18
Reject or Fail to Reject H0
57:55
Example 2: Repeated Measures ANOVA
58:57
Example 3: What's the Problem with a Bunch of Tiny t-tests?
1:13:59
Section 12: Chi-square Test
Chi-Square Goodness-of-Fit Test

58m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Where Does the Chi-Square Test Belong?
0:50
Where Does the Chi-Square Test Belong?
0:51
A New Twist on HT: Goodness-of-Fit
7:23
HT in General
7:24
Goodness-of-Fit HT
8:26
Hypotheses about Proportions
12:17
Null Hypothesis
12:18
Alternative Hypothesis
13:23
Example
14:38
Chi-Square Statistic
17:52
Chi-Square Statistic
17:53
Chi-Square Distributions
24:31
Chi-Square Distributions
24:32
Conditions for Chi-Square
28:58
Condition 1
28:59
Condition 2
30:20
Condition 3
30:32
Condition 4
31:47
Example 1: Chi-Square Goodness-of-Fit Test
32:23
Example 2: Chi-Square Goodness-of-Fit Test
44:34
Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?
56:06
Chi-Square Test of Homogeneity

51m 36s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
Goodness-of-Fit vs. Homogeneity
1:13
Goodness-of-Fit HT
1:14
Homogeneity
2:00
Analogy
2:38
Hypotheses About Proportions
5:00
Null Hypothesis
5:01
Alternative Hypothesis
6:11
Example
6:33
Chi-Square Statistic
10:12
Same as Goodness-of-Fit Test
10:13
Set Up Data
12:28
Setting Up Data Example
12:29
Expected Frequency
16:53
Expected Frequency
16:54
Chi-Square Distributions & df
19:26
Chi-Square Distributions & df
19:27
Conditions for Test of Homogeneity
20:54
Condition 1
20:55
Condition 2
21:39
Condition 3
22:05
Condition 4
22:23
Example 1: Chi-Square Test of Homogeneity
22:52
Example 2: Chi-Square Test of Homogeneity
32:10
Section 13: Overview of Statistics
Overview of Statistics

18m 11s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
The Statistical Tests (HT) We've Covered
0:28
The Statistical Tests (HT) We've Covered
0:29
Organizing the Tests We've Covered…
1:08
One Sample: Continuous DV and Categorical DV
1:09
Two Samples: Continuous DV and Categorical DV
5:41
More Than Two Samples: Continuous DV and Categorical DV
8:21
The Following Data: OK Cupid
10:10
The Following Data: OK Cupid
10:11
Example 1: Weird-MySpace-Angle Profile Photo
10:38
Example 2: Geniuses
12:30
Example 3: Promiscuous iPhone Users
13:37
Example 4: Women, Aging, and Messaging
16:07
Loading...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
Bookmark & Share Embed

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Please ensure that your website editor is in text mode when you paste the code.
(In Wordpress, the mode button is on the top right corner.)
  ×
  • - Allow users to view the embedded video in full-size.
Since this lesson is not free, only the preview will appear on your website.
  • Discussion

  • Answer Engine

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Lecture Comments (6)

0 answers

Post by Srikanth C on April 22, 2015

I couldn't get the point where we can use any number to divide the heart of skewness. If (n-1) is 2, then we are literally dividing the skewness value from the formula by half and if (n-1) is 3 we are making it one-third, so are we not reducing the overall value of skewness by increasing n? So, will it be correct to say we can choose any number for n-1 to divide while calculating skewness?

1 answer

Last reply by: Manoj Joseph
Wed May 1, 2013 11:23 PM

Post by Manoj Joseph on May 1, 2013

I think you are making a wrong mistake to SPSS file instead of Excel shell

0 answers

Post by Kambiz Khosrowshahi on March 28, 2013

In the equation for skewness, I dont understand why it's ok to multiply it by 1/n-1. Isnt the correct equation that "heart"? If so, then by multiplying it by 1/n-1, wont it produce the wrong skewness?

1 answer

Last reply by: Professor Son
Wed Aug 15, 2012 2:15 PM

Post by KIM CARTER on April 27, 2012

Love all your work!!!!!!!!The correct spelling of (you wrote)asymtotic it is actually asymptotic. (Smile)

Shape: Calculating Skewness & Kurtosis

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:16
    • Roadmap
  • Skewness Concept 1:09
    • Skewness Concept
  • Calculating Skewness 3:26
    • Calculating Skewness
  • Interpreting Skewness 7:36
    • Interpreting Skewness
    • Excel Example
  • Kurtosis Concept 20:29
    • Kurtosis Concept
  • Calculating Kurtosis 24:17
    • Calculating Kurtosis
  • Interpreting Kurtosis 29:01
    • Leptokurtic
    • Mesokurtic
    • Platykurtic
    • Excel Example
  • Example 1: Shape of Distribution 38:28
  • Example 2: Shape of Distribution 39:29
  • Example 3: Shape of Distribution 40:14
  • Example 4: Kurtosis 41:10

Transcription: Shape: Calculating Skewness & Kurtosis

Hi and welcome to www.educator.com.0000

We are going to be looking at shapes again but now we are going to be calculating skewedness and pectoris.0002

We could not do this before when we covered shapes because we needed to know something about variability first.0008

This is a road map and basically we are going to be covering skewedness to concepts.0019

I’m going to try and connect that to calculating skew.0025

Then we are going to be talking about how to interpret the number that you have calculated0030

and also what that number will tell you about the relationship of central tendency.0034

Obviously because the number directly relates to the actual concepts of skewedness0039

this measures a central tendency relationships will also hold for the skewedness concepts.0044

Then we are going to be talking about kurtosis.0054

Kurtosis is something that people do not focus a lot on because it is a different shape to understand.0055

We are going to talking about calculating and how to interpret kurtosis.0063

First let us start with the concept of skewedness.0071

We have been over this again and again.0075

We know that there are skewed right distributions, skewed left distributions, and ones that are not skewed at all.0077

There are lots of these distributions.0087

This one is not skewed.0091

Anything that has a tail is skewed.0093

This are also called j shapes or formally they are called asymptotic shapes0101

because they are this asymptote right here, it asymptote against the x axis.0115

Skewedness means there is a tail somewhere.0125

One side is longer than the other.0131

One side that values that go and on and those values are not clustered around other values.0135

That is basically the concept of skewness.0143

It would be nice if we could get a number that would tell us this is exactly how skewed your distribution is.0147

Distribution like this might be somewhat skewed but a distribution like this might be really skewed.0155

It would be nice if we could say how much more skewed one is on the other.0166

Skewedness what we are going to calculate is going to tell us exactly how skewed something is.0171

On the positive end it would mean that the tail is on the right, if the number is positive.0177

If the number is negative it will tell us that the tail is on the left.0187

The greater the positive number it means it is more skewed on the right.0192

Normal distributions often have 0 skew.0198

Things that are much go often have 0 skew.0201

Let us talk about calculating skewedness.0209

Some people get a little bit freaked out at this but everything in here I want to blow it down to what is important.0210

Here is the main heart of the skewedness function.0218

Basically it is going to be the sum of cubed distances from the min.0227

We will talk in a little why it is cubed.0239

The sum of all the cubed distances of each point from the min of the sample.0243

Instead of sum of squares, it will be the sum of cubes / the standard deviation (s),0252

where we use the sample to calculate the standard deviation of the population to estimate it.0264

That is basically the crooks of the idea.0272

There is going to be some more frills on this but this is basically the heart of the idea.0275

Let us think about why it is cubed.0280

It is cubed because it is going to matter whether it is going to be positive or negative.0282

Before when we squared things, we do not care which direction away from the min it was.0287

But now we care.0293

How far are you away from the min but also what direction are you away from the min?0295

We are having things that are stacking up.0304

It is like each point is having a vote.0306

They are saying I’m on the left side, I’m on the right side.0310

By adding them up we see who wins, the left side guys or the right side guys.0314

When there are more guys on the negative end we get a smaller number.0319

When there are more guys on the positive end we get a large number.0325

By cubing it, what we do is we make those that are on the ends matter more than those who are close to the center.0332

Cubing is going to make it bigger than squaring it.0340

That is the main heart of the idea.0346

Just to review, if you double click on this s, what would be inside?0349

Think about clicking on that s, what would be s equal to?0356

S is the sum of squares ÷ n -1 because it is the standard deviation, we would square that.0361

Standard deviation is always going to be positive.0372

This is not going to change its sign.0378

On top of the heart of this function.0384

Sometimes people divide by n as well or n – 1.0388

In Excel it is going to multiply by n × n + 1 ÷ n – 3.0395

But that junk does not matter because that will not have a great impact on skewedness as the heart of this function.0403

That is what I want you to know, this is the heart.0413

That is the heart of the function and the other stuff are just frills.0418

This is one type of skewedness but actually there are tons of different types of skewedness.0425

There is pair skewedness, there are lots of skewness that I cannot remember the name.0430

There is minimal or distance skewedness.0437

There is momentum skewedness, there are ton of them.0442

If you want to know more about skewedness, you can check out this link www.wolframalpha.com.0444

I frequently use it and refer my students to it.0452

Let us talk about interpreting skewedness.0459

Imagine this is all of the skewedness values that are possible, if we find something out that has negative skewedness, like -1 skewness.0464

We have find out that our distribution has it, but I have not shown you what our distribution is.0478

Can you guess what kind of distribution it is?0482

Yes you can.0484

You know that it has a tail that goes to the left.0485

On the other hand if it is a positive one, you know that it has a tail that goes to the right.0491

If the skewness is 0, we are not sure exactly what it is but we basically know that it looks symmetrical.0497

It could look uniform, like approximately a normal distribution.0508

It could be some crazy thing that is symmetrical.0515

It could be any number of distributions as long as they are approximately symmetrical.0524

Let us talk a little bit about how to look at skewness in SPSS.0531

Go ahead and download the example Excel file and if you look on the first sheet, you should see skew data.0540

I have shown you 3 different little data sets.0550

Data set A is skewed to the right, notice that it has a tail to the right.0556

Data set B is skewed to the left, notice that it has a tail to the left.0561

Data set c is more normal, it is symmetrical.0566

It has a little peak here but it is roughly symmetrical on both sides.0572

The nice thing about Excel is that it does have a skew function so you could automatically generate skewness.0579

The skew function is just skew and you put in the data that you like it to calculate.0587

This is showing you the skew of 2.08, 2.09, it is saying that it is very positively skewed.0596

I could just drag this across and it will calculate the skews for the data sets above.0609

This one is for this data set and we find that it is a negative skew.0616

The extent is almost as negative as this one, right?0622

This is about 2, -2, and it is because these two are largely mirror opposites of each other so they have similar skews.0627

Here we see that the skew is closer to 0.0637

It is still on the positive side but it is closer to 0 than it is to 2 or 3.0641

SPSS, if you are interested in the particular formula that they use to calculate skew,0650

you could double click and when these little help window comes up you will notice that skew is a hyperlink.0657

If you click on that then you should get this little window that comes up, that is an Excel help window.0667

Let me make this bigger.0677

Here they show you the exact equation that they use for skew.0681

Notice that this part is exactly what we talk about.0685

It is that distance away from the min3 / stdev3.0689

That is all the heart if the function but here we have this extra stuff n / n – 1 × n – 2.0699

You could use that or you could use something else.0711

You could use 1 / n – 1 that is also very common.0714

I will be using n / n - 1.0718

If you hit enter, let us look a little bit first before we calculate skew for our cells,0723

let us look a little bit on what it would mean to our measures of central tendency to be skewed right or left.0730

Let us calculate the min.0739

Min is average, sometimes I type in min in Excel.0741

I’m going to close my parentheses.0750

Let us also get the median while we are at it and let us get the mode.0752

Once we have those 3 values, we could just copy and paste straight across and0767

it will find the respective min, median and mode for each of these other distributions.0777

When it is skewed to the right and there is a very positive skew,0784

what we find is that the min is greater than the median which is greater than the mode.0789

That makes sense.0796

The mode is probably going to be on the very left side of the distribution.0798

The min is highly affected by the outliers.0802

The outliers are on the right so the min is going to be greater.0806

For negative skewed distributions, we have the exact opposite pattern.0810

This time the mode is the greatest, followed by the median, then the min.0814

Remember the min is highly affected by outliers, this time the outliers are small.0819

The min is being pulled to the smaller side.0825

The most frequent numbers are up on the higher end and that is why median and mode tend to be bigger.0829

When you know the skewness, you will automatically know the relationship between min, median, and mode.0835

On the other hand, when distributions are largely symmetrical, when there is very little skew on either side0844

then what we see is that the min, median, and mode are often very close to each other.0851

Sometimes they will be exactly the same, this time they are just very close to each other.0858

That makes sense because there is a little bit of skew on top of each other.0863

Let us talk about how to calculate skewness.0874

Here I have put the formula for skewness.0879

Sometimes people are not comfortable with moving around that sigma sign, it is like where does it go?0883

That sigma will affect some things and not other things.0899

That sigma will affect anything that has an I next to it.0902

Because of that you could put ∑ and put a divided by stdev3 in there and it would not affect it very much.0906

You could either do this part after you add them all up or you could do it before you add them all up.0917

It actually does not matter because of the distributive property.0919

You could either put this inside or this or you could take it out.0930

It does not matter.0935

Let me show you a way that you could write this.0936

Actually this might not be helpful to write it on Excel, I will write it when we go back to our power point slides.0941

Let us start with this.0950

We know that we need to get all of these distances away from the min.0952

Let us start that with cubing that.0958

I need to start with the equal sign (=) and I’m going to put in a parentheses to say take my value, subtract the min which would be the average.0962

I’m going to put in all of these values.0975

I do not want that average to jiggle around so I’m going to lock that data in place.0979

Because I want to copy and paste it later on, I’m going to leave the A to vary but I’m going to say lock the 4 down.0985

When I copy down this column it would not change the A column, but when I copy across it will change it.0997

I only want to lock down the number row.1004

Then I’m going to close that parentheses and cube that value.1009

Here I could also divide by this as well but I’m just going to do this for now.1016

This is just the part where I’m doing x – x bar3, that is all I’m doing in this column.1025

I’m going to go ahead and copy and paste that all the way down.1037

Here what we have is just this part but now I need to sum them all up and divide by s3 as well as n – 1.1045

I’m going to modify this to make my formula a little bit simpler.1061

I will have n – 1 down here.1070

Now I need to add these up and divide by n – 1 × stdev3.1078

Here let us do that.1091

Let us sum this guys all up and divide by count my data – 1, multiply that by stdev of my data cubed.1093

It is a lot of parentheses.1127

Because it is color coded I know that I need another blue one to signal my denominator and hit enter.1131

There we get a skewness of 1.84.1139

That is way close to the 2.1148

Something that we got using the Excel function.1151

The Excel function will just multiply by a slightly different constant so that is the only difference.1153

But once we have this, we can actually copy this over here.1158

Here if I double click on this it shows that it is using the C column because I let C vary1168

and it is being relative about the columns but it is locking the rows in place.1175

I could also check down here, it is using the C column to count those data and get the standard deviation.1182

What we find is -1.7 which is pretty close to 1.9 that we found with Excel’s function.1192

I’m just going to copy and paste all of that again to our approximately symmetrical one.1201

We got something close to the .44 that Excel gave us, .39.1209

That is how we are going to calculate shape here.1216

One of the features is skewness.1222

That is skewness.1226

Now let us talk about interpreting kurtosis.1233

I actually said that I will talk a little bit about where the skewness stuff goes.1236

Let me just draw it over this corner.1242

I’m not going to need all this room for kurtosis.1244

Let me just talk a little bit more about the skewness function here.1247

The formula for skewness as I said before is going to be the sum of all the cubed distances over some constant and s3.1254

I have said that is the idea.1275

Sometimes you might see this formula written more like this x sub I – mean3 / s3.1277

This and this they mean the same thing.1294

It is just different ways of writing it.1301

This affects things that are affected by i and because this is not affected by the I at all,1303

s3 will be the same whether it is the first value or the last value.1312

It does not matter whether you write it inside included in that ∑ or not.1318

Another way you could write it is also 1/s3 × x sub I – x bar3.1323

This is another way you could write it and so all of these 3 ways of writing it are equivalent.1338

It does not change anything.1344

Do not get tricked out by one of this other option.1348

They are not trying to be tricky.1352

Let us talk about kurtosis.1356

Kurtosis is a concept that is weird for people because this is not something that we are used to dealing with in regular shapes that we know.1358

Roundness and squareness.1368

Kurtosis is something a little bit different.1371

Kurtosis is about two things that are bundled into one.1373

One is pointiness or peakness.1378

Very kurtotic shapes might look something like that.1382

Super non kurtotic shapes looks something like that.1388

One important aspect of this is the pointiness.1392

At the same time that peakness is going up, what you are seeing is the tails becoming thinner.1403

Peakness and thin tails usually go together and kurtosis is about both of these things1413

because here we not only have no peakness but we also have fat tails.1421

The tails are just as fat as that lock of peak in the middle.1427

Something in the middle might look something more like that.1431

This is a kurtotic dimension where you are getting increasing kurtosis.1436

When you have increasing kurtosis, it means two things simultaneously.1447

Having more peaks but also having thinner tails.1451

Let us talk about calculating kurtosis.1458

Kurtosis let us talk about the main idea before we get into things.1464

Like skew, you can multiply constants to it.1472

It does not matter.1475

This is the heart of kurtosis.1477

Here is my sigma, it is going to be that same distance for each point away from my mean.1480

Instead of cubing it, we are going to raise it to the 4th power.1488

When we raise it to the 4th power, we know that we do not care about whether it is on the left side or the right side.1498

That is one thing to know already about kurtosis.1507

It is not counting how many are on one side versus the other side of the mean.1510

Kurtosis already we know it will probably be positive because it is going to raise everything to the 4th power1516

and when you raise something to an even number power it is going to be positive.1525

We are going to divide that to stdev4.1530

Once again, this will always be positive, stdev is already positive.1536

We are going to raise it to the 4th power.1543

Kurtosis is largely going to be a positive number.1545

The only difference between different values of kurtosis might be whatever constant they decide to multiply by.1553

Frequently, 1/n – 1 is one of the constants.1564

I forget what Excel does, Excel does something crazy.1572

We will figure it out when we get there.1575

Once again, even with kurtosis you could write it in very different ways.1577

You could write it as 1/n – 1 × s4 and then put your sum of 4th powers here.1582

Since it is sum of squares, we are raising it up to the squared.1600

That is one way of writing it.1606

Another way of writing it is you can make ∑ x sub I – mean4 / n-1 × s4.1608

All of these things are the same thing.1624

Once again, this is the heart of idea of kurtosis.1627

One of the reason that is r4 is let us think about it.1634

Remember it is very concerned about being neither on the outside or the inside of the distributions.1638

Are you on the tails or at that peak?1643

By raising it to the 4th power it makes everybody matter a lot especially if you are on the outside.1645

You matter wait more that if you are on the inside.1652

One more thing to the idea of kurtosis.1659

Typically kurtosis is going to be for normally distributed function, let me try here.1663

For approximately normal looking distribution the kurtosis if you calculate it with some function like this, it is going to be 3.1673

That is so arbitrary.1686

What they have done is they made the kurtosis function so that you subtract 3 from it so that the normal distribution has a kurtosis of 0.1691

Like 3 – 3 =0.1705

That is actually how you will get negative kurtosis.1706

It is not because of this function, but it is because you subtract by 3.1710

The lowest kurtosis you can get is -3.1714

It is an odd bizarre of things.1720

I’m not sure when you decide to normalize it to 0 but my theory is that it will be hard for people to remember above 3 is something for normal.1722

They just make you do it in the formula.1738

Now that we have this weird correction of subtracting 3, we could talk about interpreting kurtosis.1744

You already know that the kurtosis above 0 would mean that you have an approximately way normal distribution.1752

That is what a kurtosis of 0 would be.1766

A kurtosis of less than 0 and greater than 0 is going to be more peaked, more pointy than normal.1769

Something like this.1784

That is a kurtosis that is greater than 0 and we call that leptokurtic.1788

It means more peaked than normal.1799

We could call other things that have a kurtosis that is similar to the normal distribution.1812

We could just say similarly kurtotic to the normal distribution but that would be long.1819

We say they are mesokurtic because you do not have to be a normal distribution to have a kurtosis of 0.1825

Mesokurtic just means it is about the same peakness as normal.1837

That is mesokurtic.1853

We need to have another for something that is less peaked or flatter than normal.1858

I remember this because mesokurtic, leptokurtic, that sounds crazy but meso I just remember it is like in the middle kurtosis.1868

Lepto is hard for me to remember which it is.1877

That is why this last one helps me because this one I could always remember, it is called platykurtic.1881

I think of a – and how it has a flat peak.1887

Platykurtic means that this is flatter than normal, smaller peakness than normal.1892

That would be something that looks more like this.1913

That is platykurtic there.1918

Those are our 3 interpretations of kurtosis.1922

Let us go to our Excel examples and look at kurtosis there.1925

Here let us click on a kurtosis data, that is the 3rd sheet and we could look at 3 distributions1931

that are already put in there that might be good for us to look at regarding this idea of kurtosis.1940

The uniform distribution obviously the tails are just as fat as the peak, it is not simple peaked.1946

The tails are super fat.1952

Here we have normal peak but the tails do not look pretty fat.1954

Here we have the thinnest tails and the peaks are higher than the tails are.1960

There is a bigger difference between the peak and tails.1967

Handily Excel has a kurtosis function so we could put in kurt and then put in our data.1972

Hit enter.1985

What we have here is negative kurtosis where it is flatter than the normal distribution.1989

I’m just going to drag all of this over here.1997

Here we still have a negative kurtosis because it is not as peaked or pointy as the normal distribution.2001

Here we have it is not normal but is more pointy, more peaked than the normal distribution would be.2012

If you want to know the precise formula that Excel uses in order to calculate kurtosis, go ahead and click on kurt.2022

Here is shows you that this is formula for kurtosis.2032

What they do is they multiply by this crazy looking stuff but the crooks of the formula is still there.2037

It is the distances, deviations, to the 4th power, stdev4 – 3 × crazy stuff.2046

That is the heart of that function.2061

You can see that we use that.2065

If we click on the kurtosis, we will calculate it on our own.2069

I use this n – 1 constant, other people who use other things.2074

What I’m going to do is I’m going to put in n – 1 down here instead.2083

All of this, this whole thing, and you subtract 3.2097

Bizarre but true.2107

Let us start with this part right here.2110

The deviation to the 4th power.2112

I just put in this value – average of my data and all of that raised to the 4th.2117

I do not want my mean to jiggle around so I’m going to lock my rows.2132

We are not going to lock the columns.2138

As long as I copy and paste it down here, we can just use column A.2141

I’m just going to copy and paste all of that down here.2147

Notice that all of these values are positive.2152

Down here, what I’m going to do is sum them all up.2157

That is one thing I know I need to do.2164

I know I need to divide by n – 1, count all of these guys and subtract 1.2168

That is within my green parentheses there, multiply by stdev4.2184

That is stdev of my data raised to the 4th power.2193

Because Excel knows order of operations it is going to do that power before it does the multiplying.2202

Then I’m going to close that and that is my blue parentheses closing there.2212

Here we have the sum ÷ n – 1 × stdev4.2220

I need to take all of that and subtract 3.2235

I’m going to put on another set of parentheses around this whole thing and subtract 3.2239

Hit enter.2248

Here I get negative kurtosis.2249

That means it is flatter than normal.2258

Notice that it is not more negative than -3, that is the maximum.2260

I’m going to take this whole thing right here and paste it right here.2267

What we see is similarly this is less flat than the one we just saw but it is not quite close to normal but is more normal.2274

If we copy and paste all of that over here, we find here this is more sharply peaked than normal.2293

That is our kurtosis on Excel.2302

Let us move on to some examples.2310

Here is example 1.2311

Given that on a particular sample, mean is less than the median is less than the mode.2314

What is the likely shape of this distribution?2320

This means that somehow the mode or the peak and the mean is somewhere on this side of it.2324

Here is the mode and the mean, median is somewhere in between.2335

That would mean that since this guy, the mean is highly affected by outliers there must be outliers on this side.2341

I’m going to guess that this is a negative skew, left skewed.2352

That means the skewness number should be negative.2366

What about in a sample where the mean is greater than the median which is greater than the mode.2373

What is the likely shape?2379

We just have to reason backwards.2380

The mode is the smallest and the mean, the median is somewhere in between.2383

The mean is pulled by the outliers, my outliers must be here.2395

I’m going to say this is a right skewed distribution.2400

That would mean that the skewness is greater than 0.2406

It is a positive number.2413

Example 3.2417

If a distribution has a kurtosis close to 0 and skewness close to 0, what is the likely shape of the distribution?2418

We know that skewness close to 0 means that it is basically symmetric, but it could be symmetric in lots of ways.2428

It does not have to be normally distributed.2438

In a kurtosis that is also 0, we know that must means that the tails are not too fat, not too skinny.2440

The peak is not too pointy and dull either.2451

If both skewness and kurtosis are 0, we could very likely think of this as approximately normal.2455

That is probably a good way to guess.2468

Finally example 4.2473

Sketch a potential distribution that can have a kurtosis of 1 then sketch over in a distribution that can a have a kurtosis of -1.2474

I thought the positive 1 to be easier because to me I always think when it is positive it means it is pointy.2485

Over in the sketch of the distribution that is less pointy.2496

It is always something like that.2504

That is it for skewness and kurtosis.2507

Thanks for using www.educator.com.2509

Educator®

Please sign in to participate in this lecture discussion.

Resetting Your Password?
OR

Start Learning Now

Our free lessons will get you started (Adobe Flash® required).
Get immediate access to our entire library.

Membership Overview

  • Available 24/7. Unlimited Access to Our Entire Library.
  • Search and jump to exactly what you want to learn.
  • *Ask questions and get answers from the community and our teachers!
  • Practice questions with step-by-step solutions.
  • Download lecture slides for taking notes.
  • Track your course viewing progress.
  • Accessible anytime, anywhere with our Android and iOS apps.