Dr. Ji Son

Correlation: r vs. r-squared

Slide Duration:

Table of Contents

Section 1: Introduction

Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro

0:00

Roadmap

0:10

Roadmap

0:11

Statistics

0:35

Statistics

0:36

Let's Think About High School Science

1:12

Measurement and Find Patterns (Mathematical Formula)

1:13

Statistics = Math of Distributions

4:58

Distributions

4:59

Problematic… but also GREAT

5:58

Statistics

7:33

How is It Different from Other Specializations in Mathematics?

7:34

Statistics is Fundamental in Natural and Social Sciences

7:53

Two Skills of Statistics

8:20

Description (Exploration)

8:21

Inference

9:13

Descriptive Statistics vs. Inferential Statistics: Apply to Distributions

9:58

Descriptive Statistics

9:59

Inferential Statistics

11:05

Populations vs. Samples

12:19

Populations vs. Samples: Is it the Truth?

12:20

Populations vs. Samples: Pros & Cons

13:36

Populations vs. Samples: Descriptive Values

16:12

Putting Together Descriptive/Inferential Stats & Populations/Samples

17:10

Putting Together Descriptive/Inferential Stats & Populations/Samples

17:11

Example 1: Descriptive Statistics vs. Inferential Statistics

19:09

Example 2: Descriptive Statistics vs. Inferential Statistics

20:47

Example 3: Sample, Parameter, Population, and Statistic

21:40

Example 4: Sample, Parameter, Population, and Statistic

23:28

Section 2: About Samples: Cases, Variables, Measurements

About Samples: Cases, Variables, Measurements

32m 14s

Intro

0:00

Data

0:09

Data, Cases, Variables, and Values

0:10

Rows, Columns, and Cells

2:03

Example: Aircrafts

3:52

How Do We Get Data?

5:38

Research: Question and Hypothesis

5:39

Research Design

7:11

Measurement

7:29

Research Analysis

8:33

Research Conclusion

9:30

Types of Variables

10:03

Discrete Variables

10:04

Continuous Variables

12:07

Types of Measurements

14:17

Types of Measurements

14:18

Types of Measurements (Scales)

17:22

Nominal

17:23

Ordinal

19:11

Interval

21:33

Ratio

24:24

Example 1: Cases, Variables, Measurements

25:20

Example 2: Which Scale of Measurement is Used?

26:55

Example 3: What Kind of a Scale of Measurement is This?

27:26

Example 4: Discrete vs. Continuous Variables.

30:31

Section 3: Visualizing Distributions

Introduction to Excel

8m 9s

Intro

0:00

Before Visualizing Distribution

0:10

Excel

0:11

Excel: Organization

0:45

Workbook

0:46

Column x Rows

1:50

Tools: Menu Bar, Standard Toolbar, and Formula Bar

3:00

Excel + Data

6:07

Exce and Data

6:08

Frequency Distributions in Excel

39m 10s

Intro

0:00

Roadmap

0:08

Data in Excel and Frequency Distributions

0:09

Raw Data to Frequency Tables

0:42

Raw Data to Frequency Tables

0:43

Frequency Tables: Using Formulas and Pivot Tables

1:28

Example 1: Number of Births

7:17

Example 2: Age Distribution

20:41

Example 3: Height Distribution

27:45

Example 4: Height Distribution of Males

32:19

Frequency Distributions and Features

25m 29s

Intro

0:00

Roadmap

0:10

Data in Excel, Frequency Distributions, and Features of Frequency Distributions

0:11

Example #1

1:35

Uniform

1:36

Example #2

2:58

Unimodal, Skewed Right, and Asymmetric

2:59

Example #3

6:29

Bimodal

6:30

Example #4a

8:29

Symmetric, Unimodal, and Normal

8:30

Point of Inflection and Standard Deviation

11:13

Example #4b

12:43

Normal Distribution

12:44

Summary

13:56

Uniform, Skewed, Bimodal, and Normal

13:57

Sketch Problem 1: Driver's License

17:34

Sketch Problem 2: Life Expectancy

20:01

Sketch Problem 3: Telephone Numbers

22:01

Sketch Problem 4: Length of Time Used to Complete a Final Exam

23:43

Dotplots and Histograms in Excel

42m 42s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Previously

1:02

Data, Frequency Table, and visualization

1:03

Dotplots

1:22

Dotplots Excel Example

1:23

Dotplots: Pros and Cons

7:22

Pros and Cons of Dotplots

7:23

Dotplots Excel Example Cont.

9:07

Histograms

12:47

Histograms Overview

12:48

Example of Histograms

15:29

Histograms: Pros and Cons

31:39

Pros

31:40

Cons

32:31

Frequency vs. Relative Frequency

32:53

Frequency

32:54

Relative Frequency

33:36

Example 1: Dotplots vs. Histograms

34:36

Example 2: Age of Pennies Dotplot

36:21

Example 3: Histogram of Mammal Speeds

38:27

Example 4: Histogram of Life Expectancy

40:30

Stemplots

12m 23s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

What Sets Stemplots Apart?

0:46

Data Sets, Dotplots, Histograms, and Stemplots

0:47

Example 1: What Do Stemplots Look Like?

1:58

Example 2: Back-to-Back Stemplots

5:00

Example 3: Quiz Grade Stemplot

7:46

Example 4: Quiz Grade & Afterschool Tutoring Stemplot

9:56

Bar Graphs

22m 49s

Intro

0:00

Roadmap

0:05

Roadmap

0:08

Review of Frequency Distributions

0:44

Y-axis and X-axis

0:45

Types of Frequency Visualizations Covered so Far

2:16

Introduction to Bar Graphs

4:07

Example 1: Bar Graph

5:32

Example 1: Bar Graph

5:33

Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?

11:07

Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?

11:08

Example 2: Create a Frequency Visualization for Gender

14:02

Example 3: Cases, Variables, and Frequency Visualization

16:34

Example 4: What Kind of Graphs are Shown Below?

19:29

Section 4: Summarizing Distributions

Central Tendency: Mean, Median, Mode

38m 50s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

Central Tendency 1

0:56

Way to Summarize a Distribution of Scores

0:57

Mode

1:32

Median

2:02

Mean

2:36

Central Tendency 2

3:47

Mode

3:48

Median

4:20

Mean

5:25

Summation Symbol

6:11

Summation Symbol

6:12

Population vs. Sample

10:46

Population vs. Sample

10:47

Excel Examples

15:08

Finding Mode, Median, and Mean in Excel

15:09

Median vs. Mean

21:45

Effect of Outliers

21:46

Relationship Between Parameter and Statistic

22:44

Type of Measurements

24:00

Which Distributions to Use With

24:55

Example 1: Mean

25:30

Example 2: Using Summation Symbol

29:50

Example 3: Average Calorie Count

32:50

Example 4: Creating an Example Set

35:46

Variability

42m 40s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Variability (or Spread)

0:45

Variability (or Spread)

0:46

Things to Think About

5:45

Things to Think About

5:46

Range, Quartiles and Interquartile Range

6:37

Range

6:38

Interquartile Range

8:42

Interquartile Range Example

10:58

Interquartile Range Example

10:59

Variance and Standard Deviation

12:27

Deviations

12:28

Sum of Squares

14:35

Variance

16:55

Standard Deviation

17:44

Sum of Squares (SS)

18:34

Sum of Squares (SS)

18:35

Population vs. Sample SD

22:00

Population vs. Sample SD

22:01

Population vs. Sample

23:20

Mean

23:21

23:51

Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File

27:21

Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File

35:25

Example 3: Sum of Squares

38:58

Example 4: Standard Deviation

41:48

Five Number Summary & Boxplots

57m 15s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Summarizing Distributions

0:37

Shape, Center, and Spread

0:38

5 Number Summary

1:14

Boxplot: Visualizing 5 Number Summary

3:37

Boxplot: Visualizing 5 Number Summary

3:38

Boxplots on Excel

9:01

Using 'Stocks' and Using Stacked Columns

9:02

Boxplots on Excel Example

10:14

When are Boxplots Useful?

32:14

Pros

32:15

Cons

32:59

How to Determine Outlier Status

33:24

Rule of Thumb: Upper Limit

33:25

Rule of Thumb: Lower Limit

34:16

Signal Outliers in an Excel Data File Using Conditional Formatting

34:52

Modified Boxplot

48:38

Modified Boxplot

48:39

Example 1: Percentage Values & Lower and Upper Whisker

49:10

Example 2: Boxplot

50:10

Example 3: Estimating IQR From Boxplot

53:46

Example 4: Boxplot and Missing Whisker

54:35

Shape: Calculating Skewness & Kurtosis

41m 51s

Intro

0:00

Roadmap

0:16

Roadmap

0:17

Skewness Concept

1:09

Skewness Concept

1:10

Calculating Skewness

3:26

Calculating Skewness

3:27

Interpreting Skewness

7:36

Interpreting Skewness

7:37

Excel Example

8:49

Kurtosis Concept

20:29

Kurtosis Concept

20:30

Calculating Kurtosis

24:17

Calculating Kurtosis

24:18

Interpreting Kurtosis

29:01

Leptokurtic

29:35

Mesokurtic

30:10

Platykurtic

31:06

Excel Example

32:04

Example 1: Shape of Distribution

38:28

Example 2: Shape of Distribution

39:29

Example 3: Shape of Distribution

40:14

Example 4: Kurtosis

41:10

Normal Distribution

34m 33s

Intro

0:00

Roadmap

0:13

Roadmap

0:14

What is a Normal Distribution

0:44

The Normal Distribution As a Theoretical Model

0:45

Possible Range of Probabilities

3:05

Possible Range of Probabilities

3:06

What is a Normal Distribution

5:07

Can Be Described By

5:08

Properties

5:49

'Same' Shape: Illusion of Different Shape!

7:35

'Same' Shape: Illusion of Different Shape!

7:36

Types of Problems

13:45

Example: Distribution of SAT Scores

13:46

Shape Analogy

19:48

Shape Analogy

19:49

Example 1: The Standard Normal Distribution and Z-Scores

22:34

Example 2: The Standard Normal Distribution and Z-Scores

25:54

Example 3: Sketching and Normal Distribution

28:55

Example 4: Sketching and Normal Distribution

32:32

Standard Normal Distributions & Z-Scores

41m 44s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

A Family of Distributions

0:28

Infinite Set of Distributions

0:29

Transforming Normal Distributions to 'Standard' Normal Distribution

1:04

Normal Distribution vs. Standard Normal Distribution

2:58

Normal Distribution vs. Standard Normal Distribution

2:59

Z-Score, Raw Score, Mean, & SD

4:08

Z-Score, Raw Score, Mean, & SD

4:09

Weird Z-Scores

9:40

Weird Z-Scores

9:41

Excel

16:45

For Normal Distributions

16:46

For Standard Normal Distributions

19:11

Excel Example

20:24

Types of Problems

25:18

Percentage Problem: P(x)

25:19

Raw Score and Z-Score Problems

26:28

Standard Deviation Problems

27:01

Shape Analogy

27:44

Shape Analogy

27:45

Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer

28:24

Example 2: Heights of Male College Students

33:15

Example 3: Mean and Standard Deviation

37:14

Example 4: Finding Percentage of Values in a Standard Normal Distribution

37:49

Normal Distribution: PDF vs. CDF

55m 44s

Intro

0:00

Roadmap

0:15

Roadmap

0:16

Frequency vs. Cumulative Frequency

0:56

Frequency vs. Cumulative Frequency

0:57

Frequency vs. Cumulative Frequency

4:32

Frequency vs. Cumulative Frequency Cont.

4:33

Calculus in Brief

6:21

Derivative-Integral Continuum

6:22

PDF

10:08

PDF for Standard Normal Distribution

10:09

PDF for Normal Distribution

14:32

Integral of PDF = CDF

21:27

Integral of PDF = CDF

21:28

Example 1: Cumulative Frequency Graph

23:31

Example 2: Mean, Standard Deviation, and Probability

24:43

Example 3: Mean and Standard Deviation

35:50

Example 4: Age of Cars

49:32

Section 5: Linear Regression

Scatterplots

47m 19s

Intro

0:00

Roadmap

0:04

Roadmap

0:05

Previous Visualizations

0:30

Frequency Distributions

0:31

Compare & Contrast

2:26

Frequency Distributions Vs. Scatterplots

2:27

Summary Values

4:53

Shape

4:54

Center & Trend

6:41

Spread & Strength

8:22

Univariate & Bivariate

10:25

Example Scatterplot

10:48

Shape, Trend, and Strength

10:49

Positive and Negative Association

14:05

Positive and Negative Association

14:06

Linearity, Strength, and Consistency

18:30

Linearity

18:31

Strength

19:14

Consistency

20:40

Summarizing a Scatterplot

22:58

Summarizing a Scatterplot

22:59

Example 1: Gapminder.org, Income x Life Expectancy

26:32

Example 2: Gapminder.org, Income x Infant Mortality

36:12

Example 3: Trend and Strength of Variables

40:14

Example 4: Trend, Strength and Shape for Scatterplots

43:27

Regression

32m 2s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Linear Equations

0:34

Linear Equations: y = mx + b

0:35

Rough Line

5:16

Rough Line

5:17

Regression - A 'Center' Line

7:41

Reasons for Summarizing with a Regression Line

7:42

Predictor and Response Variable

10:04

Goal of Regression

12:29

Goal of Regression

12:30

Prediction

14:50

Example: Servings of Mile Per Year Shown By Age

14:51

Intrapolation

17:06

Extrapolation

17:58

Error in Prediction

20:34

Prediction Error

20:35

Residual

21:40

Example 1: Residual

23:34

Example 2: Large and Negative Residual

26:30

Example 3: Positive Residual

28:13

Example 4: Interpret Regression Line & Extrapolate

29:40

Least Squares Regression

56m 36s

Intro

0:00

Roadmap

0:13

Roadmap

0:14

Best Fit

0:47

Best Fit

0:48

Sum of Squared Errors (SSE)

1:50

Sum of Squared Errors (SSE)

1:51

Why Squared?

3:38

Why Squared?

3:39

Quantitative Properties of Regression Line

4:51

Quantitative Properties of Regression Line

4:52

So How do we Find Such a Line?

6:49

SSEs of Different Line Equations & Lowest SSE

6:50

Carl Gauss' Method

8:01

How Do We Find Slope (b1)

11:00

How Do We Find Slope (b1)

11:01

Hoe Do We Find Intercept

15:11

Hoe Do We Find Intercept

15:12

Example 1: Which of These Equations Fit the Above Data Best?

17:18

Example 2: Find the Regression Line for These Data Points and Interpret It

26:31

Example 3: Summarize the Scatterplot and Find the Regression Line.

34:31

Example 4: Examine the Mean of Residuals

43:52

Correlation

43m 58s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Summarizing a Scatterplot Quantitatively

0:47

Shape

0:48

Trend

1:11

Strength: Correlation ®

1:45

Correlation Coefficient ( r )

2:30

Correlation Coefficient ( r )

2:31

Trees vs. Forest

11:59

Trees vs. Forest

12:00

Calculating r

15:07

Average Product of z-scores for x and y

15:08

Relationship between Correlation and Slope

21:10

Relationship between Correlation and Slope

21:11

Example 1: Find the Correlation between Grams of Fat and Cost

24:11

Example 2: Relationship between r and b1

30:24

Example 3: Find the Regression Line

33:35

Example 4: Find the Correlation Coefficient for this Set of Data

37:37

Correlation: r vs. r-squared

52m 52s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

R-squared

0:44

What is the Meaning of It? Why Squared?

0:45

Parsing Sum of Squared (Parsing Variability)

2:25

SST = SSR + SSE

2:26

What is SST and SSE?

7:46

What is SST and SSE?

7:47

r-squared

18:33

Coefficient of Determination

18:34

If the Correlation is Strong…

20:25

If the Correlation is Strong…

20:26

If the Correlation is Weak…

22:36

If the Correlation is Weak…

22:37

Example 1: Find r-squared for this Set of Data

23:56

Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?

33:54

Example 3: Why Does r-squared Only Range from 0 to 1

37:29

Example 4: Find the r-squared for This Set of Data

39:55

Transformations of Data

27m 8s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Why Transform?

0:26

Why Transform?

0:27

Shape-preserving vs. Shape-changing Transformations

5:14

Shape-preserving = Linear Transformations

5:15

Shape-changing Transformations = Non-linear Transformations

6:20

Common Shape-Preserving Transformations

7:08

Common Shape-Preserving Transformations

7:09

Common Shape-Changing Transformations

8:59

Powers

9:00

Logarithms

9:39

Change Just One Variable? Both?

10:38

Log-log Transformations

10:39

Log Transformations

14:38

Example 1: Create, Graph, and Transform the Data Set

15:19

Example 2: Create, Graph, and Transform the Data Set

20:08

Example 3: What Kind of Model would You Choose for this Data?

22:44

Example 4: Transformation of Data

25:46

Section 6: Collecting Data in an Experiment

Sampling & Bias

54m 44s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Descriptive vs. Inferential Statistics

1:04

Descriptive Statistics: Data Exploration

1:05

Example

2:03

To tackle Generalization…

4:31

Generalization

4:32

Sampling

6:06

'Good' Sample

6:40

Defining Samples and Populations

8:55

Population

8:56

Sample

11:16

Why Use Sampling?

13:09

Why Use Sampling?

13:10

Goal of Sampling: Avoiding Bias

15:04

What is Bias?

15:05

Where does Bias Come from: Sampling Bias

17:53

Where does Bias Come from: Response Bias

18:27

Sampling Bias: Bias from Bas Sampling Methods

19:34

Size Bias

19:35

Voluntary Response Bias

21:13

Convenience Sample

22:22

Judgment Sample

23:58

Inadequate Sample Frame

25:40

Response Bias: Bias from 'Bad' Data Collection Methods

28:00

Nonresponse Bias

29:31

Questionnaire Bias

31:10

Incorrect Response or Measurement Bias

37:32

Example 1: What Kind of Biases?

40:29

Example 2: What Biases Might Arise?

44:46

Example 3: What Kind of Biases?

48:34

Example 4: What Kind of Biases?

51:43

Sampling Methods

14m 25s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Biased vs. Unbiased Sampling Methods

0:32

Biased Sampling

0:33

Unbiased Sampling

1:13

Probability Sampling Methods

2:31

Simple Random

2:54

Stratified Random Sampling

4:06

Cluster Sampling

5:24

Two-staged Sampling

6:22

Systematic Sampling

7:25

Example 1: Which Type(s) of Sampling was this?

8:33

Example 2: Describe How to Take a Two-Stage Sample from this Book

10:16

Example 3: Sampling Methods

11:58

Example 4: Cluster Sample Plan

12:48

Research Design

53m 54s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Descriptive vs. Inferential Statistics

0:51

Descriptive Statistics: Data Exploration

0:52

Inferential Statistics

1:02

Variables and Relationships

1:44

Variables

1:45

Relationships

2:49

Not Every Type of Study is an Experiment…

4:16

Category I - Descriptive Study

4:54

Category II - Correlational Study

5:50

Category III - Experimental, Quasi-experimental, Non-experimental

6:33

Category III

7:42

Experimental, Quasi-experimental, and Non-experimental

7:43

Why CAN'T the Other Strategies Determine Causation?

10:18

Third-variable Problem

10:19

Directionality Problem

15:49

What Makes Experiments Special?

17:54

Manipulation

17:55

Control (and Comparison)

21:58

Methods of Control

26:38

Holding Constant

26:39

Matching

29:11

Random Assignment

31:48

Experiment Terminology

34:09

'true' Experiment vs. Study

34:10

Independent Variable (IV)

35:16

Dependent Variable (DV)

35:45

Factors

36:07

Treatment Conditions

36:23

Levels

37:43

Confounds or Extraneous Variables

38:04

Blind

38:38

Blind Experiments

38:39

Double-blind Experiments

39:29

How Categories Relate to Statistics

41:35

Category I - Descriptive Study

41:36

Category II - Correlational Study

42:05

Category III - Experimental, Quasi-experimental, Non-experimental

42:43

Example 1: Research Design

43:50

Example 2: Research Design

47:37

Example 3: Research Design

50:12

Example 4: Research Design

52:00

Between and Within Treatment Variability

41m 31s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Experimental Designs

0:51

Experimental Designs: Manipulation & Control

0:52

Two Types of Variability

2:09

Between Treatment Variability

2:10

Within Treatment Variability

3:31

Updated Goal of Experimental Design

5:47

Updated Goal of Experimental Design

5:48

Example: Drugs and Driving

6:56

Example: Drugs and Driving

6:57

Different Types of Random Assignment

11:27

All Experiments

11:28

Completely Random Design

12:02

Randomized Block Design

13:19

Randomized Block Design

15:48

Matched Pairs Design

15:49

Repeated Measures Design

19:47

Between-subject Variable vs. Within-subject Variable

22:43

Completely Randomized Design

22:44

Repeated Measures Design

25:03

Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment

26:16

Example 2: Block Design

31:41

Example 3: Completely Randomized Designs

35:11

Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?

39:01

Section 7: Review of Probability Axioms

Sample Spaces

37m 52s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

Why is Probability Involved in Statistics

0:48

Probability

0:49

Can People Tell the Difference between Cheap and Gourmet Coffee?

2:08

Taste Test with Coffee Drinkers

3:37

If No One can Actually Taste the Difference

3:38

If Everyone can Actually Taste the Difference

5:36

Creating a Probability Model

7:09

Creating a Probability Model

7:10

D'Alembert vs. Necker

9:41

D'Alembert vs. Necker

9:42

Problem with D'Alembert's Model

13:29

Problem with D'Alembert's Model

13:30

Covering Entire Sample Space

15:08

Fundamental Principle of Counting

15:09

Where Do Probabilities Come From?

22:54

Observed Data, Symmetry, and Subjective Estimates

22:55

Checking whether Model Matches Real World

24:27

Law of Large Numbers

24:28

Example 1: Law of Large Numbers

27:46

Example 2: Possible Outcomes

30:43

Example 3: Brands of Coffee and Taste

33:25

Example 4: How Many Different Treatments are there?

35:33

Addition Rule for Disjoint Events

20m 29s

Intro

0:00

Roadmap

0:08

Roadmap

0:09

Disjoint Events

0:41

Disjoint Events

0:42

Meaning of 'or'

2:39

In Regular Life

2:40

In Math/Statistics/Computer Science

3:10

Addition Rule for Disjoin Events

3:55

If A and B are Disjoint: P (A and B)

3:56

If A and B are Disjoint: P (A or B)

5:15

General Addition Rule

5:41

General Addition Rule

5:42

Generalized Addition Rule

8:31

If A and B are not Disjoint: P (A or B)

8:32

Example 1: Which of These are Mutually Exclusive?

10:50

Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?

12:57

Example 3: Engagement Party

15:17

Example 4: Home Owner's Insurance

18:30

Conditional Probability

57m 19s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

'or' vs. 'and' vs. Conditional Probability

1:07

'or' vs. 'and' vs. Conditional Probability

1:08

'and' vs. Conditional Probability

5:57

P (M or L)

5:58

P (M and L)

8:41

P (M|L)

11:04

P (L|M)

12:24

Tree Diagram

15:02

Tree Diagram

15:03

Defining Conditional Probability

22:42

Defining Conditional Probability

22:43

Common Contexts for Conditional Probability

30:56

Medical Testing: Positive Predictive Value

30:57

Medical Testing: Sensitivity

33:03

Statistical Tests

34:27

Example 1: Drug and Disease

36:41

Example 2: Marbles and Conditional Probability

40:04

Example 3: Cards and Conditional Probability

45:59

Example 4: Votes and Conditional Probability

50:21

Independent Events

24m 27s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Independent Events & Conditional Probability

0:26

Non-independent Events

0:27

Independent Events

2:00

Non-independent and Independent Events

3:08

Non-independent and Independent Events

3:09

Defining Independent Events

5:52

Defining Independent Events

5:53

Multiplication Rule

7:29

Previously…

7:30

But with Independent Evens

8:53

Example 1: Which of These Pairs of Events are Independent?

11:12

Example 2: Health Insurance and Probability

15:12

Example 3: Independent Events

17:42

Example 4: Independent Events

20:03

Section 8: Probability Distributions

Introduction to Probability Distributions

56m 45s

Intro

0:00

Roadmap

0:08

Roadmap

0:09

Sampling vs. Probability

0:57

Sampling

0:58

Missing

1:30

What is Missing?

3:06

Insight: Probability Distributions

5:26

Insight: Probability Distributions

5:27

What is a Probability Distribution?

7:29

From Sample Spaces to Probability Distributions

8:44

Sample Space

8:45

Probability Distribution of the Sum of Two Die

11:16

The Random Variable

17:43

The Random Variable

17:44

Expected Value

21:52

Expected Value

21:53

Example 1: Probability Distributions

28:45

Example 2: Probability Distributions

35:30

Example 3: Probability Distributions

43:37

Example 4: Probability Distributions

47:20

Expected Value & Variance of Probability Distributions

53m 41s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Discrete vs. Continuous Random Variables

1:04

Discrete vs. Continuous Random Variables

1:05

Mean and Variance Review

4:44

Mean: Sample, Population, and Probability Distribution

4:45

Variance: Sample, Population, and Probability Distribution

9:12

Example Situation

14:10

Example Situation

14:11

Some Special Cases…

16:13

Some Special Cases…

16:14

Linear Transformations

19:22

Linear Transformations

19:23

What Happens to Mean and Variance of the Probability Distribution?

20:12

n Independent Values of X

25:38

n Independent Values of X

25:39

Compare These Two Situations

30:56

Compare These Two Situations

30:57

Two Random Variables, X and Y

32:02

Two Random Variables, X and Y

32:03

Example 1: Expected Value & Variance of Probability Distributions

35:35

Example 2: Expected Values & Standard Deviation

44:17

Example 3: Expected Winnings and Standard Deviation

48:18

Binomial Distribution

55m 15s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Discrete Probability Distributions

1:42

Discrete Probability Distributions

1:43

Binomial Distribution

2:36

Binomial Distribution

2:37

Multiplicative Rule Review

6:54

Multiplicative Rule Review

6:55

How Many Outcomes with k 'Successes'

10:23

Adults and Bachelor's Degree: Manual List of Outcomes

10:24

P (X=k)

19:37

Putting Together # of Outcomes with the Multiplicative Rule

19:38

Expected Value and Standard Deviation in a Binomial Distribution

25:22

Expected Value and Standard Deviation in a Binomial Distribution

25:23

Example 1: Coin Toss

33:42

Example 2: College Graduates

38:03

Example 3: Types of Blood and Probability

45:39

Example 4: Expected Number and Standard Deviation

51:11

Section 9: Sampling Distributions of Statistics

Introduction to Sampling Distributions

48m 17s

Intro

0:00

Roadmap

0:08

Roadmap

0:09

Probability Distributions vs. Sampling Distributions

0:55

Probability Distributions vs. Sampling Distributions

0:56

Same Logic

3:55

Logic of Probability Distribution

3:56

Example: Rolling Two Die

6:56

Simulating Samples

9:53

To Come Up with Probability Distributions

9:54

In Sampling Distributions

11:12

Connecting Sampling and Research Methods with Sampling Distributions

12:11

Connecting Sampling and Research Methods with Sampling Distributions

12:12

Simulating a Sampling Distribution

14:14

Experimental Design: Regular Sleep vs. Less Sleep

14:15

Logic of Sampling Distributions

23:08

Logic of Sampling Distributions

23:09

General Method of Simulating Sampling Distributions

25:38

General Method of Simulating Sampling Distributions

25:39

Questions that Remain

28:45

Questions that Remain

28:46

Example 1: Mean and Standard Error of Sampling Distribution

30:57

Example 2: What is the Best Way to Describe Sampling Distributions?

37:12

Example 3: Matching Sampling Distributions

38:21

Example 4: Mean and Standard Error of Sampling Distribution

41:51

Sampling Distribution of the Mean

1h 8m 48s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Special Case of General Method for Simulating a Sampling Distribution

1:53

Special Case of General Method for Simulating a Sampling Distribution

1:54

Computer Simulation

3:43

Using Simulations to See Principles behind Shape of SDoM

15:50

Using Simulations to See Principles behind Shape of SDoM

15:51

Conditions

17:38

Using Simulations to See Principles behind Center (Mean) of SDoM

20:15

Using Simulations to See Principles behind Center (Mean) of SDoM

20:16

Conditions: Does n Matter?

21:31

Conditions: Does Number of Simulation Matter?

24:37

Using Simulations to See Principles behind Standard Deviation of SDoM

27:13

Using Simulations to See Principles behind Standard Deviation of SDoM

27:14

Conditions: Does n Matter?

34:45

Conditions: Does Number of Simulation Matter?

36:24

Central Limit Theorem

37:13

SHAPE

38:08

CENTER

39:34

SPREAD

39:52

Comparing Population, Sample, and SDoM

43:10

Comparing Population, Sample, and SDoM

43:11

Answering the 'Questions that Remain'

48:24

What Happens When We Don't Know What the Population Looks Like?

48:25

Can We Have Sampling Distributions for Summary Statistics Other than the Mean?

49:42

How Do We Know whether a Sample is Sufficiently Unlikely?

53:36

Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?

54:40

Example 1: Mean Batting Average

55:25

Example 2: Mean Sampling Distribution and Standard Error

59:07

Example 3: Sampling Distribution of the Mean

1:01:04

Sampling Distribution of Sample Proportions

54m 37s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Intro to Sampling Distribution of Sample Proportions (SDoSP)

0:51

Categorical Data (Examples)

0:52

Wish to Estimate Proportion of Population from Sample…

2:00

Notation

3:34

Population Proportion and Sample Proportion Notations

3:35

What's the Difference?

9:19

SDoM vs. SDoSP: Type of Data

9:20

SDoM vs. SDoSP: Shape

11:24

SDoM vs. SDoSP: Center

12:30

SDoM vs. SDoSP: Spread

15:34

Binomial Distribution vs. Sampling Distribution of Sample Proportions

19:14

Binomial Distribution vs. SDoSP: Type of Data

19:17

Binomial Distribution vs. SDoSP: Shape

21:07

Binomial Distribution vs. SDoSP: Center

21:43

Binomial Distribution vs. SDoSP: Spread

24:08

Example 1: Sampling Distribution of Sample Proportions

26:07

Example 2: Sampling Distribution of Sample Proportions

37:58

Example 3: Sampling Distribution of Sample Proportions

44:42

Example 4: Sampling Distribution of Sample Proportions

45:57

Section 10: Inferential Statistics

Introduction to Confidence Intervals

42m 53s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Inferential Statistics

0:50

Inferential Statistics

0:51

Two Problems with This Picture…

3:20

Two Problems with This Picture…

3:21

Solution: Confidence Intervals (CI)

4:59

Solution: Hypotheiss Testing (HT)

5:49

Which Parameters are Known?

6:45

Which Parameters are Known?

6:46

Confidence Interval - Goal

7:56

When We Don't Know m but know s

7:57

When We Don't Know

18:27

When We Don't Know m nor s

18:28

Example 1: Confidence Intervals

26:18

Example 2: Confidence Intervals

29:46

Example 3: Confidence Intervals

32:18

Example 4: Confidence Intervals

38:31

t Distributions

1h 2m 6s

Intro

0:00

Roadmap

0:04

Roadmap

0:05

When to Use z vs. t?

1:07

When to Use z vs. t?

1:08

What is z and t?

3:02

z-score and t-score: Commonality

3:03

z-score and t-score: Formulas

3:34

z-score and t-score: Difference

5:22

Why not z? (Why t?)

7:24

Why not z? (Why t?)

7:25

But Don't Worry!

15:13

Gossett and t-distributions

15:14

Rules of t Distributions

17:05

t-distributions are More Normal as n Gets Bigger

17:06

t-distributions are a Family of Distributions

18:55

Degrees of Freedom (df)

20:02

Degrees of Freedom (df)

20:03

t Family of Distributions

24:07

t Family of Distributions : df = 2 , 4, and 60

24:08

df = 60

29:16

df = 2

29:59

How to Find It?

31:01

'Student's t-distribution' or 't-distribution'

31:02

Excel Example

33:06

Example 1: Which Distribution Do You Use? Z or t?

45:26

Example 2: Friends on Facebook

47:41

Example 3: t Distributions

52:15

Example 4: t Distributions , confidence interval, and mean

55:59

Introduction to Hypothesis Testing

1h 6m 33s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Issues to Overcome in Inferential Statistics

1:35

Issues to Overcome in Inferential Statistics

1:36

What Happens When We Don't Know What the Population Looks Like?

2:57

How Do We Know whether a sample is Sufficiently Unlikely

3:43

Hypothesizing a Population

6:44

Hypothesizing a Population

6:45

Null Hypothesis

8:07

Alternative Hypothesis

8:56

Hypotheses

11:58

Hypotheses

11:59

Errors in Hypothesis Testing

14:22

Errors in Hypothesis Testing

14:23

Steps of Hypothesis Testing

21:15

Steps of Hypothesis Testing

21:16

Single Sample HT ( When Sigma Available)

26:08

Example: Average Facebook Friends

26:09

Step1

27:08

Step 2

27:58

Step 3

28:17

Step 4

32:18

Single Sample HT (When Sigma Not Available)

36:33

Example: Average Facebook Friends

36:34

Step1: Hypothesis Testing

36:58

Step 2: Significance Level

37:25

Step 3: Decision Stage

37:40

Step 4: Sample

41:36

Sigma and p-value

45:04

Sigma and p-value

45:05

On tailed vs. Two Tailed Hypotheses

45:51

Example 1: Hypothesis Testing

48:37

Example 2: Heights of Women in the US

57:43

Example 3: Select the Best Way to Complete This Sentence

1:03:23

Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro

0:00

Roadmap

0:14

Roadmap

0:15

One Mean vs. Two Means

1:17

One Mean vs. Two Means

1:18

Notation

2:41

A Sample! A Set!

2:42

Mean of X, Mean of Y, and Difference of Two Means

3:56

SE of X

4:34

SE of Y

6:28

Sampling Distribution of the Difference between Two Means (SDoD)

7:48

Sampling Distribution of the Difference between Two Means (SDoD)

7:49

Rules of the SDoD (similar to CLT!)

15:00

Mean for the SDoD Null Hypothesis

15:01

Standard Error

17:39

When can We Construct a CI for the Difference between Two Means?

21:28

Three Conditions

21:29

Finding CI

23:56

One Mean CI

23:57

Two Means CI

25:45

Finding t

29:16

Finding t

29:17

Interpreting CI

30:25

Interpreting CI

30:26

Better Estimate of s (s pool)

34:15

Better Estimate of s (s pool)

34:16

Example 1: Confidence Intervals

42:32

Example 2: SE of the Difference

52:36

Hypothesis Testing for the Difference of Two Independent Means

50m

Intro

0:00

Roadmap

0:06

Roadmap

0:07

The Goal of Hypothesis Testing

0:56

One Sample and Two Samples

0:57

Sampling Distribution of the Difference between Two Means (SDoD)

3:42

Sampling Distribution of the Difference between Two Means (SDoD)

3:43

Rules of the SDoD (Similar to CLT!)

6:46

Shape

6:47

Mean for the Null Hypothesis

7:26

Standard Error for Independent Samples (When Variance is Homogenous)

8:18

Standard Error for Independent Samples (When Variance is not Homogenous)

9:25

Same Conditions for HT as for CI

10:08

Three Conditions

10:09

Steps of Hypothesis Testing

11:04

Steps of Hypothesis Testing

11:05

Formulas that Go with Steps of Hypothesis Testing

13:21

Step 1

13:25

Step 2

14:18

Step 3

15:00

Step 4

16:57

Example 1: Hypothesis Testing for the Difference of Two Independent Means

18:47

Example 2: Hypothesis Testing for the Difference of Two Independent Means

33:55

Example 3: Hypothesis Testing for the Difference of Two Independent Means

44:22

Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro

0:00

Roadmap

0:09

Roadmap

0:10

The Goal of Hypothesis Testing

1:27

One Sample and Two Samples

1:28

Independent Samples vs. Paired Samples

3:16

Independent Samples vs. Paired Samples

3:17

Which is Which?

5:20

Independent SAMPLES vs. Independent VARIABLES

7:43

independent SAMPLES vs. Independent VARIABLES

7:44

T-tests Always…

10:48

T-tests Always…

10:49

Notation for Paired Samples

12:59

Notation for Paired Samples

13:00

Steps of Hypothesis Testing for Paired Samples

16:13

Steps of Hypothesis Testing for Paired Samples

16:14

Rules of the SDoD (Adding on Paired Samples)

18:03

Shape

18:04

Mean for the Null Hypothesis

18:31

Standard Error for Independent Samples (When Variance is Homogenous)

19:25

Standard Error for Paired Samples

20:39

Formulas that go with Steps of Hypothesis Testing

22:59

Formulas that go with Steps of Hypothesis Testing

23:00

Confidence Intervals for Paired Samples

30:32

Confidence Intervals for Paired Samples

30:33

Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

32:28

Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

44:02

Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

52:23

Type I and Type II Errors

31m 27s

Intro

0:00

Roadmap

0:18

Roadmap

0:19

Errors and Relationship to HT and the Sample Statistic?

1:11

Errors and Relationship to HT and the Sample Statistic?

1:12

Instead of a Box…Distributions!

7:00

One Sample t-test: Friends on Facebook

7:01

Two Sample t-test: Friends on Facebook

13:46

Usually, Lots of Overlap between Null and Alternative Distributions

16:59

Overlap between Null and Alternative Distributions

17:00

How Distributions and 'Box' Fit Together

22:45

How Distributions and 'Box' Fit Together

22:46

Example 1: Types of Errors

25:54

Example 2: Types of Errors

27:30

Example 3: What is the Danger of the Type I Error?

29:38

Effect Size & Power

44m 41s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Distance between Distributions: Sample t

0:49

Distance between Distributions: Sample t

0:50

Problem with Distance in Terms of Standard Error

2:56

Problem with Distance in Terms of Standard Error

2:57

Test Statistic (t) vs. Effect Size (d or g)

4:38

Test Statistic (t) vs. Effect Size (d or g)

4:39

Rules of Effect Size

6:09

Rules of Effect Size

6:10

Why Do We Need Effect Size?

8:21

Tells You the Practical Significance

8:22

HT can be Deceiving…

10:25

Important Note

10:42

What is Power?

11:20

What is Power?

11:21

Why Do We Need Power?

14:19

Conditional Probability and Power

14:20

Power is:

16:27

Can We Calculate Power?

19:00

Can We Calculate Power?

19:01

How Does Alpha Affect Power?

20:36

How Does Alpha Affect Power?

20:37

How Does Effect Size Affect Power?

25:38

How Does Effect Size Affect Power?

25:39

How Does Variability and Sample Size Affect Power?

27:56

How Does Variability and Sample Size Affect Power?

27:57

How Do We Increase Power?

32:47

Increasing Power

32:48

Example 1: Effect Size & Power

35:40

Example 2: Effect Size & Power

37:38

Example 3: Effect Size & Power

40:55

Section 11: Analysis of Variance

F-distributions

24m 46s

Intro

0:00

Roadmap

0:04

Roadmap

0:05

Z- & T-statistic and Their Distribution

0:34

Z- & T-statistic and Their Distribution

0:35

F-statistic

4:55

The F Ration ( the Variance Ratio)

4:56

F-distribution

12:29

F-distribution

12:30

s and p-value

15:00

s and p-value

15:01

Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?

18:33

Example 2: F-distributions

19:29

Example 3: F-distributions and Heights

21:29

ANOVA with Independent Samples

1h 9m 25s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

The Limitations of t-tests

1:12

The Limitations of t-tests

1:13

Two Major Limitations of Many t-tests

3:26

Two Major Limitations of Many t-tests

3:27

Ronald Fisher's Solution… F-test! New Null Hypothesis

4:43

Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)

4:44

Analysis of Variance (ANoVA) Notation

7:47

Analysis of Variance (ANoVA) Notation

7:48

Partitioning (Analyzing) Variance

9:58

Total Variance

9:59

Within-group Variation

14:00

Between-group Variation

16:22

Time out: Review Variance & SS

17:05

Time out: Review Variance & SS

17:06

F-statistic

19:22

The F Ratio (the Variance Ratio)

19:23

S²bet = SSbet / dfbet

22:13

What is This?

22:14

How Many Means?

23:20

So What is the dfbet?

23:38

So What is SSbet?

24:15

S²w = SSw / dfw

26:05

What is This?

26:06

How Many Means?

27:20

So What is the dfw?

27:36

So What is SSw?

28:18

Chart of Independent Samples ANOVA

29:25

Chart of Independent Samples ANOVA

29:26

Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?

35:52

Hypotheses

35:53

Significance Level

39:40

Decision Stage

40:05

Calculate Samples' Statistic and p-Value

44:10

Reject or Fail to Reject H0

55:54

Example 2: ANOVA with Independent Samples

58:21

Repeated Measures ANOVA

1h 15m 13s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

The Limitations of t-tests

0:36

Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?

0:37

ANOVA (F-test) to the Rescue!

5:49

Omnibus Hypothesis

5:50

Analyze Variance

7:27

Independent Samples vs. Repeated Measures

9:12

Same Start

9:13

Independent Samples ANOVA

10:43

Repeated Measures ANOVA

12:00

Independent Samples ANOVA

16:00

Same Start: All the Variance Around Grand Mean

16:01

Independent Samples

16:23

Repeated Measures ANOVA

18:18

Same Start: All the Variance Around Grand Mean

18:19

Repeated Measures

18:33

Repeated Measures F-statistic

21:22

The F Ratio (The Variance Ratio)

21:23

S²bet = SSbet / dfbet

23:07

What is This?

23:08

How Many Means?

23:39

So What is the dfbet?

23:54

So What is SSbet?

24:32

S² resid = SS resid / df resid

25:46

What is This?

25:47

So What is SS resid?

26:44

So What is the df resid?

27:36

SS subj and df subj

28:11

What is This?

28:12

How Many Subject Means?

29:43

So What is df subj?

30:01

So What is SS subj?

30:09

SS total and df total

31:42

What is This?

31:43

What is the Total Number of Data Points?

32:02

So What is df total?

32:34

so What is SS total?

32:47

Chart of Repeated Measures ANOVA

33:19

Chart of Repeated Measures ANOVA: F and Between-samples Variability

33:20

Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability

35:50

Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?

40:25

Hypotheses

40:26

Significance Level

41:46

Decision Stage

42:09

Calculate Samples' Statistic and p-Value

46:18

Reject or Fail to Reject H0

57:55

Example 2: Repeated Measures ANOVA

58:57

Example 3: What's the Problem with a Bunch of Tiny t-tests?

1:13:59

Section 12: Chi-square Test

Chi-Square Goodness-of-Fit Test

58m 23s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Where Does the Chi-Square Test Belong?

0:50

Where Does the Chi-Square Test Belong?

0:51

A New Twist on HT: Goodness-of-Fit

7:23

HT in General

7:24

Goodness-of-Fit HT

8:26

Hypotheses about Proportions

12:17

Null Hypothesis

12:18

Alternative Hypothesis

13:23

Example

14:38

Chi-Square Statistic

17:52

Chi-Square Statistic

17:53

Chi-Square Distributions

24:31

Chi-Square Distributions

24:32

Conditions for Chi-Square

28:58

Condition 1

28:59

Condition 2

30:20

Condition 3

30:32

Condition 4

31:47

Example 1: Chi-Square Goodness-of-Fit Test

32:23

Example 2: Chi-Square Goodness-of-Fit Test

44:34

Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?

56:06

Chi-Square Test of Homogeneity

51m 36s

Intro

0:00

Roadmap

0:09

Roadmap

0:10

Goodness-of-Fit vs. Homogeneity

1:13

Goodness-of-Fit HT

1:14

Homogeneity

2:00

Analogy

2:38

Hypotheses About Proportions

5:00

Null Hypothesis

5:01

Alternative Hypothesis

6:11

Example

6:33

Chi-Square Statistic

10:12

Same as Goodness-of-Fit Test

10:13

Set Up Data

12:28

Setting Up Data Example

12:29

Expected Frequency

16:53

Expected Frequency

16:54

Chi-Square Distributions & df

19:26

Chi-Square Distributions & df

19:27

Conditions for Test of Homogeneity

20:54

Condition 1

20:55

Condition 2

21:39

Condition 3

22:05

Condition 4

22:23

Example 1: Chi-Square Test of Homogeneity

22:52

Example 2: Chi-Square Test of Homogeneity

32:10

Section 13: Overview of Statistics

Overview of Statistics

18m 11s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

The Statistical Tests (HT) We've Covered

0:28

The Statistical Tests (HT) We've Covered

0:29

Organizing the Tests We've Covered…

1:08

One Sample: Continuous DV and Categorical DV

1:09

Two Samples: Continuous DV and Categorical DV

5:41

More Than Two Samples: Continuous DV and Categorical DV

8:21

The Following Data: OK Cupid

10:10

The Following Data: OK Cupid

10:11

Example 1: Weird-MySpace-Angle Profile Photo

10:38

Example 2: Geniuses

12:30

Example 3: Promiscuous iPhone Users

13:37

Example 4: Women, Aging, and Messaging

16:07

This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics

Statistics Correlation: r vs. r-squared

Name: Statistics: Correlation: r vs. r-squared
Brand: Educator.com
Price: 35 USD
Availability: InStock

Section 5: Linear Regression: Lecture 5 | 52:52 min

Lecture Description

Next Lecture

Previous Lecture

Discussion
Answer Engine
Download Lecture Slides
Table of Contents
Transcription
Related Books

Lecture Comments (2)

0 answers

Post by Elias Tessema on April 5, 2014

I am having hard time understanding about concordant rate...can you please explain what concordant pair means

0 answers

Post by George Kumar on May 11, 2012

Model planes are a good analogy. However, model houses are not a good analogy. Model houses are real. They are sometimes sought after houses.

Answer EngineGet answers to any question!Ask any question related to Statistics

Working on the solution...

Correlation: r vs. r-squared

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

Intro

Roadmap

Roadmap

R-squared

What is the Meaning of It? Why Squared?

Parsing Sum of Squared (Parsing Variability)

SST = SSR + SSE

What is SST and SSE?

What is SST and SSE?

r-squared

Coefficient of Determination

If the Correlation is Strong…

If the Correlation is Strong…

If the Correlation is Weak…

If the Correlation is Weak…

Example 1: Find r-squared for this Set of Data

Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?

Example 3: Why Does r-squared Only Range from 0 to 1

Example 4: Find the r-squared for This Set of Data

Intro 0:00
Roadmap 0:07

Roadmap

R-squared 0:44

What is the Meaning of It? Why Squared?

Parsing Sum of Squared (Parsing Variability) 2:25

SST = SSR + SSE

What is SST and SSE? 7:46

What is SST and SSE?

r-squared 18:33

Coefficient of Determination

If the Correlation is Strong… 20:25

If the Correlation is Strong…

If the Correlation is Weak… 22:36

If the Correlation is Weak…

Example 1: Find r-squared for this Set of Data 23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance? 33:54
Example 3: Why Does r-squared Only Range from 0 to 1 37:29
Example 4: Find the r-squared for This Set of Data 39:55

General Statistics Online Course

Section 1: Introduction
	Descriptive Statistics vs. Inferential Statistics	25:31
Section 2: About Samples: Cases, Variables, Measurements
	About Samples: Cases, Variables, Measurements	32:14
Section 3: Visualizing Distributions
	Introduction to Excel	8:09
	Frequency Distributions in Excel	39:10
	Frequency Distributions and Features	25:29
	Dotplots and Histograms in Excel	42:42
	Stemplots	12:23
	Bar Graphs	22:49
Section 4: Summarizing Distributions
	Central Tendency: Mean, Median, Mode	38:50
	Variability	42:40
	Five Number Summary & Boxplots	57:15
	Shape: Calculating Skewness & Kurtosis	41:51
	Normal Distribution	34:33
	Standard Normal Distributions & Z-Scores	41:44
	Normal Distribution: PDF vs. CDF	55:44
Section 5: Linear Regression
	Scatterplots	47:19
	Regression	32:02
	Least Squares Regression	56:36
	Correlation	43:58
	Correlation: r vs. r-squared	52:52
	Transformations of Data	27:08
Section 6: Collecting Data in an Experiment
	Sampling & Bias	54:44
	Sampling Methods	14:25
	Research Design	53:54
	Between and Within Treatment Variability	41:31
Section 7: Review of Probability Axioms
	Sample Spaces	37:52
	Addition Rule for Disjoint Events	20:29
	Conditional Probability	57:19
	Independent Events	24:27
Section 8: Probability Distributions
	Introduction to Probability Distributions	56:45
	Expected Value & Variance of Probability Distributions	53:41
	Binomial Distribution	55:15
Section 9: Sampling Distributions of Statistics
	Introduction to Sampling Distributions	48:17
	Sampling Distribution of the Mean	1:08:48
	Sampling Distribution of Sample Proportions	54:37
Section 10: Inferential Statistics
	Introduction to Confidence Intervals	42:53
	t Distributions	1:02:06
	Introduction to Hypothesis Testing	1:06:33
	Confidence Intervals for the Difference of Two Independent Means	55:14
	Hypothesis Testing for the Difference of Two Independent Means	50:00
	Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means	1:14:11
	Type I and Type II Errors	31:27
	Effect Size & Power	44:41
Section 11: Analysis of Variance
	F-distributions	24:46
	ANOVA with Independent Samples	1:09:25
	Repeated Measures ANOVA	1:15:13
Section 12: Chi-square Test
	Chi-Square Goodness-of-Fit Test	58:23
	Chi-Square Test of Homogeneity	51:36
Section 13: Overview of Statistics
	Overview of Statistics	18:11

Transcription: Correlation: r vs. r-squared

Hi and welcome to www.educator.com.0000

We are going to talk about the difference between r and r².0002

First I’m going to just introduce the quantitative r² and need to understand it.0010

Why cannot we just square r and be like that is r².0015

We want to know what the meaning of r².0019

In order to get to the meaning of r² we have to understand that sum of squared differences is actually going to split apart it to different ways.0022

We are going to learn how to parse the different parts of the sum of squared differences.0031

Then we are going to talk about what r² means for a very strong correlation.0035

What r² maybe for a very weak correlation.0040

One of the reason why practically you will need to understand what r² is that often when you do regression on the computer,0047

either in SPSS or S data or any of this statistics packages, they will often give you r²0056

as one of the output and you might be looking at and me like why are we doing the r²?0064

We want to know what is the meaning of it?0070

Why just r²? Why not just have r?0073

Often if you just find the correlation you will just get r but if you find the regression you will get r².0076

It is like what is the deal?0084

R² is really is just r², but there is a meaning behind it.0087

I want to just stuck and say it is like the difference between feet and feet².0094

They mean different things.0100

It is not just that you can square the number and be like it is just the numbers squared.0102

It is not just about the number it is also about the actual unit.0109

You have to understand what the unit is because feet is a measurement that examines link but square feet now gives you area.0113

Those are different things.0130

They are obviously related to each other, but they are very different ideas.0132

Because of that you need to also not only know, like how to calculate r², but also know the meaning of r².0136

Again in order to understand the meaning of r² we will need to parse the sum of squares.0148

Remember the sum of squares that we have been talking about is something like x or y and0154

the difference between x and x bar or the difference between y and y bar.0161

Squaring all those and then adding them up, sum of squares.0167

When we say sum of squares you might hear the term that this is about variability.0172

Sum of squares talks about variability and it is because you are always getting that deviation between your data and the mean.0180

Sum of squares is often idea that is highly associated with variability.0189

Another way of thinking about parsing sum of squares is parsing variability because variability comes from a variety of sources.0199

Here we are going to talk about a couple of those sources and how to figure out this variability comes from that but this variability comes from that.0207

When you put it together you have total variability.0216

Now total variability is going to be indicated by SST or sum of squares total.0221

This idea is all the variability in the system.0228

All of the variability.0232

We are going to take that and parse it, split apart into two pieces that are equal pieces but there just 2 different places at that variability comes from.0234

One of the sources of the variability is always from this relationship between X and Y and that can be explained by the regression line.0246

This is sum of squares from the regression and so that can be the idea that sum of squares.0255

This one is going to be the left over sum of squares.0270

There is going to be some variability left over that is not explained by the regression line and that sum of squares error.0276

When we say error, we do not necessarily mean that we made a mistake.0294

It is not that we made a mistake.0300

Error often just means variability that is unexplained.0304

We do know where it came from.0310

We do not know if it is because there was some measurement error.0313

We do not know if there is just noise in the system.0318

We do not know if there is another variable that is causing this variation.0321

Sum of squares error just means variability that we cannot explain.0327

That does not necessarily mean that we made a mistake.0334

Often times that has to be statistics uses that word error but it does not mean that we made a mistake0338

but it means that it just variability that we do not know where it came from.0345

There is no explanation for it.0349

To break this down you could see that this is sum of squares total and that is usually what we get from looking at the difference between y and just the mean.0353

That is like the classic sum of squares because the mean should give us some information about where y is.0365

It is what every single point is going to be at the mean.0372

That is like error but that is the total error.0376

Some of that errors, some of that variation away from the mean can be accounted for by regression like here it is farther and farther up from the mean.0380

The numbers are bigger than the mean and then here the numbers are smaller than the mean.0391

Here this is the residual and this is we have already looked up before.0398

You can also think of it as residual error where it is the rest of the variation that is not accounted for by that nice regression line that we found.0405

We could think of this as the explained variability.0418

This is explained and what explains the variability?0428

The regression line.0434

The regression line says it is been a very systematically like this.0436

The residual is what we call unexplained variability.0441

When another one comes from its real error just variability in the system that is caused by another variable.0447

When you put the explained variability and unexplained variability altogether you will get total variability.0456

Let us break down specifically and mathematically what is sum of squares total or the sum of squares residual or sum of squares are?0469

I will give you a picture of what these things are.0483

First let us talk about sum of squares total.0486

One thing we probably want to do is give a rough idea of what the mean is.0490

Let us say the mean of something like this.0495

I'm just going to call that y bar because that might mean of y roughly.0499

Closer to these points but these guys are sure further down to pin it down.0504

I’m going to call that y bar and I want to know the sum of squares total.0510

Was the total variability that you see here.0518

Because we are squaring all these differences we are not just interested in that residual idea.0522

We interested in the area of little squares.0531

It is not only the distance down but imagine that distance squared and this area.0538

That is the sum of squared variation of one point.0548

Imagine doing that with all of these.0554

You create these squares.0558

Some are big squares, some are little squares and you add up all those different areas.0561

That is sum of squares total.0578

That is the total variation in our data away from the mean.0580

Would not it be nice if all our data looks something like the mean?0585

That would be like I can predict this data but this has more variation.0588

I must give way over to sum of squares error because that is when we actually know.0598

In order to find sum of squared error I need the regression line.0604

I’m just going to draw a regression line like this.0609

It might not be perfect but something like that.0614

Remember how we found residual?0617

To find a residual it is just the difference between my y and y bar that my predicted y hat.0620

These are my y hat and I want to know the difference between them but we are squaring that difference.0635

Instead of just drawing a line we draw a square and imagine getting that area.0643

That is the sum of squared residual or error for one point.0652

We are going to do that with all of the points.0657

Find that area, that area, that area and add up out of all those areas then we get the sum of squared error.0663

The variation away from the regression line.0678

This is our unexplained variation.0685

This is our total variation.0689

Now what is this part?0692

This is the variability that is already accounted for by the regression line.0694

This is the difference between the predicted y and y bar.0700

Here is the idea.0707

If we just have y bar we not have a lot of predicted power.0710

We are just saying our y bar is just average.0715

It is just the average and we only have one guess.0720

The average.0723

If we have the regression line we have a more mere guess.0725

If I know what x is I could tell you more closely what y might be.0730

I will try to redraw my regression line and pretend that is a nice regression.0735

Here is my y hat.0747

Also, here is my y bar.0750

Here what I want to know is how much of the variability is simply accounted for by having this line?0761

Having this line gives us the more predictive power how much of that predictive power is it.0770

We want to know for this point this is now my difference and then I'm just to square that difference.0776

Here is another point but here is the difference.0789

The difference is very like nothing.0795

Here is the difference.0798

It is right here, this difference.0801

Let me give another example like right here for this point this would be the difference.0812

I'm looking at all of these you can think of it as sort of the squared spaces in between my regression line and my main line.0820

I'm looking at that and that gives me how much of my variance in the data is accounted for by the regression line.0830

That is roughly the idea.0840

Let us think about actual formulas and to help us out with that I have a more like nicely drawn variation that my crappy dots0842

but now you could see the square differences between my actual data points and my mean.0854

Here are my square differences.0864

Here is that same data.0866

It is the same data from before, except now we are looking at differences from the regression line not the mean line.0868

Here we are looking at differences between the mean line and the regression line.0880

Let us write these things down in formulas in terms of formulas.0885

In order to find the sum of squares total let us think about what this is as an idea.0890

Okay, we want the sum of squares, so I know it is going to be sum of squares.0896

All of these guys are to be like this I could already write that down.0902

As this r what we call from the sum of squared and here is going to be the sum of something squared.0906

We already know that is going to be the same variability.0926

Here we have for every y give me the difference between that y and the mean and then square it and get that area.0930

Get all these areas and add them up.0942

That just y – y bar.0945

If we want to fill this out, we would know this means for everything single point that we have get y - y bar and then square it and add them up.0953

That is the idea.0963

That is sum of squares total.0965

Sum of squares residual actually let us go over to sum of squares error.0967

I sometimes call it also sum of squares residual because this is the idea of the residual.0975

Remember the residual was y – y hat.0982

And so, we are squaring the difference between y and y hat.0989

That is really easy.1002

Y – y hat.1004

If you want to fill it out, you could obviously put in the (i) as well just so you know you have to do that for every single point.1007

For the sum of squares for the regression I know that is why they call it sum of squares and sum of squares residual because it is confusing for the r.1016

This one is sum of squares regression.1027

I want to think of this guy as the good guy.1034

It is like you want to be able to predict X and Y and this guy helps you because he sucks up some of the variance.1037

This guy is the leftover that I do not know what to do anything about.1043

When we talk about the regression we are talking about the difference between y hat and y bar.1047

That is y hat and y bar.1056

You could obviously do that for each point.1065

There you have it, the formulas for these but if you understand the ideas you could always intercept what is this a picture of?1075

This is a picture of the difference between the data points and y bar.1085

Here is a picture of the difference between the data points and y hat.1091

It may be confusing though which one is y hat?1095

All you should do is go back to the picture and think to yourself by telling a total variance or variance after we have the regression line.1100

Okay, so now that you know as the SST, SSR and SSC now we can talk about r² because you need those components.1115

R² is often called the coefficient of determination, not coefficient of correlation squared it is often called the coefficient of determination.1125

One of the reasons that r² is important is that it has an interpretation.1135

It is actually is talking about the proportion of total variance.1140

Remember variance is standard deviation².1144

Because we are talking about sum of squared the proportion of that total of variance of y explained by the simple regression model.1149

Here is the idea.1159

It is like here is all that variance and we do not know where that variance comes from.1161

I do not know why they are all varying.1166

We have the regression line.1168

The regression line explains where some of the variation away from the mean comes from.1169

It comes from this relationship of x.1174

Is that regression line is doing a good job then a lot of the total variance is explained by the regression line, that predicted regression y.1178

If the line is not doing a very good job then it does not explain a lot of the variation there is extra variation above and beyond that.1193

All would be very low because only a small portion of that various is accounted for.1207

Given that, let us talk about what a strong r might be and what a weak r might be.1215

If the correlation is very strong let us think about this.1226

Whatever your sum of squares total is they are all variance.1231

Whatever that is this is going to account for a lot of it.1238

Let us say this is like 100% of the variance this accounts for 85% and so this would be small to be 15%.1244

This is of how this works.1258

These two added up, give you the total.1260

If that is true, if the correlation is very strong this should be small and this should be large.1264

If this is small then the proportion of error over the total would be a small number.1275

Here is the formula for r².1284

R² is 1 – that proportion of error / the total.1286

This is the unaccounted for error, that leftover error / the total variation.1291

This is the unexplained variation / the total variation.1298

This number should be very, very small and when that number is very small 1 - a very small number is a number very close to 1.1303

R² is very strong because the maximum r² could be this 1.1312

This means that if r² is large this means close to 1 and this means that much of the variation is accounted for by the regression line.1318

The regression line did a great job of explaining variation.1343

As we near the regression line I could tell you I can predict for you y given x.1346

It is doing a good job.1353

On the other hand, if a correlation is weak.1358

If it is weak then this is the correlation how whiny it is.1362

Even if we have a line it does not explain all the variation.1369

There is a lot of leftover variation.1374

That should be low compared to that one.1378

If this is 100% and this is not doing a very good job explaining variation.1383

It only explains 15% of the variation then we have 85% of the variation leftover.1387

If we put the sum of squared error over the total this number should be large.1395

There is a lot of a large proportion of that total variance is still unaccounted for, unexplained.1400

1 - a larger number, one that is closer to 1 this will be a very small number for r².1407

R² if it is small this means that not a lot of the variation was accounted for by the regression line.1416

The regression line did not do very good job of explaining the variation in our data.1429

Let us do some example.1438

Previously we work with this data before for the above example data we have already found the regression line and the correlation they give it to us.1440

We could look at this and it has a negative slope and there is more rise than run.1451

Because the cost goes up really fast.1465

For every one that you go if you go up a little bit here.1469

It makes sense that the correlation is negative and strong, it is -.869 that is a pretty strong, very line-y but it had the negative slope.1474

It only gets as far.1489

It is giving us the correlation coefficient, not the coefficient of determination, r².1492

Find r² for the set of data and examine whether r² once we find it in a different way by looking at r² = 1 - the sum of squared error / sum of squared total.1497

Once we find that examine whether this is also r × r.1514

If you download the examples provided for you below and go to example 1, here is our data and I just provided the graph for you so you could see.1523

I’m just going to move it over to the side because we are not going to need it.1534

Remember that we have this, we are going to need to calculate something in order to find the sum of squared error and the sum of squares total.1546

One thing that I like to do is remind myself if I looked at sum of squared error, if I double clicked on that what would I see inside?1557

Well, we know that the sum of squared error is whatever regression line we have and we need this distance away squared.1568

That is going to be the sum of y - y hat because this is y hat².1580

I know I’m going to need y hat.1592

What else are we going to need?1596

Sum of squares total is whatever my mean is.1597

Whatever my mean is I’m going to need to know the difference between my data and my mean squared.1603

My data and my mean squared, that is sum of squares total.1612

That I could easily find I should try to find y hat as well.1619

Y hat will be easy to find because we have the regression line.1626

We could just plug-in a whole bunch of x and get each y for all those x.1630

Why do not we start there?1638

Let us find the predicted and then I'm just going to call cost per unit as my y because that was on my y axis.1643

I will talk about predicted cost per unit, predicted CPU.1650

In order to find that I need to put in my regression formula, so that is 795.207 and then subtract 21.514 and Excel will automatically do order of operations for you.1655

Multiplication comes before subtraction.1676

I’m just going to just click in x.1679

Whatever x is this is going to find me the predicted y value.1683

Once I have that I’m just going to drag down this to find all of my predicted CPU.1695

It might be actually be helpful to us to find the sum and averages of all of these.1708

I’m just going to color these in red so that I know is not part of my data.1722

I probably do not need the sum for that.1728

I need the average for these.1730

I’m also going to need the average for these.1738

We have our predicted CPU (cost per unit).1746

That is my y hat.1752

I also find my y bar, my average cost per unit.1754

Let us find the error terms square and also these variations squared.1760

Here I’m just going to write it down for myself as y - the predicted y² and also my y - y bar².1773

We could also write CPU - predicted CPU² or CPU - average CPU².1796

I am just writing it y just to save space.1805

Let me get my y - the predicted y and all of these squared.1808

Let me also do that for y and y bar.1822

Let me get the parentheses.1825

Y - y bar and all of that squared.1827

Now y bar is never going to change it so I'm just going to lock that down.1841

Once I have that I could just copy and paste these 2 cells all the way down.1850

Once I have that now I could find the sum of the residual squared as well as the sum of these deviations squared.1863

Sum of all these guys and sum of these guys.1884

I have almost everything I need in order to find r².1897

I have my sum here, my sum here.1902

Let us find r².1905

R² is going to be 1 - the sum of squared error ÷ by sum of squares total, that ratio.1910

Let us first just look at the data that we have clicked.1925

This value is smaller than this value.1928

This is 1/6.1932

Because of that 1/6 that is pretty good so we should have about 5/6 should be closer to 1 then to 0.1935

We will get .7 / 6 and so we get a pretty good r².1947

Notice that r² is positive even though our slope is negative because r² does not actually talk about slope.1955

It is just the proportion of variance accounted for by the regression line.1964

It is the same 76% of the total variance is accounted for by that regression line, that majority.1970

And so that is good.1977

Now let us try to put in r × r so we already know what r is.1979

Let us see if r² will give us .76.1986

So -.869² we will get something very close and this is probably rounded and so because of that it does not give us precise numbers.1991

We do not have that precision, but is pretty close is still 76%.2009

If you have the actual r that you computed and you squared it, you would get perfectly r².2015

We found our square for the set of data and examined whether it is r × r and it indeed is.2025

Example 2, the conceptual explanation of r62 is that it is the proportion of total variance of y explained by the simple regression model.2035

A simple regression model we just mean you only have the form y = b knot + b1.2045

It can only be aligned, it can be accrued.2060

That is what we mean by a simple linear regression.2065

What does it mean that the simple linear regression is a model of variance explained by a simple regression model.2069

Let us think about this idea.2086

Here we have our data set.2089

I’m just going to draw some points here.2092

These points do not exactly fall in a line.2099

That line that we made up the regression line, the regression line is really a model.2103

It is not actual data it is a theoretical model that we created from the data.2112

By model just like model airplane or model house, it is not the real houses.2118

It is like a shining example.2133

But not only is it an example, it is idealized.2139

It is the perfect version of the world.2144

If the word are perfect and there was no error that would be a model.2146

When we say a modeling variance we are there is always variance.2152

Where does it come from?2157

When we create a model, we have a little theory of where that variance comes from and in our model here this is our theory that explains the variance.2160

Our theory is that it is a relationship between x and y and it is very small explanation.2185

But it is this relationship between x and y that is where the variation comes from.2192

That is what we mean by the regression is lying as a model of the variance.2197

Now the idea behind r² is how good is this theory.2204

How good is this model?2211

Does it explain a lot of the total variation or is it a theory that does not really help us out a lot?2213

If we have a big r², if it is fairly large and this means that our theory is pretty good.2224

Our theory explains a lot of the total variance accounted for the total variance.2231

If our r² is very small it means our theory was not that great.2237

We had a theory, here is a model but it is not that good.2240

It only explains a little bit of the variance.2244

Example 3, why is r² only range from 0 to 1?2251

It might be helpful here to start off what r² is?2256

1 - the sum of squared error / the total sum of squares / the total variance.2262

Now let us think can SSE ever be greater than SST?2272

No it cannot, because SST by definition it equals the sum of squares from a regression and the sum of squared error.2280

This by definition have to be smaller than this and none of these can be negative because they are squared.2292

Whatever it has to be positive numbers it is actually the case that if you add 2 positive numbers together to get another positive sum2299

and that sum has to be greater than or equal to this.2307

Either this is greater than each of these or it is equal to one of them because it could be like this is 0 and this is 100%.2317

There is just actually no way that this could be bigger than 1.2325

Not bigger than 1, bigger than SST?2336

No, cannot be.2348

This proportion have to range between 0 and 1.2351

It got to be 1 or smaller or they could be equal.2360

This could be 0 and this could be 1.2367

There is no way that this could be bigger than this.2372

Because this value only ranges from 0 to 1, 1 - something that ranges from 0 – 1, this whole thing could only range from 0 to 1.2376

Because of that r² can only range from 0 to 1.2389

Example 4, and this is going to be a do see.2397

Find r² for this set of data and examine whether this is also r × r.2400

Let us think about what we are going to do.2408

In order to find r × r and so r is the correlation coefficient and that is the sum of the product of z scores z sub x × z sub y and the average product of z scores.2411

We are going to find that.2436

We also have to find r².2439

In order to find r² that is 1 - sum of squared error / sum of squared total.2443

In order to find this, we need y hat.2449

In order to find y hat we need the regression line.2454

To find the regression line one thing we could do is once we find a correlation coefficient we could use that in order to find b1.2465

Or obviously we can also just find b1 in other ways too.2482

But this is one is a shortcut and once we find b1 we can find the intercept 1 – b1 × x.2488

We will have a whole bunch of data.2501

We have all this data.2504

Let us get started.2508

If you go to your examples and example 4, here is our data and I’m just going to move this over to the side because we are not going to be needing it for a while.2509

We already can see that it is probably can be a positive correlation if anything.2522

Let us just start by finding the correlation coefficient because it is pretty easy for us to find and once we have that we can find other things.2528

In order to get started on that it often helps to have the sum, the average, and the standard deviation.2539

I’m just going to make these all bolder in red so we know that there are different.2552

I’m going to find the sum for these.2558

We do not need the sum here though but I figured it as well.2564

It is not too hard.2569

There is the average and let us get the standard deviation because we are going to need that for the z score anyway.2570

Great.2580

We go all up now let us find is the scores for TV watching and also the z scores for junk food.2583

It makes sense that there is this more positive correlation.2599

The more TV watch per week perhaps more junk food calories are consumed.2606

Is the correlation strong?2616

I do not know.2619

In order to find the z score we need to have the TV watching data and subtract from that the mean and I want that distance,2620

not in terms of the raw distance, but in terms of standard deviation.2638

How many standard deviations away?2642

All divided by standard deviation.2644

Here I'm just going to lockdown the row.2649

I always use the same mean and standard deviation.2655

Once I have that I could just drag it all the way down and add it while we drag it across.2667

We forgot to find these for junk food calories.2680

Let us just double click on one of these and test it out.2687

Let us see.2692

It gives me the junk food calories - the average / the standard deviation.2693

Perfect.2701

Let us just eyeball this data for a second.2704

We see that roughly half of the z scores are negative and roughly half are positive.2707

Here too roughly half are negative and roughly half are positive.2713

We know that we did a good job at finding z scores.2717

In order to find the average product we are going to need to find the product the z(TV) × z(junk food).2719

This times this and once we have all of that we could sum these and we could find the average.2733

This divided by count how many data points that and then subtract 1.2750

We found the average and that is r.2767

Just regular of r.2770

That r it is .58, so it is not super duper weak but it is not really strongly either.2773

I’m just labeling it so that I know where it is only come out.2782

Once we have r we could find b1, b sub 1.2785

In order to find b sub 1 that will be r × the ratio between the standard deviation for y and standard deviation of x.2804

We have that right over here.2817

standard deviation for y ÷ stdev x, that proportion.2820

And so we get the b1 is 10.75 and once we have b1 we could find b sub 0.2830

Remember, we have the point of averages, but we also have all these points.2844

You can substitute anyone of these points.2851

Any one of the points between x and predicted y.2852

You cannot substitute these points.2858

In order to get the point of averages we will get y – b1 × x.2869

Here we get the intercepts b sub knot or b sub 0 is 186.2881

Now that we have b1 and b0 we can now find predicted y.2891

Let us go up here.2899

To help us out I am just going to color these some color so that we know that this is one is all about finding the correlation coefficient.2904

We found the correlation coefficient.2921

Now what we want to do is find r².2923

And so in order to find r² let us think about what we need.2929

We need predicted y, predicted junk food and we could easily find that and once we have that we know we are going to need y - predicted y².2933

That is our sum of squared error. But we also going to need y - y bar².2958

That is going to be our total error.2967

Let us start with predicted y.2970

Predicted y is always going to be b sub y + slope × x which is TV watching.2973

We will lock down b sub knot and the slope b sub 1 because do not want that to move.2992

Once we have that we could find (y - the predicted y)² .3006

And then finally we want to find (y - the average y)².3033

We want this average to be locked in place in order to move.3052

Once we have all of those 3 pieces we could just do the easy job of copying and pasting all the way down.3062

Once we do that, we could sum these up because we are going to need to have3074

the sum of squared residual I’m going to need the sum of squared deviation from the mean.3081

In order to find the sum I could just copy and paste that.3093

Once we have the sum I can now find r².3100

I can just put in 1 – SSE / SST.3115

Let us see.3127

I will get .3377.3129

The regression line accounts for about 34% of the variation.3133

Let us see.3142

Is this r × r?3144

Is that going to be the same thing?3148

We have r we can just scroll and we get exactly 34%.3151

If we get a question like this, Excel can help.3160

Thanks for watching www.educator.com.3169

Related Books

Statistics by Witte, 10th Edition

Authors: Robert S. Witte, John S . Witte

ISBN: 1118450531

Publisher: Wiley

Year: 2013

This book provides a clear and methodical approach to essential statistical procedures. It clearly explains the basic concepts and procedures of descriptive and inferential statistical analysis. This book features a new emphasis on expressions involving sums of squares and degrees of freedom as well as a stronger stress on the importance of variability.

Related Books

Name	Description	Link
BookRenter.com	BookRenter.com is simply the most reliable online textbook rental service.	Visit BookRenter.com
PhysicsForums.com Homework Help	Physics Forums is a scientific community for students looking for math & science help.	Visit PhysicsForums.com Homework Help

Statistics Correlation: r vs. r-squared

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Discussion

Answer Engine

Download Lecture Slides

Table of Contents

Transcription

Related Books

Answer EngineGet answers to any question!Ask any question related to Statistics

Correlation: r vs. r-squared

General Statistics Online Course

Transcription: Correlation: r vs. r-squared

Related Books

Related Books

Start Learning Now

Membership Overview

Statistics Correlation: r vs. r-squared

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Discussion

Answer Engine

Download Lecture Slides

Table of Contents

Transcription

Related Books

Answer EngineGet answers to any question!Ask any question related to Statistics

Correlation: r vs. r-squared

General Statistics Online Course

Transcription: Correlation: r vs. r-squared

Related Books

Related Books

Available 24/7. Unlimited Access to Our Entire Library.

Searchable Lessons

Get Answers & Community Support

Downloadable Lecture Notes

Study Guides, Worksheets and Extra Example Lessons

Start Learning Now

Membership Overview