Dr. Ji Son

Sampling & Bias

Slide Duration:

Table of Contents

Section 1: Introduction

Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro

0:00

Roadmap

0:10

Roadmap

0:11

Statistics

0:35

Statistics

0:36

Let's Think About High School Science

1:12

Measurement and Find Patterns (Mathematical Formula)

1:13

Statistics = Math of Distributions

4:58

Distributions

4:59

Problematic… but also GREAT

5:58

Statistics

7:33

How is It Different from Other Specializations in Mathematics?

7:34

Statistics is Fundamental in Natural and Social Sciences

7:53

Two Skills of Statistics

8:20

Description (Exploration)

8:21

Inference

9:13

Descriptive Statistics vs. Inferential Statistics: Apply to Distributions

9:58

Descriptive Statistics

9:59

Inferential Statistics

11:05

Populations vs. Samples

12:19

Populations vs. Samples: Is it the Truth?

12:20

Populations vs. Samples: Pros & Cons

13:36

Populations vs. Samples: Descriptive Values

16:12

Putting Together Descriptive/Inferential Stats & Populations/Samples

17:10

Putting Together Descriptive/Inferential Stats & Populations/Samples

17:11

Example 1: Descriptive Statistics vs. Inferential Statistics

19:09

Example 2: Descriptive Statistics vs. Inferential Statistics

20:47

Example 3: Sample, Parameter, Population, and Statistic

21:40

Example 4: Sample, Parameter, Population, and Statistic

23:28

Section 2: About Samples: Cases, Variables, Measurements

About Samples: Cases, Variables, Measurements

32m 14s

Intro

0:00

Data

0:09

Data, Cases, Variables, and Values

0:10

Rows, Columns, and Cells

2:03

Example: Aircrafts

3:52

How Do We Get Data?

5:38

Research: Question and Hypothesis

5:39

Research Design

7:11

Measurement

7:29

Research Analysis

8:33

Research Conclusion

9:30

Types of Variables

10:03

Discrete Variables

10:04

Continuous Variables

12:07

Types of Measurements

14:17

Types of Measurements

14:18

Types of Measurements (Scales)

17:22

Nominal

17:23

Ordinal

19:11

Interval

21:33

Ratio

24:24

Example 1: Cases, Variables, Measurements

25:20

Example 2: Which Scale of Measurement is Used?

26:55

Example 3: What Kind of a Scale of Measurement is This?

27:26

Example 4: Discrete vs. Continuous Variables.

30:31

Section 3: Visualizing Distributions

Introduction to Excel

8m 9s

Intro

0:00

Before Visualizing Distribution

0:10

Excel

0:11

Excel: Organization

0:45

Workbook

0:46

Column x Rows

1:50

Tools: Menu Bar, Standard Toolbar, and Formula Bar

3:00

Excel + Data

6:07

Exce and Data

6:08

Frequency Distributions in Excel

39m 10s

Intro

0:00

Roadmap

0:08

Data in Excel and Frequency Distributions

0:09

Raw Data to Frequency Tables

0:42

Raw Data to Frequency Tables

0:43

Frequency Tables: Using Formulas and Pivot Tables

1:28

Example 1: Number of Births

7:17

Example 2: Age Distribution

20:41

Example 3: Height Distribution

27:45

Example 4: Height Distribution of Males

32:19

Frequency Distributions and Features

25m 29s

Intro

0:00

Roadmap

0:10

Data in Excel, Frequency Distributions, and Features of Frequency Distributions

0:11

Example #1

1:35

Uniform

1:36

Example #2

2:58

Unimodal, Skewed Right, and Asymmetric

2:59

Example #3

6:29

Bimodal

6:30

Example #4a

8:29

Symmetric, Unimodal, and Normal

8:30

Point of Inflection and Standard Deviation

11:13

Example #4b

12:43

Normal Distribution

12:44

Summary

13:56

Uniform, Skewed, Bimodal, and Normal

13:57

Sketch Problem 1: Driver's License

17:34

Sketch Problem 2: Life Expectancy

20:01

Sketch Problem 3: Telephone Numbers

22:01

Sketch Problem 4: Length of Time Used to Complete a Final Exam

23:43

Dotplots and Histograms in Excel

42m 42s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Previously

1:02

Data, Frequency Table, and visualization

1:03

Dotplots

1:22

Dotplots Excel Example

1:23

Dotplots: Pros and Cons

7:22

Pros and Cons of Dotplots

7:23

Dotplots Excel Example Cont.

9:07

Histograms

12:47

Histograms Overview

12:48

Example of Histograms

15:29

Histograms: Pros and Cons

31:39

Pros

31:40

Cons

32:31

Frequency vs. Relative Frequency

32:53

Frequency

32:54

Relative Frequency

33:36

Example 1: Dotplots vs. Histograms

34:36

Example 2: Age of Pennies Dotplot

36:21

Example 3: Histogram of Mammal Speeds

38:27

Example 4: Histogram of Life Expectancy

40:30

Stemplots

12m 23s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

What Sets Stemplots Apart?

0:46

Data Sets, Dotplots, Histograms, and Stemplots

0:47

Example 1: What Do Stemplots Look Like?

1:58

Example 2: Back-to-Back Stemplots

5:00

Example 3: Quiz Grade Stemplot

7:46

Example 4: Quiz Grade & Afterschool Tutoring Stemplot

9:56

Bar Graphs

22m 49s

Intro

0:00

Roadmap

0:05

Roadmap

0:08

Review of Frequency Distributions

0:44

Y-axis and X-axis

0:45

Types of Frequency Visualizations Covered so Far

2:16

Introduction to Bar Graphs

4:07

Example 1: Bar Graph

5:32

Example 1: Bar Graph

5:33

Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?

11:07

Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?

11:08

Example 2: Create a Frequency Visualization for Gender

14:02

Example 3: Cases, Variables, and Frequency Visualization

16:34

Example 4: What Kind of Graphs are Shown Below?

19:29

Section 4: Summarizing Distributions

Central Tendency: Mean, Median, Mode

38m 50s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

Central Tendency 1

0:56

Way to Summarize a Distribution of Scores

0:57

Mode

1:32

Median

2:02

Mean

2:36

Central Tendency 2

3:47

Mode

3:48

Median

4:20

Mean

5:25

Summation Symbol

6:11

Summation Symbol

6:12

Population vs. Sample

10:46

Population vs. Sample

10:47

Excel Examples

15:08

Finding Mode, Median, and Mean in Excel

15:09

Median vs. Mean

21:45

Effect of Outliers

21:46

Relationship Between Parameter and Statistic

22:44

Type of Measurements

24:00

Which Distributions to Use With

24:55

Example 1: Mean

25:30

Example 2: Using Summation Symbol

29:50

Example 3: Average Calorie Count

32:50

Example 4: Creating an Example Set

35:46

Variability

42m 40s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Variability (or Spread)

0:45

Variability (or Spread)

0:46

Things to Think About

5:45

Things to Think About

5:46

Range, Quartiles and Interquartile Range

6:37

Range

6:38

Interquartile Range

8:42

Interquartile Range Example

10:58

Interquartile Range Example

10:59

Variance and Standard Deviation

12:27

Deviations

12:28

Sum of Squares

14:35

Variance

16:55

Standard Deviation

17:44

Sum of Squares (SS)

18:34

Sum of Squares (SS)

18:35

Population vs. Sample SD

22:00

Population vs. Sample SD

22:01

Population vs. Sample

23:20

Mean

23:21

23:51

Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File

27:21

Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File

35:25

Example 3: Sum of Squares

38:58

Example 4: Standard Deviation

41:48

Five Number Summary & Boxplots

57m 15s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Summarizing Distributions

0:37

Shape, Center, and Spread

0:38

5 Number Summary

1:14

Boxplot: Visualizing 5 Number Summary

3:37

Boxplot: Visualizing 5 Number Summary

3:38

Boxplots on Excel

9:01

Using 'Stocks' and Using Stacked Columns

9:02

Boxplots on Excel Example

10:14

When are Boxplots Useful?

32:14

Pros

32:15

Cons

32:59

How to Determine Outlier Status

33:24

Rule of Thumb: Upper Limit

33:25

Rule of Thumb: Lower Limit

34:16

Signal Outliers in an Excel Data File Using Conditional Formatting

34:52

Modified Boxplot

48:38

Modified Boxplot

48:39

Example 1: Percentage Values & Lower and Upper Whisker

49:10

Example 2: Boxplot

50:10

Example 3: Estimating IQR From Boxplot

53:46

Example 4: Boxplot and Missing Whisker

54:35

Shape: Calculating Skewness & Kurtosis

41m 51s

Intro

0:00

Roadmap

0:16

Roadmap

0:17

Skewness Concept

1:09

Skewness Concept

1:10

Calculating Skewness

3:26

Calculating Skewness

3:27

Interpreting Skewness

7:36

Interpreting Skewness

7:37

Excel Example

8:49

Kurtosis Concept

20:29

Kurtosis Concept

20:30

Calculating Kurtosis

24:17

Calculating Kurtosis

24:18

Interpreting Kurtosis

29:01

Leptokurtic

29:35

Mesokurtic

30:10

Platykurtic

31:06

Excel Example

32:04

Example 1: Shape of Distribution

38:28

Example 2: Shape of Distribution

39:29

Example 3: Shape of Distribution

40:14

Example 4: Kurtosis

41:10

Normal Distribution

34m 33s

Intro

0:00

Roadmap

0:13

Roadmap

0:14

What is a Normal Distribution

0:44

The Normal Distribution As a Theoretical Model

0:45

Possible Range of Probabilities

3:05

Possible Range of Probabilities

3:06

What is a Normal Distribution

5:07

Can Be Described By

5:08

Properties

5:49

'Same' Shape: Illusion of Different Shape!

7:35

'Same' Shape: Illusion of Different Shape!

7:36

Types of Problems

13:45

Example: Distribution of SAT Scores

13:46

Shape Analogy

19:48

Shape Analogy

19:49

Example 1: The Standard Normal Distribution and Z-Scores

22:34

Example 2: The Standard Normal Distribution and Z-Scores

25:54

Example 3: Sketching and Normal Distribution

28:55

Example 4: Sketching and Normal Distribution

32:32

Standard Normal Distributions & Z-Scores

41m 44s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

A Family of Distributions

0:28

Infinite Set of Distributions

0:29

Transforming Normal Distributions to 'Standard' Normal Distribution

1:04

Normal Distribution vs. Standard Normal Distribution

2:58

Normal Distribution vs. Standard Normal Distribution

2:59

Z-Score, Raw Score, Mean, & SD

4:08

Z-Score, Raw Score, Mean, & SD

4:09

Weird Z-Scores

9:40

Weird Z-Scores

9:41

Excel

16:45

For Normal Distributions

16:46

For Standard Normal Distributions

19:11

Excel Example

20:24

Types of Problems

25:18

Percentage Problem: P(x)

25:19

Raw Score and Z-Score Problems

26:28

Standard Deviation Problems

27:01

Shape Analogy

27:44

Shape Analogy

27:45

Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer

28:24

Example 2: Heights of Male College Students

33:15

Example 3: Mean and Standard Deviation

37:14

Example 4: Finding Percentage of Values in a Standard Normal Distribution

37:49

Normal Distribution: PDF vs. CDF

55m 44s

Intro

0:00

Roadmap

0:15

Roadmap

0:16

Frequency vs. Cumulative Frequency

0:56

Frequency vs. Cumulative Frequency

0:57

Frequency vs. Cumulative Frequency

4:32

Frequency vs. Cumulative Frequency Cont.

4:33

Calculus in Brief

6:21

Derivative-Integral Continuum

6:22

PDF

10:08

PDF for Standard Normal Distribution

10:09

PDF for Normal Distribution

14:32

Integral of PDF = CDF

21:27

Integral of PDF = CDF

21:28

Example 1: Cumulative Frequency Graph

23:31

Example 2: Mean, Standard Deviation, and Probability

24:43

Example 3: Mean and Standard Deviation

35:50

Example 4: Age of Cars

49:32

Section 5: Linear Regression

Scatterplots

47m 19s

Intro

0:00

Roadmap

0:04

Roadmap

0:05

Previous Visualizations

0:30

Frequency Distributions

0:31

Compare & Contrast

2:26

Frequency Distributions Vs. Scatterplots

2:27

Summary Values

4:53

Shape

4:54

Center & Trend

6:41

Spread & Strength

8:22

Univariate & Bivariate

10:25

Example Scatterplot

10:48

Shape, Trend, and Strength

10:49

Positive and Negative Association

14:05

Positive and Negative Association

14:06

Linearity, Strength, and Consistency

18:30

Linearity

18:31

Strength

19:14

Consistency

20:40

Summarizing a Scatterplot

22:58

Summarizing a Scatterplot

22:59

Example 1: Gapminder.org, Income x Life Expectancy

26:32

Example 2: Gapminder.org, Income x Infant Mortality

36:12

Example 3: Trend and Strength of Variables

40:14

Example 4: Trend, Strength and Shape for Scatterplots

43:27

Regression

32m 2s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Linear Equations

0:34

Linear Equations: y = mx + b

0:35

Rough Line

5:16

Rough Line

5:17

Regression - A 'Center' Line

7:41

Reasons for Summarizing with a Regression Line

7:42

Predictor and Response Variable

10:04

Goal of Regression

12:29

Goal of Regression

12:30

Prediction

14:50

Example: Servings of Mile Per Year Shown By Age

14:51

Intrapolation

17:06

Extrapolation

17:58

Error in Prediction

20:34

Prediction Error

20:35

Residual

21:40

Example 1: Residual

23:34

Example 2: Large and Negative Residual

26:30

Example 3: Positive Residual

28:13

Example 4: Interpret Regression Line & Extrapolate

29:40

Least Squares Regression

56m 36s

Intro

0:00

Roadmap

0:13

Roadmap

0:14

Best Fit

0:47

Best Fit

0:48

Sum of Squared Errors (SSE)

1:50

Sum of Squared Errors (SSE)

1:51

Why Squared?

3:38

Why Squared?

3:39

Quantitative Properties of Regression Line

4:51

Quantitative Properties of Regression Line

4:52

So How do we Find Such a Line?

6:49

SSEs of Different Line Equations & Lowest SSE

6:50

Carl Gauss' Method

8:01

How Do We Find Slope (b1)

11:00

How Do We Find Slope (b1)

11:01

Hoe Do We Find Intercept

15:11

Hoe Do We Find Intercept

15:12

Example 1: Which of These Equations Fit the Above Data Best?

17:18

Example 2: Find the Regression Line for These Data Points and Interpret It

26:31

Example 3: Summarize the Scatterplot and Find the Regression Line.

34:31

Example 4: Examine the Mean of Residuals

43:52

Correlation

43m 58s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Summarizing a Scatterplot Quantitatively

0:47

Shape

0:48

Trend

1:11

Strength: Correlation ®

1:45

Correlation Coefficient ( r )

2:30

Correlation Coefficient ( r )

2:31

Trees vs. Forest

11:59

Trees vs. Forest

12:00

Calculating r

15:07

Average Product of z-scores for x and y

15:08

Relationship between Correlation and Slope

21:10

Relationship between Correlation and Slope

21:11

Example 1: Find the Correlation between Grams of Fat and Cost

24:11

Example 2: Relationship between r and b1

30:24

Example 3: Find the Regression Line

33:35

Example 4: Find the Correlation Coefficient for this Set of Data

37:37

Correlation: r vs. r-squared

52m 52s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

R-squared

0:44

What is the Meaning of It? Why Squared?

0:45

Parsing Sum of Squared (Parsing Variability)

2:25

SST = SSR + SSE

2:26

What is SST and SSE?

7:46

What is SST and SSE?

7:47

r-squared

18:33

Coefficient of Determination

18:34

If the Correlation is Strong…

20:25

If the Correlation is Strong…

20:26

If the Correlation is Weak…

22:36

If the Correlation is Weak…

22:37

Example 1: Find r-squared for this Set of Data

23:56

Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?

33:54

Example 3: Why Does r-squared Only Range from 0 to 1

37:29

Example 4: Find the r-squared for This Set of Data

39:55

Transformations of Data

27m 8s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Why Transform?

0:26

Why Transform?

0:27

Shape-preserving vs. Shape-changing Transformations

5:14

Shape-preserving = Linear Transformations

5:15

Shape-changing Transformations = Non-linear Transformations

6:20

Common Shape-Preserving Transformations

7:08

Common Shape-Preserving Transformations

7:09

Common Shape-Changing Transformations

8:59

Powers

9:00

Logarithms

9:39

Change Just One Variable? Both?

10:38

Log-log Transformations

10:39

Log Transformations

14:38

Example 1: Create, Graph, and Transform the Data Set

15:19

Example 2: Create, Graph, and Transform the Data Set

20:08

Example 3: What Kind of Model would You Choose for this Data?

22:44

Example 4: Transformation of Data

25:46

Section 6: Collecting Data in an Experiment

Sampling & Bias

54m 44s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Descriptive vs. Inferential Statistics

1:04

Descriptive Statistics: Data Exploration

1:05

Example

2:03

To tackle Generalization…

4:31

Generalization

4:32

Sampling

6:06

'Good' Sample

6:40

Defining Samples and Populations

8:55

Population

8:56

Sample

11:16

Why Use Sampling?

13:09

Why Use Sampling?

13:10

Goal of Sampling: Avoiding Bias

15:04

What is Bias?

15:05

Where does Bias Come from: Sampling Bias

17:53

Where does Bias Come from: Response Bias

18:27

Sampling Bias: Bias from Bas Sampling Methods

19:34

Size Bias

19:35

Voluntary Response Bias

21:13

Convenience Sample

22:22

Judgment Sample

23:58

Inadequate Sample Frame

25:40

Response Bias: Bias from 'Bad' Data Collection Methods

28:00

Nonresponse Bias

29:31

Questionnaire Bias

31:10

Incorrect Response or Measurement Bias

37:32

Example 1: What Kind of Biases?

40:29

Example 2: What Biases Might Arise?

44:46

Example 3: What Kind of Biases?

48:34

Example 4: What Kind of Biases?

51:43

Sampling Methods

14m 25s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Biased vs. Unbiased Sampling Methods

0:32

Biased Sampling

0:33

Unbiased Sampling

1:13

Probability Sampling Methods

2:31

Simple Random

2:54

Stratified Random Sampling

4:06

Cluster Sampling

5:24

Two-staged Sampling

6:22

Systematic Sampling

7:25

Example 1: Which Type(s) of Sampling was this?

8:33

Example 2: Describe How to Take a Two-Stage Sample from this Book

10:16

Example 3: Sampling Methods

11:58

Example 4: Cluster Sample Plan

12:48

Research Design

53m 54s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Descriptive vs. Inferential Statistics

0:51

Descriptive Statistics: Data Exploration

0:52

Inferential Statistics

1:02

Variables and Relationships

1:44

Variables

1:45

Relationships

2:49

Not Every Type of Study is an Experiment…

4:16

Category I - Descriptive Study

4:54

Category II - Correlational Study

5:50

Category III - Experimental, Quasi-experimental, Non-experimental

6:33

Category III

7:42

Experimental, Quasi-experimental, and Non-experimental

7:43

Why CAN'T the Other Strategies Determine Causation?

10:18

Third-variable Problem

10:19

Directionality Problem

15:49

What Makes Experiments Special?

17:54

Manipulation

17:55

Control (and Comparison)

21:58

Methods of Control

26:38

Holding Constant

26:39

Matching

29:11

Random Assignment

31:48

Experiment Terminology

34:09

'true' Experiment vs. Study

34:10

Independent Variable (IV)

35:16

Dependent Variable (DV)

35:45

Factors

36:07

Treatment Conditions

36:23

Levels

37:43

Confounds or Extraneous Variables

38:04

Blind

38:38

Blind Experiments

38:39

Double-blind Experiments

39:29

How Categories Relate to Statistics

41:35

Category I - Descriptive Study

41:36

Category II - Correlational Study

42:05

Category III - Experimental, Quasi-experimental, Non-experimental

42:43

Example 1: Research Design

43:50

Example 2: Research Design

47:37

Example 3: Research Design

50:12

Example 4: Research Design

52:00

Between and Within Treatment Variability

41m 31s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Experimental Designs

0:51

Experimental Designs: Manipulation & Control

0:52

Two Types of Variability

2:09

Between Treatment Variability

2:10

Within Treatment Variability

3:31

Updated Goal of Experimental Design

5:47

Updated Goal of Experimental Design

5:48

Example: Drugs and Driving

6:56

Example: Drugs and Driving

6:57

Different Types of Random Assignment

11:27

All Experiments

11:28

Completely Random Design

12:02

Randomized Block Design

13:19

Randomized Block Design

15:48

Matched Pairs Design

15:49

Repeated Measures Design

19:47

Between-subject Variable vs. Within-subject Variable

22:43

Completely Randomized Design

22:44

Repeated Measures Design

25:03

Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment

26:16

Example 2: Block Design

31:41

Example 3: Completely Randomized Designs

35:11

Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?

39:01

Section 7: Review of Probability Axioms

Sample Spaces

37m 52s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

Why is Probability Involved in Statistics

0:48

Probability

0:49

Can People Tell the Difference between Cheap and Gourmet Coffee?

2:08

Taste Test with Coffee Drinkers

3:37

If No One can Actually Taste the Difference

3:38

If Everyone can Actually Taste the Difference

5:36

Creating a Probability Model

7:09

Creating a Probability Model

7:10

D'Alembert vs. Necker

9:41

D'Alembert vs. Necker

9:42

Problem with D'Alembert's Model

13:29

Problem with D'Alembert's Model

13:30

Covering Entire Sample Space

15:08

Fundamental Principle of Counting

15:09

Where Do Probabilities Come From?

22:54

Observed Data, Symmetry, and Subjective Estimates

22:55

Checking whether Model Matches Real World

24:27

Law of Large Numbers

24:28

Example 1: Law of Large Numbers

27:46

Example 2: Possible Outcomes

30:43

Example 3: Brands of Coffee and Taste

33:25

Example 4: How Many Different Treatments are there?

35:33

Addition Rule for Disjoint Events

20m 29s

Intro

0:00

Roadmap

0:08

Roadmap

0:09

Disjoint Events

0:41

Disjoint Events

0:42

Meaning of 'or'

2:39

In Regular Life

2:40

In Math/Statistics/Computer Science

3:10

Addition Rule for Disjoin Events

3:55

If A and B are Disjoint: P (A and B)

3:56

If A and B are Disjoint: P (A or B)

5:15

General Addition Rule

5:41

General Addition Rule

5:42

Generalized Addition Rule

8:31

If A and B are not Disjoint: P (A or B)

8:32

Example 1: Which of These are Mutually Exclusive?

10:50

Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?

12:57

Example 3: Engagement Party

15:17

Example 4: Home Owner's Insurance

18:30

Conditional Probability

57m 19s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

'or' vs. 'and' vs. Conditional Probability

1:07

'or' vs. 'and' vs. Conditional Probability

1:08

'and' vs. Conditional Probability

5:57

P (M or L)

5:58

P (M and L)

8:41

P (M|L)

11:04

P (L|M)

12:24

Tree Diagram

15:02

Tree Diagram

15:03

Defining Conditional Probability

22:42

Defining Conditional Probability

22:43

Common Contexts for Conditional Probability

30:56

Medical Testing: Positive Predictive Value

30:57

Medical Testing: Sensitivity

33:03

Statistical Tests

34:27

Example 1: Drug and Disease

36:41

Example 2: Marbles and Conditional Probability

40:04

Example 3: Cards and Conditional Probability

45:59

Example 4: Votes and Conditional Probability

50:21

Independent Events

24m 27s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Independent Events & Conditional Probability

0:26

Non-independent Events

0:27

Independent Events

2:00

Non-independent and Independent Events

3:08

Non-independent and Independent Events

3:09

Defining Independent Events

5:52

Defining Independent Events

5:53

Multiplication Rule

7:29

Previously…

7:30

But with Independent Evens

8:53

Example 1: Which of These Pairs of Events are Independent?

11:12

Example 2: Health Insurance and Probability

15:12

Example 3: Independent Events

17:42

Example 4: Independent Events

20:03

Section 8: Probability Distributions

Introduction to Probability Distributions

56m 45s

Intro

0:00

Roadmap

0:08

Roadmap

0:09

Sampling vs. Probability

0:57

Sampling

0:58

Missing

1:30

What is Missing?

3:06

Insight: Probability Distributions

5:26

Insight: Probability Distributions

5:27

What is a Probability Distribution?

7:29

From Sample Spaces to Probability Distributions

8:44

Sample Space

8:45

Probability Distribution of the Sum of Two Die

11:16

The Random Variable

17:43

The Random Variable

17:44

Expected Value

21:52

Expected Value

21:53

Example 1: Probability Distributions

28:45

Example 2: Probability Distributions

35:30

Example 3: Probability Distributions

43:37

Example 4: Probability Distributions

47:20

Expected Value & Variance of Probability Distributions

53m 41s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Discrete vs. Continuous Random Variables

1:04

Discrete vs. Continuous Random Variables

1:05

Mean and Variance Review

4:44

Mean: Sample, Population, and Probability Distribution

4:45

Variance: Sample, Population, and Probability Distribution

9:12

Example Situation

14:10

Example Situation

14:11

Some Special Cases…

16:13

Some Special Cases…

16:14

Linear Transformations

19:22

Linear Transformations

19:23

What Happens to Mean and Variance of the Probability Distribution?

20:12

n Independent Values of X

25:38

n Independent Values of X

25:39

Compare These Two Situations

30:56

Compare These Two Situations

30:57

Two Random Variables, X and Y

32:02

Two Random Variables, X and Y

32:03

Example 1: Expected Value & Variance of Probability Distributions

35:35

Example 2: Expected Values & Standard Deviation

44:17

Example 3: Expected Winnings and Standard Deviation

48:18

Binomial Distribution

55m 15s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Discrete Probability Distributions

1:42

Discrete Probability Distributions

1:43

Binomial Distribution

2:36

Binomial Distribution

2:37

Multiplicative Rule Review

6:54

Multiplicative Rule Review

6:55

How Many Outcomes with k 'Successes'

10:23

Adults and Bachelor's Degree: Manual List of Outcomes

10:24

P (X=k)

19:37

Putting Together # of Outcomes with the Multiplicative Rule

19:38

Expected Value and Standard Deviation in a Binomial Distribution

25:22

Expected Value and Standard Deviation in a Binomial Distribution

25:23

Example 1: Coin Toss

33:42

Example 2: College Graduates

38:03

Example 3: Types of Blood and Probability

45:39

Example 4: Expected Number and Standard Deviation

51:11

Section 9: Sampling Distributions of Statistics

Introduction to Sampling Distributions

48m 17s

Intro

0:00

Roadmap

0:08

Roadmap

0:09

Probability Distributions vs. Sampling Distributions

0:55

Probability Distributions vs. Sampling Distributions

0:56

Same Logic

3:55

Logic of Probability Distribution

3:56

Example: Rolling Two Die

6:56

Simulating Samples

9:53

To Come Up with Probability Distributions

9:54

In Sampling Distributions

11:12

Connecting Sampling and Research Methods with Sampling Distributions

12:11

Connecting Sampling and Research Methods with Sampling Distributions

12:12

Simulating a Sampling Distribution

14:14

Experimental Design: Regular Sleep vs. Less Sleep

14:15

Logic of Sampling Distributions

23:08

Logic of Sampling Distributions

23:09

General Method of Simulating Sampling Distributions

25:38

General Method of Simulating Sampling Distributions

25:39

Questions that Remain

28:45

Questions that Remain

28:46

Example 1: Mean and Standard Error of Sampling Distribution

30:57

Example 2: What is the Best Way to Describe Sampling Distributions?

37:12

Example 3: Matching Sampling Distributions

38:21

Example 4: Mean and Standard Error of Sampling Distribution

41:51

Sampling Distribution of the Mean

1h 8m 48s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Special Case of General Method for Simulating a Sampling Distribution

1:53

Special Case of General Method for Simulating a Sampling Distribution

1:54

Computer Simulation

3:43

Using Simulations to See Principles behind Shape of SDoM

15:50

Using Simulations to See Principles behind Shape of SDoM

15:51

Conditions

17:38

Using Simulations to See Principles behind Center (Mean) of SDoM

20:15

Using Simulations to See Principles behind Center (Mean) of SDoM

20:16

Conditions: Does n Matter?

21:31

Conditions: Does Number of Simulation Matter?

24:37

Using Simulations to See Principles behind Standard Deviation of SDoM

27:13

Using Simulations to See Principles behind Standard Deviation of SDoM

27:14

Conditions: Does n Matter?

34:45

Conditions: Does Number of Simulation Matter?

36:24

Central Limit Theorem

37:13

SHAPE

38:08

CENTER

39:34

SPREAD

39:52

Comparing Population, Sample, and SDoM

43:10

Comparing Population, Sample, and SDoM

43:11

Answering the 'Questions that Remain'

48:24

What Happens When We Don't Know What the Population Looks Like?

48:25

Can We Have Sampling Distributions for Summary Statistics Other than the Mean?

49:42

How Do We Know whether a Sample is Sufficiently Unlikely?

53:36

Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?

54:40

Example 1: Mean Batting Average

55:25

Example 2: Mean Sampling Distribution and Standard Error

59:07

Example 3: Sampling Distribution of the Mean

1:01:04

Sampling Distribution of Sample Proportions

54m 37s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Intro to Sampling Distribution of Sample Proportions (SDoSP)

0:51

Categorical Data (Examples)

0:52

Wish to Estimate Proportion of Population from Sample…

2:00

Notation

3:34

Population Proportion and Sample Proportion Notations

3:35

What's the Difference?

9:19

SDoM vs. SDoSP: Type of Data

9:20

SDoM vs. SDoSP: Shape

11:24

SDoM vs. SDoSP: Center

12:30

SDoM vs. SDoSP: Spread

15:34

Binomial Distribution vs. Sampling Distribution of Sample Proportions

19:14

Binomial Distribution vs. SDoSP: Type of Data

19:17

Binomial Distribution vs. SDoSP: Shape

21:07

Binomial Distribution vs. SDoSP: Center

21:43

Binomial Distribution vs. SDoSP: Spread

24:08

Example 1: Sampling Distribution of Sample Proportions

26:07

Example 2: Sampling Distribution of Sample Proportions

37:58

Example 3: Sampling Distribution of Sample Proportions

44:42

Example 4: Sampling Distribution of Sample Proportions

45:57

Section 10: Inferential Statistics

Introduction to Confidence Intervals

42m 53s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Inferential Statistics

0:50

Inferential Statistics

0:51

Two Problems with This Picture…

3:20

Two Problems with This Picture…

3:21

Solution: Confidence Intervals (CI)

4:59

Solution: Hypotheiss Testing (HT)

5:49

Which Parameters are Known?

6:45

Which Parameters are Known?

6:46

Confidence Interval - Goal

7:56

When We Don't Know m but know s

7:57

When We Don't Know

18:27

When We Don't Know m nor s

18:28

Example 1: Confidence Intervals

26:18

Example 2: Confidence Intervals

29:46

Example 3: Confidence Intervals

32:18

Example 4: Confidence Intervals

38:31

t Distributions

1h 2m 6s

Intro

0:00

Roadmap

0:04

Roadmap

0:05

When to Use z vs. t?

1:07

When to Use z vs. t?

1:08

What is z and t?

3:02

z-score and t-score: Commonality

3:03

z-score and t-score: Formulas

3:34

z-score and t-score: Difference

5:22

Why not z? (Why t?)

7:24

Why not z? (Why t?)

7:25

But Don't Worry!

15:13

Gossett and t-distributions

15:14

Rules of t Distributions

17:05

t-distributions are More Normal as n Gets Bigger

17:06

t-distributions are a Family of Distributions

18:55

Degrees of Freedom (df)

20:02

Degrees of Freedom (df)

20:03

t Family of Distributions

24:07

t Family of Distributions : df = 2 , 4, and 60

24:08

df = 60

29:16

df = 2

29:59

How to Find It?

31:01

'Student's t-distribution' or 't-distribution'

31:02

Excel Example

33:06

Example 1: Which Distribution Do You Use? Z or t?

45:26

Example 2: Friends on Facebook

47:41

Example 3: t Distributions

52:15

Example 4: t Distributions , confidence interval, and mean

55:59

Introduction to Hypothesis Testing

1h 6m 33s

Intro

0:00

Roadmap

0:06

Roadmap

0:07

Issues to Overcome in Inferential Statistics

1:35

Issues to Overcome in Inferential Statistics

1:36

What Happens When We Don't Know What the Population Looks Like?

2:57

How Do We Know whether a sample is Sufficiently Unlikely

3:43

Hypothesizing a Population

6:44

Hypothesizing a Population

6:45

Null Hypothesis

8:07

Alternative Hypothesis

8:56

Hypotheses

11:58

Hypotheses

11:59

Errors in Hypothesis Testing

14:22

Errors in Hypothesis Testing

14:23

Steps of Hypothesis Testing

21:15

Steps of Hypothesis Testing

21:16

Single Sample HT ( When Sigma Available)

26:08

Example: Average Facebook Friends

26:09

Step1

27:08

Step 2

27:58

Step 3

28:17

Step 4

32:18

Single Sample HT (When Sigma Not Available)

36:33

Example: Average Facebook Friends

36:34

Step1: Hypothesis Testing

36:58

Step 2: Significance Level

37:25

Step 3: Decision Stage

37:40

Step 4: Sample

41:36

Sigma and p-value

45:04

Sigma and p-value

45:05

On tailed vs. Two Tailed Hypotheses

45:51

Example 1: Hypothesis Testing

48:37

Example 2: Heights of Women in the US

57:43

Example 3: Select the Best Way to Complete This Sentence

1:03:23

Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro

0:00

Roadmap

0:14

Roadmap

0:15

One Mean vs. Two Means

1:17

One Mean vs. Two Means

1:18

Notation

2:41

A Sample! A Set!

2:42

Mean of X, Mean of Y, and Difference of Two Means

3:56

SE of X

4:34

SE of Y

6:28

Sampling Distribution of the Difference between Two Means (SDoD)

7:48

Sampling Distribution of the Difference between Two Means (SDoD)

7:49

Rules of the SDoD (similar to CLT!)

15:00

Mean for the SDoD Null Hypothesis

15:01

Standard Error

17:39

When can We Construct a CI for the Difference between Two Means?

21:28

Three Conditions

21:29

Finding CI

23:56

One Mean CI

23:57

Two Means CI

25:45

Finding t

29:16

Finding t

29:17

Interpreting CI

30:25

Interpreting CI

30:26

Better Estimate of s (s pool)

34:15

Better Estimate of s (s pool)

34:16

Example 1: Confidence Intervals

42:32

Example 2: SE of the Difference

52:36

Hypothesis Testing for the Difference of Two Independent Means

50m

Intro

0:00

Roadmap

0:06

Roadmap

0:07

The Goal of Hypothesis Testing

0:56

One Sample and Two Samples

0:57

Sampling Distribution of the Difference between Two Means (SDoD)

3:42

Sampling Distribution of the Difference between Two Means (SDoD)

3:43

Rules of the SDoD (Similar to CLT!)

6:46

Shape

6:47

Mean for the Null Hypothesis

7:26

Standard Error for Independent Samples (When Variance is Homogenous)

8:18

Standard Error for Independent Samples (When Variance is not Homogenous)

9:25

Same Conditions for HT as for CI

10:08

Three Conditions

10:09

Steps of Hypothesis Testing

11:04

Steps of Hypothesis Testing

11:05

Formulas that Go with Steps of Hypothesis Testing

13:21

Step 1

13:25

Step 2

14:18

Step 3

15:00

Step 4

16:57

Example 1: Hypothesis Testing for the Difference of Two Independent Means

18:47

Example 2: Hypothesis Testing for the Difference of Two Independent Means

33:55

Example 3: Hypothesis Testing for the Difference of Two Independent Means

44:22

Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro

0:00

Roadmap

0:09

Roadmap

0:10

The Goal of Hypothesis Testing

1:27

One Sample and Two Samples

1:28

Independent Samples vs. Paired Samples

3:16

Independent Samples vs. Paired Samples

3:17

Which is Which?

5:20

Independent SAMPLES vs. Independent VARIABLES

7:43

independent SAMPLES vs. Independent VARIABLES

7:44

T-tests Always…

10:48

T-tests Always…

10:49

Notation for Paired Samples

12:59

Notation for Paired Samples

13:00

Steps of Hypothesis Testing for Paired Samples

16:13

Steps of Hypothesis Testing for Paired Samples

16:14

Rules of the SDoD (Adding on Paired Samples)

18:03

Shape

18:04

Mean for the Null Hypothesis

18:31

Standard Error for Independent Samples (When Variance is Homogenous)

19:25

Standard Error for Paired Samples

20:39

Formulas that go with Steps of Hypothesis Testing

22:59

Formulas that go with Steps of Hypothesis Testing

23:00

Confidence Intervals for Paired Samples

30:32

Confidence Intervals for Paired Samples

30:33

Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

32:28

Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

44:02

Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

52:23

Type I and Type II Errors

31m 27s

Intro

0:00

Roadmap

0:18

Roadmap

0:19

Errors and Relationship to HT and the Sample Statistic?

1:11

Errors and Relationship to HT and the Sample Statistic?

1:12

Instead of a Box…Distributions!

7:00

One Sample t-test: Friends on Facebook

7:01

Two Sample t-test: Friends on Facebook

13:46

Usually, Lots of Overlap between Null and Alternative Distributions

16:59

Overlap between Null and Alternative Distributions

17:00

How Distributions and 'Box' Fit Together

22:45

How Distributions and 'Box' Fit Together

22:46

Example 1: Types of Errors

25:54

Example 2: Types of Errors

27:30

Example 3: What is the Danger of the Type I Error?

29:38

Effect Size & Power

44m 41s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Distance between Distributions: Sample t

0:49

Distance between Distributions: Sample t

0:50

Problem with Distance in Terms of Standard Error

2:56

Problem with Distance in Terms of Standard Error

2:57

Test Statistic (t) vs. Effect Size (d or g)

4:38

Test Statistic (t) vs. Effect Size (d or g)

4:39

Rules of Effect Size

6:09

Rules of Effect Size

6:10

Why Do We Need Effect Size?

8:21

Tells You the Practical Significance

8:22

HT can be Deceiving…

10:25

Important Note

10:42

What is Power?

11:20

What is Power?

11:21

Why Do We Need Power?

14:19

Conditional Probability and Power

14:20

Power is:

16:27

Can We Calculate Power?

19:00

Can We Calculate Power?

19:01

How Does Alpha Affect Power?

20:36

How Does Alpha Affect Power?

20:37

How Does Effect Size Affect Power?

25:38

How Does Effect Size Affect Power?

25:39

How Does Variability and Sample Size Affect Power?

27:56

How Does Variability and Sample Size Affect Power?

27:57

How Do We Increase Power?

32:47

Increasing Power

32:48

Example 1: Effect Size & Power

35:40

Example 2: Effect Size & Power

37:38

Example 3: Effect Size & Power

40:55

Section 11: Analysis of Variance

F-distributions

24m 46s

Intro

0:00

Roadmap

0:04

Roadmap

0:05

Z- & T-statistic and Their Distribution

0:34

Z- & T-statistic and Their Distribution

0:35

F-statistic

4:55

The F Ration ( the Variance Ratio)

4:56

F-distribution

12:29

F-distribution

12:30

s and p-value

15:00

s and p-value

15:01

Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?

18:33

Example 2: F-distributions

19:29

Example 3: F-distributions and Heights

21:29

ANOVA with Independent Samples

1h 9m 25s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

The Limitations of t-tests

1:12

The Limitations of t-tests

1:13

Two Major Limitations of Many t-tests

3:26

Two Major Limitations of Many t-tests

3:27

Ronald Fisher's Solution… F-test! New Null Hypothesis

4:43

Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)

4:44

Analysis of Variance (ANoVA) Notation

7:47

Analysis of Variance (ANoVA) Notation

7:48

Partitioning (Analyzing) Variance

9:58

Total Variance

9:59

Within-group Variation

14:00

Between-group Variation

16:22

Time out: Review Variance & SS

17:05

Time out: Review Variance & SS

17:06

F-statistic

19:22

The F Ratio (the Variance Ratio)

19:23

S²bet = SSbet / dfbet

22:13

What is This?

22:14

How Many Means?

23:20

So What is the dfbet?

23:38

So What is SSbet?

24:15

S²w = SSw / dfw

26:05

What is This?

26:06

How Many Means?

27:20

So What is the dfw?

27:36

So What is SSw?

28:18

Chart of Independent Samples ANOVA

29:25

Chart of Independent Samples ANOVA

29:26

Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?

35:52

Hypotheses

35:53

Significance Level

39:40

Decision Stage

40:05

Calculate Samples' Statistic and p-Value

44:10

Reject or Fail to Reject H0

55:54

Example 2: ANOVA with Independent Samples

58:21

Repeated Measures ANOVA

1h 15m 13s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

The Limitations of t-tests

0:36

Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?

0:37

ANOVA (F-test) to the Rescue!

5:49

Omnibus Hypothesis

5:50

Analyze Variance

7:27

Independent Samples vs. Repeated Measures

9:12

Same Start

9:13

Independent Samples ANOVA

10:43

Repeated Measures ANOVA

12:00

Independent Samples ANOVA

16:00

Same Start: All the Variance Around Grand Mean

16:01

Independent Samples

16:23

Repeated Measures ANOVA

18:18

Same Start: All the Variance Around Grand Mean

18:19

Repeated Measures

18:33

Repeated Measures F-statistic

21:22

The F Ratio (The Variance Ratio)

21:23

S²bet = SSbet / dfbet

23:07

What is This?

23:08

How Many Means?

23:39

So What is the dfbet?

23:54

So What is SSbet?

24:32

S² resid = SS resid / df resid

25:46

What is This?

25:47

So What is SS resid?

26:44

So What is the df resid?

27:36

SS subj and df subj

28:11

What is This?

28:12

How Many Subject Means?

29:43

So What is df subj?

30:01

So What is SS subj?

30:09

SS total and df total

31:42

What is This?

31:43

What is the Total Number of Data Points?

32:02

So What is df total?

32:34

so What is SS total?

32:47

Chart of Repeated Measures ANOVA

33:19

Chart of Repeated Measures ANOVA: F and Between-samples Variability

33:20

Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability

35:50

Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?

40:25

Hypotheses

40:26

Significance Level

41:46

Decision Stage

42:09

Calculate Samples' Statistic and p-Value

46:18

Reject or Fail to Reject H0

57:55

Example 2: Repeated Measures ANOVA

58:57

Example 3: What's the Problem with a Bunch of Tiny t-tests?

1:13:59

Section 12: Chi-square Test

Chi-Square Goodness-of-Fit Test

58m 23s

Intro

0:00

Roadmap

0:05

Roadmap

0:06

Where Does the Chi-Square Test Belong?

0:50

Where Does the Chi-Square Test Belong?

0:51

A New Twist on HT: Goodness-of-Fit

7:23

HT in General

7:24

Goodness-of-Fit HT

8:26

Hypotheses about Proportions

12:17

Null Hypothesis

12:18

Alternative Hypothesis

13:23

Example

14:38

Chi-Square Statistic

17:52

Chi-Square Statistic

17:53

Chi-Square Distributions

24:31

Chi-Square Distributions

24:32

Conditions for Chi-Square

28:58

Condition 1

28:59

Condition 2

30:20

Condition 3

30:32

Condition 4

31:47

Example 1: Chi-Square Goodness-of-Fit Test

32:23

Example 2: Chi-Square Goodness-of-Fit Test

44:34

Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?

56:06

Chi-Square Test of Homogeneity

51m 36s

Intro

0:00

Roadmap

0:09

Roadmap

0:10

Goodness-of-Fit vs. Homogeneity

1:13

Goodness-of-Fit HT

1:14

Homogeneity

2:00

Analogy

2:38

Hypotheses About Proportions

5:00

Null Hypothesis

5:01

Alternative Hypothesis

6:11

Example

6:33

Chi-Square Statistic

10:12

Same as Goodness-of-Fit Test

10:13

Set Up Data

12:28

Setting Up Data Example

12:29

Expected Frequency

16:53

Expected Frequency

16:54

Chi-Square Distributions & df

19:26

Chi-Square Distributions & df

19:27

Conditions for Test of Homogeneity

20:54

Condition 1

20:55

Condition 2

21:39

Condition 3

22:05

Condition 4

22:23

Example 1: Chi-Square Test of Homogeneity

22:52

Example 2: Chi-Square Test of Homogeneity

32:10

Section 13: Overview of Statistics

Overview of Statistics

18m 11s

Intro

0:00

Roadmap

0:07

Roadmap

0:08

The Statistical Tests (HT) We've Covered

0:28

The Statistical Tests (HT) We've Covered

0:29

Organizing the Tests We've Covered…

1:08

One Sample: Continuous DV and Categorical DV

1:09

Two Samples: Continuous DV and Categorical DV

5:41

More Than Two Samples: Continuous DV and Categorical DV

8:21

The Following Data: OK Cupid

10:10

The Following Data: OK Cupid

10:11

Example 1: Weird-MySpace-Angle Profile Photo

10:38

Example 2: Geniuses

12:30

Example 3: Promiscuous iPhone Users

13:37

Example 4: Women, Aging, and Messaging

16:07

This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics

Statistics Sampling & Bias

Name: Statistics: Sampling & Bias
Brand: Educator.com
Price: 35 USD
Availability: InStock

Section 6: Collecting Data in an Experiment: Lecture 1 | 54:44 min

Lecture Description

Next Lecture

Previous Lecture

Discussion
Answer Engine
Download Lecture Slides
Table of Contents
Transcription
Related Books

Please login to ask a question and view discussion.

Start Learning Now

Our free lessons will get you started (Adobe Flash^® required).
Get immediate access to our entire library.

Membership Overview

Unlimited access to our entire library of courses.
Search and jump to exactly what you want to learn.
*Ask questions and get answers from the community and our teachers!
Practice questions with step-by-step solutions.
Download lesson files for programming and software training practice.
Track your course viewing progress.
Download lecture slides for taking notes.
Learn at your own pace... anytime, anywhere!

Answer EngineGet answers to any question!Ask any question related to Statistics

Working on the solution...

Sampling & Bias

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

Intro 0:00
Roadmap 0:05

Roadmap

Descriptive vs. Inferential Statistics 1:04

Descriptive Statistics: Data Exploration
Example

To tackle Generalization… 4:31

Generalization
Sampling
'Good' Sample

Defining Samples and Populations 8:55

Population
Sample

Why Use Sampling? 13:09

Why Use Sampling?

Goal of Sampling: Avoiding Bias 15:04

What is Bias?
Where does Bias Come from: Sampling Bias
Where does Bias Come from: Response Bias

Sampling Bias: Bias from Bas Sampling Methods 19:34

Size Bias
Voluntary Response Bias
Convenience Sample
Judgment Sample
Inadequate Sample Frame

Response Bias: Bias from 'Bad' Data Collection Methods 28:00

Nonresponse Bias
Questionnaire Bias
Incorrect Response or Measurement Bias

Example 1: What Kind of Biases? 40:29
Example 2: What Biases Might Arise? 44:46
Example 3: What Kind of Biases? 48:34
Example 4: What Kind of Biases? 51:43

General Statistics Online Course

Section 1: Introduction
	Descriptive Statistics vs. Inferential Statistics	25:31
Section 2: About Samples: Cases, Variables, Measurements
	About Samples: Cases, Variables, Measurements	32:14
Section 3: Visualizing Distributions
	Introduction to Excel	8:09
	Frequency Distributions in Excel	39:10
	Frequency Distributions and Features	25:29
	Dotplots and Histograms in Excel	42:42
	Stemplots	12:23
	Bar Graphs	22:49
Section 4: Summarizing Distributions
	Central Tendency: Mean, Median, Mode	38:50
	Variability	42:40
	Five Number Summary & Boxplots	57:15
	Shape: Calculating Skewness & Kurtosis	41:51
	Normal Distribution	34:33
	Standard Normal Distributions & Z-Scores	41:44
	Normal Distribution: PDF vs. CDF	55:44
Section 5: Linear Regression
	Scatterplots	47:19
	Regression	32:02
	Least Squares Regression	56:36
	Correlation	43:58
	Correlation: r vs. r-squared	52:52
	Transformations of Data	27:08
Section 6: Collecting Data in an Experiment
	Sampling & Bias	54:44
	Sampling Methods	14:25
	Research Design	53:54
	Between and Within Treatment Variability	41:31
Section 7: Review of Probability Axioms
	Sample Spaces	37:52
	Addition Rule for Disjoint Events	20:29
	Conditional Probability	57:19
	Independent Events	24:27
Section 8: Probability Distributions
	Introduction to Probability Distributions	56:45
	Expected Value & Variance of Probability Distributions	53:41
	Binomial Distribution	55:15
Section 9: Sampling Distributions of Statistics
	Introduction to Sampling Distributions	48:17
	Sampling Distribution of the Mean	1:08:48
	Sampling Distribution of Sample Proportions	54:37
Section 10: Inferential Statistics
	Introduction to Confidence Intervals	42:53
	t Distributions	1:02:06
	Introduction to Hypothesis Testing	1:06:33
	Confidence Intervals for the Difference of Two Independent Means	55:14
	Hypothesis Testing for the Difference of Two Independent Means	50:00
	Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means	1:14:11
	Type I and Type II Errors	31:27
	Effect Size & Power	44:41
Section 11: Analysis of Variance
	F-distributions	24:46
	ANOVA with Independent Samples	1:09:25
	Repeated Measures ANOVA	1:15:13
Section 12: Chi-square Test
	Chi-Square Goodness-of-Fit Test	58:23
	Chi-Square Test of Homogeneity	51:36
Section 13: Overview of Statistics
	Overview of Statistics	18:11

Transcription: Sampling & Bias

Hi and welcome to www.educator.com.0000

Today we are going to be talking about sampling and bias.0002

The next couple of lectures they are going to be about the way that you collect the data and the different ways that are involved there.0006

It is really because we are going to be going into inferential statistics, so we are going to be talking about0020

the difference between descriptive statistics and inferential statistics.0026

In that introduction we are going to start off with how do you collect the data in order to do inferential statistics.0032

Because of that we are going to be talking about how these two methods are involved simply to collect data and experiments to collect data.0038

The book of this lecture is going to be based on sampling.0050

It is going to be about why we use sampling, the goal of sampling which is to avoid biased0053

and the types of biases that might occur because of different kinds of sampling.0058

First let us start off with the difference between the descriptive statistics and inferential statistics.0066

You may not have noticed so far what we have been doing is largely descriptive statistics.0072

Everything that we been doing so far like visualizing data, summarizing data, getting things0077

like the mean, median, and interquartile range, standard deviation.0083

All of these things are just about summarizing that data.0092

We have looked at doing that for one variable or two variables.0096

We have looked at getting mean, median, and mode and visualizing one variable at a time on bar graphs or histograms.0102

We also looked at things like regression and correlation and looking at scatterplot, so visualizing and summarizing two variables at the time.0110

Even though we can do all of that stuff so far, there are still lots of things we cannot do that are important.0126

Here is one example, so maybe we could do stuff like this.0134

A sample of 53 out of 100 voters will vote for candidate A but here is what we do not know, will candidate A win the election?0138

How likely is that?0148

We could summarize our sample, we can visualize it but we cannot make any conclusions about it for the actual population.0153

We cannot do generalization.0164

This is what skill or ability is called.0167

We cannot do generalization going from the sample to understand something about the population.0172

That is something we cannot do yet.0179

Here is an example of something that you cannot do.0183

A sample of 20 preschoolers who are raised with pets scored an average of 5.6 out of 10 on a biology class but a sample of those without pets scored 4.5.0185

Does this mean that having pets make children better in biology?0197

We have no idea.0201

I could tell you I could summarize for you the sample of 20 preschoolers with pets and without pets.0203

I could summarize that.0210

I could visualize that for you and I could do that with these two variables whether they have been raised with pets or not.0211

The pets are the variable which is categorical and their score on the biology test which is continuous.0219

I could do that but I still cannot tell you whether having pets actually cause children to be better at biology.0226

This is compare and find causation.0234

We cannot figure out causation so far and that is really going from the data in order to have a theory about causation.0245

This is something that we are also not able to do.0256

Number one thing we cannot do is we cannot do any generalization.0259

Number two thing we cannot do is we cannot say anything about one variable causing another variable, but causing changes in another variable.0264

First, in order to tackle generalization that is why we are going to take on sampling.0275

Sampling is going to help solve this problem.0279

In order to take on the problem of not being able to determine causation we are going to be using experiments to do that.0293

So, first, let us talk about sampling.0301

This is what a common generalization problem.0304

We have the sample that where 53 out of 100 voters will vote for candidate A but the question is will candidate A win the actual election?0309

Here our little sample is 100 potential voters.0319

If these 100 potential voters and we know what their opinion is like but the question is every voter in election,0327

we want to predict their behavior and that is the problem.0343

We only know about these 100 potential voters which is the subset and we do not know if they reflect every voter in the election.0347

And so what we are really looking for is a way to get a sample of this population that actually represents this population well.0365

Sampling is the method of choosing or the items or cases in your study.0369

And so that is the whole idea behind sampling.0394

It is a method of choosing your particular sample.0396

Here is the thing, in order to select a good sample and so what is a good example?0401

A good sample is one that is a representative of sample and a representative meaning that0412

this sample if it accurately reflects this population then that is a really good sample.0427

It is hard to guarantee a good example because there is just no way I could ever really know from looking at a tiny book if it actually reflects the population.0434

The only way I could really know that if I knew everything about the population then I can compare it to my samples.0445

Then all my samples are representatives but that is not the problem, we often cannot get the whole population.0450

How do I know if my sample is any good?0457

The bad news is we cannot really know but the good news is we determine a good sample by a good sampling method.0463

We are not going to judge your like recipe by how good the food actually turns out.0481

We are going to judge the recipe about by looking at how good the recipe or the techniques actually are.0488

In that way we do not judge a sample by actually knowing how representative it is but we think it is a good sample if it has a good sampling method.0495

And by good it, we mean unbiased.0505

That is what we are really looking for.0510

Ultimately, our goal is a representative sample, but in order to get there all we have are the tools at our hand is unbiased sampling method.0512

And so as long as you have an unbiased sampling method you are safe as you can be because0524

there is no other way to check whether that sample is actually representative or not.0530

Before we go on, let us take a moment to define samples and population.0539

A population is the entire universe of cases that you are interested in, entire set of cases you are interested in.0543

Sometimes population are manageable for instance, if your population is all the people in your school.0564

So it is a manageable size.0572

Maybe your school has 10,000 people and I do not if it is a lie but it is still more manageable and it is very well defined.0576

It is easy to say all your student or maybe you have a company and your population is all your employees.0585

That is another sample, your employees.0596

That is another population.0598

That is sort of smaller and manageable but often times populations are quite large and unmanageable and even undefinable.0600

For instance, Coca-Cola might be interested in potential customers.0607

That is really hard to say who the potential customer of Coca-Cola.0614

They might be as diverse as cho la la from India, or like businessman in New York City.0617

In McDonald's people might be customers of potential customers for Coca-Cola.0627

The entire set of case is whatever you are actually interested in.0635

Often times in psychological or sociological or social science studies, the entire cases you are interested0640

in our something like all adolescents or all American adolescents or all adults.0652

Those are some pretty wide-ranging populations that are very, very difficult to define and pinpoint.0664

So if it would be impossible to collect data on all of those people.0673

Now the sample is drawn from that population.0677

It is the subset of the population.0681

The sample is the subset of the population not only that but the sample is also the people0688

who are actually in your city or the cases that are actually in your study.0697

It does not to be people.0702

For instance, it could be pills made in your factory.0704

You are not going to look at every single pill made in your factory that would be the population, but maybe you will take a sample.0708

You will take one bottle from a case and that will be your sample.0713

The sample is the subset that you collect data on.0718

These are presumably the ones in your study or data set.0731

That is the difference between populations and samples.0743

When we collect data from an entire population that is very rare but it does sometimes happen and that is called a census.0746

When you collect data from every single case that you are interested in.0764

That is very, very thorough.0771

That is often not what we do.0774

Here we use sampling methods and here we collect data on a sample.0777

Why use sampling?0791

Why not always do a census?0793

Well the census is very, very difficult.0794

If your population is manageable, for instance, let us say you go to a high school and your high school has 1,000 people in it.0800

That might not be impossible to collect data on everybody.0809

Usually for something like wanting to know about adolescents living in the United States then doing census is very, very hard.0813

Obviously sampling can save you money and time.0823

Sampling also makes it possible to collect more information on each individual case.0833

Basically the idea there is that when you collect census information you can usually only collect very little0840

or shallow information about every case you are interested in.0849

But if you do sampling presumably you can go in depth with each case.0853

Finally, sometimes testing or collecting data will sometimes destroy the item of interest.0858

For instance Ben & Jerry's might want to use sampling to maintain quality control of their ice cream products.0865

In order to do that, one of the things they do is they cut the entire carton in half to just make sure0874

that all ice cream is mixed with the ice cream on top tastes the same as ice cream on the bottom, that the flavors are the consistent all the way through.0878

In order to do that, you would have to destroy a pint of ice cream and that one pint of ice cream that cannot be sold.0888

If you did a census, you would destroy every single one of your products, right?0895

You do not want to do census in order to do quality control.0899

Let us talk about the goal of sampling.0906

The big goal of sampling whenever you are trying to do sampling is to avoid biased or we want to avoid biased.0909

The whole reason is we want our sample to be representative of our population.0915

Biased means that our sample is not representative that is somehow our sample is giving us values0935

that are too small or too large or skewed in some way away from the population.0944

Maybe the population mean is 5, but our sampling method gives us samples that are consistently to low or maybe too high.0950

What we want is a sample when we use the sampling method that this method would give us the even distribution.0962

Biased is getting samples that are consistently too low or high for population.0973

Just because of the sampling method results in a sample that is a little bit different from the population does not mean it is a bad sampling method.0998

If it consistently gives you bad samples, that is a bad sampling method.1007

That is a biased.1012

But because of chance whenever we draw from a population randomly we would not get some spread around the mean.1013

What we want from a good sampling method is that those samples, that sampling method, those samples reflect the actual distribution of the population.1025

For instance, let us say we had this box of marbles and it is a mix of red and white, and it is the even mix of red and white.1037

But let us say we have a sampling method that only takes from this side.1047

What we really want is a sampling method that goes sort of around randomly and takes samples from everywhere instead of just from one side.1057

That is what we mean by biased, when the actual method itself is biased.1069

Where does the biased come from?1075

Well, the one that I mentioned is that biased can come from the actual sampling method.1079

What we mean by that is that your actual method as picking the sample is somehow biased.1098

I will give you some examples in the next slide.1104

The other way that you could get biased is actually sort of bad news because even when you have a good sample1108

you might still have biased and that might be because of something like response biased.1118

This would be because somehow the way that you are collecting the data or somehow something1123

is going wrong in the way that the data is being measured.1130

Biased from method of collecting or measuring your cases or variables.1134

Those kind of biases are also dangerous because that is even when you have an1158

unbiased sample, you might still have biased that comes from response biased.1165

We will first talk about sampling biased and then we will talk about response biased.1171

Sampling biased will come from bad sampling methods, already skewed sampling methods.1177

One sort of bad biased that you want to avoid is biased that comes from size biased.1186

The size biased is whenever you have a method that gives a larger individual, the better chance.1193

Method gives larger individuals more chance of being in the sample.1200

For instance, let us say you wanted to collect data on insects from various places, various lakes in your state.1221

You have a map of the state and you decide to drop a handful of rice from up here and1234

just throw it and wherever the rice lands I’m going to go to that pond or lake to collect some insects.1241

Now, the problem with that is that bigger lakes will have a greater chance of being landed on by the rice1250

and so those bigger lakes will be overly represented in your sample and the smaller lakes would not.1257

That is often a poor method of sampling.1264

Another biased method of sampling is when you ask people to volunteer to be in a sample.1274

Letting people volunteer to be in the sample.1284

Sometimes this sampling method is used, but it should be always noted that this has a biased in it.1295

For instance, you might only collect data from people who care more or more passionate about this issue1301

or maybe they are people who have more time on their ends.1310

One thing might be when people send out an email survey and ask people would you take some time to fill it out?1314

People who check their email more.1321

People who are online at work all the time.1323

Those might be the people that are overly represented in your sample when perhaps you just want a sample from just regular people from all acts of life.1326

Voluntary response bias is often something that comes about.1338

Now another very, very common bias or biased sampling method rather is a convenient sample.1342

Basically the idea is take whatever is handy, so maybe you want to know about adolescent behavior,1353

but you just sample from people in your high school because you go to that high school.1365

This is a really frequent sampling method and most a lot of social science studies for better or for worse.1374

A lot of universities have a subject pool where people can participate in studies and those are just university students.1385

Even though these social science studies want to know about human behavior with a real yet information is people taking psych 101 for college students.1394

So that is a pretty biased sample.1407

Really, it is just using whatever is handy so the people around you, the people in your school, the people who happen to be at a particular location.1411

Maybe if you are a biologist, you would sample ponds in a 10 mile radius around your institution.1423

All of those methods are biased because of convenience.1431

They are just based on the convenience.1437

Another biased sampling method is a judgment sample.1440

A judgment sample is one where you are trying to be representative.1449

You are really trying hard and because you trying to be representative, you actually get an expert1453

or expert like reasons in order to pick certain people or institutions, or lakes, or whatever it is for your study.1458

Use expert judgment to select sample.1469

For instance, maybe a professor wants to do research on elementary schools.1483

How would he pick the elementary schools?1488

Well, one thing you might do is pick them based on socioeconomic status of the community1490

and so he would have the even mixture of the different strata of SES in his community.1497

There he is trying to use his expert judgment and using expert knowledge in order to create a representative sample.1508

Now the problem with this is that experts do not know everything.1518

Even with the best of intentions and the best and most updated knowledge, we might still have a biased sample1522

because maybe we did not know that some factor was really important.1529

Or maybe because of the way we picked it there was systematic bias introduced.1533

The final thing is when you have what is called an inadequate sampling frame.1543

Sampling frame is the pool of potential cases.1549

The pool of potential cases that you are going to include in your study.1560

If you have a small sampling frame, if you say this is my pool of people that I’m going drop from them1566

and then that would be worse than if you had a larger and more representative sampling frame.1574

Often this is the case when people get directories somehow, the telephone directory, not everybody is listed in the telephone directory1579

and once you try to use the telephone directory in order to pick people who would be in your study1589

that might be an inadequate sampling frame because you would systematically be biased against people1595

who do not like to list their number publicly or who just recently moved or something like that.1601

Those are cases of a inadequate sampling frame where your pool of original potential cases was not big enough.1610

Let me say a final thing about these five different types of biased sampling methods.1622

They are bad, and I will put bad quotes, they are bad because they produce biased.1631

They are bad because they do not produce representative samples.1634

That is what I mean by bad.1644

It does not mean that they are not used.1646

If I quote used quite frequently, voluntary response bias is quite frequent, convenient sampling is a very common method.1648

Judgment samples are also very common.1659

Although they are bad, it is not that they are not used.1664

It is just that you need to know that when you read a study and it says that use the convenient sample,1666

it is good for us to keep that in mind and take the results of the study with a grain of salt.1672

Here is the bad news, let us say somehow you figured out this great way to sample and you avoid size bias1683

and voluntary response bias and you are not convenience sampling and you did not necessarily use expert judgments1692

and you have a big enough sampling frame you can still have bias in your data.1699

It is crazy because even though you might try your hardest to avoid all of those things you might still have bias because of bad data collection method.1710

Once again by bad, I’m going to put that in quote because it is going to make it systematically biased but it is not that these are not common.1722

They are actually quite common.1731

The first kind of a bad data collection bias comes from what it is called nonresponse bias.1734

Remember, a lot of studies are voluntary and so some people may just not want to participate, they might not volunteer.1744

They might decline being part of your sample.1754

And so those people who choose not to respond, you do not have data on them but they might be very important to the population.1757

There are systematically not represented in the population.1767

The nonresponse bias is you could think of it as the who bias.1772

You systematically avoid to the sample is missing the people who do not want to respond for whatever reason.1781

For instance, in that case where you email out a survey those people who do not respond1805

even though you emailed out the survey to perfectly representative group of people.1811

The people who choose not to finish the survey, you are not going to have data for them.1818

Let us say you want to do a study on whether teachers and their teaching methods change over time1824

and you want to go into teachers classrooms but you have to have their permissions.1837

Now the teachers to choose not to participate in your study, we do not know about their behavior.1843

We do not know about their teaching methods and who knows why they do not want to participate.1849

Maybe they are such awesome teachers and they just want to focus on their students.1853

Or maybe they are terrible and they are ashamed and so they would not want anybody going in there watching them.1857

You never know and those nonresponses are actually stealing your data.1864

The other bad data collection method is the bias from bad data collection methods is what we call a questionnaire bias.1871

A questionnaire bias is very broad and this is sort of what I think of as the how bias because the way you word something,1881

the way you word a question, the tone of voice that you ask, the context in which you ask, the order of questions.1893

All of these things affect the answer.1913

Because of that, it is very hard to avoid the questionnaire bias.1926

All of these tiny little variables actually might affect the way that your data is collected, the kind of data that gets collected.1933

Let me just give you an example.1942

This sort of famous example by Amos Traverski, a Nobel prize-winning economist and behavioral economists.1947

He asked people these two questions.1960

Basically in one situation, they said okay I want you to tell me which one of these things you are going to choose.1968

First I’m going to give you $100 then I’m going to give you a risky choice versus a safe choice.1978

I want you to tell me which of these two choices that you want to take with your $100.1997

One is take a risk, you put the coin and if you get the heads you will lose nothing but if you get tails you will lose that $100.2003

That is the risky choice.2027

The safe choice is this, I will just give you $50.2029

People obviously will choose this option because it is like “what I can lose money?” or you just give me $50.2049

For sure I’m just going to get the $50.2057

Sounds reasonable.2063

But let us say I ask the question slightly differently.2065

Here is $200, the risky choice remains the same.2068

Heads you will lose nothing, tails you will lose $100.2077

It does not sound like a great choice, right2086

What is the safe option this time?2090

Now the safe option is you lose $50.2093

Let us think about this for a second.2106

A lot of people would say I’m going to go to this option.2108

Let us think.2114

In the first game, if you have $100 I will give you $50, you would end up with $150 here.2117

That is your choice.2126

Here you have a chance of losing 0 or losing 100.2129

Here if I put safe, I will still end up with $150.2135

All of a sudden, this option looks so terrible compared to this one and the same choice was this option looks good compared to this one.2148

One of the things they found is that one thing is people are really sensitive to this framing in terms of classes starting2163

with $200 and going down to $150 is a lot worse than starting with $100 and going up to $150.2169

Even though the $150 is exactly the same.2175

People are very lost adverse.2179

The second thing is that people are more likely to take risks when you frame things in terms of loses.2183

All of a sudden, risks sounds pretty good to us.2190

But if we frame things in terms of games then risks sounds terrible and we rather have the safer option.2193

What they have done is that they have shown us with saving lives, saving the environment, like all kinds of stuffs.2201

People make choices differently when you frame the question in terms of losses or in terms of games, in terms of risk or safety.2208

All of those things play a role.2217

This is just a small example, but you could see that every question that we work we have all kind of choices in how we asked it.2220

We could ask people are you feeling anxious rate on a scale of 1 to 4 or we can ask the exact opposite are you feeling calm?2229

We can ask either one of those and get to same idea that we wanted to get to.2239

So how do you choose?2245

To those are questionnaire biases when we have bad data collection methods.2248

The final bias from bad data collection is when we have incorrect responses or measurement bias.2254

Incorrect responses are pretty straightforward, but it is also sort of unavoidable.2262

These are things like maybe you ask people questions and they have incorrect memories or maybe sometimes they straight up why?2268

Nothing says that people cannot lie to you.2281

Maybe people will be like did you vote in the last election?2284

Maybe somebody asked that on a survey.2290

Maybe people say that they did not but I have thoughts about it, so that is sort of like voting or my candidate won anyway so it is like I voted.2292

Maybe they might just check off yes or maybe you might ask them did you know you follow the doctor's instructions since the last time last doctor's visit?2299

People might look back and say I did not do it every day, but I did it like twice.2316

Sure I followed all the doctors instructions.2322

People might be well-meaning and still end up having an incorrect response, or they may be generally trying to remember did I do that2327

I do not know.2335

Maybe I think I did and maybe just convince themselves that they did do it.2336

Incorrect responses have a lot of different sources but they are unavoidable.2341

They are very hard to avoid anyway.2347

The other thing is measurement bias.2350

Something about the way your measuring it is wrong.2352

For instance, let us say you wanted to measure people heights but your ruler is sort of worn down on one side.2356

Then your measurement tool is already off.2366

Maybe you are collecting people’s times on the stopwatch, but your stopwatch is off.2370

Things like that make it to problematic for you.2376

Actual measuring tool is off.2379

These are all kinds of biases that might come from bad data collection methods.2389

By bad I just mean that they result in systematic bias.2395

Although they are going to be ways that we can try to mediate some of the things, we could try to build in some checks and balances.2398

We could do a lot of other things.2409

A lot of these times these biases even though we try as best we can, they do not completely go away.2412

There always sort of their lurking in the background and whenever you meet a study you want to watch out for these different biases.2420

Let us start with some example.2433

People generally want to appear knowledgeable and agreeable even to strangers, how might that affect2436

the result of a telephone survey conducted by a school on the satisfaction of graduates with their education?2441

What biases should we worry about?2448

Well, one thing we want to start off with is the biases that come from sampling methods.2451

Sampling method bias.2459

Would people's desire to appear knowledgeable and agreeable, would that somehow biased this?2461

I think mostly, it probably would not affect the sampling method because the sampling method we will be choosing who to call.2475

It probably would not affect that but it may affect the response biases.2484

First out of sampling or selection bias versus response bias, those are the two different kinds of biases that we could choose from.2492

It is probably the desire to appear knowledgeable and agreeable or most likely affect the response biases more than the sampling bias.2515

Sampling you have the desires of the participant have served less to do with it than the desires of the experimenter or researcher.2526

We are going to have to worry about response bias.2543

One thing might be is would this desire affect the nonresponse bias.2545

Perhaps because perhaps the people who choose to actually take part in the telephone survey, those people might be the ones who really want to please people.2556

You might get people pleasers over represented and maybe there are people who are less social or less agreeable but they would be underrepresented.2570

As for questionnaire bias, perhaps there might be some of that bias but that would depend mostly on the question and less on the participant themselves.2591

The other thing that we probably do have to worry about is the incorrect responses.2604

People might try to think back and when they think back and somebody ask them do you think college really was valuable to you2616

or how satisfied are you with your educational experience.2626

They might think back and just remember all the good times or their favorite professors2630

and they might forget all their useless professors or classes they thought were so useless.2638

They might incorrectly remember or have a bias in their memories.2648

Or maybe it might just be as simple as you know, maybe some of them went to a game2656

or something recently for their school and some of that school spirit gets activated and it may have nothing to do with educational experience.2664

But still color their thoughts about their educational experience.2672

I would say that the two biases that we should worry about the most are nonresponsive and incorrect response bias.2679

Examples 2, at a meeting of local democrats the organizers wanted to estimate how well the party will do in the next election.2689

They use the people at the meeting for their sample.2696

What biases might arise?2699

Already we know that this is going to entail sampling bias.2701

The sampling bias is definitely going to be part of it.2708

Out of the different types of sampling bias which of these might be involved?2712

Sampling bias is not really a thing here because it is not that bigger people are going to be more represented2717

by the bigger donors are represented or may be bigger donors.2726

That might be a possibility but really the problem is going to be voluntary response bias.2739

The reason is that these people volunteered to come to this meeting so it is definitely the people who are carrying2752

maybe their extreme democrats, maybe they hate republicans, whatever it is these people are volunteered that are there not randomly,2760

but because they really chosen to be there that night.2770

Also another thing that we see that is that this is definitely a convenient sample.2775

They are just taking the people who can do that.2780

It is a convenience sample certainly, it is not a judgment sample because these people were not picked for some reason.2783

Definitely they had an inadequate sampling frame, but that sort of goes along with convenient sample.2791

Because of these sampling biases we know that the sample is already biased other than any response biases2799

that might be working here in the background.2810

Certainly, there is the nonresponse bias.2814

All the people who did not come to the meeting for whatever reason maybe they are republicans, maybe they are democrats, but they are not that into it.2820

Maybe they are independent and they did not want to go to meeting for democrats, even though they are interested in this candidate.2821

The nonresponse bias is important to note.2838

There might be incorrect responses because of this biased sample.2845

They might ask them do your friends vote democrat and a lot of when they think back,2856

when they think about their democratic friends because they are sort of in the democratic context.2867

They might have incorrect responses that stand from poor context.2873

Context is the questionnaire bias.2888

Questionnaire bias is when something about the way that it is asked changes the answer.2894

If an organizer of a democratic meeting asks the question, maybe these people will say yes I am very democrat or I am very interested.2902

Example 3, a news magazine conducted a poll in Americans asking them to agree or disagree with one of two statements.2917

Statement 1, the government should try to balance their budget with both spending cuts and taxes.2924

Statement 2, I will be disappointed if taxes were raised this year.2931

What kind of biases should we worry about?2935

Let us break this down real quick.2939

Does it say anything about sampling?2944

It did not say anything about sampling.2948

We have no idea how their sampling so we can pretty much rule out sampling bias because it has nothing about sample.2949

We are probably got to be focusing on the response biases.2957

We do not know how they are asking so we cannot really say anything about the nonresponsive,2962

but it definitely seems like there are questionnaire biases that work here.2967

Because maybe people say government should definitely try to balance their budget.2978

They might say that they agree but they might also agree with this statement that they will be disappointed if taxes are raised this year.2984

People might use this statement to say Americans are in favor raising taxes as well as spending cuts2996

and then somebody might use this to say Americans are very against raising taxes3004

because they would be disappointed if taxes were raised this year.3010

Even though people might probably agree with both of these statements, it does not necessarily mean that they are against each other.3015

It is not that they are being internally inconsistent or hypocritical.3030

It is mostly because of the way these things are worded.3033

Of course we might say the government should try this.3039

It is pretty meek and it is not saying that the determinant has 2 but also this statement is pretty weak too.3043

It is just saying I would be disappointed if taxes were raised this year.3052

It is not saying I would move to Canada if taxes were raised this year.3056

I would hate my politicians if the taxes are raised this year.3062

Because of the way that these statements are worded, you might be able to push people to sort of agree3064

or disagree with the statement that with the idea that taxes are bad or good.3072

Questionnaire bias is definitely one.3081

Would there be an incorrect answer bias?3083

Not necessarily if they had worded these differently people would probably have a different opinion.3089

That is still a question nearby more than incorrect response bias.3096

Let us say questionnaire biases are the only one here.3101

And finally, in example 4 it says to estimate how many students in the city have passed the AP statistics test.3106

A researcher post a message on the website asking local teachers to report how many students took the exam and how many students passed?3113

What kind of biases shall we worry about?3120

Definitely we know that their sampling being mentioned here, the way that the researcher is sampling3123

is that they are posting a message on our website and asking local teachers to respond.3128

There is definitely sampling bias.3136

First of all, it is probably true that these people are only going to do this if they are volunteering.3144

There is voluntary response bias.3153

Also it is skewed towards people who are going to this website and so that is sort of a convenient sample.3160

You are only taking people who happen to come to this website anyway.3175

Let us see.3183

These are really the two big ones being perpetrated here, but in addition to sampling bias there is also some response biases at work here.3188

Definitely there is the nonresponse bias.3207

That is because there are going to be teachers who are not reporting anything.3215

Maybe they had 0 students take the AP exam or they have 0 students that actually pass and maybe they are ashamed.3218

Or maybe that some of the teachers were quite influential in teaching AP statistics they might just not be on the website.3227

Or maybe they do not have time to respond.3237

That is one definite problem.3241

Another problem might be incorrect responses.3244

Maybe some teachers will just misremember how many of their students actually took or pass the exam or they might have been incorrect for other reasons.3256

They might just have been off for other reasons.3267

The nonresponse and the incorrect response bias that is probably there.3270

That is it for sampling and biases.3279

Thanks for using www.educator.com.3282

Related Books

Statistics by Witte, 10th Edition

Authors: Robert S. Witte, John S . Witte

ISBN: 1118450531

Publisher: Wiley

Year: 2013

This book provides a clear and methodical approach to essential statistical procedures. It clearly explains the basic concepts and procedures of descriptive and inferential statistical analysis. This book features a new emphasis on expressions involving sums of squares and degrees of freedom as well as a stronger stress on the importance of variability.

Related Books

Name	Description	Link
BookRenter.com	BookRenter.com is simply the most reliable online textbook rental service.	Visit BookRenter.com
PhysicsForums.com Homework Help	Physics Forums is a scientific community for students looking for math & science help.	Visit PhysicsForums.com Homework Help

Statistics Sampling & Bias

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Discussion

Answer Engine

Download Lecture Slides

Table of Contents

Transcription

Related Books

Start Learning Now

Membership Overview

Answer EngineGet answers to any question!Ask any question related to Statistics

Sampling & Bias

General Statistics Online Course

Transcription: Sampling & Bias

Related Books

Related Books

Start Learning Now

Membership Overview

Statistics Sampling & Bias

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Discussion

Answer Engine

Download Lecture Slides

Table of Contents

Transcription

Related Books

Start Learning Now

Membership Overview

Answer EngineGet answers to any question!Ask any question related to Statistics

Sampling & Bias

General Statistics Online Course

Transcription: Sampling & Bias

Related Books

Related Books

Available 24/7. Unlimited Access to Our Entire Library.

Searchable Lessons

Get Answers & Community Support

Downloadable Lecture Notes

Study Guides, Worksheets and Extra Example Lessons

Start Learning Now

Membership Overview