Sign In | Subscribe
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
  • Discussion

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Bookmark and Share
Lecture Comments (2)

0 answers

Post by Michelle Greene on October 15, 2013

Again, you are using Excel when we cannot use Excel on exams. Please show us with at scientific calculator... excel is not helpful but the calculator is very helpful.

0 answers

Post by Brijesh Bolar on August 20, 2012

What book are you referring to for these sessions.. Or what book do we refer.

Introduction to Confidence Intervals

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:06
    • Roadmap
  • Inferential Statistics 0:50
    • Inferential Statistics
  • Two Problems with This Picture… 3:20
    • Two Problems with This Picture…
    • Solution: Confidence Intervals (CI)
    • Solution: Hypotheiss Testing (HT)
  • Which Parameters are Known? 6:45
    • Which Parameters are Known?
  • Confidence Interval - Goal 7:56
    • When We Don't Know m but know s
  • When We Don't Know 18:27
    • When We Don't Know m nor s
  • Example 1: Confidence Intervals 26:18
  • Example 2: Confidence Intervals 29:46
  • Example 3: Confidence Intervals 32:18
  • Example 4: Confidence Intervals 38:31

Transcription: Introduction to Confidence Intervals

Hi and welcome to

Today we are going to be introduced to competence intervals.0002

Here is the roadmap for today, first we are going to do a brief overview of inferential statistics.0005

We have been trying to do some inferential statistics but there have been a couple of problems we keep running into.0013

So far I have fudged it.0022

We will address some of those problems head on and come up with 2 solutions.0024

One of those solutions is the competence interval and we are going to talk about competence intervals 0031

when the sigma, population, standard deviation is known and when sigma is unknown.0039

Those are the two situations we are going to be focused on.0046

Let us go over inferential statistics.0049

We know the big picture idea there is some population represented by X and we wish we could know the population but we do not.0055

But instead what we can know is little samples.0065

We could know that but the problem is samples are biased. 0071

Whenever we have samples and we summarize them using these mathematical summaries we call them statistics.0074

Just to give you an example of some statistics there things like x bar or s, those are all statistics.0084

What we would like to do is use these samples to understand something about the population.0093

Statistics, the field is about using these statistics to estimate parameters and 0100

to give you ideas about parameters there are things like mu or sigma.0108

That is our whole goal.0112

Here we realize in order to jump from things like x bar and s to mu and sigma we are going to need more than just wishful thinking.0114

And that is where the sampling distributions come in.0132

Here we talk about sampling distribution often we are talking about some sort of statistic.0135

When we talk about sampling distribution of the mean we are talking about a whole bunch of x bars.0142

Here we have a whole bunch of x.0148

Here we have a whole bunch of x bar and that is the distribution.0150

When we summarize these statistics in the sampling distribution we call them expected values. 0155

So it is not just mu, it is mu sub x bar. 0164

It is not just sigma it is sigma sub x bar. 0168

What we want to do is go from this to understand this but what we have learned 0172

so far is how to see the relationship between parameters and expected values.0177

We know that these things have a relationship to each other.0186

And from doing that we could then make this jump.0189

It is like we use this to say something like this.0195

There are two problems with this picture although it seems rosy and there is still to nagging questions.0199

We would look at them a little bit before but we need to solve this more rigorously than we had before. 0210

One question is this, what happens when we do not know what the population looks like?0217

Of course we could use the central limit theorem when we know mu and sigma from the population.0222

What if we do not know mu?0229

What if we do not know sigma?0231

Then what happens?0233

Also, how do we know whether a sample is sufficiently unlikely because remember the whole point 0234

of the sampling distribution is for us to take sampling distributions from a known population and compare it to an unknown population. 0240

If this sample does not match the sampling distribution enough that it is very unlikely to come from the sampling distribution.0254

We could say this is probably not the population that the sample came from.0261

How do we know when it sufficiently weird?0266

To answer these two questions there is going to be to solutions.0269

You can think of it as this one.0275

This first question roughly, they are both actually are answered in each of these but this one goes along better with that one.0281

This one goes along better with that one.0287

The two solutions are these, one is competence interval.0291

When we talk about competence interval here is what we are doing, we are going to figure out where mu might be from the sample.0302

We are going to try to figure out the population mu from the sample and 0306

that is what we do when we do not know what the population looks like.0336

We try to figure it out from the sample.0342

Hypothesis testing actually takes another view.0344

The hypothesis testing, we come up with a hypothesis for what the population is like.0349

Hypothesize a population mu first.0355

In this case we are saying we are going to pull from something and figure out and pick a potential population mu.0363

And then we are going to test how weird the sample is.0376

We are going to come up with a number to tell us this is how weird the sample is.0387

We are going to decide is that weirdness weird enough?0393

That is going to be hypothesis testing.0398

But we are going to focus here on competence intervals. 0401

Okay, when we talk about competence intervals we need to get an inventory of what we know so far.0404

Basically that is asking the question, which parameters are known or given to us?0413

What happens when we do not know what the population looks like?0418

Well we may not know what 0422

The population looks like because we do not know anything about the population, or we know 0424

Only a little bit about the population.0428

This is the case where we know a little.0431

Here we do not know mu but we do know sigma.0434

For some reason we have some partial information and that helps us out.0444

Here we know nothing.0450

Here nothing is helping us that we do not know mu and we are trying to figure it out but we do not know sigma either.0454

It is like nothing is helping us out here.0464

We just have to pull ourselves up from our own bootstraps.0466

These are the two situations that we are going to talk about it.0471

Here is the goal of competence interval. 0475

The basic idea of the competence interval is going to be this.0480

We are going to try to figure out where mu might be but we do know x bar.0484

We know everything about the sample but we do not know anything about the population.0497

But in this case I am going to show you what happens when we already know sigma.0503

So we have a leg up. 0508

We know sigma life is little that easier for us today.0509

Here is the thing, we do not know what the population looks like so cannot draw a normal or skewed or anything. 0513

We have no idea what the population looks like and we have no idea what the population mu is.0524

But we for some reason know sigma is.0531

Sigma is given to us.0533

From there can we construct an SDOM?0534

Given that n is sufficiently large we can assume that it is normal. 0540

We have no idea what mu is and so we do not know what mu sub x bar is. 0548

We do not know it at all but we can figure out sigma sub x bar.0553

We could figure out the standard error because we have sigma and we could divide that by √n.0559

We have a little bit of information about the SDOM.0566

Here is what we do in competence intervals.0570

First assume that the x bar is the mu sub x bar.0574

Whatever your sample x bar is we are going to put back here.0586

We are going to assume it.0591

Here is why, because we always assume one thing to figure out the other, 0595

here we are going to assume things about the x bar to figure out mu.0601

And hypothesis testing, we assume something about the population to figure out how 0605

Weird x bar is.0609

Here because we know that the SDOM tends to be normal given a sufficiently 0612

Large n what we know is that we can find out with reasonable competence what some 0621

Significant borders are.0632

For instance, let us say we are one standard deviation away.0634

This is raw score and this is z score so we know at one standard deviation away 0642

this base right here we know that that is 68% of SDOM.0650

Let us think about what this might mean.0660

When we get these borders what we might end up saying is that these are the borders in which 68% of our values will fall in the SDOM.0663

And here is what we could say we could also say that there is 68% chance that our 0679

Population mu will fall in that zone.0686

That is a 68% competence interval.0691

For 68% is higher than half, but it is not that high.0697

But here is the thing we can have a high competence interval.0702

We can have a 95% competence interval or we can have a 99% competence interval.0707

That is what we can do. 0713

We can have here is my x bar, here is 0 but what we can do is figure out 0716

These borders such that we are now sure that 95% chance of having our 0730

Population mean fall in this interval.0744

We can know that.0748

That is called the competence interval.0750

That is pretty hypothetically and you can even go to 99%.0753

And we could easily figure out these borders. 0756

Here is how.0759

Because we easily figure out the border we could figure out what the z scores are.0761

This is what we call a two-tailed competence interval because even though the middle part is 95% that does not mean that part of 5%.0772

You will have 105% so that means that part is .025 so 2.5% and this part is .025.0785

And those the only parts that we are not sure.0796

There is a small chance that the population mean will fall somewhere out here but it is a very small chance. 0798

We are trying to reduce it as much is possible.0809

Let us think about how we could find the z score out here.0812

We could use our tables in the back of the book, our z tables and we can look up and usually z tables will give you like one side.0817

We can look up .025 and look at the z score or we could do it on our Excel.0830

Instead of using normsdist, normsdist will give you the proportion of the distribution.0837

We are going to put in normsin as the inverse and here we want to put in the probability.0848

Now this is going to be my probability.0855

I am going to put in this probability .025 and we get 1.967.0870

This value here is -1.96 and because the normal distribution is symmetric we know that this part is also 1.96 0884

but now a positive instead of negative.0892

We know our z values on the end and if we know the z values what is our raw score here?0896

Tell me what this value is and also tell me what that value is.0908

Well the z score tells you how many standard errors away you are.0915

How many jumps away and each jump is worth that much.0921

We are away 1.96 of these jumps.0926

We are going to multiply this by this and then 0931

Either subtract it from x or add it to x.0934

Step two in finding competence interval is let us say you want to find a 95% competence interval finds the z scores.0938

It is all in the case where you know sigma.0953

Step 3 is this, now you want to actually find the actual scores and that is going to be x bar + or -the z score × standard error.0957

That is what you are going to do. 0984

And we know what the standard error is.0986

I am going to rewrite this to be x bar + or - z score × sigma / √n.0989

When we do that we could find these competence intervals.1003

Once you have these competence intervals then you that with 95% competence that 1009

your population mean will fall in this interval between these two numbers.1019

Now the 95% is actually called the capture rate that is like 95% and 99%, whatever. 1028

What would the competence interval be for 100%?1042

It would go from –infinity to infinity because that is how far the normal distribution goes.1047

But the capture rate is this the proportion of random sample for which this interval captures mu.1053

Let us imagine taking a whole bunch of random sample, it is going to be that 95% of the 1080

Time those random samples in tail mu.1091

They somehow overlap with mu.1097

That is what we mean by 95% capture rate.1099

That is when you know sigma but now we do not know sigma.1103

We are in trouble but we do not know mu.1113

We do not sigma either.1115

Still our goal remains the same, we try to figure out mu from x bar.1116

But now we are a little hobbled.1128

I do not have a tool that I use to have.1132

The beginning part of the story stays the same. 1135

The population we have no idea and from there we want to find the SDOM because 1139

we are going to figure out how good our sample is.1146

We know the shape of our SDOM as long as our s is sufficiently big.1151

Can we figure out sigma sub x bar anymore?1157

No we cannot because we do not have sigma so how can figure out sigma sub x bar.1161

We cannot figure out that standard error.1170

Here is where another idea comes in.1171

There is another way we can estimate the standard error of the sampling distribution that is going to be s sub x bar.1175

Because we are going to use the sample standard deviation s instead of sigma.1186

Remember s is more variable, not quite right and because of that we corrected already a little bit by using n -1 instead of n.1200

Here we are going to divide that by √n.1214

If you double click on this you would see the square root of the sum of squares ÷ √ n -1.1218

You would see this inside of that.1231

We already tried to correct it a little bit, but s is still variable.1234

It is not quite as good as having sigma.1242

And there can be other problems that we run into.1245

This is pretty good though and it is a pretty good estimate but you always have 1249

to keep in mind we have not as good of a standard error as we used to.1254

We have to account for that.1262

But the steps remain the same. 1265

First assume x bar for mu sub x bar. 1267

Two, find z for your capture rate.1275

If your capture rate for example 95% then you would find the z scores.1287

It is helpful to memorize that for this capture rate the z scores are going to be + or -1.96. 1297

It is going to come up a lot.1305

Find the z scores for your capture rate.1306

Here we run into a problem. 1310

I wish we could use z scores but here is an issue, we actually cannot because s is to variable for us to assume perfect normality.1314

And because of that we cannot use the z and instead we have to use the t which is very similar to z.1330

Find the t score for your capture rate.1348

Instead of having raw score and z score we are going to find t score.1352

For now you just need to know that you can find your t score in the back of the book but in 1366

The next lesson we are going to go over why you use t and why you cannot use z.1372

That is a big story.1377

You are going to find t.1380

Once you find the t for your capture rate and that will also be + or -, t is going to be very similar to z score.1383

We are going to use this formula.1390

You are going to use a very similar idea to the z score competence interval where you want to know x bar + or -.1396

How a t score is also going to tell you how many standard errors away.1407

T × standard error. 1411

But remember, you use t when you estimate this from sample.1417

If we unpack this, this is what it can look like x bar + or - t × this is that estimated standard error s/√n.1426

It is still the same idea.1443

It is how many jumps away, figuring that out and then multiplying that to the length of the jump 1446

and adding that to x bar for the high-value and then subtracting that from the x bar for the low value.1451

In order to find t here is what you need to know for now.1458

You need to know whether it is a 1 or 2 tailed distribution.1465

If your competence interval is two-tailed then remember these are .025 1470

because you would split the remaining 5% on both side.1478

But sometimes where t values though only give you one side.1482

They might give you a one sided 5% or one sided .25%. 1487

You have to just keep in mind whether it is one tailed or two tailed and also the t distributions are a whole bunch of different distributions.1493

They are a whole bunch of different tables basically.1502

You have to also know what degrees of freedom.1508

For now you could remember degrees of freedom as n -1.1514

There are reasons for all of these things why we use t, why we use degrees of freedom all that stuff.1521

That will be covered in the next lesson. 1528

For now, here is what you need to know.1529

You need to know whether it is one tailed or two tailed.1532

You also need to know degrees of freedom. 1534

Once you have that you could actually look it up in t table usually found in the back of your book. 1536

It might also be called the students t distribution because - invented it but he was actually contracted to work for Guinness.1542

That is why I cannot publish it under his actual name.1553

We published it under the pseudonym student because that is called the students t.1556

You can look up your degrees of freedom and then look for the area that you need and then go down and find the t score.1560

Very similar to z score.1573

Let us go on to some examples.1574

Example 1, consider two extreme situations n=10 and n=1,000.1582

If you use s in the formula for CI given sigma, here is the actual formula for when you have sigma.1591

We use 1.96 because we use the z score.1609

Which of these situations would you expect to give a capture rate closer to 95%?1614

Here is what this question is really asking.1621

When you know sigma for competence interval for 95% competence interval 1.96 that is my z × sigma / √n.1624

What it is asking you is what if you substituted in s?1649

Here we do not know sigma but we are going to just take this formula and use the z value s/√n.1656

In order to answer this question you really only need to keep in mind one thing, when is s more like sigma.1676

S is more like sigma when n is very large.1687

This situation would give you a very close capture rate of 95%.1708

This would be very, very similar. 1721

However, when n is 10 you have more uncertainty and because of that the t distribution it is not as tight.1724

It is actually more like spread out and because of that, when n=10 you do not capture 95% just by being about 2 standard deviations out this way. 1733

That would not capture 95% of those samples.1748

In fact you have to go out further to capture 95%.1753

This is going to be much closer to 95% capture rate.1758

This is going to give you a smaller capture rate.1763

That is because your s is going to be more variable and because of that your t distribution 1766

is going to be more disperse because more variable means sort of wider.1778

95% CI for a population mean is calculated for random sample of weights and the resulting CI is from 42 to 48 pounds.1785

For each statement indicate whether it is a true or false interpretation of the CI.1798

This question is asking you do you understand what the competence interval means?1807

Do you understand what it is for?1811

Let us see, 95% of the weights in the population are between 42 and 48.1813

Does competence interval tell us about the actual population numbers?1821

No, it only tells us about the population mean.1830

This is actually not true.1833

We do not know anything about the actual numbers of the population. 1836

We do not know whether it is skewed, whether it is uniform distribution.1840

We do not know any of those things. 1847

The 95% thing would only be reasonable if the population was normal and its mu was exactly equal to x bar.1848

That would be the case.1862

That is not true.1864

What about number 2?1866

95% of weights in the sample are between 42 and 48, does the CI tell us anything about this sample?1868

No, using the sample to estimate population mean.1878

We are using the SDOM.1882

We do not know anything about the sample itself.1884

That is also not true. 1888

What about number 3?1890

The probability that the interval includes the population mean is 95%. 1893

This is actually true. 1899

There is only a 5% chance that this interval does not contain the population mean.1902

What about number 4?1916

The sample mean might not be in the competence interval.1919

That does not make sense if you look at the picture because we use the sample mean in order to construct the competence interval.1924

Of course this is in the competence intervals and this is just ridiculous. 1932

Example 3, a random sample of 22 men had a mean body temperature of 98.1°, standard deviation of .73.1936

Construct a 95% competence interval for the mean of the population that the sample was drawn from.1950

Interpret the CI and 98.6° included in this.1956

This the average human body temperature.1963

We have body temperatures in the world and we do not know what that population looks like.1965

We are asking can we construct 95% competence interval such that whatever 1975

the population mean is there is a 95% chance that we have covered it.1989

We start by assuming that the mean of the sample x bar is the mean of the sampling distribution of the mean.1994

We have done step one.2004

Step two is we have to construct CI and so here they give us x, but do we have sigma?2008


We know that we cannot use the z score.2025

We have to use the t score. 2029

Let us find the t for this.2031

This is .025 chance that we would not find it on the site and here is .025 chance that we can find it on the site. 2033

What is the t scores?2043

This is the raw score or the temperature. 2046

What is the t score for .025 when the degrees of freedom and that is n -1 there is 22 man so 22-1= 21 degrees of freedom. S2049

If you look in your book, at your students t distributions I am going to go down to where the df=21.2065

I am going to go across to where it says you know .025.2074

My table actually gives me this area so I am going to look at .025 on the side.2080

You and it says 2.08 is my t score.2086

That makes sense.2093

That is around 1.96.2095

You will see that as degrees of freedom get greater and greater this value becomes more and more close to 1.96.2098

On this side we know that it is symmetrical so I know it is -2.08.2108

From here I can construct my CI.2114

The CI is going to be the x bar + or – the t value × my standard error.2118

My estimated standard error here is s sub x bar because we do not have sigma.2129

That is going to be s ÷ √n.2137

Let us put in numbers here, so that is 98.1 that is our sample mean ± t value 2.08 × s .73 ÷ √22.2141

I am just going to calculate this on a calculator so that is going to be 98.1 and I will do the + side first. +2.08.2167

Excel does order of operation.2182

It needs to do the multiplication before the addition and its .3 ÷ √22.2185

That is the high-end of my competence interval is 98.4 and the low end is going to be 97.8.2195

98.4 and 97.8 are my CI.2217

When we interpret the competence interval we want to say something like 2229

there is a 95% chance that the mean of the population lies between these two values.2239

Or another way we could say it is that if we draw samples at random, 95% of those samples will include the population mean.2250

95% of the samples in between this interval will include the population mean.2264

Let us think about this competence interval, is it reasonable?2271

Is 98.6° included that is supposed to be the mean for everybody.2280

We see that it is not actually.2286

Maybe this sample is odd because our competence interval does not actually include the mean 2288

that we secretly know for providing temperature of people.2297

That is when competence intervals are helpful. 2307

Here is example 4, in a random sample of 1000 community college students, their mean score on a quantitative literacy test was 310.2310

The standard deviation on this test of all the community college students have taken is 360.2324

Construct a 95% competence interval for the mean of all community college students have ever taken this test.2331

Here is our random sample and their mean or x bar is 310 but the standard deviation 2338

of all the students who have taken this test that is the sigma is 360.2351

Construct a 95% competence interval. 2358

Well, the first part that we know population we do not know but we are given the population standard deviation.2361

And from that, let us construct the SDOM.2374

Well given that this n is quite large let us assume normality.2377

Here we could find out the standard error by putting 360 ÷ √ 1000.2382

Now going to our steps of our competence interval first we assume that x bar is the mean of our sampling distribution of the mean.2395

Here we could use the z instead of t because we have sigma and because of that we know that this is normal. 2412

That is going to be +1.96 and -1.96 in order to construct a 95% competence interval.2425

Our CI is going to look something like this x bar + or – z × standard error.2436

If you sort of double click on standard error what you will find is sigma / √n.2446

Let us put in numbers here.2464

310 is our x bar.2467

Our z score is 1.96.2471

Our sigma is 360.2475

Our n is 1,000.2479

Let us put these in our calculators.2483

I will do the high end first 310 + 1.96 × 360 ÷√1,000.2487

Order of operations says it does not matter anything you multiply or divide it in.2508

That is my high end 332 as the high scoring end.2516

The low scoring end, the lower bound of my 95% CI is 287.7.2524

That is going to be 287.7 as well as 332.3.2537

The mean of the population 95% should fall between this interval.2547

That is the end for our competence intervals.2558

That is part one of competence intervals.2561

Hope you join me for t distributions to find out why we use t instead of z sometimes.2566

Thank you for using