Sign In | Subscribe
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
  • Discussion

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Bookmark and Share
Lecture Comments (12)

0 answers

Post by Saadman Elman on June 1, 2015

It was helpful.

0 answers

Post by Bongani Makhathini on July 3, 2014

Thank Dr Ji Son. Love the material and am also realising how applicable it is to my work in Data Analysis.

0 answers

Post by Utomo Pratama on July 2, 2014

Dear Dr. Ji Son,

Most of the slide don't contain with your notes, particularly in summary part which is blank. Please provide with complete your handwriting.

Much appreciate.

1 answer

Last reply by: Angela Patrick
Thu May 8, 2014 3:30 PM

Post by Robert Putnam on January 13, 2014

Skewed distribution:  Why is it called skewed "right" or "left," whereas it looks like the data peaks at the left or right, respectively?

1 answer

Last reply by: Angela Patrick
Thu May 8, 2014 3:29 PM

Post by devraj1 on April 27, 2013

hey guys random question will watching all these videos on ap stats effectively prepare me for a 5 on
the ap exam? thanks

1 answer

Last reply by: Professor Needham
Thu Sep 12, 2013 6:20 PM

Post by Alex Pate on August 11, 2012

This video is missing 5 minutes and 20 seconds.
Please retrieve, but the guy is doing a great job!

1 answer

Last reply by: Professor Needham
Thu Sep 12, 2013 6:20 PM

Post by jesus pulido on July 18, 2012

this video for me cuts off before i'm able to see the extra examples is there any way to get the whole video to play?

0 answers

Post by Leah Clark on May 3, 2012

Sketch Problem 3-- Uniform, I think is what you meant-- not unimodal.

Frequency Distributions and Features

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:10
    • Data in Excel, Frequency Distributions, and Features of Frequency Distributions
  • Example #1 1:35
    • Uniform
  • Example #2 2:58
    • Unimodal, Skewed Right, and Asymmetric
  • Example #3 6:29
    • Bimodal
  • Example #4a 8:29
    • Symmetric, Unimodal, and Normal
    • Point of Inflection and Standard Deviation
  • Example #4b 12:43
    • Normal Distribution
  • Summary 13:56
    • Uniform, Skewed, Bimodal, and Normal
  • Sketch Problem 1: Driver's License 17:34
  • Sketch Problem 2: Life Expectancy 20:01
  • Sketch Problem 3: Telephone Numbers 22:01
  • Sketch Problem 4: Length of Time Used to Complete a Final Exam 23:43

Transcription: Frequency Distributions and Features

Hi and welcome back to

We are going to be talking about frequency distributions again but now we are going to be going a little more into detail about their features.0003

In the last lesson we covered how to look at the data in Excel.0013

There is a checkmark on top of that one and we talked about how to go from data to frequency tables using our count if function. 0017

From frequency table to visualization.0026

We are going to take another look at those same examples that we looked 0029

at before except now we are going to be talking about the features of these distributions.0033

In particular we are going to be looking at their shape.0037

There are couple of shapes you should know after this.0040

One is uniform distributions.0043

Another one is going to be called unimodal.0047

Yet another is called bimodal.0054

Especially we are going to be looking at called normal.0059

We are also going to be talking about center.0065

We are not going to be talking about how to calculate the center of the distribution.0068

We are going to be talking about how to think about the center conceptually in three different ways, mean, median and mode.0072

We are not going to talk about how to calculate it yet.0080

We are also going to be talking a little bit about spread.0084

How spread out is this distribution.0086

Finally we will also mention outliers, gaps and clusters whenever they are relevant.0090

Recall example 1, here we looked at a data set of a 100 friends 0098

and we looked at whether more of these friends are born in a particular month or another.0103

Note here that it really seems to be that no particular month is super popular.0108

This is what we call the uniform distribution.0113

If you sort of squint and blur your vision a little bit, it is almost like there is a flat line here.0116

Everybody is hovering close to that line.0125

No one month is more frequent in births than any of the other months by a lot.0130

Some of these months are a little more frequent but only by a little bit.0136

You could see there is relatively little change from month to month here.0143

Other uniform distributions also look like this sort of rectangle or flat shape 0147

and these distributions might be anything from deaths occurring on days of the week.0155

Is there any reason to believe that one particular day is more favorable to die on than the other?0160

Or in rolls of a six sided dice, is there a particular reason to believe that one side might come up more frequently than another?0165

Not if it is a fair sided dice.0174

Remember this is now example 2, in example 2 we look at the same data set again and we looked at the age distribution in the sample.0181

Here we do not have a uniform distribution.0191

No matter how much you squint your eyes you are not going to see sort of a flat shape.0193

You will see a peak right here and because of that, this peak often called a mode the most frequent value.0197

This peak makes this a unimodal distribution.0208

I’m not going to call it example 2 anymore, I’m going to call it a unimodal distribution.0213

We will add on to that.0220

Not only that but this shape is what we call skewed.0222

If I decide to just draw a light little sketch over this guy, we see that it has this long what we call tail.0226

This tail goes out towards the right side, the larger values because it is skewed and the tail is to the right we call is skewed right.0242

It is not only unimodal but it is also skewed right.0256

You often have a skewed distribution when you have some sort of minimum or maximum value that these values are all bumping up against.0263

In I think you have to be 13 years old to sign up and maybe a lot of 14 and 15 year olds.0273

Their parents are not letting them sign up.0279

The bottom end of it there is sort of a walls and there is like an imaginary wall there.0282

The most popular at least in our sample seems to be in the 20’s and some of the older people use it.0290

There is no limit on that.0296

You could be 100 years old and still use

Since there is no limit on that, that tail can go on for a really long time.0300

These outliers out here, you could think of them as oddballs but we call them outliers.0306

Tails are often made up of outliers.0320

Note also that because this is skewed right, if we drew a line of symmetry from the mode and 0324

we imagine folding this distribution on itself, we would not have two sides that match up.0333

We call this asymmetric as well.0342

We learned a lot here, it is unimodal, it is skewed and it is asymmetric.0346

Here we will learn yet another term, we see there are these gaps.0356

These are called gaps, nice and easy.0365

If we had a couple of people clustered in a group we call that a cluster.0369

A lot of these terms are pretty normal words that you use in everyday life.0377

Let us move on to example 3.0391

In example 3, we are interested in what the height distribution was in this sample.0393

Compare this distribution to our previous skewed distribution.0401

Is it skewed to the right or skewed to the left?0408

Is there some sort of tail here?0411

Not really.0414

There is no real tail that I can see but we do see that there are a couple of places that are popular modes, most frequent values.0416

These are 64 and 69, these seem to be the popular peaks.0428

Because we have one mode here and another mode here this is no longer a unimodal distribution.0436

This is what we call a bimodal distribution.0443

Instead of calling it example 3 I’m going to call it bimodal.0447

Is it symmetric? We could see it as almost having 2 bumps like that.0453

It is sort of symmetric but not perfectly symmetric.0464

There is no tail, there is not very many gaps.0469

There is maybe a little bit of a gap here but not very much.0476

This is what we call a bimodal distribution.0480

Let us think about this, height distributions.0483

Well our friends are both males and females.0488

and since males tend to be taller than females on average it might be that there is a cluster of males up here 0492

and a cluster of females down here that we cannot see right now.0499

Let us look at these two distributions, males and females separately.0504

Here is just the distribution of male heights from our sample.0511

Notice that here it is not really a symmetric because when you look at this mode, there is our mode right here and you draw a line of symmetry.0517

You imagine folding it on itself then you will get a pretty even looking hill right there.0531

You will get a pretty even looking hill with roughly similar numbers of people on this side as on this side.0547

This is what we would call a roughly symmetric distribution instead of example 4a.0558

It is also what we call unimodal because we only have one mode right here.0568

What else do we notice about this?0582

We do not really see a tail and further more it seems that this distribution seems to have a lot of people piled up around 69 inches.0584

With a lot more people close to 69 and fewer people farther away from 69 like at 75 or around 64.0596

This is what we call a normal distribution.0610

You could think of a normal distribution like a pile.0617

A normal distribution will not usually, by definition a normal distribution is both unimodal and symmetrical.0620

In a normal distribution typically the mode, as well as the mean or the average is going to be the same.0632

To think about the word average you might want to think of it like this in terms of distributions.0648

Imagine cutting out this distributions, like out of cardboard and then trying to balance it on your finger.0653

Where the distribution would balance, that point, that is the mean.0662

Although we will learn to calculate this later, that is the image I want you to think of when you think of the mean.0668

If we draw a smooth line around this distribution on either side of the mode, at about 60% of the height of this peak that is about 50%.0675

Around here at about 60% of the height of this peak you will have what is called the point of inflection.0692

Here is what so important about the point of inflection.0709

Although you cannot see it very well from my picture, I will exaggerate it.0713

The point of inflection is where the distribution goes from being concave to being convex.0716

That is about right and this point of inflection is going to be important later because this distance right here, 0726

this distance is going to be called the standard deviation.0736

Later we will learn exactly how to calculate that but that point of inflection and the standard deviation 0752

is going to be really critical to our understanding of other distributions as well.0758

Here we see both males and females heights plotted on this frequency distribution.0765

Just you could see here is a sort of our female distribution and here is our male distribution.0777

Here you could see there is roughly a normal distribution for the females as well as the males.0792

We are going to say two normal distributions.0799

They are both unimodal and they both are roughly symmetric on both sides.0806

There is no tail.0814

No big gaps.0816

There is a big cluster in the middle and that is about it.0817

Here what we thought before was a bimodal distribution we actually see there is actually two normal unimodal distributions instead.0823

Let us summarize what we have learned so far. 0839

We have learned four different shapes, uniform, skewed to the right or to the left, bimodal and normal.0843

We have also learn asymmetric and symmetric.0850

Here I’m asking is this one symmetric or this one asymmetric?0853

The uniform one, yes it is largely symmetric because rectangles are symmetric.0858

Skewed, are they symmetric?0863

No, because either the right tail was long or the left tail is long.0866

Bimodal, are they symmetric?0878

This one is sort of a sometimes.0881

There can be times when the these are symmetric.0883

For instance if you have two that look like that, it is roughly symmetric but you may also have bimodal distributions that look like this.0886

Then that one does not look as symmetric.0897

Normal distributions, yes they are symmetric, always.0901

I will just draw this here just so that you know.0907

Let us talk about the centers.0910

Does it have a clear mode?0913

Here it does not have a mode, there is not one most frequent value.0917

In fact all the values are roughly similarly frequent.0923

We will say no, it does not have a clear mode.0926

Typically the skewed distributions are unimodal.0929

Yes, unimodal.0933

What about bimodal distributions?0940

Do they have a mode?0942

Yes, they are overflowing with modes.0944

They have two modes in fact sometimes more.0944

You could have trimodal right? Yes.0949

What about normal distributions.0954

Well of course it has a mode because it is also unimodal.0956

Let us talk about spread.0966

What a spread look like here.0968

The spread is roughly even but as it goes as far as the values go.0970

Does it use the point of inflection? No.0976

What about in a skewed distribution?0980

Do we use point of inflection there?0982

In a skewed distribution the point of inflection is weird because the point of inflection is going to cut it up 0985

at different places depending on whether you look at the right side of the mode or the left side of the mode.0990

Point of inflection is not quite as useful here.0995

In a bimodal distribution sometimes you can use the point of inflection but it gets complicated.1000

We will write in it is complicated.1005

It is only for the normal distribution that the point of inflection comes in really handy.1014

Resulting yes.1019

At the distance from the mode or the center to that point of inflection, distance is called the standard deviation.1021

Let us go on to some examples that you might frequently see in text books, AP statistics, as well as a lot of general reasoning questions.1042

These are what I like to call sketch problems.1056

They will give you some sort of data set that you only know a little bit about and they ask you what kind of distribution do you think it might have.1059

We can answer these questions now.1069

Here is sketch problem number 1.1071

What if you are asked to imagine the age of each person who got his or her first drivers license in your state last year?1074

That is going to be a distribution.1081

It is a whole bunch of numbers, whole bunch of different ages.1083

Let us think about this.1088

On the X axis we will probably put age.1091

Here we are going to put frequency but I'm not going to try that in.1094

Actually I will try that in.1106

Here is the Y axis.1108

Let us think about this.1110

Is there some sort of minimum or maximum age at which you can get your drivers license?1112

Yes, probably 16 in most states.1116

We will put 16 as the minimum age and probably a lot of people get their drivers license sort of early on, from 16 to 20.1122

They are probably very few people getting their first drivers license ever by the time they are 25, 30, or 40, even fewer people.1131

That is already starting to sound like maybe somewhat of a skewed distribution.1143

Probably lots of people in their early 20’s, maybe late teens, getting their drivers license 1149

and very few outliers were getting their first drivers license when they are 40 or 50.1158

Even though you might not know very much about people getting their first drivers license you can already tell the shape of this distribution.1168

It is skewed but not only skewed but the tail is to the right.1178

We call that skewed right, it is probably unimodal or there is probably some cluster up here.1182

It is probably asymmetric because it is skewed.1191

Next example, here is sketch problem number 2.1200

Let us think about the life expectancy of females in Africa and Europe.1205

When we think about life expectancy that is considering how long are females in Africa and Europe going to live.1211

Age or years should probably be on the X axis.1217

On the Y axis once again we are going to be looking at frequency.1224

I will just say freq.1228

Let us consider the life expectancy in Africa and Europe.1232

Africa has a lot of diseases and malnutrition and other factors that are going to affect life expectancy of females.1236

Also Europe on the other side of the spectrum is going to have a lot fewer of those same issues.1245

The life expectancy of males in Africa might be shorter than life expectancy of those in Europe.1255

We might see something like a bimodal distribution that is actually caused by two unimodal distributions.1261

Let us put Africa in red and European females maybe in blue.1269

Maybe most European females die when they are older, like 70.1281

Maybe in Africa the life expectancy is less, maybe 50.1288

Here we see two unimodal distributions but it did not ask us to plot this separately.1298

When we combine these, we see a bimodal distribution.1305

Let us go into the next problem.1319

Sketch problem number 3 says well what about the distribution of the last two digits of the telephone numbers in the town or city where you live?1323

Do we have any reason to believe that those two digits are going to be more favorite than the others?1332

Let us think the last two digits of the telephone numbers.1341

If we put that on the X axis basically we can go from 00.1345

We can go from 00 all the way up to 99.1362

That is our range of possibility.1366

Let us see what might the frequency be.1371

The only we have a reason to believe that 00 is more or less popular than 99.1375

We do not really have a reason to think 99 is more or less popular than 62 or 47 or 35.1381

We might be thinking about a roughly unimodal distribution where each of these are roughly equivalently popular.1390

You can continue that on, so this is probably one of those unimodal distributions where one of the numbers is way more frequent than another one.1405

Let us move on to sketch problem number 4.1425

What about the length of time students used to complete a final exam within a 50 minute class period?1428

Let us put minutes on our X axis and the frequency over here on the Y axis.1434

Now since it is a 50 minute limit, 50 is going to be the max value people are should not be allowed to use 51 or 52 minutes.1448

The numbers are probably bunched up against that wall.1460

Remember skewed distributions usually happen when there some sort of imaginary wall in this case.1466

Probably most students might take a little less time, a little more time.1473

Maybe somewhere close to 50 and maybe some students will take all the way up to 50 minutes.1480

Maybe the students will be clustered around there and probably very, very few students will finish it in like 10 minutes or 20 minutes.1488

Maybe it will look something like this.1499

Fewer students are finishing it in 10 minutes but maybe there is one fast guy who does.1513

Maybe just a few more finishing it in 20 minutes but most of the students finishing around 40 or 50 minutes.1518

That is the last example problem, thanks for using