For more information, please see full course syllabus of Statistics

For more information, please see full course syllabus of Statistics

### Frequency Distributions and Features

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

- Intro 0:00
- Roadmap 0:10
- Data in Excel, Frequency Distributions, and Features of Frequency Distributions
- Example #1 1:35
- Uniform
- Example #2 2:58
- Unimodal, Skewed Right, and Asymmetric
- Example #3 6:29
- Bimodal
- Example #4a 8:29
- Symmetric, Unimodal, and Normal
- Point of Inflection and Standard Deviation
- Example #4b 12:43
- Normal Distribution
- Summary 13:56
- Uniform, Skewed, Bimodal, and Normal
- Sketch Problem 1: Driver's License 17:34
- Sketch Problem 2: Life Expectancy 20:01
- Sketch Problem 3: Telephone Numbers 22:01
- Sketch Problem 4: Length of Time Used to Complete a Final Exam 23:43

### General Statistics Online Course

### Transcription: Frequency Distributions and Features

*Hi and welcome back to www.educator.com.*0000

*We are going to be talking about frequency distributions again but now we are going to be going a little more into detail about their features.*0003

*In the last lesson we covered how to look at the data in Excel.*0013

*There is a checkmark on top of that one and we talked about how to go from data to frequency tables using our count if function.*0017

*From frequency table to visualization.*0026

*We are going to take another look at those same examples that we looked*0029

*at before except now we are going to be talking about the features of these distributions.*0033

*In particular we are going to be looking at their shape.*0037

*There are couple of shapes you should know after this.*0040

*One is uniform distributions.*0043

*Another one is going to be called unimodal.*0047

*Yet another is called bimodal.*0054

*Especially we are going to be looking at called normal.*0059

*We are also going to be talking about center.*0065

*We are not going to be talking about how to calculate the center of the distribution.*0068

*We are going to be talking about how to think about the center conceptually in three different ways, mean, median and mode.*0072

*We are not going to talk about how to calculate it yet.*0080

*We are also going to be talking a little bit about spread.*0084

*How spread out is this distribution.*0086

*Finally we will also mention outliers, gaps and clusters whenever they are relevant.*0090

*Recall example 1, here we looked at a data set of a 100 www.facebook.com friends*0098

*and we looked at whether more of these friends are born in a particular month or another.*0103

*Note here that it really seems to be that no particular month is super popular.*0108

*This is what we call the uniform distribution.*0113

*If you sort of squint and blur your vision a little bit, it is almost like there is a flat line here.*0116

*Everybody is hovering close to that line.*0125

*No one month is more frequent in births than any of the other months by a lot.*0130

*Some of these months are a little more frequent but only by a little bit.*0136

*You could see there is relatively little change from month to month here.*0143

*Other uniform distributions also look like this sort of rectangle or flat shape*0147

*and these distributions might be anything from deaths occurring on days of the week.*0155

*Is there any reason to believe that one particular day is more favorable to die on than the other?*0160

*Or in rolls of a six sided dice, is there a particular reason to believe that one side might come up more frequently than another?*0165

*Not if it is a fair sided dice.*0174

*Remember this is now example 2, in example 2 we look at the same data set again and we looked at the age distribution in the sample.*0181

*Here we do not have a uniform distribution.*0191

*No matter how much you squint your eyes you are not going to see sort of a flat shape.*0193

*You will see a peak right here and because of that, this peak often called a mode the most frequent value.*0197

*This peak makes this a unimodal distribution.*0208

*I’m not going to call it example 2 anymore, I’m going to call it a unimodal distribution.*0213

*We will add on to that.*0220

*Not only that but this shape is what we call skewed.*0222

*If I decide to just draw a light little sketch over this guy, we see that it has this long what we call tail.*0226

*This tail goes out towards the right side, the larger values because it is skewed and the tail is to the right we call is skewed right.*0242

*It is not only unimodal but it is also skewed right.*0256

*You often have a skewed distribution when you have some sort of minimum or maximum value that these values are all bumping up against.*0263

*In www.facebook.com I think you have to be 13 years old to sign up and maybe a lot of 14 and 15 year olds.*0273

*Their parents are not letting them sign up.*0279

*The bottom end of it there is sort of a walls and there is like an imaginary wall there.*0282

*The most popular at least in our sample seems to be in the 20’s and some of the older people use it.*0290

*There is no limit on that.*0296

*You could be 100 years old and still use www.facebook.com.*0297

*Since there is no limit on that, that tail can go on for a really long time.*0300

*These outliers out here, you could think of them as oddballs but we call them outliers.*0306

*Tails are often made up of outliers.*0320

*Note also that because this is skewed right, if we drew a line of symmetry from the mode and*0324

*we imagine folding this distribution on itself, we would not have two sides that match up.*0333

*We call this asymmetric as well.*0342

*We learned a lot here, it is unimodal, it is skewed and it is asymmetric.*0346

*Here we will learn yet another term, we see there are these gaps.*0356

*These are called gaps, nice and easy.*0365

*If we had a couple of people clustered in a group we call that a cluster.*0369

*A lot of these terms are pretty normal words that you use in everyday life.*0377

*Let us move on to example 3.*0391

*In example 3, we are interested in what the height distribution was in this sample.*0393

*Compare this distribution to our previous skewed distribution.*0401

*Is it skewed to the right or skewed to the left?*0408

*Is there some sort of tail here?*0411

*Not really.*0414

*There is no real tail that I can see but we do see that there are a couple of places that are popular modes, most frequent values.*0416

*These are 64 and 69, these seem to be the popular peaks.*0428

*Because we have one mode here and another mode here this is no longer a unimodal distribution.*0436

*This is what we call a bimodal distribution.*0443

*Instead of calling it example 3 I’m going to call it bimodal.*0447

*Is it symmetric? We could see it as almost having 2 bumps like that.*0453

*It is sort of symmetric but not perfectly symmetric.*0464

*There is no tail, there is not very many gaps.*0469

*There is maybe a little bit of a gap here but not very much.*0476

*This is what we call a bimodal distribution.*0480

*Let us think about this, height distributions.*0483

*Well our www.facebook.com friends are both males and females.*0488

*and since males tend to be taller than females on average it might be that there is a cluster of males up here*0492

*and a cluster of females down here that we cannot see right now.*0499

*Let us look at these two distributions, males and females separately.*0504

*Here is just the distribution of male heights from our sample.*0511

*Notice that here it is not really a symmetric because when you look at this mode, there is our mode right here and you draw a line of symmetry.*0517

*You imagine folding it on itself then you will get a pretty even looking hill right there.*0531

*You will get a pretty even looking hill with roughly similar numbers of people on this side as on this side.*0547

*This is what we would call a roughly symmetric distribution instead of example 4a.*0558

*It is also what we call unimodal because we only have one mode right here.*0568

*What else do we notice about this?*0582

*We do not really see a tail and further more it seems that this distribution seems to have a lot of people piled up around 69 inches.*0584

*With a lot more people close to 69 and fewer people farther away from 69 like at 75 or around 64.*0596

*This is what we call a normal distribution.*0610

*You could think of a normal distribution like a pile.*0617

*A normal distribution will not usually, by definition a normal distribution is both unimodal and symmetrical.*0620

*In a normal distribution typically the mode, as well as the mean or the average is going to be the same.*0632

*To think about the word average you might want to think of it like this in terms of distributions.*0648

*Imagine cutting out this distributions, like out of cardboard and then trying to balance it on your finger.*0653

*Where the distribution would balance, that point, that is the mean.*0662

*Although we will learn to calculate this later, that is the image I want you to think of when you think of the mean.*0668

*If we draw a smooth line around this distribution on either side of the mode, at about 60% of the height of this peak that is about 50%.*0675

*Around here at about 60% of the height of this peak you will have what is called the point of inflection.*0692

*Here is what so important about the point of inflection.*0709

*Although you cannot see it very well from my picture, I will exaggerate it.*0713

*The point of inflection is where the distribution goes from being concave to being convex.*0716

*That is about right and this point of inflection is going to be important later because this distance right here,*0726

*this distance is going to be called the standard deviation.*0736

*Later we will learn exactly how to calculate that but that point of inflection and the standard deviation*0752

*is going to be really critical to our understanding of other distributions as well.*0758

*Here we see both males and females heights plotted on this frequency distribution.*0765

*Just you could see here is a sort of our female distribution and here is our male distribution.*0777

*Here you could see there is roughly a normal distribution for the females as well as the males.*0792

*We are going to say two normal distributions.*0799

*They are both unimodal and they both are roughly symmetric on both sides.*0806

*There is no tail.*0814

*No big gaps.*0816

*There is a big cluster in the middle and that is about it.*0817

*Here what we thought before was a bimodal distribution we actually see there is actually two normal unimodal distributions instead.*0823

*Let us summarize what we have learned so far.*0839

*We have learned four different shapes, uniform, skewed to the right or to the left, bimodal and normal.*0843

*We have also learn asymmetric and symmetric.*0850

*Here I’m asking is this one symmetric or this one asymmetric?*0853

*The uniform one, yes it is largely symmetric because rectangles are symmetric.*0858

*Skewed, are they symmetric?*0863

*No, because either the right tail was long or the left tail is long.*0866

*Bimodal, are they symmetric?*0878

*This one is sort of a sometimes.*0881

*There can be times when the these are symmetric.*0883

*For instance if you have two that look like that, it is roughly symmetric but you may also have bimodal distributions that look like this.*0886

*Then that one does not look as symmetric.*0897

*Normal distributions, yes they are symmetric, always.*0901

*I will just draw this here just so that you know.*0907

*Let us talk about the centers.*0910

*Does it have a clear mode?*0913

*Here it does not have a mode, there is not one most frequent value.*0917

*In fact all the values are roughly similarly frequent.*0923

*We will say no, it does not have a clear mode.*0926

*Typically the skewed distributions are unimodal.*0929

*Yes, unimodal.*0933

*What about bimodal distributions?*0940

*Do they have a mode?*0942

*Yes, they are overflowing with modes.*0944

*They have two modes in fact sometimes more.*0944

*You could have trimodal right? Yes.*0949

*What about normal distributions.*0954

*Well of course it has a mode because it is also unimodal.*0956

*Let us talk about spread.*0966

*What a spread look like here.*0968

*The spread is roughly even but as it goes as far as the values go.*0970

*Does it use the point of inflection? No.*0976

*What about in a skewed distribution?*0980

*Do we use point of inflection there?*0982

*In a skewed distribution the point of inflection is weird because the point of inflection is going to cut it up*0985

*at different places depending on whether you look at the right side of the mode or the left side of the mode.*0990

*Point of inflection is not quite as useful here.*0995

*In a bimodal distribution sometimes you can use the point of inflection but it gets complicated.*1000

*We will write in it is complicated.*1005

*It is only for the normal distribution that the point of inflection comes in really handy.*1014

*Resulting yes.*1019

*At the distance from the mode or the center to that point of inflection, distance is called the standard deviation.*1021

*Let us go on to some examples that you might frequently see in text books, AP statistics, as well as a lot of general reasoning questions.*1042

*These are what I like to call sketch problems.*1056

*They will give you some sort of data set that you only know a little bit about and they ask you what kind of distribution do you think it might have.*1059

*We can answer these questions now.*1069

*Here is sketch problem number 1.*1071

*What if you are asked to imagine the age of each person who got his or her first drivers license in your state last year?*1074

*That is going to be a distribution.*1081

*It is a whole bunch of numbers, whole bunch of different ages.*1083

*Let us think about this.*1088

*On the X axis we will probably put age.*1091

*Here we are going to put frequency but I'm not going to try that in.*1094

*Actually I will try that in.*1106

*Here is the Y axis.*1108

*Let us think about this.*1110

*Is there some sort of minimum or maximum age at which you can get your drivers license?*1112

*Yes, probably 16 in most states.*1116

*We will put 16 as the minimum age and probably a lot of people get their drivers license sort of early on, from 16 to 20.*1122

*They are probably very few people getting their first drivers license ever by the time they are 25, 30, or 40, even fewer people.*1131

*That is already starting to sound like maybe somewhat of a skewed distribution.*1143

*Probably lots of people in their early 20’s, maybe late teens, getting their drivers license*1149

*and very few outliers were getting their first drivers license when they are 40 or 50.*1158

*Even though you might not know very much about people getting their first drivers license you can already tell the shape of this distribution.*1168

*It is skewed but not only skewed but the tail is to the right.*1178

*We call that skewed right, it is probably unimodal or there is probably some cluster up here.*1182

*It is probably asymmetric because it is skewed.*1191

*Next example, here is sketch problem number 2.*1200

*Let us think about the life expectancy of females in Africa and Europe.*1205

*When we think about life expectancy that is considering how long are females in Africa and Europe going to live.*1211

*Age or years should probably be on the X axis.*1217

*On the Y axis once again we are going to be looking at frequency.*1224

*I will just say freq.*1228

*Let us consider the life expectancy in Africa and Europe.*1232

*Africa has a lot of diseases and malnutrition and other factors that are going to affect life expectancy of females.*1236

*Also Europe on the other side of the spectrum is going to have a lot fewer of those same issues.*1245

*The life expectancy of males in Africa might be shorter than life expectancy of those in Europe.*1255

*We might see something like a bimodal distribution that is actually caused by two unimodal distributions.*1261

*Let us put Africa in red and European females maybe in blue.*1269

*Maybe most European females die when they are older, like 70.*1281

*Maybe in Africa the life expectancy is less, maybe 50.*1288

*Here we see two unimodal distributions but it did not ask us to plot this separately.*1298

*When we combine these, we see a bimodal distribution.*1305

*Let us go into the next problem.*1319

*Sketch problem number 3 says well what about the distribution of the last two digits of the telephone numbers in the town or city where you live?*1323

*Do we have any reason to believe that those two digits are going to be more favorite than the others?*1332

*Let us think the last two digits of the telephone numbers.*1341

*If we put that on the X axis basically we can go from 00.*1345

*We can go from 00 all the way up to 99.*1362

*That is our range of possibility.*1366

*Let us see what might the frequency be.*1371

*The only we have a reason to believe that 00 is more or less popular than 99.*1375

*We do not really have a reason to think 99 is more or less popular than 62 or 47 or 35.*1381

*We might be thinking about a roughly unimodal distribution where each of these are roughly equivalently popular.*1390

*You can continue that on, so this is probably one of those unimodal distributions where one of the numbers is way more frequent than another one.*1405

*Let us move on to sketch problem number 4.*1425

*What about the length of time students used to complete a final exam within a 50 minute class period?*1428

*Let us put minutes on our X axis and the frequency over here on the Y axis.*1434

*Now since it is a 50 minute limit, 50 is going to be the max value people are should not be allowed to use 51 or 52 minutes.*1448

*The numbers are probably bunched up against that wall.*1460

*Remember skewed distributions usually happen when there some sort of imaginary wall in this case.*1466

*Probably most students might take a little less time, a little more time.*1473

*Maybe somewhere close to 50 and maybe some students will take all the way up to 50 minutes.*1480

*Maybe the students will be clustered around there and probably very, very few students will finish it in like 10 minutes or 20 minutes.*1488

*Maybe it will look something like this.*1499

*Fewer students are finishing it in 10 minutes but maybe there is one fast guy who does.*1513

*Maybe just a few more finishing it in 20 minutes but most of the students finishing around 40 or 50 minutes.*1518

*That is the last example problem, thanks for using www.educator.com.*1525

0 answers

Post by Saadman Elman on June 1, 2015

It was helpful.

Thanks.

0 answers

Post by Bongani Makhathini on July 3, 2014

Thank Dr Ji Son. Love the material and am also realising how applicable it is to my work in Data Analysis.

0 answers

Post by Utomo Pratama on July 2, 2014

Dear Dr. Ji Son,

Most of the slide don't contain with your notes, particularly in summary part which is blank. Please provide with complete your handwriting.

Much appreciate.

1 answer

Last reply by: Angela Patrick

Thu May 8, 2014 3:30 PM

Post by Robert Putnam on January 13, 2014

Skewed distribution: Why is it called skewed "right" or "left," whereas it looks like the data peaks at the left or right, respectively?

1 answer

Last reply by: Angela Patrick

Thu May 8, 2014 3:29 PM

Post by devraj1 on April 27, 2013

hey guys random question will watching all these videos on ap stats effectively prepare me for a 5 on

the ap exam? thanks

1 answer

Last reply by: Professor Needham

Thu Sep 12, 2013 6:20 PM

Post by Alex Pate on August 11, 2012

This video is missing 5 minutes and 20 seconds.

Please retrieve, but the guy is doing a great job!

1 answer

Last reply by: Professor Needham

Thu Sep 12, 2013 6:20 PM

Post by jesus pulido on July 18, 2012

this video for me cuts off before i'm able to see the extra examples is there any way to get the whole video to play?

0 answers

Post by Leah Clark on May 3, 2012

Sketch Problem 3-- Uniform, I think is what you meant-- not unimodal.