Sign In | Subscribe
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
  • Discussion

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Bookmark and Share
Lecture Comments (10)

0 answers

Post by Michael Onizak on March 10, 2014

Just a heads up; those using MS Excel 2007 the pivot table button is located under the "Insert" tab not the "Data" tab. Also, as you saw the Apple version having three separate windows to create the pivot table where as the MS 2007 will be in one window. Hope this helps

0 answers

Post by steven bain on December 2, 2013

The files appear to the wrong files for this lesson waste of time!

0 answers

Post by Christopher Hu on November 17, 2013

hi Dr.  do you cover p-value?  thanks

0 answers

Post by Patrick Manuel on June 28, 2013

Where are the exercise files? I can't find them. Please post the exercise files. Thanks.

0 answers

Post by Monica Ballard on February 12, 2013




For the first expercise, my formula appears right in excel but I keep getting the answer of 0 rather than 7

0 answers

Post by Monica Ballard on February 12, 2013

For the first expercise, my formula appears right in excel but I keep getting the answer of 0 rather than 7

0 answers

Post by Brijesh Bolar on August 12, 2012

Very nice sessions. It is a pleasant surprise to learn excel as a part of stats course. I always wanted to learn excel for stats and this fulfills my wish.

2 answers

Last reply by: Jorge Delgado
Fri Nov 2, 2012 10:21 AM

Post by Robert Hsiao on January 27, 2012

I would like to download this data file, so it is easier to follow the lesson.

Frequency Distributions in Excel

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:08
    • Data in Excel and Frequency Distributions
  • Raw Data to Frequency Tables 0:42
    • Raw Data to Frequency Tables
    • Frequency Tables: Using Formulas and Pivot Tables
  • Example 1: Number of Births 7:17
  • Example 2: Age Distribution 20:41
  • Example 3: Height Distribution 27:45
  • Example 4: Height Distribution of Males 32:19

Transcription: Frequency Distributions in Excel

Hi welcome to www.educator.com.0000

We are going to be talking about how to create frequency distributions in Excel from raw data.0003

We are just going to overview when sample data set in Excel already, you can download it from one of the links below.0012

When we are going to talk about how to create frequency distributions from that data 0022

but in order to create these distributions visualize a bowl of seeable distributions.0027

We need to go first from the data to frequency tables, then from the tables we will go to the visualizations.0034

First, going from raw data to frequency tables.0046

The reason we want to do this is oftentimes when we look at raw data it is really hard to make sense of.0050

It is just rows and rows and rows of data.0055

It would be nice if somebody could summarize that data for us so that we can visualize it.0059

When we summarize and visualize that data we get a sense of what the data looks like.0066

We are going to be talking later about actual shapes of distributions.0071

There are two ways to go and do frequency tables in Excel.0076

One is by using formulas.0080

Here we are going to be using the formula count F and the other way is to use pivot tables.0083

I’m going to show you one example of using pivot tables but we are going to be using mostly the formulas.0090

If you want to open up your Excel file that has all of our data in it, this is a sample data set of 100 friends from www.facebook.com.0100

Notice that they all have this CID which is their case ID and each column shows some sort of characteristic or variable.0110

Each cell for each person has a value for that variable.0124

Let us look at example 1, CID 1, case number1.0131

For this person they have 4 tagged photos, not a lot of tagged photos.0137

They have to seem 0 mobile uploads, again not a lot of mobile uploads, maybe they do Not have a smart phone right?0143

If we go down the line we could see that there are lots and lots and lots of variables here.0150

There are tagged photos, mobile photos, uploaded photos, profile pictures, then number of friends, number of siblings, relationship status right?0154

There is a whole bunch of these.0167

Here is one that we are going to be focusing on today, birth month.0169

Birth month is going to be important for us today.0172

We are going to be looking at age and height.0176

If I asked you if you see these 100 people and I will show them to you all at once so you could see them.0184

Here is this 100 people what can you tell me about their age.0193

What can you tell me about their height?0197

It will be hard to do because it is just lines and lines and lines of data.0199

It will be nice if there was one way where we could just easily see all the data at once in a way where it was a little more tangible to us.0204

That is where we are going to be talking about how to visualize these and how to create frequency tables.0216

In the files that I provided for you, I put in little tabs already.0219

One of the sheets has all of our data in it and one of the sheets talks about the variables.0225

Here we have a whole bunch of different variable names like the case ID number, the tagged photos, 0235

how many photos they are tagged in, mobile uploads, how many mobile photos uploaded, relationship status, birth month, birth year, gender.0239

These are a whole bunch of different variables that are already in this data set.0249

I also have a column that tells you what kind of measure it is.0254

Is it a nominal measure where it is just a number but it really stands for a name?0259

Relationship status is one of those where there is a number there like 1, 2, 3 or 4 but it does not mean 0264

that the relationship status is literally like the number 1, it actually means if you scroll over, if they have a zero it means that their 0440.9 relationship status is blank.0271

If they have a 1 it means that their single.0283

If they have 2, that means there in a relationship.0286

If it is 3, they are engaged.0289

If it is a 4, they are married and if it is a 5, it is complicated.0291

And 6 if it is other right?0295

That is an example of what we call a nominal type of measure.0297

Just so you can see all of these things at the same time, if you look down here there is this two little blue rectangles.0302

If you drag that over then you could sort of keep this column just static and locked while you move these columns.0311

We can also see that birth month is what we call an interval, it can also be seen as ordinal.0324

It is not quite interval because it is technically like 30 or 31 days, it is not exactly the same interval but you could sometimes call it interval.0332

Each of the numbers represent one of the months.0344

Birth year is also interval, there is an interval of exactly one year.0350

Gender is obviously nominal because even though there is a 1 or 2 it does not mean that their gender is 1 or 2.0354

It means that if they have a 1 they are male.0362

If they have a 2, they are female.0364

Some things like friends is really to understand though because friends is a ratio measure.0366

It is the count of how many friends they have so that is continuous type of variable and if they have a 0 means they have no friends.0372

That is very rare on www.facebook.com but it could happen.0382

I’m going to move this locked piece over.0387

The next tab you could see there it says birth month on it.0391

So far I have created a little set up so that we could begin our frequency table.0397

A frequency table is just a count of how many people are born in January.0402

How many people are born in February and so on and so forth. 0408

Now if we have to do that by hand it would be hard.0412

We have to go to our data, click on data.0414

Go to birth month and we have to count up how many people have one, 1, 2.0418

But this is a very error prone process so we are going to use Excel to help us do that really efficiently.0426

First, let us go to our first example.0436

We have here a data set with data from 100 www.facebook.com friends.0440

More of these friends born in a particular month or is the number of births fairly uniform across the year.0444

Well is there reason to believe that one month is more popular for having babies than another month?0452

We are not sure but it is hard to see the answer to this question literally like see the answer to this question 0458

by looking at the data because the data just look like this giant list.0464

That is why we are going to create frequency tables.0470

In order to create frequency tables we can start off with the formula.0474

In order to do a formula remember we always start off with the equal sign (=) to tell Excel “hey I’m doing a formula here”.0479

In order to count how many ones we have we could use the count formula.0486

It is a formula that is already prewritten in Excel.0493

Excel will just do it for us.0496

If we just stopped at the word count, it would just count how many things you have.0498

It would not count how many ones you have, right.0504

We want to use the formula count if, that is the function that we want to use0508

What is handy about Excel is that once you type in something then it will tell you what inputs you need.0514

Here it says you need the range.0521

The range of cells that you want Excel to look at as well as the criteria.0523

Here I’m going to tell Excel we will look over at my data.0529

I’m going to click on data and click from this one all the way down to the very very last row. 0535

And if I go back to birth month then it should say date from row I2 all the way to I101 but it has it twice, I’m going to delete this part.0547

That is the data that I wanted to look at.0567

This little column right here is telling you the range.0570

It says go from I2 all the way to I101.0574

That is the criteria I want and before I put my criteria Excel tells me, it reminds me I need a little comma in between.0579

I’m going to put a little comma.0589

What is my criteria? I wanted to count it if it is a 1.0591

I’m going to say if is equal to whatever is in this cell.0598

Excel will automatically put in that this is part of the birth month sheet.0602

It actually does not need this one either but it will put it in automatically for you.0611

I’m going to delete that one just so you could see but you could have it there as well.0616

It does not matter.0620

Let me finish my little function and let us look at what it says.0623

It says count if the data in this range is basically equal to whatever is in a2, this one.0626

Let me hit enter and it should say 7.0636

7 people out of my 100 www.facebook.com friends are born in the month of January.0639

The great thing about Excel is that it is a relative program.0645

If I copy and paste this cell, one cell down it will take everything in my formula and sort of calibrate it one cell down, right.0649

Let me look at this, do I wanted to bring everything one row down?0665

That means my data would go from I3 to I102.0671

That is not what I want.0677

I want the data part to stay the same but I want this part to move and moved down.0678

So that then it will say count if this data is equal to 2.0684

Here is what I’m going to do, to tell Excel keep this part the same.0691

I’m going to tell I’m going to put in a dollar sign ($) right in front of the I and right in front of 2.0695

This says freeze the row and freeze the column.0702

I’m going to put that also in front of this one, as well as that one.0706

That means this data set will never move but this A2 will move.0712

Notice that doing that does not change anything from my first row but I’m going to take this and copy it.0718

I’m just hitting either command c if you are on a mac and control c if you are on a pc and then pasting it one cell underneath.0724

Let us double click on this to see what it says.0736

It says count if data and my data states exactly the same from I2 to I101.0739

That is exactly what I wanted to do.0745

Notice that now my criteria has changed.0747

My criteria has moved one row down because I have copied and paste in my formula, one row down.0750

Excel it is relative.0757

It will move everything one row down.0759

Let us try it with the next one.0762

I’m just copying and pasting this one, one row down.0764

Let us double click on it to see what it says.0769

It says count if.0771

Data stays exactly the same from row 2 to 101 but now it is comparing it to whatever is in A4 which is March.0774

The nice thing about Excel is that if you look right at the corner here, there is this little box in the lower right hand corner.0785

If you put your mouse over that it will turn into a little cross.0794

If I drag that all the way down, it will copy and paste my formula again and again all the way down.0800

We could just check one of these down here once again my data set has stayed the same because I put those dollar signs ($) in there.0807

My criteria has moved down to A10 now.0816

I have my frequency table now.0820

Frequency tables are nice because they just give you the raw numbers in the month of January there are 7 people who have birthdays then.0824

In the month of July there are 10 people who have birthdays then.0833

We could look at our data.0837

We could stop here but I want to show you another way that we could create frequency tables.0839

I’m going to go back on my data and show you a second way.0848

The second way is less common but I still want to show it to you because we may use it once in a while.0853

We are going to use what is called pivot tables.0857

What I’m going to do is just put my cursor anywhere and open my Excel toolbar.0862

Unfortunately, you cannot see it on this screen.0870

Open my Excel toolbar.0872

There is a little tab called data.0873

Seldom used.0877

If you scroll down there should be something that says pivot table or pivot table report.0880

I’m going to click on that.0888

Once that comes up, you should have a little pivot table wizard that pops up and you will say “where is this data you want to analyze?”0893

It is on my Microsoft Excel data base.0903

Is this the data you want to use?0907

Yes, I want from A all the way to N and from A1 all the way to 101.0908

That is next.0924

I want to put my pivot table on a new sheet, just so I can show you.0925

I’m just going to hit finish.0930

A new sheet should pop up, it is probably be called sheet 1.0934

I’m just going to make this a little bigger for you.0939

A little pivot table should pop up.0945

You should also have a little pivot table tool bar that also pops up.0949

Let me drag it in for you.0955

Here we go.0966

This is the little pivot table tool bar that comes up.0967

This pivot table tool bar has all of my variables in it.0970

I could drag these variables into this pivot table down here.0975

It actually shows why it is called a pivot table.0979

I assume it is because you could move these variables from one corner to another and that is where we get the pivot.0982

What we want is a bunch of months on this side and then I want it to tell me how many people are born in that month.0990

I’m going to look for birth month and put it in my row fields because each row is going to be a birth month.1000

I’m going to take that birth month and drag it into my data as well.1006

What is does is it sums up how many of those birth months there are.1012

For January it sums up 1, seven times but for 2 I do not want it to sum up.1018

Instead I’m going to tell my pivot table count how many they are, do not sum them up.1026

Go to pivot table and go to field settings and I will hit count instead of sum.1032

Then hit Ok.1039

When that happens you can see we basically get the same numbers that we have when we use the formula.1040

In the month of January we have 7.1045

In the month of July we have 10.1048

This is another way that you could look for frequency tables.1050

Notice that this one is pretty fast.1056

Pivot tables do require a little bit of work but on the front in there is a little bit of learning curve.1059

Once you do understand that, they are really handy.1066

We maybe using them again in the future.1068

If you do not feel comfortable with them, feel free to also use the formulas.1072

I will be using the formulas for the rest of this lesson.1076

Let us go back to my birth month.1079

My birth month pivot table created just through Excel formulas by themselves.1083

I have this nice frequency table but it will be nice if I could visualize it.1089

Here I have to read each row and although for 12 months it is not so bad, they might be times when this is less helpful to us.1097

What I’m going to do is highlight the data that I want to visualize and then hit chart.1105

It should be one of the tabs up here or you could go get it through one of your Excel tabs.1115

I’m going to say give it to me in columns or you could use borrow as well.1122

In Excel it just means it is on this side.1133

I’m going to use columns for now.1135

I will just pick the first one.1138

It seems the simplest.1140

I’m just going to delete that legend, it is redundant.1144

Here is my frequency table and we could literally see our data.1149

It is also tells me what each of these bars stand for.1157

It stands for 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.1162

What Excel will do is it will automatically seed your X axis with just those numbers starting from 1 and it will go up.1166

With months, handle it but is the same thing that Excel is doing.1174

In another example we will need to put in our own X axis.1179

Notice that here these are not means, they are not averages, these are frequencies.1185

This means that 7 people were born in January, 10 people are born in July and 7 people are born in December.1191

And so that is what our birth month frequency visualization looks like.1205

This is our frequency distribution for birth month.1210

Let me minimize this.1215

If the number of friends born in a particular, is one month particularly popular for one of our friends are born.1217

It does not seem to be the case, the months all tend to be from something like 7 – 10 people per month.1225

It seems that the numbers are fairly uniform.1233

Let us go into our second example.1241

Here is another example and now let us take our same data, the data from 100 www.facebook.com friends 1243

and we are going to look at what is the age distribution in this sample.1249

Here is my Excel data I'm just going to click on the data sheet and here when we go up and look for age we could see here is a whole bunch of ages.1255

It seems like there is a lot of people in their 20’s.1268

A few people in their late teens but here we see some people who are 0 years old.1274

In this data set, if they have 0 it means that they do not list their year of birth or do they do not list their age.1281

Maybe they are embarrassed, maybe they are too young.1289

I do not know.1291

We do not learn a lot by just scrolling up and down on this data.1296

That is why it will be nice if we could look at a frequency table or look at a distribution visually.1300

I'm going to click on my age sheets and here I have already made set it up so that we could just 1306

do our frequency table really easily from the lowest age in our sample which is 17 I have ignored the 0 obviously 1314

to the oldest age in our sample which is 38 and there is all the ages in between.1323

Let us go ahead and put in our formula to find out how many people in our sample are 17 years old.1329

To start a formula we start with the equal sign (=).1335

We use count if because we do not want to count everybody, we just want to count the people who are 17.1338

Let us tell Excel where it should find our data, what is the range of data.1346

I’m going to click on data and click from this cell all the way down to row 101.1352

I know I need a comma after that.1364

I’m going to delete that part.1371

Here is our data range and I wanted to count it if this person is 17.1373

My inputs are there. Remember we want this data to stay the same all the time.1385

We do not want it to move because Excel will move it if it has the chance to.1389

I’m going to put dollar sign ($) in front of the L and the 2 and in front of the column indicator 1394

and the row indicator to tell it to lock this data in place.1400

Always use this data, do not change.1404

Once I have that formula, I'm just going to drag it all the way down so that it counts at the frequencies for 18, 19, 20 year olds.1409

Let us back check. Let us look at 21 year olds.1424

It says count if our data set has stayed the same because we have locked it in with our dollar sign ($).1426

Now it is saying I will count these people if they are 21 years old, that is our criteria.1433

It looks like our formula has copied and pasted quite well.1438

Notice that for some of these some of ages, the frequency is 0.1443

There are 0 people who are 26 years old in our sample.1449

Now why do I want to keep that 26 in there?1454

If we skipped down on 26 and 28, 29, 30, 31, 32 and we looked and there is 127 year old in 133 year old, 1457

we might mistakenly assume that from 27 to 33 there is equal chance of having 1471

at least one person from our sample being sort of in that range.1478

You could see that is actually not true.1483

In between there, there is like a big desert of nobody and we want our distribution to reflect that.1486

Age is a continuous variable and so we do not want to skip any ages.1494

We want to show how the distribution looks as we look at age continuously.1500

This is nice because we can already see that the ages are clamped or clustered around age maybe 20 – 22, early 20’s.1507

It will be nice if we could really look at this.1518

One thing you might want to do is click on select both age and frequency.1520

Go to charts and we are going to do an X, Y scatter.1530

For those of you who have Microsoft Excel later than 2008, like 2009 and later you can go directly to column 1538

but here we are going to start with 2008 Excel.1549

We are going to need to do a little fix.1555

First I’m going to click on a scatter.1557

A scatter is nice because it shows you both the age.1560

This is age 17 and the frequency.1566

Once we have that then I'm going to go to column and then it will show me 17 through 38.1572

If I had gone directly to column, here is what will happen.1584

If I did not go through scatter first, here is what will happen.1589

Let us say I just wanted the frequency, they will go directly to column it will not give me the proper ages on my X axis, 1595

it will only give me Excel’s default setting for the X axis which is just labeling it from 1 all the way to 22.1604

However many there are that is not what we want.1614

Instead we would rather have Excel label the correct ages for us.1618

Just so that we will know that this is a frequency distribution of ages later.1625

We should go and label are horizontal X axis, we can label that age.1632

In that way we will know it is a frequency table but it is a frequency table of ages that is what the 17 stands for.1641

What is the age distribution in the sample is largely young.1654

They are mostly on the young side with a few people sort of in their 30’s.1658

Example 3, again from our same www.facebook.com data, what is the height distribution in this data?1666

What did their heights looked like.1673

Let us see.1675

If we click on data and we look at their heights, their heights are listed in inches.1679

Remember that 5 × 12 is 60, 60 inches is about 5 feet tall and then 68 is 5’8.1685

It is a quick way to think of it.1696

72 is 6 feet tall, that person is pretty taller.1698

Once again if we just look at these row by row, it is a just bunch of numbers.1703

We do not need that, we would rather have a nice frequency table.1709

Let us go to height.1713

I have already seated it for you with the height that is the minimum height in our data set as well as the maximum height in our data set.1716

The minimum height happens to be a little bit just shy of 5 feet, 4’10.1724

This one is a little bit more than 6 feet tall, 6’3.1734

Let us put in our frequency function.1740

Count if and let us go ahead and select the data that we want to use.1746

Now that we know we basically need to lock it in place, let us do that right here.1759

Let us lock it in place.1766

We already locked our data in and what is our criteria?1774

I want you to count it if they are 58 inches tall.1779

It seems that there is only one person in our data set of 100 that has that height.1787

I’m just going to copy and paste that all the way down.1792

Once again I'm just going to spot check, 69 inches tall count if this is the correct data.1796

It is locked in and this is the correct criteria that I wanted to use for that row.1806

Good.1811

When we look at this, it seems that it is not that there is one cluster.1814

It seems like there is this sort of giant spread out cluster.1818

It will be nice if we can look at this visually.1825

Let us go ahead and select both columns.1829

Go to chart and go ahead and select XY scatter.1833

This is going to give us both, it is going to use the height as the x coordinate and the frequency as the y coordinate.1839

Here we see that all our frequencies are up here because all of our heights are from 58 to 75 inches.1850

Let us change that into a column chart.1862

Here is how our distribution looks like.1870

Just in case we come back to this later it will be nice to know what these numbers down here represent.1874

I'm going to go to my formatting palette, I’m going to close that.1879

I’m going to go to my formatting palette and tell my horizontal axis that it should be labeled height in inches.1884

That is what our distribution of heights looks like.1902

It looks like these over here, this one seems pretty popular and these seem sort of popular.1907

These are less likely and this one a little bit less likely.1915

This is a sort of what our shape looks like and it is nice and it is really easy to see when we see it in a visualization.1921

It is harder to see when we just look at the list of numbers.1928

Let us move on to our next example.1934

Example 4, now that is the height distribution of everybody in our 100 person www.facebook.com example.1940

But it is a mix of males and females.1948

What if we just wanted the height distribution of males?1951

After all males tend to be taller than females.1954

Their distributions might look different.1956

Let us look at the height distribution only of males.1958

We could also look at only the height distribution of females.1962

Feel free to do that if you want.1965

Here I'm going to use my height by gender and there is a male frequency column and a female frequency column.1970

Once again here are my heights but we will have to figure out in our data set which rows belong to males and which rows belongs to females.1982

Let us go back to our data set.1993

Here is my column for gender, my variable of gender.1998

Some people are gender number 2 and some people are gender number 1.2003

If we look at our variables we could see that gender has been dummy coded because it is a nominal measure.2008

We will get 0 if gender is blank or unavailable.2019

They got 1 if their gender was male and 2 if their gender was female.2024

Here is what we will do, we will take all of our data and sort it by gender 2030

so that all the 1 are clumped together and all the 2 are clumped together.2035

I'm going to use sort.2041

Sorry about that.2054

I think I did it and ended it, alright.2059

I’m going to use gender and I’m just going to sort it by clicking in this column.2060

I just want to make sure that these guys all moved with each other.2070

Now it is sorted so that all of my data for males is up on top and then all of my data for females is at the bottom.2077

Just to keep it straight for myself, I’m going to just color all the heights of males, all the values for height of males,2088

I'm going to color that with the blue font color.2098

Just to help myself keep it straight I’m going to color all the females height values with the sort of pinkish font color.2106

What does my distribution of only males look like?2119

We need to start off with the frequency table again.2123

Let us go to height by gender and here I will put in count if.2126

And let us put in my range.2136

Now my range is only going to be those that I have already colored blue 2138

because they only want my range to be those that are already identified as males.2143

Here I’m going to select all these blue guys and put a comma.2150

And then tell if a male is 58 inches tall then I definitely want you to count him.2164

It turns out there are 0 males that are that tall or that short for that matter.2178

We want to lock that data set in place because we know that this is not going to need to move for this column at least.2184

I’m going to go ahead and copy and paste that all the way down and we see that 2195

from the males the heights are sort of clustered up here rather than down here.2200

I wonder if that is the same for females.2208

Even though our question was really about males why do not we females too just to see.2210

I’m going to start with my count if.2217

The range for females needs to be all the data that has been already identified as females.2221

Here are these pink women and I’m going to go ahead and put in a comma because I know I will need one.2227

Go back here and I will say check if the female is that height.2235

Once again I want to lock in my data.2243

I do not want that to move when I copy and paste.2248

And then it turns out that our one person who is 58 inches tall before happened to be female and I’m going to drag that all the way down.2253

We see something different in females than we saw in males.2263

Females tend to be clustered around here and the most frequent height being about 64 inches.2267

For males, the heights are sort of clustered up here with the most frequent height being 69 inches.2275

Let us look at this now and visualization.2283

I’m just going to look at the heights of males for now.2288

Hit chart and go to XY scatter because I want to know both the height and the frequency of that height.2293

We see that males are clustered up here.2305

Let me change that into a column and what do we see?2309

We see that it is like a pile.2318

The males are sort of piled up around 68 – 70 and it falls off closer to 5 feet tall.2321

There is not as many people who are way taller than 6 feet.2330

That is the chart for males.2337

Feel free to go ahead and do the chart for females.2340

That is the end of our examples today.2346

Thanks for using www.educator.com.2348