Sign In | Subscribe
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
  • Discussion

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Bookmark and Share
Lecture Comments (1)

0 answers

Post by Brijesh Bolar on August 12, 2012

I am a bit confused between continuous and discrete variable? Why is no. of siblings a continuous variable and not discrete variable in the example 3 above.

Bar Graphs

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:05
    • Roadmap
  • Review of Frequency Distributions 0:44
    • Y-axis and X-axis
    • Types of Frequency Visualizations Covered so Far
    • Introduction to Bar Graphs
  • Example 1: Bar Graph 5:32
    • Example 1: Bar Graph
  • Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs? 11:07
    • Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
  • Example 2: Create a Frequency Visualization for Gender 14:02
  • Example 3: Cases, Variables, and Frequency Visualization 16:34
  • Example 4: What Kind of Graphs are Shown Below? 19:29

Transcription: Bar Graphs

Welcome back to 0000

Today we are going to be covering bar graphs.0002

Today’s roadmap looks like this, first we are going to review all the frequency distributions that we have done so.0010

Then we are going to talk about a little bit about bar graphs and how they are different.0017

Mainly is that they involve categorical variables and looking at the frequency of each value.0020

Then we are going to contrast bar graphs and histograms because they are going to 0026

look very similar to each other but they are very different ideas underlying them.0030

But they look superficially similar.0036

We are going to talk about whether shape applies to bar graphs, central tendency instead as well. 0038

First let us review all the frequency distributions we have done so far.0048

What was on the Y axis?0052

Well since they are frequency distributions, largely it is something like frequency. 0054

Sometimes you will see frequency distributions that have relative frequency and that does not change the shape, center, or spread.0060

Because you are basically just dividing by a constant. 0072

What was on that X axis?0075

Remember in all of our histograms, dot plots, and stem plot so far, it is usually on the X axis.0078

In the case of stem plots in that sort of Y axis, that central column that is going to be the values of the variable, whatever your variable is.0086

So far we have looked at variables such as height, number of friends, photos, number of photos, those kind of variables. 0105

We have each of those variables and we have all the values for that variable on the X axis.0128

Let us look at the different types of frequency visualizations we have looked at so far.0137

You could think of them as graphs or charts.0146

We have covered dot plots, remember what those look like?0154

They look like little dots or stars.0158

We have looked at histograms which look like bars, those are pretty frequent.0163

We have looked at stem and leaf plots that have the actual numbers in them like 2, 3, 4 and then they have 0 – 5, 0 – 2.0171

We are going to look at bar graphs.0184

When I draw a bar graph, which one does this look like?0186

It looks very similar to the histograms, right.0194

Here is the difference, so far in all three of these kinds here the variable of interest has always been continuous.0198

There is something in between 62 inches and 63 inches, right?0220

Like having 100 friends and 101 friends. 0227

It is meaningful having 101 friends, having one more friend.0233

These have been interval, they have ratio values and so these variables have largely been continuous.0238

In bar graphs, I will color this red so that you know it is different.0248

In bar graphs, for the first time we are going to be looking at variables that are categorical.0258

If you recall categorical means that these variables are going to be like little bins, right?0268

There is nothing in between 1 and 2, in the case of categorical variables 0276

and these are going to be largely useful for things that are nominal measures.0283

Things like gender having been male or female.0288

There are some things in between male and female but for the purposes of statistics we treat as a categorical variable. 0291

They are nothing in between hair color, you either have black hair or brownish hair or blondish hair rates and reddish hair, other colors. 0301

There is nothing in between eye color. 0315

It is not a continuo necessarily.0318

Because of that categorical variables are going to be visualized in a very different way.0321

These visualizations are going to be called bar graphs.0328

Let us go ahead and look at an example of the bar graph.0335

They basically look like histograms superficially but the way you could tell is by looking at the X axis.0338

This is how you can be able to tell whether it is a histogram or bar graph because on the X axis you should see a categorical variable. 0349

Let us look at an example. 0361

If you put up the Excel file that you could download, here we have all of these variables in these columns.0365

Each of our cases is one of our people from, these are our 100 friends from

Let us go down to the column that says relationship status, here it is.0381

Here is relationship status. 0387

This is something that really big for, because all of a sudden you could internet stalk people 0389

and figure out what their relationship status is.0395

But here what we see is relationship status is a whole bunch of numbers. 0398

If you click on the variables sheet, it will tell you what those numbers stand for.0404

Relationship status is in column H and we are going to color this in red fonts so you could clearly follow along.0410

If you scroll to the right it tells you that it is a nominal kind of measure and it is a categorical variable.0419

Perfect for doing bar graphs. 0425

Here is what the dummy coding looks like.0427

Although they are numbers there, those numbers do not actually stand for numbers.0430

They are just dummy codes because they are stand ins for nominal names, nominal categories.0434

Here it says if there is a 0 in that column, it just means that the relationship status is left blank or unfilled in.0443

If they have a 1 it means that they are single.0451

If it is a 2, they are in a relationship.0455

If it is a 3, they are engaged.0458

If it is a 4, they are married.0461

If it is a 5, it is complicated.0463

If it is a 6, they just put something else.0466

We know that they have 0 through 6 as the potential values that could be in that variable. 0470

Let us go to the relationship status sheet.0478

Here I have already filled in the category labels for you and the status is that they would have.0482

Let us make a frequency table, this is going to look the same as before. 0488

Let us go ahead and put in our formula, the equal sign (=) first for the function count if.0492

I want Excel to count this person and if they have a blank in their relationship status.0501

I will put in a comma because I know I’m going to need that.0515

Here is my data and I know this is going to stay put if I put in dollar sign ($).0520

I’m going to lock it in place and that just tells me what sheet we are in.0529

Count in this data if it meets this criteria of 0.0538

I’m going to close my parentheses.0547

There are 13 people out of our 100 that have left their relationship status blank.0550

I’m just going to copy and paste all of that down and we see that just from looking at our frequency table, 0557

We could see that most of the people in our sample are single or in a relationship, not too serious. 0561

Here we will select all of these before we meet the bar graphs.0573

The reason of selecting these category labels is that when I select them, Excel will just fill it in for me.0581

It will fill in that X axis for me.0588

I’m going to click on charts and go ahead and click on column.0592

And here we go.0598

I’m going to delete that because it is redundant. 0603

Here we have a nice bar graph.0607

Notice that it looks almost exactly like a histogram but one of the difference is that in bar graphs 0612

there are spaces in between to indicate that these are separate bins that cannot be continuously looked at.0618

Because of that we see that here we see the same information as we saw in the frequency table that being single is the most frequent category.0627

Being in a relationship is the second most frequent.0640

These are all much less frequent and it is a little common to leave a blank but not too common.0643

That is one example of a bar graph.0653

It looks just like histograms but the difference was we used categorical variable. 0660

Let us take our example that I have just copied and pasted it on here and let us look at whether sheet, center, 0670

or spread of distributions that we have looked at before apply to bar graphs. 0678

Now let us think about this, can we really say that this is a skewed right sheet? that it has a tail?0683

Let us think about this. 0695

Well, in order to answer that question we might want to think about this idea, will it matter if the order of the bars were reversed?0697

Let us say I decided to put the people who left it blank over here and people who said it is complicated.0705

I have decided to dummy code that as number 1 right.0712

And let us say I decided to switch single, married, and in a relationship with engaged.0717

I will switch all of those.0724

It would look like a left skewed distribution.0726

The ordering down here is largely arbitrary.0734

There is no rule that says that we have to put these values in these order because being blank, single, in a relationship, 0742

engaged, married, complicated, and other, it does not have a set order that corresponds the numbers.0751

We might have some order that have being married is the most committed.0758

Being engaged is second committed or something.0764

We do not have some order that we want to put on it but largely this is not arbitrary ordering.0767

We could switch up these bars and that will be okay.0772

However, in histogram we cannot arbitrary rechange the bar for value 1 and 2 because in that case 1 and 2 actually means something.0778

It means something numerical.0792

Here 1 and 2 do not actually mean numerical.0794

It means something nominal, it is just a stand in for a name.0798

Because of that,sheet does not quite apply neither does center except for mode.0804

Mode is the one that we would use for categorical variables.0813

How would you have a mean here?0819

Spread does not quite make sense here either.0823

In a bar graph, we cannot quite use the same constants that we have been using for the rest of the frequency visualizations.0832

Let us move on to example 2.0845

Let us create a frequency visualization of gender and what would that be called or let us answer this question first.0847

What would that be called?0853

What kind of variable is gender?0855

Gender you could have values such as male, female, blank, and we could consider that the categorical variable.0859

When you have a categorical variable, we would be making a bar graph.0869

Let us go back to our examples.0877

If you move on to the gender sheet, here we have gender values 0, 1, 2.0881

1 means they are male, 2 means they are female, and 0 means they did not put their gender down.0889

Let us put in our formula, count if.0896

I’m going to go to my data and let us find the gender column.0901

Here is the beginning, gender is right here.0908

I’m going to put in a comma because I know I will need that.0923

Let us put in this gender.0929

Count if it is 0 and let us lock this data in place so that we could easily copy and paste later.0932

It turns out that 0 people left it blank.0949

Then male and female should add up to 100.0954

52 + 48 adds up to 100.0957

When we skipped that blank one because no one left it blank, let us just select male and female.0960

Go to charts and hit column if it is not already selected.0968

Go and create a bar graph.0975

Here on the x axis we have a categorical variable gender and here we have frequency just like we did before.0979

Let us move on.0993

Example number 3, supposedly collect the following information from each student 0995

in a class, age, hair color, number of siblings, miles away from school.1001

What are the cases in this data set and what kind of variables are here?1007

Is it a categorical or continuous?1014

What kind of frequency does visualization when we create for each variable?1017

Let us start with the first question.1023

What are the cases?1025

What is the thing that unites all of these 4 variables together?1028

That is going to be each students.1036

Each student is a case.1038

What kind of variables do we have here?1045

I will put this in blue.1047

What kind of variable is age, continuous or categorical?1048

Age is a number that means actually something, right?1053

Being 10, 11, 12, and there are always gaps in between.1056

Age is continuous.1061

What about something like hair color?1067

Hair color we usually coat it as something categorical.1069

What about number of siblings?1080

Number of siblings is tricky because the number means something, definitely.1082

There is 1, 2, 3 and 3 is definitely more than 2 but there is no such thing as having like 2.5 siblings.1089

This is kind of continuous but I’m going to list it as continuous for now.1100

One of the things is going to be that later when we create an average.1106

The number of children in a family or something like 1.75.1113

We know that average means something and because of that I’m going to list that as continuous.1120

Miles from school that is also going to be continuous.1127

Because of that, what kind of frequency visualization will we use for each variable?1134

For age, for every continuous variable we will use a histogram.1141

Why do not we just fill that in for all of these?1152

For our 1 categorical variable we would use bar graphs.1160

Here is example 4, what kinds of graphs are shown below?1172

Let us see.1175

Let us look at the first one on top.1178

It says this is the graph of a fast food industry.1182

It seems to have number of stores plotted on the Y axis, frequency number of stores.1185

They are Mc Donald’s, Burger King, and Taco Bell, 1992 and 1996.1193

It seems like each of these have more stores in 1996 than in 1992, something like that.1199

Here what kind of graph is this?1209

Is this a dot plot? No.1219

A stem plot? No, we could rule this out.1223

It basically comes on if it is a bar graph or a histogram.1226

That is going to depend on whether the X axis has categorical or continuous variables.1230

Here you could see that these are grouped into Mc Donald’s, Burger King, and Taco Bell.1238

Those are names of fast food restaurants.1246

Fast foods restaurant is on our X axis.1250

Fast food restaurants is a categorical variable so we know that this one is in red, bar graph.1254

Even though in a year is continuous it is not listed in a continuous way.1267

They only picked 2 of random years, right?1274

Fast food restaurants this is categorical.1288

Even year, they are treating it as if it is categorical here.1294

Let us look at this one, here we have Mc Donald’s and this must be years that Mc Donald’s 1298

has been offering and here is the net income in billions of dollars.1306

Even though it says like something like 1.5, this is $1.5 billion, it is the net income.1312

Look at this in 2008, they are making $4.8 billion.1321

By the way, I took the data from the last entry journal.1327

Here year is on the X axis.1333

Year is continuous or they are treating it continuous here because they do not skip any number.1339

It is 3, 4, 5, 6, 7, 8, right?1347

This is the number of dollars made.1349

Here we consider this a histogram.1356

That is it for bar graphs.1366

Thanks for using