Sign In | Subscribe
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
  • Discussion

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Bookmark and Share
Lecture Comments (6)

0 answers

Post by Srikanth C on April 22, 2015

I couldn't get the point where we can use any number to divide the heart of skewness. If (n-1) is 2, then we are literally dividing the skewness value from the formula by half and if (n-1) is 3 we are making it one-third, so are we not reducing the overall value of skewness by increasing n? So, will it be correct to say we can choose any number for n-1 to divide while calculating skewness?

1 answer

Last reply by: Manoj Joseph
Wed May 1, 2013 11:23 PM

Post by Manoj Joseph on May 1, 2013

I think you are making a wrong mistake to SPSS file instead of Excel shell

0 answers

Post by Kambiz Khosrowshahi on March 28, 2013

In the equation for skewness, I dont understand why it's ok to multiply it by 1/n-1. Isnt the correct equation that "heart"? If so, then by multiplying it by 1/n-1, wont it produce the wrong skewness?

1 answer

Last reply by: Professor Son
Wed Aug 15, 2012 2:15 PM

Post by KIM CARTER on April 27, 2012

Love all your work!!!!!!!!The correct spelling of (you wrote)asymtotic it is actually asymptotic. (Smile)

Shape: Calculating Skewness & Kurtosis

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:16
    • Roadmap
  • Skewness Concept 1:09
    • Skewness Concept
  • Calculating Skewness 3:26
    • Calculating Skewness
  • Interpreting Skewness 7:36
    • Interpreting Skewness
    • Excel Example
  • Kurtosis Concept 20:29
    • Kurtosis Concept
  • Calculating Kurtosis 24:17
    • Calculating Kurtosis
  • Interpreting Kurtosis 29:01
    • Leptokurtic
    • Mesokurtic
    • Platykurtic
    • Excel Example
  • Example 1: Shape of Distribution 38:28
  • Example 2: Shape of Distribution 39:29
  • Example 3: Shape of Distribution 40:14
  • Example 4: Kurtosis 41:10

Transcription: Shape: Calculating Skewness & Kurtosis

Hi and welcome to

We are going to be looking at shapes again but now we are going to be calculating skewedness and pectoris.0002

We could not do this before when we covered shapes because we needed to know something about variability first.0008

This is a road map and basically we are going to be covering skewedness to concepts.0019

I’m going to try and connect that to calculating skew.0025

Then we are going to be talking about how to interpret the number that you have calculated 0030

and also what that number will tell you about the relationship of central tendency.0034

Obviously because the number directly relates to the actual concepts of skewedness 0039

this measures a central tendency relationships will also hold for the skewedness concepts.0044

Then we are going to be talking about kurtosis.0054

Kurtosis is something that people do not focus a lot on because it is a different shape to understand.0055

We are going to talking about calculating and how to interpret kurtosis.0063

First let us start with the concept of skewedness.0071

We have been over this again and again.0075

We know that there are skewed right distributions, skewed left distributions, and ones that are not skewed at all.0077

There are lots of these distributions.0087

This one is not skewed.0091

Anything that has a tail is skewed.0093

This are also called j shapes or formally they are called asymptotic shapes 0101

because they are this asymptote right here, it asymptote against the x axis.0115

Skewedness means there is a tail somewhere.0125

One side is longer than the other.0131

One side that values that go and on and those values are not clustered around other values. 0135

That is basically the concept of skewness.0143

It would be nice if we could get a number that would tell us this is exactly how skewed your distribution is.0147

Distribution like this might be somewhat skewed but a distribution like this might be really skewed.0155

It would be nice if we could say how much more skewed one is on the other.0166

Skewedness what we are going to calculate is going to tell us exactly how skewed something is.0171

On the positive end it would mean that the tail is on the right, if the number is positive.0177

If the number is negative it will tell us that the tail is on the left.0187

The greater the positive number it means it is more skewed on the right.0192

Normal distributions often have 0 skew.0198

Things that are much go often have 0 skew.0201

Let us talk about calculating skewedness.0209

Some people get a little bit freaked out at this but everything in here I want to blow it down to what is important.0210

Here is the main heart of the skewedness function.0218

Basically it is going to be the sum of cubed distances from the min.0227

We will talk in a little why it is cubed.0239

The sum of all the cubed distances of each point from the min of the sample.0243

Instead of sum of squares, it will be the sum of cubes / the standard deviation (s), 0252

where we use the sample to calculate the standard deviation of the population to estimate it.0264

That is basically the crooks of the idea.0272

There is going to be some more frills on this but this is basically the heart of the idea.0275

Let us think about why it is cubed.0280

It is cubed because it is going to matter whether it is going to be positive or negative.0282

Before when we squared things, we do not care which direction away from the min it was.0287

But now we care.0293

How far are you away from the min but also what direction are you away from the min?0295

We are having things that are stacking up.0304

It is like each point is having a vote.0306

They are saying I’m on the left side, I’m on the right side.0310

By adding them up we see who wins, the left side guys or the right side guys.0314

When there are more guys on the negative end we get a smaller number.0319

When there are more guys on the positive end we get a large number.0325

By cubing it, what we do is we make those that are on the ends matter more than those who are close to the center.0332

Cubing is going to make it bigger than squaring it.0340

That is the main heart of the idea.0346

Just to review, if you double click on this s, what would be inside?0349

Think about clicking on that s, what would be s equal to?0356

S is the sum of squares ÷ n -1 because it is the standard deviation, we would square that.0361

Standard deviation is always going to be positive.0372

This is not going to change its sign.0378

On top of the heart of this function.0384

Sometimes people divide by n as well or n – 1.0388

In Excel it is going to multiply by n × n + 1 ÷ n – 3.0395

But that junk does not matter because that will not have a great impact on skewedness as the heart of this function.0403

That is what I want you to know, this is the heart.0413

That is the heart of the function and the other stuff are just frills.0418

This is one type of skewedness but actually there are tons of different types of skewedness.0425

There is pair skewedness, there are lots of skewness that I cannot remember the name.0430

There is minimal or distance skewedness.0437

There is momentum skewedness, there are ton of them.0442

If you want to know more about skewedness, you can check out this link

I frequently use it and refer my students to it.0452

Let us talk about interpreting skewedness.0459

Imagine this is all of the skewedness values that are possible, if we find something out that has negative skewedness, like -1 skewness.0464

We have find out that our distribution has it, but I have not shown you what our distribution is.0478

Can you guess what kind of distribution it is?0482

Yes you can.0484

You know that it has a tail that goes to the left.0485

On the other hand if it is a positive one, you know that it has a tail that goes to the right.0491

If the skewness is 0, we are not sure exactly what it is but we basically know that it looks symmetrical.0497

It could look uniform, like approximately a normal distribution.0508

It could be some crazy thing that is symmetrical.0515

It could be any number of distributions as long as they are approximately symmetrical.0524

Let us talk a little bit about how to look at skewness in SPSS.0531

Go ahead and download the example Excel file and if you look on the first sheet, you should see skew data.0540

I have shown you 3 different little data sets.0550

Data set A is skewed to the right, notice that it has a tail to the right.0556

Data set B is skewed to the left, notice that it has a tail to the left.0561

Data set c is more normal, it is symmetrical.0566

It has a little peak here but it is roughly symmetrical on both sides.0572

The nice thing about Excel is that it does have a skew function so you could automatically generate skewness.0579

The skew function is just skew and you put in the data that you like it to calculate.0587

This is showing you the skew of 2.08, 2.09, it is saying that it is very positively skewed.0596

I could just drag this across and it will calculate the skews for the data sets above.0609

This one is for this data set and we find that it is a negative skew.0616

The extent is almost as negative as this one, right?0622

This is about 2, -2, and it is because these two are largely mirror opposites of each other so they have similar skews.0627

Here we see that the skew is closer to 0.0637

It is still on the positive side but it is closer to 0 than it is to 2 or 3.0641

SPSS, if you are interested in the particular formula that they use to calculate skew,0650

you could double click and when these little help window comes up you will notice that skew is a hyperlink.0657

If you click on that then you should get this little window that comes up, that is an Excel help window.0667

Let me make this bigger.0677

Here they show you the exact equation that they use for skew.0681

Notice that this part is exactly what we talk about.0685

It is that distance away from the min3 / stdev3.0689

That is all the heart if the function but here we have this extra stuff n / n – 1 × n – 2.0699

You could use that or you could use something else.0711

You could use 1 / n – 1 that is also very common.0714

I will be using n / n - 1.0718

If you hit enter, let us look a little bit first before we calculate skew for our cells, 0723

let us look a little bit on what it would mean to our measures of central tendency to be skewed right or left.0730

Let us calculate the min.0739

Min is average, sometimes I type in min in Excel.0741

I’m going to close my parentheses.0750

Let us also get the median while we are at it and let us get the mode.0752

Once we have those 3 values, we could just copy and paste straight across and 0767

it will find the respective min, median and mode for each of these other distributions.0777

When it is skewed to the right and there is a very positive skew, 0784

what we find is that the min is greater than the median which is greater than the mode.0789

That makes sense.0796

The mode is probably going to be on the very left side of the distribution.0798

The min is highly affected by the outliers.0802

The outliers are on the right so the min is going to be greater.0806

For negative skewed distributions, we have the exact opposite pattern.0810

This time the mode is the greatest, followed by the median, then the min.0814

Remember the min is highly affected by outliers, this time the outliers are small.0819

The min is being pulled to the smaller side.0825

The most frequent numbers are up on the higher end and that is why median and mode tend to be bigger.0829

When you know the skewness, you will automatically know the relationship between min, median, and mode.0835

On the other hand, when distributions are largely symmetrical, when there is very little skew on either side 0844

then what we see is that the min, median, and mode are often very close to each other.0851

Sometimes they will be exactly the same, this time they are just very close to each other.0858

That makes sense because there is a little bit of skew on top of each other.0863

Let us talk about how to calculate skewness.0874

Here I have put the formula for skewness.0879

Sometimes people are not comfortable with moving around that sigma sign, it is like where does it go?0883

That sigma will affect some things and not other things.0899

That sigma will affect anything that has an I next to it.0902

Because of that you could put ∑ and put a divided by stdev3 in there and it would not affect it very much.0906

You could either do this part after you add them all up or you could do it before you add them all up.0917

It actually does not matter because of the distributive property.0919

You could either put this inside or this or you could take it out.0930

It does not matter.0935

Let me show you a way that you could write this.0936

Actually this might not be helpful to write it on Excel, I will write it when we go back to our power point slides.0941

Let us start with this.0950

We know that we need to get all of these distances away from the min.0952

Let us start that with cubing that.0958

I need to start with the equal sign (=) and I’m going to put in a parentheses to say take my value, subtract the min which would be the average.0962

I’m going to put in all of these values.0975

I do not want that average to jiggle around so I’m going to lock that data in place.0979

Because I want to copy and paste it later on, I’m going to leave the A to vary but I’m going to say lock the 4 down.0985

When I copy down this column it would not change the A column, but when I copy across it will change it.0997

I only want to lock down the number row.1004

Then I’m going to close that parentheses and cube that value.1009

Here I could also divide by this as well but I’m just going to do this for now.1016

This is just the part where I’m doing x – x bar3, that is all I’m doing in this column.1025

I’m going to go ahead and copy and paste that all the way down.1037

Here what we have is just this part but now I need to sum them all up and divide by s3 as well as n – 1.1045

I’m going to modify this to make my formula a little bit simpler.1061

I will have n – 1 down here.1070

Now I need to add these up and divide by n – 1 × stdev3.1078

Here let us do that.1091

Let us sum this guys all up and divide by count my data – 1, multiply that by stdev of my data cubed.1093

It is a lot of parentheses.1127

Because it is color coded I know that I need another blue one to signal my denominator and hit enter.1131

There we get a skewness of 1.84.1139

That is way close to the 2. 1148

Something that we got using the Excel function.1151

The Excel function will just multiply by a slightly different constant so that is the only difference.1153

But once we have this, we can actually copy this over here.1158

Here if I double click on this it shows that it is using the C column because I let C vary 1168

and it is being relative about the columns but it is locking the rows in place.1175

I could also check down here, it is using the C column to count those data and get the standard deviation.1182

What we find is -1.7 which is pretty close to 1.9 that we found with Excel’s function.1192

I’m just going to copy and paste all of that again to our approximately symmetrical one.1201

We got something close to the .44 that Excel gave us, .39.1209

That is how we are going to calculate shape here.1216

One of the features is skewness.1222

That is skewness.1226

Now let us talk about interpreting kurtosis.1233

I actually said that I will talk a little bit about where the skewness stuff goes.1236

Let me just draw it over this corner.1242

I’m not going to need all this room for kurtosis.1244

Let me just talk a little bit more about the skewness function here.1247

The formula for skewness as I said before is going to be the sum of all the cubed distances over some constant and s3.1254

I have said that is the idea.1275

Sometimes you might see this formula written more like this x sub I – mean3 / s3.1277

This and this they mean the same thing.1294

It is just different ways of writing it.1301

This affects things that are affected by i and because this is not affected by the I at all, 1303

s3 will be the same whether it is the first value or the last value.1312

It does not matter whether you write it inside included in that ∑ or not.1318

Another way you could write it is also 1/s3 × x sub I – x bar3.1323

This is another way you could write it and so all of these 3 ways of writing it are equivalent.1338

It does not change anything.1344

Do not get tricked out by one of this other option.1348

They are not trying to be tricky.1352

Let us talk about kurtosis.1356

Kurtosis is a concept that is weird for people because this is not something that we are used to dealing with in regular shapes that we know.1358

Roundness and squareness.1368

Kurtosis is something a little bit different.1371

Kurtosis is about two things that are bundled into one.1373

One is pointiness or peakness.1378

Very kurtotic shapes might look something like that.1382

Super non kurtotic shapes looks something like that.1388

One important aspect of this is the pointiness.1392

At the same time that peakness is going up, what you are seeing is the tails becoming thinner.1403

Peakness and thin tails usually go together and kurtosis is about both of these things 1413

because here we not only have no peakness but we also have fat tails.1421

The tails are just as fat as that lock of peak in the middle.1427

Something in the middle might look something more like that.1431

This is a kurtotic dimension where you are getting increasing kurtosis.1436

When you have increasing kurtosis, it means two things simultaneously.1447

Having more peaks but also having thinner tails.1451

Let us talk about calculating kurtosis.1458

Kurtosis let us talk about the main idea before we get into things.1464

Like skew, you can multiply constants to it.1472

It does not matter.1475

This is the heart of kurtosis.1477

Here is my sigma, it is going to be that same distance for each point away from my mean.1480

Instead of cubing it, we are going to raise it to the 4th power.1488

When we raise it to the 4th power, we know that we do not care about whether it is on the left side or the right side.1498

That is one thing to know already about kurtosis.1507

It is not counting how many are on one side versus the other side of the mean.1510

Kurtosis already we know it will probably be positive because it is going to raise everything to the 4th power 1516

and when you raise something to an even number power it is going to be positive.1525

We are going to divide that to stdev4.1530

Once again, this will always be positive, stdev is already positive.1536

We are going to raise it to the 4th power.1543

Kurtosis is largely going to be a positive number.1545

The only difference between different values of kurtosis might be whatever constant they decide to multiply by.1553

Frequently, 1/n – 1 is one of the constants.1564

I forget what Excel does, Excel does something crazy.1572

We will figure it out when we get there.1575

Once again, even with kurtosis you could write it in very different ways.1577

You could write it as 1/n – 1 × s4 and then put your sum of 4th powers here.1582

Since it is sum of squares, we are raising it up to the squared.1600

That is one way of writing it.1606

Another way of writing it is you can make ∑ x sub I – mean4 / n-1 × s4.1608

All of these things are the same thing.1624

Once again, this is the heart of idea of kurtosis.1627

One of the reason that is r4 is let us think about it.1634

Remember it is very concerned about being neither on the outside or the inside of the distributions.1638

Are you on the tails or at that peak? 1643

By raising it to the 4th power it makes everybody matter a lot especially if you are on the outside.1645

You matter wait more that if you are on the inside.1652

One more thing to the idea of kurtosis.1659

Typically kurtosis is going to be for normally distributed function, let me try here.1663

For approximately normal looking distribution the kurtosis if you calculate it with some function like this, it is going to be 3.1673

That is so arbitrary.1686

What they have done is they made the kurtosis function so that you subtract 3 from it so that the normal distribution has a kurtosis of 0.1691

Like 3 – 3 =0.1705

That is actually how you will get negative kurtosis.1706

It is not because of this function, but it is because you subtract by 3.1710

The lowest kurtosis you can get is -3.1714

It is an odd bizarre of things.1720

I’m not sure when you decide to normalize it to 0 but my theory is that it will be hard for people to remember above 3 is something for normal.1722

They just make you do it in the formula.1738

Now that we have this weird correction of subtracting 3, we could talk about interpreting kurtosis.1744

You already know that the kurtosis above 0 would mean that you have an approximately way normal distribution.1752

That is what a kurtosis of 0 would be.1766

A kurtosis of less than 0 and greater than 0 is going to be more peaked, more pointy than normal.1769

Something like this.1784

That is a kurtosis that is greater than 0 and we call that leptokurtic.1788

It means more peaked than normal.1799

We could call other things that have a kurtosis that is similar to the normal distribution.1812

We could just say similarly kurtotic to the normal distribution but that would be long.1819

We say they are mesokurtic because you do not have to be a normal distribution to have a kurtosis of 0.1825

Mesokurtic just means it is about the same peakness as normal.1837

That is mesokurtic.1853

We need to have another for something that is less peaked or flatter than normal.1858

I remember this because mesokurtic, leptokurtic, that sounds crazy but meso I just remember it is like in the middle kurtosis.1868

Lepto is hard for me to remember which it is.1877

That is why this last one helps me because this one I could always remember, it is called platykurtic.1881

I think of a – and how it has a flat peak.1887

Platykurtic means that this is flatter than normal, smaller peakness than normal.1892

That would be something that looks more like this.1913

That is platykurtic there.1918

Those are our 3 interpretations of kurtosis.1922

Let us go to our Excel examples and look at kurtosis there.1925

Here let us click on a kurtosis data, that is the 3rd sheet and we could look at 3 distributions 1931

that are already put in there that might be good for us to look at regarding this idea of kurtosis.1940

The uniform distribution obviously the tails are just as fat as the peak, it is not simple peaked.1946

The tails are super fat.1952

Here we have normal peak but the tails do not look pretty fat.1954

Here we have the thinnest tails and the peaks are higher than the tails are.1960

There is a bigger difference between the peak and tails.1967

Handily Excel has a kurtosis function so we could put in kurt and then put in our data.1972

Hit enter.1985

What we have here is negative kurtosis where it is flatter than the normal distribution.1989

I’m just going to drag all of this over here.1997

Here we still have a negative kurtosis because it is not as peaked or pointy as the normal distribution.2001

Here we have it is not normal but is more pointy, more peaked than the normal distribution would be.2012

If you want to know the precise formula that Excel uses in order to calculate kurtosis, go ahead and click on kurt.2022

Here is shows you that this is formula for kurtosis.2032

What they do is they multiply by this crazy looking stuff but the crooks of the formula is still there.2037

It is the distances, deviations, to the 4th power, stdev4 – 3 × crazy stuff.2046

That is the heart of that function.2061

You can see that we use that.2065

If we click on the kurtosis, we will calculate it on our own.2069

I use this n – 1 constant, other people who use other things.2074

What I’m going to do is I’m going to put in n – 1 down here instead.2083

All of this, this whole thing, and you subtract 3.2097

Bizarre but true.2107

Let us start with this part right here.2110

The deviation to the 4th power.2112

I just put in this value – average of my data and all of that raised to the 4th.2117

I do not want my mean to jiggle around so I’m going to lock my rows.2132

We are not going to lock the columns.2138

As long as I copy and paste it down here, we can just use column A.2141

I’m just going to copy and paste all of that down here.2147

Notice that all of these values are positive.2152

Down here, what I’m going to do is sum them all up.2157

That is one thing I know I need to do.2164

I know I need to divide by n – 1, count all of these guys and subtract 1.2168

That is within my green parentheses there, multiply by stdev4.2184

That is stdev of my data raised to the 4th power.2193

Because Excel knows order of operations it is going to do that power before it does the multiplying.2202

Then I’m going to close that and that is my blue parentheses closing there.2212

Here we have the sum ÷ n – 1 × stdev4.2220

I need to take all of that and subtract 3.2235

I’m going to put on another set of parentheses around this whole thing and subtract 3.2239

Hit enter.2248

Here I get negative kurtosis.2249

That means it is flatter than normal.2258

Notice that it is not more negative than -3, that is the maximum.2260

I’m going to take this whole thing right here and paste it right here.2267

What we see is similarly this is less flat than the one we just saw but it is not quite close to normal but is more normal.2274

If we copy and paste all of that over here, we find here this is more sharply peaked than normal.2293

That is our kurtosis on Excel.2302

Let us move on to some examples.2310

Here is example 1.2311

Given that on a particular sample, mean is less than the median is less than the mode.2314

What is the likely shape of this distribution?2320

This means that somehow the mode or the peak and the mean is somewhere on this side of it.2324

Here is the mode and the mean, median is somewhere in between.2335

That would mean that since this guy, the mean is highly affected by outliers there must be outliers on this side.2341

I’m going to guess that this is a negative skew, left skewed.2352

That means the skewness number should be negative.2366

What about in a sample where the mean is greater than the median which is greater than the mode.2373

What is the likely shape?2379

We just have to reason backwards.2380

The mode is the smallest and the mean, the median is somewhere in between.2383

The mean is pulled by the outliers, my outliers must be here.2395

I’m going to say this is a right skewed distribution.2400

That would mean that the skewness is greater than 0.2406

It is a positive number.2413

Example 3.2417

If a distribution has a kurtosis close to 0 and skewness close to 0, what is the likely shape of the distribution?2418

We know that skewness close to 0 means that it is basically symmetric, but it could be symmetric in lots of ways.2428

It does not have to be normally distributed.2438

In a kurtosis that is also 0, we know that must means that the tails are not too fat, not too skinny.2440

The peak is not too pointy and dull either.2451

If both skewness and kurtosis are 0, we could very likely think of this as approximately normal.2455

That is probably a good way to guess.2468

Finally example 4.2473

Sketch a potential distribution that can have a kurtosis of 1 then sketch over in a distribution that can a have a kurtosis of -1.2474

I thought the positive 1 to be easier because to me I always think when it is positive it means it is pointy.2485

Over in the sketch of the distribution that is less pointy.2496

It is always something like that.2504

That is it for skewness and kurtosis.2507

Thanks for using