For more information, please see full course syllabus of Statistics

For more information, please see full course syllabus of Statistics

## Discussion

## Download Lecture Slides

## Table of Contents

## Transcription

## Related Books

### Shape: Calculating Skewness & Kurtosis

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

- Intro 0:00
- Roadmap 0:16
- Roadmap
- Skewness Concept 1:09
- Skewness Concept
- Calculating Skewness 3:26
- Calculating Skewness
- Interpreting Skewness 7:36
- Interpreting Skewness
- Excel Example
- Kurtosis Concept 20:29
- Kurtosis Concept
- Calculating Kurtosis 24:17
- Calculating Kurtosis
- Interpreting Kurtosis 29:01
- Leptokurtic
- Mesokurtic
- Platykurtic
- Excel Example
- Example 1: Shape of Distribution 38:28
- Example 2: Shape of Distribution 39:29
- Example 3: Shape of Distribution 40:14
- Example 4: Kurtosis 41:10

### General Statistics Online Course

### Transcription: Shape: Calculating Skewness & Kurtosis

*Hi and welcome to www.educator.com.*0000

*We are going to be looking at shapes again but now we are going to be calculating skewedness and pectoris.*0002

*We could not do this before when we covered shapes because we needed to know something about variability first.*0008

*This is a road map and basically we are going to be covering skewedness to concepts.*0019

*I’m going to try and connect that to calculating skew.*0025

*Then we are going to be talking about how to interpret the number that you have calculated*0030

*and also what that number will tell you about the relationship of central tendency.*0034

*Obviously because the number directly relates to the actual concepts of skewedness*0039

*this measures a central tendency relationships will also hold for the skewedness concepts.*0044

*Then we are going to be talking about kurtosis.*0054

*Kurtosis is something that people do not focus a lot on because it is a different shape to understand.*0055

*We are going to talking about calculating and how to interpret kurtosis.*0063

*First let us start with the concept of skewedness.*0071

*We have been over this again and again.*0075

*We know that there are skewed right distributions, skewed left distributions, and ones that are not skewed at all.*0077

*There are lots of these distributions.*0087

*This one is not skewed.*0091

*Anything that has a tail is skewed.*0093

*This are also called j shapes or formally they are called asymptotic shapes*0101

*because they are this asymptote right here, it asymptote against the x axis.*0115

*Skewedness means there is a tail somewhere.*0125

*One side is longer than the other.*0131

*One side that values that go and on and those values are not clustered around other values.*0135

*That is basically the concept of skewness.*0143

*It would be nice if we could get a number that would tell us this is exactly how skewed your distribution is.*0147

*Distribution like this might be somewhat skewed but a distribution like this might be really skewed.*0155

*It would be nice if we could say how much more skewed one is on the other.*0166

*Skewedness what we are going to calculate is going to tell us exactly how skewed something is.*0171

*On the positive end it would mean that the tail is on the right, if the number is positive.*0177

*If the number is negative it will tell us that the tail is on the left.*0187

*The greater the positive number it means it is more skewed on the right.*0192

*Normal distributions often have 0 skew.*0198

*Things that are much go often have 0 skew.*0201

*Let us talk about calculating skewedness.*0209

*Some people get a little bit freaked out at this but everything in here I want to blow it down to what is important.*0210

*Here is the main heart of the skewedness function.*0218

*Basically it is going to be the sum of cubed distances from the min.*0227

*We will talk in a little why it is cubed.*0239

*The sum of all the cubed distances of each point from the min of the sample.*0243

*Instead of sum of squares, it will be the sum of cubes / the standard deviation (s),*0252

*where we use the sample to calculate the standard deviation of the population to estimate it.*0264

*That is basically the crooks of the idea.*0272

*There is going to be some more frills on this but this is basically the heart of the idea.*0275

*Let us think about why it is cubed.*0280

*It is cubed because it is going to matter whether it is going to be positive or negative.*0282

*Before when we squared things, we do not care which direction away from the min it was.*0287

*But now we care.*0293

*How far are you away from the min but also what direction are you away from the min?*0295

*We are having things that are stacking up.*0304

*It is like each point is having a vote.*0306

*They are saying I’m on the left side, I’m on the right side.*0310

*By adding them up we see who wins, the left side guys or the right side guys.*0314

*When there are more guys on the negative end we get a smaller number.*0319

*When there are more guys on the positive end we get a large number.*0325

*By cubing it, what we do is we make those that are on the ends matter more than those who are close to the center.*0332

*Cubing is going to make it bigger than squaring it.*0340

*That is the main heart of the idea.*0346

*Just to review, if you double click on this s, what would be inside?*0349

*Think about clicking on that s, what would be s equal to?*0356

*S is the sum of squares ÷ n -1 because it is the standard deviation, we would square that.*0361

*Standard deviation is always going to be positive.*0372

*This is not going to change its sign.*0378

*On top of the heart of this function.*0384

*Sometimes people divide by n as well or n – 1.*0388

*In Excel it is going to multiply by n × n + 1 ÷ n – 3.*0395

*But that junk does not matter because that will not have a great impact on skewedness as the heart of this function.*0403

*That is what I want you to know, this is the heart.*0413

*That is the heart of the function and the other stuff are just frills.*0418

*This is one type of skewedness but actually there are tons of different types of skewedness.*0425

*There is pair skewedness, there are lots of skewness that I cannot remember the name.*0430

*There is minimal or distance skewedness.*0437

*There is momentum skewedness, there are ton of them.*0442

*If you want to know more about skewedness, you can check out this link www.wolframalpha.com.*0444

*I frequently use it and refer my students to it.*0452

*Let us talk about interpreting skewedness.*0459

*Imagine this is all of the skewedness values that are possible, if we find something out that has negative skewedness, like -1 skewness.*0464

*We have find out that our distribution has it, but I have not shown you what our distribution is.*0478

*Can you guess what kind of distribution it is?*0482

*Yes you can.*0484

*You know that it has a tail that goes to the left.*0485

*On the other hand if it is a positive one, you know that it has a tail that goes to the right.*0491

*If the skewness is 0, we are not sure exactly what it is but we basically know that it looks symmetrical.*0497

*It could look uniform, like approximately a normal distribution.*0508

*It could be some crazy thing that is symmetrical.*0515

*It could be any number of distributions as long as they are approximately symmetrical.*0524

*Let us talk a little bit about how to look at skewness in SPSS.*0531

*Go ahead and download the example Excel file and if you look on the first sheet, you should see skew data.*0540

*I have shown you 3 different little data sets.*0550

*Data set A is skewed to the right, notice that it has a tail to the right.*0556

*Data set B is skewed to the left, notice that it has a tail to the left.*0561

*Data set c is more normal, it is symmetrical.*0566

*It has a little peak here but it is roughly symmetrical on both sides.*0572

*The nice thing about Excel is that it does have a skew function so you could automatically generate skewness.*0579

*The skew function is just skew and you put in the data that you like it to calculate.*0587

*This is showing you the skew of 2.08, 2.09, it is saying that it is very positively skewed.*0596

*I could just drag this across and it will calculate the skews for the data sets above.*0609

*This one is for this data set and we find that it is a negative skew.*0616

*The extent is almost as negative as this one, right?*0622

*This is about 2, -2, and it is because these two are largely mirror opposites of each other so they have similar skews.*0627

*Here we see that the skew is closer to 0.*0637

*It is still on the positive side but it is closer to 0 than it is to 2 or 3.*0641

*SPSS, if you are interested in the particular formula that they use to calculate skew,*0650

*you could double click and when these little help window comes up you will notice that skew is a hyperlink.*0657

*If you click on that then you should get this little window that comes up, that is an Excel help window.*0667

*Let me make this bigger.*0677

*Here they show you the exact equation that they use for skew.*0681

*Notice that this part is exactly what we talk about.*0685

*It is that distance away from the min ^{3} / stdev^{3}.*0689

*That is all the heart if the function but here we have this extra stuff n / n – 1 × n – 2.*0699

*You could use that or you could use something else.*0711

*You could use 1 / n – 1 that is also very common.*0714

*I will be using n / n - 1.*0718

*If you hit enter, let us look a little bit first before we calculate skew for our cells,*0723

*let us look a little bit on what it would mean to our measures of central tendency to be skewed right or left.*0730

*Let us calculate the min.*0739

*Min is average, sometimes I type in min in Excel.*0741

*I’m going to close my parentheses.*0750

*Let us also get the median while we are at it and let us get the mode.*0752

*Once we have those 3 values, we could just copy and paste straight across and*0767

*it will find the respective min, median and mode for each of these other distributions.*0777

*When it is skewed to the right and there is a very positive skew,*0784

*what we find is that the min is greater than the median which is greater than the mode.*0789

*That makes sense.*0796

*The mode is probably going to be on the very left side of the distribution.*0798

*The min is highly affected by the outliers.*0802

*The outliers are on the right so the min is going to be greater.*0806

*For negative skewed distributions, we have the exact opposite pattern.*0810

*This time the mode is the greatest, followed by the median, then the min.*0814

*Remember the min is highly affected by outliers, this time the outliers are small.*0819

*The min is being pulled to the smaller side.*0825

*The most frequent numbers are up on the higher end and that is why median and mode tend to be bigger.*0829

*When you know the skewness, you will automatically know the relationship between min, median, and mode.*0835

*On the other hand, when distributions are largely symmetrical, when there is very little skew on either side*0844

*then what we see is that the min, median, and mode are often very close to each other.*0851

*Sometimes they will be exactly the same, this time they are just very close to each other.*0858

*That makes sense because there is a little bit of skew on top of each other.*0863

*Let us talk about how to calculate skewness.*0874

*Here I have put the formula for skewness.*0879

*Sometimes people are not comfortable with moving around that sigma sign, it is like where does it go?*0883

*That sigma will affect some things and not other things.*0899

*That sigma will affect anything that has an I next to it.*0902

*Because of that you could put ∑ and put a divided by stdev ^{3} in there and it would not affect it very much.*0906

*You could either do this part after you add them all up or you could do it before you add them all up.*0917

*It actually does not matter because of the distributive property.*0919

*You could either put this inside or this or you could take it out.*0930

*It does not matter.*0935

*Let me show you a way that you could write this.*0936

*Actually this might not be helpful to write it on Excel, I will write it when we go back to our power point slides.*0941

*Let us start with this.*0950

*We know that we need to get all of these distances away from the min.*0952

*Let us start that with cubing that.*0958

*I need to start with the equal sign (=) and I’m going to put in a parentheses to say take my value, subtract the min which would be the average.*0962

*I’m going to put in all of these values.*0975

*I do not want that average to jiggle around so I’m going to lock that data in place.*0979

*Because I want to copy and paste it later on, I’m going to leave the A to vary but I’m going to say lock the 4 down.*0985

*When I copy down this column it would not change the A column, but when I copy across it will change it.*0997

*I only want to lock down the number row.*1004

*Then I’m going to close that parentheses and cube that value.*1009

*Here I could also divide by this as well but I’m just going to do this for now.*1016

*This is just the part where I’m doing x – x bar ^{3}, that is all I’m doing in this column.*1025

*I’m going to go ahead and copy and paste that all the way down.*1037

*Here what we have is just this part but now I need to sum them all up and divide by s ^{3} as well as n – 1.*1045

*I’m going to modify this to make my formula a little bit simpler.*1061

*I will have n – 1 down here.*1070

*Now I need to add these up and divide by n – 1 × stdev ^{3}.*1078

*Here let us do that.*1091

*Let us sum this guys all up and divide by count my data – 1, multiply that by stdev of my data cubed.*1093

*It is a lot of parentheses.*1127

*Because it is color coded I know that I need another blue one to signal my denominator and hit enter.*1131

*There we get a skewness of 1.84.*1139

*That is way close to the 2.*1148

*Something that we got using the Excel function.*1151

*The Excel function will just multiply by a slightly different constant so that is the only difference.*1153

*But once we have this, we can actually copy this over here.*1158

*Here if I double click on this it shows that it is using the C column because I let C vary*1168

*and it is being relative about the columns but it is locking the rows in place.*1175

*I could also check down here, it is using the C column to count those data and get the standard deviation.*1182

*What we find is -1.7 which is pretty close to 1.9 that we found with Excel’s function.*1192

*I’m just going to copy and paste all of that again to our approximately symmetrical one.*1201

*We got something close to the .44 that Excel gave us, .39.*1209

*That is how we are going to calculate shape here.*1216

*One of the features is skewness.*1222

*That is skewness.*1226

*Now let us talk about interpreting kurtosis.*1233

*I actually said that I will talk a little bit about where the skewness stuff goes.*1236

*Let me just draw it over this corner.*1242

*I’m not going to need all this room for kurtosis.*1244

*Let me just talk a little bit more about the skewness function here.*1247

*The formula for skewness as I said before is going to be the sum of all the cubed distances over some constant and s ^{3}.*1254

*I have said that is the idea.*1275

*Sometimes you might see this formula written more like this x sub I – mean ^{3} / s^{3}.*1277

*This and this they mean the same thing.*1294

*It is just different ways of writing it.*1301

*This affects things that are affected by i and because this is not affected by the I at all,*1303

*s ^{3} will be the same whether it is the first value or the last value.*1312

*It does not matter whether you write it inside included in that ∑ or not.*1318

*Another way you could write it is also 1/s ^{3} × x sub I – x bar^{3}.*1323

*This is another way you could write it and so all of these 3 ways of writing it are equivalent.*1338

*It does not change anything.*1344

*Do not get tricked out by one of this other option.*1348

*They are not trying to be tricky.*1352

*Let us talk about kurtosis.*1356

*Kurtosis is a concept that is weird for people because this is not something that we are used to dealing with in regular shapes that we know.*1358

*Roundness and squareness.*1368

*Kurtosis is something a little bit different.*1371

*Kurtosis is about two things that are bundled into one.*1373

*One is pointiness or peakness.*1378

*Very kurtotic shapes might look something like that.*1382

*Super non kurtotic shapes looks something like that.*1388

*One important aspect of this is the pointiness.*1392

*At the same time that peakness is going up, what you are seeing is the tails becoming thinner.*1403

*Peakness and thin tails usually go together and kurtosis is about both of these things*1413

*because here we not only have no peakness but we also have fat tails.*1421

*The tails are just as fat as that lock of peak in the middle.*1427

*Something in the middle might look something more like that.*1431

*This is a kurtotic dimension where you are getting increasing kurtosis.*1436

*When you have increasing kurtosis, it means two things simultaneously.*1447

*Having more peaks but also having thinner tails.*1451

*Let us talk about calculating kurtosis.*1458

*Kurtosis let us talk about the main idea before we get into things.*1464

*Like skew, you can multiply constants to it.*1472

*It does not matter.*1475

*This is the heart of kurtosis.*1477

*Here is my sigma, it is going to be that same distance for each point away from my mean.*1480

*Instead of cubing it, we are going to raise it to the 4th power.*1488

*When we raise it to the 4th power, we know that we do not care about whether it is on the left side or the right side.*1498

*That is one thing to know already about kurtosis.*1507

*It is not counting how many are on one side versus the other side of the mean.*1510

*Kurtosis already we know it will probably be positive because it is going to raise everything to the 4th power*1516

*and when you raise something to an even number power it is going to be positive.*1525

*We are going to divide that to stdev ^{4}.*1530

*Once again, this will always be positive, stdev is already positive.*1536

*We are going to raise it to the 4th power.*1543

*Kurtosis is largely going to be a positive number.*1545

*The only difference between different values of kurtosis might be whatever constant they decide to multiply by.*1553

*Frequently, 1/n – 1 is one of the constants.*1564

*I forget what Excel does, Excel does something crazy.*1572

*We will figure it out when we get there.*1575

*Once again, even with kurtosis you could write it in very different ways.*1577

*You could write it as 1/n – 1 × s ^{4} and then put your sum of 4th powers here.*1582

*Since it is sum of squares, we are raising it up to the squared.*1600

*That is one way of writing it.*1606

*Another way of writing it is you can make ∑ x sub I – mean ^{4} / n-1 × s^{4}.*1608

*All of these things are the same thing.*1624

*Once again, this is the heart of idea of kurtosis.*1627

*One of the reason that is r ^{4} is let us think about it.*1634

*Remember it is very concerned about being neither on the outside or the inside of the distributions.*1638

*Are you on the tails or at that peak?*1643

*By raising it to the 4th power it makes everybody matter a lot especially if you are on the outside.*1645

*You matter wait more that if you are on the inside.*1652

*One more thing to the idea of kurtosis.*1659

*Typically kurtosis is going to be for normally distributed function, let me try here.*1663

*For approximately normal looking distribution the kurtosis if you calculate it with some function like this, it is going to be 3.*1673

*That is so arbitrary.*1686

*What they have done is they made the kurtosis function so that you subtract 3 from it so that the normal distribution has a kurtosis of 0.*1691

*Like 3 – 3 =0.*1705

*That is actually how you will get negative kurtosis.*1706

*It is not because of this function, but it is because you subtract by 3.*1710

*The lowest kurtosis you can get is -3.*1714

*It is an odd bizarre of things.*1720

*I’m not sure when you decide to normalize it to 0 but my theory is that it will be hard for people to remember above 3 is something for normal.*1722

*They just make you do it in the formula.*1738

*Now that we have this weird correction of subtracting 3, we could talk about interpreting kurtosis.*1744

*You already know that the kurtosis above 0 would mean that you have an approximately way normal distribution.*1752

*That is what a kurtosis of 0 would be.*1766

*A kurtosis of less than 0 and greater than 0 is going to be more peaked, more pointy than normal.*1769

*Something like this.*1784

*That is a kurtosis that is greater than 0 and we call that leptokurtic.*1788

*It means more peaked than normal.*1799

*We could call other things that have a kurtosis that is similar to the normal distribution.*1812

*We could just say similarly kurtotic to the normal distribution but that would be long.*1819

*We say they are mesokurtic because you do not have to be a normal distribution to have a kurtosis of 0.*1825

*Mesokurtic just means it is about the same peakness as normal.*1837

*That is mesokurtic.*1853

*We need to have another for something that is less peaked or flatter than normal.*1858

*I remember this because mesokurtic, leptokurtic, that sounds crazy but meso I just remember it is like in the middle kurtosis.*1868

*Lepto is hard for me to remember which it is.*1877

*That is why this last one helps me because this one I could always remember, it is called platykurtic.*1881

*I think of a – and how it has a flat peak.*1887

*Platykurtic means that this is flatter than normal, smaller peakness than normal.*1892

*That would be something that looks more like this.*1913

*That is platykurtic there.*1918

*Those are our 3 interpretations of kurtosis.*1922

*Let us go to our Excel examples and look at kurtosis there.*1925

*Here let us click on a kurtosis data, that is the 3rd sheet and we could look at 3 distributions*1931

*that are already put in there that might be good for us to look at regarding this idea of kurtosis.*1940

*The uniform distribution obviously the tails are just as fat as the peak, it is not simple peaked.*1946

*The tails are super fat.*1952

*Here we have normal peak but the tails do not look pretty fat.*1954

*Here we have the thinnest tails and the peaks are higher than the tails are.*1960

*There is a bigger difference between the peak and tails.*1967

*Handily Excel has a kurtosis function so we could put in kurt and then put in our data.*1972

*Hit enter.*1985

*What we have here is negative kurtosis where it is flatter than the normal distribution.*1989

*I’m just going to drag all of this over here.*1997

*Here we still have a negative kurtosis because it is not as peaked or pointy as the normal distribution.*2001

*Here we have it is not normal but is more pointy, more peaked than the normal distribution would be.*2012

*If you want to know the precise formula that Excel uses in order to calculate kurtosis, go ahead and click on kurt.*2022

*Here is shows you that this is formula for kurtosis.*2032

*What they do is they multiply by this crazy looking stuff but the crooks of the formula is still there.*2037

*It is the distances, deviations, to the 4th power, stdev ^{4} – 3 × crazy stuff.*2046

*That is the heart of that function.*2061

*You can see that we use that.*2065

*If we click on the kurtosis, we will calculate it on our own.*2069

*I use this n – 1 constant, other people who use other things.*2074

*What I’m going to do is I’m going to put in n – 1 down here instead.*2083

*All of this, this whole thing, and you subtract 3.*2097

*Bizarre but true.*2107

*Let us start with this part right here.*2110

*The deviation to the 4th power.*2112

*I just put in this value – average of my data and all of that raised to the 4th.*2117

*I do not want my mean to jiggle around so I’m going to lock my rows.*2132

*We are not going to lock the columns.*2138

*As long as I copy and paste it down here, we can just use column A.*2141

*I’m just going to copy and paste all of that down here.*2147

*Notice that all of these values are positive.*2152

*Down here, what I’m going to do is sum them all up.*2157

*That is one thing I know I need to do.*2164

*I know I need to divide by n – 1, count all of these guys and subtract 1.*2168

*That is within my green parentheses there, multiply by stdev ^{4}.*2184

*That is stdev of my data raised to the 4th power.*2193

*Because Excel knows order of operations it is going to do that power before it does the multiplying.*2202

*Then I’m going to close that and that is my blue parentheses closing there.*2212

*Here we have the sum ÷ n – 1 × stdev ^{4}.*2220

*I need to take all of that and subtract 3.*2235

*I’m going to put on another set of parentheses around this whole thing and subtract 3.*2239

*Hit enter.*2248

*Here I get negative kurtosis.*2249

*That means it is flatter than normal.*2258

*Notice that it is not more negative than -3, that is the maximum.*2260

*I’m going to take this whole thing right here and paste it right here.*2267

*What we see is similarly this is less flat than the one we just saw but it is not quite close to normal but is more normal.*2274

*If we copy and paste all of that over here, we find here this is more sharply peaked than normal.*2293

*That is our kurtosis on Excel.*2302

*Let us move on to some examples.*2310

*Here is example 1.*2311

*Given that on a particular sample, mean is less than the median is less than the mode.*2314

*What is the likely shape of this distribution?*2320

*This means that somehow the mode or the peak and the mean is somewhere on this side of it.*2324

*Here is the mode and the mean, median is somewhere in between.*2335

*That would mean that since this guy, the mean is highly affected by outliers there must be outliers on this side.*2341

*I’m going to guess that this is a negative skew, left skewed.*2352

*That means the skewness number should be negative.*2366

*What about in a sample where the mean is greater than the median which is greater than the mode.*2373

*What is the likely shape?*2379

*We just have to reason backwards.*2380

*The mode is the smallest and the mean, the median is somewhere in between.*2383

*The mean is pulled by the outliers, my outliers must be here.*2395

*I’m going to say this is a right skewed distribution.*2400

*That would mean that the skewness is greater than 0.*2406

*It is a positive number.*2413

*Example 3.*2417

*If a distribution has a kurtosis close to 0 and skewness close to 0, what is the likely shape of the distribution?*2418

*We know that skewness close to 0 means that it is basically symmetric, but it could be symmetric in lots of ways.*2428

*It does not have to be normally distributed.*2438

*In a kurtosis that is also 0, we know that must means that the tails are not too fat, not too skinny.*2440

*The peak is not too pointy and dull either.*2451

*If both skewness and kurtosis are 0, we could very likely think of this as approximately normal.*2455

*That is probably a good way to guess.*2468

*Finally example 4.*2473

*Sketch a potential distribution that can have a kurtosis of 1 then sketch over in a distribution that can a have a kurtosis of -1.*2474

*I thought the positive 1 to be easier because to me I always think when it is positive it means it is pointy.*2485

*Over in the sketch of the distribution that is less pointy.*2496

*It is always something like that.*2504

*That is it for skewness and kurtosis.*2507

*Thanks for using www.educator.com.*2509

0 answers

Post by Srikanth C on April 22, 2015

I couldn't get the point where we can use any number to divide the heart of skewness. If (n-1) is 2, then we are literally dividing the skewness value from the formula by half and if (n-1) is 3 we are making it one-third, so are we not reducing the overall value of skewness by increasing n? So, will it be correct to say we can choose any number for n-1 to divide while calculating skewness?

1 answer

Last reply by: Manoj Joseph

Wed May 1, 2013 11:23 PM

Post by Manoj Joseph on May 1, 2013

I think you are making a wrong mistake to SPSS file instead of Excel shell

0 answers

Post by Kambiz Khosrowshahi on March 28, 2013

In the equation for skewness, I dont understand why it's ok to multiply it by 1/n-1. Isnt the correct equation that "heart"? If so, then by multiplying it by 1/n-1, wont it produce the wrong skewness?

1 answer

Last reply by: Professor Son

Wed Aug 15, 2012 2:15 PM

Post by KIM CARTER on April 27, 2012

Love all your work!!!!!!!!The correct spelling of (you wrote)asymtotic it is actually asymptotic. (Smile)