For more information, please see full course syllabus of Statistics

For more information, please see full course syllabus of Statistics

### Repeated Measures ANOVA

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

- Intro
- Roadmap
- The Limitations of t-tests
- ANOVA (F-test) to the Rescue!
- Independent Samples vs. Repeated Measures
- Independent Samples ANOVA
- Repeated Measures ANOVA
- Repeated Measures F-statistic
- S²bet = SSbet / dfbet
- S² resid = SS resid / df resid
- SS subj and df subj
- SS total and df total
- Chart of Repeated Measures ANOVA
- Chart of Repeated Measures ANOVA: F and Between-samples Variability
- Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
- Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
- Hypotheses
- Significance Level
- Decision Stage
- Calculate Samples' Statistic and p-Value
- Reject or Fail to Reject H0
- Example 2: Repeated Measures ANOVA
- Example 3: What's the Problem with a Bunch of Tiny t-tests?

- Intro 0:00
- Roadmap 0:05
- Roadmap
- The Limitations of t-tests 0:36
- Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
- ANOVA (F-test) to the Rescue! 5:49
- Omnibus Hypothesis
- Analyze Variance
- Independent Samples vs. Repeated Measures 9:12
- Same Start
- Independent Samples ANOVA
- Repeated Measures ANOVA
- Independent Samples ANOVA 16:00
- Same Start: All the Variance Around Grand Mean
- Independent Samples
- Repeated Measures ANOVA 18:18
- Same Start: All the Variance Around Grand Mean
- Repeated Measures
- Repeated Measures F-statistic 21:22
- The F Ratio (The Variance Ratio)
- S²bet = SSbet / dfbet 23:07
- What is This?
- How Many Means?
- So What is the dfbet?
- So What is SSbet?
- S² resid = SS resid / df resid 25:46
- What is This?
- So What is SS resid?
- So What is the df resid?
- SS subj and df subj 28:11
- What is This?
- How Many Subject Means?
- So What is df subj?
- So What is SS subj?
- SS total and df total 31:42
- What is This?
- What is the Total Number of Data Points?
- So What is df total?
- so What is SS total?
- Chart of Repeated Measures ANOVA 33:19
- Chart of Repeated Measures ANOVA: F and Between-samples Variability
- Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
- Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos? 40:25
- Hypotheses
- Significance Level
- Decision Stage
- Calculate Samples' Statistic and p-Value
- Reject or Fail to Reject H0
- Example 2: Repeated Measures ANOVA 58:57
- Example 3: What's the Problem with a Bunch of Tiny t-tests? 1:13:59

### General Statistics Online Course

### Transcription: Repeated Measures ANOVA

*Hi, welcome to educator.com.*0000

*Today we are going to talk about repeated measures ANOVA.*0002

*So the repeated measures ANOVA is a lot like the regular one way independent samples ANOVA that we have been talking about.*0004

*But it is also a lot like the paired samples t-test and so we are going to talk about why we need the repeated measures ANOVA .*0014

*And we are going to contrast the independent samples ANOVA with the repeated measures *0022

*ANOVA and finally we are going to breakdown that repeated measures at statistic into its component variant parts.*0027

*Okay so previously, when we talked about one-way ANOVA we talk initially about why we *0035

*needed it and the reason why we need ANOVA is that the t-test is limited.*0044

*So previously we talked about this example, who uploads more pictures, Latino white Asian or black Facebook users? *0049

*When we saw this problem and we thought about maybe doing independent samples t-test we realize we would have to do a whole bunch of little t-test.*0058

*Well let us get this problem.*0066

*It is similar in some ways but it is also a little bit different so here is the question.*0070

*Which prototype is most frequently used on facebook? *0076

*Tagged, uploaded mobile uploads for profile pictures? *0079

*Now in the same way that this has many groups, at this all the problem also has many groups, *0083

*the one thing you could serve immediately tell us of that is if we try to use t-test we also have to use a bunch of little t-test here.*0091

*But here is another thing.*0099

*These variables are actually linked to one another.*0102

*Often people who have tagged photos have a number of uploaded photos who have a number of *0104

*mobile uploads will also have a number of profile pictures.*0110

*So in this sense although these are made up of four just separate groups of users and the user *0114

*here is the linked to any of the users and the other groups Latino, white, Asian, black groups, here we have these four sets of data.*0121

*Tagged, uploaded mobile or profile pictures but the number of tagged photos is linked to some *0135

*number of uploaded photos probably because they come from the same person and maybe this *0146

*person owns the digital camera that they really loving carry around everywhere.*0153

*So these scores in these different groups are actually linked to each other and these are what we *0158

*have called previously dependent samples or we called them paired samples before because *0167

*there were only two groups of them at that time but now we have four groups but we could still see that linked principle still hold.*0173

*So here were talking about were talking about different samples, multiple numbers samples *0181

*more than two but these samples are also linked to each other in some way.*0188

*And because of that those are called repeated measures because we are repeatedly measuring something over and over again.*0194

*Measuring photos here measuring photos here measuring photos here measuring photos here and because that is called repeated measures.*0204

*It is very similar to the idea of paired samples except were now talking about more than two.*0211

*So 3, 4, 5 we call those repeated measures so we have the same problem here as we did here.*0217

*If we have a bunch of t-test of our solution is a bunch of t-test, we have two problems whether their paired t-test or independent samples.*0225

*So in this case they would be paired.*0243

*That even in the case of paired t-test of the same problems that we did before, the first problem *0247

*is that with so many t-test the probability of false alarms goes up.*0254

*So this is going to be a problem.*0258

*And it is because we reject more null hypotheses every time you reject a null hypotheses you have a .05 chance of error.*0264

*So where compounding that problem.*0273

*The 2nd thing that is wrong when we do a whole bunch of little t-test instead of one giant test is *0277

*that we are ignoring some of the data when where calculating the population standard deviation.*0284

*So what we estimate that population standard deviation the more data of the better but if we *0291

*only look at two of the sample that a time then were ignoring the other two perfectly good sets *0297

*of data and were not using them in order to help us estimate more accurately the population standard deviation.*0303

*So we get a poorer estimate of S because we are not using all the data at our disposal.*0311

*So that is the problem and we need to find a way around it, thankfully, Ronald Fisher comes to the rescue with his test.*0338

*Okay so the ANOVA that is our general solution to the problem of too many tiny t-test. *0349

*But so far we only talked about ANOVAs for independent samples.*0356

*Now we need an ANOVA for repeated measures so the ANOVA is always going to start the same way with the Omnibus hypothesis.*0360

*One hypothesis to rule them all and the Omnibus hypothesis almost said all the samples come from the same population.*0370

*So the first group of photos equals the mu of the second group of photos equals the mu of the *0378

*third group of photos equals the mu of the fourth group.*0389

*And the alternative hypothesis is not that they are not all not equal to each other but that at least one is different, outlier.*0392

*And so the way we say that is that all mu’s of P, all the mu’s of the different photo type are not the same.*0404

*Now we have to keep in mind the logic that all of these mu’s are not the same, is not the same as *0421

*saying all of the mu’s are different from each other.*0430

*And when we say all of them are not the same if even one of them is not the same then this alternative hypothesis is true.*0433

*So this starts off much the same way as independent samples from there we go on to analyze variance.*0442

*And here were going to use that S statistic again.*0450

*And Ronald Fisher's big idea that he had upon is this idea that it when we talk about the F it is a *0458

*ratio of variances and really one way of thinking about it is the ratio of between sample or group variability over the within sample variability.*0469

*And another way of thinking about this is if the variability we are interested in and I do not just *0495

*mean that over passionate about it or we find a very curious but I really need is the variability *0510

*that we are making a hypothesis about over the variability that we cannot explain.*0515

*We do not know where that vary the other variability comes from, it just exists and we have to deal with it.*0523

*Okay and so this S statistic is going to be the same concept, the same concept will going to come *0536

*up, again we will talk about the repeated measures version of F.*0544

*There are going to be some subtle differences though.*0548

*Okay so let us talk about the independent samples ANOVA versus the repeated measures ANOVA.*0551

*People have the same start, they have the same hypothesis not only that but they both have the *0561

*same idea of taking all the variance in our sample and breaking it down into component parts.*0567

*Now what we talk about all the variance in our sample we really mean what is our sum of squares total.*0572

*What is the total amount of variability away from the grand mean in our entire data set? *0581

*And we can easily just from the sentence we could figure out what the formula for this would be.*0590

*This should be something like the variability of all every single one of our data point minus the *0597

*grand mean which we signify with two bars the double bar square and the Sigma now to do this *0606

*for every single data point not just the data point in one sample while the way it knows to do that is because this should say N total.*0616

*So this is going to go through every single data point in every single sample and subtract get the *0625

*distance from the grand mean in the square that distance and add those distances all squared.*0634

*Okay so that is the same idea to begin with.*0643

*Now we will take this at this total and break it down into its component parts.*0647

*Now an independent samples what we see is that all of the variability that were unable to *0651

*explain lies within the group, all of the variability that we are very interested in, that is between *0664

*the group, and so independent samples the story becomes, as this total is a conglomeration, it is *0671

*when you split it up into its part you see that it is made up of the sum of squares within the *0681

*group inside of the group and the sum of squares between the groups added up.*0690

*And because of that the S statistic here becomes the variance between over the variance within *0696

*and obviously each of these variances corresponds to its own sum of squares.*0711

*Now the repeated measures ANOVA were going to be talking about something slightly different because now we have these linked data.*0720

*So here the data is independent these samples are independent they are not linked to each other in any way.*0730

*Here, these samples are actually linked to each other.*0738

*Either by virtue of being made from the same subject or the same class produced or something about these scores are linked to each other.*0744

*So not only is there variability across the groups just like before so sort of between the groups and variability within the group.*0753

*But now we have a new kind of variability.*0770

*We have the variability caused by these different linkages, these all are different from each other but maybe similar across.*0774

*So the person who owns a digital camera they might just have an enormous number of photos all across the board.*0785

*The person who does not have a digital camera or not even a smartphone might have a low number of photos across-the-board.*0792

*So there are those things that we call often are called individual differences.*0799

*Those are differences that we actually mathematically quantify, we could actually explain where *0806

*it is but we are not actually interested in the study, were really interested in the between group difference.*0813

*But this is not all.*0822

*Once you have taken out this individual of variability there is still some residual within group variability left over.*0825

*And so that is that is really stuff we cannot explain, it is not caused by the individual differences *0834

*it is not because of between group, it is just within group differences.*0842

*So in repeated measures the sum of squares total actually breaks down slightly differently even *0847

*though it is still it is still this idea of breaking down the sum of squares total now it actually splits *0856

*up into some of squares subjects, this individual links the yellow part plus the sum of squares *0865

*within just like before that now we call it residual because this we have taken out the sum of the *0878

*variability that comes from the individual differences and so because of that there is there is only *0889

*left over and so because that we call it residual just like the words left over.*0896

*And of course the sum of squares between which is what were actually very interested in.*0902

*So just to recap, this is something that we can explain how there were not interested in, this is *0907

*something we cannot explain and this is something we are very interested in.*0915

*So, our S statistic will actually become our variability between divided by our variability residual *0920

*and in fact we just wanted to take this guy out of the equation we want him to out of the equation of F.*0931

*So F, it does not count the variability from the subjects, were individual difference, the are not interested in that.*0940

*Okay so I wanted to show you this and a picture here is what I would show you.*0953

*Here is what we mean by independent samples, remember, the independent samples ANOVA it *0960

*is always been a start off with that same idea and total the difference between each data point *0966

*from the grand mean squared and then add all of those up that is the total sum of squares.*0975

*In independent samples, what were going to do is take all of this all of this variability that SS total.*0983

*That is that SS total of the total variance and we are going to break them up into between group variance.*0994

*So think of this, this is just to signify the difference of all of these guys from the grand mean.*1012

*So the between group differences, so SS between and add to that the within group variability.*1020

*The variability that we have no explanation for.*1038

*So that is the within group variability.*1041

*So it only makes sense that the variability between divided by the variability within, this is what *1046

*we would use in order to figure out the ratio of the variability we are interested in or *1067

*hypothesizing about divided by the variability we cannot account for.*1077

*So this becomes the S statistics that were very much interested in.*1090

*Now when we talk about the repeated measures ANOVA, once again we start off similarly for *1096

*every single data point we want their squared distance away from the grand mean and add them all up.*1106

*In order to see this as a picture you want to see that whole this whole idea here that did the *1113

*distance of all of these away from the grand mean that is SS total.*1123

*However what we wanted to do is then break it up into its component parts and just like before *1129

*we have these differences between the groups so that is SS between.*1138

*And that SS between is the stuff that were really interested in so that is also going to be a factor here.*1147

*Where we take the variability between but then, we want to break up the rest of the variance *1156

*into one part that we can actually explain, we could account for it and into the rest of the residuals that we cannot explain.*1165

*So even when we are not interested in that we could actually account for the variability.*1174

*You could think of it across these rows because notice that person one, the viewer photos *1182

*across-the-board, person three just has more photos across-the-board and so those are the kinds *1194

*of individual differences, little level differences that we do not actually want in our S statistic.*1204

*Its variability we know where it comes from which is not interested in it in terms of our hypothesis testing.*1211

*So we have this SS subject, put a little yellow highlight here so that you know what it stands for *1217

*and that is the variability that we can explain but not part of my hypothesis testing.*1232

*And so what variability are we left with, we are left with any leftover variability, there is some *1240

*leftover variability and we call that residual variability and that is going to be SS residual.*1248

*And if we want to look at the variability that were interested in over the variability we cannot *1257

*explain, we are not going to include this variability, we are only going to use this one.*1264

*So the variability between groups divided by the variability residual, residual variability.*1269

*Once we have that now let us break it down even further.*1281

*So the repeated measures S statistic now you sort of know basically what it is.*1289

*It is the is the variability between groups divided by the variability within group.*1294

*Now we could break these up into their component parts so it is going to be the SS between *1307

*sum of squares between, divided by the degrees of freedom between, all divided by the sum of *1316

*squares of partnership residual, residual over the degrees of freedom residual.*1327

*So far it just looks like what we have always been doing with variability sum of squares through *1338

*the freedom but now we have to figure out okay how do we actually find these things.*1348

*And in fact, this is something you already know because this is actually exactly the same as the independent samples ANOVA.*1355

*The only thing that can really be different is this one.*1371

*Okay so let us start off here.*1387

*So this is what we are really looking for when we start double-click on this guy we double-click *1390

*on that variability what we find inside is something like this and then double-click on each of these things and figure out what is inside.*1398

*So conceptually this is idea the whole idea of the variability between groups is the difference of *1405

*sample mean from the grand mean because we want to know how each sample differs from that grand mean.*1413

*Now let us think about how many means we have because that is going to determine our degrees of freedom.*1419

*So how many means we have is usually K.*1426

*How many samples we have and so with the degrees of freedom between well it is going to be K -1.*1430

*And the way you can think about this is how many means we find, where you find three means *1441

*or how many means as groups so if we have four groups it would be for means if we had three groups it would be three means.*1448

*And if we knew two of them and we know the grand mean, we could actually figure out third and *1456

*so because of that is going our degrees of freedom is K – 1, the number means -1.*1465

*Okay so what is sum of squares between? *1471

*Well it is this whole idea of the difference of sample means from the grand mean and we could *1479

*say that the sample mean away from the grand mean we have a whole bunch of sample means.*1484

*Something to put my index there.*1489

*And we are going to square that because the study some of squares, and each means distance *1491

*should count more if that sample has a lot of members and I should get more votes so we are *1498

*going to multiply that by N sub I, how many in their sample.*1504

*And in order to figure out what I mean when I say okay let us think about that, I is going to stand *1511

*for each group so this Sigma is going to have to go from I = 1 through K how many groups.*1518

*And then it is going to cycle through group 1, group 2, group 3, group 4 and this is some of squares between.*1527

*Additional is very similar because this is actually the same thing from independent samples ANOVA.*1536

*So now we have to figure out how to find the other sum of squares the new one.*1544

*Sum of squares residual and degrees of freedom residual and the whole reason we want to do *1551

*that is because we want to find the variability residual, the leftover variability.*1555

*If any leftover spread within the groups is not accounted for by within subject variation.*1562

*Now within subject might mean within each person right but it might mean within each hamster *1570

*or each company that is being measured here repeatedly so whatever it is that your case is *1578

*whether animal or human or entity of some sort , that is considered within your subject of variability.*1585

*And those subjects are all slightly different from each other.*1595

*But that is not something that were actually interested in so we want to take that out and take the leftover variability.*1598

*And because it is the idea of leftover, we actually cant find a lot of this, we can find some of *1604

*squares directly, we have to find the leftover.*1614

*And so the way we do that is take the total sum of squares and then subtract out the stuff we do *1617

*not need which is namely the sum of squares between as well as the sum of squares within subject to the variability within subject.*1624

*And so here we see that we are going to have to find some of squares for everybody were *1637

*enough to find total as well as for the within subject, we already knew we have to find this one, *1642

*and that is how where we can find some of squares residual, literally whatever is left over.*1650

*In the same way to find degrees of freedom residual, we should have to know something about *1656

*the other degrees of freedom in order to find this sort of our whatever's left.*1661

*And so in order to find degrees of freedom residual, what we do is we multiply together the *1667

*degrees of freedom between times the degrees of freedom within subject and when we do this, *1674

*we are going to be able to find all the degrees of freedom that is leftover .*1683

*Okay so we realize in order to find sum of squares residual we have to find all these other sum of squares so here is some of squares within subject.*1689

*So the way to sort of think about this notion is this idea that were really talking about subject level or case level, subject level variation.*1702

*So each case differs a little better from the other cases for God knows what reason right but we *1716

*can actually account for it here, it is not totally unexplained we do not know why it exists, we *1723

*know it exists because the subjects are all slightly different from each other and we do not know *1729

*why it exists but we know what it is and we could calculate it.*1734

*Okay so conceptually you want to think about this at how far each subjects mean is away from the grand mean.*1738

*Remember in repeated measures we are repeatedly measuring each subject or case, we are *1748

*measuring them multiple times so if I am Facebook user I will be contributing for different scores to this problem.*1754

*Now I know what you could do is get a little mean just for me right? *1763

*The little mean of my four scores and that is my subject mean.*1770

*So each subject has her own little mean and we want to find the distance of those little means away from the grand mean.*1774

*So let us think how many subject means do we have? *1783

*We have N number of subjects, that we have an number of samples for each for each sample and number of measures for each sample.*1787

*So that is our sample size.*1800

*So what is degrees of freedom for within subjects? *1803

*Well, that is going to be N-1.*1806

*So what is the sum of squares for each subject? *1808

*Well one of the things you have to do is sort of figure out a way to talk about the subject level mean.*1814

*So here, I am just going to say mean and put a index for now but here in my in my little telling that *1820

*this Sigma will tell this what I is, I will go from one up to N sample size.*1835

*This is really the subject means and I want to get the distance squared distance from each *1843

*subject mean to the grand mean and square that, squared distance and we should also take into *1853

*account how many times is the subject being measured and that is going to be K number of times.*1860

*How many how many samples are taken how many measures are taken so repeated measures how many times the measure is repeated.*1867

*And the more times a subject participates the more this variation will count.*1878

*So there we have subject level variation, we are really only finding it so that we can find SS *1889

*residual, so better do it and so we also have to find sum of squares total and degrees of freedom total.*1899

*These are something we have gone over that just to drive it home remember the reason we *1908

*want to find this is just so we can find sum of squares residual.*1913

*So conceptually this is just the total variation of all the data points away from the grand mean.*1916

*What is the total number of data points? *1922

*That is going to be N total.*1925

*So every single data point counted up and the way we find that is sample size N times the *1928

*number of samples we have so if we had 30 people participating in four different measures, it is *1937

*30 times 4 and the number of samples is called K so NK of N subtotal.*1944

*So what is the degrees of freedom total? *1954

*Well it is either going to be N total minus 1 or the same exact numerical value will be NK -1 either way.*1957

*And so what is the sum of squares total? *1967

*Well we have already been through it this is what we always start off with at least conceptually *1970

*for every single data point notice that there is no bars on it not any means of literally every single data point.*1975

*The distance from the grand mean squared and we could put NK here just to say go and do this *1984

*for every single data point do not leave one behind.*1994

*So we have all of our different components, now let us put them together in this chart so that you will know how they fit together.*1997

*Remember the idea of the F is the variation we are interested in over variation we cannot *2009

*explain, we cannot account for do not know where it comes from, it is a mystery.*2027

*The formula for this is going to be F equals, I remember this is for repeated measures so that *2033

*between sample variability over the residual variability.*2043

*And in order to find that we are going to need between sample variability.*2051

*The idea is always going to be the best sample means difference from grand mean.*2057

*So basically the centres of each sample away the distance away from the grand mean and so the *2072

*formula for that is going to be S squared between equals SS between over the DF between and *2084

*we can find each of those component parts SS between going to be the sum to many zigzag, the *2094

*sum of all of my all of my X bars minus the grand mean the distance and when I say all I mean one at a time.*2103

*Each as individuals one at a time, and this distance should count more if you have more people *2118

*or more data point in your sample and I does not go from one to N, it goes from one to K, I am *2125

*going to do this for each sample, NK is my number of samples or number of groups.*2134

*So my degrees of freedom between is really going to be K -1, number groups -1.*2139

*Okay so now let us try to get residual variability.*2148

*Now residual variability is that leftover within groups within sample variability, now in order to *2155

*get leftover the formula for this is going to be the variability residual.*2169

*Now to get that you get the residual sum of squares and divide by the residual degrees of *2181

*freedom, the residual sum of squares is literally going to be the left over.*2190

*SS total minus SS subject plus SS between.*2197

*And my degrees of freedom residual is going to be a conglomeration of other degrees of *2209

*freedom available to S times the degrees of freedom between, okay.*2222

*So we know that in order to find these, so the total variability let us start there, we know this one *2230

*pretty well, all the data point in all of our samples away from the granting and so we actually do *2242

*not need the variability here and we do not need this variability either.*2269

*What we really need is the sum of squares total and that is going to be for each data point no X *2274

*bar anything get this squared distance from the grand mean.*2283

*So now that we have that we do not really need that but we can find it anyway so the degrees of *2288

*freedom total is going to be NK -1 so the total number of data points minus 1.*2299

*Now let us talk about within subject variability this is the Brad of each case away from grand *2308

*mean and when you talk about each case, each case can sort of be represented the point *2324

*estimate of it can be its own means so each cases mean so that is how I want you to think of it.*2331

*Each case is represented by its own little mean and so that is why they were using it means to calculate the distance.*2337

*So that SS subject is going to be the distance of each subject level mean away from the grand *2345

*mean squared and in order to say subject level, you got to put that N here so that it knows do *2360

*this for each subject not do this for each data point or do this for each, if we put a K there would *2368

*be do this for each group and we wanted to count more if they participate in more measures so if the measures are repeated over and over again.*2386

*So we want to put in the number of k and so that gives us our are some of squares for each *2387

*subject and once we have those two we can find this as well as some of squares between and *2394

*then we also need the degrees of freedom for within subjects just because were in need that to find out the degrees of freedom residual.*2400

*This guy do all this jump through hoops.*2411

*So the degree of freedom for each subject for subject level variance is going to be N -1, the number of subjects -1.*2414

*Okay so here is example 1 which is more prevalent uploaded mobile profile photos and so these *2423

*are all different kinds of photos but one person or one Facebook user presumably 1 person, they *2434

*are sort of the linking factor of all four of those measures.*2441

*So what is the null hypothesis? *2448

*Well it is that all of these groups really come from the same population.*2451

*The reason I use this P notation is for just different types of photos and I will call this one 2, 3, and 4.*2457

*Also it makes it easier for me to write my alternative hypothesis, it has a practical significance so all use of P’s are not equal so they are not all equal.*2472

*So the significance level we could just set it as alpha equals .05 just by convention because we *2494

*are going to be using F value, we do not have to determine whether the one tailed or two-tailed.*2513

*Always one tailed is cut off on one side and skewed to the rights it is always going to be just on *2517

*the positive side and so let us draw our decision stage with the jar of distribution.*2528

*We know that it ends at zero is alpha equals .05, what is the F here? *2535

*Well remember, in order to find as we need to know the denominators DF as well as this numerators DS.*2548

*And so here we know that F in the numerator is going to be degrees of freedom between group and that is K -1.*2557

*There is 4 groups so it is going to be 3 and the degrees of freedom of residual is going to be degrees of freedom between times degrees of freedom subject.*2570

*So we are going to need to find degrees of freedom subject and degrees of freedom subject is going to be N-1.*2587

*Now let us look at our data set in order to figure out how many we have in our sample.*2597

*So I have made it nice and pretty here, first type photos mobile uploads uploaded photos and *2602

*profile photos, as you look at this row it has all of the data from one subject so this *2609

*person has zero photos of any kind whereas let us look at this person.*2619

*This person has zero mobile uploads and zero profile photos that they have 79 uploaded photos *2625

*and 37 tag photos and so for each subject we can see that there is some variation there but *2631

*across the different samples we also see some variation.*2639

*So here down here I put step one they are all equal and are not all equal, equals .05, here is the *2643

*decision stage, our K is 4 groups 4 samples, our degrees of freedom between us 4 -1, we already *2655

*done that but might to fill this in, on degrees of freedom for subject the reason why this is there *2663

*is so that we can find the degrees of freedom residual and once we have that then we can find our critical F.*2670

*So the degrees of freedom for each subject we should count how many subjects we actually have *2692

*here, we could just count the rows, so I just picked profile photos -1 so we actually have 29, 29 *2707

*cases but our degrees of freedom for subject is 28.*2719

*Now degrees of freedom residual are those 2° of freedom multiplied to each other so 3×28 and *2724

*that is going to be 84 and that is our denominator degrees of freedom.*2735

*So now we can find our critical F.*2740

*In order to do that we use F inverse the probability is .05 and our first degrees of freedom is the *2742

*numerator one and our second degrees of freedom is the denominator and our critical F is 2.71, that is our critical F.*2750

*So once we have that we can now go on to sort of figure out, okay from there let us go on and calculate our sample tests.*2767

*So we will have to find the sample S statistic right before, I disputed generically because you *2777

*might have to find T statistic or the statistics but in that case we know because we have a *2787

*omnivorous hypothesis we need that S statistic and we have to find the P value afterwards so let us find the S statistics.*2794

*Go to your example again, this is example 1, let us put in all the different things you need.*2803

*So you need the variance between over the variance of the residual variance so let us start off *2814

*with variance between, it is something we already know, we know it is been split up into sum of *2822

*squares between and degrees of freedom between.*2827

*We actually have degrees of freedom between already so let us just fill that in, in order to find *2830

*the sum of squares between, you have to find the means for each of these groups.*2834

*We are also going to need to find out what is the N.*2841

*That is actually quite simple because we know that it is 29 for each of these groups so that makes life a little bit simpler.*2845

*Now let us find the averages for each of these samples so for the first sample I believe this is tag photos, the mean is 9.93, *2861

*I believe this is mobile uploads, that is 12.45 for uploaded photos, that averages 68 and finally for profile photos the average is 1.5.*2874

*Okay so now we going to have to calculate the grand mean.*2905

*The grand mean is quite easy to do on XL because you just take all your data points every single one and you calculate that average.*2909

*The average is 23.*2919

*I am just going to copy and paste that here, what I did was they put a point here so that it would *2921

*just point to that top value for the granting shouldn't change the granting is always the same.*2929

*Now that we have all of these values we could find N times XR minus the grand mean squared.*2935

*We could find that for each group, and then when we add that up, we end up getting our sum of *2948

*squares between and we get this joint number 82,700.*2968

*And so I am just going to put a pointer = point to that guy and then I am going to find out my variance between.*2973

*So my variance is still quite large about 27,600.*2986

*Okay so to have that now we need to find my variance of my residual variance.*2993

*In order to find residual variance, I know I am going to need to find all this other stuff that I did not necessarily plan on.*3000

*So one of the things I do need to find is my SS total as well as my SS subject.*3007

*I am going to start with SS total because although the idea is simple to on XL it looks a little crazy *3014

*just because it takes up a lot of space because we going to need to find this square distance *3023

*away from the grand mean for every single data point.*3028

*So here, all my data points are here.*3033

*Now I am going to need to find the square distance of this guy away from the grand mean, and then add them all up.*3040

*What is helpful in XL is to create separate rows and then to sort of add them up and so I am just *3055

*going to use save these for later, and so this is, I have already put in the formulas here, this 1 is *3067

*tag for the tag photos, it is sort of my partial way to find SS total just for the tag photos and I am *3075

*going to do it for the mobile photos and for the uploaded photos then for profile photos and then add them altogether .*3085

*So either sort of subtotal.*3091

*So what I need to find is that data points minus the grand mean and I will just use this grand mean that I found down here.*3094

*But what I need to do is I need to lock that down I need to say always use this grand mean do not use any other one.*3105

*You put that in parentheses so that I could square it.*3113

*So here I am going to do that all the way down for tag photos and just take this across for mobile *3118

*uploaded and profile photos and that is the nice thing about XL it will give you all of these values very very easily.*3133

*I am just going to shortness this for second, just to show you what each of these is talking about.*3144

*So click on this one, this cell gives me this value minus my grand mean which is locked down *3152

*squared so I have now and every single data points square distance away from the grand mean and these are all the differences square distance.*3160

*Now I need to add them all up.*3171

*So put sum, and I am not just going to add up this column I am literally going to add all this up.*3174

*So our total sum of squares is 257,000.*3184

*So I am going to go down to my sum of squares total and just put a pointer here and say that is it.*3192

*So how do I find my sum of squares for the subject level variation? *3201

*Well, this I know I need to find the mean for every subject then I need to find the distance *3212

*between that mean and the grand mean square that and multiply it by how many groups I have.*3219

*The nice thing is the number of groups I have this constant is always four for everybody so let us go ahead and find subject level mean.*3226

*So subject means are going to be found by averaging one person measures for all 4 sample and *3235

*so that guys average of zero, just copy and paste that down, if they wanted to check this one takes the average of these four measures.*3248

*So this is subject level variation and this shows you that this guy has a lot fewer photos period than this guy.*3259

*He has just an average a lot higher photos than this guy.*3268

*And this guy is sort of in the middle of those two.*3273

*Once we have these subject level means now we could find this idea K times the difference *3276

*squared for each subject so I know my K is going to be 4 times my subject level mean minus the *3286

*grand mean and I will just use my already calculated grand mean down here and I need to lock *3302

*that grand mean down because that grand mean is never going to change squared.*3309

*Once I have that then I probably want to add them all up in order to get my sum of squares for within subject variation.*3316

*I will just put this little sum signs so that I know that this is an another like data point, it is a *3335

*totally different thing, sum, and once I have that it is 56,600 and I know my sum of squares within subject.*3345

*Once I knew all those things now I can finally calculate some of squares residual because I know my ingredients.*3360

*I have my sum of squares total minus the sum of squares per subject plus the sum of squares *3369

*between and I could obviously distribute out that negative sign but I will just use the parentheses.*3380

*So here is my leftover sum of squares that whatever's leftover unaccounted for and I already *3390

*figured out my DF residual and so here I am going to put my sum of squares residual divided by *3399

*degrees of freedom residual and there get 1400.*3410

*So now we can finally finally calculate our F by taking the variance between and dividing that by the variance residual variance.*3416

*In there I get 19.69 which is quite a bit above the critical F of 2.7.*3427

*Now once I have that now I could find my P value.*3435

*So by May P value I would put in my F, put in my F value, my numerator degrees of freedom as *3441

*well as my denominator degrees of freedom and I get 9.3×10 to the negative 10 Power so that *3456

*means there is a lot of decimal places before you get to that 9 so it is very very very very small P value.*3468

*So what do we do? *3475

*We reject the null.*3478

*Also remember that in a F test , all we do is reject the Omnibus null hypothesis that does not *3480

*mean we know which groups are actually different from each other so when you do reject the *3490

*null after doing F test, you want to follow up and do post hoc test.*3495

*There is lots of different post hoc test you might learn to keep postop or Bonferroni corrections *3500

*so those all help us know the pairwise comparisons to figure out which means are actually *3507

*different from which other means and you probably also want to find effect size and in F test if *3515

*effect size is not D or G instead its Eta squared.*3524

*So we would reject a null.*3527

*Example 2, a weightless boot camp is trying out three different exercise programs to help their clients shed some extra pounds.*3537

*All participants are assigned to team up 4 people and each week their entire team is weight *3546

*together to see how many pounds they were able to take off.*3552

*The data shows their weekly weight loss as a team.*3554

*With a exercise program all equally effective in helping them lose weight note that all teams *3558

*tried all three exercise regime but they all receive the treatment in random order.*3564

*So this is definitely a case where we have three different treatments.*3569

*Treatment 1, 2 and 3 and we have data points which are going to be pounds lost.*3574

*How many pounds they were able to take off per week pounds loss per week but these are not independent samples.*3581

*They are actually linked to each other.*3590

*What's the link? *3592

*It is the team of four that lost that weight right so this team lost that much under this exercise *3594

*regime, lost that much under this exercise regime, lost that much under this exercise regime.*3602

*Now each team got these three exercise regimes in a different order.*3608

*Some people are 3, 2, 1, so they have all been balanced in that way so if you pull up your examples and good example 2, you will see this data set.*3612

*So here are the different teams or squads.*3627

*Here are the three different types of exercise program and in the different orders that they *3630

*were, that they did these exercises and each exercise was done for a week.*3635

*So let us think about this.*3642

*So to begin with we need a hypothesis so step one is the null hypothesis and all are equal.*3644

*So all the mutinies, exercise 1, exercise 2, exercise 3, they are all equal.*3660

*The alternative hypothesis is that not all are equal.*3667

*So step 2 is our significance level we could just set alpha equals to .05 once again because it on *3674

*the best hypothesis we know we are going to do a F test so it does not need to be two-tailed.*3686

*So step three this is the decision stage if user imagine that F distribution or color in that part, *3692

*what is that critical F? *3703

*Well, in order to find the critical F, we are going to need to find the DF between as well as the DF *3706

*residual because that is the numerator and the denominator degrees of freedom.*3715

*In order to find DF residual we also need to find DF subject and remember here subject does not *3722

*mean each individual person, subject really mean case.*3730

*And each case here is a squad.*3733

*So how many squads are there -1.*3736

*So count how many squads there are -1.*3741

*So there is 11° of freedom or subject.*3746

*For degrees of freedom between what were going to need is the number of different samples which is three okay -1 so 3 - 1 is 2.*3760

*And so my DF residual is the DF between times the DF subject and that is 22 so let us find the critical F.*3774

*We need F inverse, the probability that we need is .05, the degrees of freedom for the *3784

*numerator is 2, the degrees of freedom for the denominator is 22 and our critical F is 3.44.*3792

*Step 4, here we are going to need the F statistic and in order to find F, we need the variance between divided by the variance residual.*3802

*In order to find variance between we are going to need the SS between divided by DF between, *3823

*we already have DF between thankfully, so we do need SS between.*3833

*And the concept of SS between is the whole idea of each samples X bar, their distant away from *3838

*the grand mean squared and then depending on how many subjects you had in your sample how *3849

*many data points you had in your sample you get waited more or less.*3856

*Now the nice thing is all of these have the same number of subjects.*3860

*But let us go ahead and and try to do this.*3864

*So first we need the different samples so this exercise 1, exercise 2, exercise 3, we need their N, *3869

*their N is going to be 12, there is 12 data points in each sample.*3879

*We also need there each exercise regimes average weight loss so we need X bar and we also *3888

*need the grand mean because ultimately we are going to look for N times X bar minus the grand *3901

*mean squared in order to add all of those up.*3908

*So let us find X bars for exercise regime number 1.*3912

*So that an XL makes it nice and easy for us to just find out all those averages very quickly and *3918

*then once we have that, now we can find the grand mean.*3931

*The grand mean is also very easy to find here.*3938

*We just want to select all the data points.*3941

*I think I selected one of them twice, be careful about that.*3944

*So make sure everybody is selected just one time so this is the average weight loss per week *3950

*regardless of which team you were on regardless of which exercise you did.*3960

*And now let us find N times the X bar minus the grand mean squared and let us do that for each for each exercise regime.*3965

*Once we have that done we could find the sum, and the sum is 23.63.*3983

*So here in SS between I would put that number there.*3997

*So once we have that now we could actually find this because we already have calculated the DF between, was not too hard.*4006

*Now we have to work on variance residual, now in order to find variance residual, let me just add *4018

*in a couple of rows here just to give me a little more space, variance residual, in order to find *4031

*variance residual I am going to need to find SS residual divided by DF residual.*4049

*We already have DF residual so we just need to find SS residual, in order to find that I need SS *4054

*total minus SS between + SS subject level.*4062

*So I already have my SS between so I need to find SS total and SS for each subject.*4071

*So SS total is going to be for every single exercise regime, for every single one of these data *4080

*points I need to find that distance away from the grand mean, add them all up and square and that is going to be my SS total.*4092

*So for E1 here is my subtotal for SS total, for E2, my subtotal for SS total, for E3, my subtotal for SS total.*4104

*So that is X minus the grand mean, lock that grand mean down, squared and make sure you do *4120

*that for every single data point in E1 so if I check on that last data point and just go ahead and *4141

*copy and paste that although it have to hear let us just checked on this one, this is taking this *4151

*value, subtracting the grand mean from it and then squaring that distance.*4157

*So once I have this, I could sum them all up and get my SS total, my total sum of squared distances.*4164

*So I am just going to put a pointer here so that I do not have to rewrite the number.*4180

*Once I have that all I have to find the SS subject.*4187

*Now remember, the SS subject each subject has its own little mean could be repeatedly make *4190

*the measure right so we have to find the subjects mean and then we have to get the distance *4195

*between their mean and the grand mean, square that and multiply it by the number of measures, K.*4201

*So let us do that here, first we need to find the subjects X bars so that is going to be each squads *4211

*average weight loss so some squads probably lost more weight than others, so this is the average *4226

*weight loss for each squad so it looks like you know this squad loss a bit, so a little bit of variation *4242

*in subjects success and sure we are going to look at K times the subjects X bar minus the grand *4259

*mean squared so we already know K, K is going to be 3 times the subjects X bar minus the grand *4272

*mean, I am just going to use the one we have already calculated down here and of course lock that down so copy and paste this, squared.*4284

*So copy and paste that all the way down and I could find the summary here and this is going to be my sum of squares for subject.*4298

*That is the sum of the bunch of squares.*4312

*So that is 34 something.*4318

*I am just going to put a pointer there so I do not have to retype that but I could just see it nice and clearly right here.*4321

*So now you have everything I need in order to find SS residual so I need SS total minus my sum of squares between plus some of squares subject.*4332

*Once I have that now I could find my vary residual variance divided by degrees of freedom, okay *4344

*so here it looks like my residual variance is much smaller than my between sample variance and *4356

*so I could predict my F value will be pretty big so 11 point something divided by two point *4372

*something and that gives me 5.219 and that is a little bit bigger than my critical F.*4381

*So if I find my key value F disc and put in my F, my numerator degrees of freedom, my *4391

*denominator degrees of freedom, I would find .01 so that seems like a pretty small, smaller than *4403

*.05 so I am going to be rejecting my null.*4414

*So step five down here, reject the null.*4418

*And we know that once you reject the null you are going to need to also do post hoc tests as well as find it a square.*4425

*So that brings us to example 3 what is the problem with a bunch of tiny t-test? *4432

*Well with so many t-test the probability of type 1 error increases increasing the cut off A, actually *4445

*were not increasing the cut off, we are keeping it at .05 but the type 1 error increases because *4458

*we reject we have the possibility of rejecting the null multiple times.*4466

*With so many t-test the probability of type 1 error increases here it is because we may be rejecting more null hypothesis.*4470

*This is actually a correct answer so we might not be done yet.*4478

*With so many paired samples t-test we have a better estimate of S because we have been estimating S several times.*4484

*With so many paired samples t-test we have a poor estimate of S because were not using all of *4491

*the data to estimate one S in fact we are just using substance of the data to estimate S several times, that is a good answer.*4500

*So that is it for repeated measures ANOVA, thanks for using educator.com.*4508

## Start Learning Now

Our free lessons will get you started (Adobe Flash

Sign up for Educator.com^{®}required).Get immediate access to our entire library.

## Membership Overview

Unlimited access to our entire library of courses.Learn at your own pace... anytime, anywhere!