Sign In | Subscribe
Start learning today, and be successful in your academic & professional career. Start Today!
Loading video...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Probability
  • Discussion

  • Study Guides

  • Download Lecture Slides

  • Table of Contents

  • Transcription

Bookmark and Share

Start Learning Now

Our free lessons will get you started (Adobe Flash® required).
Get immediate access to our entire library.

Sign up for

Membership Overview

  • Unlimited access to our entire library of courses.
  • Search and jump to exactly what you want to learn.
  • *Ask questions and get answers from the community and our teachers!
  • Practice questions with step-by-step solutions.
  • Download lesson files for programming and software training practice.
  • Track your course viewing progress.
  • Download lecture slides for taking notes.
  • Learn at your own pace... anytime, anywhere!

Order Statistics

Download Quick Notes

Order Statistics

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Premise 0:11
    • Example Question: How Tall Will the Tallest Student in My Next Semester's Probability Class Be?
    • Setting
    • Definition 1
    • Definition 2
    • Question: What are the Distributions & Densities?
  • Formulas 4:47
    • Distribution of Max
    • Density of Max
    • Distribution of Min
    • Density of Min
  • Example I: Distribution & Density Functions 8:29
    • Example I: Distribution
    • Example I: Density
    • Example I: Summary
  • Example II: Distribution & Density Functions 14:25
    • Example II: Distribution
    • Example II: Density
    • Example II: Summary
  • Example III: Mean & Variance 20:32
    • Example III: Mean
    • Example III: Variance
    • Example III: Summary
  • Example IV: Distribution & Density Functions 35:43
    • Example IV: Distribution
    • Example IV: Density
    • Example IV: Summary
  • Example V: Find the Expected Time Until the Team's First Injury 51:14
    • Example V: Solution
    • Example V: Summary

Transcription: Order Statistics

Hi, welcome back to the probability lectures here on, my name is Will Murray.0000

Today, we are going to talk about order statistics.0005

I have to tell you what order statistics means, let us jump into that.0009

I’m going to start out with an example question.0013

I teach at a university and I'm wondering next semester, I know that I’m going to be teaching probability.0015

I look at my class and I see all of that 30 people enrolled in my class next semester and I have never met any of them.0022

I just got 30 random names and never met any of them.0029

I wonder how tall is the tallest student in that class be?0032

That is the kind of question that we are going to be answering using order statistics.0038

I might wonder how tall is the tallest student in my class be, how tall will be the shortest student in my class be?0047

Let me try to connect that with some variables.0053

The idea here, we have N independent random variables with identical distributions.0058

Their distribution function, we are going to call F of Y.0065

Their density, we are going to call f of Y.0068

Of course, that is always the derivative of the distribution.0071

It is always the density, f is always F prime of Y.0074

To connect that backup with the example, imagine I'm looking in my class for next semester, 0080

I’m saying I have 30 students in my class next semester.0086

Each student represents a random variable and that random variable represents how tall that student is.0090

I’m going to have 30 students in my class, that means I have Y1 through Y30, 30 different heights.0097

I’m wondering, how tall will the tallest student in the class be and how tall will the shortest student in the class be?0103

In order to study that, I'm going to define Y sub 1 to be the smallest or the minimum of the Y1 through YN.0111

YN is the largest or the maximum of Y1 through YN.0122

This notation is not very good, I’m following one of the standard textbooks in the field0127

but this notation can be quite confusing for students, because there are two different Y1’s here.0133

Let me quickly identify the difference here.0139

This Y1 with no parentheses, that is just the first random variable.0143

It is the first student that walks in my door, I’m going to call that person Y1.0148

Y1 with parentheses, that means that I look at all the random variables in I select out the smallest one.0153

I call that one (Y1), that is the smallest one.0165

It is like, if I'm talking about my students in my class, Y1 with no parentheses0170

is just the first student who walks in the door on the first day of the semester.0177

Y sub 1 with parentheses means I wait till all my students come into the class.0181

I ask them to all stand up, I look around, I find the shortest student in the class and I say you are Y sub 1.0187

Y sub N without parentheses, that is the last random variable that you look at.0194

That is kind of the last students that walks in the door of my classroom on the first day.0202

Y sub N in parentheses is the largest of all of them.0208

You look at all the variables and you pick whichever one is biggest.0214

In the classroom example, that means that I wait for all of my students to file in.0220

I ask them all to stand up, I see who is the tallest in the class and then I labeled that person as YN.0226

It is the largest of all the variables.0234

Make sure you do not get those mixed up.0238

The Y sub 1 without parenthesis and Y sub 1 with parentheses,0242

the Y sub N without parenthesis and Y sub N with parentheses.0246

The question that we are going to try to solve today is, 0249

one of the distributions and densities of the minimum and maximum, Y sub 1 in parenthesis and Y sub N in parenthesis.0252

Remember, we know the distributions and densities of the individual variables, F and f.0260

F and f are known in terms of homework problems, they should be given to you.0266

You should know the F and f of R.0271

We are going to try to find distributions and densities of Y sub 1 and Y sub N in parenthesis.0277

Let me give you some formulas for this.0287

These formulas actually take a bit of work derive.0288

I’m not showing you the whole derivation here but the distribution,0291

it turns out that the simpler ones are for the maximum values.0295

These first ones that I'm going to teach you are for the maximum values.0299

And then, we will do the minimum one later because they are more complicated.0305

I’m not showing you all the derivation but the distribution function here, 0309

the distribution is the probability that the maximum is less than some cutoff of Y.0315

What is the probability, maybe that your tallest student will still be less than 7 feet tall.0322

The way you get it is, you do F of Y ⁺N.0327

F of Y is the distribution of the individual Yi’s.0331

Little F of Y is the density function of the Y sub I, the density of Y sub i.0344

F of Y and f of Y should be given to us.0354

And then, we just drop them into these formulas f of Y is the derivative of F of Y.0360

We can either remember the formula for it or we can work it out from the function for F.0369

If you take the derivative of this then you get N × F of Y ⁺N-1 × its derivative, that is the chain rule kicking in there.0378

The derivative of F is f.0390

That is why we get that f that you have to tack on the end there.0392

We are using the power rule and the chain rule there.0399

For the minimum, it is a little more complicated.0402

These formulas correspond to the minimum value, the smallest or the shortest student in my class. 0405

Right here, I forgot to mention that this is the density of the max.0417

Here we are going to find the distribution of the minimum.0429

The way you find it, I’m skipping the derivation here but the probability that the minimum will be less than Y.0439

What is the probability that I will have a student below 5 feet tall in my class?0446

What is the chance of having a student less than 5 feet tall?0450

The way you do it is you use the distribution from your original Y and you can drop into this formula.0453

And then, you take the derivative of that to get the density of the minimum.0461

If we take the derivative of this, this initial 1 drops out because it is a constant.0472

And then, we have by the power rule N × 1 - F of Y ⁺N -1.0479

And then, we have the derivative of the inside stuff which is f of Y.0486

That is of the formula that we are going to use to find the density function, for the minimum variable.0494

That is a lot of formulas, I think it is good if we jump right into some examples and0500

we practice those formulas and see how they work out.0506

Let us start with example 1, we have 24 students in a class.0511

Each one is writing a term paper and I guess the teacher said that paper can be any length0514

from 0 to 7 pages, which is a little bit artificial.0519

Usually, a teacher will say it is got to be 5 to 7 pages, but we are going to keep it simple.0522

We are going to go 0 to 7 pages long.0527

We want to find the density and the distribution functions for the length of the longest paper.0530

Maybe, you are the teacher and you are wondering how much you are going to have to grade.0536

You are wondering, what is the long paper that I’m going to read here?0540

Let me start out by identifying the fact that we have a uniform distribution here.0545

I need to find the density and distribution functions for the uniform distribution.0551

We had a whole lecture on the uniform distribution.0559

I will remind you what formulas we got for that.0565

The density function for a uniform distribution on the interval from θ1 to θ2, the density is just 1/θ 2 - θ 1.0569

In this case, it is 1/7 -0, just 1/7.0588

The distribution function is F of Y, you integrate that from 0 to Y.0593

You get, just the integral of that is just Y/7.0601

That is, as Y goes from 0 to 7.0604

I now have a f and F.0610

It is easy now to use the formulas from the previous side to solve the rest of it.0615

We are looking at the length of the longest paper, that F sub Y sub N, 0621

that is the maximum one, the max length of the paper, of Y.0628

I will use the formula from the previous slide, it is F of Y ⁺N.0635

In this case, N is 24 and F is Y/7 ⁺24.0641

That is it for the distribution function, that is the distribution.0659

The density function is f sub Y sub N of Y.0668

You can take the derivative of the term above but you can also use the formula.0675

I’m going to use the formula just to practice that.0681

N × F of Y ⁺N-1 × f of Y.0684

N is 24, F of Y is Y/7, N – 1 is 23, f of Y is 1/70690

It looks like this could simplify, combine the terms a little bit.0703

24 × Y ⁺23 and now I have 1/7 ⁺23 in the denominator and one more factor of 7.0707

I got 7 ⁺24, there, you could have gotten by taking the derivative of the distribution function.0717

My range is still the same as before, Y goes from 0 to 7.0726

That is what I have for my density function.0735

Those are the distribution and the density functions for the length of the longest term paper,0746

that will be submitted.0752

Let me review the steps there.0754

I first identified that we had a uniform distribution, I looked up my density function0757

and my distribution functions for the uniform distribution.0764

We have a whole lecture on the uniform distribution, earlier on this series.0767

You can scroll back up and see it.0771

I think it was the first continuous distribution that we study, it is the easiest one.0773

The uniform distribution, the density function is always constant that is why it is uniform.0778

It is 1/7 because our range is from 0 to 7.0784

The distribution function, you integrate that from 0 to Y.0788

If you integrate that from 0 to Y, you get Y/7.0793

I'm using the two formulas for the distribution and density of the max value.0797

That was on the previous side where I gave you the formulas for order statistics.0804

F of Y ⁺N is Y/7 ⁺N, our N is 24, I forgot to put a nice box around that because that was my final answer for the distribution.0811

My f of Y, if I just drop in N = 24, my F of Y and my f of Y,0825

simplify that down I get the density function for the maximum value there.0834

That is the end of example 1, I am going to reuse the setting for examples 2 and 3.0842

I want to make sure that you understand this,0849

in example 2 we are going to find the density and distribution for the length of the shortest term paper.0852

Make sure that you remember what we came up with here, we will reuse these in examples 2 and 3.0858

In example 2 here, this is a follow-up to example 1, the setting is the same.0867

With 24 students in a class, they are each going to write a term paper.0873

The papers length can be anywhere from 0 to 7 pages.0877

Maybe, if a student has a really profound thought, the student could express her amazing thought in half of the page.0880

It could be potentially 1/2 page term paper or can run as long as 7 pages.0889

We want to find the distribution and density functions for the length of the shortest paper.0894

We did already start figuring out some useful facts about this problem back in example 1.0899

We figured out that this was a uniform distribution.0905

I went back and looked up the density and distribution functions for the uniform distribution.0908

What I figured out was that, my F of, Y my distribution function was Y/7.0917

My f of Y was just a constant value 1/7.0926

That is going to be very useful, now I'm going to invoke the formulas for, the shortest paper means Y sub 1.0933

F sub Y sub 1 of Y, it is a little more complicated than what we had for the max values.0943

The min value, its distribution function is 1 - (1-F of Y) ⁺nth.0952

N is the number of variables that we are looking at.0968

Here we have 24 students, N is 24.0971

I'm going to plug in what I know here.0976

By the way, this formula came from the formula in the introductory slide.0978

I think it was the second slide of this lecture.0984

Just go back and check to see the formulas, that is where this formula is coming from.0986

That 1 - (1 -, my F of Y is Y/7, 1 - Y/7) ⁺nth or 24 here.0991

That is 1 -, if I want to put those over a common denominator, it is not absolutely necessary but it will be 7 – Y/7 ⁺24.1009

I do not think that is going to get any better.1023

That is my distribution function that I just found, for the minimum value.1026

Let me box that up because I'm going to submit that as my answer to the first part of the problem.1035

Let us figure out the density, my f sub Y1 of Y.1042

I could take the derivative of what I just figure out or I could use the formula from earlier on in a lecture.1049

I think I’m going to practice using the formula.1055

Let me remind you what that was.1057

It was N × the (1 -F of Y), looking at myself here, to the N -1 × f of Y.1059

Let me just plug in everything here, the N is 24, 1 - F of Y is 1 – y/7 ⁺23, my f of Y is 1/7.1071

This will simplify a little bit, 24 ×, I can put over a common denominator.1085

I get 7 - Y/7 ⁺23 × 1/7.1092

Now, I can write this as 24, in the denominator I have7 ⁺24.1099

I will have a 7 –Y ⁺23.1107

What is my range, I forgot to mention the range on this.1113

The Y is still going to go from 0 to 7, that is because the original distribution was Y going from 0 through 7 .1117

That is the density function, I’m done with that problem.1129

Let me review the steps and then I can move on.1141

We have a uniform distribution, we already identified that back in example 1.1144

We identify the distribution function F of Y is Y/7.1149

Our density function was 1/7 and N = 24 because we are talking about 24 students here.1153

And then, I used the two formulas that I gave you back on the second slide of this lecture.1161

You can scroll back, you can see the distribution function in terms of F of Y and the density function for Y1,1167

because we are talking about the shortest paper that is why we are looking at Y1.1178

Those are the two formulas and then I just dropped in what my F of Y is, Y/7, dropped that into the formulas.1183

F of Y is 1/7 and then I simplified it down to get a distribution function and a density function.1190

These are both representing the density function and the distribution function for Y1, which is the shortest paper.1199

If I'm wondering ahead of time, how long will the shortest paper be that this teacher is going to have to grade, 1207

then I would follow these density and distribution functions to calculate those probabilities.1214

We are going to use this term paper example for one more problem here.1221

Make sure you still understand the uniform distribution, as we move on to example 3.1227

In example 3, just following up from example 1.1233

We got 24 students in a class, each writing a term paper.1237

The term papers lengths are uniformly distributed from 0 to 7 pages.1241

Not a totally realistic example, but maybe some students are kind of lazy and returning very short term papers.1245

Some students are very industrious and they are writing 7 good pages.1251

We want to find the mean and variance for the length of the longest paper.1255

The longest paper means the maximum value.1259

We are going to be looking at Y sub N, the maximum value, when in parenthesis that is the maximum value.1262

We are going to use some of the answers that we derived in example 1.1272

Let me remind you of what we figured out in example 1.1277

In example 1, we figured out the density and distribution functions.1284

In particular, the density, that is what we are going to use.1288

F sub Y, that is not a Y sub 1, that is F sub YN.1291

I will remind you what we found in example 1, that was 24 × Y ⁺23 divided by 7 ⁺24.1298

That is coming from example 1, if you are have not watched example 1 in the past few minutes,1312

you might want to go back and just review that.1318

Make sure you know where that is coming from, so it is not a total mystery in solving example 3.1319

That is the density function for the longest paper.1326

We want to find it is mean and variance.1332

Let me walk you through that.1335

The mean is the expected value of Y sub N.1336

What we do, to find the expected value, you just do the integral of Y × the density function DY.1341

We integrate over the full range of Y.1355

In this case, this is the integral of Y × that density function we just figure out.1358

Y ⁺24 now, I bumped up the power by 1 because I had that one extra Y and 7 ⁺24 in the denominator DY.1368

I have to integrate this from 0 to 7, Y = 0 to 7.1377

That integral is actually not that bad, the numbers are a little ugly, 24/7 ⁺24.1382

I want to integrate Y ⁺24, this just a power rule.1390

I get Y ⁺25/25 and if I evaluate that from Y= 0 to Y = 7 then what I get is 24 × 7 ⁺25, that is going to cancel, × 25.1393

A lot of that is going to cancel, my mean or expected value is 24.1415

24 of those 7’s are going to cancel, leaving you with just 1/7 in the numerator and 25 in the denominator.1425

It looks like I'm done and there is going to be a lot of 24 and 25 in the variance, it is going to get even worse.1439

Let me write this in terms of N, I think it will be a little more meaningful if I just use generic N here.1445

This is 7N divided by N + 1 because remember the N here was the number students in the class, that is N = 24.1451

7N/N + 1 is the expected value of the longest paper that I'm going to turn in.1461

Notice by the way, if N is very large then the limit of that, as N gets very large is 7.1469

It gets closer and closer to 7, as N gets larger and larger.1477

That means that, the more students we have in this class, the more likely it is 1481

that someone is going to turn in the paper that is exactly or very close to 7 pages.1486

The more students we have, the more likely it is that the longest paper will be about 7 pages.1492

That kind of makes sense, if you have more students in a class, meaning N gets bigger, 1498

it is more likely that the longest paper is close to 7 pages.1513

That really makes sense with your intuition.1527

If you have just have two students in the class, might not be very likely that one is going to produce a 7 page paper.1530

But, if you have 100,000 students in a class and their lengths are uniformly distributed from 0 to 7 pages,1536

chances are you going to get 1 that is pretty close to 7 just because there is so many students.1544

Variance is a lot messier here, let us calculate out the variance.1549

Remember to find the variance of a variable, what you do is you do 1554

the expected value of Y² - the expected value of (Y)².1560

What I'm going to do here is find the expected value of Y² first.1568

E of Y² is the integral of Y² F of Y DY.1573

I’m going to use Y² where I used Y before.1583

I’m going to start using, the integral of 24 Y ⁺25, because I have Y ⁺23 and I bumped it up by 2 powers.1589

We still have7 ⁺24, then I got my DY here.1601

If I integrate that, I saw a 24/7 ⁺24, now Y ⁺26/26 by the power rule.1610

This gives me a good form of old fashioned calculus here.1620

Y = 0 to Y = 7 and I get 24/7 ⁺24 × 7 ⁺26/26, and that simplifies to,1624

Because I can cancel a lot of the 7’s, 24 × 7²/26 there.1644

Again, we have lots of 25, 24, 26, I think they are all coming from that original 24.1653

I think it is useful to write this in terms N.1659

What I’m getting here is 7² × N divided by N + 2.1663

I’m going to bring in the expected value of Y, that part of the formula.1676

My sigma² or my variance is E of Y² so 7² × N/N + 2 - the expected value of (Y)².1681

The expected value of Y, I already worked out here.1694

7/(N + 1)², that is going to get a little messy.1697

7² N/N + 2 - 7² N²/N + 1².1705

I think I can put those over common denominator.1717

It actually gets a little messy and then it simplifies really nicely.1720

Stick with me and I will show you how it works out because it is kind of fun when it simplifies.1723

I have a 7² everywhere and then, I'm going to have N × N + 1² for the first term.1728

My common denominator, I’m planning ahead is going to be N + 2 × N + 1².1736

I did multiply that first term by N + 1², the second term have to multiply by N + 2, N² × N + 2.1743

That looks pretty horrific but that numerator actually simplifies nicely.1751

It did, when I work this out before.1756

In fact, I can factor an N out of everything here.1758

7² × N, and N + 1² is N² + 2N + 1 - N × N + 2 is N² + 2N.1760

That is because I factored one of the N outside, there is 1N outside.1775

I still have that same denominator, maybe I will write that down.1780

Miracles, a lots of terms cancel, in fact the N² + 2N cancel.1787

We are just left with the 1 in that numerator there.1794

That simplifies down to 7² divided by, I still have that denominator, 1797

7² × N divided by the denominator N + 1² × N + 2.1804

Of course, I was shooting for a number here, let me go ahead and fill in the number there.1813

That 7² ×, my N was 24, I’m not going to multiply this out.1817

N + 1² will be 25² and N + 2 is 26, that is my variance.1825

Not the most illuminating answer in the world, not nearly as easily verified, 1835

or it does not easily conform to your intuition the same way the mean did.1844

But it is a number, we have solved the problem.1849

There is the mean and there is the variance of the longest paper.1854

Let me review the steps there, I'm using the density function for the longest paper,1857

for the maximum value that I figure out in example 1.1863

If this part of the solution came as a total surprise to you, go back and watch example 1 and1866

you will see where this comes from, 24Y³/7 ⁺24.1873

And then, to find the mean, remember our original definition of the mean many lectures ago,1878

was you just integrate Y × the density function.1885

We are integrating Y × the density function, Y × Y ⁺23 gives me Y ⁺24, that is where that 24 came from.1890

I bumped that power by 1 because of that extra Y there.1899

When I did that integral, it turn out to be a pretty easy integral, just use the power rule.1903

I dropped in my values, my range was Y goes from 0 to 7 that is1908

because we are given this uniform distribution from 0 to 7 pages.1915

That is where those limits came in, and when I dropped in the range of values Y goes from 0 to 7, I get 7 × 24/25.1919

I noticed that, that was 7N/N + 1, I’m kind of like thinking of it in terms of N because, 1931

if you notice that when N goes to infinity, that gets very close to 7.1937

The limit is 7 but when you plug in bigger and bigger values of N, it gets closer and closer to 7.1943

That is not surprising, because if you think about it, if I have a very small class, if there are 5 people in the class,1949

and I say we are all going to pick a length between 0 and 7 pages, 1957

it is not that likely that anybody is going to get that close to 7.1962

But if you have a huge number of people in the class, if you have 5,000 people in your class1966

Maybe it is some huge online class and they all write papers, and you look at the longest one.1972

The chances are that the longest one, I picked term paper out of 5000 people,1978

it is going to get pretty close to 7 pages.1981

The more people you throw into the mix, the more likely it is that the longest one will get closer and closer to 7 pages.1985

That is sort of very assuring.1992

The variance does not work out quite nicely.1996

First, I want to find E of Y² and that is because I'm remembering this old formula for the variance, E of Y² - the mean².1999

To find E of Y², I did the same thing as with the expected value except we have Y², 2008

instead of Y which means we are integrating Y ⁺25 instead of Y ⁺24 before.2014

It is still an easy integral, plugging Y = 7 and I got 24⁷²/26.2021

That is anticipating, all these different values of 24 and 25 and 26.2029

I think it is easier to think of them in terms of N.2034

I just wrote that as 7² × N/N + 2, remember my N was 24, number of students in the class.2037

Using this formula, my variance is that E of Y² -, that 7/N + 1 that is coming from here.2045

That is what we got dropped in here.2056

And then, I got into some messy algebra.2058

I was just putting these two terms over common denominator, I can factor out a 7² 2063

and I could also factor an N from both terms.2070

My common denominator was N + 2 × N + 1².2073

I’m going to multiply the first term by N + 1² and the second term by N + 2.2077

After I factor out that N from both terms and I expanded both terms, 2082

I got something really cool because N + 1² gave me N² + 2N + 1.2087

N × N2 gave me N² + 2N.2093

All the terms dropped out and I’m just left with 1 there.2097

I just simplify down the 7² × N.2101

That is really nice, I’m going to translate it back into the N into 24 so I got an actual number for the variance.2104

I do not bother to put that into a calculator, it is not the most revealing of numbers.2114

You will notice when N gets very big, it goes to 0 which is kind of reassuring because with more and2121

more students in the class, there is going to be less variance in the length of the longest paper.2126

But beyond that, there is not too much to be insight to be gained from that expression.2132

It is just a number and we know that it is right.2137

Let us move on, in example 4, we have got a basketball team.2142

The team has 10 players, it is a women's basketball team.2148

Unfortunately, what happens with professional sports teams is that every so often somebody gets injured, 2154

it is just a fact of life2163

If you are a coach, you do have to plan for there will be injuries from time to time.2164

If you have 10 players, you never know, sometime during the season one of them might get injured.2170

You have to worry about that and you have to plan for that.2176

That is what this example is all about.2180

We got 10 players, 5 of them will play at any given time, and then we got 5 reserves.2182

We have been studying the sport of basketball for a long time.2188

You have notice that over the long run, each player gets injured now and then.2192

It is very hard to predict but on average, each player gets injured about once every 5 years.2198

That is just an average, that certainly does not reflect individual player statistics.2204

There might be unlucky players who get injured almost every year.2209

There might be players who can go their whole careers without a major injury.2214

We certainly hope to find players in the latter category but its very hard to predict.2219

What you are trying to do as the coach of the team is, you got 10 healthy players right now, 2226

you are worried about when the first injury is going to come.2233

What will be the first time that you have a player getting injured.2237

We want to find the distribution and density functions for that.2242

Then the following slide, in example 5, we are going to find the expected time,2245

how long will it be until we see our first injury.2249

Let us puzzle this out here.2254

First thing we notice here is that this is an exponential distribution.2256

That is actually very realistic description of real life here because a basketball injury2260

is not something that happens with any regularity.2267

It is every so often, it happens and it might happen twice in a month or it might never happen over the course of 10 years.2274

There is no predicting in it.2283

If you are safe for 5 years, it does not mean that you would not get injured tomorrow, unfortunately.2285

Let me remind you what the exponential distribution is all about.2292

It has density function f sub Y is 1/β × E ⁻Y/β, where Y goes from 0 to infinity.2296

By the way, the exponential distribution is part of the family of gamma distributions.2310

We have a lecture on the gamma distribution earlier on in the series,2316

it was in the chapter on continuous distributions.2320

If you do not remember the exponential distribution at all, it is a very good time now to go back and check it out,2323

and get yourself understanding the exponential distribution again.2329

One thing you learn is that the mean of the exponential distribution is this β.2334

That means, we have been given that the mean is 5 years, β here is 5.2338

Our f sub Y is 1/5 E ⁻Y/5.2346

Our F, that is the distribution function, what we just found before was the density function.2353

F is you will integrate the density function.2358

I’m going to integrate from 0 to Y of 1/5 E ⁻T /5.2363

I got to call it T, since my Y of U is elsewhere.2369

The integral of E ⁻T/5 is just -5 E ⁻T/5.2375

But the 5 and 1/5 cancel, it is -E ⁻T/5.2381

I did a u substitution there in my integration.2387

If you are not comfortable with that, you may work it out yourself.2389

I did u substitution, u = -T/5.2392

And then, I have worked out my DU as well.2399

You might want to fill in that step yourself.2403

I’m integrating this from T = 0 to T = Y.2406

If I plug in T= Y, I get -E ⁻Y/5 and if I plug in T = 0, I get E⁰.2412

It is negative, it is a +, it is – a negative so it is +.2424

E⁰ is 1, that is my distribution function, my F of Y.2430

That is kind of describing the individual × until injury for each one of the players on this team.2437

That is kind of describing the density and distribution functions for Y1 up to, I guess we got 10 players on this team.2448

What we are worried about as a coach is, how long it will be until we see the first injury on this team?2457

Right now, we got 10 healthy players, we would certainly like to keep them all healthy2464

but we know that is a sooner or later, we might have an injury.2468

We are worried about when that first injury will come.2473

The first injury will be the minimum value of the Y.2477

The first injury will be the minimum of the Yi, which remember is what we are calling Y sub 1 in parentheses.2481

We want to figure out the density and distribution functions for Y sub 1.2498

We have a formula for that, that formula I will remind you what it is.2504

But if you do not remember where you saw before, it was on the second slide of this lecture.2511

Just scroll back and you will see the second slide of this lecture.2515

The distribution function for the minimum value is 1 - (1-F of Y) ⁺N.2519

This work out fairly nicely, this is 1 -, it is going to be a little messy at first.2531

F of Y itself is 1 – E ⁻Y/5 ⁺nth.2537

Those 1 - in the inside cancel, we just get 1 - E ⁻Y/5.2548

N is the number of variables we have, that is 10.2561

This is 1 - E ^, I can multiply those exponents there.2568

E ⁻2Y, that is my distribution function for the minimum of those variables.2573

F sub Y1 is the density function, let me write that what I found there was the distribution function.2584

The density function, I could just take the derivative of the distribution function, that is what I just found.2599

It would be quite easy to do that.2605

What I like to do is practice the formula that I gave you earlier on in this lecture.2608

Probably, it would be faster to take the derivative.2612

I just want to remind you of the formula.2614

It is N × 1 - F of Y ⁺N-1 × f of Y.2617

If I fill in what those are, N is 10, 1 - F of Y, that is the same thing that I figure out before.2628

1 - (1 – E ⁻Y/5)⁹.2635

That f of Y is 1/5 E ⁻Y/5.2642

I see that I got a 10 × 1/5, that is 2.2651

Here that is E ⁻Y/5, it is getting a little messy there.2655

I can do better than that, here is –Y/5 ⁺10.2664

It is to the 9th power, but then I got one more E ⁻Y/5, that 1/5 combine with the 10.2669

This simplifies down to 2E ⁻Y/5 to the × 10 because we got E ⁻Y/5⁹ and E ⁻Y ⁺1st, × 10.2679

This is 2E ^-, 10/5 is 2, make 2E ^- 2Y, that is my density function.2697

As I mentioned before, the much quicker way to find that would be to have taken the derivative here.2711

This is just, if we done D by DY of the distribution function, the derivative of –E ⁻2Y is just positive 2E ⁻TY.2720

That would be much quicker way to do it.2730

I just want to practice the formula that I gave you on the opening slides of this lecture.2732

In case you want to find the density function immediately, without sort of detouring through the distribution function.2738

That answers our two questions here, we got the distribution and density functions for the time until injury.2746

I did not really tell you the range on that, but it is the same range on your initial Y.2752

Let me throw that in here, 0 less than Y less than infinity.2757

In the worst of luck, it could happen right there on the first day, somebody gets injured.2763

If we are very lucky, it could go forever without an injury.2768

Let me review the steps here. First thing I did was focus on the word exponential distribution.2774

The exponential distribution, as I learned when I studied the gamma distribution in the earlier lecture is, 2779

its density function is 1/β E ⁻Y/β.2787

Its range is from 0 to infinity, in this case the mean is always β but we are given a mean of 5.2792

Our β is 5, in this case.2799

I just dropped in β = 5 here.2802

The exponential distribution is a very good model of this physical situation,2804

because an injury is something that you absolutely cannot predict.2810

It is not the kind of thing that, if you stay healthy all through this month, 2814

it does not mean that you are more or less likely to get injured next month.2821

That is exactly the kind of behavior that the exponential distribution models.2825

In fact, I think that when we learned about the exponential distribution, I called it the memoryless distribution.2831

Meaning that, if you stay healthy all through 1 month, next month is you sort of do not remember 2836

that you are healthy last month, you just get a fresh start in the second month.2841

That was the density function for the exponential distribution.2847

The distribution function there is, you take the integral of the density function.2852

I changed my variable to T because I want to integrate from 0 to Y.2858

Integrating that, I did a u substitution here, actually I did it my head.2862

I integrated that and I got 1 - E ⁻Y/5.2869

Remember here that you cannot ignore the T = 0 terms.2874

You got to include that T = 0 term because when you plug in 0, you did get the number 1 there.2878

You cannot drop that out, that is where that 1 came from.2888

The T = Y gave me the E ⁻Y/5, that is my distribution function.2891

Those all represented the density or distribution functions for individual players, 2899

that represents the waiting time for an individual player to experience an injury. 2905

What the problem was actually asking is, we have 10 of those players, 2913

there are 10 different players that are all running around, hopefully, none of them gets injured.2920

What we are worried about is that each one of them has a possibility of getting injured.2924

I’m wondering about how long we are going to have to wait until the first one gets injured?2931

As soon as we have one injured player, it is really is going to change our coaching strategy on the team.2936

The first injury that means the shortest time which one of those players is going to get injured in the shortest time.2942

That is the minimum of the Yi, that is exactly Y sub 1, that is the one in parentheses.2950

I’m going to use my formulas to calculate the distribution and density for Y sub 1.2955

My formulas tell me, these are the formulas from the second slide of this lecture.2961

Just scroll back, there is the formula for the distribution and there is the formula for the density.2966

Then, I just go through and wherever I see a F, I fill in this.2972

Wherever I see a f, there is a f right there, I fill in the density function for the exponential distribution.2978

The nice thing is there is a lot of 1 – F, 1 - F actually simplify down really nicely into E ⁻Y/5.2987

Both ×, I had a 1 – F and simplify down into E ⁻Y/5.3001

When I put in my exponent N = 10, that was because there were 10 players right there.3008

We got a simpler form for the distribution function.3014

After some work, I got a simpler form for the density function.3018

Of course, if I wanted to save time and really use every resource, I would not have use this formula for the density function.3021

I would just started with the distribution function and taken its derivatives.3030

If you take its derivative, you get the density function very quickly to be 2E ⁻2Y.3034

I want you to really make sure you understand these answers from example 4,3042

because in example 5, we are going to revisit this example.3048

We are going to take it a little farther, we are going to calculate the expected time until the first injury.3052

We will use these answers, I will not derive them again from scratch.3059

We will figure out the expected time until the first injury.3062

If you understand these answers, it will make example 5 make a lot more sense.3067

Let us go ahead and take a look at that.3073

Example 5 here is a follow-up to example 4.3076

If you have not just watched example 4, you really need to understand example 4 3080

before you can make some sense of example 5.3084

The situation back then, you have a basketball team with 10 players.3087

Each player, we are worried about how long it would be until she experiences some kind of injury.3093

Because that is unfortunately what happens with basketball team, every now and then a player gets injured.3098

We are interested in finding the expected time until the team's first injury.3104

Let me remind you what we figure out in example 4.3111

We figured out that the time until the team's first injury.3115

The first injury means, we are looking at Y sub 1, the minimum time until an injury among all those players.3121

We figured out in example 4, the density function for F of Y sub 1 which I will remind you was 2E ⁻2Y.3132

2E ⁻2Y, that is the density function for Y sub 1 and that came from quite a bit of work in example 4.3148

I’m not going to repeat that but if you think that that is coming out of left field, 3158

I’m mixing my sports metaphors, this is a basketball team and nothing should come out of the field.3164

Go back and look at example 4, that should all make sense to you.3170

I want to find the expected value of Y1.3174

What I do, there is a quick way to do this and I’m going to hold off from that.3180

I’m going to do it using the definition, and then we will go back and 3184

see how we could have found the expected value very quickly.3188

Let me find the expected value, sort of not using any special cleverness here.3192

Remember, it is the integral of Y × F of Y DY.3197

In this case, it is the integral of Y × my F of Y is 2E ⁻2Y DY.3204

I need to do a little integration by parts to make that work.3214

Let me go ahead and integrate by parts.3217

Here is tabular integration because I'm feeling lazy.3222

E⁻² Y, take derivatives on the left 2Y, the derivative of Y is 2, the derivative of the constant is 0.3226

The integral of E ⁻2Y is -1/2 E ⁻2Y.3237

The integral of that is +1/4 E ⁻2Y + -, the integral there is, 2Y × -1/2 is –Y E ⁻2Y – 2/4 that is ½ E ⁻2Y.3243

I’m dividing this over the whole range of Y, which I neglected to mention before, Y goes from 0 to infinity.3270

Of course, you can figure that out by looking at the exponential distribution or 3278

by sort of understanding the physical setup of the problem.3282

The time until the first injury could be as small as 0, if you get an injury right away.3287

Or it could that be arbitrarily long, if you are lucky, your basketball team will play for 50 years without ever getting injured.3292

It is unlikely but you can certainly have that.3299

I’m integrating this from Y = 0 or evaluating this from Y =0 to infinity.3302

If we plug in Y = infinity, these exponential terms are going to drag everything to 0.3309

You can do a Patel’s rule on that or you can just know that exponential terms in the denominator will always be polynomials.3317

I'm not even going to worry about my exponential terms, they are both 0.3325

If I plug in Y = 0, I get 0 for the first term.3331

In the second term, I get + because it is - a negative, + ½ E⁰ + ½ E⁰ which is ½.3335

What that means is that, if you are this basketball team coach and 3347

you are wondering how long is it going to be until I see an injury in one of my players.3353

The expected time, you hope not to injure anyone ever, 3360

but the expected time until your first player gets injured is 1/2 a year, 6 months, that is not so good.3369

Dangerous sport, stay away from it.3377

If you got 10 people on the court playing at the same time and you keep them playing hard,3381

chances are in about 6 months, you are going to see your first injury.3386

That is the unfortunate consequence of your probability here of having 10 people play basketball at once.3392

I mention that there was actually a quick way to figure this out.3400

Let me show you now how we could have done that, it could have saved a lot of work here, 3405

which is to look back at that distribution that we had there.3409

Notice that, this is another exponential distribution.3420

This is exponential, let me remind you of the form of an exponential distribution.3425

It has density function F of Y is equal to 1/β E ⁻Y/β.3432

That is the density function for an exponential distribution.3440

What we have is something that exactly matches that, if we take our β equal to ½ because 1/½ is exactly 2.3445

We have an exponential distribution here.3455

The expected value of the exponential distribution, the mean of the exponential distribution is β.3464

This is a property that we learned about the exponential distribution long ago, 3471

when we are studying the gamma distribution.3475

That was because the exponential distribution was a special case of the gamma family.3477

Μ = β is ½ and that is really all we needed to do, if we had noticed that. 3482

We could have saved ourselves from walking through that long integration by parts.3491

I guess it was not that bad, but it is really useful to recognize a distribution, if it does fall into one of your known families.3495

And, to be able to draw some conclusions about it right away.3502

Let me generalize this a little bit, we already have our answer here.3508

We know that it is going to be about 6 months until our team's first injury, that is our expected time.3512

Let me mention where that 2 came from.3518

If you look back in example 4, where that 2 came from was, it just came from the 10 divided by the 5.3521

10 was the number of players on the team and 5 was the mean of the original distribution.3528

That ½ in turn came from the 5 divided by the 10, which in turn the 5 was the β,3539

the mean of the original distribution.3549

And, 10 was the number of players so β /N.3551

I'm going to try to write this in general here, without making reference to specific numbers.3556

In general, if Y1 through YN are exponential with mean β then, let me write that β a little more clear.3571

Then, Y1 the minimum of the Yi’s, the minimum of Y1 through YN is exponential.3585

We can say what the mean of that is with the new mean is going to be β/N.3607

And that is not too surprising, if you kind of think about this basketball player example.3617

If I have 10 players, on the average each one gets injured every 5 years.3624

If I’m sitting around waiting for one of them to get injured, which I hope I'm not.3630

Maybe, I’m the team doctor and I'm wondering when I'm going to have a job to do.3634

If there is 10 people, each one getting injured every 5 years, 3639

on the average you are going to have one getting injured every half year.3644

That is really not surprising, that is just 5 divided by 10.3647

If you start out with an exponential distribution with mean β and cobble together N of them,3653

and look at the minimum then it is exponential with mean β/N.3660

That is not too surprising, if you can think about the basketball players.3666

That wraps up example 5, let me review the steps here.3672

The key step here came from example 4, where we identified Y1 is the minimum of these Yi’s.3676

Yi is the time for each player to get injured, which hopefully is long.3685

Y1 is the first player to get injured, Y1 is what you are worried about.3690

As a coach, you do not want any of your players to get injured.3695

You worry about when it will be until your first player gets injured and forces you to change your team strategy.3698

We calculated the minimum, and back in example 4, 3704

we calculated the density function for the minimum as 2E ^- 2Y.3708

The long way to find the expected value of that is to use the definition of expected value.3715

Definition of expected values says you integrate Y × the density function.3722

That is what I did there, I dropped in the density function.3727

I did the integral using integration by parts.3730

If that tabular integration was unfamiliar to you, I covered that in my calculus 2 lectures here on

You can look at the calculus 2 lectures and you can find a section on integration by parts.3740

It shows you a little short hand trick there.3749

When I plug in the infinity, they all dropped out.3751

When I plug in 0, I got exactly ½, my expected time is ½ a year which I translated into 6 months.3754

That was the long way, the short way is to look back at this density function and recognize that as an exponential distribution.3763

It is just an exponential distribution with a new β, β is ½.3774

If you know your exponential distribution, you know the mean is just β.3781

I could have immediately jump to my answer there of 1/2 a year, 3786

without ever having to do that integral and all that integration by parts.3793

And then, I extrapolated that a little bit because I figure out that that ½ came from 5/10.3797

If you go back and look at example 4, you will see that that 2 came from 10/5.3805

And then, the ½ came from 1/23810

In turn, that comes from 5/10.3814

The 5 and 10 comes from the original problem.3816

The 5 was β and the 10 was the N.3821

There is the N and there is the β.3824

The ½ came from β/N.3827

We have an exponential distribution with the new β is the old β divided by the old N.3830

That is a general principle there, if you have N exponential variables with the mean of β,3838

then their minimum will be exponential with mean β/N.3845

That is a very useful property and it kind of explains this idea, 3850

if you are wondering when your first player will get injured, every player gets injured, on average once every 5 years.3855

If you got 10 players, you are going to have people getting injured once every 6 months on average.3862

That is just 5 years divided by 10, gives you the 6 months, that gives you 1/2 year.3867

That kind of explains and sort of justifies and reassures that all of our mathematics is correct there.3873

That wraps up our lecture here on order statistics.3882

This is part of the larger series on probability here on

I'm your host along the way, my name is Will Murray.3891

I thank you very much for joining me today, bye now.3893