WEBVTT mathematics/probability/murray
00:00:00.000 --> 00:00:04.400
Hi, welcome back to www.educator.com lectures on probability, my name is Will Murray.
00:00:04.400 --> 00:00:09.800
Today, we are going to talk about the first of several continuous distributions.
00:00:09.800 --> 00:00:13.500
This is probably the easiest one, it is called the uniform distribution.
00:00:13.500 --> 00:00:16.000
We will see very soon why it is called that.
00:00:16.000 --> 00:00:17.500
Let us jump right on in.
00:00:17.500 --> 00:00:23.200
The idea of the uniform distribution is you have a finite range from θ1 to θ2.
00:00:23.200 --> 00:00:27.900
Let me go ahead and draw a graph of this, as I'm talking about this.
00:00:27.900 --> 00:00:37.200
You have 2 constant values, here is θ1 and then you have θ2, somewhere a bit bigger than θ1.
00:00:37.200 --> 00:00:44.400
And then, you just divide your density evenly, you distribute it evenly over that range
00:00:44.400 --> 00:00:51.100
which means you just take a completely horizontal line over that range.
00:00:51.100 --> 00:00:55.300
What that means is, remember the total density always has to be 1,
00:00:55.300 --> 00:00:59.700
that the total area under a density function always has to be 1.
00:00:59.700 --> 00:01:10.500
In order to have that area be 1, the width is θ2 – θ1, the height has to be the constant 1/θ2 – θ1.
00:01:10.500 --> 00:01:18.900
By the way, this triple equal sign, that means always equals 2.
00:01:18.900 --> 00:01:22.300
It means that the density function is constant.
00:01:22.300 --> 00:01:26.700
That is much different from all of the other density functions that we will be studying later.
00:01:26.700 --> 00:01:30.700
That is what makes the uniform distribution a lot easier than some of the later ones.
00:01:30.700 --> 00:01:34.100
It is that the density function is always equal to a constant.
00:01:34.100 --> 00:01:41.000
That constant has to be 111, in order to give your total area 1.
00:01:41.000 --> 00:01:46.300
Each part of the region is equally possible or equally probable.
00:01:46.300 --> 00:01:54.600
It is very easy to calculate probabilities with the uniform distribution, if you have two values A and B here.
00:01:54.600 --> 00:01:57.900
Let me go ahead and draw them in on my graph A and B.
00:01:57.900 --> 00:02:03.900
If you have two values A and B, and we want to find the probability that your random variable
00:02:03.900 --> 00:02:06.100
land somewhere between A and B.
00:02:06.100 --> 00:02:11.400
It is very simple, you just have to calculate the distance between A and B.
00:02:11.400 --> 00:02:15.900
Essentially, you are calculating this area right here.
00:02:15.900 --> 00:02:29.300
And that black area is just going to be B – A/θ2 – tθ1, because it has width B - A and it has height of 111.
00:02:29.300 --> 00:02:33.300
It is very easy to calculate probabilities using uniform distribution.
00:02:33.300 --> 00:02:36.900
You just look at the two ranges that you are interested in, subtract them,
00:02:36.900 --> 00:02:44.200
and then divide that by the appropriate constant which is always θ2 – θ1.
00:02:44.200 --> 00:02:47.300
Let us see how that plays out.
00:02:47.300 --> 00:02:53.600
The key properties in uniform distribution, they mean should be kind of intuitively obvious.
00:02:53.600 --> 00:02:58.000
Let me draw again the graph of the uniform distribution.
00:02:58.000 --> 00:03:01.200
There is θ1 and there is θ2.
00:03:01.200 --> 00:03:07.600
Remember that, we are distributing the density completely evenly between θ1 and θ2.
00:03:07.600 --> 00:03:10.700
You would expect the mean to be halfway between them.
00:03:10.700 --> 00:03:13.400
In fact, that is where it turns out to be.
00:03:13.400 --> 00:03:18.300
The mean is exactly the average of θ1 and θ2.
00:03:18.300 --> 00:03:22.300
You just get θ1 + θ2/2 for the μ.
00:03:22.300 --> 00:03:27.600
That is really intuitively clear, it should not be hard to remember because it should be obvious.
00:03:27.600 --> 00:03:35.600
The variance is less obvious, the variance turns out to be θ2 – θ1²/12.
00:03:35.600 --> 00:03:38.200
I think that is not something that you would guess.
00:03:38.200 --> 00:03:41.300
You probably would guess the mean if you give a little bit of thought.
00:03:41.300 --> 00:03:44.900
The variance is not something that you probably guessed.
00:03:44.900 --> 00:03:48.900
You probably have to calculate it out or just memorize it.
00:03:48.900 --> 00:03:53.100
The standard deviation, remember, it is always the square root of the variance.
00:03:53.100 --> 00:04:03.600
If you take the square root of the variance here, you get θ2 – θ1 ÷ √ 12 which is 2 × √ 3.
00:04:03.600 --> 00:04:08.700
That is the standard deviation of the uniform distribution.
00:04:08.700 --> 00:04:17.100
It should still make a rough intuitive sense because it is really a measure of how spread out the interval is.
00:04:17.100 --> 00:04:24.400
Remember that, variance and standard deviation measure how spread out your dataset is.
00:04:24.400 --> 00:04:30.200
In this case, since we got a uniform distribution, if it is spread out over a wider area
00:04:30.200 --> 00:04:34.200
then you should have a higher variance or a higher standard deviation.
00:04:34.200 --> 00:04:42.500
If it is squished into a smaller area then you should have a smaller variance or a smaller standard deviation.
00:04:42.500 --> 00:04:51.200
In this case, since we have the term of θ2 – θ1, that is the width of the interval there, θ2 – tθ1.
00:04:51.200 --> 00:04:57.200
What we are saying here is that the standard deviation is proportional to the width of the interval.
00:04:57.200 --> 00:05:03.000
If you have a wider interval, if you spread your data out more then you have a larger standard deviation.
00:05:03.000 --> 00:05:07.400
If you compress your interval, then you will have a smaller standard deviation.
00:05:07.400 --> 00:05:10.100
It should be intuitively plausible.
00:05:10.100 --> 00:05:17.200
What is not so obvious I think is, the constant 12 for the variance, 2√ 3 for the standard deviation.
00:05:17.200 --> 00:05:22.200
I think that part would not be obvious unless, you actually calculated them out.
00:05:22.200 --> 00:05:27.800
Let us go ahead and look at some problems involving the uniform distribution.
00:05:27.800 --> 00:05:30.500
They generally tend to be fairly easy to calculate.
00:05:30.500 --> 00:05:37.400
The first example here is, you are sitting on your front doorstep, waiting for your morning newspaper to arrive.
00:05:37.400 --> 00:05:42.500
It always arrives sometime between 7:00 AM and the noon.
00:05:42.500 --> 00:05:47.400
The time at which it arrives follows a uniform distribution.
00:05:47.400 --> 00:05:51.900
We want to find that probability that it will arrive during an odd number hour,
00:05:51.900 --> 00:06:00.300
which means we want it to arrive between 7 and 8, not between 8 and 9 because that would be even number hour.
00:06:00.300 --> 00:06:10.800
Between 9 and 10 we are qualified, 10 and 11 would not qualify, and 11 to 12 that would also qualify as an odd numbered hour.
00:06:10.800 --> 00:06:14.100
It looks like there is 3 hours here that would qualify.
00:06:14.100 --> 00:06:25.500
Our total range here is 7 to noon, 7 to 12, that is θ1 is 7 and θ2 is 12.
00:06:25.500 --> 00:06:41.300
Θ2- θ1 is 12 – 7 is 5, that is our denominator here.
00:06:41.300 --> 00:06:44.400
We want to talk about what range we are interested in.
00:06:44.400 --> 00:07:02.700
We have all these odd number hours, 8 -7 + 10 - 9 + 12 -11, which of course is just 3 separate hours on that 5 hour interval.
00:07:02.700 --> 00:07:10.700
We have a total probability of 3/5.
00:07:10.700 --> 00:07:20.500
If your newspaper is going to arrive sometime between 7:00 AM and 12, and it is uniformly distributed in that interval,
00:07:20.500 --> 00:07:27.600
then there is a 60% chance that it will be an odd numbered hour.
00:07:27.600 --> 00:07:29.300
Let me recap that.
00:07:29.300 --> 00:07:36.900
We got this newspaper arriving in a 5 hour interval, that is where I got that denominator of 5 because it is a 5 hour interval.
00:07:36.900 --> 00:07:43.600
If you want think about it in terms of θ2 – θ1, that is 12 -7, that is where that 5 comes from.
00:07:43.600 --> 00:07:49.600
The 3 comes from the 3 hours that have odd numbers.
00:07:49.600 --> 00:07:56.700
The 7:00 hour, the 9:00 hour, and 11:00 hour, gives you 3 different hours that have odd numbers.
00:07:56.700 --> 00:08:04.500
Our total fraction is 3/5, the probability that it will arrive during an odd numbered hour is exactly 60%.
00:08:04.500 --> 00:08:09.400
Remember, I told you that the uniform distribution is one of the easiest to deal with.
00:08:09.400 --> 00:08:14.300
Problems involving uniform distribution are often very easy computationally.
00:08:14.300 --> 00:08:20.600
This certainly qualifies as an easy one computationally but if you stick around, I got a couple of harder once coming up.
00:08:20.600 --> 00:08:24.400
We will see something a little harder coming up.
00:08:24.400 --> 00:08:30.800
Example 2 is also going to be an easy one, trust me, example 3 will be a little more tricky.
00:08:30.800 --> 00:08:38.800
But example 2, we are going to pick a real number Y from the uniform distribution on the interval from 5 to 12.
00:08:38.800 --> 00:08:41.100
Let me go ahead and graph this out.
00:08:41.100 --> 00:08:50.900
There is 5, there is 6, 7, 8, 9, 10, 11, there is 12.
00:08:50.900 --> 00:08:56.700
We are going to pick a real number Y from somewhere in this interval on the uniform distribution.
00:08:56.700 --> 00:09:03.800
We want to find the probability that that value of Y will turn out to be bigger than 9.
00:09:03.800 --> 00:09:15.400
The probability that Y would be bigger than or equal to 9, is equal to, on the denominator we have θ2 – θ1.
00:09:15.400 --> 00:09:20.800
In the numerator, we have B - A being the interval that we are interested in.
00:09:20.800 --> 00:09:23.300
Let me go ahead and draw that in.
00:09:23.300 --> 00:09:33.300
Θ1 is 5, θ2 is 12, the interval that we are interested in is from 9 to 12 because we want Y to be bigger than 9.
00:09:33.300 --> 00:09:37.300
6, 7, 8, 9, there is 9 right there.
00:09:37.300 --> 00:09:42.900
There is A is 9, and B is the same as θ2, B is 12.
00:09:42.900 --> 00:09:51.800
B - A is 12 – 9, θ2 – θ1 is 12 -5.
00:09:51.800 --> 00:09:58.600
What we have there is 12 – 9 is 3, 12 -5 is 7.
00:09:58.600 --> 00:10:04.100
The probability that our number will be bigger than 9 is exactly 3/7.
00:10:04.100 --> 00:10:11.900
Again, very easy computations for the uniform distribution, it is just a matter of saying how wide is your interval.
00:10:11.900 --> 00:10:20.800
In this case, our interval is 7 units wide, the interval from 5 to 12, that is where we got that 7 in the denominator.
00:10:20.800 --> 00:10:26.000
How wide is the region that you are interested in, the region that you might call success?
00:10:26.000 --> 00:10:30.900
In this case, success is defined as Y is bigger than 9.
00:10:30.900 --> 00:10:36.700
That region of success will be the interval from 9 to 12.
00:10:36.700 --> 00:10:43.600
The width of that interval from 9 to 12 is just 12 - 9 is 3, that is where we got that 3.
00:10:43.600 --> 00:10:48.900
Our total answer, our total probability there is just 3/7.
00:10:48.900 --> 00:10:54.900
I guess you could convert that into a decimal, I think that would turn out to be about 42%.
00:10:54.900 --> 00:11:03.200
But that does not come out neatly, I’m going to leave it as a fraction as 3/7 there.
00:11:03.200 --> 00:11:10.100
Example 3 is a bit trickier, what is happening here is that you have a dinner date with your best friend.
00:11:10.100 --> 00:11:18.500
You are planning to meet at 6 pm at the restaurant but the problem is that, both of you tend to run a little late.
00:11:18.500 --> 00:11:27.000
In fact, even though you are planning to me at 6:00 pm, you might be a little bit late, your friend might be a little bit late.
00:11:27.000 --> 00:11:30.600
One of you is probably going to end up waiting a little bit for the other one.
00:11:30.600 --> 00:11:36.600
The way it works is you tend to arrive between 0 and 15 minutes late.
00:11:36.600 --> 00:11:40.100
You are never later than 15 minutes and you are never early.
00:11:40.100 --> 00:11:45.700
You are always maybe 7 minutes, 8 minutes, 11 minutes late, somewhere between 0 and 15.
00:11:45.700 --> 00:11:49.300
Your friend is always between 0 and 10 minutes late.
00:11:49.300 --> 00:11:53.100
Your friend is a little bit more prompt than you are.
00:11:53.100 --> 00:11:58.000
The question we are asking here is the probability that you will arrive before your friend.
00:11:58.000 --> 00:12:04.400
In other words, will you be the one who gets there first and has to find the table at dinner
00:12:04.400 --> 00:12:07.700
and will be sitting around waiting for your friend, or will it be the other way around?
00:12:07.700 --> 00:12:13.500
Your friend arrives first and has to deal with the waiter, and your friend would be waiting for you.
00:12:13.500 --> 00:12:15.700
Let me show you how to solve this one.
00:12:15.700 --> 00:12:19.900
We are going to set up two variables here because there are two independent things going on.
00:12:19.900 --> 00:12:25.900
There is you arriving to the restaurant and there is your friend arriving to the restaurant.
00:12:25.900 --> 00:12:39.500
I set up the variable for X which is your arrival time which could be anywhere from 0 to 15 minutes late.
00:12:39.500 --> 00:12:49.800
Y is going to be your friends arrival time which could be anywhere from 0 to 10 minutes late.
00:12:49.800 --> 00:12:54.200
A really useful way to think about this problem is to graph it.
00:12:54.200 --> 00:12:59.800
Let me go ahead and draw a graph of the possibilities here.
00:12:59.800 --> 00:13:05.100
I will put X on the X axis and Y on the Y axis.
00:13:05.100 --> 00:13:14.300
Your arrival time can be anywhere from 0 to, there is you being 5 minutes late, here you are being 10 minutes late,
00:13:14.300 --> 00:13:17.600
and here you are coming in 15 minutes late.
00:13:17.600 --> 00:13:20.100
We know you are not going to be later than that.
00:13:20.100 --> 00:13:26.300
There is your friend arriving 5 minutes late and here is your friend arriving 10 minutes late.
00:13:26.300 --> 00:13:29.600
We know that your friend would not be later than 10 minutes.
00:13:29.600 --> 00:13:36.800
What that means is your combined arrival time, the combination of arrival times
00:13:36.800 --> 00:13:44.800
is going to be somewhere in this rectangle, depending on when you arrive and depending on when your friend arrives.
00:13:44.800 --> 00:13:50.600
Somewhere, you will arrive a certain number of minutes late and your friend will arrive a certain number of minutes late.
00:13:50.600 --> 00:13:54.400
That will give us a point somewhere in this rectangle.
00:13:54.400 --> 00:14:01.000
Then, we will look at that and say did you arrive first or did your friend arrive first.
00:14:01.000 --> 00:14:08.600
The way we want to think about that is we want to calculate the probability that you will arrive before your friend.
00:14:08.600 --> 00:14:18.800
In other words, we want the probability that X is less than or equal to Y, that is you arriving before your friend.
00:14:18.800 --> 00:14:23.700
Let me just turn those variables around because I think it will be easier to graph that way.
00:14:23.700 --> 00:14:29.300
The probability that Y is greater than or equal to X, that is saying the same thing.
00:14:29.300 --> 00:14:33.300
Let me graph the region in which Y is greater than X.
00:14:33.300 --> 00:14:37.000
A little bit of algebra review here.
00:14:37.000 --> 00:14:40.800
Maybe, I will do this in black.
00:14:40.800 --> 00:14:45.000
The line Y equals X is that line right there.
00:14:45.000 --> 00:14:47.900
That is the line Y is equal to X.
00:14:47.900 --> 00:14:53.700
Y greater than X means you go above the lines.
00:14:53.700 --> 00:15:00.600
It is all this triangular region above the line here.
00:15:00.600 --> 00:15:07.100
That is the region where you arrive before your friend.
00:15:07.100 --> 00:15:14.700
Anywhere below the line, that means your friend arrives first and you arrived afterwards.
00:15:14.700 --> 00:15:20.700
Let us try to calculate that the probability of being in that black shaded region.
00:15:20.700 --> 00:15:31.800
It is the total shaded area ÷ the total area in the rectangle.
00:15:31.800 --> 00:15:34.000
Let us try to figure out what those areas are.
00:15:34.000 --> 00:15:42.400
The shaded area, I see I have a triangle with 10 units on a side here.
00:15:42.400 --> 00:15:50.400
That is base × height/2, that is 10 × 10/2, 100/2 is 50.
00:15:50.400 --> 00:15:54.900
50 units in your shaded triangle.
00:15:54.900 --> 00:16:03.600
The total area is a rectangle, it is 10 by 15 on the side, that is 10 × 15 is 150.
00:16:03.600 --> 00:16:09.800
Here is very nice, it simplifies lovely to 1/3.
00:16:09.800 --> 00:16:17.600
The probability that you will arrive before your friend is exactly 1/3.
00:16:17.600 --> 00:16:25.400
If you make lots and lots of dates with the same friend, and you guys both follow the same habits over the years,
00:16:25.400 --> 00:16:30.300
what will happen is 1/3 of the time you will be sitting around waiting for your friend.
00:16:30.300 --> 00:16:34.700
2/3 of the time your friend will be sitting around waiting for you.
00:16:34.700 --> 00:16:42.700
That is really the result of the fact that your friend is a little bit more prompt than you.
00:16:42.700 --> 00:16:47.600
Most of the time your friend will end up waiting for you, at some of the time, 1/3 of the time,
00:16:47.600 --> 00:16:49.400
you will wait for your friend.
00:16:49.400 --> 00:16:51.200
Let me recap that.
00:16:51.200 --> 00:16:57.400
The way we approach this is, we noticed first that we really had two independent uniform distributions.
00:16:57.400 --> 00:17:03.100
There is one for your arrival time and there is one for your friend’s arrival time.
00:17:03.100 --> 00:17:10.300
That is the first thing I did was to set up variables to indicate your arrival time and your friend’s arrival time.
00:17:10.300 --> 00:17:20.000
Your arrival time, I put on the X axis that goes from 0 to 15 because you can be anywhere from 0 to 15 minutes late.
00:17:20.000 --> 00:17:23.200
Your friend is anywhere from 0 to 10 minutes late.
00:17:23.200 --> 00:17:27.900
I got those from the stem of the problem here.
00:17:27.900 --> 00:17:36.700
I graph those here and I got this nice rectangle of possible combinations of arrival ×.
00:17:36.700 --> 00:17:43.600
Once I have this rectangle, I know that while you arrive at a particular time, your friend will arrive at a particular time,
00:17:43.600 --> 00:17:49.200
that means essentially we are choosing a point at random in this rectangle.
00:17:49.200 --> 00:17:54.300
And then, we have to ask whether you will arrive before your friend?
00:17:54.300 --> 00:18:01.000
You arriving before your friend means your arrival time is less than or equal to your friend’s arrival time,
00:18:01.000 --> 00:18:03.900
which can be rewritten as Y greater than X.
00:18:03.900 --> 00:18:10.800
We graph the shaded region, represents the region where Y is greater than or equal to X.
00:18:10.800 --> 00:18:14.300
And then, it was just a matter of calculating the areas which turned into
00:18:14.300 --> 00:18:19.000
a little old fashion geometry of calculating the area of a triangle.
00:18:19.000 --> 00:18:25.300
The triangle was ½ base × height, ½ × 10 × 10.
00:18:25.300 --> 00:18:33.300
The rectangle was base × height, that is 15 × 10.
00:18:33.300 --> 00:18:39.400
We get 50/150 simplifies down to a probability of 1/3.
00:18:39.400 --> 00:18:45.600
That represents the chance that you have arrived at the restaurant before your friend.
00:18:45.600 --> 00:18:51.800
You will be the one who has to sit around and wait.
00:18:51.800 --> 00:18:59.000
In example 4, we have a problem that is a great interest to computer programmers.
00:18:59.000 --> 00:19:06.100
The reason is that most computer languages have a random number function.
00:19:06.100 --> 00:19:10.800
It uses something rand or random, or something like that.
00:19:10.800 --> 00:19:21.900
If you type rand into a computer program and the appropriate language, it will give you a number between 0 and 1.
00:19:21.900 --> 00:19:28.600
We usually try to arrange it so that the random numbers are uniformly distributed between 0 and 1,
00:19:28.600 --> 00:19:32.600
which is very good if you need a number between 0 and 1.
00:19:32.600 --> 00:19:37.100
In lots of cases, when you need a random number in a computer program,
00:19:37.100 --> 00:19:44.600
you need a random number on another range which might be from θ1 to θ2.
00:19:44.600 --> 00:19:52.900
What this problem really does is, it shows you how to convert a uniform distribution on 0-1
00:19:52.900 --> 00:19:57.100
into a uniform distribution onto θ1 and θ2.
00:19:57.100 --> 00:20:02.700
That is the point of this problem, it is very useful for computer programmers
00:20:02.700 --> 00:20:05.000
but that is not actually what we are doing here.
00:20:05.000 --> 00:20:10.100
What we are doing is we are making a little transformation and we are starting with Y
00:20:10.100 --> 00:20:14.800
being uniformly distributed on the interval from 0 to 1.
00:20:14.800 --> 00:20:30.600
We are looking at another variable X which is defined to be, by the way that colon means define to be.
00:20:30.600 --> 00:20:35.900
X is defined to be θ2 – tθ1 Y + θ1.
00:20:35.900 --> 00:20:43.100
We want to show that that is a uniform distribution on the interval from θ1 to θ2.
00:20:43.100 --> 00:20:50.400
To show that, let me first find the range of values for X.
00:20:50.400 --> 00:21:00.400
Notice that, the range for Y is from 0 to 1, I have Y =0 to Y =1, if I plug those values into X,
00:21:00.400 --> 00:21:08.400
if I plug Y =0 in to X, I get X = θ2 – θ1 × 0, that drops out.
00:21:08.400 --> 00:21:10.600
I just get X = θ1.
00:21:10.600 --> 00:21:22.000
If I plug Y = 1 into X, I get X = θ2- θ1 × 1 + θ1.
00:21:22.000 --> 00:21:26.700
That simplifies down to θ2.
00:21:26.700 --> 00:21:33.800
That tells me the range for X, X goes from θ1 to θ2.
00:21:33.800 --> 00:21:41.800
That is hopeful, at least I know that X is distributed somehow on the range from θ1 to θ2.
00:21:41.800 --> 00:21:45.600
But I want to really show that X is a uniform distribution.
00:21:45.600 --> 00:21:52.900
I want to calculate the probability of the line between any 2 values.
00:21:52.900 --> 00:21:56.700
Let me find that probability here.
00:21:56.700 --> 00:22:10.000
The probability that is X is between any two values A and B, I can calculate that as the probability that,
00:22:10.000 --> 00:22:23.600
X, just by definition is θ2 – θ1 × Y + θ1, that should be between A and B.
00:22:23.600 --> 00:22:29.800
What I want to do is to solve this into a set of probabilities for Y.
00:22:29.800 --> 00:22:33.800
First, subtract off θ1 from all 3 sides there.
00:22:33.800 --> 00:22:48.100
The probability of A – θ1 being less than or equal to θ2 -θ1 × Y, less than or equal to B – θ1.
00:22:48.100 --> 00:22:52.100
I’m trying to solve for Y, I'm trying to get Y by itself.
00:22:52.100 --> 00:22:57.000
Next, I’m going to divide by θ2 – θ1.
00:22:57.000 --> 00:23:04.800
This is the probability of A – θ1/θ2- θ1.
00:23:04.800 --> 00:23:17.000
Less than or equal to just Y by itself now, less than or equal to B –θ1/θ2 – θ1.
00:23:17.000 --> 00:23:24.600
I'm remembering that Y itself is a uniformly distributed random variable.
00:23:24.600 --> 00:23:31.200
The probability that Y would be between any two bounds is just the difference between those two bounds.
00:23:31.200 --> 00:23:33.500
You just subtract those two bounds.
00:23:33.500 --> 00:23:45.500
I will just do B – θ1/θ2- θ1- A – θ1/θ2 – θ1.
00:23:45.500 --> 00:23:53.900
I see now that I have a common denominator, θ2- θ1.
00:23:53.900 --> 00:24:00.900
I got –θ1 in the first term and - and -, + θ1 in the second term.
00:24:00.900 --> 00:24:08.200
Those θ1 cancel with each other and I just get down to B – A.
00:24:08.200 --> 00:24:18.200
If you look at that, what I did was I started out with the probability that X is going to be between A and B.
00:24:18.200 --> 00:24:27.800
What I came up with is, the probability is equal to exactly B - A ÷ θ2 – θ1.
00:24:27.800 --> 00:24:34.900
That is exactly the formula for a uniform distribution.
00:24:34.900 --> 00:25:01.400
X has a uniform distribution, X is uniformly distributed, distribution on the interval from θ1 to θ2.
00:25:01.400 --> 00:25:06.800
I should start with θ1 and go to θ2.
00:25:06.800 --> 00:25:19.500
X has a uniform distribution because the probability of X falling in any interval is exactly equal to the width of that interval B – A.
00:25:19.500 --> 00:25:26.300
To recap what I did there, first, I looked at the range for Y, Y goes from 0 to 1.
00:25:26.300 --> 00:25:29.500
Based on that, I calculated the range for X.
00:25:29.500 --> 00:25:40.800
I plugged in those values of Y 0 and 1 into the formula for X here, and calculated the bounds for X being θ1 to θ2.
00:25:40.800 --> 00:25:45.100
I know that X takes on the right range of values.
00:25:45.100 --> 00:25:56.000
And then, I found the probability of any particular sub interval from A to B by converting X into terms of Y.
00:25:56.000 --> 00:26:03.500
Solving out to isolate the variable Y, and then I use the fact that Y is uniformly distributed.
00:26:03.500 --> 00:26:12.000
The probability of Y being between any two limits is just the width of those limits, the difference between those two limits.
00:26:12.000 --> 00:26:20.500
B – θ1/θ2 –θ1- A – θ1/θ2 – θ1.
00:26:20.500 --> 00:26:33.800
That simplify down to B –A /θ2 – θ1, which is exactly the formula for probability with a uniform distribution.
00:26:33.800 --> 00:26:41.400
That tells me that X has a uniform distribution on the interval from θ1 to θ2.
00:26:41.400 --> 00:26:45.200
That is very useful if you are computer programmer because that means
00:26:45.200 --> 00:26:53.000
you can use the random number generator given by most computer programming languages.
00:26:53.000 --> 00:27:00.100
And then, you can use this formula to convert it into a uniform distribution whatever range you want.
00:27:00.100 --> 00:27:10.800
If you want for example, a random number between 80 and 100, and then you just plug in θ1 = 80 and θ2 =100.
00:27:10.800 --> 00:27:18.200
You can use this formula to generate a random number between 80 and 100,
00:27:18.200 --> 00:27:23.800
that will be uniformly distributed between 80 and 100.
00:27:23.800 --> 00:27:28.700
In example 5, we are going leave the world of the random numbers behind.
00:27:28.700 --> 00:27:33.600
We are going to look at the rough and tumble world of ice cream dispensary.
00:27:33.600 --> 00:27:38.900
We have an ice cream machine which gives you servings of ice cream.
00:27:38.900 --> 00:27:48.500
The servings vary a little bit, if you are unlucky the machine will be stingy with you, and it will give you just 206 ml of ice cream.
00:27:48.500 --> 00:27:55.900
If it is a good day for you, if the machine is feeling generous, it will give you up to 238 ml.
00:27:55.900 --> 00:28:03.700
Essentially, it picks a random amount in between 206 and 230, and they are uniformly distributed.
00:28:03.700 --> 00:28:11.000
The question we are trying to answer is, if you go up there with your bowl and
00:28:11.000 --> 00:28:18.600
you want to predict how much ice cream you will get, you want to describe what the expected amount of ice cream is in a serving,
00:28:18.600 --> 00:28:24.800
and also the standard deviation in that quantity.
00:28:24.800 --> 00:28:30.200
This is really asking, the expected value and mean are the same thing.
00:28:30.200 --> 00:28:37.300
We are trying to calculate the expected value or the mean of this uniform distribution, and also the standard deviation.
00:28:37.300 --> 00:28:48.100
I gave you formulas for those as few slides back, in a slide called key properties of the uniform distribution.
00:28:48.100 --> 00:28:51.700
You can go back and look those up, I will remind you what they are here.
00:28:51.700 --> 00:28:58.100
The mean which is always the same as the expected value, by definition those are the same thing.
00:28:58.100 --> 00:29:05.700
For the uniform distribution is θ1 + tθ2 ÷ 2.
00:29:05.700 --> 00:29:12.200
The θ1 and θ2 are the ranges of the endpoints of the interval.
00:29:12.200 --> 00:29:27.500
In this case that is 206 + 230 ÷ 2, that is 436 ÷ 2 which is 218.
00:29:27.500 --> 00:29:35.500
Our units here are ml, the average amount you expect to get when you fill up your bowl
00:29:35.500 --> 00:29:40.600
at this ice cream machine is 218 ml of ice cream.
00:29:40.600 --> 00:29:49.300
Of course that is not at all surprising, if you are going to get a random amount between 206 and 230,
00:29:49.300 --> 00:29:58.400
it is not surprising that in the long run, you will get about halfway between 206 and 230 which is 218.
00:29:58.400 --> 00:30:00.900
That is really not surprising at all.
00:30:00.900 --> 00:30:05.500
The standard deviation, I also gave you on that slide, several slides ago.
00:30:05.500 --> 00:30:12.400
It is θ2- θ1 ÷ 2 √s 3.
00:30:12.400 --> 00:30:23.600
In this case, θ2, the big one is 230, Θ1 is 206, we want to divide that by 2 √3.
00:30:23.600 --> 00:30:37.200
230 -206 is 24, the 200 part cancels, ÷ 2 and 3, that simplifies down to 12/3.
00:30:37.200 --> 00:30:49.000
Since, 12 is 4 × 3, this just gives us 4 × √ 3 which I put that into a calculator,
00:30:49.000 --> 00:30:54.300
it works out to be just a little bit less than 7 ml.
00:30:54.300 --> 00:30:59.200
It is about 6.9 ml.
00:30:59.200 --> 00:31:07.500
If you fill up your bowl at this ice cream machine, you expect on average to get about 218 ml.
00:31:07.500 --> 00:31:16.400
The standard deviation on that will be 6.9 ml, about 7 ml + or - from 200 and 18.
00:31:16.400 --> 00:31:19.300
To recap where these numbers came from.
00:31:19.300 --> 00:31:24.700
The formulas for the mean and standard deviation, I give this to you on early slide in this talk.
00:31:24.700 --> 00:31:27.800
It was called key properties of the uniform distribution.
00:31:27.800 --> 00:31:29.300
They are fairly straightforward formulas.
00:31:29.300 --> 00:31:32.900
In particular, the mean is what you would guess.
00:31:32.900 --> 00:31:44.700
It is just the average of the upper and lower bounds, 206 + 230/2 gives you an average of 280 ml of ice cream per serving.
00:31:44.700 --> 00:31:54.500
The standard deviation is probably not something you would guess but if you have a formula handy, it is θ2 – θ1/203.
00:31:54.500 --> 00:31:59.900
I will just plug in the θ2 and θ1 into those values there.
00:31:59.900 --> 00:32:08.300
And I simplified it down to 4 √3 and I have this decimal approximation that is about 6.9 or about 7 ml.
00:32:08.300 --> 00:32:12.600
That is your standard deviation in an ice cream serving.
00:32:12.600 --> 00:32:18.100
That is the last example and that wraps up this lecture on the uniform distribution.
00:32:18.100 --> 00:32:24.100
The uniform distribution is just the first, it is the easiest of several continuous distributions.
00:32:24.100 --> 00:32:30.800
We will be moving on from here and looking at the famous normal distribution, not the same as the uniform distribution,
00:32:30.800 --> 00:32:38.000
and also the gamma distribution which includes the exponential distribution and chi square distribution.
00:32:38.000 --> 00:32:45.000
Those are all coming up in the next few lectures here in the probabilities series on www.educator.com.
00:32:45.000 --> 00:32:49.000
You are watching Will Murray with www.educator.com, and thank you very much for joining us, bye.