1
00:00:17,350 --> 00:00:25,450
Hello and welcome to this lecture 6 of module
3 on random variable. In this lecture, we
2
00:00:25,450 --> 00:00:31,650
will cover the probability distribution of
continuous random variable. In the last lecture,
3
00:00:31,650 --> 00:00:37,200
we covered the discrete random variables.
Similarly, for some continuous random
4
00:00:37,200 --> 00:00:43,050
variable, there are some standard probability
distribution functions are there, which are
5
00:00:43,050 --> 00:00:48,610
very widely used in different problems in
civil engineering. We will see those
6
00:00:48,610 --> 00:00:55,460
distributions maybe in today’s class as
well as next class we will continue the discussion
7
00:00:55,460 --> 00:01:01,809
through different probability distributions
for continuous random variables.
8
00:01:01,809 --> 00:01:08,870
So, at the starting we will quickly recapitulate
the pdf that is probability density function,
9
00:01:08,870 --> 00:01:15,320
you know that for the discrete random variable
what we refer is the probability density
10
00:01:15,320 --> 00:01:22,360
function. So, the probability density function
and their requirement to be a valid pdf for
11
00:01:22,360 --> 00:01:27,380
the continuous random variable. We will see
that very quickly as this was we discussed
12
00:01:27,380 --> 00:01:32,110
in the earlier lectures as well; and after
that we will go through the different distribution
13
00:01:32,110 --> 00:01:39,950
.function. To list it, we will start with
the uniform distribution, then we will see
14
00:01:39,950 --> 00:01:41,850
the
normal distribution, log normal distribution,
15
00:01:41,850 --> 00:01:46,610
exponential distribution, gamma
distribution, Weibull distribution and beta
16
00:01:46,610 --> 00:01:51,500
distribution.
The list of this distribution may not be exhausted,
17
00:01:51,500 --> 00:01:55,990
but these are the distribution, which
are generally and widely used for different
18
00:01:55,990 --> 00:02:01,749
problems in civil engineering. So, we will
cover them 1 after another and we will just
19
00:02:01,749 --> 00:02:08,750
show there for which type of problems,
which distributions will be most suitable
20
00:02:08,750 --> 00:02:11,840
and that is generally, decided based on their
set
21
00:02:11,840 --> 00:02:17,000
For example, when we talk about the uniform
distribution this is generally, bounded
22
00:02:17,000 --> 00:02:22,650
from the lower side as well as upper side
and in between that the density is uniform.
23
00:02:22,650 --> 00:02:28,600
Similarly, the normal distribution is the
support of normal distribution is a spanning
24
00:02:28,600 --> 00:02:31,020
from
the minus infinity to plus infinity.
25
00:02:31,020 --> 00:02:38,690
So, which the entire range of the real axis
and log normal distribution, exponential
26
00:02:38,690 --> 00:02:44,860
distribution and gamma distribution generally,
are lower bounded have been some lower
27
00:02:44,860 --> 00:02:49,470
bound. And these bound is generally, at the
origin that is 0 and then we will see the
28
00:02:49,470 --> 00:02:54,170
weibull distribution, beta distributions this
beta distribution is again, is bounded
29
00:02:54,170 --> 00:02:59,770
distribution. And we will see the different
possible application of this 1 in different
30
00:02:59,770 --> 00:03:03,091
civil
engineering problem. In this lecture and as
31
00:03:03,091 --> 00:03:09,100
well as, in next lecture that is in this module,
what we will discuss, is their basic properties
32
00:03:09,100 --> 00:03:13,650
of this distribution and what are their
characteristics.
33
00:03:13,650 --> 00:03:18,370
Applications of this any specific distribution
to some specific problem of civil
34
00:03:18,370 --> 00:03:23,840
engineering we will be discuss, in and the
subsequent modules most probably it is in
35
00:03:23,840 --> 00:03:30,260
module 5. So, we will start with this very
brief recapitulation of probability density
36
00:03:30,260 --> 00:03:34,460
function for the continuous random variable.
37
00:03:34,460 --> 00:03:35,460
..
38
00:03:35,460 --> 00:03:40,610
So, as you discuss, in the earlier classes
earlier lectures that a continuous random
39
00:03:40,610 --> 00:03:45,400
variable is a function that can take any value
in the sense of the continuous range with
40
00:03:45,400 --> 00:03:50,260
in
the given range of that random variable. The
41
00:03:50,260 --> 00:03:56,920
continuous random variable possess a
continuous probability density function. So,
42
00:03:56,920 --> 00:04:03,030
this continuous probability density function
we know that over the range of this possible
43
00:04:03,030 --> 00:04:08,450
range of this random variable over that
range this function is continuous and; obviously,
44
00:04:08,450 --> 00:04:16,709
some properties should be satisfied by
this function to be a valid pdf. Now, probability
45
00:04:16,709 --> 00:04:23,850
of such random variables identifies the
probability of a value of the random variable
46
00:04:23,850 --> 00:04:29,810
falling within a particular interval.
So, this is basically, the difference between
47
00:04:29,810 --> 00:04:35,389
the discrete and the continuous random
variable here as we are talking the function
48
00:04:35,389 --> 00:04:41,160
as a density functions. So, at a particular
point at a particular value of this random
49
00:04:41,160 --> 00:04:44,220
variable, if we see the pdf then what it is
giving
50
00:04:44,220 --> 00:04:49,570
is the density. Now, when you are talking
about a small interval and small range of
51
00:04:49,570 --> 00:04:52,461
this
one, then that area below that pdf is the
52
00:04:52,461 --> 00:05:01,370
probability. So, that is why the probability
though what it identifies is the probability
53
00:05:01,370 --> 00:05:06,150
of the random variable falling within the
particular interval. So, one small interval
54
00:05:06,150 --> 00:05:09,780
is needed to define what is the probability
for
55
00:05:09,780 --> 00:05:15,030
that random variable falling within that particular
interval.
56
00:05:15,030 --> 00:05:16,030
..
57
00:05:16,030 --> 00:05:21,570
Then, these are the properties that should
be followed to be a valid pdf that is a
58
00:05:21,570 --> 00:05:27,870
probability density function pdf is conventionally
as we have seen earlier that is it is
59
00:05:27,870 --> 00:05:33,540
denoted by fx(X), where this is the capital
X, which is denoting that random variable
60
00:05:33,540 --> 00:05:37,690
this
is small x, which is denoting that dummy variable
61
00:05:37,690 --> 00:05:43,040
are the particular value of that random
variable. So, there are two properties that
62
00:05:43,040 --> 00:05:50,250
it should follow we know that at any for all
values of all feasible values of this x this
63
00:05:50,250 --> 00:05:57,230
value of this function should be greater than
equal to 0. And it should integrate to unity
64
00:05:57,230 --> 00:06:04,080
to satisfy that second axiom the total
probability within the all possible that feasible
65
00:06:04,080 --> 00:06:07,740
over the feasible range of this random
variable, which is known as the support of
66
00:06:07,740 --> 00:06:18,210
the random variable over this support it
should be equal to 1. So, with this now we
67
00:06:18,210 --> 00:06:21,030
will go through different probability
distribution function.
68
00:06:21,030 --> 00:06:22,030
..
69
00:06:22,030 --> 00:06:28,169
And first we will start with the uniform distribution.
So, any random variable X is
70
00:06:28,169 --> 00:06:34,889
uniformly distributed random variable, if
its probability density function is given
71
00:06:34,889 --> 00:06:41,110
by this
equation that is F(X) this will be capital
72
00:06:41,110 --> 00:06:45,830
X F(X) with this variable x and this alpha
beta or
73
00:06:45,830 --> 00:06:51,731
any such thing, which is shown here are generally,
the parameter of the distribution. So,
74
00:06:51,731 --> 00:06:58,110
here the parameters are alpha and beta. So,
with this parameter this distribution is
75
00:06:58,110 --> 00:07:06,840
expressed by 1 by beta minus alpha for the
range, when this x lies between alpha and
76
00:07:06,840 --> 00:07:14,410
beta, that is the lower limit is alpha and
upper limit is beta outside this range anywhere,
77
00:07:14,410 --> 00:07:21,370
the value of this function is equals to 0.
Now, if we see this function then before we
78
00:07:21,370 --> 00:07:25,620
call
that this is a probability density function
79
00:07:25,620 --> 00:07:30,850
we have to see that whether those two
properties are followed or not.
80
00:07:30,850 --> 00:07:37,800
So, we see that as this beta is greater than
alpha. So, this quantity is greater than 0
81
00:07:37,800 --> 00:07:42,419
always
and in the outside this region this is equal
82
00:07:42,419 --> 00:07:45,330
to 0. So, the first property that is it should
be
83
00:07:45,330 --> 00:07:51,621
greater than equal to 0 is satisfied and;
obviously, this 1 by beta minus alpha, if
84
00:07:51,621 --> 00:07:55,320
we now
integrate it over this alpha to beta. So,
85
00:07:55,320 --> 00:07:59,419
this range from this beta to alpha, which
is again,
86
00:07:59,419 --> 00:08:08,040
this beta minus alpha this will be equal to
1. So, this is the complete form of this
87
00:08:08,040 --> 00:08:14,240
uniformly distributed uniform distribution
and we know that, if we say that the
88
00:08:14,240 --> 00:08:20,080
cumulative distribution function will be given
by this x minus alpha divided by beta
89
00:08:20,080 --> 00:08:21,539
minus alpha.
90
00:08:21,539 --> 00:08:22,539
.
91
00:08:22,539 --> 00:08:28,570
Here the diagram or how this distribution
looks like is shown here this is your alpha
92
00:08:28,570 --> 00:08:30,710
limit
and this your beta limit.
93
00:08:30,710 --> 00:08:31,710
.
94
00:08:31,710 --> 00:08:41,200
Now, if you just see it once here that now
this is what we are calling as this alpha
95
00:08:41,200 --> 00:08:45,050
and
this what we are calling as beta and now,
96
00:08:45,050 --> 00:08:49,800
this side is your that value of that function
f(x)
97
00:08:49,800 --> 00:08:57,069
and this is your x. So, here the possible
range is between alpha and beta and this is
98
00:08:57,069 --> 00:08:59,759
a
close boundary, close boundary means, that
99
00:08:59,759 --> 00:09:05,100
less than equal to sign is shown. So, this
is
100
00:09:05,100 --> 00:09:12,019
the close boundary; that means it is inclusive
of these two values. So, add this value the
101
00:09:12,019 --> 00:09:20,269
.probability is 1 by from this point to this
point this probability, if I have just shown
102
00:09:20,269 --> 00:09:24,630
as a
indicative line is 1 by beta minus alpha.
103
00:09:24,630 --> 00:09:30,230
So, from this range to this range the density
is
104
00:09:30,230 --> 00:09:36,100
uniform all over this region that is why the
name is uniform distribution and just for
105
00:09:36,100 --> 00:09:38,629
this
reference we are just noting these two lines
106
00:09:38,629 --> 00:09:44,050
from this alpha to beta.
Now, when we are talking about what is cumulative
107
00:09:44,050 --> 00:09:51,130
distribution; that means, any value
with in this range up to the x we have to
108
00:09:51,130 --> 00:09:55,579
calculate, what is the total area up to that
point
109
00:09:55,579 --> 00:10:01,639
as we have seen in the earlier description
as well from for the cdf? So, at this point
110
00:10:01,639 --> 00:10:08,920
I have
to point that what is the value for that region
111
00:10:08,920 --> 00:10:18,550
in this. So, this point when you are talking
about this cumulative distribution function
112
00:10:18,550 --> 00:10:25,529
this point implies that the total area covered
up to that point. So, obviously, the total
113
00:10:25,529 --> 00:10:35,209
area covered at alpha is nothing 0. So, this
should corresponds to this point and when
114
00:10:35,209 --> 00:10:38,529
it is in this way, if we just move this x
up to
115
00:10:38,529 --> 00:10:45,170
this beta; obviously, from this second property
it should go it should touch the values of
116
00:10:45,170 --> 00:10:51,089
total area will be equals to 1.
So, this value should come as 1 and as this
117
00:10:51,089 --> 00:10:58,610
is uniform this should be a straight lines
starting from 0 at alpha and 1 at beta. So,
118
00:10:58,610 --> 00:11:05,179
this will be your c d f that is cumulative
distribution function for uniform distribution.
119
00:11:05,179 --> 00:11:14,470
So, which is shown here we can see from
this cumulative distribution function that
120
00:11:14,470 --> 00:11:17,769
is x minus alpha by beta minus alpha. Now,
if
121
00:11:17,769 --> 00:11:24,179
you put that x is equals to alpha here then
you are getting it here to be 0 and if you
122
00:11:24,179 --> 00:11:27,170
put
that x equals to beta then you are getting
123
00:11:27,170 --> 00:11:33,950
this value as to be here 1. So, which is shown
in
124
00:11:33,950 --> 00:11:37,499
this diagram as well?
125
00:11:37,499 --> 00:11:38,499
..
126
00:11:38,499 --> 00:11:48,470
So, this is that density that is the density
function now, from different initial parameters
127
00:11:48,470 --> 00:11:53,231
that we have discuss, in the earlier lectures
that is mean variance and coefficient. If
128
00:11:53,231 --> 00:11:56,490
we
calculate these things for the uniform distribution
129
00:11:56,490 --> 00:12:02,920
is comes as the mean is given by mu
equals to alpha plus beta by 2, which is obvious
130
00:12:02,920 --> 00:12:10,220
from this diagram as well. So, that mean
value of this one as these distribution is
131
00:12:10,220 --> 00:12:18,439
uniform so mid mean value should be at the
midpoint of this two ranges. So, that midpoint
132
00:12:18,439 --> 00:12:22,800
is nothing but your alpha plus beta by 2.
.
133
00:12:22,800 --> 00:12:28,949
.Now, before coming to this variance, if I
see the coefficient of Skewness then coefficient
134
00:12:28,949 --> 00:12:33,769
of Skewness, if we see then we know that,
which is the measure of the symmetry with
135
00:12:33,769 --> 00:12:41,040
respect to the with respect to its mean. So,
we see that this is also symmetric with respect
136
00:12:41,040 --> 00:12:48,029
to the mean that is why Skewness also will
be 0. So, this is so mean, is given by this
137
00:12:48,029 --> 00:12:53,259
average value of this lower and upper bound
of that distribution and the coefficient of
138
00:12:53,259 --> 00:12:59,459
Skewness is 0. As the distribution is symmetric
about the mean and variance also the
139
00:12:59,459 --> 00:13:07,619
equation that we use that it should be from
the mean and if we take the square and then
140
00:13:07,619 --> 00:13:13,829
we multiply it with that probability density
function and integrate over the entire support.
141
00:13:13,829 --> 00:13:19,170
Then we get that the variance should equals
to beta minus alpha whole square divided by
142
00:13:19,170 --> 00:13:26,629
12. So, this is your variance and; obviously,
Skewness is 0 as discussed.
143
00:13:26,629 --> 00:13:27,629
..
144
00:13:27,629 --> 00:13:34,429
Now, any specific civil engineering applications
sometimes, if we see then if we take the
145
00:13:34,429 --> 00:13:43,430
example of a structural member, which is made
up of a material, which is a uniform
146
00:13:43,430 --> 00:13:48,810
characteristics all over and it is subject
to a particular loading condition, then that
147
00:13:48,810 --> 00:13:56,110
structural member can failing means, it can
break or rupture at a distance x from the
148
00:13:56,110 --> 00:14:00,709
1
end. Now, the distribution of that X can be
149
00:14:00,709 --> 00:14:07,569
assumed to be uniform with the support 0 to
L now, if this 1. So, this is the structural
150
00:14:07,569 --> 00:14:11,610
member now the location of the failure from
the
151
00:14:11,610 --> 00:14:20,790
1 end of that member that, if it take that
particular distance from 1 end is the random
152
00:14:20,790 --> 00:14:28,670
variable then it can happen anywhere. So,
the probability can be the distribution of
153
00:14:28,670 --> 00:14:31,360
the
probability for this random variable capital
154
00:14:31,360 --> 00:14:36,990
x can be assumed to be uniform and this is
equals to 1 by L.
155
00:14:36,990 --> 00:14:42,319
.So, and this is valid from the 0 to the entire
length of this structural member. So, that
156
00:14:42,319 --> 00:14:47,059
is
why this is 1 by L and it is 0 a 0 elsewhere
157
00:14:47,059 --> 00:14:50,589
now, if we draw it is pdf it looks like this
that
158
00:14:50,589 --> 00:14:58,779
is from 0 to L it is uniformly distributed,
which is equals to 1 by L. Considering the
159
00:14:58,779 --> 00:15:02,760
fact
that the total area below this curve below
160
00:15:02,760 --> 00:15:11,300
this pdf should be equals to 1. Now, this
analogy this example can be analogously can
161
00:15:11,300 --> 00:15:19,079
be extended to the other examples like that
road accident on a highway stretch identified
162
00:15:19,079 --> 00:15:24,889
to be the accident prone now, if the total
length of this road is equals to 0. So, that
163
00:15:24,889 --> 00:15:31,199
location of this accident can happen any point
over this stretch. So, that is also can be
164
00:15:31,199 --> 00:15:32,749
followed as a uniform distribution. Now, we
will
165
00:15:32,749 --> 00:15:42,690
show to start with we started that a uniform
distribution, which is easy in the sense from
166
00:15:42,690 --> 00:15:45,009
this mathematical concept.
.
167
00:15:45,009 --> 00:15:50,730
And we will proceed now to the normal or the
Gaussian distribution and this normal and
168
00:15:50,730 --> 00:15:55,859
Gaussian distribution is one of the most important
distribution in every field including
169
00:15:55,859 --> 00:16:07,139
civil engineering. This normal distribution
is very useful for many applications in
170
00:16:07,139 --> 00:16:12,059
different research field.
So, that is why this normal or the Gaussian
171
00:16:12,059 --> 00:16:19,959
distribution is most popular distribution
among the all continuous probability distribution
172
00:16:19,959 --> 00:16:25,329
and this is a continuous probability
distribution with unbounded support.
173
00:16:25,329 --> 00:16:30,429
Unbounded support means, mathematically it
can take the values from minus infinity to
174
00:16:30,429 --> 00:16:36,629
plus infinity and it is symmetrical distribution
about the mean. So, these two properties
175
00:16:36,629 --> 00:16:48,019
.are there and this is a two parameter distribution
again, that is the mean is mu and
176
00:16:48,019 --> 00:16:52,820
variance is sigma square for the uniform distribution.
Also we have shown that there are
177
00:16:52,820 --> 00:16:58,249
two parameters alpha and beta, which are the
basically, the bound for the distribution
178
00:16:58,249 --> 00:17:02,819
here also there are two distribution, one
is this mu and another one is the sigma square.
179
00:17:02,819 --> 00:17:08,339
We will see how the distribution looks like
and it’s different properties in the success
180
00:17:08,339 --> 00:17:09,339
.
181
00:17:09,339 --> 00:17:18,230
This is the pdf for this normal distribution.
So, pdf of the random variable X having
182
00:17:18,230 --> 00:17:25,929
normal or Gaussian distribution with the parameter
mu and other parameter is sigma
183
00:17:25,929 --> 00:17:33,580
square. It is expressed as that Fx(X) with
this two parameter equals to 1 by square root
184
00:17:33,580 --> 00:17:38,019
of
2 pi sigma remember that this sigma is outside
185
00:17:38,019 --> 00:17:44,029
this square root, if it is within; obviously,
1 square will come here multiplied by exponential
186
00:17:44,029 --> 00:17:50,110
of minus half x minus mu by sigma
whole square. And this support for this x
187
00:17:50,110 --> 00:17:53,340
is from minus infinity to plus infinity as
just
188
00:17:53,340 --> 00:18:00,909
now discuss. So, this pdf results in a bell
slap bell shaped curve and which is symmetric
189
00:18:00,909 --> 00:18:06,400
about the mean in the earlier cases also,
if you see that this distribution generally
190
00:18:06,400 --> 00:18:10,119
looks
like this.
191
00:18:10,119 --> 00:18:11,119
..
192
00:18:11,119 --> 00:18:22,580
So, this distribution generally, looks like
this so this is having its mean and; obviously,
193
00:18:22,580 --> 00:18:28,200
the mean mode and median are same here and
this is with respect to this is known as the
194
00:18:28,200 --> 00:18:36,760
bell shaped curve and this is symmetric with
respect to this mean. And if you see again,
195
00:18:36,760 --> 00:18:44,540
if you see its cumulative density function
then; obviously, we can say that it starts
196
00:18:44,540 --> 00:18:47,840
from
this minus infinity and go up to this plus
197
00:18:47,840 --> 00:18:51,590
infinity. So, if we take that any particular
point
198
00:18:51,590 --> 00:19:00,529
here and total area, if we calculate and put
it some value here then this c d f that is
199
00:19:00,529 --> 00:19:12,970
which
is your pdf and this is your c d f and this
200
00:19:12,970 --> 00:19:19,100
generally, goes and become again, it touch
this
201
00:19:19,100 --> 00:19:29,179
1 at infinity plus infinity it is asymptotic
to this line one and this is asymptotic to
202
00:19:29,179 --> 00:19:33,269
the line
0 at minus infinity.
203
00:19:33,269 --> 00:19:39,889
So, this is how this cumulative distribution
looks like and this is how the pdf looks like
204
00:19:39,889 --> 00:19:47,510
for one normal or Gaussian distribution. So,
mathematically we know that to get that
205
00:19:47,510 --> 00:19:52,389
cumulative distribution function we have to
integrate from this left hand support that
206
00:19:52,389 --> 00:19:57,020
is
minus infinity to it can go up to x and this
207
00:19:57,020 --> 00:20:00,730
integration from this minus infinity to x
of this
208
00:20:00,730 --> 00:20:07,210
pdf this integration of this one from this
it will give you the cdf. Now, this integration
209
00:20:07,210 --> 00:20:10,049
is
difficult and we will see that how this is
210
00:20:10,049 --> 00:20:14,070
over come through this numerical integration
and with the available chart that we will
211
00:20:14,070 --> 00:20:15,150
discuss in a minute.
212
00:20:15,150 --> 00:20:16,150
..
213
00:20:16,150 --> 00:20:21,770
Before that we will show this the example
of this bell shaped curve that it is that
214
00:20:21,770 --> 00:20:26,670
we have
shown is the this bell shaped curve the parameter
215
00:20:26,670 --> 00:20:34,019
for this one this curve is that mu equals
to 1 and sigma is equals to 1.2. So, as this
216
00:20:34,019 --> 00:20:37,169
parameter mu is equals to 1 you can see that
it
217
00:20:37,169 --> 00:20:46,570
is the maximum density is at this point 1.
So, this mu is basically, the location parameter
218
00:20:46,570 --> 00:20:55,309
where the maximum density is located. So,
that is signified by this mu and sigma also
219
00:20:55,309 --> 00:20:57,590
it
is generally, showing the spread over this
220
00:20:57,590 --> 00:21:03,559
mean and we will show this one in the
examples how this things can affect the shape
221
00:21:03,559 --> 00:21:04,580
of this curve.
.
222
00:21:04,580 --> 00:21:16,799
.And this is your for the same pdf with mu
equals to 1 and sigma equals 2, if we calculate
223
00:21:16,799 --> 00:21:20,930
what should be that cumulative distribution
function that is c d f then it looks like
224
00:21:20,930 --> 00:21:24,340
this.
And this line you can see that this is asymptotic
225
00:21:24,340 --> 00:21:30,600
to 1 at plus infinity and this one towards
the left it is the asymptotic to 0 towards
226
00:21:30,600 --> 00:21:34,360
minus infinity.
.
227
00:21:34,360 --> 00:21:43,240
Now, this parameters of this normal distribution
as we discuss, that this normal
228
00:21:43,240 --> 00:21:51,049
distribution is a two parameter distribution.
So, the first parameter is your mean, it is
229
00:21:51,049 --> 00:22:01,070
the
shape parameter of this normal distribution
230
00:22:01,070 --> 00:22:06,809
and it generally denoted by this mean. And
the second parameter is the variance, which
231
00:22:06,809 --> 00:22:13,130
is the scale parameter of the normal
distribution and this is generally, denoted
232
00:22:13,130 --> 00:22:19,559
by sigma square. Now, the coefficient of
Skewness is again, 0 similar to the uniform
233
00:22:19,559 --> 00:22:23,799
distribution what we discuss, earlier this
is 0,
234
00:22:23,799 --> 00:22:30,520
because this distribution is also symmetric
about the mean that is why this Skewness is
235
00:22:30,520 --> 00:22:33,099
0.
236
00:22:33,099 --> 00:22:34,099
..
237
00:22:34,099 --> 00:22:40,460
Now, if you see the effect of change of this
parameter as we was talking that effect of
238
00:22:40,460 --> 00:22:45,980
change in this parameter value on this normal
pdf, then it looks like this. We have plotted
239
00:22:45,980 --> 00:22:52,809
here three different normal distributions,
the pdf for the normal distribution with
240
00:22:52,809 --> 00:23:01,840
different parameters. The blue line that is
the middle one, if you see this is for the
241
00:23:01,840 --> 00:23:07,039
similar
value 1.5 for all these three distribution
242
00:23:07,039 --> 00:23:13,200
that is shown here having the same mean mu
equals to 0. So, that is why for all this
243
00:23:13,200 --> 00:23:16,850
distribution you can see that maximum density
is
244
00:23:16,850 --> 00:23:25,500
concentrated at x equals to 0.
Now, this blue curve this blue one is having
245
00:23:25,500 --> 00:23:36,289
the sigma value equals to 1.5 whereas, this
green one is having the sigma value is 0.75
246
00:23:36,289 --> 00:23:43,159
and this black one is having the sigma value
is equals to 1. Now, you see for this black
247
00:23:43,159 --> 00:23:58,159
one the sigma value is 1.5, which is maximum
here. So, that is why we are keeping the mean
248
00:23:58,159 --> 00:24:05,000
same for all three distribution. This is
more spread the spread is more about its mean
249
00:24:05,000 --> 00:24:08,879
and so, it is reflected from its value of
this
250
00:24:08,879 --> 00:24:17,820
parameter sigma, which is 1.5. Similarly,
for this green one the sigma value is the
251
00:24:17,820 --> 00:24:25,129
minimum and which is 0.75 that is why it is
this spread about the mean is the minimum
252
00:24:25,129 --> 00:24:33,870
most among this three distribution curve.
So, this sigma generally, controls the spread
253
00:24:33,870 --> 00:24:39,110
about the mean, which is reflected from this
background.
254
00:24:39,110 --> 00:24:40,110
..
255
00:24:40,110 --> 00:24:47,789
Now, the effect of change in the parameter
value the second parameter now here we are
256
00:24:47,789 --> 00:24:55,440
taking this is that a mu. Now, in this three
plot again, what we have kept same is that
257
00:24:55,440 --> 00:25:03,669
sigma now for all this three curves sigma
is equals to 1.5. Now, so, as this sigma is
258
00:25:03,669 --> 00:25:09,750
same
then you can see this spread about the mean
259
00:25:09,750 --> 00:25:14,190
the respective mean is same for all three
curves. Now, as we have change for this blue
260
00:25:14,190 --> 00:25:20,710
one mu is minus 2 for the black one mu is
0 and for green one mu is 2. So, you can see
261
00:25:20,710 --> 00:25:26,039
that this is generally, shifted from the one
location to another location for blue one
262
00:25:26,039 --> 00:25:32,019
it is a centered at minus 2 for black one
it is at 0
263
00:25:32,019 --> 00:25:39,029
and for green one it is at 2. So that so,
where it is centered that is controlled by
264
00:25:39,029 --> 00:25:43,529
this
parameter mu.
265
00:25:43,529 --> 00:25:44,529
..
266
00:25:44,529 --> 00:25:53,659
Now, if we take some standard example of this
normal distribution in civil engineering,
267
00:25:53,659 --> 00:25:59,029
if a number of concrete cubes. So, one example,
is that strength of this concrete, if a
268
00:25:59,029 --> 00:26:04,540
number of concrete cube prepared through the
identical methods and cured under the
269
00:26:04,540 --> 00:26:11,539
identical circumstances are tested for their
crushing strength it is observed that their
270
00:26:11,539 --> 00:26:18,799
crushing strength is a normally distributed
random variable. Now, the crushing strength
271
00:26:18,799 --> 00:26:26,120
that is available in at least 95 percent of
the sample is called the characteristic strength
272
00:26:26,120 --> 00:26:28,799
of
the 95 percent dependable strength.
273
00:26:28,799 --> 00:26:35,220
You know that example of the characteristic
strength for the strength of the concrete
274
00:26:35,220 --> 00:26:41,470
is
that in the 95 percent cases if we see that
275
00:26:41,470 --> 00:26:47,009
the particular strength of that cube is exceeded
that is generally, denoted by this crushing
276
00:26:47,009 --> 00:26:50,029
strength. Now, that how we can say that this
is
277
00:26:50,029 --> 00:26:56,740
the 95 percent cases it is exceeded. So, it
is if generally, found to follow a normal
278
00:26:56,740 --> 00:27:01,669
distribution keeping its mean. And we considered
that strength, where it should be
279
00:27:01,669 --> 00:27:08,169
exceeded at the 95 percent cases to designate
that particular strength to be the
280
00:27:08,169 --> 00:27:15,610
characteristics strength of the particular
concrete. So, this is how we define that
281
00:27:15,610 --> 00:27:18,230
characteristics strength of the concrete cube.
282
00:27:18,230 --> 00:27:19,230
..
283
00:27:19,230 --> 00:27:27,049
Now, there are some nice properties of this
normally, distributed random variable, which
284
00:27:27,049 --> 00:27:33,740
is known as this additive property, if a random
variable X is normally, distributed with
285
00:27:33,740 --> 00:27:38,529
its parameter mu and sigma. We discuss the
normal distribution having two parameters
286
00:27:38,529 --> 00:27:46,730
mu and sigma then, if we get another random
variable which is related to this earlier
287
00:27:46,730 --> 00:27:50,200
one
that is X in the through this equation that
288
00:27:50,200 --> 00:27:57,830
is Y is equals to a plus b X that is the random
variable is multiplied by b and added with
289
00:27:57,830 --> 00:28:06,009
a then this Y is also a normally distributed;
however, its parameters will change like this
290
00:28:06,009 --> 00:28:11,330
that it is the first parameter in case, of
this
291
00:28:11,330 --> 00:28:18,260
mu it will be a plus b mu and it is second
parameter, which is sigma square is equals
292
00:28:18,260 --> 00:28:21,009
to b
square sigma square.
293
00:28:21,009 --> 00:28:27,120
So, while getting this new parameter what
we are doing is that we are just putting the
294
00:28:27,120 --> 00:28:35,240
mean value that is that first parameter value
here and getting the mean for this mu
295
00:28:35,240 --> 00:28:40,360
random variable. And when we are talking about
the variance that is the spread around
296
00:28:40,360 --> 00:28:47,419
the mean then the constant term that was adding
that is not affecting, but what is
297
00:28:47,419 --> 00:28:53,010
affecting is by its multiplying coefficient
and it should be squared. So, that is b square
298
00:28:53,010 --> 00:29:00,470
multiplied by this sigma square. Similarly,
if there are n numbers of such normally,
299
00:29:00,470 --> 00:29:06,549
distributed random variable and if we can
say that these are independent to each other
300
00:29:06,549 --> 00:29:14,840
then, if you create another new random variable,
which is Y is equals to a plus b 1
301
00:29:14,840 --> 00:29:24,110
multiplied by X 1 b 2 multiplied by x 2 similarly,
up to b n multiplied by X n then this Y
302
00:29:24,110 --> 00:29:35,149
will also be a normally, distributed .So,
we will just put the individual mean of this
303
00:29:35,149 --> 00:29:42,149
random variables to get the mean for this
y. which is so, the mu y is equals to a plus
304
00:29:42,149 --> 00:29:51,519
.summation of b i mu i. So, a plus b 1 mu
1 plus b 2 mu 2 extra up to b n mu n and for
305
00:29:51,519 --> 00:29:58,549
this sigma square it should be the coefficient
for those random variable that square times
306
00:29:58,549 --> 00:30:05,660
their individual variance and their summation
up should give this 1. This second property
307
00:30:05,660 --> 00:30:12,149
generally, leading to the central limit theorem
and that we will discuss, in the subsequent
308
00:30:12,149 --> 00:30:19,039
lectures we will see that we can relax the
requirement of this normal distribution for
309
00:30:19,039 --> 00:30:24,260
this
random variable, if these are simply, if they
310
00:30:24,260 --> 00:30:26,929
are independent and identically distributed
itself.
311
00:30:26,929 --> 00:30:32,879
We can say that this Y will have this normal
distribution, which is the result of the
312
00:30:32,879 --> 00:30:37,909
central limit theorem will be discussed later.
And the first property when we are talking
313
00:30:37,909 --> 00:30:45,029
about that is the Y is equals to a plus b
X this y can be treated as the function of
314
00:30:45,029 --> 00:30:49,150
X, which
will be discuss, in greater detail in the
315
00:30:49,150 --> 00:30:55,929
next module where we will discuss about that
functions of random variable. So, while discussing
316
00:30:55,929 --> 00:31:01,320
the functions of the random variable
we will know, if we know the properties that
317
00:31:01,320 --> 00:31:08,720
parameters for one random variable how to
get the parameters for the distribution as
318
00:31:08,720 --> 00:31:11,789
well as parameters as well as distribution
for its
319
00:31:11,789 --> 00:31:19,179
function. So, this normal distribution is
one example that we have shown here in while,
320
00:31:19,179 --> 00:31:25,909
discussing the functions of random variable
this will be discussing in a general way
321
00:31:25,909 --> 00:31:32,480
irrespective of this of any particular distribution
of the original random variable.
322
00:31:32,480 --> 00:31:33,480
.
323
00:31:33,480 --> 00:31:41,980
Now, another distribution, which is derived
from this normal distribution, is known as
324
00:31:41,980 --> 00:31:48,470
the standard normal distribution. This standard
normal distribution when a normal
325
00:31:48,470 --> 00:31:55,330
.distribution is having its mean mu equals
to 0 and variance sigma square equals to 1
326
00:31:55,330 --> 00:31:59,710
then
this particular normal distribution with this
327
00:31:59,710 --> 00:32:03,049
specific values of this parameters is known
as
328
00:32:03,049 --> 00:32:10,190
the standard normal distribution. So, in this
original distributional form that is what
329
00:32:10,190 --> 00:32:16,029
we
have shown it earlier that this distribution,
330
00:32:16,029 --> 00:32:21,860
if you just put mu equals to 0 and sigma
equals to 1 what the distribution form we
331
00:32:21,860 --> 00:32:28,399
will get that will be the standard normal
distribution instead of using x there we are
332
00:32:28,399 --> 00:32:38,320
using z as the dummy variable.
So, this is your standard normal distribution
333
00:32:38,320 --> 00:32:47,130
to continue with our same notation this is
also as continuous instead of p this will
334
00:32:47,130 --> 00:32:50,529
be f and this is the capital Z and this is
the small
335
00:32:50,529 --> 00:32:56,360
z. So, this is specific value and this is
the random variable. So, whose distribution
336
00:32:56,360 --> 00:33:00,450
is 1 by
square root 2 pi exponential minus z square
337
00:33:00,450 --> 00:33:06,630
by 2 and; obviously, the z is having the
support from minus infinity to plus infinity.
338
00:33:06,630 --> 00:33:09,539
So, this is also normal distribution, which
is
339
00:33:09,539 --> 00:33:16,779
having the mean mu and variance 1. The cumulative
distribution for this standard normal
340
00:33:16,779 --> 00:33:22,409
distribution again, can be found out from
integrating it from this left support to the
341
00:33:22,409 --> 00:33:28,340
specific value z, which is 1 by square root
2 pi exponential minus z square by 2.
342
00:33:28,340 --> 00:33:33,460
So, integrating from this left support to
the particular value x and this is also will
343
00:33:33,460 --> 00:33:37,240
be the
capital F and this z less than equals space
344
00:33:37,240 --> 00:33:42,740
particular value z, which is giving you the
cumulative probability distribution. Now,
345
00:33:42,740 --> 00:33:49,399
we should know, what is the use of this
distribution and why we need this specific
346
00:33:49,399 --> 00:33:52,710
distribution with this specific parameter
is that
347
00:33:52,710 --> 00:33:57,779
this integration whenever, we are talking
about this integration, which is important
348
00:33:57,779 --> 00:34:02,580
to
calculate the probability it is not this integration
349
00:34:02,580 --> 00:34:07,450
cannot be done in the close form. So, we
have to go for some numerical integration
350
00:34:07,450 --> 00:34:12,720
and the numerically integrated values are
available for this standard normal distribution.
351
00:34:12,720 --> 00:34:18,540
Now, for any general normal distribution
we can convert it first to the standard normal
352
00:34:18,540 --> 00:34:24,500
distribution and get it is desired probability
and that we will discuss in a minute.
353
00:34:24,500 --> 00:34:25,500
..
354
00:34:25,500 --> 00:34:31,520
So, before that this is how this distribution
that is pdf probability density function looks
355
00:34:31,520 --> 00:34:37,530
like and as you know that its mean is equals
to 0 and sigma is equals to 1. So, that is
356
00:34:37,530 --> 00:34:44,020
why
it is centered at 0? Basically, it support
357
00:34:44,020 --> 00:34:46,950
is again that from minus infinity to plus
infinity
358
00:34:46,950 --> 00:34:52,429
the, but one thing you can see that most of
that probabilities is concentrated between
359
00:34:52,429 --> 00:34:55,270
minus 3 to plus 3.
.
360
00:34:55,270 --> 00:35:04,460
Now, while the thing that we are discussing
why we need this standard normal
361
00:35:04,460 --> 00:35:11,730
distribution is as follows, if a random variable
X is normally, distributed with mean mu
362
00:35:11,730 --> 00:35:22,790
.and the variance sigma square then the probability
of X being less than or equal to x is
363
00:35:22,790 --> 00:35:29,750
given by this particular distribution. Now,
as the above integration cannot be evaluated
364
00:35:29,750 --> 00:35:35,440
analytically as we discuss just now the numerically
computed values are tabulated taking
365
00:35:35,440 --> 00:35:40,690
this mu equals to 0 and sigma equals to 1
that is the standard normal distribution.
366
00:35:40,690 --> 00:35:44,380
So, for
the standard normal distribution these values
367
00:35:44,380 --> 00:35:52,330
are numerically computed and listed.
Now, for all other normal distribution with
368
00:35:52,330 --> 00:35:55,800
any values of this parameter that is if this
mu
369
00:35:55,800 --> 00:36:03,020
is not equal to 0 and this sigma square is
not equal to 1, then the cumulative probabilities
370
00:36:03,020 --> 00:36:09,750
can be determined by converting this x to
its reduce variate. So, how it is converted
371
00:36:09,750 --> 00:36:13,530
that
is this X that particular random variable
372
00:36:13,530 --> 00:36:18,950
is deducted from its mean that is the whatever,
the value mean is there and it is divided
373
00:36:18,950 --> 00:36:25,230
by sigma. So, we get in new random variable
which is Z.
374
00:36:25,230 --> 00:36:26,230
.
375
00:36:26,230 --> 00:36:38,790
Just now in a minute we have seen that if
that y is equals to a plus b X just few earlier,
376
00:36:38,790 --> 00:36:41,350
if
this is normal distribution we have seen that
377
00:36:41,350 --> 00:36:54,350
its mean the mean of this y is a plus b mu
and its variance is b square sigma square.
378
00:36:54,350 --> 00:36:57,790
Now, what we are doing here the conversion
is
379
00:36:57,790 --> 00:37:05,270
thus z is equals to X minus mu by sigma, if
I just write in that form that is mu by sigma
380
00:37:05,270 --> 00:37:14,290
plus 1 by sigma X. So, here we have a is equals
to minus mu by sigma and here b is
381
00:37:14,290 --> 00:37:24,390
equals to 1 by sigma. So, that the mean for
this z will be a plus b mu, if I take it from
382
00:37:24,390 --> 00:37:38,550
here a plus b mu and if I put this value is
equals to minus mu by sigma plus this 1 by
383
00:37:38,550 --> 00:37:41,880
this
b is 1 by sigma multiplied by mu, which is
384
00:37:41,880 --> 00:37:48,720
0 and variance for this z is b square sigma
385
00:37:48,720 --> 00:37:55,290
.square, which is again this b square is 1
by sigma square multiplied by sigma square
386
00:37:55,290 --> 00:38:00,370
equals to 1.
So, we have converted it in such a way that
387
00:38:00,370 --> 00:38:05,140
is the mean for this new random variable is
0
388
00:38:05,140 --> 00:38:11,990
and variance for this new random variable
becomes 1. So, through this conversion of
389
00:38:11,990 --> 00:38:16,991
any
normally, distributed random variable we can
390
00:38:16,991 --> 00:38:25,341
generate a another new random variable,
which again, normal distribution with mean
391
00:38:25,341 --> 00:38:31,730
0 and standard deviation 1 and this 1
conversion is irrespective of any specific
392
00:38:31,730 --> 00:38:36,850
value of mu and sigma that original random
variable is having.
393
00:38:36,850 --> 00:38:37,850
.
394
00:38:37,850 --> 00:38:46,390
So, once we can convert this one then we can
use this standard normal distribution chart,
395
00:38:46,390 --> 00:38:53,320
which is generally, looks like this. So, this
is your that pdf of this normal distribution
396
00:38:53,320 --> 00:38:57,070
and
this cumulative values are listed as follows
397
00:38:57,070 --> 00:39:00,780
that is for any specific value of this z this
is
398
00:39:00,780 --> 00:39:07,520
basically, started from minus 5 and goes like
this and coming and going up to 5. And we
399
00:39:07,520 --> 00:39:15,250
know that from this from this effectively
from minus 3 to plus 3 itself most of the
400
00:39:15,250 --> 00:39:23,400
probabilities are exhausted. So, if I just
zoom it here then I can see for the value
401
00:39:23,400 --> 00:39:27,990
when
this z value is at 0 that is adjust at thus
402
00:39:27,990 --> 00:39:33,650
at the mean we know that the total probability
covered due to the property of a symmetry
403
00:39:33,650 --> 00:39:37,440
should be equals to point 5, which is shown
here.
404
00:39:37,440 --> 00:39:44,140
Now, as we are for any value suppose that
if I take that point 0.21. So, this is your
405
00:39:44,140 --> 00:39:48,620
point
2 and this is a second decimal 0.21. So, up
406
00:39:48,620 --> 00:39:55,760
to point z equals to up to 0. 21 the probability
407
00:39:55,760 --> 00:40:03,870
.covered is 0.5832. Now, we will see one example,
how to calculate the probability for
408
00:40:03,870 --> 00:40:11,220
any normal distribution that we will see.
So, this is how we have to read this standard
409
00:40:11,220 --> 00:40:18,770
normal distribution and these tables are generally,
available in any standard text book.
410
00:40:18,770 --> 00:40:19,770
.
411
00:40:19,770 --> 00:40:26,130
So, what we have found from this distribution
is that from the standard normal
412
00:40:26,130 --> 00:40:33,160
distribution curve it can be observed that
this 99.74 percent of the area under the curve
413
00:40:33,160 --> 00:40:39,530
falls inside the region bounded by plus minus
3 sigma. So, for the standard normal
414
00:40:39,530 --> 00:40:46,070
distribution what we saw that sigma equals
to 1 so from this minus 3 to plus 3. So, this
415
00:40:46,070 --> 00:40:51,740
much probability, which is almost closed one
is already covered in that. This is
416
00:40:51,740 --> 00:40:58,380
particularly important in the real life scenario,
where a random variable may be bounded
417
00:40:58,380 --> 00:41:05,680
by X equals to 0, but can still considered
to be normally, distributed if mu is greater
418
00:41:05,680 --> 00:41:08,020
than
three sigma.
419
00:41:08,020 --> 00:41:13,460
So, in the real life sometimes we can come
across to the situation that those random
420
00:41:13,460 --> 00:41:22,310
variables are effectively a lower bounded
by 0, but if we see that its mean is away
421
00:41:22,310 --> 00:41:26,200
from
the origin with a magnitude of three sigma.
422
00:41:26,200 --> 00:41:32,100
Then we can once see that whether, that can
also be considered to be a normal distribution
423
00:41:32,100 --> 00:41:39,250
as we know that below this 0 means left
side of this 0. So, towards a negative value
424
00:41:39,250 --> 00:41:43,510
effectively the probability is 0.
425
00:41:43,510 --> 00:41:44,510
..
426
00:41:44,510 --> 00:41:51,510
Now, if we just want to do one small exercise
how to calculate the probability using this
427
00:41:51,510 --> 00:41:58,360
standard normal distribution let us consider
a random variable X that is normally,
428
00:41:58,360 --> 00:42:02,450
distributed and once we say that this is normal
distributed we have specified this
429
00:42:02,450 --> 00:42:12,320
parameter. So, with mean equals to 4 35 and
sigma is equals to 0.9. So, now, if we look
430
00:42:12,320 --> 00:42:21,360
for the probability that this random variable
will take a value between 4 and 5 then this
431
00:42:21,360 --> 00:42:27,630
can be calculated from this cumulative standard
normal probability table how. So, our
432
00:42:27,630 --> 00:42:37,520
intention to the probability of X line between
5 to 4 should be equals to first what we are
433
00:42:37,520 --> 00:42:44,280
doing is that we are converting this two limit
that is the 5 and 4 this we are converting
434
00:42:44,280 --> 00:42:45,780
to
its reduced variate.
435
00:42:45,780 --> 00:42:50,910
So, how to convert it to the reduced variate
that particular value minus its mean divided
436
00:42:50,910 --> 00:42:59,820
by sigma. So, if we reduce it to the upper
limit corresponding reduced variate lower
437
00:42:59,820 --> 00:43:03,760
limit
corresponding reduced variate. So, this is
438
00:43:03,760 --> 00:43:11,010
again, 4 minus mean divided by sigma now
this value ends up to the 1.1 and this is
439
00:43:11,010 --> 00:43:18,940
minus 0. 59. Now, this one as these are reduced
variate. So, this is having the mean is equals
440
00:43:18,940 --> 00:43:23,380
to 0 and variance is equals to 1. So, this
we
441
00:43:23,380 --> 00:43:33,630
can read from this standard normal distribution
for 1.1. So, if you refer to this chart then
442
00:43:33,630 --> 00:43:41,470
we can see that what this 1.1.
So, this cell, if we can read it that is 0.4838.
443
00:43:41,470 --> 00:43:56,110
So, what is meant is that starting from here
up to 1.1 total area covered is 0. 8438. So,
444
00:43:56,110 --> 00:44:04,940
this is 8 6 4 3 maybe this will be 8 4. So,
this
445
00:44:04,940 --> 00:44:10,040
might be a mistake that is this might be that
8 4. So, this is how we read this probability
446
00:44:10,040 --> 00:44:19,300
.for a particular value of the standard normal
distribution. And similarly, we can read this
447
00:44:19,300 --> 00:44:27,250
value for the other limit and we can deduct
this probability to get this value. So, what
448
00:44:27,250 --> 00:44:31,140
is
actually, here is done graphically is like
449
00:44:31,140 --> 00:44:32,140
this.
.
450
00:44:32,140 --> 00:44:45,360
So, first of all this is reduced variate.
So, if this is your 0 then the reduced variate
451
00:44:45,360 --> 00:44:49,460
is looks
like this now from the 1.1 what is the value
452
00:44:49,460 --> 00:44:58,270
we get from this table is the total area from
the left support to that 1.1. So, this total
453
00:44:58,270 --> 00:45:01,810
area that is the total probability we will
get from
454
00:45:01,810 --> 00:45:10,450
that standard normal distribution table and
for the left one is the minus 0.59 may be
455
00:45:10,450 --> 00:45:16,710
somewhere here minus 0.59. So, what you are
doing that up to this much what is this
456
00:45:16,710 --> 00:45:24,950
area that we are deducting. So, that we will
get, what is there in this area only. So,
457
00:45:24,950 --> 00:45:26,990
what
is the total area between this two limit we
458
00:45:26,990 --> 00:45:30,380
will get.
So, the total area up to this one minus total
459
00:45:30,380 --> 00:45:37,020
area up to this point to calculate the
probability between this to limits. So, this
460
00:45:37,020 --> 00:45:42,390
is exactly is done here first of all we have
converted to the reduced variate, which is
461
00:45:42,390 --> 00:45:46,340
1.1 another 1 is this point minus 0.59 this
5
462
00:45:46,340 --> 00:45:51,750
corresponds to 1.1 and this 4 corresponds
to 0.59 and this two values are taken from
463
00:45:51,750 --> 00:45:56,850
this
standard normal distribution table.
464
00:45:56,850 --> 00:45:57,850
..
465
00:45:57,850 --> 00:46:04,531
So, here this graphical repartition is shown
here again, that is 1.1 this area. So, up
466
00:46:04,531 --> 00:46:07,720
to this
that value, which is indicating is the total
467
00:46:07,720 --> 00:46:10,670
area from this minus infinity to this point.
So,
468
00:46:10,670 --> 00:46:16,650
that is why we have to get only this much
area we have to deduct the area, which is
469
00:46:16,650 --> 00:46:24,320
which is up to that point of minus 0.59. So,
this area should be deducted to get the area
470
00:46:24,320 --> 00:46:27,790
in
between these two limits.
471
00:46:27,790 --> 00:46:28,790
.
472
00:46:28,790 --> 00:46:37,820
Next, distribution that we are going to discuss,
is the log normal distribution this also
473
00:46:37,820 --> 00:46:46,550
known as Galton distribution any random variable
X is a log normal random variable, if
474
00:46:46,550 --> 00:46:52,410
.its probability density function is given
by as follows. Again, it is having two parameters
475
00:46:52,410 --> 00:47:01,360
one is mu and this sigma square is 1 by x
sigma square root 2 phi exponential minus
476
00:47:01,360 --> 00:47:07,180
1 by
2 in bracket log natural X minus mu divided
477
00:47:07,180 --> 00:47:17,950
by sigma whole square. And it is limit for
the x is from so, this limit is this is a
478
00:47:17,950 --> 00:47:22,600
mistake this limit from the 0 to infinity.
So, why
479
00:47:22,600 --> 00:47:27,060
this is important and what is the difference
between this normal distribution is that for
480
00:47:27,060 --> 00:47:33,700
the normal distribution need varies from minus
infinity to plus infinity, but this log
481
00:47:33,700 --> 00:47:39,310
normal distribution supports from 0 to infinity.
Basically, what we are doing again, is that
482
00:47:39,310 --> 00:47:46,350
we are just taking that one random variable
we are taking its log and it is related through
483
00:47:46,350 --> 00:47:54,860
this equation that is Y is equals to log x.
And now what is shown here is that if this
484
00:47:54,860 --> 00:48:01,420
random variable X is log normally, distributed
then that Y, if I take that log X is the normally
485
00:48:01,420 --> 00:48:08,340
distributed. So, if we come across with
any distribution, which is log normally, distributed
486
00:48:08,340 --> 00:48:12,311
we can take its log and convert it to
the normal distribution and again, we can
487
00:48:12,311 --> 00:48:15,990
use the same procedure, which is followed
for
488
00:48:15,990 --> 00:48:22,290
the normal distribution.
Now, so, as it is related to this as the log
489
00:48:22,290 --> 00:48:28,660
normal distribution is related to the normal
distribution through this functional transformation
490
00:48:28,660 --> 00:48:34,170
this will again, be clear in the next
module where we are discussing about the functional
491
00:48:34,170 --> 00:48:39,640
random variable. So, if this
distribution is known or if this normal distribution
492
00:48:39,640 --> 00:48:43,920
is known, what is the distribution for
this x and through that one it can be usually,
493
00:48:43,920 --> 00:48:50,880
shown that this distribution of this x is
following is having the form, this will be
494
00:48:50,880 --> 00:48:56,030
0 this is not minus infinity this is 0. And
again,
495
00:48:56,030 --> 00:49:03,130
that cumulative distribution function is given
by this one where this is your that
496
00:49:03,130 --> 00:49:09,790
cumulative distribution of this normal distribution
and l n x minus mu by sigma.
497
00:49:09,790 --> 00:49:15,110
Basically, if you see this mu and sigma that
is the parameter of this log normal
498
00:49:15,110 --> 00:49:24,420
distribution this mu is the mean of the variable
X after taking it is log. So, after taking
499
00:49:24,420 --> 00:49:30,490
the log of X, if you calculate the mean then
that is equals to it is parameter mu and
500
00:49:30,490 --> 00:49:36,849
similarly, if you take the variance this is
this sigma square.
501
00:49:36,849 --> 00:49:37,849
..
502
00:49:37,849 --> 00:49:44,410
Ah this is how a log normal distribution with
this log mu is equals to 1 and log sigma
503
00:49:44,410 --> 00:49:51,500
equals to 0.12 looks like. So, this is lower
bounded by 0, and go up to plus infinity that,
504
00:49:51,500 --> 00:49:57,360
which is the support for this log normal distribution.
.
505
00:49:57,360 --> 00:50:02,360
And similarly, this is here cumulative distribution
function for the log normal
506
00:50:02,360 --> 00:50:03,360
distribution.
507
00:50:03,360 --> 00:50:04,360
..
508
00:50:04,360 --> 00:50:12,470
Now, this mean of this log normal distribution
is expectation of this X, which can be
509
00:50:12,470 --> 00:50:20,000
shown that e power mu plus 1 by sigma square
that is this is the mean for that variable
510
00:50:20,000 --> 00:50:23,840
X.
And this mu this is the mean after taking
511
00:50:23,840 --> 00:50:28,660
the log of this of the distribution of x,
if you
512
00:50:28,660 --> 00:50:33,480
calculate the mean this is that mean, which
was discussed. Again, the variance of this
513
00:50:33,480 --> 00:50:39,320
distribution that is variance of x is equals
to mu x now this subscript x is shown as this
514
00:50:39,320 --> 00:50:42,520
is
for this mu for this x this one that square
515
00:50:42,520 --> 00:50:48,340
multiplied by exponential sigma square minus
1. Coefficient of variance is again, given
516
00:50:48,340 --> 00:50:53,610
by the C v of this x that is the random variable
X is equals to square root of e power sigma
517
00:50:53,610 --> 00:51:00,240
square minus 1. And coefficient of Skewness
is given by for this gamma x is equals to
518
00:51:00,240 --> 00:51:06,350
3 multiplied by C v x plus C v x square that
C v
519
00:51:06,350 --> 00:51:14,960
x cube. The distribution is positively skewed
and with decrease of the coefficient of
520
00:51:14,960 --> 00:51:21,360
variation the Skewness also decreases which
can be reflected from this equation.
521
00:51:21,360 --> 00:51:22,360
..
522
00:51:22,360 --> 00:51:30,740
Now, for a log normally, distributed random
variable X the sample statistics for Y equals
523
00:51:30,740 --> 00:51:37,611
to log X that is we have some observation
some sample data we have taken its log, if
524
00:51:37,611 --> 00:51:41,930
we
take that log then how we can get the mean
525
00:51:41,930 --> 00:51:50,910
of that converted random variable. This is
obtained through this equation that is Y bar
526
00:51:50,910 --> 00:51:54,320
after taking the log its mean is equals to
half
527
00:51:54,320 --> 00:52:03,970
l n x bar square divided by 1 plus C v x s
square. So, this C v is the coefficient of
528
00:52:03,970 --> 00:52:10,800
variance for that observation x and this x
bar is the mean for that particular observation
529
00:52:10,800 --> 00:52:17,911
of the observed value, which is square. The
sample variance for the Y equals to log X
530
00:52:17,911 --> 00:52:24,470
again this 1 is equals to S y square is equals
to l n 1 plus C v x. So, this is the coefficient
531
00:52:24,470 --> 00:52:32,470
of variation with square plus 1 take the log
get that sample variance of y this coefficient
532
00:52:32,470 --> 00:52:38,901
of variation as we know is the ratio of this
standard deviation of the X divided by mean
533
00:52:38,901 --> 00:52:39,901
of that X.
534
00:52:39,901 --> 00:52:43,560
..
535
00:52:43,560 --> 00:52:50,010
In civil engineering the distribution of annual
river flow data may follow the log normal
536
00:52:50,010 --> 00:52:56,761
distribution the stream flow values generally,
are greater than 0 that is it is lower bound
537
00:52:56,761 --> 00:53:02,480
is 0 and the probability density for extremely
low stream flows are quite less. This is;
538
00:53:02,480 --> 00:53:09,830
obviously, for some kind of big rivers and
if you see that it is generally having some
539
00:53:09,830 --> 00:53:14,630
contribution from this ground flow or the
snow-fed rivers like that. So, generally,
540
00:53:14,630 --> 00:53:20,850
in
some flow is maintained throughout the year.
541
00:53:20,850 --> 00:53:28,410
So, that probability is low for the
extremely low flows then the probability densities
542
00:53:28,410 --> 00:53:35,300
are increases with increasing amount
of this annual flow for the moderate values
543
00:53:35,300 --> 00:53:40,530
of the stream flow and again, it decreases
progressively for the increasing stream flow.
544
00:53:40,530 --> 00:53:46,160
So, the probability for the extremely high
stream flow again, is very less. So, what
545
00:53:46,160 --> 00:53:49,640
we
can see is that it is starts from 0 take the
546
00:53:49,640 --> 00:53:51,680
peak and again, it is coming down. So, far
as the
547
00:53:51,680 --> 00:53:57,001
density of the probability is concern over
the range of this annual river flow. So, this
548
00:53:57,001 --> 00:54:00,800
can
follow we cannot confirm it and we cannot
549
00:54:00,800 --> 00:54:05,380
say it as a general case that all the annual
rivers are always follow the log normal distribution
550
00:54:05,380 --> 00:54:10,400
that even though we cannot say that,
but there is a possibility that this may follow
551
00:54:10,400 --> 00:54:13,339
a log normal distribution.
552
00:54:13,339 --> 00:54:14,339
..
553
00:54:14,339 --> 00:54:23,140
Next, we will discusses, that exponential
distribution it is the probability distribution
554
00:54:23,140 --> 00:54:26,350
that
describe the time between events in an experiment
555
00:54:26,350 --> 00:54:31,350
where the outcomes occur
continuously and independently at a constant
556
00:54:31,350 --> 00:54:37,140
average rate. It is generally, used as a
decay function in the engineering problems.
557
00:54:37,140 --> 00:54:38,140
.
558
00:54:38,140 --> 00:54:43,790
So, this distribution is a mathematically
very effective to show and in the previous
559
00:54:43,790 --> 00:54:48,000
lectures also we have taken this example,
to show it is different properties of the
560
00:54:48,000 --> 00:54:53,380
continuous random variable. And we know that
it is probability density function looks
561
00:54:53,380 --> 00:54:58,750
.like the lamda e power minus lamda x for
this x greater than equal to 0 and elsewhere
562
00:54:58,750 --> 00:55:02,120
it
is 0. And here it is a single parameter distribution
563
00:55:02,120 --> 00:55:06,670
and the parameter is lamda and if we
take we in the earlier lectures also we have
564
00:55:06,670 --> 00:55:13,330
seen that this cumulative distribution function
is can be shown as the 1 minus e power minus
565
00:55:13,330 --> 00:55:14,330
lamda x.
.
566
00:55:14,330 --> 00:55:20,590
And its distribution that is pdf that is probability
density function with the particular
567
00:55:20,590 --> 00:55:24,760
value of lamda equals to 1 taken here is looks
like this.
568
00:55:24,760 --> 00:55:25,760
.
569
00:55:25,760 --> 00:55:31,020
.And its cumulative distribution again, looks
like this, which is starting from 0 and going
570
00:55:31,020 --> 00:55:38,120
asymptotic to 1 at x equals to infinity.
.
571
00:55:38,120 --> 00:55:42,640
And this things also we calculate earlier
its mean is given by expectation of X is given
572
00:55:42,640 --> 00:55:46,160
by
1 by lamda and variance is a 1 by sigma square.
573
00:55:46,160 --> 00:55:51,970
It is Skewness can be shown is equals to
2 and coefficient of variance also is shown
574
00:55:51,970 --> 00:56:00,250
it earlier and which is equals to 1 in earlier
lecture we covered this distribution as example.
575
00:56:00,250 --> 00:56:01,250
.
576
00:56:01,250 --> 00:56:06,670
.So, if we take that daily rainfall depth
and the probability density is highest for
577
00:56:06,670 --> 00:56:09,210
this 0
rainfall we know that most of the days, if
578
00:56:09,210 --> 00:56:11,220
in case, the most of the days are dry days
then
579
00:56:11,220 --> 00:56:15,490
it is the maximum probability is concentrated
as zero. The probability density becomes
580
00:56:15,490 --> 00:56:19,920
progressively lesser for the higher values
of the rainfall depth and it is very less
581
00:56:19,920 --> 00:56:25,810
for the
extremely high rainfall depths. Thus this
582
00:56:25,810 --> 00:56:28,170
may follow again, it should now may should
be
583
00:56:28,170 --> 00:56:35,500
follow for daily rainfall depth may follow
and exponential distribution and with this
584
00:56:35,500 --> 00:56:39,800
we
stop with this exponential distribution here.
585
00:56:39,800 --> 00:56:45,760
So, in this lecture we cover that uniform
distribution, normal distribution, log normal
586
00:56:45,760 --> 00:56:49,330
distribution and exponential distribution
there are some more distributions are also
587
00:56:49,330 --> 00:56:55,090
important. And that we will cover in the next
class, and then we will go through the next
588
00:56:55,090 --> 00:57:01,890
module and in successive modules application
of this kind of distribution in different
589
00:57:01,890 --> 00:57:06,420
civil engineering problem for different modules
will be explained later. Thank you.
590
00:57:06,420 --> 00:57:06,420
.