1
00:00:00,110 --> 00:00:26,790
Today we will discuss hypothesis testing.
2
00:00:26,790 --> 00:00:43,250
So, what is hypothesis? You have any idea
about hypothesis? If I ask you, what is hypothesis?
3
00:00:43,250 --> 00:01:16,200
Hypothesis is a statement that is yet to be
proven. Usually in statistics we frame hypothesis
4
00:01:16,200 --> 00:01:26,210
concerning the population parameter, hypothesis
power population parameter.
5
00:01:26,210 --> 00:01:37,229
So, if you read the history of science you
will find out that n number of hypothesis.
6
00:01:37,229 --> 00:01:47,330
There are huge number of hypothesis has been
framed by the scientist and proven by experiment
7
00:01:47,330 --> 00:02:02,600
or by some other means. So, our hypothesis
is limited to the statistical hypothesis testing
8
00:02:02,600 --> 00:02:11,620
and before going into detail of this let us
see the content of today’s lecture. We will
9
00:02:11,620 --> 00:02:18,319
discuss the hypothesis testing for single
population mean for single population variance
10
00:02:18,319 --> 00:02:25,470
for equality of two population means and equality of two population variances.
11
00:02:25,470 --> 00:02:35,810
So, come back to this hypothesis testing and
I have discuss the hypothesis statement that
12
00:02:35,810 --> 00:02:41,190
is yet to be proven and there are we will
consider two types of hypothesis. One is H
13
00:02:41,190 --> 00:02:53,870
O which is known as null hypothesis and another
one is H 1 or H a which is known as alternate
14
00:02:53,870 --> 00:03:06,260
hypothesis. Null hypothesis is an assertion
about a population parameter and we believe
15
00:03:06,260 --> 00:03:12,260
on it unless it is proven statistically otherwise.
Alternative hypothesis is the negation of
16
00:03:12,260 --> 00:03:27,420
H 0, alternative hypothesis is negation of
H 0. Let us see one example here.
17
00:03:27,420 --> 00:03:33,830
The manufacturer of a mobile handset claims
that the mean recharge period for the battery
18
00:03:33,830 --> 00:03:42,840
of its newly launched mobile set is 7 days
beyond which it has to be recharged. As a
19
00:03:42,840 --> 00:03:50,010
busy person travelling frequently mister R found it interesting but he wanted to assured
20
00:03:50,010 --> 00:03:58,659
whether the claim is true or false.
So, here our null hypothesis is the mean recharge
21
00:03:58,659 --> 00:04:05,680
period is 7 days which is the population mean.
We are basically talking about like this that
22
00:04:05,680 --> 00:04:17,309
null hypothesis mu is equal to mu 0. In this
case mu 0 is 7 and alternative hypothesis
23
00:04:17,309 --> 00:04:30,969
is that mu not equal to mu 0. That means not
equal to 7 and if this is the case as I told
24
00:04:30,969 --> 00:04:41,029
that we will believe on null hypothesis, so
long we are not able to prove that it is wrong.
25
00:04:41,029 --> 00:04:50,300
You have to prove it statistically. So, what
is what is then the, what are the steps?
26
00:04:50,300 --> 00:05:03,889
The steps are identify first null hypothesis
and alternative hypothesis, then you definitely
27
00:05:03,889 --> 00:05:09,919
find out the appropriate sampling statistic.
Then obtain sampling distribution theta. If
28
00:05:09,919 --> 00:05:16,409
the statistic is theta, when the null hypothesis
is true. Please keep in mind that this is
29
00:05:16,409 --> 00:05:24,099
very important concept that when null hypothesis
is true that time you are finding the distribution
30
00:05:24,099 --> 00:05:28,860
of theta, that is the sample statistic. What
do you do?
31
00:05:28,860 --> 00:05:43,369
You collect sample, collect sample. Let size
n and you compute the parameter from the sample
32
00:05:43,369 --> 00:05:49,729
and what we are trying to say here. I told
that what is basically the statistic is theta
33
00:05:49,729 --> 00:05:54,969
instead of cap we have given only theta. For
example, this theta will be x bar or theta
34
00:05:54,969 --> 00:06:01,389
cap, that is x bar which is basically estimate of mu.
35
00:06:01,389 --> 00:06:14,339
Now, this x bar is a random variable and what
you require to know. You require to frame
36
00:06:14,339 --> 00:06:20,289
an appropriate statistics. If it is x bar,
fine if it is not x bar, something else that
37
00:06:20,289 --> 00:06:26,309
statistics you find out. For example, we need
to talk about x bar. We usually frame z that
38
00:06:26,309 --> 00:06:40,139
is x bar minus expected value of x bar by
sigma by root n, that we usually frame here.
39
00:06:40,139 --> 00:06:49,550
Our null hypothesis is H 0 mu equal to mu
0. We all know that expected value of x bar
40
00:06:49,550 --> 00:06:58,179
is mu, so then we write sigma by root n.
Now, we have assumed that H 0, from H 0 we
41
00:06:58,179 --> 00:07:04,179
have we have seen that mu equal to mu 0. So,
that means you can write this one, mu 0 by
42
00:07:04,179 --> 00:07:18,550
sigma by root n. Then if this is my statistic
for which you require to know the distribution,
43
00:07:18,550 --> 00:07:26,020
what I said here that is obtain the sampling
distribution of theta when H 0 is true. That
44
00:07:26,020 --> 00:07:31,309
means the sampling distribution of the here
your here your x bar which is nothing but
45
00:07:31,309 --> 00:07:36,439
we converting it into z and we are writing
mu 0 instead of mu.
46
00:07:36,439 --> 00:07:48,319
As we are hoping that our null hypothesis
is true and then what you will do? You find
47
00:07:48,319 --> 00:07:56,929
out the critical value. What do you mean by
critical value? In this case if it is z distributed,
48
00:07:56,929 --> 00:08:06,619
my distribution will be like this. This is
z, then we want to see that basically this
49
00:08:06,619 --> 00:08:14,020
one is z equal to 0.
Now, this quantity x bar minus mu 0 by sigma
50
00:08:14,020 --> 00:08:19,849
by root n, this quantity follows distribution,
all possible values, all values here they
51
00:08:19,849 --> 00:08:28,309
are possible if it is sufficiently away from
that z. That mean value z mean value, then
52
00:08:28,309 --> 00:08:33,729
we will conclude that mu is not equal to mu 0.
53
00:08:33,729 --> 00:08:39,370
So, that is why we will frame critical value
in this side. Either this may be your, this
54
00:08:39,370 --> 00:08:47,110
is your alpha by 2 and or this side sufficiently
away. This side, this is your alpha by 2.
55
00:08:47,110 --> 00:08:55,310
What I mean to say, if the commutate statistics
which one is this one. In this case if this
56
00:08:55,310 --> 00:09:02,730
value falls in the right hand rejection region
or left hand rejection region then we conclude
57
00:09:02,730 --> 00:09:13,440
that H 0 is rejected, not true. That mean
H 1 is can be accepted, that is mu not equal
58
00:09:13,440 --> 00:09:21,290
to mu 0, then computational what you will
do first? You collect data that what you have
59
00:09:21,290 --> 00:09:25,949
done in this particular case. What we will
proceed?
60
00:09:25,949 --> 00:09:34,990
You will proceed like this, you collect data
for x which will be x 1, x 2 like x n. Then
61
00:09:34,990 --> 00:09:45,529
you will compute x bar which is 1 by n i equal
to 1 to n x i. Then you will compute the statistic
62
00:09:45,529 --> 00:09:52,670
which we are talking about z, that x bar minus
mu 0 by sigma by root n.
63
00:09:52,670 --> 00:10:00,519
Now, please remember that root sigma will
be given. If sigma is not given then sigma
64
00:10:00,519 --> 00:10:11,029
cap will be s and in that case if n is sufficiently
large then you will be using z distribution
65
00:10:11,029 --> 00:10:20,160
like this, x bar minus mu 0 by sigma by root
n. This is the computed z, basically from
66
00:10:20,160 --> 00:10:28,220
the data z computed. Now, you have to choose alpha.
67
00:10:28,220 --> 00:10:34,250
So, your next step is choose alpha, you have
seen the what is the error. You are going
68
00:10:34,250 --> 00:10:41,579
to consume if you say r alpha is 0.05, this
is also known as probability level of significance.
69
00:10:41,579 --> 00:10:49,420
If we say alpha equal to 0.05 then you are
this your point will be like this, your rejection
70
00:10:49,420 --> 00:10:57,180
region will be like this. Now, suppose this
one is alpha by 2 and this side, this is also
71
00:10:57,180 --> 00:11:14,120
alpha by 2 then this is your rejection region.This side, this one is rejection region and in between this is the
72
00:11:14,360 --> 00:11:27,880
accepted acceptance region.
So, you have your distribution and you know
73
00:11:27,889 --> 00:11:36,480
that level of error you are trying. We will
consume and then what happen based on this
74
00:11:36,480 --> 00:11:45,250
you will find out an interval which is acceptance
interval and beyond as it is a two tailed
75
00:11:45,250 --> 00:11:50,540
case. So, either right hand or left hand we
need to go beyond the critical value. That
76
00:11:50,540 --> 00:11:56,870
is z alpha by 2 right hand side and minus
z alpha by 2 left side. When you goes beyond
77
00:11:56,870 --> 00:12:08,499
this you will reject the null hypothesis.
So, what you do? Then you have already computed
78
00:12:08,499 --> 00:12:17,670
z, from table you are getting the z alpha
by 2, either it is the mod value we have to
79
00:12:17,670 --> 00:12:28,290
take. If z computed, if z computed is greater
than mod of z alpha by 2 absolute value then
80
00:12:28,290 --> 00:12:35,870
what will happen that either you will come
into this side or this side. Otherwise this
81
00:12:35,870 --> 00:12:42,939
maybe negative also, you do one thing you
just change little bit that mod of this will
82
00:12:42,939 --> 00:12:48,519
also come under this side. I will take mod
in both sides.
83
00:12:48,519 --> 00:12:55,139
So, if this one is my zero value. Now, if
z value is here this side it will be positive
84
00:12:55,139 --> 00:13:00,499
and this side it will be negative and you
will be seeing the alpha value from that z
85
00:13:00,499 --> 00:13:06,569
alpha by 2 value from the table. Usually this one will be a positive value. So, you may
86
00:13:06,569 --> 00:13:13,079
not take mod here, no problem.
So, the absolute value if it is more than
87
00:13:13,079 --> 00:13:21,009
z alpha by 2, more than or equal you can write.
So, that means what I am saying here, we are
88
00:13:21,009 --> 00:13:30,189
saying that if it falls here or it falls here
then it is sufficiently away from the mean
89
00:13:30,189 --> 00:13:43,649
value. So, H 0 mu equal to mu 0 that can be
rejected. If I reject H 0 that means I am
90
00:13:43,649 --> 00:13:56,439
accepting H 0. Alternatively I can say I accept
H 0, H 1 not 0, you are rejecting this.
91
00:13:56,439 --> 00:14:02,589
So, actually you either will be able to reject
H 0 or you will be suppose your z value comes
92
00:14:02,589 --> 00:14:19,149
here that is within this acceptance zone.
Then you will say failed to reject H 0, so
93
00:14:19,149 --> 00:14:29,680
this is the procedure. The procedure says
first you must know that what is the problem
94
00:14:29,680 --> 00:14:35,670
and then the appropriate variable you will
find out from the population point of view.
95
00:14:35,670 --> 00:14:42,180
You collect sample, from that sample appropriate
statistics you generate, then you create the
96
00:14:42,180 --> 00:14:47,949
null hypothesis and alternative hypothesis
for the population parameter of interest.
97
00:14:47,949 --> 00:14:59,100
Then using the statistic as well as its distribution
then you compare the computed value of the
98
00:14:59,100 --> 00:15:04,680
statistic and as well as the critical value
from the sampling distribution of that statistic
99
00:15:04,680 --> 00:15:10,329
when you compare. If you find out that the
computed value is more than the absolute value
100
00:15:10,329 --> 00:15:17,870
of that computed statistic is more than the
tabulated value then you will reject null
101
00:15:17,870 --> 00:15:29,209
hypothesis, otherwise you say failed to reject
null hypothesis, okay?
102
00:15:29,209 --> 00:15:36,699
You see this figure and I know that last class
also you have seen the same thing. What we
103
00:15:36,699 --> 00:15:42,189
are talking about here, if it is population
from normal or non normal whether sigma known
104
00:15:42,189 --> 00:15:48,360
or sigma unknown and the condition of the
sample. That means the size of the sample
105
00:15:48,360 --> 00:15:54,619
whether it is large or small and depending
on this all of we have seen under interval
106
00:15:54,619 --> 00:16:01,540
estimation that your quantity that statistics
may follow z distribution may follow t distribution
107
00:16:01,540 --> 00:16:06,379
or we may not get any distribution, parametric distribution.
108
00:16:06,379 --> 00:16:13,430
So, this one x bar minus mu by sigma by root n, that is for the first case, like second
109
00:16:13,430 --> 00:16:21,509
case also same, third case your sigma is replaced
by s. Fourth case also sigma is replaced by
110
00:16:21,509 --> 00:16:26,869
s but sample size small. So, that is why it
is t distribution, okay?
111
00:16:26,869 --> 00:16:34,059
So, if you use the t distribution what is
the same thing, you will calculate the statistic
112
00:16:34,059 --> 00:16:48,199
that is t computed will be your x bar minus
mu 0 by s by root n and then you will be finding
113
00:16:48,199 --> 00:16:58,199
out t tabulated. The t tabulated means basically
you are required to know the degree of freedom,
114
00:16:58,199 --> 00:17:08,230
n minus 1 and alpha by 2 because t distribution
is also a two tailed distribution and your
115
00:17:08,230 --> 00:17:13,870
hypothesis also two tailed hypothesis. The
way you have created hypothesis, you have
116
00:17:13,870 --> 00:17:22,970
created hypothesis like this. H 0 mu equal
to mu 0 and H 1 mu not equal to mu 0. That
117
00:17:22,970 --> 00:17:31,080
mean both side is open for you.
So, if it goes this side or this side, beyond
118
00:17:31,080 --> 00:17:37,440
this level you are not accepting the null
hypothesis or rejecting the null hypothesis.
119
00:17:37,440 --> 00:17:48,779
So, this is your t distribution and the mean
value will be definitely 0. Here is one important
120
00:17:48,779 --> 00:17:55,340
point when you go for two tailed distribution,
when you go for one tailed distribution.
121
00:17:55,340 --> 00:18:07,220
Sir for mean distribution, means
only positive values are there.
122
00:18:07,220 --> 00:18:18,990
Is it true that for all the time that for
mean you will go to two tailed or there is
123
00:18:18,990 --> 00:18:29,860
something different. Suppose, here in this
example, we say that recharge period that
124
00:18:29,860 --> 00:18:40,070
is our example recharge period. As a user
what you want? You want the more the recharge
125
00:18:40,070 --> 00:18:45,740
period it is the better.
Now, for recharge period if you test like
126
00:18:45,740 --> 00:18:55,279
this that H 0 mu equal to mu 0 and H 1 mu
not equal to mu 0. Suppose, your that computed
127
00:18:55,279 --> 00:19:05,520
value comes here, z value or t value. Suppose,
t computed is here. Now, this is our, this
128
00:19:05,520 --> 00:19:14,399
is our smallest to largest. So, the manufacturer
is claiming that it is here mean value, it
129
00:19:14,399 --> 00:19:21,120
is the manufacturer claim. He is saying that
this is the mean value.
130
00:19:21,120 --> 00:19:28,710
Now, you are testing that whether the manufacturer
claim is true or wrong. Now, you have collective
131
00:19:28,710 --> 00:19:35,399
data and you have found that t computed or
z computed depending on the distribution.
132
00:19:35,399 --> 00:19:40,240
It is in this case, we are talking about t
distribution, it is falling here. So, null
133
00:19:40,240 --> 00:19:46,100
hypothesis is rejected fine, that means alternative
hypothesis is accepted.
134
00:19:46,100 --> 00:19:59,399
So, in this case do we go for the mean set,
we will not go because the mean value is much
135
00:19:59,399 --> 00:20:07,120
lower than claimed. What will be in your favour?
In your favour if the value lies in other
136
00:20:07,120 --> 00:20:14,630
way, that means if your if the computed value
comes here that is your favourable case, you
137
00:20:14,630 --> 00:20:19,370
are rejecting so long it is in between. This
it is nothing but what is the manufacturer
138
00:20:19,370 --> 00:20:25,960
is claim, if it comes here this side it is
more than that.
139
00:20:25,960 --> 00:20:33,110
So, what you may be interested to create null
hypothesis like this, mu greater than mu 0.
140
00:20:33,110 --> 00:20:40,890
You may not be interested to test this mu
not equal to mu 0, you may be interested to
141
00:20:40,890 --> 00:20:47,559
test mu greater than mu 0. Means the manufacturer
is claiming it is 7 days, fine 7 days is good.
142
00:20:47,559 --> 00:20:53,149
You may be happy but you are thinking that
if it is more than 7 days that is better.
143
00:20:53,149 --> 00:21:01,789
So, in this case it is mu only but if I go
by this two tailed distribution. your hypothesis,
144
00:21:01,789 --> 00:21:10,120
your alternate hypothesis and you cannot,
you may reject H 0 but it may not go in your
145
00:21:10,120 --> 00:21:18,590
favour. So, as a result what it is said that
it is always greater. If you do like this
146
00:21:18,590 --> 00:21:24,200
first you understand what is the variable.
147
00:21:24,200 --> 00:21:34,010
If suppose x is your random variable, what
type of variable is it, lower the better type
148
00:21:34,010 --> 00:21:49,820
or is it higher the better or it is nominal
is the best, getting me? For example, suppose
149
00:21:49,820 --> 00:22:00,269
the sulphur content in steel percentage, sulphur
in steel. We do not want that sulphur content,
150
00:22:00,269 --> 00:22:09,019
more sulphur content, zero sulphur.
So, it is lower the better in this case what
151
00:22:09,019 --> 00:22:17,059
you want, suppose this is the sulphur content.
So, ultimately you will be looking for the
152
00:22:17,059 --> 00:22:22,860
lower side. So, that means you will create
a value that is the maximum value, what you
153
00:22:22,860 --> 00:22:31,760
want your acceptance region. Your zone of
working will be region will be this, you do
154
00:22:31,760 --> 00:22:37,799
not want higher one.
Now, if you plot what will happen, you may
155
00:22:37,799 --> 00:22:47,090
find out distribution like this. Let it be
like this. So, you will go for the left hand
156
00:22:47,090 --> 00:22:55,659
side, for the higher case it will be just
reverse. You will fix a value here, somewhere
157
00:22:55,659 --> 00:23:04,980
here and you want this side. Now, you can
fix here or somewhere here but ultimately
158
00:23:04,980 --> 00:23:13,279
you see lower the better and higher the better
case, mean the one tailed test is better only
159
00:23:13,279 --> 00:23:20,500
when nominal is the best mean. There is a
target and if you deviate from the target,
160
00:23:20,500 --> 00:23:27,779
if you deviate from this target this side
or this side, this is not desirable. Then
161
00:23:27,779 --> 00:23:38,140
your two tailed hypothesis is better.
So, for both the cases this is one tailed
162
00:23:38,140 --> 00:23:53,899
and for this two tailed, getting me? When
you frame the hypothesis, null hypothesis,
163
00:23:53,899 --> 00:23:59,679
there is absolutely no problem because from
what you are doing you are basically telling
164
00:23:59,679 --> 00:24:06,919
some value for that hypothesis statement.
You are making it is something like this,
165
00:24:06,919 --> 00:24:11,889
usually we say it is something like this.
In statistics hypothetical testing but what
166
00:24:11,889 --> 00:24:17,549
will be your alternate hypothesis.
Now, if you use one tailed hypothesis what
167
00:24:17,549 --> 00:24:27,440
will happen, then with for the critical value
in two tailed case you have taken t r. Suppose,
168
00:24:27,440 --> 00:24:34,080
this is the two tailed case, if I say this
is my left side, this is right side you have
169
00:24:34,080 --> 00:24:41,840
taken z alpha by 2 or you have taken t n minus
1 alpha by 2. This side or this side t r minus
170
00:24:41,840 --> 00:24:50,919
z alpha by 2 t minus t n minus 1 alpha by
2 but when it is one tailed, suppose so long
171
00:24:50,919 --> 00:24:57,289
it is within this you are accepting H 0. When
it is going this side you are rejecting H
172
00:24:57,289 --> 00:25:03,360
0, then it is one tailed.
In that case this will be z alpha, if z distribution
173
00:25:03,360 --> 00:25:13,389
is applicable or it will be t n minus 1 alpha.
In the other side also if you think that no
174
00:25:13,389 --> 00:25:25,679
this is my this one is this only, this also
alpha, getting me? So, then this is minus
175
00:25:25,679 --> 00:25:38,059
n minus 1 alpha and here it is this value
is alpha. Now, there is a relationship between
176
00:25:38,059 --> 00:25:44,799
this hypothesis and confidence interval, see
if this is alpha then this side is 100 into
177
00:25:44,799 --> 00:25:52,960
1 minus alpha that percent. Similar here what
is happening? This is 100 into 1 minus the
178
00:25:52,960 --> 00:26:01,299
interval, that is why alpha by 2, alpha by
2 you are making, okay? So, this is what our
179
00:26:01,299 --> 00:26:07,779
hypothesis testing, for single population
mean.
180
00:26:07,779 --> 00:26:14,200
This is one example. Last class we have described,
we said that MSD is a problem with industrial
181
00:26:14,200 --> 00:26:20,440
workers, particularly the crane operators
in heavy industry. Now, a random sample of
182
00:26:20,440 --> 00:26:27,630
76 responses yielded a mean of 7 and standard
deviation of 4. Population standard deviation
183
00:26:27,630 --> 00:26:35,570
is also given. Conduct hypothesis testing
for alpha equal to 0.05. That mean what you
184
00:26:35,570 --> 00:26:45,679
require to say that you are basically mu equal
to 7, that will be your null hypothesis and
185
00:26:45,679 --> 00:26:47,610
what will be your alternate hypothesis?
186
00:26:47,610 --> 00:26:57,279
Your null hypothesis is mu equal to 7. Here
let us see the alternate hypothesis mu not
187
00:26:57,279 --> 00:27:09,919
equal to 7. Now, here the thing is that as
a manufacturer, as a management point of view
188
00:27:09,919 --> 00:27:15,659
you want to see that no it is not 7, it is
below 7. From operators point of view they
189
00:27:15,659 --> 00:27:22,049
may say no, it is not 7 it is more than 7.
The operator who are using the crane they
190
00:27:22,049 --> 00:27:30,919
will be saying we are suffering more. That
the management who are basically maintaining
191
00:27:30,919 --> 00:27:35,730
the system they will be saying no, it is less
than 4, less than 7.
192
00:27:35,730 --> 00:27:43,269
So, then from which side you are testing the
hypothesis that is also important. Who is
193
00:27:43,269 --> 00:27:54,320
the owner of this hypothesis in that sense,
okay? I have given here that what will happen
194
00:27:54,320 --> 00:28:00,159
if population standard deviation is not known.
You know 76 responses, even if population
195
00:28:00,159 --> 00:28:05,230
standard deviation is not known but sample
standard deviation is known, z distribution
196
00:28:05,230 --> 00:28:10,179
will still be applied.
What will happen if population standard deviation
197
00:28:10,179 --> 00:28:18,080
is not known and in less than 30 t distribution,
you will be getting the critical value for
198
00:28:18,080 --> 00:28:27,139
competition. From t distribution are you getting
similarity with confidence interval? There
199
00:28:27,139 --> 00:28:34,059
you have created the same way. You created
this first, you find out the statistic then
200
00:28:34,059 --> 00:28:39,419
in that case what happened in when you have
found out some statistic like x bar minus
201
00:28:39,419 --> 00:28:47,440
mu by s by root n.
Then you have created like this minus t alpha
202
00:28:47,440 --> 00:28:58,779
by 2 n minus 1 t alpha by t alpha by 2 here,
n minus 1 that we have created as mu was not
203
00:28:58,779 --> 00:29:05,419
known to you. Last class we have seen you
have you do not know mu, you created a range
204
00:29:05,419 --> 00:29:15,289
for mu. What happened here in this hypothesis
testing case mu is given as mu 0 which is
205
00:29:15,289 --> 00:29:23,200
under H 0 and we are saying these particular
quantity follows t distribution when H 0 is
206
00:29:23,200 --> 00:29:26,730
true. That is why I told you in the beginning that
207
00:29:26,730 --> 00:29:33,230
you please keep in mind that whatever statistics
you will generate and the statistical sampling
208
00:29:33,230 --> 00:29:43,100
distribution. You will consider that is true
when H 0 is true, then now instead of mu z,
209
00:29:43,100 --> 00:29:49,580
mu 0 this value you know, entire value you
know and you also know that the value is critical
210
00:29:49,580 --> 00:29:54,230
values. This is left side critical value,
this is right side critical value. Both values
211
00:29:54,230 --> 00:30:00,649
you know whether this value is less than this
value or this value is more than this value
212
00:30:00,649 --> 00:30:11,720
that you are finding out and accordingly you
are rejecting null hypothesis.
213
00:30:11,720 --> 00:30:20,370
I know how to use z table, that is very much
known to you, t table also.
214
00:30:20,370 --> 00:30:33,190
Now, when you make hypothesis testing you
basically make take certain decisions. Hypothesis
215
00:30:33,190 --> 00:30:40,789
testing is basically a decision making. In
confidence interval, estimation that time
216
00:30:40,789 --> 00:30:46,139
you are not doing making any decision here.
In hypothesis testing you are making a decision
217
00:30:46,139 --> 00:30:58,759
based on the sample data. Now, there is possibility
that there are four scenarios. First scenario
218
00:30:58,759 --> 00:31:07,059
is you just think like this that you have
considered hypothesis H 0 as something, some
219
00:31:07,059 --> 00:31:16,309
statement you made in favour of H 0 that statement
may be true, may be false.
220
00:31:16,309 --> 00:31:25,070
Now, if statement is true that mean the hypothesis
null hypothesis is true then depending on
221
00:31:25,070 --> 00:31:33,169
that based on your analysis of the based you
may accept it you may reject it. If your null
222
00:31:33,169 --> 00:31:38,759
hypothesis is true and you have accepted it,
that it is the right decision. Again if null
223
00:31:38,759 --> 00:31:44,039
hypothesis is false and you have rejected
it based on your data analysis that is also
224
00:31:44,039 --> 00:31:52,799
right decision. Problem comes when H 0 is
true but you have rejected it or H 0 is false.
225
00:31:52,799 --> 00:32:00,259
You have accepted it when H 0 is true you
have accepted it. That is known as type 1
226
00:32:00,260 --> 00:32:16,440
error or alpha error and when your H0 is
false and you have accepted as a type 2 error or beta err
227
00:32:16,629 --> 00:32:25,629
What is the physical interpretation
of this, how do you get the physical meaning
228
00:32:25,629 --> 00:32:34,990
of decision making if you see that manufacturer
risk versus consumer risk, getting me?
229
00:32:34,990 --> 00:32:45,679
There is a production process, producing something.
Something you are producing and there is suppose
230
00:32:45,679 --> 00:32:57,549
some testing like this and it will be good
or it will be bad depending on the test here.
231
00:32:57,549 --> 00:33:07,250
Let it be if it is good, it is going to the
customer, bad it is going back means rework
232
00:33:07,250 --> 00:33:17,049
or scrap or something like this. This is of
my production process. So, when you are what
233
00:33:17,049 --> 00:33:23,210
do we do? Basically, here we do certain sampling.
Here in quality terminology it is acceptance
234
00:33:23,210 --> 00:33:36,200
sampling, acceptance sampling.
Suppose, you are doing this one as a, this
235
00:33:36,200 --> 00:33:42,269
is manufacturer and you are the customer or
other way also you are purchasing raw materials.
236
00:33:42,269 --> 00:33:52,620
There is supplier, supplier supplying raw
materials supplier and here is a quality control,
237
00:33:52,620 --> 00:33:57,960
acceptance sampling. Again if it is bad product,
good product going to you but if it is bad
238
00:33:57,960 --> 00:34:05,019
it is again coming back to supplier. All these
two places it is an acceptance sampling case.
239
00:34:05,019 --> 00:34:13,500
In the first case here manufacturer is producing
something and there is a sampling, acceptance
240
00:34:13,500 --> 00:34:19,660
sampling scheme for the customer. Based on
the data he may accept or reject, that means
241
00:34:19,660 --> 00:34:25,430
good or bad is coming based on the data.
In the same this case also the supplier is
242
00:34:25,430 --> 00:34:32,380
sending something the manufacturer is testing
it and according to the test either returning
243
00:34:32,380 --> 00:34:40,190
to the supplier or accepting it. Now, in this
case what will happen? Suppose, the manufacturer
244
00:34:40,190 --> 00:34:49,540
produces the right one but test says it is
wrong one. So, you will your error is type
245
00:34:49,540 --> 00:34:55,010
1 error because manufacturer produces the
right one, manufacturer produces the right
246
00:34:55,010 --> 00:35:03,430
one and you have not accepted it. Type 1 error
or alpha error, fine? This is good because
247
00:35:03,430 --> 00:35:11,640
the produced item is gone back to the manufacturer.
Now, suppose what will happen if manufacturer
248
00:35:11,640 --> 00:35:20,490
produces the bad one and your scheme accepts
it. Then you will get a bad product, the second
249
00:35:20,490 --> 00:35:26,600
one is more problematic because the bad product
goes to the market. It is problematic for
250
00:35:26,600 --> 00:35:33,200
the customer, problematic more problematic
for the manufacturer also because next time
251
00:35:33,200 --> 00:35:38,690
what will happen if someone else wants to
purchase this as a customer. You will say
252
00:35:38,690 --> 00:35:43,300
do not purchase this, it has lot many problems.
So, as a result what happen here actually?
253
00:35:43,300 --> 00:35:51,620
Customer will not do the acceptance sampling
at the market level but the manufacturer himself
254
00:35:51,620 --> 00:35:58,370
do certain sampling here. Once the production
is completed, how many to be sent to the market
255
00:35:58,370 --> 00:36:05,630
based on certain test, getting me?
So, hypothesis testing is nothing but a decision
256
00:36:05,630 --> 00:36:14,520
making and many times these decisions are
very crucial decisions and you must know what
257
00:36:14,520 --> 00:36:21,670
is the alpha error and what is the beta error.
Another issue you will go through if you find
258
00:36:21,670 --> 00:36:42,210
time, that is operating characteristic curve,
popularly known as OC curve. We say the type
259
00:36:42,210 --> 00:36:55,970
2 error is beta error and 1 minus beta is
the power of the test, beta is the power of
260
00:36:55,970 --> 00:37:01,780
the test. This operating characteristic curve
can be understood. Suppose, you see I am giving
261
00:37:01,780 --> 00:37:03,400
one example here.
262
00:37:03,400 --> 00:37:13,160
That example is that you think of a quality
characteristic and that characteristics suppose
263
00:37:13,160 --> 00:37:18,340
follows this normal distribution. This is
my quality characteristic, let it be some
264
00:37:18,340 --> 00:37:27,530
variable x. Now, its mean value will be here
but over this is nothing but the process which
265
00:37:27,530 --> 00:37:38,310
is producing something, some output which
is measured through x. What happened this
266
00:37:38,310 --> 00:37:47,480
overtime because of maybe v r and t r or because
of your what can I say that not perfect maintenance.
267
00:37:47,480 --> 00:37:57,530
These ultimately the process bin, that means
the x value, the mean of x, it will basically
268
00:37:57,530 --> 00:38:02,840
sipped. Suppose, initially this was my mean value
269
00:38:02,840 --> 00:38:10,140
but because v r and t r of maintenance, this
mean value may shift in this direction or
270
00:38:10,140 --> 00:38:18,700
in this direction depending on the quality.
What is happening in the process parameters
271
00:38:18,700 --> 00:38:28,520
level? So, in that case suppose I want to
take if there is but it is very difficult
272
00:38:28,520 --> 00:38:34,760
to change that. This mini shift has taken
place immediately, you will not find out the
273
00:38:34,760 --> 00:38:40,890
change, when the change has taken substantially
then only. Suppose, if the mean has gone on
274
00:38:40,890 --> 00:38:47,600
to this level, this is the mu 1, new mean
then you find out that much change has taken
275
00:38:47,600 --> 00:38:52,680
place. So, that type also if I write in this side
276
00:38:52,680 --> 00:39:05,060
mean change or mean drift and this side suppose
the beta error beta value what will happen,
277
00:39:05,060 --> 00:39:13,230
you get a certain curve when the beta is what type 2 error. There is change but you are
278
00:39:13,230 --> 00:39:21,830
not able to detect so long the change is small,
you are not able to detect, that means you
279
00:39:21,830 --> 00:39:33,760
will accept. So, your things may be like this
it may be like this, it all depends on which
280
00:39:33,760 --> 00:39:39,500
type of it will come under the operating characteristic.
It is very important because when you are
281
00:39:39,500 --> 00:39:47,830
able to change the find out the change has
taken place. That is very critical for if
282
00:39:47,830 --> 00:39:54,080
not only manufacturing or other system also.
283
00:39:54,080 --> 00:40:07,020
Now, we will go for the population variance
and I am sure that you will appreciate one
284
00:40:07,020 --> 00:40:14,830
thing here, that you know how to go about
hypothesis testing. For population variance
285
00:40:14,830 --> 00:40:18,530
also what is the required, what you are required
to know, you are required to know the statistics.
286
00:40:18,530 --> 00:40:21,950
What will be the statistic in the case of
population variance?
287
00:40:21,950 --> 00:40:28,930
We say the ki square. Ki square is n minus
1 s square by sigma square. What is your null
288
00:40:28,930 --> 00:40:36,780
hypothesis sigma is sigma 0, what is your
alternate hypothesis sigma not equal to sigma
289
00:40:36,780 --> 00:40:55,300
0. So, then your computed value will be ki
square computed which is n minus 1 s square
290
00:40:55,300 --> 00:41:03,180
by sigma 0 square. So, as you have taken two
tailed condition here also, what you will
291
00:41:03,180 --> 00:41:12,300
do? You will just find out here it is ki square
alpha by 2 and here also ki square 1 minus
292
00:41:12,300 --> 00:41:20,110
alpha by 2 and all of you know that ki square
distribution. One parameter of ki square distribution
293
00:41:20,110 --> 00:41:23,930
is the degree of freedom.
So, here you are required to write degree
294
00:41:23,930 --> 00:41:36,440
of freedom n minus 1 here also, it is n minus
1. Then if you find out that if your this
295
00:41:36,440 --> 00:41:41,810
value will fall here or here you will reject
null hypothesis. So, that means you have to
296
00:41:41,810 --> 00:41:53,910
compute and find out where it is falling and
as you know the variance is also a proper
297
00:41:53,910 --> 00:42:00,240
such a what can I say parameter. We do not
want more variance, basically we want less
298
00:42:00,240 --> 00:42:08,670
variance. In that sense if you are interested
then maybe you can go for one tailed also
299
00:42:08,670 --> 00:42:18,290
but here what is the issue is that one null
hypothesis is given, you are trying to prove
300
00:42:18,290 --> 00:42:22,340
that null hypothesis is true or false.
So, in that case alternate hypothesis if you
301
00:42:22,340 --> 00:42:28,010
have created it like this, there is no problem
but I have given you some example. Based on
302
00:42:28,010 --> 00:42:32,680
that example if you go, that mean from the
problem to hypothesis then I think got the
303
00:42:32,680 --> 00:42:42,030
better. Higher is the better nominal is the
best, that particular concept you must use
304
00:42:42,030 --> 00:42:50,700
and again if I go for the next one, that also
that one also easy for you. This is one example
305
00:42:50,700 --> 00:42:55,790
I have given and anything, no problem at all.
306
00:42:55,790 --> 00:43:03,780
What is our next topic? Next topic is you
want to test whether two population are equal
307
00:43:03,780 --> 00:43:10,690
or not equality. Two population means what
you will do here, what is the random variable
308
00:43:10,690 --> 00:43:11,690
here?
309
00:43:11,690 --> 00:43:23,170
x 1 bar minus x 2 bar. So, this x 1 bar minus
x 2 bar, that will follow normal or z, that
310
00:43:23,170 --> 00:43:29,210
is t distribution. Depending on the conditions
and all of you have seen that x 1 bar minus
311
00:43:29,210 --> 00:43:39,100
x 2 bar minus expected value of x 1 bar minus
x 2 bar divided by that variance of x 1 bar
312
00:43:39,100 --> 00:43:47,860
minus x 2 bar. This follows z, we are assuming
that sigma and other things are known. You
313
00:43:47,860 --> 00:43:52,390
know z or t that is that again I no need of
discussing further.
314
00:43:52,390 --> 00:44:05,170
So, 0, 1 then what will be your here that
z computed. If I use z value then z computed
315
00:44:05,170 --> 00:44:14,290
is x 1 bar minus x 2 bar minus, this is mu
1, minus mu 2 by this one. We have seen last
316
00:44:14,290 --> 00:44:24,410
class n 1 plus sigma 2 square by n 2, that
we have seen what is your null hypothesis
317
00:44:24,410 --> 00:44:33,060
here H 0 mu 1 equal to mu 2, that I mean this
quantity becomes 0.
318
00:44:33,060 --> 00:44:44,360
So, your z computed becomes x 1 bar minus
x 2 bar by square root of sigma 1 square by
319
00:44:44,360 --> 00:44:53,100
n 1 sigma 2 square by n 2. You know n 1, n
2 and sigma 1, sigma 2, compute this and then
320
00:44:53,100 --> 00:45:03,890
again find out. You set your alternate hypothesis
mu 1 not equal to mu 2, that is two tailed,
321
00:45:03,890 --> 00:45:10,670
two tailed case.
So, z alpha by 2 minus z alpha by 2, find
322
00:45:10,670 --> 00:45:17,870
out where it is falling. Is it falling in
this region or it is in between this two?
323
00:45:17,870 --> 00:45:32,030
This is acceptance, this is rejection for
whom for null hypothesis, correct question
324
00:45:32,030 --> 00:45:41,620
here, anything clear? Basically, the crux
of the matter is knowing the appropriate statistics
325
00:45:41,620 --> 00:45:48,450
and its distribution. Why that is required
because then only you will be using the table
326
00:45:48,450 --> 00:45:55,750
otherwise you cannot frame anything. It will
not be a parametric one, it will be different
327
00:45:55,750 --> 00:46:01,570
kind for parametric case. It is the must,
you must know the statistic appropriate statistics
328
00:46:01,570 --> 00:46:03,910
and its distribution then things are very
simple.
329
00:46:03,910 --> 00:46:07,750
Sir, why we call it null hypothesis
because our target is good. By contradiction
330
00:46:07,750 --> 00:46:24,380
is it like we always try to negate it?
This is what I can say, this way it is developed
331
00:46:24,380 --> 00:46:34,280
one to negate it. Definitely you want to negate
it null hypothesis, the word null I have no
332
00:46:34,280 --> 00:46:45,820
idea about the word null, mean why they are
writing null but I can say that as you are
333
00:46:45,820 --> 00:46:48,850
saying that alternative hypothesis. There
is a purpose of alternative hypothesis that
334
00:46:48,850 --> 00:46:56,350
you are to reject the null hypothesis, you
will claim in such a manner that you will
335
00:46:56,350 --> 00:47:02,500
do in such a manner that possible to test
null hypothesis as such why the word null
336
00:47:02,500 --> 00:47:22,110
is coming difficult for me. So, I may serious
one otherwise this is a general way, this
337
00:47:22,110 --> 00:47:29,440
is a very good question. Basically, I think
yes why the word null is used. I can give
338
00:47:29,440 --> 00:47:36,350
you one explanation as sometime what happened
I have seen in structural equation modelling.
339
00:47:36,350 --> 00:47:44,190
There are they have taken co variance matrix
and then only the variance component they
340
00:47:44,190 --> 00:47:53,580
consider and other component they put it to
0. So, I may not be 100 percent sure that
341
00:47:53,580 --> 00:48:08,240
null basically is devoid or 0, getting me?
So, if this is the case, so then when we test
342
00:48:08,240 --> 00:48:15,340
you will find out you create. Suppose, you
are making z test then that z test, that z
343
00:48:15,340 --> 00:48:27,710
value is 0. If z value is 0 so long your statistic
is close to 0, you are accepting this, you
344
00:48:27,710 --> 00:48:34,340
are when you are rejecting you are saying
if that my computed value is sufficiently
345
00:48:34,340 --> 00:48:42,510
away from 0, getting me?
So, from that analogy I can tell you because
346
00:48:42,510 --> 00:48:49,580
of this 0, you are getting me, because of
this 0 what I am doing? Basically I am computing
347
00:48:49,580 --> 00:48:54,990
the z value. The statistics value I know this
will follow certain z, that z distribution
348
00:48:54,990 --> 00:48:59,170
where the mean value is 0 because you are
testing for mean.
349
00:48:59,170 --> 00:49:05,650
So, it is sufficiently away from 0 so long
it is close to 0. That means so long within
350
00:49:05,650 --> 00:49:13,210
this region say it is in the null region,
if I say null equal to 0, null means void.
351
00:49:13,210 --> 00:49:25,700
Void means 0, that sense I think this maybe
but one of the explanation but not 100 percent
352
00:49:25,700 --> 00:49:32,220
sure that whether this I think this can be
thought of.
353
00:49:32,220 --> 00:49:42,500
This is the example. As I told you in last
class that the method of teaching is tested
354
00:49:42,500 --> 00:49:48,870
and two population standard deviations are
given for section one, that method A it is
355
00:49:48,870 --> 00:50:03,080
5 and method B it is 10. In your confidence
interval you have taken this values but in
356
00:50:03,080 --> 00:50:12,120
when you have s pooled, that time what happened,
that time you have assumed that these two
357
00:50:12,120 --> 00:50:19,180
values are not different. We are here, we
are interested to test whether this two values
358
00:50:19,180 --> 00:50:28,960
are really different or not and that is the
difference between two teaching methods. No,
359
00:50:28,960 --> 00:50:32,870
here we are basically doing the difference
between 80 and 70, that is mean difference
360
00:50:32,870 --> 00:50:41,430
we are taking and considering population that
5 and 10, not that what I said that ki square
361
00:50:41,430 --> 00:50:43,410
distribution here.
362
00:50:43,410 --> 00:50:59,230
Now, you come to the special case, what is
the special case? Sigma 1 equal to sigma 2
363
00:50:59,230 --> 00:51:09,290
equal to sigma. This time you will use s p
square, that is full variance, rest of the
364
00:51:09,290 --> 00:51:30,030
things remain the same and you statistic that
basically computed value will be under H 0
365
00:51:30,030 --> 00:51:37,810
mu 1 equal to mu 2.
So, if I use t computed, t will be x 1 bar
366
00:51:37,810 --> 00:51:44,470
minus x 2 bar minus 0 because mu 1 minus mu
2 equal to 0 divided by s p square root of
367
00:51:44,470 --> 00:51:57,170
1 by n 1 minus 1 by n 2 and go for tabulated
t tabulated t n 1 plus n 2 minus 2 alpha by
368
00:51:57,170 --> 00:52:06,880
2. If my tabulated t is more than or less
than depending on which side that it is that
369
00:52:06,880 --> 00:52:09,600
the tabulated t, you will reject the null
hypothesis.
370
00:52:09,600 --> 00:52:22,740
This is just a special case of the
original?
371
00:52:22,740 --> 00:52:25,510
Original one.
Student: Original one? Original one was the.
372
00:52:25,510 --> 00:52:27,420
Sigma 1 square by n 1 plus sigma 1 square
by n 2.
373
00:52:27,420 --> 00:52:27,850
That special case would hold for
the original one also using others cases.
374
00:52:27,850 --> 00:52:32,660
Okay.
What is the requirement of the computation
375
00:52:32,660 --> 00:52:43,470
of the several type of convolution.
Correct, correct. Now, because under this
376
00:52:43,470 --> 00:52:51,480
special case scenario what happened actually
all this that whatever the distribution you
377
00:52:51,480 --> 00:52:58,210
are finding out this distributions are governed
by the number of parameters. As the size of
378
00:52:58,210 --> 00:53:04,600
the sample and all those things are involved
here. Now, what happen when this special condition
379
00:53:04,600 --> 00:53:11,520
arise that sigma 1 equal to sigma 2 sigma,
then if we convert this s p, if we find s
380
00:53:11,520 --> 00:53:18,040
p square and accordingly we will go that power
of the test is much better, you are getting
381
00:53:18,040 --> 00:53:21,420
me? Always there is a possibility of error, there
382
00:53:21,420 --> 00:53:27,610
is alpha error or beta error is there. When
the power of the test will be better that
383
00:53:27,610 --> 00:53:33,500
is why this special cases are found out and
they are also reported and most of the times
384
00:53:33,500 --> 00:53:37,980
you use this. That is why what we said if
we find out that the two population variances
385
00:53:37,980 --> 00:53:44,490
are equal, that you first find out then you
go for the special case. Do not use the general
386
00:53:44,490 --> 00:53:49,590
case. General case is many a times a general
case because general case will give you some
387
00:53:49,590 --> 00:53:57,430
result but if this special case will occur.
That this is better case?
388
00:53:57,430 --> 00:53:59,200
Always better.
389
00:53:59,200 --> 00:54:08,310
This is another example that we want to equality
of performance between two medicines. Confidence
390
00:54:08,310 --> 00:54:16,010
interval case you have seen. Now, equality
of two population variances should I go for
391
00:54:16,010 --> 00:54:22,070
this? I think you will all be able to find
out this when you say that the equality of
392
00:54:22,070 --> 00:54:30,390
two population variance is you must know the
distribution of the ratio of two population
393
00:54:30,390 --> 00:54:35,590
variances. Last class we have proven it that
it will be basically F distribution.
394
00:54:35,590 --> 00:54:45,700
So, your F n 1 minus 1 and n 2 minus 1 and
definitely you will be getting certain alpha
395
00:54:45,700 --> 00:55:02,640
values and that will be your tabulated one and what was the statistics. That time statistics we considered F
396
00:55:02,680 --> 00:55:12,660
equal to s 1 square by sigma 1 square by s 2 square by sigma 2 square. What is our hypothesis? Here H 0 that
397
00:55:12,670 --> 00:55:18,740
sigma 1 square is equal to sigma 2 square. So, sigma 1 by sigma 2 whole square is basically
398
00:55:18,740 --> 00:55:25,740
this is 1. So, your F is basically resultant
F, basically s 1 square by s 2 square. So,
399
00:55:25,740 --> 00:55:36,330
you want to test with the computed tabulated
value. If you find that the tabulated value
400
00:55:36,330 --> 00:55:42,590
is less than the computed value, reject the
null hypothesis. Here some special cases also
401
00:55:42,590 --> 00:55:48,280
occur basically depending on the sigma 1,
sigma 2 that which one is more, which one
402
00:55:48,280 --> 00:55:53,110
is less.
You go through by this Frueud and Miller book.
403
00:55:53,110 --> 00:56:02,790
Frueud and Miller basic statistics that book
basically engineering I think this is basically
404
00:56:02,790 --> 00:56:10,600
introduction to statistics. That book you
go through, you will be finding out many more
405
00:56:10,600 --> 00:56:13,540
cases are there..
406
00:56:13,540 --> 00:56:18,940
This is F table.
407
00:56:18,940 --> 00:56:31,730
I told you last class that when the confidence
interval as well as this hypothesis testing.
408
00:56:31,730 --> 00:56:44,980
These are the some pioneers hypothesis. Fischer
has tested so many hypothesis in each lot
409
00:56:44,980 --> 00:56:52,220
many experiments. What he have conducted and
for hypothesis testing this book is good book.
410
00:56:52,220 --> 00:56:59,950
This is Aczel A D, complete statistics. I
have seen this book is a very good book, you
411
00:56:59,950 --> 00:57:02,070
can go through book, it is a compact one.
412
00:57:02,070 --> 00:57:07,680
In addition as I told you that Frueud and
Miller, that statistics book that also you
413
00:57:07,680 --> 00:57:17,410
go through, you will find out that fantastic.
So, many good books are there and particularly
414
00:57:17,410 --> 00:57:23,980
for basic statistics is concerned, universal
statistics is concerned, multivariate statistics
415
00:57:23,980 --> 00:57:31,510
books. Yes, good books are there and some
more books are there but compared to univariate
416
00:57:31,510 --> 00:57:42,070
case, multivariate books are very limited.
So, next class what we will do? We will basically
417
00:57:42,070 --> 00:57:57,590
talk about multivariate descriptive statistics.
So, this is the end of today, this one, this
418
00:57:57,590 --> 00:58:04,080
is the end of our basic or prerequisites,
remember what I have told you? Under this
419
00:58:04,080 --> 00:58:12,210
basic statistic is very miniscule, I have
told you very simple just for to get some
420
00:58:12,210 --> 00:58:17,980
idea that what is happening, what will be
is there but in general one univariate. That
421
00:58:17,980 --> 00:58:23,630
basic statistic book is itself 1000 pages,
getting me?
422
00:58:23,630 --> 00:58:33,160
So, if you really want to get your fundamentals
very strong, you have to have some good books
423
00:58:33,160 --> 00:58:38,890
on basic statistics also and you have to go
through and I have given you some of the things
424
00:58:38,890 --> 00:58:48,100
what the concept. This concept we will be
using in the multivariate and that lectures
425
00:58:48,100 --> 00:58:52,590
also that you can easily grasp, means that
first I will tell you in univariate. You got
426
00:58:52,590 --> 00:58:58,470
this one, see how we are converting the same
concept to multivariate. So, those many portions
427
00:58:58,470 --> 00:59:05,060
only I have taken into consideration, it is
not the totality of univariate basic statistics.
428
00:59:05,060 --> 00:59:10,360
Then thank you.