1
00:00:17,940 --> 00:00:24,780
Good afternoon; this is Doctor Pradhan here;
welcome to NPTEL project on econometric modeling.
2
00:00:24,780 --> 00:00:30,200
So, today we will continue the reliability
part of bivariate econometric modeling. In
3
00:00:30,200 --> 00:00:36,610
the last class we have discussed the near
depth reliability and the structure of reliability
4
00:00:36,610 --> 00:00:43,610
for the bivariate estimated econometric model.
Now, we like to highlight this same issue
5
00:00:45,300 --> 00:00:49,280
again here because, some of the things we
have not discussed last class.
6
00:00:49,280 --> 00:00:56,280
So, the thing is for 2 variables Y and X,
our fitted model is like this: Y hat equal
7
00:00:58,510 --> 00:01:05,510
to alpha hat plus beta hat X. Now, the essential
point is here we have 2 specific objectives;
8
00:01:09,280 --> 00:01:14,930
the first objective is to know the significance
of the parameters and second objective is
9
00:01:14,930 --> 00:01:21,320
to know the overall fitness of the models.
In this particular… suppose reliability
10
00:01:21,320 --> 00:01:28,320
is concerned, we have 2 specific objectives.
First objective is to know the significance
11
00:01:30,690 --> 00:01:37,690
of parameters and that is with respect to
alpha hat and beta hats and second, the significance
12
00:01:41,590 --> 00:01:48,590
of the significance of overall fitness of
the model overall fitness of the model. So,
13
00:01:58,170 --> 00:02:01,720
we have 2 specific objectives so far as reliability
is concerned.
14
00:02:01,720 --> 00:02:06,680
So, first objective is to know the significance
of the parameters that is the weightage of
15
00:02:06,680 --> 00:02:13,680
you know, each parameters when we fit the
you know, regression equations with respect
16
00:02:15,050 --> 00:02:21,360
to X and Y. Then, obviously the impact can
be negative or the impact can be positive
17
00:02:21,360 --> 00:02:28,360
which is just through the slope of the, you
know, X coefficient or X variables. This which
18
00:02:30,730 --> 00:02:35,870
we have to just through slope of the X variable
that is nothing but, beta coefficient and
19
00:02:35,870 --> 00:02:41,240
alpha coefficient is just to know the significance
of the, you know, supporting factors.
20
00:02:41,240 --> 00:02:47,400
Now, you know, just to put in a straight line
equation, this is you know intercept and this
21
00:02:47,400 --> 00:02:52,790
is what we call it a slope. Now, we like to
know whether this you know, intercept or the
22
00:02:52,790 --> 00:02:59,530
supporting component is significant one for
influencing Y and whether X component is significant
23
00:02:59,530 --> 00:03:05,319
one. Again, for influencing Y now to know
this one, we have standard procedure; so,
24
00:03:05,319 --> 00:03:10,720
that part the discussion of, this part particularly
is known as the reliability of this estimated
25
00:03:10,720 --> 00:03:13,660
model.
So, before we… first we start this first
26
00:03:13,660 --> 00:03:18,020
objective; that is the significance of the
parameters. So, the significance of parameter
27
00:03:18,020 --> 00:03:24,980
is that we have to represent the estimated
models in a typical tabular form so that we
28
00:03:24,980 --> 00:03:27,780
can understand the exact structure of the
reliability.
29
00:03:27,780 --> 00:03:34,370
Now, when we have estimated models, we had
Y hat equal to alpha hat plus beta hat X so
30
00:03:34,370 --> 00:03:41,370
then the standard table we have to design
is here. We have estimated parameters; then
31
00:03:47,900 --> 00:03:54,900
second, the estimated values; then, third
column represents variance - variance of estimated
32
00:04:02,489 --> 00:04:09,489
values. Then, standard error then t statistics,
then you know probability level of significance,
33
00:04:12,930 --> 00:04:19,289
these are the structure of this particular
you know, significance of the parameters;
34
00:04:19,289 --> 00:04:23,650
that means with respect to the first objective.
So, what are the estimated parameters for
35
00:04:23,650 --> 00:04:28,349
this particular you know, bivariate setup?
The estimated parameter, first parameter is
36
00:04:28,349 --> 00:04:35,349
related to alpha hat and second parameter
is beta hat; this is what that means. Now,
37
00:04:35,570 --> 00:04:41,410
this table is altogether complete one so,
this is what we have to design this entire
38
00:04:41,410 --> 00:04:47,960
table alright.
Now, these particular structures, what is
39
00:04:47,960 --> 00:04:52,780
estimated value alpha hat? We will get it
you know this is nothing but, Y bar minus
40
00:04:52,780 --> 00:04:59,780
beta hat X bar and beta hat is equal to summation
XY by summation X square which we have discussed
41
00:05:00,030 --> 00:05:06,730
long back. Now, variance of estimated alpha
that is nothing but variance of alpha hat
42
00:05:06,730 --> 00:05:11,389
and this is nothing but variance of beta hat.
Then, standard error of beta that is nothing
43
00:05:11,389 --> 00:05:18,389
but, variance of alpha hat and this is square
root of variance of beta hat. So, this is
44
00:05:19,630 --> 00:05:25,680
standard error; when we design T T statistic
for this is t alpha hat and this is t beta
45
00:05:25,680 --> 00:05:29,220
hat and we like to know what is the significance
levels.
46
00:05:29,220 --> 00:05:35,470
Now, this model is you know theoretically
is but, you know, technically or practically
47
00:05:35,470 --> 00:05:41,110
so far as the significance of the parameter
is concerned, we have to evaluate in a proper
48
00:05:41,110 --> 00:05:45,040
sequence and that has to be compared with
the tabulated value which we have discussed
49
00:05:45,040 --> 00:05:49,340
details in the last class.
Now, what is all about this variance of alpha
50
00:05:49,340 --> 00:05:54,919
hat? so basically the variance of alpha hat
is derived there are you know technical procedure
51
00:05:54,919 --> 00:05:58,729
how you have to get the variance of alpha
hat but, in the mean times variance of alpha
52
00:05:58,729 --> 00:06:05,560
hat is nothing but, sigma square u into summation
X square divided by n summation X square.
53
00:06:05,560 --> 00:06:11,400
So, here this is you know this particular
item is a capital X and this particular x
54
00:06:11,400 --> 00:06:17,630
is a small x. This is nothing but, deviation
format this is we can represent in X minus
55
00:06:17,630 --> 00:06:24,190
X bar alright.
Now, altogether there are 4 items, sigma square
56
00:06:24,190 --> 00:06:30,460
u summation capital X square then a n into
summation small x square. So, this is nothing
57
00:06:30,460 --> 00:06:37,120
but, variance of you know, variance of X so
the question is, what is sigma square here?
58
00:06:37,120 --> 00:06:42,580
So, sigma square is sigma square u is called
as an error variance here. This is otherwise
59
00:06:42,580 --> 00:06:49,580
called as error variance here we are calculating
the variance of a particular variance x or
60
00:06:50,889 --> 00:06:57,690
particular variable y. We have to also calculate
the variance of you can say u or e because,
61
00:06:57,690 --> 00:07:03,990
in a bivariate setup we start with 2 variables
Y and X. But, ultimately we with the help
62
00:07:03,990 --> 00:07:08,800
of you know estimated model we get to know
the or you have to create another variable
63
00:07:08,800 --> 00:07:14,789
called as u or otherwise called as error term.
Now, altogether when we have a fitted model
64
00:07:14,789 --> 00:07:20,250
then the entire system consists of you know
4 important columns. So, first column is related
65
00:07:20,250 --> 00:07:25,400
to Y column; it gives the information about
y structure and we can get to know what is
66
00:07:25,400 --> 00:07:30,690
the variation of Y or you can say, standard
deviation of Y or mean of Y. so, these are
67
00:07:30,690 --> 00:07:34,400
the statistics we have to draw from the Y
column.
68
00:07:34,400 --> 00:07:39,539
Similarly, in the X column we have series
of X information corresponding to or Y component.
69
00:07:39,539 --> 00:07:46,539
So, we can also get to know the entire descriptive
statistics of x variable. Now, next to X we
70
00:07:47,620 --> 00:07:52,229
start with a variable called Y hat. Y hat
is nothing but, alpha hat plus beta hat X.
71
00:07:52,229 --> 00:07:57,350
now, with the help of alpha hat value and
beta hat value and with the help of X information
72
00:07:57,350 --> 00:08:04,350
then we can create the y hat columns. So,
Y hat column also we can get the descriptive
73
00:08:04,389 --> 00:08:10,919
statistic y hat because, y hat altogether
here another variable which is designed through
74
00:08:10,919 --> 00:08:15,870
the help of Y and X and the estimated parameter
alpha hat and beta hat.
75
00:08:15,870 --> 00:08:21,449
Now, with respect to Y hat and Y, we have
to create another column called as a error
76
00:08:21,449 --> 00:08:27,550
columns; so that is represented as u columns
or you can say e column. Now, corresponding
77
00:08:27,550 --> 00:08:32,560
to every figure of Y hat and Y, we have to
find out the error component for instance,
78
00:08:32,560 --> 00:08:39,560
u 1 is equal to y one minus y one hat. Similarly,
e 2 equal to y 2 minus y 2 hat so the difference
79
00:08:39,800 --> 00:08:46,800
between the estimated Y and you know actual
Y; so this will give the error representation.
80
00:08:46,850 --> 00:08:53,850
Now, once you have error series starting from
u 1 to e 1, e n provided the system is n th
81
00:08:53,959 --> 00:08:59,149
observation then, you have to calculate the
error variance. So, these error variance you
82
00:08:59,149 --> 00:09:04,459
know it is called as a sigma square u; so
sigma square u. Sometimes, you know this error
83
00:09:04,459 --> 00:09:11,120
variance we will represent here summation
e square by n minus 2 this summation sigma
84
00:09:11,120 --> 00:09:15,580
square u equal to summation e square n minus
2.
85
00:09:15,580 --> 00:09:21,920
Here, you know basically this particular summation
e square by n minus 2. We can put it in a
86
00:09:21,920 --> 00:09:27,600
other way: summation is e square by n minus
k actually k is k is the number of variables
87
00:09:27,600 --> 00:09:33,980
in this particular system or number of parameters,
in this particular systems. Now, since this
88
00:09:33,980 --> 00:09:40,980
particular model is a bivariate one obviously,
there are 2 variables and there are 2 parameters
89
00:09:41,350 --> 00:09:48,000
right. That is, alpha parameter and beta parameters;
so as a result k is represented as here 2.
90
00:09:48,000 --> 00:09:53,440
So, there is no point to write summation e
square by n minus k because, it is already
91
00:09:53,440 --> 00:09:59,170
known to us that k represents the total number
of variables in the system. That is, you can
92
00:09:59,170 --> 00:10:06,060
say y and x or number of parameter in the
systems; that is alpha hat and beta hat. Now,
93
00:10:06,060 --> 00:10:12,990
but, when there is you know, multivariate
system then these particular terms can be
94
00:10:12,990 --> 00:10:18,570
represented as a summation e square by n minus
k. For instance, if we have trivariate models
95
00:10:18,570 --> 00:10:24,019
then obviously, summation is e square by n
minus 3 because there are 3 variables in the
96
00:10:24,019 --> 00:10:29,279
system. Similarly, we have to extend one after
another then obviously the n minus k component
97
00:10:29,279 --> 00:10:34,810
will be start, you can say expanding.
Now, sigma square e equal to summation e square
98
00:10:34,810 --> 00:10:41,810
by n minus 2 where summation e square is equal
to summation y hat square plus summation y
99
00:10:45,670 --> 00:10:52,390
hat square; this is summation u y square minus
summation y hat square. So, that means in
100
00:10:52,390 --> 00:10:56,630
other words it is nothing but, summation y
square minus summation y hat square.
101
00:10:56,630 --> 00:11:02,850
Let me explain how it has happened here. This
is you know, usually derived in a technical
102
00:11:02,850 --> 00:11:07,810
procedures. So, the details you know calculating
procedure of this particular terms we can
103
00:11:07,810 --> 00:11:10,230
analyze here.
104
00:11:10,230 --> 00:11:17,230
Now, this particular system this particular
system say let us say we have a system; say
105
00:11:17,779 --> 00:11:24,779
y equal to y hat plus e. So, this is how we
start the process. What we will do? Let us
106
00:11:25,740 --> 00:11:32,730
we call it equation number one then we you
can say subtract y bar on both the sides so
107
00:11:32,730 --> 00:11:36,970
this is nothing but, y hat minus y bar so
plus e.
108
00:11:36,970 --> 00:11:43,970
Now, you see here so the actual the actual
representation is like this now this is what
109
00:11:46,649 --> 00:11:53,260
we call as the x series and this is what we
will call it y series. So, then we have a
110
00:11:53,260 --> 00:12:00,260
estimated you know corresponding to y; so
we can get the you can say we can get the
111
00:12:02,620 --> 00:12:08,950
y you know y bar and corresponding to x we
have to get the x bar. So, this is what we
112
00:12:08,950 --> 00:12:11,440
call is a mean of y and this is called as
a mean of x.
113
00:12:11,440 --> 00:12:18,440
Now, with respect to y and x information,
our objective is to get the estimated line;
114
00:12:18,730 --> 00:12:23,490
that is called as a best fitted line. Now,
let us assume that this best fitted line can
115
00:12:23,490 --> 00:12:28,800
be represented like this. So, this is y here
which is equal to alpha hat plus beta hat
116
00:12:28,800 --> 00:12:35,800
x beta hat x now this particular point is
very relevant because, in this particular
117
00:12:37,240 --> 00:12:44,240
point where y hat y hat bar or exactly equal
to y bar so, this is a… we are representing
118
00:12:45,079 --> 00:12:51,880
here that means this entire representation
we can write it here like this y minus y bar
119
00:12:51,880 --> 00:12:58,880
is equal to y hat minus y. hat bar plus e
because y y bar and y hat bar is equal at
120
00:12:59,440 --> 00:13:03,220
that point of you know equilibrium; so it
is not a issue.
121
00:13:03,220 --> 00:13:09,550
So, what we have to do instead of writing
this one? We call it this is small y in deviation
122
00:13:09,550 --> 00:13:15,529
format and this is what we will call it y
hat in a deviation format and this is e. This
123
00:13:15,529 --> 00:13:22,529
is also as usual error terms. Now, this is
what we have derived from here; now put it
124
00:13:24,630 --> 00:13:31,560
in a proper way. So, it is y equal to y hat
plus e; now, what we have to do? This is original
125
00:13:31,560 --> 00:13:37,740
equation we have y minus y equal to y hat
plus e this y and this y hat is in a capital
126
00:13:37,740 --> 00:13:43,390
format and this y and this y hat is you know
deviation format; there is a huge difference
127
00:13:43,390 --> 00:13:49,390
between this deviation and actual.
Now, we have transferred the actual to deviation
128
00:13:49,390 --> 00:13:55,320
format. For this simplicity is concerned,
now what we have to do? We have to apply summation.
129
00:13:55,320 --> 00:14:01,500
We first apply square in both the sides and
then we have to apply the summation to get
130
00:14:01,500 --> 00:14:05,959
the entire structures. Now, what we have to
do? If we do that then the entire structure
131
00:14:05,959 --> 00:14:12,959
becomes summation y square equal to summation
y hat plus e whole square. Obviously, i equal
132
00:14:13,800 --> 00:14:20,790
to 1 to n here this is i equal to 1 to n because
i represents the sample units; it will start
133
00:14:20,790 --> 00:14:27,790
from 1 to n because, we are in the process
of cross sectional modeling and our sample
134
00:14:28,029 --> 00:14:35,029
unit represent here i.
Now, obviously i equal to 1 to up to n alright;
135
00:14:36,740 --> 00:14:43,740
now, what you have to do? This particular
component y hat y hat bar, this is what we
136
00:14:46,550 --> 00:14:53,550
can write in the format like this; y hat squares
i equal to 1 to n plus summation e square
137
00:14:54,420 --> 00:15:01,420
i equal to 1 plus 2 summation y hat into e.
So, this if you, if we expand this particular
138
00:15:04,420 --> 00:15:09,760
you know, right hand side of this equation
then, we will get summation y square equal
139
00:15:09,760 --> 00:15:16,420
to summation y hat square plus summation e
square plus 2 summation y hat into e. But,
140
00:15:16,420 --> 00:15:22,700
this particular term is exactly equal to 0
this particular term is exactly equal to 0.
141
00:15:22,700 --> 00:15:29,170
Now, the question is how it becomes 0; so
let me explain here.
142
00:15:29,170 --> 00:15:36,170
The structure is here; our point is here to
prove that summation y hat e equal to 0 so
143
00:15:36,529 --> 00:15:43,529
first of all what is y hat y hat is equal
to y hat minus y hat bar so this is nothing
144
00:15:44,720 --> 00:15:51,720
but, y hat minus y bar alright. Now, if we
will simplify then it is nothing but, alpha
145
00:15:52,940 --> 00:15:59,940
hat plus beta hat x minus alpha hat minus
beta hat x bar. Again, if you simplify then
146
00:16:01,690 --> 00:16:08,690
it is nothing but, alpha, alpha hat cancels
so, beta hat into x minus x bar which is equal
147
00:16:10,579 --> 00:16:15,660
to beta hat into small x. That is what we
call it deviation; that means, this particular
148
00:16:15,660 --> 00:16:20,880
item is small x.
So, this is one part of the problem then e
149
00:16:20,880 --> 00:16:27,880
equal to y minus y hat so that means that
means e equal to y minus y hat which is nothing
150
00:16:28,630 --> 00:16:35,630
but, y minus beta hat x beta hat x i now this
is y e and this is y hat now we have to integrate
151
00:16:40,120 --> 00:16:47,120
now summation y hat e is equal to summation
because this is summation here so this beta
152
00:16:49,660 --> 00:16:56,660
hat x into y minus beta hat x so obviously
this is x i and this is x i this is y i so
153
00:16:59,769 --> 00:17:06,769
like this so of course, i equal to 1 to n
and this side i equal to also one to n actually
154
00:17:09,010 --> 00:17:15,510
the term is 2 into summation y hat into e
but, if we prove that summation y hat equal
155
00:17:15,510 --> 00:17:22,510
to 0 then obviously, 2 into 0 equal to 0.
Now, what we have to do? Here, just we take
156
00:17:22,739 --> 00:17:29,739
beta hat common then, summation x i y i minus
beta hat beta hat is equal to common here.
157
00:17:41,529 --> 00:17:48,100
So, beta hat summation x squares then this
beta hat is you can say, we have taken common
158
00:17:48,100 --> 00:17:55,100
beta hat so then it is nothing but, beta hat
into summation x i y i minus what is beta?
159
00:17:56,889 --> 00:18:03,159
Beta hat is nothing but, summation x i y i
divided by summation x square. We have again
160
00:18:03,159 --> 00:18:08,619
summation x square; so this summation x square,
this summation x square cancels. That means,
161
00:18:08,619 --> 00:18:15,619
equal to beta hat into summation x i y i minus
summation x i y i. so, summation x i y I;
162
00:18:18,179 --> 00:18:22,979
so this and this is canceled. That means,
it is nothing but, beta hat into 0 which is
163
00:18:22,979 --> 00:18:29,979
nothing but, equal to 0 alright. Now, so that
means the entire structure is like this; so
164
00:18:30,840 --> 00:18:37,690
beta hat beta hat equal to summation x y x
i y i. We are just expanding the beta hat
165
00:18:37,690 --> 00:18:42,690
value here; so obviously, summation x square
summation x square cancels. The left out term
166
00:18:42,690 --> 00:18:49,450
is summation x i y i. Obviously, summation
x i y i is here so this is summation x i y
167
00:18:49,450 --> 00:18:56,450
i so that means it is equal to 0.
Now, we have proved that summation y hat equal
168
00:18:57,039 --> 00:19:04,039
to 0. Now, you come to this stage here so
summation that means summation y square summation
169
00:19:04,399 --> 00:19:11,330
y square is equal to summation y hat square
plus summation e square. Now, we will start
170
00:19:11,330 --> 00:19:18,330
our process here; so what is exactly this
particular? so this particular is like this.
171
00:19:19,970 --> 00:19:26,970
We start with you can say, y equal to y hat
plus e then you know we transfer into y equal
172
00:19:27,359 --> 00:19:34,359
to y hat plus e then after you know doing
so you know process so we get to we have received
173
00:19:37,190 --> 00:19:44,190
summation y square equal to summation y hat
square plus summation e square so this is
174
00:19:44,419 --> 00:19:51,419
i equal to 1 to n and this is i equal to 1
to n and this is also i equal to 1 to n. This
175
00:19:52,080 --> 00:19:58,109
is how the entire structure is all about;
that means, our point is here to justify the
176
00:19:58,109 --> 00:20:05,109
significance of the alpha parameter and beta
parameters and just this particular task.
177
00:20:05,479 --> 00:20:12,369
So, we need to have variance of alpha hat
and to have variance of alpha hat and to have
178
00:20:12,369 --> 00:20:18,340
variance of beta hat we need to integrate
with again with error variance because this
179
00:20:18,340 --> 00:20:24,479
particular variance of alpha hat depends upon
the variance of error variance and again for
180
00:20:24,479 --> 00:20:28,220
you can say a variance of beta hat we need
also error variance.
181
00:20:28,220 --> 00:20:33,849
So, we like to know, what is the exact component
of error variance; explaining by this process,
182
00:20:33,849 --> 00:20:39,899
we are in the stage that summation y square
equal to summation y hat square plus summation
183
00:20:39,899 --> 00:20:46,419
e square. This particular term is called as
a TSS and this particular term is called as
184
00:20:46,419 --> 00:20:53,419
a ESS and this particular term is called as
a RSS; this particular term is called as a
185
00:20:54,830 --> 00:21:01,119
RSS. What is exactly this particular term?
That means, this is called as a total sum
186
00:21:01,119 --> 00:21:05,460
square; this is explained sum square and this
is called as a residual sum square. So, that
187
00:21:05,460 --> 00:21:12,460
means, this is what we call as a total sum
square total sum squares. Then, this is explained
188
00:21:14,450 --> 00:21:21,450
sum square, explained sum squares and this
particular term is called as a this particular
189
00:21:24,499 --> 00:21:31,499
term is called as a residual - residual sum
squares. Sum square it is otherwise known
190
00:21:34,519 --> 00:21:41,519
as unexplained sum square unexplained sum
squares; this is otherwise called as a unexplained
191
00:21:54,559 --> 00:21:59,450
sum square.
Now, what is exactly a this particular term
192
00:21:59,450 --> 00:22:06,450
so this is nothing but, summation y minus
y bar whole squares i equal to 1 to n and
193
00:22:10,169 --> 00:22:15,879
this particular term is nothing but, y hat
square this is nothing but, summation y hat
194
00:22:15,879 --> 00:22:22,879
minus y bar whole squares i equal to 1 to
n then this is nothing but, summation you
195
00:22:25,450 --> 00:22:32,450
can say y i minus y hat whole square i equal
to 1 n. In fact, the entire process is started
196
00:22:34,690 --> 00:22:41,690
from here only because, our entire model is
nothing but, e equal to y minus y hat and
197
00:22:42,389 --> 00:22:47,700
the way we are minus minimizing the error
sum we have received the alpha hat component
198
00:22:47,700 --> 00:22:53,649
and beta hat component.
Now, to justify the significance of this particular
199
00:22:53,649 --> 00:23:00,549
parameter alpha hat and the parameter beta
hat we again come down this particular process.
200
00:23:00,549 --> 00:23:07,549
Now, we have to explain how this means there
is lots of interesting facts behind this particular
201
00:23:09,220 --> 00:23:16,220
structure. So, let us see how is this particular
structures alright.
202
00:23:16,609 --> 00:23:23,609
Now, we have the component summation y square
equal to summation y hat square plus summation
203
00:23:26,259 --> 00:23:33,259
e square. That means, what we can conclude
total sum square is equal to explained sum
204
00:23:33,429 --> 00:23:40,229
square plus residual sum square alright. So,
we are now in a position to say that total
205
00:23:40,229 --> 00:23:47,229
sum square is equal to explained sum square
and residual sum square i have highlighted
206
00:23:48,029 --> 00:23:55,029
earlier. That, you know when you have y series
and we have x series that is our you can say
207
00:23:55,309 --> 00:24:01,999
a beginning; so we have y information and
we have x information and through the process
208
00:24:01,999 --> 00:24:08,149
we have received the error component. That
is how we can say it is all about you can
209
00:24:08,149 --> 00:24:15,149
say statistics or econometric; so that means,
we like to verify that whether x is totally
210
00:24:15,679 --> 00:24:22,679
influencing the y component or x is partly
influencing y and some of the other part can
211
00:24:22,960 --> 00:24:27,739
be explained in other way.
For instance, if x is not 100 percent influencing
212
00:24:27,739 --> 00:24:34,320
y then obviously there is some point of lacking;
so that lacking part, we have to discuss and
213
00:24:34,320 --> 00:24:39,109
that is nothing but, it is called as a residuals.
So, that means when we have y series; we like
214
00:24:39,109 --> 00:24:43,509
to know what is the total sum square; that
is nothing but, sum of y i minus y bar the
215
00:24:43,509 --> 00:24:50,509
deviation and its squares. That means the
variation from all these points to the, you
216
00:24:51,460 --> 00:24:56,499
can say, from the arithmetic mean now total
sum square is equal to explained sum square.
217
00:24:56,499 --> 00:25:02,669
That is nothing but, summation y hat minus
y hat bar squares and rest is summation e
218
00:25:02,669 --> 00:25:09,669
squares that is residual sum square.
Now, put it technically. What I will do? Let
219
00:25:10,129 --> 00:25:15,259
us assume that this is equation number 1;
so what I will do? I will divide summation
220
00:25:15,259 --> 00:25:22,259
y square both the sides; so dividing summation
y square on both the sides of equation 1 then,
221
00:25:31,529 --> 00:25:38,309
what do you have? You see here so summation
y square divided by summation y square is
222
00:25:38,309 --> 00:25:45,309
equal to summation y hat square by summation
y square plus summation e square by summation
223
00:25:47,229 --> 00:25:53,950
y square alright.
Now, this particular term is exactly equal
224
00:25:53,950 --> 00:26:00,950
to to 1; this is equal to 1. Now, this is
one component and this is another component.
225
00:26:05,529 --> 00:26:09,799
So, that means 1 equal to summation y hat
square by summation y square plus summation
226
00:26:09,799 --> 00:26:16,799
e square by summation y square alright. This
is how means we are in a position to draw
227
00:26:21,330 --> 00:26:27,950
like this. obviously, i equal to 1 to n here
i equal to 1 to n here; so this is i equal
228
00:26:27,950 --> 00:26:34,950
to 1 up to n here alright.
Now, we have 2 parts; so we call it this is
229
00:26:35,999 --> 00:26:42,999
part A and this is we call it a part B. Let
us first explain what is this part; a component.
230
00:26:43,099 --> 00:26:49,909
So, part of a component is like this summation
y hat square by summation y squares what is
231
00:26:49,909 --> 00:26:56,909
y hat exactly. So, y hat is nothing but, summation
small beta hat means beta hat and small x
232
00:26:58,679 --> 00:27:05,679
this is whole square i equal to 1 to n divided
by summation y square obviously i equal to
233
00:27:07,489 --> 00:27:14,489
1 to n alright.
Now, what is beta? That means, if we simplify
234
00:27:15,649 --> 00:27:22,649
further then it is nothing but, beta hat square
then summation x square divided by summation
235
00:27:23,029 --> 00:27:30,029
y squares alright; what is beta hat? Beta
hat actually, beta hat is equal to summation
236
00:27:30,129 --> 00:27:37,129
x y by summation x square beta hat by beta
hat equal to summation x i by summation x
237
00:27:39,960 --> 00:27:46,960
square. Now, you see here if you simplify
further then what we can do? Summation put
238
00:27:48,580 --> 00:27:52,089
t here; so summation I will write it here
again.
239
00:27:52,089 --> 00:27:59,089
Summation y hat square by summation y square
is equal to summation x y whole square divided
240
00:28:07,549 --> 00:28:14,549
by summation x square whole square into summation
x square divided by summation y square; this
241
00:28:17,259 --> 00:28:23,029
is what y hat square by summation y square.
Now, you see here this is summation x square
242
00:28:23,029 --> 00:28:29,190
and this is summation x x square to the power
again to… so this is how it is cancelled
243
00:28:29,190 --> 00:28:34,769
so that means it is nothing but, summation
x y whole square by summation x square into
244
00:28:34,769 --> 00:28:40,629
summation y square alright. This is the left
out term from this you know component; That
245
00:28:40,629 --> 00:28:46,529
means, this is what we have received from
the part a. so, part A it will expand this
246
00:28:46,529 --> 00:28:53,529
part A that is the variance of explained ratio
between explained sum square to total sum
247
00:28:53,950 --> 00:28:59,149
square so means the exact term is summation
y square equal to summation y hat square plus
248
00:28:59,149 --> 00:29:03,330
summation e square; that means, the total
sum square equal to explained sum square plus
249
00:29:03,330 --> 00:29:07,019
residual sum square.
Now, what we have done? We divide the total
250
00:29:07,019 --> 00:29:13,019
sum square both the side so then the left
side of this problem is equal to 1. Then,
251
00:29:13,019 --> 00:29:18,089
right part of the first part is the explained
sum square divided by total sum square. This
252
00:29:18,089 --> 00:29:23,719
is how it is called; as a you know ratio between
the explained sum square to total square then,
253
00:29:23,719 --> 00:29:29,460
the ratio between residual sum square to total
sum square. Now, we like to know if we have
254
00:29:29,460 --> 00:29:34,979
a component explained sum square to total
sum square, what is that issue and if you
255
00:29:34,979 --> 00:29:41,309
know the ratio component is residual sum square
divided by total sum square, what is that
256
00:29:41,309 --> 00:29:46,769
component? So, then we have to now you know,
interpret accordingly.
257
00:29:46,769 --> 00:29:53,469
Now, by this process we are in the, we are
you know coming to a position that summation
258
00:29:53,469 --> 00:29:59,529
y hat square by summation y square that is
nothing but, ESS by TSS is nothing but, summation
259
00:29:59,529 --> 00:30:05,159
x y square by summation x square into summation
y square. This is what we call it is just
260
00:30:05,159 --> 00:30:10,779
like r square this is what we call it a r
square; that is what is r square r square
261
00:30:10,779 --> 00:30:17,779
is nothing but, square of square of correlation
coefficient this is what is called as a correlation
262
00:30:21,339 --> 00:30:26,849
coefficient.
You see, what is correlation? Then, correlation
263
00:30:26,849 --> 00:30:33,809
is simply nothing but, covariance of X Y divided
by sigma x into sigma y. If we will simplify
264
00:30:33,809 --> 00:30:40,299
further then it is nothing but, summation
x minus x bar into y minus y bar divided by
265
00:30:40,299 --> 00:30:47,299
n or divided by summation x square by n square
root; then summation y square by n square
266
00:30:48,339 --> 00:30:55,339
root; so this n this n this n cancelled, alright.
Now, if this is R component, this particular
267
00:30:57,809 --> 00:31:04,739
component is nothing but, summation x y this
is summation x y divided by summation x square
268
00:31:04,739 --> 00:31:11,049
into summation y square alright. Now, if we
will make it square then obviously r square
269
00:31:11,049 --> 00:31:18,049
equal to summation x y whole square divided
by summation x square into summation y square
270
00:31:19,029 --> 00:31:25,959
so what we have is received from here only
so that means this particular ratio explains
271
00:31:25,959 --> 00:31:32,959
some square to total sum square is nothing
but, the r square component that means what
272
00:31:33,139 --> 00:31:39,979
is r square here r square represent the square
of correlation coefficient but, you know this
273
00:31:39,979 --> 00:31:44,879
particular component is very much true when
we are in the bivariate process but, when
274
00:31:44,879 --> 00:31:51,619
there is multivariate process then this uh
you know ratio between explained sum square
275
00:31:51,619 --> 00:31:55,989
to total sum square cannot be represented
as a simple correlation coefficient that is
276
00:31:55,989 --> 00:31:59,379
something different.
What is this difference? The difference is
277
00:31:59,379 --> 00:32:05,330
actually, this particular r square component
is represented as a coefficient of determination
278
00:32:05,330 --> 00:32:12,330
so this particular component this r square
component is represented as a coefficient
279
00:32:12,909 --> 00:32:19,909
of determination; this particular item is
represented as a coefficient of determination;
280
00:32:23,049 --> 00:32:25,700
so, what is this coefficient of determination?
281
00:32:25,700 --> 00:32:32,700
Now, coefficient of determination that means,
you see here we had, we have here y square
282
00:32:34,159 --> 00:32:41,159
is equal to summation y square equal to summation
y hat square plus summation e square. So,
283
00:32:41,859 --> 00:32:48,099
that is what we have received; 1 equal to
summation y hat square by summation e y square
284
00:32:48,099 --> 00:32:55,099
plus summation e square by summation y square.
This is what we have received and by the process
285
00:32:57,629 --> 00:33:04,629
this is otherwise known as ESS by TSS and
this is what we had RSS by TSS.
286
00:33:06,089 --> 00:33:13,089
Now, this particular component by the you
know, by the process of derivations, what
287
00:33:13,289 --> 00:33:19,869
we have received it is nothing but, simply
you can say R square. Usually, when we will
288
00:33:19,869 --> 00:33:26,289
represent the coefficient of that determination
then it is nothing but, represented as a capital
289
00:33:26,289 --> 00:33:31,830
R square. So, what we have written earlier,
it is called as a small R square; that means,
290
00:33:31,830 --> 00:33:38,830
small and capital R square; both have same
in the case of in the case of bivariate models.
291
00:33:39,599 --> 00:33:45,559
So, bivariate model in the that means in the
case of bivariate model, the coefficient determination
292
00:33:45,559 --> 00:33:52,219
and the square of correlation coefficient
are similar so that means they are same but,
293
00:33:52,219 --> 00:33:57,289
the interpretation is somewhat different in
the correlation coefficient. What we have
294
00:33:57,289 --> 00:34:03,269
to study? You know, association between the
2 variable degree of association between 2
295
00:34:03,269 --> 00:34:06,709
variables.
Now, here R square capital R square we have
296
00:34:06,709 --> 00:34:12,629
judged ratio between explained sum square
by total sum square and explained some square
297
00:34:12,629 --> 00:34:19,320
is nothing but, total sum of the x component
that is explained items and divided by total
298
00:34:19,320 --> 00:34:25,389
sum of y component which is nothing but, dependent
component. Now, we like to know what is the
299
00:34:25,389 --> 00:34:31,859
percentage influence of independent variable
to dependent variable or you know explanatory
300
00:34:31,859 --> 00:34:37,889
variable to explained variable that is what
we are now in the process so that means R
301
00:34:37,889 --> 00:34:44,010
square is the ratio between explained sum
square to total sum square by default it is
302
00:34:44,010 --> 00:34:51,010
equal to 1 here plus RSS by TSS here RSS by
TSS here.
303
00:34:51,659 --> 00:34:58,659
Now, there are you know beautifully interpretation
here; so, what is this beautiful interpretation?
304
00:34:58,720 --> 00:35:05,720
you know You know, fortunately this particular
item can be again turned into this one. Now,
305
00:35:05,849 --> 00:35:12,849
we know correlation coefficient is usually
in between minus 1 less than equal to 1 so
306
00:35:14,039 --> 00:35:20,809
this is how the correlation coefficient range
this is correlation correlation coefficient
307
00:35:20,809 --> 00:35:27,809
range correlation coefficient range alright
now correlation coefficient range R square
308
00:35:28,660 --> 00:35:35,660
R square is always in between 0 to 1 so this
is the range of coefficient of determination
309
00:35:35,960 --> 00:35:41,359
so what is the coefficient of determination
it is the ratio between explained sum square
310
00:35:41,359 --> 00:35:48,359
to total sum square means technically or you
can say it would go by physical interpretation
311
00:35:48,710 --> 00:35:55,480
it is the variation of you know total variation
of explained items to you can say total variation
312
00:35:55,480 --> 00:36:00,160
on y.
So, this is how it is called as a R square
313
00:36:00,160 --> 00:36:06,109
or coefficient determinations. Coefficient
determination, coefficient of determination
314
00:36:06,109 --> 00:36:13,109
is nothing but, percentage of proportion variation
of y which is explained by the you know, proportion
315
00:36:13,450 --> 00:36:20,450
variation of x. This particular term is called
as a proportion variation of y which is explained
316
00:36:21,910 --> 00:36:28,359
by proportion variation of x and this particular
component is represented as proportion variation
317
00:36:28,359 --> 00:36:34,680
of y which is explained by proportion variation
of y that means this is total sum square is
318
00:36:34,680 --> 00:36:38,440
nothing but, y square; so this is our total
component.
319
00:36:38,440 --> 00:36:45,010
We like to know what is the x inflation y,
what is e inflation y, so that is why it is
320
00:36:45,010 --> 00:36:50,069
known as a proportion variation of y. This
is proportion variation of y which is explained
321
00:36:50,069 --> 00:36:56,829
by this you know, proportion variation of
x because ESS is you know, the entire component
322
00:36:56,829 --> 00:37:03,250
of ESS depends upon the x component only then
this is nothing but, proportion variation
323
00:37:03,250 --> 00:37:08,990
of y which is x means which is not explained
properly that is what we called as a RSS.
324
00:37:08,990 --> 00:37:15,599
That means, that will taken care by u component
so 1 equal to R square by R square plus RSS
325
00:37:15,599 --> 00:37:20,819
by TSS.
Now, so we have the range 0 R square and 1;
326
00:37:20,819 --> 00:37:26,650
so, this will give you the model signal. This
will give the reliability of the model signal.
327
00:37:26,650 --> 00:37:31,470
So far as the second objective is concerned,
now you see, we start with the first objective
328
00:37:31,470 --> 00:37:37,140
and by default we are now going to explain
the second objective. So, that is the overall
329
00:37:37,140 --> 00:37:41,869
fitness of the model. Now, the moment will
get R square that is the proper structure,
330
00:37:41,869 --> 00:37:46,480
how you have to you know receive this R square
and how you have to go for its statistical
331
00:37:46,480 --> 00:37:51,019
level of significant because, suppose a first
objectivity is concerned with respect to alpha
332
00:37:51,019 --> 00:37:55,519
hat and beta hat, so we are applying the t
statistic. Now, when we are going for you
333
00:37:55,519 --> 00:38:00,250
know, over all fitness of the model then,
we have to use the f statistics.
334
00:38:00,250 --> 00:38:06,769
Now, we are just explaining how we are receiving
the error variance and how it is connected
335
00:38:06,769 --> 00:38:13,769
to total variance of y and total variance
of x. now, by this process we like to explain
336
00:38:14,609 --> 00:38:19,339
how is the structure of this significance
of the individual parameters that to alpha
337
00:38:19,339 --> 00:38:25,569
hat and beta hat. And, in the other side by
means by using all these you know TSS ESS
338
00:38:25,569 --> 00:38:31,019
and RSS, we like to explain how the overall
fitness of the module will be statistically
339
00:38:31,019 --> 00:38:36,170
significant. So, that means we have 2 clear
cut objectives in our mind first is the significance
340
00:38:36,170 --> 00:38:41,380
of the parameter and the significance of the
overall fitness of the module. So, before
341
00:38:41,380 --> 00:38:47,269
i means, before i highlight the entire structure
of the R square significance label and the
342
00:38:47,269 --> 00:38:52,760
typical parameters significance variable.
We like to highlight here the influence of
343
00:38:52,760 --> 00:38:59,760
R square because the value of R square always
in between 0 to 1; so if it is 0 how is this
344
00:39:00,900 --> 00:39:05,410
structure and if it is 1 how is the structure?
Let us see here.
345
00:39:05,410 --> 00:39:12,410
Now, R square the entire component is R square
R square plus summation e square y summation
346
00:39:13,349 --> 00:39:20,279
y square is exactly equal to 1. Now, this
is how we have observed; now since this is
347
00:39:20,279 --> 00:39:27,279
our target, so what we will do? We will take
R square equal to 1 minus summation e square
348
00:39:28,559 --> 00:39:35,299
by summation y square, this is what we have
received form this you know simplification.
349
00:39:35,299 --> 00:39:42,299
So, what we have to do here now? Let us say,
if case 1: case 1 if R square is equal to
350
00:39:42,970 --> 00:39:48,710
1 then, what will happen? If R square equal
to 1 this particular item is equal to 0; this
351
00:39:48,710 --> 00:39:54,680
particular item exactly equal to 0 so R square
equal to 1 means this is equal to 1 and this
352
00:39:54,680 --> 00:40:01,680
particular item is equal to 0. So, that means,
the model is the absolutely fit for this you
353
00:40:02,809 --> 00:40:09,809
can say problem.
So, when R square is 1 then it is the best
354
00:40:10,359 --> 00:40:16,359
fitted models. Now, when R square exactly
equal to 1 then, the unexplained component
355
00:40:16,359 --> 00:40:22,029
the percentage of unexplained component is
exactly equal to 0. That means, there is no
356
00:40:22,029 --> 00:40:28,000
way you has a impact on you can say y variables.
So, that means, the 100 percent the percentage
357
00:40:28,000 --> 00:40:34,680
influence of x on y; so this is how, this
is the case where R square exactly equal to
358
00:40:34,680 --> 00:40:41,140
1. But, in real life situation or real life
problem, it is very difficult to get a situation
359
00:40:41,140 --> 00:40:45,950
when R square exactly equal to 1 alright.
In the other side, when R square equal to
360
00:40:45,950 --> 00:40:51,690
1 then it is called as a complete fitted or
perfectly fit model. This is what we will
361
00:40:51,690 --> 00:40:58,690
call as a perfectly fitted model - perfectly
fitted models but, this is not the sufficient
362
00:40:59,970 --> 00:41:06,450
condition. this is the necessary conditions
the when R square equal to 1 the overall fitness
363
00:41:06,450 --> 00:41:12,779
of the model is very high or very high means
it is excellent one so that means it is completely
364
00:41:12,779 --> 00:41:19,609
fitted model estimated model so it can be
used for forecasting and for but, the sufficient
365
00:41:19,609 --> 00:41:25,299
condition is that when R square is exactly
equal to 1. Then, corresponding to the first
366
00:41:25,299 --> 00:41:31,750
objective with respect to significance of
the alpha hat and beta hat; it has to be significant
367
00:41:31,750 --> 00:41:38,200
- highly significant. Then, the model we can
say that it is best fitted model otherwise
368
00:41:38,200 --> 00:41:43,880
of R square is exactly 1 and model is you
know the significance of the model is explicitly
369
00:41:43,880 --> 00:41:50,230
high and other side the parameters are not
statistical significant or few parameter are
370
00:41:50,230 --> 00:41:55,500
statistical significant and other parameters
are not even significant at a very lower.
371
00:41:55,500 --> 00:42:01,809
Then, the model cannot be used as a forecasting.
Even if R square equal to 1 because we are
372
00:42:01,809 --> 00:42:06,930
just in the beginning of this process and
we have R square equal to 1 and parameters
373
00:42:06,930 --> 00:42:13,500
are not all parameters, are not statistically
highly significant then there is a serious
374
00:42:13,500 --> 00:42:19,410
problem in the modeling. So, there will be
some you know complex problem in between.
375
00:42:19,410 --> 00:42:24,880
So, that complex problems we have not highlighted;
we will highlight details when we will proceed
376
00:42:24,880 --> 00:42:31,640
you know, when we will proceed accordingly.
So, we like to know in later stage, not now.
377
00:42:31,640 --> 00:42:37,609
So, what we can now, you can explain that
when R square equal to 1, just we interpret
378
00:42:37,609 --> 00:42:42,329
that it is perfectly fitted the models keeping
other things it remains constant.
379
00:42:42,329 --> 00:42:49,329
Now, case 2: when R square is equal to 0,
R square equal to 0 then, the model is completely
380
00:42:54,069 --> 00:43:01,069
unfit. That means the entire variations will
be receive from u only. That means, this particular
381
00:43:04,029 --> 00:43:11,029
item is equal to 1 and this is equal to 0.
Now, when R square equal to R square equal
382
00:43:11,559 --> 00:43:16,529
to 1; so that means, this is equal to 0 when
R square equal to 1 then summation y square
383
00:43:16,529 --> 00:43:21,829
is equal to summation y hat squares; this
is summation y square equal to summation y
384
00:43:21,829 --> 00:43:25,690
hat square.
When it is unfit then summation y square is
385
00:43:25,690 --> 00:43:29,839
equal to summation e square alright but, this
is rare and this is rare. Why it is rare?
386
00:43:29,839 --> 00:43:36,839
It may, it may not be rare but, this is you
know very extreme situation. The reality is
387
00:43:41,369 --> 00:43:48,279
that when we will when we are in the process
of you know fitting a model then, obviously
388
00:43:48,279 --> 00:43:53,619
we must have some theoretical knowledge. So,
when we have a theoretical knowledge then,
389
00:43:53,619 --> 00:43:59,720
obviously means most of the instances R square
cannot be equal to 0. It may be very low level
390
00:43:59,720 --> 00:44:06,720
but, it cannot be 0. If your R square value
is coming 0 that means, your theory is not
391
00:44:07,069 --> 00:44:12,119
absolutely that means identification of problem
with relate to all variables are not systematically.
392
00:44:12,119 --> 00:44:17,490
So, there is some kind of problems; that is
why before going to fit this particular model
393
00:44:17,490 --> 00:44:23,210
that is your theoretical knowledge may be
very perfect and you must be in a position
394
00:44:23,210 --> 00:44:29,769
to to identify exactly the structural variables.
If your initial you know, initial homework
395
00:44:29,769 --> 00:44:35,339
is very tough then, obviously later stage
of modeling will not face problems. Otherwise,
396
00:44:35,339 --> 00:44:39,880
it is just like a continuous process until
you get the best fitted models. If you do
397
00:44:39,880 --> 00:44:46,230
not go stepwise you process then, obviously
every time we will go back to again original
398
00:44:46,230 --> 00:44:51,960
position till we get the better fitted model.
So, that is why each and every stage should
399
00:44:51,960 --> 00:44:58,960
be perfectly before we going to next stage
so in reality we have R square one extreme
400
00:45:01,269 --> 00:45:08,269
equal to 1 and another extreme R square equal
to 0 but, it is very essential and it is also
401
00:45:08,539 --> 00:45:15,539
very essential. So, what is the actual is,
when R square variable is close to 1 then,
402
00:45:15,839 --> 00:45:22,839
it is called as a best you know means a better
fitted model. We cannot say best fitted model
403
00:45:23,539 --> 00:45:27,150
when we will call is a best fitted model then,
obviously R square equal to 1.
404
00:45:27,150 --> 00:45:34,150
Now, when R square R square is close to 1
then it is called as a that means, the fitness
405
00:45:34,859 --> 00:45:40,039
of the model will start increasing that means,
you start from R square equal to point 0 point
406
00:45:40,039 --> 00:45:47,039
0 1 point 0 2 point 0 3 point 0 4, 0 5, like
this; so we will go up to point 0 0.
407
00:45:47,990 --> 00:45:54,990
Now, we have 3 different ranges; in fact,
see here; the range is like this. So, take
408
00:45:55,640 --> 00:46:02,640
a case here so this is what we called 0 and
point one 0.1 then 0.2 then 0.3 like this
409
00:46:05,589 --> 00:46:12,190
then this is 0.5. Then of course, this is
1.0; this is how the R square ranges this
410
00:46:12,190 --> 00:46:17,069
is the R square range. so when we will call
it as R square 0 less than 1. So, the range
411
00:46:17,069 --> 00:46:24,069
will be like this so this is you know middle;
now if you are in this stage the model fitness
412
00:46:24,329 --> 00:46:29,660
or model accuracy will start declining when
we are moving this side then, the model accuracy
413
00:46:29,660 --> 00:46:34,539
will be start increasing.
Now, always our objective is to go this side,
414
00:46:34,539 --> 00:46:39,609
not this side. So, that the model fitness
or overall fitness of the model will be start
415
00:46:39,609 --> 00:46:46,400
increasing now when your R square value will
be closed towards one then it is the signal
416
00:46:46,400 --> 00:46:51,900
of or it is just like a green signal it is
the outcome of the best fitted model. So,
417
00:46:51,900 --> 00:46:58,160
when we are closing to 0 or close to 0 then,
obviously it is a, it will give you the red
418
00:46:58,160 --> 00:47:04,480
signal. That means we are we are diverting
from the best fitted model so we should not
419
00:47:04,480 --> 00:47:09,680
go towards the red signal whether you have
to go towards the green signal where the best
420
00:47:09,680 --> 00:47:14,829
fitted model or the model accuracy will be
start increasing so this would be our main
421
00:47:14,829 --> 00:47:21,440
agenda before we will go to this process.
Now, you come back to the original position,
422
00:47:21,440 --> 00:47:26,710
what is this actually structure? Our objective
is here to test the R square whether R square
423
00:47:26,710 --> 00:47:28,559
is statistical significant or not.
424
00:47:28,559 --> 00:47:34,390
Further, we have to prepare the ANOVA tables;
so we have to prepare the ANOVA table just
425
00:47:34,390 --> 00:47:40,539
like a… in the first objective, the first
objective we have explained here. The first
426
00:47:40,539 --> 00:47:47,539
objective what we have explained here is this
is what the first objective we have explained
427
00:47:50,470 --> 00:47:57,190
here the first objective here the fitness
of the model is like this. so, we like to
428
00:47:57,190 --> 00:48:04,190
know the target component is t alpha hat and
component is t beta hat. Now, we have received
429
00:48:05,220 --> 00:48:11,599
here summation e square by n minus 2; so that
is what sigma u now variance of alpha hat
430
00:48:11,599 --> 00:48:14,740
you have.
Similarly, you have to go for standard error
431
00:48:14,740 --> 00:48:21,190
of alpha hat standard error of alpha hat is
nothing but, variance of alpha hat. Now similarly,
432
00:48:21,190 --> 00:48:27,859
what we have to do here now we have to get
the variance of beta hat here variance of
433
00:48:27,859 --> 00:48:33,619
beta hat is nothing but, sigma square u by
summation x square. Similarly, standard error
434
00:48:33,619 --> 00:48:39,710
of beta hat standard error beta hat is nothing
but, square root of variance of beta hat so
435
00:48:39,710 --> 00:48:45,549
this is what the beta hat parameter structures
and alpha hat alpha hat parameter structure
436
00:48:45,549 --> 00:48:52,549
is alpha hat parameter structure is you can
say that means, standard error of alpha hat
437
00:48:53,140 --> 00:49:00,140
is nothing but, sigma square u summation x
square by n summation x square. So, this is
438
00:49:00,869 --> 00:49:06,990
what the structure this is standard error
of this terms alright.
439
00:49:06,990 --> 00:49:13,819
Now, standard error of this much; so what
you have to do? Once you have alpha variance
440
00:49:13,819 --> 00:49:18,930
of alpha hat, you can get the standard error
of alpha hat. So, what is the issue here?
441
00:49:18,930 --> 00:49:24,599
Now, our objective is to know whether alpha
hat is significant or beta hat is alpha hat
442
00:49:24,599 --> 00:49:29,049
is significant beta hat is significant. So,
we need to calculate t of alpha hat and you
443
00:49:29,049 --> 00:49:35,309
need to calculate t of beta hat so further
to know the significance of this particular
444
00:49:35,309 --> 00:49:41,690
alpha hat and beta hat so we have to apply
a statistic hypothesis or we have to use the
445
00:49:41,690 --> 00:49:45,059
statistical hypothesis.
Basically, the statistical hypothesis is divided
446
00:49:45,059 --> 00:49:50,730
into 2 parts called as a null hypothesis and
alternative hypothesis. So, this is null hypothesis
447
00:49:50,730 --> 00:49:56,289
then in contemporary to null hypothesis we
have alternative hypothesis so we start with
448
00:49:56,289 --> 00:50:01,220
the null hypothesis that the suppose, our
target is to test alpha is significant alpha
449
00:50:01,220 --> 00:50:07,109
is significant means alpha must have some
value if alpha has a some value. Then, we
450
00:50:07,109 --> 00:50:11,859
on the basis of that value you have to test
the significance now let us we start with
451
00:50:11,859 --> 00:50:18,859
that alpha is equal to 0 so alpha 0 usually
fit that alpha equal to 0 let us say alpha
452
00:50:19,339 --> 00:50:24,430
alpha equal to 0 and we have to test alpha
naught equal to 0.
453
00:50:24,430 --> 00:50:28,749
Once you you know reject this small hypothesis
then we are in the right trac[k]- if you could
454
00:50:28,749 --> 00:50:33,990
not reject then that variable may not be statistical
significant so that means so t of alpha hat
455
00:50:33,990 --> 00:50:38,299
is basically we will calculate technically
is nothing but, alpha hat by standard error
456
00:50:38,299 --> 00:50:44,009
of alpha hat and p of beta hat is nothing
but, you can say beta hat y standard error
457
00:50:44,009 --> 00:50:50,349
of beta hat now this is calculated statistic
this is calculated statistic that has to be
458
00:50:50,349 --> 00:50:55,710
compare with the tabulated statistic so this
is to be also compare with tabulated statistic.
459
00:50:55,710 --> 00:51:00,460
Then we get to know whether this particular
item is statistical significant or not and
460
00:51:00,460 --> 00:51:06,809
if it is significant at what level their significant
so we have different structure of significance
461
00:51:06,809 --> 00:51:13,809
tailed 5 percent one tailed and 2 tailed and
10 percent 1 tailed and 2 tailed so starting
462
00:51:19,529 --> 00:51:24,589
procedure is we have to start with the 1 percent
level. Then, if it is non significant then
463
00:51:24,589 --> 00:51:27,490
you have n to move to 5 percent. If it is
not significant then, 5 percent; then you
464
00:51:27,490 --> 00:51:31,670
have to go to 10 percent but, if you will
get significance at one percent then that
465
00:51:31,670 --> 00:51:36,460
means your model accurate is very very high
and the reliability of the model is also that
466
00:51:36,460 --> 00:51:43,230
means if the reliability of the model is perfectly.
If we are getting significance at 10 percent
467
00:51:43,230 --> 00:51:50,230
level yes model is reliable one but, the degrees
of reliability may be very less so when the
468
00:51:50,769 --> 00:51:55,970
variable is statistically significant in a
close to one or at the level of one percent
469
00:51:55,970 --> 00:52:02,259
then obviously the model reliability or model
accuracy is very high or absolutely.
470
00:52:02,259 --> 00:52:09,259
Now, we will target or we have to reformulate
or we have to design or redesign in such a
471
00:52:10,420 --> 00:52:17,420
way so that the parameters means involve in
this particular systems, modeling systems
472
00:52:17,720 --> 00:52:24,029
should be highly significant highly statistically
significant and mostly at it should be at
473
00:52:24,029 --> 00:52:31,029
the level of one percent only if it is so
then the model reliability so far as the first
474
00:52:31,220 --> 00:52:34,339
order condition is…
Now, again for sufficient condition we have
475
00:52:34,339 --> 00:52:40,029
to go for R square that means, there are 2
problems here. So, your all parameters should
476
00:52:40,029 --> 00:52:47,029
be statistical significant at the higher level
1 percent level and same times your R square
477
00:52:47,309 --> 00:52:54,259
will be also statistically significant at
the 1 percent level or that is at a higher
478
00:52:54,259 --> 00:53:00,029
levels. If it is so then, the model is absolutely
fit for the forecasting but, the problem if
479
00:53:00,029 --> 00:53:05,710
parameters are significant and R square is
not significant or R square is significant
480
00:53:05,710 --> 00:53:09,069
parameters are not significant; then the problem
is very complicated.
481
00:53:09,069 --> 00:53:14,619
So, that means there is some kind of fault
or problem in between this process; so that
482
00:53:14,619 --> 00:53:20,480
process means has to be investigated further
again there are certain problems in between;
483
00:53:20,480 --> 00:53:24,690
so that we are getting first part and we are
not receiving the second part the systems.
484
00:53:24,690 --> 00:53:30,519
The system will be very much or perfectly
when parameters are significant it should
485
00:53:30,519 --> 00:53:37,049
be R square should be statistically significant.
If not then there is serious issue for this
486
00:53:37,049 --> 00:53:41,710
particular estimated model. We have to redesign
or you have to rebuild till you get the best
487
00:53:41,710 --> 00:53:47,390
fitted models where both parameters are statistical
significant and your R square will be statically
488
00:53:47,390 --> 00:53:51,670
significant. So, we will discuss details in
the next class; thank you very much; have
489
00:53:51,670 --> 00:53:52,369
a nice day.