1
00:00:17,650 --> 00:00:24,650
Good afternoon. Welcome to NPTEL project on
econometric modeling. This is Rudra Pradhan
2
00:00:25,930 --> 00:00:32,930
here. Today, we will discuss bivariate econometric
modeling, that is, with respect to regression
3
00:00:37,300 --> 00:00:44,300
analysis. In my last lecture, we have discussed
the entire structure of bivariate econometric
4
00:00:47,400 --> 00:00:54,400
modelling. Basically, it is divided into two
parts: first part, association between two
5
00:00:58,000 --> 00:01:05,000
variables; and, in the second case, we like
to know the association along with the
6
00:01:05,489 --> 00:01:12,489
Now, in the first case, there are several
techniques: variance, covariance and correlation.
7
00:01:19,470 --> 00:01:26,470
And, other sides, there is a technical regression.
The difference is that in the first case,
8
00:01:33,800 --> 00:01:40,800
particularly with respect to variance, covariance
and correlation, the objective is to measure
9
00:01:41,900 --> 00:01:48,900
the degree of association between the two
variables. However, in the case of regression,
10
00:01:53,090 --> 00:02:00,090
we are interested for two things: first, the
association between two variables; and second,
11
00:02:01,159 --> 00:02:08,080
the cause and effective relationship between
two variables.
12
00:02:08,080 --> 00:02:15,080
Basically, bivariate econometric modelling
divided into two parts: and, that is with
13
00:02:15,580 --> 00:02:22,580
respect to degree of association; and, second
case it is causal relation. We have standard
14
00:02:30,590 --> 00:02:37,590
techniques called as variance, covariance
and correlation. Here this standard technique
15
00:02:43,170 --> 00:02:50,170
is called as regression. Now, the starting
point of this particular structure is that
16
00:02:56,169 --> 00:03:03,169
the system must have two variables. Let us
take two variables here: X represents X 1
17
00:03:04,299 --> 00:03:11,299
upto X n and Y represents Y 1 upto Y n.
Now, variance and covariance are in fact very
18
00:03:18,999 --> 00:03:25,999
similar. Variance means we have to track the
association with the same variable. For instance,
19
00:03:30,760 --> 00:03:37,760
we have to correlate with X upon x or Y upon
y. So, this is what
it is called as a variance. Now, if we correlate
20
00:03:48,010 --> 00:03:55,010
xy or YX, then it is a covariance. Now, similarly,
if we integrate x with y or Y with X, then
21
00:04:02,290 --> 00:04:08,260
it is also called as a correlation. There
is a small difference between covariance and
22
00:04:08,260 --> 00:04:14,319
correlation, but the objective of the particular
study is very much similar, because we will
23
00:04:14,319 --> 00:04:21,319
try to know the degree of association between
the two variables. Here the difference is
24
00:04:21,620 --> 00:04:26,580
only with respect to its mathematics only,
nothing else.
25
00:04:26,580 --> 00:04:31,660
Now, the case of regression – we like to
know the cause and effective relationship
26
00:04:31,660 --> 00:04:37,940
between the two variables. In my last lecture,
we have discussed the detailed structure and
27
00:04:37,940 --> 00:04:43,949
status of covariance and correlation; however,
we have not discussed anything about the regression.
28
00:04:43,949 --> 00:04:50,270
Today, we will discuss in detail about regression
and we have to compare how it is different
29
00:04:50,270 --> 00:04:57,270
or advanced to variance, covariance and correlation.
One thing is very clear: regression is very
30
00:04:59,990 --> 00:05:06,590
much dependent on the variance, covariance
and correlation. Until unless you have complete
31
00:05:06,590 --> 00:05:13,389
knowledge on variance, covariance and correlation,
you cannot go for regression technique. Let
32
00:05:13,389 --> 00:05:15,990
me explain this all about regression.
33
00:05:15,990 --> 00:05:22,449
Now, I have already discussed the structures,
bivariate econometric modeling: this is here
34
00:05:22,449 --> 00:05:29,449
association rule; and, this is here causality
rule. Now, we will take the case causality.
35
00:05:32,370 --> 00:05:39,319
Now, causality is nothing but its technique
called as a regression. Now, the first question
36
00:05:39,319 --> 00:05:46,319
is what is all about regression? Regression
is to predict or forecast a particular variable
37
00:05:50,009 --> 00:05:57,009
with respect to a given variable. In other
words, it is the average association between
38
00:05:58,870 --> 00:06:05,870
two variables keeping in mind two objectives:
first objective – degree of association
39
00:06:06,280 --> 00:06:13,280
between the two variables; and, second objective
is to know cause and effective relationship
40
00:06:14,469 --> 00:06:20,430
between the two. So, that means, if there
are two variables whether X causes Y or Y
41
00:06:20,430 --> 00:06:25,460
causes X, in fact, in the time series modeling,
we have very interesting component
42
00:06:25,460 --> 00:06:32,460
There are three different options. If X causes
Y and vice versa is not true, it is called
43
00:06:38,569 --> 00:06:43,729
as a unidirectional causality. Again, Y causes
X and not Y X versa, then again it is called
44
00:06:43,729 --> 00:06:50,729
as unidirectional causality. If X causes Y
and Y causes X, then it is called as bidirectional
45
00:06:51,250 --> 00:06:56,080
causality. However, we are not going in detail
about the time series modelling right now;
46
00:06:56,080 --> 00:07:00,969
we will discuss in detail in the later part.
So, in the mean time, we have to discuss what
47
00:07:00,969 --> 00:07:07,530
the entire structure of regression is; means,
we like to know what the basics of regression
48
00:07:07,530 --> 00:07:11,580
modelling are.
Now, regression is the average relationship
49
00:07:11,580 --> 00:07:18,580
between the two variables. Now, the starting
point of the game must have two variables.
50
00:07:19,580 --> 00:07:25,539
So, let us take two variables here. So, now,
regression can be obtained if there are two
51
00:07:25,539 --> 00:07:32,539
variables say Y and X. Now, if there are two
variables, then we are very much interested
52
00:07:34,169 --> 00:07:41,169
for cause and effect; that means, there are
two different situations here: Y on X and
53
00:07:41,729 --> 00:07:48,729
X on Y. Now, let us start with Y on X; then,
here this side X on Y. So, this is the component
54
00:07:51,259 --> 00:07:58,259
Y on X, is called as a regression line – Y
on X; and, X on Y means regression line from
55
00:07:58,960 --> 00:08:05,960
X to Y. So, now, if it is Y on X or X on Y,
how is the setup or structures. Let us start
56
00:08:06,430 --> 00:08:13,430
with here. If it is Y on X, what is the step
of regression? Under step of regression, simply
57
00:08:13,520 --> 00:08:20,520
mathematically, we can represent Y minus Y
bar is equal to byx into X minus X bar. In
58
00:08:20,939 --> 00:08:27,939
the case of X on Y, the equation is like this,
X minus X bar equal to bxy upon Y minus Y
59
00:08:30,669 --> 00:08:36,450
bar.
Now, certain things here you have to be very
60
00:08:36,450 --> 00:08:43,450
clear. First thing, we know X is a first variable;
this X is the first variable and Y is the
61
00:08:45,320 --> 00:08:52,320
second variable. Now, I have not mentioned
whether first variable… it means may be
62
00:08:53,220 --> 00:08:58,880
dependent, may be independent. Second full
– it may be dependent, it may be independent.
63
00:08:58,880 --> 00:09:05,880
Now, if I will say Y on X, then obviously,
it is Y dependent and X independent. The moment
64
00:09:05,980 --> 00:09:12,980
I will say X on Y, then obviously, it is X
dependent and Y independent. So, now X and
65
00:09:13,080 --> 00:09:20,080
Y are two variables. X bar is the average
of X – mean of X; then, Y bar is nothing
66
00:09:22,870 --> 00:09:29,870
but average of average of Y. So, now, we get
to know Y, Y bar, X, X bar; then, we have
67
00:09:32,120 --> 00:09:39,120
no idea about byx and bxy.
byx represents a regression coefficient; it
68
00:09:40,160 --> 00:09:47,160
is represented as regression coefficient – coefficients
on Y to X. So, this is for byx. Similarly,
69
00:09:57,210 --> 00:10:04,210
we have bxy; bxy represents regression on
regression coefficients on X to Y. This leads
70
00:10:12,500 --> 00:10:19,500
to here only. So, now, the situation is very
clear. So, we have two regression coefficients:
71
00:10:21,100 --> 00:10:28,100
first is X on Y; second is Y on X. So, now,
if it is X on Y, the regression equation is
72
00:10:29,830 --> 00:10:36,650
X minus X bar into bxy into Y minus Y bar;
and, if it is Y on X, then Y minus Y bar is
73
00:10:36,650 --> 00:10:42,710
equal to byx into X minus X bar. So, there
are two regression lines. So, obviously, we
74
00:10:42,710 --> 00:10:47,030
have two regression equations.
Now, we like to know what is this structure
75
00:10:47,030 --> 00:10:54,030
and setup of byx and bxy. bxy is a mathematical
coefficient; it is called as regression coefficient.
76
00:10:55,240 --> 00:11:02,240
And, bxy is also regression coefficient from
X to Y; and, for byx, it is Y to X. So, now,
77
00:11:03,790 --> 00:11:10,790
we like to know what is exactly byx and what
is exactly bxy; that means, we like to know
78
00:11:11,230 --> 00:11:18,230
what is the mathematics or statistics inside
bxy and byx. Let me highlight here what is
79
00:11:18,750 --> 00:11:20,230
all about this issue.
80
00:11:20,230 --> 00:11:27,230
Now, let us start with one equation, Y minus
Y bar is equal to byx into X minus X bar.
81
00:11:31,450 --> 00:11:38,450
So, this is regression on Y upon X. Now, here
byx represents r sigma y by sigma x. What
82
00:11:46,130 --> 00:11:53,130
is r here? r here represents correlation coefficient
and sigma y represents standard
deviation of X variables; then, sigma x represents
83
00:12:09,380 --> 00:12:16,380
standard deviation of Y. This is
standard deviation of X and this is standard
deviation of Y. byx – obviously, we have
84
00:12:29,290 --> 00:12:36,290
already represented; this
is regression coefficient of Y on X.
In the last lecture, we have discussed what
85
00:12:45,910 --> 00:12:52,090
is r. r is basically correlation coefficient,
which is again derived through proper structures.
86
00:12:52,090 --> 00:12:59,090
Now, here this is the first equation; this
is the second equation. Now, the third is
87
00:13:00,300 --> 00:13:07,300
r equal to covariance of X, Y by sigma x into
sigma y. This is conditional equation – third.
88
00:13:08,950 --> 00:13:15,660
Now, we have the original regression equation,
is Y minus Y bar equal to byx into X minus
89
00:13:15,660 --> 00:13:22,660
X bar followed by bxy equal to r upon sigma
y by sigma x. And again, r equal to covariance
90
00:13:24,250 --> 00:13:31,250
of X, Y by sigma x and sigma y.
Now, we like to know what is a sigma x, what
91
00:13:31,840 --> 00:13:36,590
is sigma y, and what is covariance of X, Y.
In fact, we have already discuss all these
92
00:13:36,590 --> 00:13:42,900
details. So, now, sigma x is nothing but square
root of summation X minus X bar whole square
93
00:13:42,900 --> 00:13:49,130
divide by n; and, sigma y represents square
root of summation Y minus Y bar whole square
94
00:13:49,130 --> 00:13:56,130
divided by n. So, now, we have covariance.
So, covariance of X, Y is equal to summation
95
00:13:58,080 --> 00:14:05,080
xy by n. It is nothing but summation X minus
X bar into Y minus Y bar divided by n. So,
96
00:14:11,150 --> 00:14:13,910
this is the regression coefficient; this is
correlation coefficient; this is standard
97
00:14:13,910 --> 00:14:20,910
deviation of X; this is standard deviation
of Y; and, this is covariance of X, Y. So,
98
00:14:21,000 --> 00:14:28,000
now, if we summarize all these details, then
obviously, ultimately, byx is nothing but
99
00:14:31,580 --> 00:14:38,580
covariance of X, Y by sigma x into sigma y
into sigma y sigma x. So, now, sigma y, sigma
100
00:14:46,640 --> 00:14:53,640
y gets canceled; so, it is nothing but covariance
of X upon Y divided by sigma square x; that
101
00:14:55,410 --> 00:14:57,530
is nothing but variance of X.
102
00:14:57,530 --> 00:15:04,530
Now, this covariance of x, y is nothing but
summation X
minus X bar into Y minus Y bar divide by n.
103
00:15:16,310 --> 00:15:23,310
Now again, byx is equal to summation X minus
X bar into Y minus Y bar by sigma square x.
104
00:15:32,130 --> 00:15:39,130
So, it is sometimes written as summation xy
by summation x square. This is divided by
105
00:15:43,310 --> 00:15:49,860
n. So, obviously, summation xy by… n, n
cancels. At the moment, you will take sigma
106
00:15:49,860 --> 00:15:56,050
x square, because sigma x square equals to
summation x square by n; that means, standard
107
00:15:56,050 --> 00:16:03,050
deviation of x is nothing but square root
of summation x square by n. So, now, this
108
00:16:07,940 --> 00:16:14,940
is byx. So, now, if we simplify, then it is
nothing but Y minus Y bar equal to summation
109
00:16:15,690 --> 00:16:22,690
xy by summation x square into X minus X bar.
So, this is the question of Y to X – regression
110
00:16:29,290 --> 00:16:30,010
equation.
111
00:16:30,010 --> 00:16:37,010
Now, come down to other part of this problem,
that is X on Y. Now, for X on Y, the regression
112
00:16:39,480 --> 00:16:46,480
equation is nothing but X minus X bar is equal
to bxy into Y minus Y bar. So, as usual, bxy
113
00:16:48,530 --> 00:16:55,530
is equal to r sigma x upon sigma y. Now, it
is nothing but covariance of x upon y by sigma
114
00:16:59,690 --> 00:17:06,569
x into sigma y multiplied by sigma x by sigma
y. So, sigma x, sigma x cancels; ultimately,
115
00:17:06,569 --> 00:17:13,569
covariance of x, y divided by sigma square
y. Now, if we further simplify, then it is
116
00:17:14,010 --> 00:17:21,010
something – summation xy by summation y
square. So, this is the final coefficient
117
00:17:21,449 --> 00:17:28,449
for bxy. So, ultimately, regression equation
will be X minus X bar is equal to summation
118
00:17:28,580 --> 00:17:35,580
xy by summation y square into Y minus Y bar.
So, this is the second equation of X on Y.
119
00:17:37,980 --> 00:17:44,980
So, we have to variables; corresponding to
two variables: Y and X, we have two regression
120
00:17:49,780 --> 00:17:56,510
equations: Y on X and X on Y. For Y on X,
the regression equation is Y minus Y bar into
121
00:17:56,510 --> 00:18:03,510
byx into X minus X bar; and, for X on Y, it
is nothing but X minus X bar equal to bxy
122
00:18:05,160 --> 00:18:12,160
upon Y minus Y bar. So, now, byx and bxy are
the regression coefficients. So, now, we have
123
00:18:14,890 --> 00:18:21,890
to see how these two regression coefficients
are integrated to each other and how it is
124
00:18:23,260 --> 00:18:30,260
very useful or very structure in the bivariate
econometric modelling. So, let me explain
125
00:18:30,670 --> 00:18:31,210
here.
126
00:18:31,210 --> 00:18:37,100
There are various properties here, which is
associated with the regression coefficient,
127
00:18:37,100 --> 00:18:44,100
correlation coefficient, covariance and variance.
Ultimately, in this bivariate data analysis
128
00:18:44,740 --> 00:18:51,740
or bivariate econometric modelling, we are
very much interested about variance, covariance,
129
00:18:57,350 --> 00:19:04,350
correlation and regression. So, we have two
series: Y minus Y bar equal to byx into X
130
00:19:12,920 --> 00:19:19,920
minus X bar. And, another side, X minus X
bar equal to bxy upon Y minus Y bar. So, now,
131
00:19:20,630 --> 00:19:27,630
this is equation 1; this is equation 2. So,
now, we like to know how they are integrated
132
00:19:31,910 --> 00:19:38,910
to each other. So, that means, is there any
relationship between these two equations or
133
00:19:40,110 --> 00:19:46,320
two regression coefficients? And, how these
two regression coefficients are integrated
134
00:19:46,320 --> 00:19:53,030
with correlation coefficients; that to variance,
covariance and structure?
135
00:19:53,030 --> 00:20:00,030
Now, let us start with here bxy. So, one standard
property is that the geometric mean of two
136
00:20:03,560 --> 00:20:09,790
regression coefficients is equal to correlation
coefficients. What is geometric mean? Now,
137
00:20:09,790 --> 00:20:16,790
byx into bxy – these two regression coefficients
will be like this – 0.5. So, what is byx
138
00:20:20,230 --> 00:20:26,760
and what is bxy? It is already mentioned.
So, byx is nothing but r sigma y by sigma
139
00:20:26,760 --> 00:20:33,760
x multiplied by r sigma x upon sigma y. So,
sigma x, sigma x cancels; sigma y, sigma y
140
00:20:34,990 --> 00:20:41,990
cancels. This is to the power 0.5. So, it
is simply r square upon square root. So, this
141
00:20:44,309 --> 00:20:50,309
means r. So, now, the physical interpretation
is that the geometric mean up to regression
142
00:20:50,309 --> 00:20:56,640
coefficient is the correlation coefficients.
So, that means, if we have two regression
143
00:20:56,640 --> 00:21:02,570
coefficients, then we can get to know the
correlation coefficient. That is simply the
144
00:21:02,570 --> 00:21:09,570
geometric mean of byx and bxy.
Now, this is the second property. The arithmetic
145
00:21:11,070 --> 00:21:18,070
mean of the two regression coefficients is
greater than two correlation coefficients.
146
00:21:18,540 --> 00:21:25,540
What is arithmetic mean? Now, for arithmetic
mean, bxy and byx is nothing but bxy plus
147
00:21:27,520 --> 00:21:34,520
byx by 2 greater than equal to correlation
coefficient. So, now how is the structure?
148
00:21:39,059 --> 00:21:46,059
It is nothing but r sigma x by sigma y plus
r sigma y by sigma x greater than equal to
149
00:21:50,440 --> 00:21:57,440
2r. So, now, r, r, r cancels. So, sigma square
x plus sigma square y greater than equal to
150
00:22:00,410 --> 00:22:07,410
2 sigma x into sigma y. So, this implies sigma
x minus sigma y whole square should be greater
151
00:22:11,410 --> 00:22:18,410
than equal to 0. So, it is a meaningful statement.
So, that means, we can justify that the arithmetic
152
00:22:19,120 --> 00:22:25,080
mean of two regression coefficients should
be always greater than equal to correlation
153
00:22:25,080 --> 00:22:32,080
coefficient; by any chance, it cannot be less
than that. So, this is the second issue of
154
00:22:32,910 --> 00:22:39,910
the association between regression coefficients
and correlation coefficients.
155
00:22:41,100 --> 00:22:48,100
Third property here – you know, r depends
upon byx and bxy. So, correlation coefficient
156
00:22:54,230 --> 00:23:01,230
simply represents or functional association
between byx and bxy. Now, it is very interesting.
157
00:23:05,190 --> 00:23:12,190
If r greater than 0, then byx is obviously
greater than 0; bxy greater than 0. So, that
158
00:23:14,600 --> 00:23:21,600
means, for if byx and bxy are positive, then
r must be positive. Then, if r is less than
159
00:23:22,950 --> 00:23:29,950
0, then byx less than 0, bxy less than 0;
or, if r equal to 0, then byx or bxy is equal
160
00:23:37,940 --> 00:23:44,940
to 0. So, that means, both regression coefficients
and correlation coefficients are usually same
161
00:23:49,250 --> 00:23:56,250
signed, by any chance it cannot be different.
For instance, if regression coefficients are
162
00:23:58,559 --> 00:24:05,559
negative, then obviously, correlation coefficient
will be negative, because it is the geometric
163
00:24:07,840 --> 00:24:14,280
mean of the two. So, obviously, both should
be positive, so that we will get the positive
164
00:24:14,280 --> 00:24:19,590
correlation coefficient; and, both should
be negative to get the negetive correlation…
165
00:24:19,590 --> 00:24:22,340
So, in one instance, bxy is positive and in
other instance, bxy should be positive; it
166
00:24:22,340 --> 00:24:29,340
cannot be other way around. So, this is how
the third property is all about between regression
167
00:24:33,130 --> 00:24:40,130
coefficients and correlation coefficients.
Now, you must be very much concerned about
168
00:24:41,850 --> 00:24:48,850
the coefficient of correlation and the coefficient
of regression. We have already mentioned that
169
00:24:50,860 --> 00:24:57,860
r, correlation coefficient always lies between
minus 1 and plus 1. Now; obviously, r square
170
00:24:59,080 --> 00:25:06,080
lies between 0 to 1. So, this is correlation
coefficient and this is the square of correlation
171
00:25:08,800 --> 00:25:14,760
coefficient. In fact, in analysis, when we
go deep into the regression, obviously, the
172
00:25:14,760 --> 00:25:21,760
r square component is a very important vector;
means, it has lots of beautiness. So, we will
173
00:25:23,960 --> 00:25:29,870
discuss in detail what is all about r square
and how it is very useful for entire regression
174
00:25:29,870 --> 00:25:36,550
issues. So, in the mean time, I like to know
it is simply r square; that means, square
175
00:25:36,550 --> 00:25:41,650
of correlation coefficient. And, if we go
into deep, then the r square is represented
176
00:25:41,650 --> 00:25:46,660
as the coefficient of the determination and
that is the measure of goodness fit test.
177
00:25:46,660 --> 00:25:51,710
In the first lectures when I mentioned about
the structure of econometric modeling, I have
178
00:25:51,710 --> 00:25:56,400
discussed the entire setup – how you start
with econometric modelling and how you end
179
00:25:56,400 --> 00:26:00,880
with the econometric modeling. In the very
beginning, I have mentioned, the first starting
180
00:26:00,880 --> 00:26:05,670
point is to define the problem that you have
to borrow from the theory; then, you have
181
00:26:05,670 --> 00:26:10,050
to transfer the theory into mathematical form
of the model; then, you have to transfer the
182
00:26:10,050 --> 00:26:16,559
mathematical form of the model to statistical
form of the model; then, we have to investigate
183
00:26:16,559 --> 00:26:21,730
that models. And, for that, you need to have
information; that is what we call data. So,
184
00:26:21,730 --> 00:26:26,140
the moment you have data, then you have to
apply the statistical technique tools for
185
00:26:26,140 --> 00:26:33,140
computational process and you the estimated.
Now, what you have estimated model in your
186
00:26:33,700 --> 00:26:40,080
front, then obviously, the first assignment
is to check the reliability before you like
187
00:26:40,080 --> 00:26:46,150
to go for forecasting or other issues.
I clearly mentioned during that times that
188
00:26:46,150 --> 00:26:50,740
so far as the reliability check is concerned,
we have three different specifications. See
189
00:26:50,740 --> 00:26:56,510
three different that is, goodness fit test,
then specification test and out of sample
190
00:26:56,510 --> 00:27:02,770
prediction test. And, goodness fit test is
one of the issues here only. So, now, what
191
00:27:02,770 --> 00:27:09,059
we are talking about r square is nothing but
the goodness fit test. So, the goodness fit
192
00:27:09,059 --> 00:27:13,130
of that particular model depends upon the
value of r square. If the r square value is
193
00:27:13,130 --> 00:27:17,870
very close to 1, then obviously, the fit of
the model is better. So, if the r square value
194
00:27:17,870 --> 00:27:24,870
is close to 0, then the model cannot be better
fitted; that means, if the goodness fit is
195
00:27:24,890 --> 00:27:30,960
not reliable, it cannot give any positive
indication, obviously, we cannot go for forecasting;
196
00:27:30,960 --> 00:27:35,940
that means, we have to apply, go back to the
second stage. So, the way we have structure
197
00:27:35,940 --> 00:27:42,940
or given a detailed structure about the flowchart,
accordingly, we have to proceed. So, now,
198
00:27:44,830 --> 00:27:51,280
by any chance if r square is close to 1, then
that means, model is reliable, then you can
199
00:27:51,280 --> 00:27:57,620
go for forecasting; and, that is observed
or that can be done with the basis of only
200
00:27:57,620 --> 00:28:03,900
goodness fit test.
Now, here the point is that if r is in between
201
00:28:03,900 --> 00:28:10,840
minus to 1, then obviously, coefficient determinant
limit is 0 to 1. Now, if the value of correlation
202
00:28:10,840 --> 00:28:16,830
coefficient is minus 1, it is positive; means,
perfectly negatively associated with each
203
00:28:16,830 --> 00:28:22,549
other. And, if it is equal to 1, then it is
positively associated to each other. And,
204
00:28:22,549 --> 00:28:28,309
that is perfect positive correlation; and,
this is perfect negative correlation. So,
205
00:28:28,309 --> 00:28:33,870
in between 0 must be there; 0 means no correlation
between the two. So, that means, if the r
206
00:28:33,870 --> 00:28:39,940
value is 0, then obviously, the causality
factor will not come into the picture, because
207
00:28:39,940 --> 00:28:46,940
if we take any regression equation: Y on X
or X on Y, then obviously, byx and bxy are
208
00:28:48,290 --> 00:28:55,160
the factors. So, the moment you have r coefficient
0, then obviously, the entire issue – byx
209
00:28:55,160 --> 00:29:02,160
and bxy are also 0. So, as a result, regression
equations: Y on X and X on Y cannot be observed.
210
00:29:04,160 --> 00:29:09,770
So; that means, from a correlation coefficient
itself it will give indication whether there
211
00:29:09,770 --> 00:29:16,020
is any cause and effect relationship, because
it is the essential point whether you have
212
00:29:16,020 --> 00:29:21,770
to proceed further; that means, if you are
staring from the variance, covariance, correlation
213
00:29:21,770 --> 00:29:25,350
and regression, then it is just like a step-by-step
process.
214
00:29:25,350 --> 00:29:30,700
Now, if the correlation gives 0 results, then
obviously, no point to go for regression.
215
00:29:30,700 --> 00:29:36,540
The region is that because regression is the
advance technique; it is very time taking;
216
00:29:36,540 --> 00:29:40,740
and also, mathematically very complex. So,
the moment you will get r equal to 0, then
217
00:29:40,740 --> 00:29:44,640
you can stop there; there is no point to discuss
about cause and effect relationship, because
218
00:29:44,640 --> 00:29:49,740
the relation itself has no meaning at all.
Now, if the correlation coefficient minus
219
00:29:49,740 --> 00:29:56,740
1 to 1, then obviously, coefficient of determination
is 0 to 1. So, accordingly, the goodness of
220
00:29:57,309 --> 00:30:01,120
fit will give the forecasting results. If
it is close to 1, then it is better forecasted;
221
00:30:01,120 --> 00:30:07,520
if it is close to 0, then there is no question
of forecasting, but if it is close to 0, then
222
00:30:07,520 --> 00:30:10,100
it is less reliable for forecasting.
223
00:30:10,100 --> 00:30:15,850
Now, corresponding to this particular setup,
we have another property associated with the
224
00:30:15,850 --> 00:30:22,850
regression coefficient, correlation coefficient.
Structure is that here we define the correlation
225
00:30:24,240 --> 00:30:31,240
structure; the correlation is very symmetric
in nature; so that means, r yx equal to r
226
00:30:31,929 --> 00:30:38,590
xy. For instance, like this, we have two variables
X and Y; that means, we can correlate X upon
227
00:30:38,590 --> 00:30:45,590
Y or we can correlate Y upon X. If will correlate
X on Y is called as r xy; and, if will be
228
00:30:46,020 --> 00:30:53,020
correlate Y on X, then it is called as r yx;
that means, the fundamental theorem is that
229
00:30:53,520 --> 00:31:00,520
r xy is equal to r yx; that means, it is simply
symmetric in nature. And, one of the other
230
00:31:05,049 --> 00:31:12,049
important point is that r xy is equal to r
uv; u and v are other variables, which is
231
00:31:15,120 --> 00:31:22,120
other way representation of xy. For instance,
u can be X minus a by h and v can be X minus
232
00:31:23,020 --> 00:31:30,020
b by k; that means, it is represented as the
coefficient is independent of change of origin.
233
00:31:32,590 --> 00:31:37,390
However, in the case of regression, it will
not be an issue; in the case of regression,
234
00:31:37,390 --> 00:31:44,230
it is change of origin, but, not this scale.
The reason is that byx is simply summation
235
00:31:44,230 --> 00:31:51,230
xy by summation y square. So, now, if x represents
here X minus X bar and y represents here Y
236
00:31:55,360 --> 00:32:02,360
minus Y bar and y square represents Y minus
Y bar into Y minus Y bar. So, now if we simplify
237
00:32:12,160 --> 00:32:19,160
here, then X equal to hu plus a and here v
equal to Y minus b… In fact, Y equal to
238
00:32:24,320 --> 00:32:31,320
kv plus b. So, now, X minus X bar is nothing
but h into u minus u bar and Y minus Y bar
239
00:32:35,419 --> 00:32:42,419
represents k into v minus v bar. So, now,
if it is simplified, the expression bxy is
240
00:32:46,679 --> 00:32:53,679
nothing but h into k summation uv divided
by summation X square – means it is simply
241
00:32:55,760 --> 00:33:02,760
k square into summation v square.
Now, this k, k cancels. Then, ultimately,
242
00:33:07,110 --> 00:33:14,110
we have the factor h by k into summation u
v by summation v square; so, that means, clear
243
00:33:15,000 --> 00:33:21,750
cut indication is that regression coefficient
is independent of origin, but not the scale.
244
00:33:21,750 --> 00:33:28,750
Similarly, you can make a verification for
bxy. So, the point is that correlation coefficient
245
00:33:30,730 --> 00:33:36,510
is independent of change of origin and scale;
whereas regression coefficient is change of
246
00:33:36,510 --> 00:33:43,510
origin, but not the scale. So, this is the
it case of regression issue and correlation
247
00:33:43,520 --> 00:33:44,210
issue.
248
00:33:44,210 --> 00:33:51,210
Now, there is another point you can hear and
note down on this. When r equal to 0, then
249
00:33:52,230 --> 00:33:59,230
obviously, the two regression lines are perpendicular
to each other. So, if r equal to 1; means,
250
00:34:07,049 --> 00:34:14,049
if the correlation coefficient is equal to
1, then it means two lines coincide; so; that
251
00:34:15,480 --> 00:34:22,480
means, if r equal to plus minus 1, then two
regression equations are usually different
252
00:34:22,940 --> 00:34:28,220
and there is an exact relationship between
the two. If regression coefficients are not
253
00:34:28,220 --> 00:34:34,349
plus minus 1 or it is equal to 0, then obviously,
there is no relationship between the two.
254
00:34:34,349 --> 00:34:39,169
So, that means, there are three different
situations altogether. If it is plus minus
255
00:34:39,169 --> 00:34:43,759
1, then there is a perfect relationship, perfect
association between the two. If it is less
256
00:34:43,759 --> 00:34:48,909
than that, then there is relation, but it
is not perfectly related to each other. However,
257
00:34:48,909 --> 00:34:55,109
if it is equal to 0, then there is no question
of association and also there is no question
258
00:34:55,109 --> 00:35:01,009
of causality. So, this is the basic background
of the regression analysis.
259
00:35:01,009 --> 00:35:06,940
Now, we will explain with the h-v table examples
here. So, how it is exactly structured as
260
00:35:06,940 --> 00:35:13,650
far as the regression equation is concerned.
Let me take example here. This is X series
261
00:35:13,650 --> 00:35:20,650
and this is Y series. So, 71; then, this is
68; then 66, then 67, then 70, 71, 70, 73,
262
00:35:30,619 --> 00:35:37,619
72, 65, then 66. So, 1 2 3 4 5 6 7 8 9 10
11; so, that means, here n equal to 11. Now,
263
00:35:47,229 --> 00:35:54,229
it will make sum; then, this is sum X equal
to 759. So, obviously, X bar is equal to 759
264
00:35:57,589 --> 00:36:04,589
divided by 11, which is nothing but 69.
Now, come to Y structure. For 71, Y is 69;
265
00:36:06,049 --> 00:36:13,049
for 68, Y is 64; then, for 66, it is 65; then,
for 67, it is 63; then 65; then, this is 62;
266
00:36:21,970 --> 00:36:28,970
65, 64; then, 66, 59, then 62. So, obviously,
n is here 11, because in the very beginning,
267
00:36:35,849 --> 00:36:42,849
I have mentioned which will go for variance,
covariance, correlation and regression. The
268
00:36:43,470 --> 00:36:50,470
essential condition is that the sample observations
most be same and unique. If the sample observations
269
00:36:53,410 --> 00:37:00,059
are not same, then you cannot make any association
or you cannot correlate; you cannot regress;
270
00:37:00,059 --> 00:37:07,059
you cannot covariate. This is in order of
the case when you go for univariate setup.
271
00:37:07,920 --> 00:37:14,920
For instance, within this particular structure
like this, we are just… This is X within
272
00:37:21,430 --> 00:37:28,430
this system and this is Y within this system.
We are doing just our assignment Y to X or
273
00:37:30,589 --> 00:37:37,589
X to Y either to regress or to correlate.
But, this is what? Bivariate. Within the bivariate,
274
00:37:42,059 --> 00:37:48,869
if we are interested about X or we are interested
about Y, then there is a no point if this
275
00:37:48,869 --> 00:37:53,160
particular structure we call it n 1 and this
particular structure sample observation we
276
00:37:53,160 --> 00:38:00,160
call it n 2, there is a no point that n 1
should be exactly equal to n 2. But, if we
277
00:38:01,309 --> 00:38:07,460
correlate between these two, in that case,
it is mandatory, n 1 should be equal to n
278
00:38:07,460 --> 00:38:12,999
2. But, in this case – this particular case
and this particular case, n should not be
279
00:38:12,999 --> 00:38:19,999
exactly equal to 2; that means, for univariate
analysis, any univariate statistics if you
280
00:38:20,059 --> 00:38:26,359
look, there is no point or there is no question
about the sample observation information,
281
00:38:26,359 --> 00:38:33,359
because uniform of sample observation within
the system represents the structure of univariate
282
00:38:34,369 --> 00:38:39,999
and bivariate and multivariate. Our objective
is completely different in the case of bivariate
283
00:38:39,999 --> 00:38:45,720
or multivariate; we just like to know how
they are related to each other. And, for that,
284
00:38:45,720 --> 00:38:51,029
we need to have information about the univariate
statistics.
285
00:38:51,029 --> 00:38:56,749
Now, within the setup, if you are looking
for univariate statistic of particular variable,
286
00:38:56,749 --> 00:39:01,479
then this is completely different. Now, for
another variable, the univariate statistics
287
00:39:01,479 --> 00:39:06,930
is completely different; that means, altogether
they are independent. But, when you like to
288
00:39:06,930 --> 00:39:13,380
integrate each other with respect to correlation
regression, that time this full unit should
289
00:39:13,380 --> 00:39:18,739
be very similar in nature. In the first case,
it is not mandatory; but, in the second case,
290
00:39:18,739 --> 00:39:25,739
it should be mandatory.
Yes, there is certain issues here is; if we
291
00:39:26,249 --> 00:39:33,249
have X variable and Y variable, here 1, 2,
3, 4, 5 up to 10 observations. Another case,
292
00:39:35,309 --> 00:39:42,309
we have 1, 2, 3, 4, 5, 6 up to 8. So, now,
the observations 6, 7 or take this as 8. This
293
00:39:46,339 --> 00:39:51,279
is 8. So, now, what we have to do, in that
case, so far as the univariate statistics
294
00:39:51,279 --> 00:39:57,160
is concerned, you can do an analysis here;
you can do analysis here, but when we apply
295
00:39:57,160 --> 00:40:03,220
bivariate modeling here, then in that case,
the system is totally inconsistent, because
296
00:40:03,220 --> 00:40:10,119
this is the sample observation of 8 and this
is the sample observation of 10. So, now this
297
00:40:10,119 --> 00:40:16,259
is nothing but inconsistency. So, to solve
this particular problem or to handle the particular
298
00:40:16,259 --> 00:40:23,259
issue, what we have to do? We have to artificially
create uniform sampling; so, that means, either
299
00:40:24,839 --> 00:40:31,839
you can reduce full size or increase the sample
size to 10; means 10 is already there for
300
00:40:32,289 --> 00:40:39,289
X, but in the case of Y, it is not there.
So, it is only 8. So, we can extend 9 and
301
00:40:39,499 --> 00:40:43,799
10 further.
You have to extend 9 and 10 further. For that,
302
00:40:43,799 --> 00:40:50,210
either we have to explore that information
is available; if it is so, then your task
303
00:40:50,210 --> 00:40:57,210
is very easy, you can go head. But, sometimes
in the real world business, you may not have
304
00:40:59,779 --> 00:41:05,440
information, but there is a standard mathematical
technique through which you can fill that
305
00:41:05,440 --> 00:41:11,469
gap also. Here one of the standard techniques
is called as interpolation and extrapolations.
306
00:41:11,469 --> 00:41:17,200
If we apply interpolation and extrapolation,
then the sample unit 8 can be extended to
307
00:41:17,200 --> 00:41:24,190
10. So, in that case, you will get the a uniformity
in the structures. Then, of course, you can
308
00:41:24,190 --> 00:41:31,190
go ahead with the solutions. But, every time
you can apply interpolation and extrapolation.
309
00:41:31,719 --> 00:41:38,650
However, there are certain problems associated
with the interpolation and extrapolation.
310
00:41:38,650 --> 00:41:45,650
It can solve the problem; it can get the model
splited; you can go for forecasting; you can
311
00:41:47,299 --> 00:41:54,299
go for anything, etcetera, but it will affect
the liability part of the model. So, the moment
312
00:41:55,739 --> 00:42:02,450
you will go for interpolation and extrapolation
to enhance the sample size or to get the uniformity
313
00:42:02,450 --> 00:42:09,450
in the sample, then one of the standard problem
you can face is that called as a correlation,
314
00:42:11,549 --> 00:42:16,450
which is very complex, very serious and very
interesting also. We will discuss in detail
315
00:42:16,450 --> 00:42:23,229
when we go for the autocorrelation modeling.
So, right now, it is not an issue here. So,
316
00:42:23,229 --> 00:42:27,640
in the first hand, we just want to know how
we have to solve this particular problem.
317
00:42:27,640 --> 00:42:31,369
Later on, when there is additional problem
or additional complexity, so far as the reliability
318
00:42:31,369 --> 00:42:37,390
check is a concerned, that time we have some
other tricks how you have to eradicate that
319
00:42:37,390 --> 00:42:40,130
problem. So, we will discuss in detail when
we go for that.
320
00:42:40,130 --> 00:42:47,130
In the mean time, we have two observations:
X and Y. X contains this much of information;
321
00:42:47,479 --> 00:42:54,140
Y contains this much of information. Now,
in the first hand, sample observations are
322
00:42:54,140 --> 00:43:01,140
similar; that means we can proceed further
for the analysis. So, origin or In fact, sometimes
323
00:43:03,430 --> 00:43:09,329
when there is series of observations. This
is 11 – by look, we can say that there is
324
00:43:09,329 --> 00:43:13,460
inconsistence. And, there are two variables
only; then, you can say that there is inconsistence.
325
00:43:13,460 --> 00:43:18,420
But, when there is multiple variables and
multiple sample points, that time it is very
326
00:43:18,420 --> 00:43:23,839
difficult to observe. Yes, we have standards
softwares. So, we just enter the data, then
327
00:43:23,839 --> 00:43:28,759
we crosscheck it. For all these variables,
the moment you will put the descriptive statistics,
328
00:43:28,759 --> 00:43:34,279
it will give you indication what is the observation
n for all the variables and what is the mean
329
00:43:34,279 --> 00:43:40,849
of all the variables, what is standard deviation,
variance – all these descriptive statistics
330
00:43:40,849 --> 00:43:46,200
it will give you in detail.
Now, with the available information X and
331
00:43:46,200 --> 00:43:53,200
Y, we need to find out X squares, we need
to find out Y squares, we need to find out
332
00:43:53,989 --> 00:44:00,989
X Y; then, we have to proceed for the regression
coefficient. To simplify further or the structure
333
00:44:01,489 --> 00:44:08,489
because of its simplicity, we can go for small
x square small y square and small xy. So,
334
00:44:13,440 --> 00:44:20,440
here small x square is represented as X minus
X bar whole square and small Y square represents
335
00:44:23,059 --> 00:44:30,059
Y minus Y bar whole square. X bar is 69 here.
So, corresponding to this Y, summation Y is
336
00:44:32,759 --> 00:44:39,759
equal to 704. So, n is 11 So, obviously, Y
bar equal to 704 by 11, which is nothing but
337
00:44:42,009 --> 00:44:47,089
64.
Now, if we transfer it, then every item has
338
00:44:47,089 --> 00:44:53,819
to be transferred into X minus X bar; that
means, for first case, it should be X minus
339
00:44:53,819 --> 00:45:00,819
69. So, now, if we transfer, then this structure
will come to minus 1, minus 3, minus 2, then
340
00:45:05,410 --> 00:45:12,410
1, 2, 1, 4, 3, minus 4, minus 3. Similarly,
in the second case, this is in fact, x; this
341
00:45:18,249 --> 00:45:25,249
is in fact x. So, obviously, we will go for
X square and Y square. So, for Y, it is nothing
342
00:45:25,259 --> 00:45:32,259
but 5, 0, 1, minus 1, 1, minus 2, 1, 0, 2,
minus 5, then minus 2. Now, this is X deviation
343
00:45:41,299 --> 00:45:48,299
format and Y deviation format. So, now, we
need XY. So, XY is 10 and 0, minus 3, 2, 1,
344
00:45:54,839 --> 00:46:01,839
minus 4, then 1, 0, then 6, then 20, then
6. Similarly, I will get x square and you
345
00:46:08,089 --> 00:46:15,089
will get y square. x square represents 4,
1, 9, 4, 1, 4, 1, 16, then 9, then 16, then
346
00:46:20,670 --> 00:46:27,670
9. Then, corresponding Y, we have y square
25, 0, 1, 1, 1, 4, 1, 0, 4, 25, 4. So, now,
347
00:46:36,130 --> 00:46:41,549
we have explain x square, we have y square,
and we have xy; of course, it is in deviation
348
00:46:41,549 --> 00:46:47,109
format. So, now, summation xy is equal to
here 39; then, summation x square is equal
349
00:46:47,109 --> 00:46:54,109
to 74; and, summation y square is here 66.
Now, we have to see how the regression coefficient
350
00:46:55,680 --> 00:47:02,680
is, how the correlation coefficient is. So,
now given information with detail about its
351
00:47:04,890 --> 00:47:11,890
statistics: invariate statistic and bivariate
statistics, that is, with respect to variance
352
00:47:12,329 --> 00:47:19,329
and covariance, we have to proceed further
for regression and also its correlation coefficient.
353
00:47:19,660 --> 00:47:26,660
Now, we have two different equations: Y minus
Y bar equal to byx into X minus X bar; and,
354
00:47:29,390 --> 00:47:36,390
other side, X minus X bar is equal to bxy
into Y minus Y bar. Now, first of all, we
355
00:47:39,539 --> 00:47:46,539
calculate what is byx. byx is equal to covariance
of x, y by sigma x into sigma y
356
00:48:04,390 --> 00:48:11,390
into sigma y by sigma x. So, sigma y sigma
y cancels. So, covariance of x, y by sigma
357
00:48:13,029 --> 00:48:20,029
square x. To the simplify, it is nothing but
N summation xy minus summation x into summation
358
00:48:23,009 --> 00:48:29,809
y divide by N summation x square minus summation
x whole square. If we further simplify, then
359
00:48:29,809 --> 00:48:36,809
it is something summation xy by summation
x square; that means, summation xy here is
360
00:48:36,809 --> 00:48:43,809
39. So, 39 divided summation x square – is
74 here. So, this is what the regression coefficient
361
00:48:49,660 --> 00:48:54,989
is.
Now, the equation will be Y minus Y bar is
362
00:48:54,989 --> 00:49:01,989
equal to 39 by 74 into X minus X bar. In other
words, Y minus Y bar – is 64 is equal to
363
00:49:04,440 --> 00:49:11,440
39 by 74 into X minus X bar, that is, 69.
Now, if we simplify, then this will be simply
364
00:49:20,630 --> 00:49:27,630
in the format of Y equal to alpha plus beta
X; alpha and beta – supporting components.
365
00:49:30,749 --> 00:49:37,749
Similarly, in the case of X minus X bar, it
is nothing but X minus 69 into bxy Y minus
366
00:49:44,779 --> 00:49:51,390
64. So, what is bxy? bxy is equal to summation
xy upon summation y square. So, this is nothing
367
00:49:51,390 --> 00:49:58,390
but 39 by summation y square is 66. So, X
minus 69 into 39 by 66 into Y minus 64. If
368
00:50:03,819 --> 00:50:10,749
we simplify again further, you will get in
the format of Y equal to alpha plus beta X.
369
00:50:10,749 --> 00:50:17,749
Now, here alpha and beta are very supporting
factors; beta is the slope of this particular
370
00:50:21,029 --> 00:50:27,390
line. So, beta is the real structure, where
it means it is the main regression coefficient,
371
00:50:27,390 --> 00:50:34,390
what we call it byx and bxy. Alpha will just
give you the indication where the line exactly
372
00:50:37,509 --> 00:50:38,769
starts.
373
00:50:38,769 --> 00:50:45,749
Let us take a case here. Whatever information
we have since we are going for Y on X or X
374
00:50:45,749 --> 00:50:52,749
on Y, then obviously, the moment is like this.
If we put it here X and Y and if we plot all
375
00:50:52,869 --> 00:50:59,869
these points, then we get to know how is the
set. Usually if we then the structure will
376
00:51:00,420 --> 00:51:07,420
be this. It will be like this. So, now, for
every sample units 1, 2, 3, 4, 5 like this,
377
00:51:08,539 --> 00:51:13,359
then obviously, there is some Y observations.
So, now, here the moment will be like this.
378
00:51:13,359 --> 00:51:18,950
It will connect each and every point and the
moment will be like this. So, within the moments,
379
00:51:18,950 --> 00:51:23,959
we like to know how is the path; that means,
this path is called as the line of the best
380
00:51:23,959 --> 00:51:30,519
fit. This is what we call it as Y head equal
to alpha head plus beta head X. This is the
381
00:51:30,519 --> 00:51:37,519
estimated equation, which we derive from Y
into Y bar equal to byx into X minus X bar.
382
00:51:38,349 --> 00:51:44,039
The detail calculation procedure we get to
know when we will go for the exact econometric
383
00:51:44,039 --> 00:51:49,940
modelling and regression modeling. We are
not discussing the detail issue about the
384
00:51:49,940 --> 00:51:54,459
structure and setups; we are just briefing
what is all about regression analysis. Once
385
00:51:54,459 --> 00:52:01,459
we enter to this – the structure of bivariate
and multivariate in a research angle or practical
386
00:52:02,420 --> 00:52:09,420
problem angle, then obviously, you can get
to know how complex it is, how it is derived
387
00:52:13,930 --> 00:52:17,859
really. So, structures.
Now, we will summarize this entire concept
388
00:52:17,859 --> 00:52:24,859
– what is all about bivariate regression
modeling; what is the structure about variance,
389
00:52:25,789 --> 00:52:32,789
covariance, correlation and regression. So,
the basic of objective behind bivariate modelling
390
00:52:33,150 --> 00:52:38,209
is that we like to know what the association
between the two variables is; that means,
391
00:52:38,209 --> 00:52:42,660
the fast condition is that in a particular
system problem setup we must have two variables.
392
00:52:42,660 --> 00:52:49,660
This is the first condition. And, for particularly
covariance, correlation and regression, then
393
00:52:49,869 --> 00:52:56,309
the second important point is that both the
variables have same number of observations.
394
00:52:56,309 --> 00:53:02,229
If one variable exceeds or less than that
of other variable, then obviously, the structure
395
00:53:02,229 --> 00:53:08,209
is inconsistent; then, you cannot proceed
further. So, the first condition is that you
396
00:53:08,209 --> 00:53:15,209
must have two variables in the system and
both the variables have same number of observations.
397
00:53:15,450 --> 00:53:21,890
Then, we like to know what the degree of association
between the two variables is. For that, you
398
00:53:21,890 --> 00:53:25,019
can apply covariance, you can apply correlation,
but correlation is better than covariance,
399
00:53:25,019 --> 00:53:29,769
because it is unitless measurement while covariance
is not at all a unitless measurement. So,
400
00:53:29,769 --> 00:53:35,829
obviously, correlation is better choice than
the covariance although the equation of correlation
401
00:53:35,829 --> 00:53:42,219
is little bit complex. So, now, if your objective
is to know the degree of association between
402
00:53:42,219 --> 00:53:46,759
the two variables, then correlation is the
best technique for that. However, if you like
403
00:53:46,759 --> 00:53:50,499
to know what is the cause and effective relationship
between the two variables, then of course
404
00:53:50,499 --> 00:53:55,759
you have to go for regression analysis.
Regression analysis basically gives you an
405
00:53:55,759 --> 00:54:02,729
indication whether it is X influence Y for
Y influence X. So, in that, we have two standard
406
00:54:02,729 --> 00:54:09,729
equations. For Y on X, it is Y minus Y bar
into byx upon X minus X bar. And similarly,
407
00:54:10,119 --> 00:54:16,599
for X on Y, X minus X bar into bxy upon Y
minus Y bar. So, this is to know the regression
408
00:54:16,599 --> 00:54:23,599
coefficient, because it will give you indication
how the path is all about, because the moment
409
00:54:24,609 --> 00:54:29,849
you will get regression coefficients byx and
bxy, then it will give you the indication
410
00:54:29,849 --> 00:54:35,279
of what is the value of correlation coefficient,
the square of correlation coefficient, that
411
00:54:35,279 --> 00:54:38,940
is, coefficient determination. And, that will
give you the weightage of that particular
412
00:54:38,940 --> 00:54:44,630
relationships. If the value of that r square
is very high and close to 1, then they are
413
00:54:44,630 --> 00:54:50,009
perfectly related to each other or their degree
of association is very high. And, if it is
414
00:54:50,009 --> 00:54:57,009
very high, then obviously, the prediction
and forecasting structure is very. So, now,
415
00:54:57,289 --> 00:55:04,289
the r will give you the indication about the
moment between these two variables in their
416
00:55:07,479 --> 00:55:14,459
association, also its causality.
Now, with this, we have to end this particular
417
00:55:14,459 --> 00:55:20,400
class. Next class, we will discuss in detail
about the basis statistic before you enter
418
00:55:20,400 --> 00:55:26,640
into the econometric modeling. So, that is
the case of probability and hypothesis testing.
419
00:55:26,640 --> 00:55:33,640
So, in the next lectures we will discuss the
probability and hypothesis; then, we will
420
00:55:33,699 --> 00:55:37,440
proceed to the multivariate econometric modeling.
With this, we will close this class.
421
00:55:37,440 --> 00:55:39,900
Thank you very much. Have a nice day.