Good afternoon.This is Dr.Pradhan here. Welcome
to NPTEL project on Econometric Modeling.Today
we will discuss the issue of Bivariate Econometric
Modeling.In the lastcouple of lectures we
have discussed various aspects of Econometric
Modeling, various structures of data analysis,Univariate
analysis, Bivariateanalysis and Multivariate
analysis.We have discussed various issues
under Univariate analysis, various issues
under Bivariate analysis and little bit idea
about the Multivariate analysis. So, today
westructure is mostly on the analysis of econometric
modeling.
So, let us start with what is all about the
structure of Bivariate Econometric Modeling.
So, it it consist oftwoaspects Bivariate setup
and modeling. So, we have discussed what is
the Bivariate data structure and we have also
discussed the modeling issues. So, let me
first highlight what is theEconometric issue
behind the Bivariate Modeling. Econometric
is the product of statistics. Basically it
is an extension ofregression modeling.
So, the basic idea of regression analysis
is to start with cause and effect relationship
betweentwovariables that is the dependent
and independent variables.
So, we like to know how the econometric modeling
is very close or you can say somewhat different
from the basic statistical modeling. So, the
basic idea behind regression analysis is to
start the cause and effect relationship. It
is alsoa similar way of econometric modeling.
So, here this issue is something different
when you will go for basic regression analysis.
So, we are not bothering about the various
typical issues or typical problems behind
this analysis.We have discussed the simple
structure of regression analysis.
But, if you go in deep higher version of modelingthen,
the regression analysis is very complex and
very complicated. So far as econometric modeling
is concerned,it is the root or beginning from
this basic regression analysis. So, now, when
will you talk about the Bivariate Econometric
Modelingobviously, the basic idea behind this
issue is to study the cause and effect relationship
betweentwovariableslike we have discussed
this issue.That means,it is cause and effect
relationship betweentwo variables.
So, what is the cause and what is the effect?For
instance, if we havetwo variables say X and
Y then if we will write Y is a function of
X then X is represented as a cause and Y represented
as a effect. That means, this is effect side
and this is cause side alright. So, nowfor
as econometric modeling is concerned, it is
the extension of the basic regression modeling.
Econometric Modeling the issue is on the structure
of research. So, there are many ways the structure
can be analyzed.For instance, we are discussing
here cause and effect relationship betweentwo
variables.There are many names we can discuss
regarding the cause and effect relationship.
For instance, it can be the a relationship
between independent variables and dependent
variables.Otherwise it is alsotheissue between
explanatory variables explanatory variable
and explained variables. So, the structure
is like this independent to dependent explain
to explanatory.Then similarly, it can be also
represented as this is predictor and it will
stand to predictant. So, similarly, it is
otherwise known as exogenous variablesand
this is otherwise known as endogenous variables.
So, similarly, it is otherwise known as stimulus
and this is otherwise called as a response.This
is otherwise called as a regressor this is
otherwise called as a regressant. So, likewise
there are various waysit can be represented.
So, its cause and effect relationship the
issue between or the nexus between dependent
variable independent variable, explained variable
and explanatory variables,covariance and covariate,
then similarly, exogenous and endogenous variable,
stimulus and responseandlast, but, not the
least is called as a regressor and regressant.
Because Econometrics is a statistical technique
which has many application in many areas.There
are variousyou know ways it can be represented.Means
the paper is must learn application oriented.
So, as a result the same words it can be represented
in a very ways it means many ways. So, it
is just like oldwine in a new bottle. So,
the pictures aremore or less same, but, the
representation is the somewhat different.
So, before we start with this BivariateModeling,you
know Econometric Modeling.Let me highlight
what is the issue behind thisBivariate Econometric
Modeling.
The Bivariate Econometric Modeling, since
the issue starts with the bivariateobviously,
there aretwosets of variables. One we call
it X and another is you can say Y. So, before
we enter to Bivariateeconometric modeling
we must haveessential requirements otherwise
55
modeling. So, now, the issue is what are this
56
econometric modeling that too Bivariate analysis.
57
of Bivariate econometric modeling.
58
to build a Bivariate econometric modeling.
59
must be two variablesin the system. So, let
60
is first and foremost condition of thisparticular
61
of dependent and independent. So, dependent
62
because it is the cause and effect relationship
63
structure and there isa independent structure.
64
00:08:40,300 --> 00:08:47,300
when we call it effect it is called as a dependent
00:08:48,910 --> 00:08:53,620
Bivariate econometric modeling then, there
00:08:53,620 --> 00:08:58,870
the first condition and second condition you
00:08:58,870 --> 00:09:05,870
which 1 is independent.Becausethe model isbased
00:09:06,100 --> 00:09:12,310
know bidirectional causality issue.So, obviously
00:09:12,310 --> 00:09:19,310
systemthen either X causes Y or Y causes X
00:09:19,600 --> 00:09:25,120
So, that particular structure is called as
00:09:25,120 --> 00:09:30,940
process of discussing the detail about time
00:09:30,940 --> 00:09:37,940
of beginning. So, thisrestriction must be
00:09:38,250 --> 00:09:43,550
reverse causality. So, we are not considering
00:09:43,550 --> 00:09:50,550
X influence yso, we are assuming that Y does
00:09:51,920 --> 00:09:58,649
Y is cannot be treated as a again independent
00:09:58,649 --> 00:10:05,500
classification must be very essentialand very
00:10:05,500 --> 00:10:12,500
choice of this particular modeling.
00:10:12,850 --> 00:10:18,870
in other words it called as a n greater than
00:10:18,870 --> 00:10:25,870
number of number of variables in the system.Just
00:10:30,899 --> 00:10:37,899
must be greater than equal to two. So, that
00:10:40,730 --> 00:10:47,730
as a independent variables setup. So, when
00:10:49,980 --> 00:10:56,980
k may be or k is exactly equal to 1.So, when
00:10:57,649 --> 00:11:02,949
to 2.When we will go for multivariate modeling
00:11:02,949 --> 00:11:09,079
But for the Bivariate analysis or Bivariate
00:11:09,079 --> 00:11:15,970
the number of independents variables then,
00:11:15,970 --> 00:11:22,970
be exactly equal to 1. That means, in this
00:11:25,240 --> 00:11:32,240
exactly equal to 1. So, k is treated as a
00:11:32,810 --> 00:11:39,810
system. So, then finally, N greater than n
00:11:46,579 --> 00:11:53,579
number of observations in the systems and
00:12:05,860 --> 00:12:12,250
systems.
00:12:12,250 --> 00:12:18,730
there aretwo variables. So, your sample size
00:12:18,730 --> 00:12:25,730
size is less than 2 then the system itself
00:12:25,819 --> 00:12:32,819
when there is a issue of multivariate similarly,
00:12:34,060 --> 00:12:41,060
n will be very serious issue. So, for instance
00:12:42,120 --> 00:12:49,120
in the systems.That means, n represents 3
00:12:49,560 --> 00:12:53,550
three.
00:12:53,550 --> 00:12:59,250
cannot be operated properly. So, to operate
00:12:59,250 --> 00:13:03,110
size or samples; that means,you must have
00:13:03,110 --> 00:13:10,110
you have sufficient data point you cannot
00:13:12,449 --> 00:13:16,379
models.
00:13:16,379 --> 00:13:23,000
of the model or the feasibility of the model
00:13:23,000 --> 00:13:28,589
sample size better is the accuracy of the
00:13:28,589 --> 00:13:34,199
models.If the sample size is very less or
00:13:34,199 --> 00:13:39,720
it will affect the system and the model by
00:13:39,720 --> 00:13:46,720
cannot use this particular model for any forecasting
00:13:46,769 --> 00:13:52,009
of policy use or forecasting your model must
00:13:52,009 --> 00:13:59,009
So, what we otherwise called as a best fitted
00:13:59,269 --> 00:14:06,269
must havehigher and higher sample size. So,
00:14:07,970 --> 00:14:14,550
setup that too econometric modeling. So, your
00:14:14,550 --> 00:14:21,189
very high than the number of variable in the
00:14:21,189 --> 00:14:28,189
to determine the a minimum number of you can
00:14:28,649 --> 00:14:30,910
system.
00:14:30,910 --> 00:14:37,139
in the different version of the econometric
modeling. In thea in the very beginningyou
115
00:14:37,139 --> 00:14:42,329
must have knowledge that whatever variables
you are using in a particular systems your
116
00:14:42,329 --> 00:14:48,670
samplessample size should be absolutely greater
than to number of variables in the systems.
117
00:14:48,670 --> 00:14:55,670
So, this is the another conditionof Bivariateyou
can say econometric modeling.Thenwe are talking
118
00:14:57,220 --> 00:15:04,220
about sixth condition.We are talking abouttwo
variablesin the systems X and y, but, we remember
119
00:15:04,620 --> 00:15:11,620
thatthere must be some variability in X and
Y for instanceif I will take some observation
120
00:15:11,920 --> 00:15:18,920
on X and some observation on Y.If the observations
are not perfectlyok as per the modeling rules
121
00:15:20,500 --> 00:15:26,300
and you can say modeling formalities then
of course, again the model will be you can
122
00:15:26,300 --> 00:15:31,610
say inconsistent or infeasible one.
For instance,let us take a case.Here is if
123
00:15:31,610 --> 00:15:38,529
I have a X variables and I have Y variable.If
I will take only X consist of somany observations
124
00:15:38,529 --> 00:15:45,529
like X 1 X 2 X 3 like up to X n similarly,
Y consist of Y 1 Y 2 Y 3 up to Y n. So, now,
125
00:15:47,410 --> 00:15:52,850
X 1 X 1 X 2 X 3 X n these are all you know
implicitly from other means we do not know
126
00:15:52,850 --> 00:15:59,850
it is X 1 what is X 2 what is X 3 what is
X n and we also do not know what is Y 1 Y
127
00:15:59,980 --> 00:16:06,980
2 Y 3 Y n. What we can say represent here
that X 1 X 2 X n are the sample points of
128
00:16:08,779 --> 00:16:13,209
X and Y 1 Y 2 Y n are the sample points of
Y.
129
00:16:13,209 --> 00:16:20,209
But what is X 1 what is X 2 or what is Y 1
or what is Y 2 we have no idea.Now I will
130
00:16:20,689 --> 00:16:26,769
give the structure.Let us say X 1 equal to
2, X 2 equal to 2, X 3 equal to 2,and X n
131
00:16:26,769 --> 00:16:33,769
equal to 2 then in this particular setup there
is no variation on X samples.That means, every
132
00:16:35,149 --> 00:16:42,149
point it is 2.If there is a no such variation
and obviously, by default the model will be
133
00:16:42,209 --> 00:16:49,209
inconsistent. So, there must be a some kind
of variability in the sample observations
134
00:16:50,259 --> 00:16:57,259
it should not be highly distance or it should
not be or you can say very equal. So, you
135
00:16:58,139 --> 00:17:04,829
have to find out the optimumone.So that means,
it is the midpoint or you can say somewhat
136
00:17:04,829 --> 00:17:11,400
middle between.You can say exclusive equality
and exclusive inequality. So, there should
137
00:17:11,400 --> 00:17:15,730
be some optimum one.
For instance, if like the sample observation
138
00:17:15,730 --> 00:17:21,780
like thisinstead of 2,2,2 if I will put X
1equal to 1, X 2 equal to 2, X 3equal to 4,
139
00:17:21,780 --> 00:17:28,780
X 4 is another sample say is equal to seven
then X n is another sample say eight then
140
00:17:30,740 --> 00:17:37,230
there is a some kind of variability. So, this
particular setup is very consistent for the
141
00:17:37,230 --> 00:17:44,230
a model building.Of course, by initial look
this data points are somewhat, but, still
142
00:17:46,220 --> 00:17:52,820
there is a statistical test whether this particular
variables observations are definitelyor not
143
00:17:52,820 --> 00:17:58,880
we have to verify it. So, there is a statistical
techniques that means, we have to check the
144
00:17:58,880 --> 00:18:05,880
normal distributionstructures before you go
for a any econometric modeling.
145
00:18:05,880 --> 00:18:12,880
Similarly, in the case of Y there should not
be any problem like a 2, 2, 2case. So, there
146
00:18:12,909 --> 00:18:19,909
should be some variability in Y also for instance
Y 1 equal to 2, Y 2 equal to five, Y 3 equal
147
00:18:20,690 --> 00:18:27,289
to seven and Y five equal to eight. That means,if
the setup is a some kind ofvariations then,
148
00:18:27,289 --> 00:18:34,150
obviously, there is a way to build a model.
If all the data points are equal then we sometimes
149
00:18:34,150 --> 00:18:40,559
very handicap to handle the particular situation.
So, we need some variability in the data points
150
00:18:40,559 --> 00:18:47,279
and that variability should not be so high.If
it is high then again it will turn to inconsistent.
151
00:18:47,279 --> 00:18:52,679
So, to make the consistent it should not be
absolutely equal and it should not be absolutely
152
00:18:52,679 --> 00:18:59,679
unequal. So, it has to be in between the two.
So, that is what we call as a optimum ones
153
00:18:59,750 --> 00:19:04,950
all right.
Then seventh or last, but, not the least condition
154
00:19:04,950 --> 00:19:11,950
is that X and Y should be random in natures
so that means, somewhatit is attach with the
155
00:19:13,909 --> 00:19:20,909
issue of probability.Means there is a some
kind of chance factor which can you can say
156
00:19:21,909 --> 00:19:28,309
involve in this modeling scenario that too
Bivariate analysis with this particular setupof
157
00:19:28,309 --> 00:19:33,470
you can saycondition of Bivariate econometric
modeling.
158
00:19:33,470 --> 00:19:40,470
We have to proceed furtherto what is the structure
of this particularissue let us the basic structure
159
00:19:42,850 --> 00:19:49,850
ofBivariatemodelingBivariate Econometric Modeling.
So, Bivariatemodeling the initial starting
160
00:19:51,200 --> 00:19:57,370
point we must havetwo variables Y and X ok.
So, here we will assume that X is represented
161
00:19:57,370 --> 00:20:03,570
as a independent variable clusterand Y is
represented as a dependent variable clusters.
162
00:20:03,570 --> 00:20:10,570
So, now, the extension is like this Y equal
to function of X this is just like a mathematics.
163
00:20:16,460 --> 00:20:22,460
Sonow the extension is the how you have to
set this particular Bivariate econometric
164
00:20:22,460 --> 00:20:29,460
modeling. So, lets Y equal to the structure
of Y 1 Y 2 up to Y n and X represents X 1
165
00:20:32,690 --> 00:20:39,690
X 2 up to X n all right. So, now, when we
have Y and X then we cantrans the functional
166
00:20:44,429 --> 00:20:48,809
relationship between the 2.
When we trans the functional relationship
167
00:20:48,809 --> 00:20:53,279
between the two then there are 2 different
techniques.We usually use either correlation
168
00:20:53,279 --> 00:20:59,909
that is the extension of covariance and there
is another technique called as a regression
169
00:20:59,909 --> 00:21:05,740
that tooextensions are vacant correlations.What
is upon as econometric modeling is concerned
170
00:21:05,740 --> 00:21:10,179
there is something more thanthat it is not
just to correlate and just to regress. So,
171
00:21:10,179 --> 00:21:17,179
in between there is a lots of hidden issues.
So, that hidden issue is the crucial point
172
00:21:17,890 --> 00:21:24,890
of you know higher version ofresults or you
can say higher complex problem. So, now, we
173
00:21:26,529 --> 00:21:32,110
have to discuss all these issue and the beginning
of econometric is that from this particular
174
00:21:32,110 --> 00:21:36,230
basic levels.
So, when you have a particular relationship
175
00:21:36,230 --> 00:21:43,230
that too correlation and regression then there
are various measuresyou can say bistructure.We
176
00:21:44,169 --> 00:21:50,549
can represent the relationship and that to
be in we perfect ones to get the perfect 1
177
00:21:50,549 --> 00:21:56,020
we have lots of by you can say structures
techniques tools to get the better picture.
178
00:21:56,020 --> 00:22:03,020
So, now, we have to see how quicklywe can
have that particular best fitted models and
179
00:22:03,210 --> 00:22:10,210
what are the problems we have to face or we
have to find out to get this best fitted models.
180
00:22:10,350 --> 00:22:15,649
So, this is you know very complex issue and
very you can say typical issue we have to
181
00:22:15,649 --> 00:22:20,779
discuss step by step. So, let us start with
this particular relationship. So, when we
182
00:22:20,779 --> 00:22:26,669
havetwo variables in this particular system
having the observation Y 1 Y 2 up to Y n and
183
00:22:26,669 --> 00:22:33,669
a the observation X 1 or X 2 up to X n then;
obviously, the first and foremoststep you
184
00:22:34,779 --> 00:22:39,789
can say is that we have to build a mathematical
form of the model that is nothing, but, Y
185
00:22:39,789 --> 00:22:46,789
equal to function of X that is we called as
a mathematical models.It is simply mathematical
186
00:22:46,880 --> 00:22:53,880
model. So, now, we know we have a variable
in the systems.
187
00:22:54,000 --> 00:23:01,000
So, first you transfer this particular theoreticalinformation
to mathematical informationthat too the individual
188
00:23:03,360 --> 00:23:09,899
variable into some functional forms that functional
form is treated as a mathematical form of
189
00:23:09,899 --> 00:23:15,740
the models. So, now, we have to transfer this
mathematical form of the model into statistical
190
00:23:15,740 --> 00:23:20,990
form of the models then the econometric issue
will be coming in to the pictures. So, now,
191
00:23:20,990 --> 00:23:27,990
before we go to statistical form of the model
let usjust represent the a explicit format
192
00:23:29,429 --> 00:23:34,049
of this particular problem you can say model
or relationship. Y equal to function of X
193
00:23:34,049 --> 00:23:40,309
means there are many ways this Y and X can
be you can say worked out. So, what you have
194
00:23:40,309 --> 00:23:44,350
to do here is..
We have to see in anexplicitly format; that
195
00:23:44,350 --> 00:23:51,350
means, with the relationship is linearoneor
the linearshipis a non-linear or not because
196
00:23:51,700 --> 00:23:57,649
this is very strong issue for this modeling
behaviors. So, let us assume that there is
197
00:23:57,649 --> 00:24:00,270
a relationship and that too linear relationships.
198
00:24:00,270 --> 00:24:07,210
So, now, we have to represents Y equal to
alpha plus beta X.Just like it is a straight
199
00:24:07,210 --> 00:24:13,529
line equation Y equal to m X plus c where
m is the slope and c is the constant. Here
200
00:24:13,529 --> 00:24:18,850
instead of putting Y equal to m X plus c we
are putting beta X plus alpha, beta is the
201
00:24:18,850 --> 00:24:25,309
slope and alpha is represented as a supporting
factor, constant factor and X is independent
202
00:24:25,309 --> 00:24:31,450
variable Y is the dependent variable. So,
this is what we called as a explicit format
203
00:24:31,450 --> 00:24:37,929
of mathematical models.
So, now this is this second step of this particular
204
00:24:37,929 --> 00:24:44,929
process and this is this step 1 process of
this bivariate setup. So, step 1 is slope.Bivariate
205
00:24:47,260 --> 00:24:54,260
econometric modeling is that we have to bringtwo
variables and you have to build its relationship
206
00:24:56,409 --> 00:25:03,169
and that too in a functional form and that
functionalform again has to be in a explicitly
207
00:25:03,169 --> 00:25:09,460
format. So, now, when you have a explicitly
mathematical model then we have to transfer
208
00:25:09,460 --> 00:25:14,500
in to statistical form of the models. So,
what is the statistical form of the models
209
00:25:14,500 --> 00:25:21,500
that too you have to move in to step three.
So, Y equal to alpha plus beta X plus another
210
00:25:21,799 --> 00:25:28,799
term called as a U where U is represented
as a error term and this particular model
211
00:25:32,240 --> 00:25:39,240
is called as a statistical form of the model.
So, we have simple Y and X then we have to
212
00:25:42,820 --> 00:25:48,760
transfer in to mathematical form of the models
that too in explicitly format Y equal to alpha
213
00:25:48,760 --> 00:25:54,370
plus beta X.Then again this particular model
has to be transferred in to statistical format
214
00:25:54,370 --> 00:25:58,179
that is Y equal to alpha plus beta X plus
0.
215
00:25:58,179 --> 00:26:05,179
For instance, I will put it in differentway.
So, Y equal to alpha plus beta x. So, that
216
00:26:06,039 --> 00:26:13,039
is mathematical models. So, then we are transferring
Y equal to alpha plus beta X plus U it is
217
00:26:13,970 --> 00:26:20,289
statistical models.
So, now the difference between these 2 models
218
00:26:20,289 --> 00:26:26,320
is with respect to this particular U term.
So, U is represented as a error term. So,
219
00:26:26,320 --> 00:26:33,320
now, the issue is what is this component U
why there is a U in this particular systemand
220
00:26:34,019 --> 00:26:41,019
where you have brought this U?Initially we
have our beginning isour journey is from Y
221
00:26:42,019 --> 00:26:48,630
and x. So, now, in between U is introduced
in the system. So, now, the question is why
222
00:26:48,630 --> 00:26:55,630
because there is a debate about this issue.
So, basically there is a always fighting between
223
00:26:56,480 --> 00:27:03,480
the mathematics and statistics mathematicians
are alwaysin the believe that everything is
224
00:27:03,750 --> 00:27:10,110
in exact; that means, what is in the right
side it should be exactly equal to in the
225
00:27:10,110 --> 00:27:17,110
left side.But statistician does not you can
say like all these issues they are in the
226
00:27:18,419 --> 00:27:25,390
view that there is this something who is hidden
in natures.So that means,nothing can exact
227
00:27:25,390 --> 00:27:31,899
in the society. So, there is a always in exact
process.So that means,something which we cannot
228
00:27:31,899 --> 00:27:38,899
exactly explore or we cannot exactly represent
in the particular system. So, if you do not
229
00:27:39,940 --> 00:27:46,070
exactlyrepresent in that particular system
then there is a something gap. So, that gap
230
00:27:46,070 --> 00:27:50,970
can be fill through the term U that is nothing,
but, error components.
231
00:27:50,970 --> 00:27:56,019
That means, if I will put this particular
equationslet us say this is equation number
232
00:27:56,019 --> 00:28:03,019
1 and this is equation number 2 this is statistical
form of the model.So that means,this is cause
233
00:28:03,389 --> 00:28:07,590
sight and this is effect sight this is independent,
structure this is dependent structure. So,
234
00:28:07,590 --> 00:28:13,669
now when there is a question of effect.So,
now, there are 2 dimension here this 1 dimension
235
00:28:13,669 --> 00:28:18,720
is alpha plus beta X and [another] another
dimension is called as a U so; that means,
236
00:28:18,720 --> 00:28:25,720
alpha plus beta X is called as a explained
factorsthis U error term is called as a unexplained
237
00:28:29,440 --> 00:28:36,440
factors and this is called as a explained
factors and this is somewhat it is called
238
00:28:39,490 --> 00:28:45,200
as a total factors.
So, that means, the total effect depends upon
239
00:28:45,200 --> 00:28:52,200
from explained issue that is your that is
derived from the X issue and which is not
240
00:28:52,870 --> 00:28:59,870
derived through X or through independent variables
then it will go to you can U component errorcomponent
241
00:29:00,850 --> 00:29:07,850
that is not in your hand.So, that means, all
explained items are known to us why U is unexplained
242
00:29:12,419 --> 00:29:18,399
in nature because it is in not in your concludes
and we do not have any idea about that particular
243
00:29:18,399 --> 00:29:25,179
item. So, our target is to find out what is
lacking in the system.So, that means, how
244
00:29:25,179 --> 00:29:31,779
much we could not represent or we could not
explain in this particular system. So, that
245
00:29:31,779 --> 00:29:37,590
is the main issue or main agenda of this econometric
modeling.
246
00:29:37,590 --> 00:29:44,590
So, we like to know what is the error component
which we cannotyou can say have in the beginning.
247
00:29:45,919 --> 00:29:52,580
So, that has to be adjusted continuously so
that means, our objective is always we have
248
00:29:52,580 --> 00:29:59,580
to build a model in such a way that the error
components would be at the minimum levels.So
249
00:30:00,500 --> 00:30:06,289
that means, we have to build or the model
can beyou can say model can be represented
250
00:30:06,289 --> 00:30:13,080
as a best fitted.If everything can be explained
nothing can be unexplained, but, it is very
251
00:30:13,080 --> 00:30:17,389
difficult to say something is a total explained
and nothing isunexplained. So, there is a
252
00:30:17,389 --> 00:30:24,010
little even if it is 1 percent then also that
1 percent has also weight age sometimes.
253
00:30:24,010 --> 00:30:30,269
So, we have to see or we like to know why
there is a error component in the particular
254
00:30:30,269 --> 00:30:37,269
system of statistical modeling. So, now, the
issue is why error component in the issueof
255
00:30:39,929 --> 00:30:46,929
statistical modeling. So, our ideais to see
why you like to use error component in the
256
00:30:48,980 --> 00:30:55,980
systems. So, now, there are many ways we can
represent this error issue. So, you see we
257
00:30:58,990 --> 00:31:05,059
are in a Bivariatesystems even if in the case
of multivariate system error component is
258
00:31:05,059 --> 00:31:11,850
must. So, now, the issue is your our justification
is why we like to introduce error component.There
259
00:31:11,850 --> 00:31:16,450
are many reasons for that.
Why error components in the systems?Number
260
00:31:16,450 --> 00:31:23,450
one.First is there are certain variables which
can explain the dependent variables, but,
261
00:31:24,539 --> 00:31:31,539
we are not in a position to include that variables
there may be many reasons for that may be
262
00:31:32,659 --> 00:31:39,179
not available in our head with respect to
information wise or with respect to structure
263
00:31:39,179 --> 00:31:46,179
wise or sometimes what happens even sometimes
the idea is there, but, we are not able to
264
00:31:47,039 --> 00:31:53,029
represent in a particular format or sometimes
we have no idea at all.Some variables may
265
00:31:53,029 --> 00:31:58,309
be effecting, but, for the time being we are
not in a position to represent with that particular
266
00:31:58,309 --> 00:32:03,730
variable which can also influence the Y component.
That means some of the relevant variables
267
00:32:03,730 --> 00:32:10,730
or you can say useful variables are not included
in the systems. So, since some variables are
268
00:32:22,330 --> 00:32:29,330
not using the systems so, obviously,there
is this some percentage which cannot be explained
269
00:32:31,299 --> 00:32:38,299
so, that means, there must be some error component.
So, useful variables not included in the system
270
00:32:39,460 --> 00:32:46,460
secondsome ofthe you know unnecessary variables
means or not relevant variables are included
271
00:32:57,429 --> 00:33:04,429
in the system are includedin the systems.
So, this may be also because of this you know
272
00:33:06,490 --> 00:33:13,019
unexplanations. So, there may be error tone
because some of the variables which may not
273
00:33:13,019 --> 00:33:18,460
have any contribution, but, it will affect
the system. So, as a result we have to introduce
274
00:33:18,460 --> 00:33:25,460
the error problem.That means, what is our
commenting factors for this effect sight.Third
275
00:33:27,100 --> 00:33:34,100
is there is a sometimes mathematicalimperfection
of the models.For instance so, we are sayings
276
00:33:51,500 --> 00:33:58,500
Y and X and we are just putting Y equal to
function of X all right. So, that too Y equal
277
00:33:59,159 --> 00:34:05,500
to alpha plusbeta X that is our issue.
But there are many ways alpha I mean Y and
278
00:34:05,500 --> 00:34:11,000
X can be represented for instance Y can be
alpha into beta to the power X or Y equal
279
00:34:11,000 --> 00:34:18,000
to alpha by beta to the power X or you can
say alpha beta to the power X or you can say
280
00:34:18,340 --> 00:34:24,889
alpha log X log beta like many ways we have
to represent the relationship.Since for the
281
00:34:24,889 --> 00:34:30,149
time being we are assuming Y equal to alpha
plus beta X then there may be some problems
282
00:34:30,149 --> 00:34:36,819
technical problem or mathematical problem.At
a particular point of time we have to use
283
00:34:36,819 --> 00:34:42,700
only 1 relationship so, that means, at a time
we cannot take Y equal to alpha plus beta
284
00:34:42,700 --> 00:34:45,210
X or Y equal to alpha into beta to the power
x.
285
00:34:45,210 --> 00:34:51,020
So, yes of course, what we can do we have
to test the model with a different function
286
00:34:51,020 --> 00:34:55,800
alpha.For instance Y equal to alpha plus beta
X in oneextent and another extent we will
287
00:34:55,800 --> 00:35:00,010
take Y equal to alpha into beta to the power
X and we have to setup different forms of
288
00:35:00,010 --> 00:35:04,970
themodel.Test the model to get the best fitted
model between these 2 which 1 is the best.We
289
00:35:04,970 --> 00:35:11,970
have to consider finally, and we have to say
that this is the best fitted model which we
290
00:35:12,839 --> 00:35:19,250
have derived on the basis of some decision
making process. So, now,likewise there are
291
00:35:19,250 --> 00:35:26,250
many ways the functional form can be established.
So, now feel there are many ways the functionalforms
292
00:35:26,400 --> 00:35:32,660
are represented then the model building structure
will be completely different and or also the
293
00:35:32,660 --> 00:35:39,170
result will be completely different.But, there
is way for particular mathematical form has
294
00:35:39,170 --> 00:35:46,170
to be use. So, basically we will start with
the simple oneand by chance will we get the
295
00:35:47,500 --> 00:35:53,220
best fitted model with the simple one.Then
we are in the right track.If the model accuracy
296
00:35:53,220 --> 00:35:59,690
is not on the basis of the above information
or above functional form then; obviously,
297
00:35:59,690 --> 00:36:06,690
we have togo onebyonewith or we have to proceedoneafter
another process to get the best fitted model.
298
00:36:07,520 --> 00:36:14,520
So, mathematical imperfection of the modelalsooneof
the committingfactor which came you can sayinvolving
299
00:36:17,000 --> 00:36:24,000
in the eve issuenext fourth there is a misspecification
of the random terms.There is a question on
300
00:36:24,470 --> 00:36:30,369
misspecification of the random terms.We are
always talking about X and Y are random in
301
00:36:30,369 --> 00:36:35,589
nature.So that means,there must be some level
of or some environmental probability in the
302
00:36:35,589 --> 00:36:42,589
particular system.The term probability itself
represent this chance of occurrence.That means
303
00:36:42,940 --> 00:36:48,630
something which is not in your control. So,
now, which is something not in your control
304
00:36:48,630 --> 00:36:54,440
means obviously, that control may be in many
ways.It can be at a higher level it can be
305
00:36:54,440 --> 00:37:01,440
at lower level it can be at the medium levels.
So, now what you have to do sincewe have no
306
00:37:01,560 --> 00:37:08,560
idea whether it is higher onelower oneor medium
one. So, we have to assume at leastonethen
307
00:37:09,270 --> 00:37:15,440
accordingly that error involvement must be
incorporate in the systems.Since we are not
308
00:37:15,440 --> 00:37:22,440
sure whether the impact is higheroneloweroneor
medium one. So, we have to do that. So, this
309
00:37:23,650 --> 00:37:30,650
is how the error is involve in the systems.Then
fifth,there may be some question of luck in
310
00:37:31,230 --> 00:37:38,230
the systemstake a case of you know social
problems.For instance you like to know what
311
00:37:38,810 --> 00:37:45,810
is the implement of expenditure on a particulars
you can say sales revenue. So, expenditure
312
00:37:48,079 --> 00:37:53,220
that too lead to you can say advertising expenditures.
So, now the theoretical knowledge is that
313
00:37:53,220 --> 00:37:59,960
it will put more and moreinvestment on advertising.Then
obviously, there is a strong impacton you
314
00:37:59,960 --> 00:38:06,960
knowsales revenue. So, now, the issue may
be in something different because we are discussing
315
00:38:08,510 --> 00:38:14,450
about one problem.So, that means, we are discussing
a particular issue.Let us say thisis a pen.
316
00:38:14,450 --> 00:38:21,450
So, now, I like to know if there is a some
kind of investment on this pen advertisement.Then
317
00:38:21,940 --> 00:38:28,550
obviously, the growthor sales of this particular
pen will bein a increasing sequence.
318
00:38:28,550 --> 00:38:35,550
But you are you are not in, you can say monopoly
situation.There are many competitors in this
319
00:38:35,770 --> 00:38:42,770
particular business environment. So, everybody
is doing like this way. So, behind there is
320
00:38:42,950 --> 00:38:47,819
a competitive issue.Then obviously, the formula
issomewhat you know direct one. So, you are
321
00:38:47,819 --> 00:38:54,180
involving other peoples are also involving.
So, by the ways there are certain factors
322
00:38:54,180 --> 00:39:01,180
means which is again you can say third variable
in natures.It can also you can sayincorporate
323
00:39:01,400 --> 00:39:08,400
or you can saygive the accuracy of the models.
So, as a result there isyou can say luck which
324
00:39:09,359 --> 00:39:14,460
can involve in the issue.Everybody is you
can say objective that if you will put more
325
00:39:14,460 --> 00:39:19,200
and more investment on advertising our sales
will be go increasing.
326
00:39:19,200 --> 00:39:26,040
So, just we are believing thatone.That means,we
are assuming that if you will put more on
327
00:39:26,040 --> 00:39:30,940
advertising then obviously, sales is good.So
that means,we are not sure it is just not
328
00:39:30,940 --> 00:39:35,650
like your mathematical way, you will put 2
plus 3 you will get five.You will put 4 you
329
00:39:35,650 --> 00:39:40,050
can say 3 plus 3 you will get six.It is not
like that way because we are putting something
330
00:39:40,050 --> 00:39:46,819
then the effect will be somewhat in other
way.That means, in between cause and effect
331
00:39:46,819 --> 00:39:52,930
there are certain variables which can be also
effect the system. So, as a result sometimes
332
00:39:52,930 --> 00:39:58,010
that factor may be considered as a luck factors.
So, that luck because of you know you are
333
00:39:58,010 --> 00:40:05,010
not sure and sometimes luck is notsupporting
you for this particular issue.Then obviously,
334
00:40:05,109 --> 00:40:11,200
you can say moral cannot be accurateoneor
cannot be perfectly explained one. So, some
335
00:40:11,200 --> 00:40:18,200
part of unexplained is there. So, as a result
error must be in the systems.Then last, but,
336
00:40:19,099 --> 00:40:26,099
not the least is called as a external factors.Besides
luck there are certain factors which is not
337
00:40:33,359 --> 00:40:40,359
in your control.For instance either you are
not aware of it or it may be coming in certain
338
00:40:46,170 --> 00:40:53,170
you can say at the particular situation. So,
in that contest since you are not sure or
339
00:40:53,319 --> 00:40:57,700
you are not certain then, obviously, there
is a error issue.
340
00:40:57,700 --> 00:41:04,700
For instance take a case ofterrorist impact.
So, everything is a planned in a proper way
341
00:41:06,119 --> 00:41:11,490
we are we are very serious and we know all
these variables are explained which is used
342
00:41:11,490 --> 00:41:18,490
in a particular system.We are in the process
to design that build that, but, unfortunately
343
00:41:19,170 --> 00:41:26,170
there is in between there is a terrorist activity.
So, as a result there should be some inconsistent.Take
344
00:41:26,800 --> 00:41:33,450
a case of same thing in say you know in between
advertising and sales. If you know putting
345
00:41:33,450 --> 00:41:38,280
more advertising and increasing the earns
among the people. So, that your sales of that
346
00:41:38,280 --> 00:41:45,280
particular item can go on increasing.
But by any chance terrorist attack on the
347
00:41:45,359 --> 00:41:51,150
your plant say then obviously, your plant
will be get damaged totally and whatever investment
348
00:41:51,150 --> 00:41:55,579
you have done on advertising on that particular
product and thatproduct cannot be also available
349
00:41:55,579 --> 00:42:00,800
for the market.That means, since production
is not there. So, whateveramount you have
350
00:42:00,800 --> 00:42:05,770
put on advertising it is no meaning at all.
So, it is no impact at all. So, as a result
351
00:42:05,770 --> 00:42:11,930
some of the external factor which are not
in your control as a result we have to put
352
00:42:11,930 --> 00:42:18,930
it in new component that is error component.So
that means, your erroris always there in a
353
00:42:19,280 --> 00:42:22,900
in a particular system.
When we will talk about statistical form of
354
00:42:22,900 --> 00:42:29,780
the model,now we have to see what are the
variableswhich are particularly explained
355
00:42:29,780 --> 00:42:34,109
in nature and what are the variable which
are not explained that we will represent in
356
00:42:34,109 --> 00:42:41,109
the form of a U. U is treated as a proxy for
unexplained variables which is a not known
357
00:42:44,450 --> 00:42:51,450
to us or which is not exactly identified.
So, since we have no idea about it. So, we
358
00:42:52,059 --> 00:42:58,569
are assuming that it is in U only. So, error
will incorporate all this defects which is
359
00:42:58,569 --> 00:43:00,059
not in our controls.
360
00:43:00,059 --> 00:43:05,470
So, that means,.So, for a Bivariate econometric
modeling is concerned then system will be
361
00:43:05,470 --> 00:43:11,809
like this our starting point will be Y then
X in between U is introduced in the system.
362
00:43:11,809 --> 00:43:18,809
So, now, like this Y 1 Y 2 Y 3 Y 4 up to Y
n then X 1 X 2 X 3 X 4 and up to X n similarly,
363
00:43:25,410 --> 00:43:32,410
U 1 U 2 U 3 U 4 and U n. So, now if will we
go by simply mathematics.Then obviously, Y
364
00:43:39,109 --> 00:43:46,109
1 equal to X 1 plus U 1, Y 2 equal to X 2
plus U 2, Y 3 equal to X 3 plus U 3, Y 4 equal
365
00:43:54,480 --> 00:44:01,480
to X 4 into U four. So, similarly, Y n equalto
X n plus U n so that means,all X are inonegroup
366
00:44:07,500 --> 00:44:14,500
and all U are in another group and the total
effect will be on Y. So, this is explained
367
00:44:17,349 --> 00:44:24,349
effect, this is unexplained effect.
So, now our objective is to minimize this
368
00:44:28,829 --> 00:44:34,220
particular activities sothat means, the way
we have to minimize you need to have a best
369
00:44:34,220 --> 00:44:41,220
fitted models. So, you need to have a best
fitted models for instance like this. So,
370
00:44:41,740 --> 00:44:48,740
now, if will we go by simple framework then
let us assume that for this particular variable
371
00:44:49,040 --> 00:44:54,380
your graphic structure will be like thisside
X measurement and this side Y measurement.
372
00:44:54,380 --> 00:45:00,390
Since the functional form is Y equal to alpha
plus beta X then alpha is a constant. So,
373
00:45:00,390 --> 00:45:07,390
thiswill be just supporting factor like this.
So, now, for every X 1 every X 2 every X 3
374
00:45:07,550 --> 00:45:14,550
every X 4 X 5like this. So, X 1 there may
be someyou can say Y 1 for X 2 there may be
375
00:45:15,740 --> 00:45:22,740
Y 2, X 3 there may be Y 3, X 4 there may be
you can say Y 4, X 5 then there will be Y
376
00:45:26,280 --> 00:45:33,280
5then X 6 then obviously, Y 6 like this.
So, now if will we join all such points then
377
00:45:34,950 --> 00:45:41,349
the picture will be coming like this and this
is the true picture of this particular setup.So,
378
00:45:41,349 --> 00:45:48,349
that means, we have Y and Y information and
X information and our idea is how Y and X
379
00:45:49,589 --> 00:45:56,589
are related to each other.This is first objective
and if Y and X are related to each other how
380
00:45:58,619 --> 00:46:05,619
best can they beyou can say related to each
other.This is the basic objective behind econometric
381
00:46:06,210 --> 00:46:11,630
modeling.Obviously, when you will go for investing
the relationship between Y and X there will
382
00:46:11,630 --> 00:46:18,270
be certain relationship.Eitheryou assume it
or by theory you have to bring these variable
383
00:46:18,270 --> 00:46:24,550
in such a way there is a somewhat relationship.
So, that means, that relationship is there,
384
00:46:24,550 --> 00:46:31,550
but, we have to predict or we have to forecasthow
best they can be you can say related to each
385
00:46:32,819 --> 00:46:39,660
other. So, that the effect will be very positive
and you know very accurate. So, that is how
386
00:46:39,660 --> 00:46:45,130
that is that should be our main agenda. So,
as a result within the particular setup we
387
00:46:45,130 --> 00:46:49,910
have to build a you can still line which is
the bestfor you.
388
00:46:49,910 --> 00:46:56,680
That means, if you will consider this is a
path and this path is very much uneven in
389
00:46:56,680 --> 00:47:03,530
nature.It is not at all straight forwarditjust
like a non-linearone.So, that means, we have
390
00:47:03,530 --> 00:47:10,240
to being in to a linear path what difference
you have to bring in a best part so that the
391
00:47:10,240 --> 00:47:14,000
model accuracy or forecasting can be very
perfect one.
392
00:47:14,000 --> 00:47:19,819
So, as a result let us assume that this particular
line is the best fitted one. So, if will you
393
00:47:19,819 --> 00:47:26,819
say that best fittedonein statistic we call
it as the Y headestimated lines. So, this
394
00:47:26,890 --> 00:47:33,890
is what we call it as a estimated lineor otherwise
called as a expected line.Yhead equal to alpha
395
00:47:36,450 --> 00:47:43,450
head plus beta head X.Yhead equal to alpha
head plus beta headX.That means, in other
396
00:47:43,970 --> 00:47:50,970
waywe havethreeforms of functions. Y equal
to alpha plus beta X this is mathematical
397
00:47:51,079 --> 00:47:58,079
form of the model.
Then we have alpha plus beta X plus U.This
398
00:47:58,290 --> 00:48:05,290
is input statistical form of the model.Then
we have Y head equal to alpha head plus beta
399
00:48:05,520 --> 00:48:10,829
head X.When will you call it Y head equal
to alpha plus beta head X,this is estimated
400
00:48:10,829 --> 00:48:16,180
line.So, we will be say that this is the estimated
line this is the best line this is the perfect
401
00:48:16,180 --> 00:48:23,099
line then of course, the error issue will
not be there.So, that means, the way we will
402
00:48:23,099 --> 00:48:29,240
choose the model is the best one.Obviously,
as per your knowledge it should be veryexplained
403
00:48:29,240 --> 00:48:36,240
one.
So, if it is explained 1 then, obviously,
404
00:48:36,579 --> 00:48:43,579
theerror will be not there in the system.So,
that means, we firststart with the exact model
405
00:48:45,950 --> 00:48:51,750
then in between you have to assume that the
model is not exact.So, that means, we have
406
00:48:51,750 --> 00:48:57,359
to bring the inexact of that particular system
then again we have to verify or you have to
407
00:48:57,359 --> 00:49:04,359
come to this stage again.It will be transferredto
the same exact model that is ina way of mathematical
408
00:49:04,380 --> 00:49:08,599
form of the model.
But the mathematical form of the model in
409
00:49:08,599 --> 00:49:14,270
a initial setting in the estimated model of
the model the latter setting may not be exactly
410
00:49:14,270 --> 00:49:20,339
equal. So, here the issue of this mathematical
form of the structureform of the model is
411
00:49:20,339 --> 00:49:25,760
that we know this should be the relationship
that is the exact relationships. Now why there
412
00:49:25,760 --> 00:49:31,250
is a statistic because statistic always object
this particular mathematics. So, if there
413
00:49:31,250 --> 00:49:38,160
is a question of objection then there is a
need of information to verify that one. So,
414
00:49:38,160 --> 00:49:43,630
statisticsstatisticians you can say assignment
is to the way they will be verify to that
415
00:49:43,630 --> 00:49:50,220
particular mathematical problem mathematical
form of the model. So, that the judgment can
416
00:49:50,220 --> 00:49:56,020
be accurate onein same way the structure isall
about the econometric modeling.
417
00:49:56,020 --> 00:50:02,440
So, now this is called as the best fitted
model. So, now,if we will integrate all this
418
00:50:02,440 --> 00:50:09,030
things then we will have Y equal to Y head
plus e. So, this is the another form of the
419
00:50:09,030 --> 00:50:16,030
equation.So that means,yourtrue value Y which
depends upon estimated value and error terms.
420
00:50:19,369 --> 00:50:25,880
So, that estimated value may be perfectlystill
there is a question of error term. So, we
421
00:50:25,880 --> 00:50:32,819
have to see how much error is committed in
the system and whatever you know error is
422
00:50:32,819 --> 00:50:39,819
that in the systemthat should be in least.
So, that is the more you knowaccuracy of this
423
00:50:42,160 --> 00:50:45,790
particular system.
So, altogether..
424
00:50:45,790 --> 00:50:52,790
So, what we have discuss is that Y equal to
alpha plus beta X.This is mathematical form
425
00:50:53,380 --> 00:51:00,380
of the models.Then Y equal to alpha head plus
beta head X.This is estimated models.In between
426
00:51:05,690 --> 00:51:12,119
from the mathematical models we assume that
Y equal to alpha plus beta X plus U is the
427
00:51:12,119 --> 00:51:17,710
statistical form of the model. So, this is
what mathematical form of the model, this
428
00:51:17,710 --> 00:51:24,710
is what statistical form of the model and
this is the estimated model.Now if will integrate
429
00:51:28,010 --> 00:51:31,609
all these things then Y equal to Y head plus
e.
430
00:51:31,609 --> 00:51:38,609
So, now what is e?e is the other way or other
way representation of error terms. So, because
431
00:51:41,210 --> 00:51:47,059
we have already in the estimation process.
Obviously, we are putting the error components
432
00:51:47,059 --> 00:51:52,869
in different names, but, it is more or less
same. So, e equal to basically a difference
433
00:51:52,869 --> 00:51:59,869
betweenYminus Y head.That means, whatis the
true value and what is theestimated value,that
434
00:52:00,980 --> 00:52:07,980
is the committee of errors. So, error meanswhat
is the actual and what is the predicted or
435
00:52:10,010 --> 00:52:16,630
estimatedor assumed value. So, we are we are
assuming that this is should be the perfect
436
00:52:16,630 --> 00:52:23,240
one. So, there should be actual. So, we like
to know what is the difference between the
437
00:52:23,240 --> 00:52:29,339
actual and estimated. So, that difference
is called as an error issue like this. So,
438
00:52:29,339 --> 00:52:33,500
these here these are the true value ok.
So, we are assuming this is estimated value.
439
00:52:33,500 --> 00:52:40,500
So, now, the differenceis all about this error
issue. So, these here are called as a error
440
00:52:42,290 --> 00:52:47,240
issue.Like you know for e it called as e 1
this is e 2 this is e 3 like this. So, this
441
00:52:47,240 --> 00:52:53,930
side is the X measurement and this side Y
measurement.This is estimated models.Yheadin
442
00:52:53,930 --> 00:53:00,240
between it shaped to with the integration
between true lines or true points actual points
443
00:53:00,240 --> 00:53:07,240
and theestimated point. So, now, this is the
typical issue of this or this is the basic
444
00:53:07,480 --> 00:53:13,630
statistic point of econometric modeling. So,
now, what is our next objective?Our next objective
445
00:53:13,630 --> 00:53:19,200
is to minimize this error terms.
So, far as minimization is concerned or of
446
00:53:19,200 --> 00:53:25,569
course, when there is a question of optimization,
we cannot optimize this single one. So, we
447
00:53:25,569 --> 00:53:32,569
have to optimize the minimum sum of the squares.
So, sum of the sumof the errors, square errorshas
448
00:53:37,150 --> 00:53:44,119
to be minimized. So, that process is a more
complex and very interesting and very useful
449
00:53:44,119 --> 00:53:50,650
very systematic. So, that we will discuss
in next class.So, it is not possible to start
450
00:53:50,650 --> 00:53:55,809
here now. So, the detail structure we have
to discuss in the next.Thank you very much
451
00:53:55,809 --> 00:53:56,690
have a nice day.