1
00:00:17,950 --> 00:00:24,950
Being and welcome to NPTEL project on econometric
modelling. This is Rudra Pradhan here. Today,
2
00:00:25,570 --> 00:00:32,570
we will discuss the concept of univariate
econometric modelling. First of all, what
3
00:00:34,480 --> 00:00:41,480
is univariate econometric modelling? It is
a statistical analysis that considers only
4
00:00:43,590 --> 00:00:47,750
one factor or variables at a time.
5
00:00:47,750 --> 00:00:54,750
It explores each variable in a data set separately
and is very essential for multivariate analysis.
6
00:01:01,560 --> 00:01:08,560
So, univariate modelling is an essential condition
for multivariate modelling. Univariate modelling
7
00:01:12,490 --> 00:01:19,490
also called as a descriptive statistics. It
is it is concerned with the description or
8
00:01:20,450 --> 00:01:27,450
summarization of individual variables in a
given data set. So, let me explain what is
9
00:01:29,119 --> 00:01:34,619
all about this univariate econometric modeling.
10
00:01:34,619 --> 00:01:41,619
Let us take a case here. We have series of
variables X 1, X 2 up to X n. Y 1, Y 2 up
11
00:01:47,649 --> 00:01:54,649
to Y n. So, this is what is all about multivariate
framework of econometric modelling. So here,
12
00:02:13,800 --> 00:02:20,800
we are very much concerned about this statistical
analysis of a particular variable or series
13
00:02:21,650 --> 00:02:28,650
of variable and their interrelationships.
So, if we go through this multivariate framework
14
00:02:30,870 --> 00:02:37,680
of econometric modelling, then usually we
have two different sets of variables. One
15
00:02:37,680 --> 00:02:44,680
set of variables is called as a X 1, X 2 up
to X n and another set of variables are called
16
00:02:47,069 --> 00:02:54,069
as a Y 1, Y 2 up to Y n. This particular series
called as a independent variable clusters
17
00:02:58,379 --> 00:03:05,379
independent variable variable cluster and
this particular series Y 1, Y 2 Y n is called
18
00:03:12,730 --> 00:03:19,730
as a dependent variable clusters.
So, multivariate framework or multivariate
19
00:03:25,489 --> 00:03:32,489
econometric modelling is nothing but the structure
of or integration of independent variables
20
00:03:35,040 --> 00:03:42,040
and dependent variables. So, it is the game
between independent variable and dependent
21
00:03:42,719 --> 00:03:48,510
variable. Otherwise, it is also called as
a endogenous variables and exogenous variables.
22
00:03:48,510 --> 00:03:55,510
There are two different situation altogether.
Situation one: One dependent one dependent
23
00:04:00,219 --> 00:04:07,219
variable with several independent variables.
So, the situation one is one, when one dependent
24
00:04:21,280 --> 00:04:28,190
variable with a several independent variables.
So, this particular structure is called as
25
00:04:28,190 --> 00:04:35,190
a simple multivariate framework or simple
multivariate modeling.
26
00:04:49,810 --> 00:04:56,810
Situation two: Where there are several several
dependent variables plus several independent
27
00:05:09,640 --> 00:05:16,640
variables. So, this particular structure is
called as a simultaneous equation modelling
28
00:05:22,910 --> 00:05:29,910
or and structural equation modelling structural
equation modelling. So, we have two different
29
00:05:36,440 --> 00:05:43,440
games so far as a real world problem is concerned.
So, one side we have one dependent variable
30
00:05:47,000 --> 00:05:53,300
with several independent variables. In other
situation, we have series of dependent variables
31
00:05:53,300 --> 00:06:00,300
and series of independent variables.
So now, let me take a case here. So, this
32
00:06:02,100 --> 00:06:09,100
is independent clusters, this is dependent
clusters. So, we have independent clusters,
33
00:06:17,490 --> 00:06:24,490
we will be represent X 1, X 2, X 3, X 4, like
X n. Dependent cluster, we have Y 1, Y 2 up
34
00:06:29,570 --> 00:06:36,570
to Y n. Now, when there is a question of situation
one, then we have to integrate Y 1 with X
35
00:06:39,630 --> 00:06:46,630
1, Y 1 with X 2, Y 1 with X 3 and so on with
Y 1 with X n, or Y 2 with X 1, Y 2 with X
36
00:06:51,320 --> 00:06:58,320
2, Y 2 with X 3, Y 2 with X 4, or Y 2 with
X n. Like, we can integrate Y 3 with so many
37
00:07:01,090 --> 00:07:06,220
independent variables and Y 4 with so many
independent variables, again with Y n with
38
00:07:06,220 --> 00:07:12,150
so many independent variables. So, this particular
framework called as a simple multivariate
39
00:07:12,150 --> 00:07:19,150
econometric modeling. However, in the question
of structural equation modelling, so, the
40
00:07:20,770 --> 00:07:27,770
structure is completely different. So, that
means, here every variable has an integration
41
00:07:28,200 --> 00:07:35,200
with other variables. This is the condition
one and the condition two is there are series
42
00:07:36,700 --> 00:07:43,700
of dependent variables and series of independent
variables. So, now within the detail structures,
43
00:07:46,570 --> 00:07:53,570
so, we have to discuss here what is all about
the univariate univariate econometric modelling.
44
00:07:56,520 --> 00:08:03,520
So, univariate econometric modelling is basically
basically represented as here univariate econometric
45
00:08:08,010 --> 00:08:14,770
modelling represented as a UEM, univariate
econometric modelling. So, let me first highlight
46
00:08:14,770 --> 00:08:21,770
here, what is the entire structure of this
univariate modeling. So, the basic objective
47
00:08:21,970 --> 00:08:28,970
behind univariate modelling is that, we have
to describe or we have summarize a particular
48
00:08:30,530 --> 00:08:37,530
variable in a given set up. If the setup consists
of say ten variables, we have to we have to
49
00:08:39,690 --> 00:08:46,220
analyze with a particular variable only. For
instance, if we have X 1, X 2, X 3 up to X
50
00:08:46,220 --> 00:08:51,420
10, then we like to know what is the futures
of X 1, what is the futures of X 2, what is
51
00:08:51,420 --> 00:08:58,420
the future of X 3 and up to what is the future
of X 10. Because, it is the prime requirement
52
00:08:58,590 --> 00:09:04,850
of multivariate econometric modelling. Until,
unless, you know all these structure and setup
53
00:09:04,850 --> 00:09:11,850
of univariate data setup, then you cannot
go anything or you cannot get better solutions.
54
00:09:12,150 --> 00:09:19,150
So, univariate econometric modelling gives
with three issues. One is called as a
central tendency central tendency, then dispersion,
55
00:09:30,630 --> 00:09:37,630
then third is called as a skewness or kurtosis.
So, what is central tendency? It is the single
56
00:09:41,880 --> 00:09:48,880
figure, which describes the entire setup.
So, the central tendency will give you indication
57
00:09:50,680 --> 00:09:57,680
about single figures. So, like this. There
are set of observations here. So, we have
58
00:09:59,590 --> 00:10:06,590
to target which particular observation is
very important which can describes the entire
59
00:10:07,320 --> 00:10:14,320
issue. The dispersion is within the setup
again. So, it represent the variability of
60
00:10:17,440 --> 00:10:23,260
the observation in a particular variables.
So, let us say this is this is variable say
61
00:10:23,260 --> 00:10:30,260
X 1. Now, these points are represented as
X 1 1, X 1 2, X 1 3 up to say X 1 n. These
62
00:10:32,950 --> 00:10:39,950
are the data points with a particular variables.
So now, which particular variable is the central
63
00:10:42,710 --> 00:10:49,680
or you can say center that describes the complete
information within the structure or within
64
00:10:49,680 --> 00:10:54,890
that particular variable. Now, let us say
this is a center here. Let us let us call
65
00:10:54,890 --> 00:11:01,790
it, X is a unit which represents the entire
structure of this particular variable. So,
66
00:11:01,790 --> 00:11:08,120
now dispersion is the variability of the observation
in a particular variables. Now, if it is X
67
00:11:08,120 --> 00:11:15,120
6, now we are very much concerned about how
the X 6 component is different from X 3, X
68
00:11:18,540 --> 00:11:25,540
1 1, X 1 2, X 3 1 3 like this. So, we have
to see whether this is equally distributed
69
00:11:26,100 --> 00:11:31,160
or unequally distributed and that objective
is the framework of dispersion.
70
00:11:31,160 --> 00:11:38,160
Now, this skewness issue is the general shape
of the distributions. So, for the distribution
71
00:11:38,870 --> 00:11:43,970
is concerned, we have series of distribution
like theoretical distribution, under theoretical
72
00:11:43,970 --> 00:11:48,650
distribution, we have probability distribution
poison distribution, thermal distribution,
73
00:11:48,650 --> 00:11:55,650
hyper power distributions and so on. But,
for econometric modelling or basically structural
74
00:11:56,770 --> 00:12:03,710
modelling, so, the model will be best fitted
or we can we can use that model or we can
75
00:12:03,710 --> 00:12:10,710
feed that model for better way. A data point
should be normally distributed. So, that means,
76
00:12:11,120 --> 00:12:17,480
we are very much interested whether the setup
is the normally distributed or not. So, that
77
00:12:17,480 --> 00:12:23,470
means, we are very much interested to integrate
this structure into normal distribution set.
78
00:12:23,470 --> 00:12:30,470
So, that shape of the distribution is our
concern and that is nothing but the skewness
79
00:12:30,510 --> 00:12:35,920
skewness component. So far as if kurtosis
is concerned, it is the flatness of the distribution.
80
00:12:35,920 --> 00:12:40,210
Again, it is the within the setup of normal
distribution.
81
00:12:40,210 --> 00:12:46,910
So, now we have three different structure
of univariate modelling. So, one structure
82
00:12:46,910 --> 00:12:52,320
is central tendency, another structure is
dispersion and another structure is skewness.
83
00:12:52,320 --> 00:12:56,680
The objective is to find out the single figure
which describes the entire issue. The second
84
00:12:56,680 --> 00:13:03,680
issue is dispersion. All other items are distance
from other data points. So, this is what the
85
00:13:04,360 --> 00:13:10,890
dispersion objective and skewness is the shape
of the distribution and kurtosis is the flatness
86
00:13:10,890 --> 00:13:17,890
of the distribution. So now, within the basic
background or information about univariate
87
00:13:19,100 --> 00:13:26,100
setup. So, we like to know, how this set up
can be evaluated, can be interpreted and can
88
00:13:26,740 --> 00:13:33,740
be used further for multivariate econometric
modelling. Let me first give you the framework
89
00:13:34,150 --> 00:13:37,310
of univariate econometric modeling.
90
00:13:37,310 --> 00:13:44,310
So, univariate econometric modelling as I
have already mentioned, it has three different
91
00:13:44,839 --> 00:13:51,839
structure altogether. Central tendency, then
dispersions, then say skewness and kurtosis.
92
00:13:56,529 --> 00:14:03,080
Under central tendency, we have three different
set up. we have altogether three different
93
00:14:03,080 --> 00:14:10,080
setup. This is called as a mean setup, this
is median setup, then this is mode setup.
94
00:14:12,899 --> 00:14:19,899
So, we have three different statistical tool
under central tendency. So, this median and
95
00:14:22,610 --> 00:14:29,610
mode is called as a positional average. It
is called as a positional average. Mean is
96
00:14:32,910 --> 00:14:39,910
called as a mathematical average. Mean is
called as a mathematical average.
97
00:14:42,899 --> 00:14:49,899
So, now similarly, for dispersions we have
two things: one is called as a absolute measures
98
00:14:53,050 --> 00:15:00,050
absolute measures and another is called as
a relative measure. Similarly, for the case
99
00:15:05,240 --> 00:15:12,240
of skewness, we have two different setup:
one is called as a absolute measures and another
100
00:15:13,420 --> 00:15:20,420
is called as a relative measures. So now,
we have to see what is the structure of central
101
00:15:20,529 --> 00:15:26,460
tendency, structure of dispersion and what
is the structure of skewness? Now, under central
102
00:15:26,460 --> 00:15:33,460
tendency the objective can be evaluated through
mean, median and mode. So, median and mode
103
00:15:36,029 --> 00:15:42,160
is represented as a positional average. Mean
is represented as a mathematical average.
104
00:15:42,160 --> 00:15:49,160
On the other side, dispersion or variability
of the information can be observed in a absolute
105
00:15:49,360 --> 00:15:56,360
angle and can be observed in relative angles.
Similarly, for skewness and kurtosis, we can
106
00:15:56,360 --> 00:16:03,360
have the absolute issue and also relative
issue. Now, the mathematical average can be
107
00:16:06,050 --> 00:16:13,050
again various shapes. It is represented in
three different forms: arithmetic mean, geometric
108
00:16:17,260 --> 00:16:24,260
mean and harmonic mean. So, arithmetic mean
is or it can be again calculated through simple
109
00:16:28,430 --> 00:16:35,430
structure and by assigning weight. Similarly,
geometric mean can be calculated with the
110
00:16:37,709 --> 00:16:44,500
simple structure and by assigning weight.
Harmonic mean can also be calculated in simple
111
00:16:44,500 --> 00:16:51,500
structure and by weight.
So now, now the structure of central tendency
112
00:16:54,200 --> 00:17:00,730
is that we like to know, what is the mathematical
average and what is the positional average.
113
00:17:00,730 --> 00:17:07,730
So far, as dispersion is concerned, we like
to know what is the absolute issue, what is
114
00:17:08,140 --> 00:17:11,640
the relative issue or how is the absolute
measurement of that particular variable and
115
00:17:11,640 --> 00:17:16,640
what is the relative measurement of that particular
variable? Similarly, in the case of skewness
116
00:17:16,640 --> 00:17:23,159
and kurtosis, we like to know how the shape
of the distribution in absolute angle and
117
00:17:23,159 --> 00:17:30,159
relative angle. Let me highlight here the
central tendency structure first. Then, we
118
00:17:31,059 --> 00:17:36,070
can proceed further or you can say econometric
modelling issue.
119
00:17:36,070 --> 00:17:43,070
So, now this econometric modelling altogether
is this here. Let us see within the central
120
00:17:45,899 --> 00:17:52,899
tendency, so, let me first highlight the issue
of arithmetic mean first. This is mean first,
121
00:18:00,940 --> 00:18:07,940
then within the mean, then we have arithmetic
mean, we have geometric mean and we have harmonic
122
00:18:08,549 --> 00:18:15,549
mean. So now, for arithmetic mean we have
again simple average and weighted average.
123
00:18:15,710 --> 00:18:22,710
Now, this simple average is nothing but X
bar is equal to summation X i, i equal to
124
00:18:25,119 --> 00:18:32,119
1 to n divide by n. So, what is this? So,
basically for a particular set up, if we will
125
00:18:36,269 --> 00:18:43,269
consider a variable say X, then it is information
is represented as a X 1, X 2 up to X n. So,
126
00:18:45,789 --> 00:18:52,340
what we will call it? It is otherwise represented
as a X i. When i equal to 1, then it becomes
127
00:18:52,340 --> 00:18:56,700
X 1. When i equal to 2, it becomes X 2. When
i equal to 3, it becomes X 3. Like, when i
128
00:18:56,700 --> 00:19:03,210
equal to n, it is X n. So, that means, one
variable has the n number of information or
129
00:19:03,210 --> 00:19:08,200
observation. So, what is the fundamental issue
of arithmetic mean? In a simple structure,
130
00:19:08,200 --> 00:19:15,200
the fundamental issue is to add all the observations,
value of that observation and divide by number
131
00:19:16,350 --> 00:19:23,350
of observation; that means, it is nothing
but X 1 plus X 2 plus X 3 up to plus X n,
132
00:19:26,179 --> 00:19:33,179
divide by number of observations. So, this
is what the simple structure of arithmetic
133
00:19:33,739 --> 00:19:40,739
mean. When there is called a weighted average,
then X bar is represented as, in sometimes
134
00:19:41,799 --> 00:19:48,799
it is represented as a X bar w. So, it is
nothing but summation f i X i, i equal to
135
00:19:51,399 --> 00:19:58,399
1 to n divide by n, where this n represents
summation f i, i equal to 1 to n. So, that
136
00:20:02,529 --> 00:20:09,230
means, here w represents weight factors and
that weight has to be represent in the form
137
00:20:09,230 --> 00:20:13,889
of frequency.
For instance, if we take this particular variables
138
00:20:13,889 --> 00:20:20,889
X 1, X 2 up to X n, then corresponding to
each variables or variable information X and
139
00:20:21,730 --> 00:20:28,730
f, we have frequency f 1, f 2 up to f n. So,
what is the usual procedure, now for weighted
140
00:20:30,379 --> 00:20:37,379
average? So, we have to multiply with f and
X. So, we will get f 1 X 1, f 2 X 2 up to
141
00:20:40,330 --> 00:20:47,330
f n X n. So, finally, we like to know, what
is sum of f i X i, i equal to 1 to n; that
142
00:20:50,509 --> 00:20:57,509
means, so, X w is nothing but w 1 sorry f
1 X 1 plus f 2 X 2 plus f n X n, divide by
143
00:21:03,429 --> 00:21:10,429
n, n represent sum of sum of f i, i equal
to 1 to n which is designated as a n.
144
00:21:13,470 --> 00:21:18,539
This is the structure of weighted average.
So, weighted average structure is like this
145
00:21:18,539 --> 00:21:25,539
and this is this structure of simple arithmetic
mean. Now, let me highlight two things here.
146
00:21:29,019 --> 00:21:36,019
One important issue is here, the the property
of arithmetic mean. One of the interesting
147
00:21:36,119 --> 00:21:43,119
property is sum of the deviation of arithmetic
mean from its from its mean is equal to 0.
148
00:21:43,759 --> 00:21:50,759
So, that means, sum of X minus X bar is equal
to 0, sum of the deviation of i term from
149
00:21:52,529 --> 00:21:59,529
the arithmetic mean is equal to 0. Second
issue is the, since X bar equal to summation
150
00:22:00,789 --> 00:22:07,789
X by n, so, that implies summation X is always
equal to n into X bar. This is for verification
151
00:22:10,249 --> 00:22:17,249
only. Third issue is, it can be have combined
mean. So, X 1 2 bar is nothing but n 1 combined
152
00:22:21,570 --> 00:22:25,859
mean. The third important property here the
combined mean.
153
00:22:25,859 --> 00:22:32,859
So, third important property is combined this
nothing but n 1 X 1 plus n 2 X 2 divide by
154
00:22:35,590 --> 00:22:42,440
n 1 plus n 2. Now, what is all about that
issue? That means, let us say there are two
155
00:22:42,440 --> 00:22:48,249
variables X 1 and X 2. So, corresponding X
1. So, if you have information X 1, X 2 up
156
00:22:48,249 --> 00:22:55,249
to X n, you have information, let us take
X 11, X 2 1, n 1 then X 2 1, X 2 2 up to X
157
00:23:01,840 --> 00:23:08,840
2 n or otherwise, we can put like this. X
1 1, then X 2 1, X 3 1 up to X n 1. Then,
158
00:23:17,639 --> 00:23:24,639
similarly, this side, we can put X 1 1, X
1 2, X sorry 2 1, 2 2, then 2 3 up to X 2
159
00:23:34,289 --> 00:23:39,299
n.
So, now we like to know, what is n here? So,
160
00:23:39,299 --> 00:23:45,450
that is represented as n 1 and that is represented
as X 1. Similarly, this side we like to know,
161
00:23:45,450 --> 00:23:52,450
what is n 2 and what is X 2, X 1 bar X 2 bar.
So, within n 1 X 1 bar, n 2 X 2 bar, we have
162
00:23:53,230 --> 00:24:00,019
to calculate the combine mean. So, that is
the joint case of the two variables. So, this
163
00:24:00,019 --> 00:24:07,019
is what the arithmetic mean with respect to
its simple structure and weighted structures.
164
00:24:07,200 --> 00:24:14,200
So, now come down to the structure of geometric
mean here. Geometric mean can be also calculated
165
00:24:15,129 --> 00:24:22,129
with the simple structure and weighted structure.
Now, for a series of variable say X, X 1,
166
00:24:23,759 --> 00:24:30,759
X 2 up to X n. Then, geometric mean usually
usually represented as a small g, is nothing
167
00:24:32,210 --> 00:24:39,210
but X 1 multiplied by X 2 multiplied by X
n to the power 1 by n. So, this is what the
168
00:24:42,090 --> 00:24:48,999
calculation of geometric mean.
So, now for for weighted issue, then we can
169
00:24:48,999 --> 00:24:55,999
call it a g w, which is nothing but X 1 f
1 multiplied by X 2 f 2, multiplied by X 3
170
00:25:00,229 --> 00:25:07,229
f 3, multiplied by X n f n to the power 1
by n. So, this is the structure of the geometric
171
00:25:12,460 --> 00:25:19,460
mean. In the case of harmonic mean, this is
also simple structure and weighted structure.
172
00:25:21,859 --> 00:25:28,859
Now, in the case of harmonic mean, it is nothing
but n by summation 1 by X i, i equal to 1
173
00:25:30,869 --> 00:25:37,580
to n. In the case of weighted average, so,
the harmonic mean with weighted average is
174
00:25:37,580 --> 00:25:44,580
nothing but n by summation f i by X i, i equal
to 1 to n.
175
00:25:46,600 --> 00:25:53,600
So, now. So, we get to know, what is the setup
of a mathematical average, that is arithmetic
176
00:25:55,739 --> 00:26:02,739
mean, geometric mean and harmonic means. Again,
that is with respect to simple structure and
177
00:26:04,919 --> 00:26:11,919
weighted structure. Now, within within this
basic setup, within the basic set up of this
178
00:26:14,899 --> 00:26:21,460
mathematical average, then you will come down
to the positional average. This positional
179
00:26:21,460 --> 00:26:28,460
average is with respect to first is the median
case, first is with respect to median case.
180
00:26:28,679 --> 00:26:35,679
So, what is this median? Median is nothing
but or you can say middle value of the sequence.
181
00:26:39,039 --> 00:26:45,229
So, that means, once you have a series of
observation, the objective of median is you
182
00:26:45,229 --> 00:26:50,629
have to find out a particular variable or
value, which can divide the observation into
183
00:26:50,629 --> 00:26:57,629
two equal parts, fifty percent above and fifty
percent below. So, that means, median can
184
00:27:00,249 --> 00:27:06,749
be calculated in a simple structure. It can
be calculated in simple structure. Under simple
185
00:27:06,749 --> 00:27:13,749
structures, the framework is like this, n
by 2 provided if n is n is even, n represents
186
00:27:18,769 --> 00:27:25,769
number of observation in the set up. Now,
when n plus 1 by 2 when or if n is odd. So,
187
00:27:31,529 --> 00:27:38,330
this is what the simple structure of median
calculation.
188
00:27:38,330 --> 00:27:45,330
So now, you remember this median is a positional
average. Now, the n by 2 and n plus 1 by 2
189
00:27:46,090 --> 00:27:53,090
will give you the position to describe the
issue. Now, for reality or you can say complex
190
00:27:53,950 --> 00:28:00,950
problem the value of median will be calculated
like this, L plus n by 2 minus C F by f into
191
00:28:03,609 --> 00:28:10,609
i. L represents lower limit of the class,
n represents number of observations in the
192
00:28:19,700 --> 00:28:26,700
system, C F represents cumulative frequency
cumulative frequency of the preceding median
193
00:28:36,229 --> 00:28:43,229
class, f represents class frequency class
frequency of median class. Then, i represents
194
00:28:51,299 --> 00:28:58,299
class interval class interval.
So now, for you know, whether it is mean issue,
195
00:29:03,609 --> 00:29:10,609
median issue or mode issue, there are two
different ways we have to calculate. One is
196
00:29:11,779 --> 00:29:18,779
simple setup, another is called as a discrete
or continuous setup. When there is the structure
197
00:29:21,789 --> 00:29:28,789
is a very simple structure, then we have to
just use the positional issue or you can say
198
00:29:32,070 --> 00:29:39,070
simple issue like summation X i Y i or n by
2 or n plus 1 by 2. But, when the structure
199
00:29:40,729 --> 00:29:47,729
is all about the issue of descriptive series
or continuous series, then the calculation
200
00:29:47,739 --> 00:29:54,739
procedure is somewhat different. So, now median
structure is here. L plus n by 2 minus C F
201
00:29:55,979 --> 00:30:02,059
by f into Y. So, I will give you detailed
example, how you have to calculate or how
202
00:30:02,059 --> 00:30:06,690
you have to use this particular formula, when
the problem is something different and you
203
00:30:06,690 --> 00:30:11,879
have to apply this particular formula, when
the problem is something else.
204
00:30:11,879 --> 00:30:18,879
So, now come down to mode issue. Now, mode
is altogether, mode is also positional average.
205
00:30:20,669 --> 00:30:22,179
It is the value of variable which has highest
frequency value of variable which has highest
206
00:30:22,179 --> 00:30:23,039
frequency highest frequency. Now, this is
again give you the positional issue, this
207
00:30:23,039 --> 00:30:28,450
will give you the positional issue, this will
give you the positional issue. When the structure
208
00:30:28,450 --> 00:30:35,129
is individual series or simple structure,
now when there is a discrete series or continuous
209
00:30:35,129 --> 00:30:40,419
series, then the mode calculation will be,
mode will be L plus del 1 by del 1 plus del
210
00:30:40,419 --> 00:30:47,419
2 into i. So, del 1 is nothing but f 1 minus
f 0 and del 2 is nothing but f 1 minus f 2.
211
00:30:52,220 --> 00:30:59,220
So, this is f 0 represents f 0 represents
frequency of the modal class and f 1 represents
212
00:31:18,809 --> 00:31:25,809
frequency of the preceding modal class, preceding
modal class and f 2 represents frequency of
213
00:32:00,749 --> 00:32:07,749
frequency of the following modal class following
modal class. So now, i represents class intervals.
214
00:32:10,349 --> 00:32:17,349
So now, let me take a case here. So, how do
you calculate all these issue? Let us take
215
00:32:21,479 --> 00:32:28,479
a case examples here. Now, there is a series
here. This series, it is with individual issues.
216
00:32:31,739 --> 00:32:38,739
Now, the examples which we have sited here,
we have series of information: 52, 76, 100,
217
00:32:45,129 --> 00:32:52,129
136, 186, 196, 205, 150, 257, like this we
have to proceed, proceed, then we have 791
218
00:32:57,070 --> 00:33:04,070
and 891. So, now our objective is to know
what is the mean value here, what is the median
219
00:33:06,080 --> 00:33:11,450
value here and what is the mode value here.
So, far as the mean is concerned, we like
220
00:33:11,450 --> 00:33:17,719
to know what is the value of these observations
and what is the number of observation? Now,
221
00:33:17,719 --> 00:33:24,529
for simple arithmetic mean we have to just
add all these items divide by number of observations.
222
00:33:24,529 --> 00:33:29,429
So, number of observation you have to find
out what are the number of observations here.
223
00:33:29,429 --> 00:33:35,279
So now, if you follow that procedure, you
can have the mean value. Similarly, for the
224
00:33:35,279 --> 00:33:42,279
median issue or the procedure median calculation
is that we have to first arrange these items
225
00:33:43,700 --> 00:33:50,159
in ascending and descending. Now, the moment,
at the moment we arrange it in ascending and
226
00:33:50,159 --> 00:33:55,479
descending, then you have to apply the positional
issue. So, that means, if the series is even,
227
00:33:55,479 --> 00:34:00,070
then you have to apply n by 2. If the series
is odd, then you have to apply n plus 1 by
228
00:34:00,070 --> 00:34:04,769
2. So, this will give you indication what
is the median of that particular series. So,
229
00:34:04,769 --> 00:34:09,659
the way we will calculate that median, then
50 percent of observation will be above and
230
00:34:09,659 --> 00:34:15,109
50 percent observation will be below.
So now, similarly, in the case of mode, so,
231
00:34:15,109 --> 00:34:20,810
we have to first arrange the items in sequence,
then you have to see what is the frequency
232
00:34:20,810 --> 00:34:27,460
of each items. Now, mode will be the value
of that particular series depends upon the
233
00:34:27,460 --> 00:34:32,589
highest frequency. So, with the basis of highest
frequency, you have to calculate the mode.
234
00:34:32,589 --> 00:34:39,549
For instance, let us take a case of 150. So,
150 item, we have to find out whether it is
235
00:34:39,549 --> 00:34:45,129
available in other place. So now, if it is
available in other place, we have to see how
236
00:34:45,129 --> 00:34:49,919
many times. Similarly, take a case of 196;
that means, what is the best procedure is
237
00:34:49,919 --> 00:34:52,529
that like this.
238
00:34:52,529 --> 00:34:58,970
The best procedure is that you have to see
here. Take a case of items. So now, these
239
00:34:58,970 --> 00:35:05,970
are all items then it is corresponding frequency.
So now, you check it here, 52. Let us take
240
00:35:08,150 --> 00:35:15,150
a case of 52. Now, again 76, then 100, then
136, then 186, 196, 205, then 150, then 257,
241
00:35:30,549 --> 00:35:37,549
then 264, then 264. Now, if you compare here,
76 here, sorry 52. This is 1, 76 1, 100 1,
242
00:35:41,630 --> 00:35:48,630
136 1, 186 1, 196 1, 205 1, 150 1, 257 1,
264 1. So, since you see again 264. So, instead
243
00:35:53,240 --> 00:36:00,240
of writing here, you put it and mark here.
Now, instead of 264, you put it here 280.
244
00:36:00,240 --> 00:36:05,710
So, now you have to see how many 280s are
there. If it is 1 here, then if it is again
245
00:36:05,710 --> 00:36:11,210
280, you put mark here. If again 280 you put
mark here. Then, final you have to observe
246
00:36:11,210 --> 00:36:16,760
how many frequencies are there. Let us say
280s are there. This is available in three
247
00:36:16,760 --> 00:36:22,039
times. So, it is frequency is nothing but
three. Here frequency is nothing but two.
248
00:36:22,039 --> 00:36:27,990
This is one, this is one, this is one, this
is one, this is one, this is one, this is
249
00:36:27,990 --> 00:36:33,029
one, this is one, this is one. Now, there
may be other series also. There may be other
250
00:36:33,029 --> 00:36:39,539
series also but within the particular setup,
so, we can call that this is the modal class
251
00:36:39,539 --> 00:36:45,400
becaus it is the highest frequency in the
particular series. So, this way we have to
252
00:36:45,400 --> 00:36:50,339
calculate the calculate the value of mode.
253
00:36:50,339 --> 00:36:57,339
So now, we have to proceed further for other
issue of this, a particular univariate econometric
254
00:37:04,799 --> 00:37:11,799
modelling. Now, univariate econometric modelling,
we have the setup, central tendency, dispersions
255
00:37:17,589 --> 00:37:23,579
and skewness. So, central tendency, we have
already discussed what is the structure and
256
00:37:23,579 --> 00:37:30,079
how is how is it setup? Now, suppose a dispersion
is concerned, we have two different structures.
257
00:37:30,079 --> 00:37:37,079
One is called as a absolute structures and
another is called as a relative structures.
258
00:37:39,279 --> 00:37:45,500
So, that means, absolute measure of dispersion,
absolute measure of dispersion and relative
259
00:37:45,500 --> 00:37:49,420
measure of dispersion.
Under absolute measure of dispersion, we have
260
00:37:49,420 --> 00:37:56,420
series of techniques. We have series of techniques:
range, quartile deviations, then mean deviations,
261
00:37:59,299 --> 00:38:06,299
then standard deviations. And under relative
measure of dispersions, we have three different
262
00:38:06,380 --> 00:38:13,380
techniques called as a coefficient variations,
coefficients of quartile deviations, coefficients
263
00:38:15,970 --> 00:38:20,079
of mean deviations. Coefficients of Coefficient
variation, coefficient quartile deviation,
264
00:38:20,079 --> 00:38:26,220
coefficient of mean deviation.
So, now I will explain what is all about all
265
00:38:26,220 --> 00:38:31,450
about these components. Let us start with
range. How do you calculate this range? The
266
00:38:31,450 --> 00:38:38,450
range is the difference between a maximum
of the series and minimum of the series. In
267
00:38:40,150 --> 00:38:45,430
this particular in this particular example,
in this particular example, if you like to
268
00:38:45,430 --> 00:38:50,579
know what is the range, then you have to see
what is the highest number of this particular
269
00:38:50,579 --> 00:38:56,359
series, then what is the lowest lowest value
of that particular series, then the difference
270
00:38:56,359 --> 00:39:03,359
will give you range.
In econometric modelling, if the range value
271
00:39:03,960 --> 00:39:09,089
is very high, then obviously, it is the negative
aspects of econometric modelling. So, the
272
00:39:09,089 --> 00:39:14,910
econometric modelling will be better or you
can get better fitted modal, if you are descriptive
273
00:39:14,910 --> 00:39:19,650
information or univariate information is very
accurate. For instance, for this particular
274
00:39:19,650 --> 00:39:25,930
issue range, if the range is very minimum,
that means, it automatically give you the
275
00:39:25,930 --> 00:39:31,680
spreadness of that particular variables. If
the range is very small, then obviously, the
276
00:39:31,680 --> 00:39:35,980
dispersion is very low.
So, that means, the variability of that particular
277
00:39:35,980 --> 00:39:42,400
variable is you can say a low. So, it will
be better indication for further econometric
278
00:39:42,400 --> 00:39:47,990
modelling. So, similarly, come down to quartile
deviation. So, quartile deviation is nothing
279
00:39:47,990 --> 00:39:54,990
but, a difference between Q 3 minus Q 1 by
2. So, third quartile minus first quartile
280
00:39:55,430 --> 00:40:01,880
divide by 2. What is third quartile? So, third
quartile, we have we have to get by L plus
281
00:40:01,880 --> 00:40:08,880
3 N by 4 minus C F by C F by F into i. And
Q 1 equal to L plus N by 4 minus C F by F
282
00:40:16,270 --> 00:40:21,329
into i. We have already discussed what is
C F? This is a cumulative frequency of the
283
00:40:21,329 --> 00:40:28,329
preceding median class and F represents frequency
of the median class, i represents class interval,
284
00:40:29,119 --> 00:40:36,119
L represents lower limit of the series, Q
3 represents third quartile and Q 1 represents
285
00:40:36,500 --> 00:40:40,490
first quartile.
So far, as quartile deviation is concerned,
286
00:40:40,490 --> 00:40:46,019
so, it is calculated with the difference of
third quartile minus first quartile by 2.
287
00:40:46,019 --> 00:40:52,760
Now, come down to mean deviation. So, mean
deviation is nothing but, 1 by N summation
288
00:40:52,760 --> 00:40:59,760
X i minus X bar. It is in deviation format.
So, the specialty of this a component is that
289
00:41:02,000 --> 00:41:08,730
it is the, it ignores usually sign. So, the
moment you will take deviation, then the minus
290
00:41:08,730 --> 00:41:15,730
component will be plus component. So, as a
result, it may be a better for, you can say
291
00:41:15,920 --> 00:41:22,920
calculation. But, it has also limitation because
the negative signs are ignoring. So now, this
292
00:41:23,680 --> 00:41:29,299
is what the procedure of mean deviation. If
there is a frequency, then obviously, you
293
00:41:29,299 --> 00:41:36,299
have to add frequency here. Then, accordingly,
you have to calculate the mean deviation.
294
00:41:36,539 --> 00:41:43,539
So, now come down to standard deviation. Standard
deviation is nothing but, summation X i minus
295
00:41:45,740 --> 00:41:52,740
minus X bar whole square, i equal to 1 to
n to the power 0.5. This is the calculating
296
00:41:53,700 --> 00:42:00,039
procedure of standard deviations. Now, you
get to know what is range, what is quartile
297
00:42:00,039 --> 00:42:05,720
deviation, what is mean deviation, what is
standard deviation? And I am just explaining
298
00:42:05,720 --> 00:42:12,720
the a simple structure, that is with respect
to individual series. So, when the series
299
00:42:12,869 --> 00:42:19,059
is continuous and discrete, then obviously,
the calculating procedure of all these components
300
00:42:19,059 --> 00:42:26,059
starting from central tendency to dispersion
is completely different. Though there is the
301
00:42:26,130 --> 00:42:31,640
the lots of integration or similarity but
the only problem is the calculation procedure.
302
00:42:31,640 --> 00:42:38,640
So, now standard deviation is basically the
square root of this, you can say sum of the
303
00:42:39,309 --> 00:42:45,839
observation from its central point. Now, come
down to relative measure of central tendency.
304
00:42:45,839 --> 00:42:50,500
On the relative measure of central tendency,
the first standard technique is called as
305
00:42:50,500 --> 00:42:57,500
a coefficient variation. Coefficient variation
is simply represented as a sigma by X bar
306
00:42:57,539 --> 00:43:04,109
multiplied by 100. Sigma usually represented
as a standard deviations. This square of standard
307
00:43:04,109 --> 00:43:11,109
deviation is called as a variance.
So, now come down to coefficient of quartile
308
00:43:11,849 --> 00:43:17,089
deviations. So, coefficient of quartile deviation
is calculated with respect to median. Now,
309
00:43:17,089 --> 00:43:24,010
quartile deviation divided by median multiplied
by 100 will give you coefficient of quartile
310
00:43:24,010 --> 00:43:28,839
deviation. Then, some coefficient of median
deviation. So, coefficient term mean deviation
311
00:43:28,839 --> 00:43:35,839
is nothing but, mean deviation about mean
with respect to mean or median. Mean deviation
312
00:43:38,240 --> 00:43:45,240
by mean multiplied by 100. This is coefficient
of mean deviations.
313
00:43:45,769 --> 00:43:52,769
So now, this is the calculating structure
of the the relative measure of relative measure
314
00:43:54,559 --> 00:44:01,559
of standard deviations. So, we have complete
information how to calculate how to calculate
315
00:44:03,180 --> 00:44:09,160
the absolute measure of dispersion and how
to calculate the relative measure of dispersion.
316
00:44:09,160 --> 00:44:14,430
So, the important difference between the absolute
measurement and relative measurement is that,
317
00:44:14,430 --> 00:44:19,440
in the first case, the structure is not unit
free but in the second case, the structure
318
00:44:19,440 --> 00:44:25,089
is complete unit free. So, that is why, a
relative measure of dispersion is the best
319
00:44:25,089 --> 00:44:32,089
measure than the absolute measures.
So now, suppress the technique wise concept
320
00:44:33,160 --> 00:44:39,920
is concerned. So, in the case of central tendency,
the best average is considered as a arithmetic
321
00:44:39,920 --> 00:44:46,920
means because it is very simple very structure
and is very reliable. And in the case of in
322
00:44:47,119 --> 00:44:53,059
the case of dispersion, in fact, standard
deviation is considered as the best technique
323
00:44:53,059 --> 00:44:59,910
under absolute measure but in reality, coefficient
variation is considered as the best technique
324
00:44:59,910 --> 00:45:05,579
because it is the unit less unit less measurement
and it is a relative research.
325
00:45:05,579 --> 00:45:12,579
I will give you very practical examples. Let
me let me take a case. What is the what is
326
00:45:12,839 --> 00:45:19,180
the exact issue or difference between the
central tendency dispersion and skewness?
327
00:45:19,180 --> 00:45:26,150
Because, there is a beautiful structure and
you know step wise process. Now, when there
328
00:45:26,150 --> 00:45:32,940
is a question of central tendency, central
tendency will give you only positional issue
329
00:45:32,940 --> 00:45:39,859
and you can say mathematical issue. But, when
there is a question of comparative analysis
330
00:45:39,859 --> 00:45:46,859
then obviously, there is there is a issue
where, two variables are there and we like
331
00:45:48,660 --> 00:45:54,289
to compare the two variables. There may be
possibility that the mean of that particular
332
00:45:54,289 --> 00:46:01,289
variables is the same. So, that means, if
X 1 is a variable and X 2 is a variable, then
333
00:46:01,450 --> 00:46:06,829
number of observations are same or different,
then obviously, we like to know what is the
334
00:46:06,829 --> 00:46:12,190
average of this particular series? Then, you
can say that, if the average is high for second
335
00:46:12,190 --> 00:46:18,420
series than the fast, then you can say that
this second is better than first or first
336
00:46:18,420 --> 00:46:22,339
is better than second.
Now, the situation will be completely different.
337
00:46:22,339 --> 00:46:28,680
If the mean of both the series will equal,
then in that case, you cannot get any conclusion
338
00:46:28,680 --> 00:46:34,900
or you cannot make a comparative analysis.
In that case, we have we have to or you have
339
00:46:34,900 --> 00:46:41,900
to proceed further to dispersion. The series
may have equal mean but the dispersion will
340
00:46:44,250 --> 00:46:50,559
be completely different. Now, equal mean may
have unequal variance or unequal standard
341
00:46:50,559 --> 00:46:55,130
deviations.
So now, only the mean cannot be sufficient
342
00:46:55,130 --> 00:47:01,809
to represent the univariate structure of econometric
modelling. So, you need to have dispersion,
343
00:47:01,809 --> 00:47:08,049
so, variability structure. So, variability
will give you the indication or comparative
344
00:47:08,049 --> 00:47:15,049
analysis between this series. Again, let us
take a case of mean is mean is same and standardization
345
00:47:15,839 --> 00:47:22,839
is also same for both the cases. Then, still
you you like to have the comparative analysis
346
00:47:24,339 --> 00:47:30,890
and in that case, there may be some situation,
mean is equal and you can say standard deviation
347
00:47:30,890 --> 00:47:37,890
also equal, then still you cannot get a conclusions.
So, in that case, you have to apply a relative
348
00:47:38,099 --> 00:47:42,269
measure of dispersion or you can go for you
can say skewness and kurtosis.
349
00:47:42,269 --> 00:47:47,990
Let me take another issue here. Particularly
take a case of absolute measure of dispersion
350
00:47:47,990 --> 00:47:54,990
and relative measure of dispersion. Yes, of
course, there is two different problems. Let
351
00:47:55,599 --> 00:48:02,599
us take a case of foreign exchange issues.
So, take a case of Japanese currency yen,
352
00:48:03,569 --> 00:48:10,099
another case is US dollar. Now, some of the
observations are in yen form and some of the
353
00:48:10,099 --> 00:48:15,109
observation in dollar form.
So, now we like to know what is the what is
354
00:48:15,109 --> 00:48:20,609
the stability of yen and what is the stability
of dollars? Now, to know the stability of
355
00:48:20,609 --> 00:48:27,609
dollar and yen, you need to have apply the
standard technique called as a dispersions.
356
00:48:30,599 --> 00:48:35,630
The component stability component will be
more stable if the variations or variability
357
00:48:35,630 --> 00:48:39,380
is very less.
So, now in the case of in the case of say
358
00:48:39,380 --> 00:48:45,319
yen, if the standardization or variance is
very high, then you can say that it is the
359
00:48:45,319 --> 00:48:50,190
not stable. But, in other case, if it is say
variation is very less, then you can say that
360
00:48:50,190 --> 00:48:55,289
it is stable. So, that means, the stability
of a particular currency depends upon the
361
00:48:55,289 --> 00:49:01,359
variability structure. If the standardization
is very low, then the currency is very stable
362
00:49:01,359 --> 00:49:07,019
one. If the standardization is very high,
then obviously, the currency stability is
363
00:49:07,019 --> 00:49:13,619
currency stability is not too good. It may
be negative. So, that means, it is not at
364
00:49:13,619 --> 00:49:20,059
all stable. That may be instability.
So, now in that case, the example may be in
365
00:49:20,059 --> 00:49:26,970
a different shape. For instance, this is to
how you measure the stability. But, suppose
366
00:49:26,970 --> 00:49:32,920
I like to compare the yen with dollar which
is more effective and more accurate. But,
367
00:49:32,920 --> 00:49:39,700
in that case, sometimes it may be more complicated
also. For instance, the moment you get the
368
00:49:39,700 --> 00:49:45,220
result by applying the standardization for
yen and for US dollar. And obviously, if the
369
00:49:45,220 --> 00:49:51,109
items are represented in yen, then the mean
and standardization you will get it in yen.
370
00:49:51,109 --> 00:49:56,819
But, if you have the observation in dollars,
then obviously, the mean and standardization,
371
00:49:56,819 --> 00:50:03,819
you will get it in also dollars. But, if it
is a comparative analysis, then the mean which
372
00:50:04,519 --> 00:50:11,470
is in mean standardization which are in yen
and other side mean standardization, which
373
00:50:11,470 --> 00:50:18,470
are in dollar format cannot be comparable
because yen and dollars are completely different.
374
00:50:19,000 --> 00:50:26,000
That is the foreign exchange market. We can
make the distinguish but in that case statistics
375
00:50:26,099 --> 00:50:31,450
is very a handy. So, if we apply econometric
tool, particular coefficient variation or
376
00:50:31,450 --> 00:50:36,890
coefficient of quartile deviation, coefficient
of mean deviation, then obviously, this particular
377
00:50:36,890 --> 00:50:43,309
problem can be solved without any additional
information. Suppose, you have information
378
00:50:43,309 --> 00:50:48,349
what is the dollar value and what is the yen
value, then you can either yen into yen or
379
00:50:48,349 --> 00:50:53,720
dollar into dollar. Then, you make a comparative
analysis. But, if you have no information,
380
00:50:53,720 --> 00:50:59,119
such information, then you just apply the
standard statistical tool, say coefficient
381
00:50:59,119 --> 00:51:04,299
variation then you can get the result. So,
this is how, you have to be very careful to
382
00:51:04,299 --> 00:51:06,680
solve the particular problems.
383
00:51:06,680 --> 00:51:13,680
So now, we have to move down to another issue.
Let me let me give a brief idea about to the
384
00:51:17,970 --> 00:51:23,680
structure of individual series and you can
say a discrete series and continuous series.
385
00:51:23,680 --> 00:51:29,710
I have discussed detail about the calculation
of mean, median, mode and standardizations
386
00:51:29,710 --> 00:51:36,710
and also coefficient variance, etcetera. So,
there may be some issues here with respect
387
00:51:39,009 --> 00:51:44,990
to discrete series and continuous series.
Now, there are series of structure here. So,
388
00:51:44,990 --> 00:51:49,910
all these structure are almost all same here.
Here, we are just, whatever discussion we
389
00:51:49,910 --> 00:51:55,309
have till now, we are just following the particular
particular format, that is individual series
390
00:51:55,309 --> 00:52:01,200
components. When there is discrete component,
then obviously, weight factor has to be assigned.
391
00:52:01,200 --> 00:52:05,980
And in the case of continuous series, then
obviously, the internal structure must be
392
00:52:05,980 --> 00:52:12,660
there. And that interval may be in a particular
you can say class interval. So, with that
393
00:52:12,660 --> 00:52:17,809
class interval and you can say proper structure,
we have we have to calculate the mean and
394
00:52:17,809 --> 00:52:23,890
you can say a standardization or coefficient
variation in different way. The complete calculating
395
00:52:23,890 --> 00:52:30,890
procedure is altogether different, but the
result is almost all same. The detail structure
396
00:52:32,130 --> 00:52:38,509
of calculation, I may highlight in different
class, because it is not possible now to take
397
00:52:38,509 --> 00:52:45,450
example to solve. So, we will discuss in next
class the detail about when we will go for
398
00:52:45,450 --> 00:52:51,059
bivariate modelling. That time, I will explain
how it can be possible, when the series is
399
00:52:51,059 --> 00:52:53,450
incompletely discrete series and continuous
series.
400
00:52:53,450 --> 00:53:00,450
So now, I will explain one thing here. That
is, the one other aspects of this particular
401
00:53:01,200 --> 00:53:08,200
problem, that is skewness and kurtosis. This
skewness is nothing but step of the step of
402
00:53:12,670 --> 00:53:15,690
the distributions. So, that is, we like to
know what is the position of this particular
403
00:53:15,690 --> 00:53:22,690
series? So, we usually look for normal distributions.
Normal distribution, if it is normally distributed,
404
00:53:23,089 --> 00:53:29,480
then obviously, this distribution is called
as a symmetric distribution where mean, median,
405
00:53:29,480 --> 00:53:34,799
mode are equal. If mean, median, mode are
not equal, then this distribution is called
406
00:53:34,799 --> 00:53:41,799
as a skewed distribution. It may negatively
skewed, it may be positively skewed.
407
00:53:42,329 --> 00:53:49,329
So, now. So, we need to have, you can say
the structure like this. This is symmetrical,
408
00:53:51,089 --> 00:53:56,220
zero skew and when there is positive negative
skewed, then it is called as skewed distribution.
409
00:53:56,220 --> 00:54:03,190
Generally, we look for this structure of data
setup. If you have a data setup like this,
410
00:54:03,190 --> 00:54:09,789
then obviously, it is called as a symmetrical
distribution and that is very effective for
411
00:54:09,789 --> 00:54:14,910
modelling or particularly multivariate modelling.
All these variable information is like this,
412
00:54:14,910 --> 00:54:20,559
then obviously, you are in the right track.
If not, then obviously, your structure is
413
00:54:20,559 --> 00:54:27,559
completely different. If your data setup is
not normally distributed, then obviously,
414
00:54:27,910 --> 00:54:32,849
you have to apply the transformation rule.
We have series of transformation rule starting
415
00:54:32,849 --> 00:54:37,980
from exponential transformation, logarithm
transformation and first difference transformations.
416
00:54:37,980 --> 00:54:44,779
So, the way you transfer the data, automatically
the series can be transferred into normal
417
00:54:44,779 --> 00:54:46,000
normally distributed.
418
00:54:46,000 --> 00:54:51,460
So, this is very important point. So, this
is another shape of the normal distribution.
419
00:54:51,460 --> 00:54:57,980
Now, this is positive positive skewed distributions
and this is what negative skewed distributions
420
00:54:57,980 --> 00:55:04,460
and this is the this is the case of both the
distribution altogether.
421
00:55:04,460 --> 00:55:11,460
So now, last but not the least component on
the univariate modelling is kurtosis. It represents
422
00:55:11,589 --> 00:55:15,700
simply the flatness of the distributions.
There are three different structures. It is
423
00:55:15,700 --> 00:55:21,730
the, you take it here case, this is one shape,
this is another shape, this is another shape.
424
00:55:21,730 --> 00:55:26,619
So, this particular shape is called as a thin
structure, this particular shape is called
425
00:55:26,619 --> 00:55:32,170
as a flat structure and this is what, is the
middle between this thin and flat. Generally,
426
00:55:32,170 --> 00:55:36,029
within the setup, we consider that this rate
structure is very beautiful and it is very
427
00:55:36,029 --> 00:55:41,799
effective for further modelling. So, this
is usual shape of the normal distribution
428
00:55:41,799 --> 00:55:46,329
curve. It is usually called as a bell shaped
curve. If that is like this, then obviously,
429
00:55:46,329 --> 00:55:50,930
the structure is a very feasible for further
econometric modeling.
430
00:55:50,930 --> 00:55:57,930
So, with this we have to finish this session
here. So, we will discuss detail in the next
431
00:55:58,970 --> 00:56:04,390
class for, you can say with beautiful examples
and different structure. So, thank you very
432
00:56:04,390 --> 00:56:05,269
much. Have a nice day.