1
00:00:15,960 --> 00:00:22,960
Good afternoon. Welcome to NPTEL project on
econometric modeling. This is Rudra Pradhan
2
00:00:23,010 --> 00:00:26,430
here. Today, we discuss the concept probability.
3
00:00:26,430 --> 00:00:33,430
In the last lectures, we have discussed various
aspects of econometric modelling. That is,
4
00:00:36,829 --> 00:00:43,660
the structure of univariate modelling and
bivariate modelling. In the univariate modelling
5
00:00:43,660 --> 00:00:50,660
we have discussed central tendency, particularly
mean, median, mode, dispersion; that is range,
6
00:00:53,829 --> 00:01:00,030
total deviation, mean deviation, standard
deviation, coefficient variation and skewness;
7
00:01:00,030 --> 00:01:06,410
that is the shape of the distribution.
On the other sides, we have discussed correlation
8
00:01:06,410 --> 00:01:13,410
and regression; that is, to know the association
between the two variables and the causality
9
00:01:16,510 --> 00:01:23,510
between the two variable. Now, the issue is
whether you go for univariate modelling or
10
00:01:25,479 --> 00:01:32,479
bivariate modelling, we must like to know
whether the univariate modelling or bivariate
11
00:01:34,000 --> 00:01:41,000
modelling has its uniformity or has its feasibility.
So for, as a feasibility is concerned or strengthness
12
00:01:45,259 --> 00:01:50,509
of this modelling is concerned or validity
of the model is concerned or significance
13
00:01:50,509 --> 00:01:57,509
of the model is concerned, we have to know
or we have to add certain things here. The
14
00:01:58,750 --> 00:02:05,750
issue is that, we like to justify whether
the particular setup, that is univariate setup
15
00:02:06,090 --> 00:02:13,090
or bivariate setup are relevant or significant
one. To justify the same, we like to know
16
00:02:13,370 --> 00:02:19,820
the concept of probability, estimation and
hypothesis testings.
17
00:02:19,820 --> 00:02:26,820
So probability estimation hypothesis testing
will has a fantastic role. So for, has a significance
18
00:02:28,000 --> 00:02:35,000
of a particular variable in a modelling setup.
So whether you apply univariate statistics
19
00:02:36,430 --> 00:02:43,430
or bivariate statistics or multivariate statistics,
you must have complete information and or
20
00:02:45,400 --> 00:02:51,570
complete structure of the models and that
model has to be statistically significant.
21
00:02:51,570 --> 00:02:57,290
So now, in these particular lectures, we like
to know how probability can be applied or
22
00:02:57,290 --> 00:03:03,670
can be used to know the significance of a
particular variable or significance of the
23
00:03:03,670 --> 00:03:09,260
model; that means the overall fitness of the
models.
24
00:03:09,260 --> 00:03:16,260
So now, what is all about this probability?
Probability is basically, you can say chance
25
00:03:19,260 --> 00:03:26,260
of occurrence. So it is quantitative measurement
of uncertainty. In the real world situations,
26
00:03:29,620 --> 00:03:36,620
you will find some of the things are very
certain and some of the things are very uncertain.
27
00:03:36,890 --> 00:03:43,890
So that is how, in the beginning, we have
mentioned the difference between the social
28
00:03:44,230 --> 00:03:49,420
science structure and you can say mathematical
science structures.
29
00:03:49,420 --> 00:03:56,260
In one sides, the relationships are exact
in nature. In the other side, the relationships
30
00:03:56,260 --> 00:04:02,830
are inexact in nature. The moment you will
say relationships are exact, then obliviously
31
00:04:02,830 --> 00:04:09,830
this is one way, we can say that the situations
are very certain, very clear. So now, point
32
00:04:10,040 --> 00:04:17,040
to go for verification or you can say any
statistical to estimate or to check reliability,
33
00:04:18,840 --> 00:04:25,840
etcetera. But, in reality there are so many
things are there. We are now sure about the
34
00:04:26,910 --> 00:04:33,490
fact, so that means there is always question
of uncertainty. So now when there is question
35
00:04:33,490 --> 00:04:40,490
of uncertainty, we try to justify how it can
be possible to measure or you can say how
36
00:04:41,819 --> 00:04:48,819
you can justify the situation. So, when it
is uncertain, so what is the percentage? So,
37
00:04:49,229 --> 00:04:55,870
that means how much different from the certainty.
So now, in that case probability plays a fantastic
38
00:04:55,870 --> 00:05:02,870
role. So, the basic meaning of probability
is the quantity measurement of uncertainty.
39
00:05:03,030 --> 00:05:10,030
So, it measures the degree or chance of occurrence
of an, you can say uncertainty event.
40
00:05:13,340 --> 00:05:19,719
So I will take a case here. Let us take a
case of, I will toss a coin, let say coin
41
00:05:19,719 --> 00:05:26,719
is a examples. Now, if I will toss a coin,
then obviously its outcome is either head
42
00:05:30,729 --> 00:05:37,689
or tail. Now, how probability can plays a
role here? So now, the moment you will toss
43
00:05:37,689 --> 00:05:44,539
a coin and obviously we have two possible
outcomes. Either you will get head or you
44
00:05:44,539 --> 00:05:49,949
will get tail. Now, the question is whether
you will get head or whether you will get
45
00:05:49,949 --> 00:05:56,949
tail. So now, this is not at all certain.
Now, uncertainty has to be applied. Now, take
46
00:05:59,129 --> 00:05:59,659
another case.
47
00:05:59,659 --> 00:06:06,659
Now, there are two, there are two persons,
say X and y. So, they like to toss a coin.
48
00:06:07,879 --> 00:06:14,879
Now, in the X case, the possible outcomes
are head and tail, and in the Y case, the
49
00:06:15,080 --> 00:06:22,080
possible outcomes are also head and tail.
So now, if both the person at the same time
50
00:06:22,449 --> 00:06:28,830
toss the coin, then obviously what is the
chance of occurrence or what is the chance
51
00:06:28,830 --> 00:06:35,830
of belief or chance of occurrence? Now, the
moment both the person, you can say try to
52
00:06:37,009 --> 00:06:42,129
toss the coins, then obliviously there is
several possibilities.
53
00:06:42,129 --> 00:06:49,129
First both both heads can occur, both the
cases, both head can come, then second case
54
00:06:52,749 --> 00:06:59,749
both tail can come and third case, one tail,
one tail and one head. So, there are three
55
00:07:07,840 --> 00:07:14,840
possible situations are there. So, we have
to make an experiment, what is the possible
56
00:07:18,389 --> 00:07:25,389
outcomes and at what extent, we have to bear
it? So, in that context, you know probability
57
00:07:26,009 --> 00:07:32,849
plays a fantastic roles. Let me explain, what
is all about these probability structures.
58
00:07:32,849 --> 00:07:39,849
Probability, basically divided into two parts.
Probability, we can measure into different
59
00:07:42,870 --> 00:07:49,870
angles. One is called as a objective classification,
then another is called as a subjective classification.
60
00:07:54,979 --> 00:08:01,979
The case of objective classification, it basically
depends upon equal likely events, equally
61
00:08:05,839 --> 00:08:12,839
likely events. In the second case, subjective
classification, we usually look for personal
62
00:08:15,830 --> 00:08:22,830
judgment. For instance, take a case of election,
snow falls, etcetera. So, in the case of objective
63
00:08:24,309 --> 00:08:30,939
classification, equally likely events. The
same thing, toss a coin, P k card or toss
64
00:08:30,939 --> 00:08:37,849
a dice, etcetera. So, in that context, so
subjective classification of probability plays
65
00:08:37,849 --> 00:08:44,130
a fantastic role. So, probability basically
can be measured in objective angle and can
66
00:08:44,130 --> 00:08:49,730
be measured in subjective angles. So, in one
case, equally likely events must occur. In
67
00:08:49,730 --> 00:08:54,770
another case, it is must on the personal judgments.
Equally likely events means the cost of a
68
00:08:54,770 --> 00:09:00,380
particular outcome may not depend on the cost
of other item. So, this is how the objective
69
00:09:00,380 --> 00:09:07,380
classification and subjective classification.
Now, the issue is here, so what is the exact
70
00:09:08,620 --> 00:09:15,620
structure of probability and how it is, you
can say applied in the econometric modelling?
71
00:09:16,630 --> 00:09:22,760
Basically, probability can be applied purely
on mathematics and purely on statistics. But,
72
00:09:22,760 --> 00:09:29,760
in this econometric modelling, particularly
when we will go for multivariate framework?
73
00:09:30,630 --> 00:09:37,630
When the problem involves so many variables
at a time, then we like to know which particular
74
00:09:40,250 --> 00:09:47,250
variable is most impact and if it is or if
it has impact, then how much and whether the
75
00:09:49,750 --> 00:09:56,750
impact is statistical significant.
So, in order to know all these answers, then
76
00:09:58,110 --> 00:10:05,110
probability has to play the game. Now, probability
will give an indication, whether it is it
77
00:10:07,250 --> 00:10:12,570
is significant or if it is significance, what
is the level of significance? So, that is
78
00:10:12,570 --> 00:10:19,570
how we have to justify. In that context, probability
plays a fantastic role. So, we like to know
79
00:10:19,630 --> 00:10:24,900
the detail about the structure of probability,
before you apply it, in econometric modelling.
80
00:10:24,900 --> 00:10:31,900
Now, so the concept of probability is very,
you know very big. It is a, it is a broader
81
00:10:32,170 --> 00:10:39,170
angles we have to describe, but in the mean
time, we have to just discuss the concept
82
00:10:39,240 --> 00:10:45,770
very briefly. So, our point is here to know,
what is exactly probability, how is it setup
83
00:10:45,770 --> 00:10:52,610
and how it can be used in the econometric
angles. Now, so know the detail concept of
84
00:10:52,610 --> 00:10:59,610
probability, so we must know something about
the concepts of probability. So, to interpret
85
00:11:00,230 --> 00:11:07,230
the probability, you must have a thorough
knowledge or its you know components.
86
00:11:07,310 --> 00:11:13,720
So one of the basic components, that is what
we we we will say basic basic components of
87
00:11:13,720 --> 00:11:20,720
probability. First is the set. Now, in mathematics
there is concept called as a set. Set means
88
00:11:25,190 --> 00:11:31,940
well defined objects. Object may be in different
way, may be different set, different structure,
89
00:11:31,940 --> 00:11:38,690
etcetera. Here, the starting point of this
probability is you can say, this set theory.
90
00:11:38,690 --> 00:11:45,690
So, this set theory plays a fantastic role
in probability. So, we have various concept
91
00:11:46,570 --> 00:11:52,640
of set in the probability theory. Now, set
we we are describing in two ways: one is called
92
00:11:52,640 --> 00:11:57,720
as a empty set and another is called as a
universal set. Empty set means, if the set
93
00:11:57,720 --> 00:12:04,720
has no elements, no element and universal
set means all possible all possible set so
94
00:12:07,530 --> 00:12:14,530
all possible all possible sets of outcomes
possible sets of outcome.
95
00:12:15,870 --> 00:12:22,870
So now, let us take a case here. This is the
set, we will call it universal set S. Now
96
00:12:27,130 --> 00:12:34,130
S, inside S, we have a set called A. Now,
this is a particular set A and our you can
97
00:12:39,370 --> 00:12:46,370
say event A, then S is the complete set. So,
this is otherwise called as a universal set.
98
00:12:47,680 --> 00:12:53,900
Universal set means it is the integration
of all sets at a times. So within the universal
99
00:12:53,900 --> 00:13:00,900
set, we have a subset. For instance, just
like we have discussed the multivariate, bivariate
100
00:13:01,390 --> 00:13:07,900
framework. Within the bivariate framework,
framework we have bivariate analysis and under
101
00:13:07,900 --> 00:13:13,880
bivariate analysis or again multivariate analysis,
we have again univariate analysis.
102
00:13:13,880 --> 00:13:18,150
So now, this structure is almost all same
here. So, we are discussing about the universal
103
00:13:18,150 --> 00:13:23,560
set. So, within the universal set, we can
describe or we can pick, in point a particular
104
00:13:23,560 --> 00:13:30,560
set also. Now, so what is all about set, what
is empty set, what is the universal set? So,
105
00:13:31,890 --> 00:13:38,670
empty set means there is no element in the
particular set. Universal set means all possible
106
00:13:38,670 --> 00:13:44,950
outcomes in the set. And in addition to that,
there is concept called as a A complement.
107
00:13:44,950 --> 00:13:51,950
So, it is nothing but the set 1, it is usually
depend as a one minus A. so, what about elements
108
00:13:54,810 --> 00:14:01,810
are in set A which is not consider, then total
set minus, you can say total items involve
109
00:14:02,150 --> 00:14:04,380
in A, represent the complement A.
110
00:14:04,380 --> 00:14:11,380
So, now there is another questions. Suppose,
we have a two sets A and B. So, how they are
111
00:14:11,500 --> 00:14:16,540
related to each other. Just like we have discussed,
there are two variables, how they are integrated
112
00:14:16,540 --> 00:14:22,460
each other. In the probability case also same,
so it can be analyzed under one particular
113
00:14:22,460 --> 00:14:26,990
variable or one particular event or it can
be analyzed under two particular variable
114
00:14:26,990 --> 00:14:33,010
or two particular events. So now if it is,
if the question is only one particular event
115
00:14:33,010 --> 00:14:37,700
or one variable and how is the setup? If it
is, with respect to two variables or more
116
00:14:37,700 --> 00:14:42,540
than two variables or more than two events,
so then how is the setups?
117
00:14:42,540 --> 00:14:47,930
So, now, in the case of two variables say
A and B, so then I can represent this universal
118
00:14:47,930 --> 00:14:54,930
set like this. Set A, this is universal set
S, so within the set I can give a picture
119
00:14:55,020 --> 00:14:59,780
here like this. This is set A and this is
set B. I will put another another examples
120
00:14:59,780 --> 00:15:06,780
here. Set A and set B. So, this is A, B and
this is S set. S represents the total number
121
00:15:09,140 --> 00:15:14,410
of or total set of, set of, you can sum up
all the individual sets.
122
00:15:14,410 --> 00:15:20,710
So now, in the first case, set A and set B
are totally independent. This is what you
123
00:15:20,710 --> 00:15:27,710
usually call it, mutually exclusive events,
mutually mutually exclusive exclusive events
124
00:15:30,860 --> 00:15:36,760
or it is otherwise called as a disjoint set;
that means there is no common element in between.
125
00:15:36,760 --> 00:15:43,730
So, A is a set under universal set and B is
a set under universal set, but there is no
126
00:15:43,730 --> 00:15:48,190
integration between A and B. Just like we
have discussed in the multivariate framework,
127
00:15:48,190 --> 00:15:54,930
there are several variables in the in the
setup, initial setup. So, we like to know,
128
00:15:54,930 --> 00:15:58,260
what is the dependent cluster and what is
the independent cluster?
129
00:15:58,260 --> 00:16:05,260
So, now, the specialty of that modelling is
that within the independent clusters, we must,
130
00:16:05,680 --> 00:16:11,530
we are very much interested whether there
is again any integration among the independent
131
00:16:11,530 --> 00:16:16,710
variables. If there is so, then the problem
may be inconsistent or it may be very complex
132
00:16:16,710 --> 00:16:23,710
in nature. So now, there may be situation
within the particular setup all these variables
133
00:16:23,760 --> 00:16:29,960
may be independent. If it is independent,
then obliviously as far the econometric modelling
134
00:16:29,960 --> 00:16:35,420
structure, then this is the right track. But,
in in in a certain situation or most of the
135
00:16:35,420 --> 00:16:41,920
situations, you will find all these independent
variables are not independent. There may be
136
00:16:41,920 --> 00:16:47,140
some association or relationship between the
two or then in that case, we have to find
137
00:16:47,140 --> 00:16:52,750
out the way, how to tackle all these, you
can say relationship or if you cannot avoid
138
00:16:52,750 --> 00:16:58,200
at least, you can minimize this particular
problem. So, this is how we have to represent
139
00:16:58,200 --> 00:17:03,480
the set of mutually exclusive events.
So, now, in the other case, there are two
140
00:17:03,480 --> 00:17:10,480
sets. But, there is some integration between
A and B. Now, if there is two sets and we
141
00:17:10,620 --> 00:17:15,980
like to know the integration, so there are
three different structures here. now, this
142
00:17:15,980 --> 00:17:22,209
particular component is one clusters and A
outside of the environment, it is called as
143
00:17:22,209 --> 00:17:27,689
a complement and this is B set, outside of
the B, it is called as B complement. So, now,
144
00:17:27,689 --> 00:17:32,499
within A and B, so we like to know how they
are related to each other; that means whether
145
00:17:32,499 --> 00:17:37,899
they have some common element or they have
not some common element. Now, in that case,
146
00:17:37,899 --> 00:17:42,090
there is no common element, that means two
sets are completely independent to each other.
147
00:17:42,090 --> 00:17:49,090
For instance, so I will take examples here.
For instance, I will take A equal to 1, you
148
00:17:50,100 --> 00:17:56,529
can say 3, 5. This is the set of you can say
odd numbers. Then, I will take another case
149
00:17:56,529 --> 00:18:03,220
B equal to 2, 4 and 6. This is the set of
even numbers. Now, this is event A and this
150
00:18:03,220 --> 00:18:10,220
is event B. But, both are, if I will say S
equal to 1, 2, 3, 4, 5 and 6, this is the
151
00:18:12,200 --> 00:18:19,139
example of tossing a or you can say die. So
now, there are six possible, you can say outcomes.
152
00:18:19,139 --> 00:18:24,789
So, within the six possible outcomes, I have
categorically divided into two parts: one
153
00:18:24,789 --> 00:18:30,129
is the event which have odd numbers and the
event which have even numbers.
154
00:18:30,129 --> 00:18:35,590
So, now, when I represent the odd structure,
we have three elements that is 1, 3, 5 and
155
00:18:35,590 --> 00:18:40,549
when we will go for even elements, then we
have again three elements, that is 2, 4, 6.
156
00:18:40,549 --> 00:18:45,409
But, set A and set B are completely independent.
So, that is what, it is called as a disjoint
157
00:18:45,409 --> 00:18:52,019
set. So, this is one typical problem in probability
theory. Another typical problem is that both
158
00:18:52,019 --> 00:18:57,779
the sets are in the universal set, but there
is some common element. For instance, I will
159
00:18:57,779 --> 00:19:04,779
take here, the set 1, 2, 3, 4, 5. So, then
another case, I will take 2, 4, 6. And obviously
160
00:19:07,419 --> 00:19:14,419
2 and 4 has a common element. Now, I will
take it here. This is 1, this is 3, this is
161
00:19:15,929 --> 00:19:22,509
2, this is 4, this is 5. Now, if B has a even
number elements, then 2, 4, 6; that means
162
00:19:22,509 --> 00:19:29,509
2, 4 is a common element. So, these 2, 4,
2 and 4 is represented as a A intersection
163
00:19:29,529 --> 00:19:35,179
B. So, this, the mathematical notation is
called as A intersection B.
164
00:19:35,179 --> 00:19:42,179
So, now, if there is the intersection or this
common elements between A and B, so that means,
165
00:19:42,370 --> 00:19:49,370
if we will clog together and it must be the
universal set, provided the two sets are,
166
00:19:51,210 --> 00:19:58,029
A and B are only only two sets in the systems.
So, then obliviously, there is problem called
167
00:19:58,029 --> 00:20:05,029
as a A union B. So, a union represents the
set of all elements between set A and set
168
00:20:06,360 --> 00:20:12,570
B. so, this is how, you like to know, what
is a particular set, component set A, what
169
00:20:12,570 --> 00:20:17,539
is the particular set, component set B, then
we like to know what is the association between
170
00:20:17,539 --> 00:20:21,929
A and B. So, there are two way, they integrate.
Either, they are completely independent or
171
00:20:21,929 --> 00:20:26,190
they are dependent. If they are dependent,
how they are dependent to each other. So,
172
00:20:26,190 --> 00:20:31,559
this is how the complete structure of, you
can say probability theory.
173
00:20:31,559 --> 00:20:38,559
So, now, there is the, there is certain other
items. For instance, one one typical word
174
00:20:38,730 --> 00:20:45,730
is called as experiments. So, since the term
probability is the quantity measurement, then
175
00:20:45,820 --> 00:20:52,279
we have to, we have to make so many experiments
to get you know possible outcome because the
176
00:20:52,279 --> 00:20:57,119
situation is totally uncertain. So, we have
to make so many experiment to get a particular
177
00:20:57,119 --> 00:21:02,369
objective or to fulfill the particular objective.
So, now, experiment plays a fantastic role
178
00:21:02,369 --> 00:21:07,409
in the probability concept. So, similarly,
there is concept called as a sample space.
179
00:21:07,409 --> 00:21:13,779
Sample space consist of, you can say all observation
at a time. So, this is otherwise called as
180
00:21:13,779 --> 00:21:20,629
also universal set. Now, there is concept
called as a event. Event means in a particular
181
00:21:20,629 --> 00:21:25,619
case, just like we have discussed set A and
set B. Set A may be a particular event, set
182
00:21:25,619 --> 00:21:30,039
B may be a particular event. So, now, within
a particular system, event A, event B may
183
00:21:30,039 --> 00:21:34,679
be may be integrated, may not be integrated.
If they are integrated and how they are integrated
184
00:21:34,679 --> 00:21:39,909
to each other? Now, so, how do we measure
the concept of probability?
185
00:21:39,909 --> 00:21:46,230
Generally, probability concept for a particular
event say, let us assume that A is an event,
186
00:21:46,230 --> 00:21:53,230
A is an event. So, then probability of A is
measured as a number of favorable outcomes
187
00:21:54,720 --> 00:22:00,029
divide by number of possible outcomes; that
means total number of possible outcomes in
188
00:22:00,029 --> 00:22:06,860
a system and number of particular event. So,
the ratio is called as a probability. So take
189
00:22:06,860 --> 00:22:13,860
a example here. Let us assume that, let us
assume that there is a family consists of
190
00:22:17,899 --> 00:22:24,899
three, three members. So, there is a family
family consists of three members and the family
191
00:22:26,929 --> 00:22:33,929
family cluster is with respect to boys and
girls, boys and girls, girls. So, we like
192
00:22:37,169 --> 00:22:43,659
to know, what is the probability of having
exactly two girls in the system and not more
193
00:22:43,659 --> 00:22:49,249
than two girls in the systems?
So, now, if there is question of choosing
194
00:22:49,249 --> 00:22:55,730
three members in a particular team, then how
they are, you can say integrated? And how
195
00:22:55,730 --> 00:23:01,580
the problems are formulated here? Now, the
total space will be like this way. Now, I
196
00:23:01,580 --> 00:23:08,039
will call it instead of boys, I will call
it B words and instead of girls I will call
197
00:23:08,039 --> 00:23:13,440
it G words. So, three particular situations
can be occured simultaneously. So, there may
198
00:23:13,440 --> 00:23:20,440
be different situation B B B; so, that means
there is no girls settles, then B B G, then
199
00:23:21,759 --> 00:23:28,759
B G B, then G B B, so then then B G G, then
G B G, then G G B, then G G G.
200
00:23:40,639 --> 00:23:46,249
So, now, this is the complete setup. Now,
the question is what is the probability of
201
00:23:46,249 --> 00:23:52,009
exactly two girls at a time? So, exactly two
girls means, we have to see where only two
202
00:23:52,009 --> 00:23:56,830
girls are there. So, this is one case. This
is another case. This is another case. So,
203
00:23:56,830 --> 00:24:03,830
there is no more. So, there are three outcomes,
three numbers. So, total samples space on
204
00:24:05,850 --> 00:24:12,850
n S equal to 1, 2, 3, 4, 5, 6, 7, 8 and individual
sample set n A which is exactly exactly two
205
00:24:13,999 --> 00:24:20,909
girls, which is nothing but, you can say 3.
So, probability of A is nothing but, n A by
206
00:24:20,909 --> 00:24:27,869
n S; which is nothing but, 3 by 8. So, this
is how probability can be calculated. So,
207
00:24:27,869 --> 00:24:34,869
now, we like to know various issues of the
particular term probability. So, let me explain
208
00:24:35,700 --> 00:24:37,320
here what is that issues.
209
00:24:37,320 --> 00:24:41,940
So, the basic starting point of probability
is like this, for a particular event A; P
210
00:24:41,940 --> 00:24:48,940
A equal to n A by, you can say n S, but you
know like last class discussion every statistic
211
00:24:52,779 --> 00:24:59,779
has its you know advantage and had its disadvantage.
You know disadvantage means it has lots of
212
00:25:00,489 --> 00:25:06,860
limitation and shortcomings. So, it has also
lots of advantage, so that we can apply or
213
00:25:06,860 --> 00:25:13,779
we can solve a particular problem. For instance,
take a case of covariance. The specialty of
214
00:25:13,779 --> 00:25:20,679
covariance is that it tresses association
between two variables. However, the limitation
215
00:25:20,679 --> 00:25:27,679
part of this particular statistic is that,
if two variables or if there is any comparative
216
00:25:27,700 --> 00:25:34,700
analysis, then the technique covariance cannot
be used properly because it is not at all
217
00:25:38,330 --> 00:25:42,350
unit less measurement.
So, now, in order to solve that particular
218
00:25:42,350 --> 00:25:47,429
problem, we have to apply correlation. So
that, with the help of correlation, we can
219
00:25:47,429 --> 00:25:54,090
get to know the answers. But again correlation
has an advantage over covariance but certain
220
00:25:54,090 --> 00:26:01,049
situation, correlation itself has a lots of
limitations. For instance, correlation is
221
00:26:01,049 --> 00:26:06,139
unit less measurement and it is advancement
covariance, it is better technique to know
222
00:26:06,139 --> 00:26:10,669
the association between the two variables
but in the same time, it cannot measure the
223
00:26:10,669 --> 00:26:15,549
cause and effective relationship between the
two variables. So, for that again, we have
224
00:26:15,549 --> 00:26:22,220
to go look for something else. So, this is
how the problem, very complex in nature. In
225
00:26:22,220 --> 00:26:29,220
every in every statistic, has its advantage
and has its disadvantage. Now, in the probability
226
00:26:29,769 --> 00:26:32,980
aspects, we like to know what is specialty
of particular probability.
227
00:26:32,980 --> 00:26:38,730
So, like you know, we have discussed the correlation
correlation issue. One of the fantastic feature
228
00:26:38,730 --> 00:26:43,789
of correlation is that, the value correlation
coefficient lie between minus one to plus
229
00:26:43,789 --> 00:26:48,559
one. So, similarly, in the case of probability,
we have also certain advantage or you can
230
00:26:48,559 --> 00:26:55,279
say interesting features. So, one of the interesting
feature is that probability of A is always
231
00:26:55,279 --> 00:27:02,279
greater than to 0 and less than to 1. So,
that means the value of probability is always
232
00:27:05,210 --> 00:27:11,749
non-negative. So, it is always non-negative;
that means it is always positive. This is
233
00:27:11,749 --> 00:27:18,749
the most important properties of probability.
Second, sum of P i, i equal to 1 to n is exactly
234
00:27:20,019 --> 00:27:26,330
equal to 1; that means sum of all probability
in a particular setup must be exactly equal
235
00:27:26,330 --> 00:27:32,399
to 1. For instance, in a particular setup
there are two events A and B. So, obliviously
236
00:27:32,399 --> 00:27:37,220
the probability of A occurrence and probability
of B occurrence should be exactly equal to
237
00:27:37,220 --> 00:27:42,889
1. First of all, the probability for first
event A should be positive and probability
238
00:27:42,889 --> 00:27:47,029
for second event B should be positive and
in a final case, total probability must be
239
00:27:47,029 --> 00:27:53,220
exactly equal to 1. So, these two property
has to be fulfilled, otherwise the concept
240
00:27:53,220 --> 00:27:59,249
of probability is inconsistent.
So, there are other other tricks also, like
241
00:27:59,249 --> 00:28:04,789
you know P A, probability of A complement
which is nothing but, 1 minus P A. Then, we
242
00:28:04,789 --> 00:28:11,789
have also discussed A integration, such you
can say P A intersection B is equal to, you
243
00:28:14,980 --> 00:28:21,429
can say probability of A intersection B divide
by, you can say n total sum; in fact, its
244
00:28:21,429 --> 00:28:26,859
number of possible outcomes divide by total
number of outcomes. And for mutual exclusive
245
00:28:26,859 --> 00:28:33,859
case, for mutual exclusive case P under A
union C is equal to P A plus P C, where P
246
00:28:36,830 --> 00:28:43,830
A union C usually P A plus P B minus P A intersection
B; Sorry, A intersection C. So, this is when
247
00:28:50,359 --> 00:28:56,340
there is mutually there is mutual exclusive
or disjoint, then obliviously the common element
248
00:28:56,340 --> 00:29:03,340
or intersection term will equal to 0. So,
simply P union C equal to P A plus P C.
249
00:29:03,690 --> 00:29:10,690
So, this is how the structure of statistics
of probability or we can say, we can discuss
250
00:29:10,929 --> 00:29:17,519
the feature of probability. Then, there is
concept called as a conditional probability.
251
00:29:17,519 --> 00:29:24,409
There is concept called as a conditional probability.
So, we take a event. There are two events
252
00:29:24,409 --> 00:29:30,739
A and B, then A and B. Then, there are two
way, we have to define the conditional probability
253
00:29:30,739 --> 00:29:37,739
P A given B which is nothing but P A intersection
B divide by P B provided P B must be positive.
254
00:29:41,940 --> 00:29:48,940
Positive means that that is not be equal to
0. Then, P obliviously P B into P A given
255
00:29:54,739 --> 00:30:01,739
B is equal to P A intersection B. So, this
is case one. And another situation, probability
256
00:30:03,539 --> 00:30:10,480
of P given B given A is equal to probability
of A intersection B divide by probability
257
00:30:10,480 --> 00:30:17,480
of A. That implies probability of A into probability
of B given A is equal to probability of A
258
00:30:18,940 --> 00:30:22,859
intersection B. Now, this is equation number
two.
259
00:30:22,859 --> 00:30:29,859
So, now, if we will compare these and these,
so, then P probability of B into probability
260
00:30:30,059 --> 00:30:37,059
of A given B equal to probability of A into
probability of B given A; so that means probability
261
00:30:40,340 --> 00:30:47,340
of B is equal to probability of A into probability
of B given A all divide by probability of
262
00:30:49,649 --> 00:30:56,649
A given B. So, this is what the theorem of
conditional probability provided in all the
263
00:30:57,179 --> 00:31:04,179
cases the value of probability should be positive
and it should be in between 0 to 1. So, it
264
00:31:05,720 --> 00:31:11,629
means no situation, it should be negative
and it should be more than 1. It should be,
265
00:31:11,629 --> 00:31:18,279
the limit should be in between 0 to 1, like
the term called as a coefficient determinant
266
00:31:18,279 --> 00:31:24,080
which we discussed in the last class, the
square of the correlation coefficient.
267
00:31:24,080 --> 00:31:31,080
So now now, for a for independent events,
for independent events, if the two events
268
00:31:31,720 --> 00:31:38,369
are independent or disjoint, then probability
of A given B is simply equal to probability
269
00:31:38,369 --> 00:31:45,369
of A because it is nothing but probability
of A intersection B divided by probability
270
00:31:45,850 --> 00:31:50,350
of B, but probability of A intersection B
is nothing but, probability of A into probability
271
00:31:50,350 --> 00:31:56,159
of B because they are independent. So, probability
of B, so it is simply equal to probability
272
00:31:56,159 --> 00:32:03,159
of A. Similarly, probability of B given A
is equal to probability of simply B. Similarly,
273
00:32:04,940 --> 00:32:11,940
P A into P B divide by P A. So, obliviously
the simple answer is P B. Now, we have, we
274
00:32:16,809 --> 00:32:23,809
like to know what is what is the issue of
probability, various aspects of probability,
275
00:32:24,049 --> 00:32:31,049
various theorems under probability and various
conditions of probability. So and within the
276
00:32:32,649 --> 00:32:36,519
condition, we will like to know the conditional
problem. This is very interesting component
277
00:32:36,519 --> 00:32:43,029
which we discuss in details later stage.
So, in fact, there are so many other issues
278
00:32:43,029 --> 00:32:47,700
also in probability, is what we called as
a mathematical probability. So, we are not
279
00:32:47,700 --> 00:32:53,019
going to discuss the details about this issue
because we are very much restricted on econometric
280
00:32:53,019 --> 00:32:59,429
modelling. So, our our idea or our agenda
is to know little bit about the probabilities.
281
00:32:59,429 --> 00:33:06,429
So, that it is a a means, it will it will
help you lot for for econometric modelling,
282
00:33:06,669 --> 00:33:12,279
particularly to test the reliability issue
and to check the significance of a particular
283
00:33:12,279 --> 00:33:13,269
variables.
284
00:33:13,269 --> 00:33:20,269
So, now, so I will discuss here the concept
called as a probability distributions. So,
285
00:33:21,590 --> 00:33:28,590
we we have the issue here, probability probability
distributions. So, before before we, before
286
00:33:31,389 --> 00:33:38,389
we move to probability distribution, so we
must have a thorough knowledge on probability,
287
00:33:42,590 --> 00:33:49,590
then we have to discuss about its distributions
like we have discussed the univariate modelling.
288
00:33:50,220 --> 00:33:57,220
So, that means we like to know what is the
central structure, variability structure and
289
00:33:58,049 --> 00:34:03,479
then, we have to come for size of the distribution
or step of the distribution. So, in the case
290
00:34:03,479 --> 00:34:10,270
of probability also same things. So, we must
have the theoretical background of the probability
291
00:34:10,270 --> 00:34:17,159
and its structure conditions theorem, then
we have to see how is it exactly distributions.
292
00:34:17,159 --> 00:34:22,280
So, now, suppose as a probability distribution
is concerned, there are two ways it can be
293
00:34:22,280 --> 00:34:29,280
discussed. One under discrete series and another
under continuous series. So, let us what is
294
00:34:34,929 --> 00:34:41,569
let us we discuss first, what is probability
distribution? Continuous means the variables
295
00:34:41,569 --> 00:34:48,569
in intervals, it will be obtained in a intervals.
So, here it may be finite or infinite, but
296
00:34:48,789 --> 00:34:55,789
it is not in interval structures. Now, so,
to to analyze the probability distribution,
297
00:34:57,200 --> 00:35:04,200
we assume that X is a random variables, X
is a random variables. So, that means its
298
00:35:05,099 --> 00:35:12,099
outcome completely depend completely depends
on chance. So, it it is a it is obtained through
299
00:35:15,319 --> 00:35:22,319
experiment only. So, this is how it is called
as a a random variable. X is a random variable,
300
00:35:22,470 --> 00:35:29,470
X consists of X 1, X 2 up to X n and corresponding
probability is equal to P 1 P 2 up to P n.
301
00:35:34,380 --> 00:35:41,380
For instance, what is all about the distributions?
Let us take a case; you like to toss a coin.
302
00:35:47,059 --> 00:35:54,059
So, obviously, there are two possible outcomes,
either head or tail. So, how do we analyze
303
00:35:57,240 --> 00:36:04,160
the situation? So, of course, the total outcomes
will be two; head and tail. Now, when we will
304
00:36:04,160 --> 00:36:10,109
make an experiment, then the situation is
totally uncertain. So, now, we like to know
305
00:36:10,109 --> 00:36:16,569
what are the possible outcomes in that particular
structure or systems. Now, first structure
306
00:36:16,569 --> 00:36:23,569
is means, there may be 0 head, then there
may be 0 tails. Now, it is possible when there
307
00:36:25,140 --> 00:36:29,930
are two outcomes at a times, but when there
is question of only one, then the game is
308
00:36:29,930 --> 00:36:35,530
very simple one. So, in that case, so either
there is possibility of head or there is possibility
309
00:36:35,530 --> 00:36:42,000
of tail. But, now we have to see, how much
tails are there or how much heads are there.
310
00:36:42,000 --> 00:36:47,069
So, now, similarly, X is a random variables
which represents the number of occurrence.
311
00:36:47,069 --> 00:36:52,250
P represents the corresponding probability.
So, obviously the condition is that here,
312
00:36:52,250 --> 00:36:59,250
so in the first case, 0 less than P i P X
i less than equal to 1 and second case summation
313
00:37:03,210 --> 00:37:10,210
P X i i equal to 1 to n exactly equal to 1.
So, let us take a case here. We like to formulate
314
00:37:12,500 --> 00:37:18,859
a probability distributions, we like to formulate
a probability distribution. The condition
315
00:37:18,859 --> 00:37:25,859
is that so, the same problem. Let us take
a case of a family size, family size is three
316
00:37:29,170 --> 00:37:36,170
and it consist of boys and girls.
So, so how is the step of the probability
317
00:37:38,990 --> 00:37:44,740
distribution boy chance of occurrence is 1
by 2 and girls chance of occurrence is 1 by
318
00:37:44,740 --> 00:37:50,450
2. So, let us assume that the probability
of success P equal to 1 by 2 for boys and
319
00:37:50,450 --> 00:37:57,119
probability success for girls is P by 1 by
2, just like head and tail. So, the possible
320
00:37:57,119 --> 00:38:03,829
chance is 1 by 2 and 1 by 2. So, if it is
head, of course, then obviously, it is 1 by
321
00:38:03,829 --> 00:38:08,260
2 and if it is tail, of course, it is also
1 by 2. So, now, in this particular setup,
322
00:38:08,260 --> 00:38:15,260
also see the situation is almost all same.
Here, there is two alternative only, either
323
00:38:15,730 --> 00:38:21,230
boy or girl. Now, how we have to formulate
the situation.
324
00:38:21,230 --> 00:38:28,230
Now, there are certain case. First case, if
all are boys; if all in all the cases all
325
00:38:30,130 --> 00:38:37,130
are boys. So, then there are three possible
outcome. So, I mean family size is three.
326
00:38:37,829 --> 00:38:44,829
So, P B into P B into P B. So, obviously the
total outcomes will 1 by 2 into 1 by 2 into
327
00:38:47,220 --> 00:38:54,220
1 by 2. So, this is 1 by 8. Now, let us another
case, all are girls, all are girls. Then,
328
00:38:57,109 --> 00:39:04,109
probability probability of G into probability
of G into probability of G, so which is equal
329
00:39:04,660 --> 00:39:11,660
to again 1 by 2, 1 by 2 1 by 2 equal to 1
by 8. So, now, similarly, so we like to know,
330
00:39:15,079 --> 00:39:21,150
we like to know what is the case of two boys
and three girls case. In that case, so let
331
00:39:21,150 --> 00:39:28,150
us say two are boys and one are girls. So,
then P into B B G B B G plus P B G B plus
332
00:39:39,619 --> 00:39:46,619
P G B B, in all the cases here this is P B
B G means 1 by 2 into 1 by 2 into 1 by 2 plus
333
00:39:54,319 --> 00:40:01,319
1 by 2 into 1 by 2 into 1 by 2 plus 1 by 2
into 1 by 2 into 1 by 2 because in the first
334
00:40:03,010 --> 00:40:10,010
case B, 1 by 2. Then, again B 1 by 2, G also
1 by 2. So, in all total, if we solve the
335
00:40:11,359 --> 00:40:18,359
particular problem, then you will get 3 by
8. So, now, fourth case, if two are girls
336
00:40:19,450 --> 00:40:26,450
and one boy, it is two boys, one girl. Then,
next case, two girls and one boy, so result
337
00:40:30,890 --> 00:40:37,890
is also similar. So, that means it will come
also 3 by 8. Now, we will formulate a probability
338
00:40:38,940 --> 00:40:39,490
distributions.
339
00:40:39,490 --> 00:40:44,670
So, now, probability distributions X is a
random variable which consists of, you can
340
00:40:44,670 --> 00:40:51,670
say let say number of boys in the systems,
number of boys in the systems. Now, so, X
341
00:40:54,710 --> 00:41:01,710
contains, it can be it can be 0, it can be
2, it can be 2. Sorry, it can be 1, it can
342
00:41:02,210 --> 00:41:09,210
be 2, it can be 3. So, that means the family
size consists of three. So, if there are three,
343
00:41:10,079 --> 00:41:17,079
then there may be possibility 0 number of
boys, one number of boys, two number of boys
344
00:41:18,210 --> 00:41:25,210
and three number of boys. So, there are 4
possible situations. So, now, if there are
345
00:41:25,490 --> 00:41:31,450
three possible situations what is the chance
of probability? Now, corresponding probability
346
00:41:31,450 --> 00:41:38,450
is for 0, it is already 1 by 8. This is 3
by 8, this is 3 by 8, this is 1 by 8. Now,
347
00:41:40,119 --> 00:41:47,119
the complete setup is called as a probability
distribution provided it is, it must be satisfy
348
00:41:48,020 --> 00:41:54,730
the condition. So, what is that condition?
Now, in every case, it is greater than 0,
349
00:41:54,730 --> 00:42:01,309
greater than 0, greater than 0, greater than
0 and sum of P i should be exactly equal to
350
00:42:01,309 --> 00:42:08,309
1. So, that means 1 by 8 plus 3 by 8 plus
1 by 8 should be exactly equal to 1. So, it
351
00:42:12,410 --> 00:42:19,410
can it is a 3 by 8 plus 3 by 8 so it is nothing
but, 8 by 8 it is exactly equal to 1.
352
00:42:25,079 --> 00:42:32,079
So, now, so there are many ways you can explain
or many examples you can sight to explain
353
00:42:32,589 --> 00:42:39,589
this probability issue and also the probability
distributions. Now, the interesting aspect
354
00:42:40,920 --> 00:42:46,710
of probability distribution is that, it can
be integrated with the bivariate framework.
355
00:42:46,710 --> 00:42:53,000
For instance, we have already discussed, we
already know what is the concept of univariate
356
00:42:53,000 --> 00:42:59,510
and what is the concept of bivariate. Now,
if we will make a look in probability how
357
00:42:59,510 --> 00:43:05,500
probability can be applied or can be integrated
to univariate structure and can be integrated
358
00:43:05,500 --> 00:43:11,410
to bivariate structure. The way we have discussed
till now, it is more or less univariate structure
359
00:43:11,410 --> 00:43:16,940
of probability distributions. So, that means
every time we are discussing a single variable
360
00:43:16,940 --> 00:43:23,609
or single event, then we are finding out the
various possible outcomes or various possible
361
00:43:23,609 --> 00:43:29,099
scenarios.
So, now, if there are two different variables,
362
00:43:29,099 --> 00:43:36,099
for instance take two random variable say
X and Y. And they have corresponding probability
363
00:43:36,660 --> 00:43:43,660
and how X and Y are integrated to each others,
just like you know the correlation structure
364
00:43:45,279 --> 00:43:50,839
and covariance structure. We can also a justify
here, how it is the problem setup.
365
00:43:50,839 --> 00:43:57,559
So, now, take a case of two variables. X is
one variables and Y is another variables.
366
00:43:57,559 --> 00:44:04,559
X contains X 1, X 2 up to X n corresponding
probability is P 1, P 2 up to P n. So, in
367
00:44:07,720 --> 00:44:14,720
the case of Y, the corresponding situation
is Y 1, Y 2 up to Y n corresponding probability
368
00:44:14,960 --> 00:44:21,960
is P 1, P 2 and P n. So, now, you make a look
here. When we look for simple structure, so
369
00:44:28,309 --> 00:44:35,309
we have series X 1 up to X n, then another
series Y 1, Y 2 up to Y n. So, what is our
370
00:44:40,460 --> 00:44:47,460
agenda here? We like to know, what is X bar?
What is variance of X? What is standard deviations
371
00:44:47,750 --> 00:44:54,750
of X? What is this skewness of X and so on.
Similarly, this is Y bar, variance of Y, this
372
00:44:57,190 --> 00:45:04,190
is sigma upon Y and skewness of Y. This is
the entirely or complete structure of univariate
373
00:45:06,339 --> 00:45:12,520
setup. Now, when we will move to bivariate
structure, then we will like to know how they
374
00:45:12,520 --> 00:45:19,029
are integrated to each other. And for that,
we either apply covariance upon X Y or we
375
00:45:19,029 --> 00:45:26,029
can apply correlation upon X Y or you can
also have it through B X Y or B Y X. So, this
376
00:45:28,670 --> 00:45:35,670
particular structure is not at all attached
with, you can say the concept of probability.
377
00:45:36,140 --> 00:45:43,140
Now, the same setup can be integrated or can
be interpreted through probability issue because
378
00:45:45,510 --> 00:45:52,510
lots of instances or you can say many occasions,
your variables are random in nature that means
379
00:45:55,299 --> 00:46:01,279
the the outcome of a particular variable entirely
depend upon chance, in that case probability
380
00:46:01,279 --> 00:46:05,869
has to be applied.
So, now, we like to know, if it is applied
381
00:46:05,869 --> 00:46:12,680
in that particular case, then how is the setup?
So, we like to know how the probability a
382
00:46:12,680 --> 00:46:19,680
structure can be applied to this univariate
setup to bivariate setup. Now, when the outcomes
383
00:46:22,160 --> 00:46:27,490
are like this, only X equal to X 1, X 2, X
n, no probability. Then, the situation is
384
00:46:27,490 --> 00:46:32,680
completely certain. So, that means there is
no question of uncertainty. So, that means
385
00:46:32,680 --> 00:46:38,619
the chance of occurrence is not at all issue.
So, now, whatever your expecting this same
386
00:46:38,619 --> 00:46:44,460
results with you. So, this is what the certain
issue. But, when there is question of uncertainty,
387
00:46:44,460 --> 00:46:51,460
then probability has to be applied. So, that
means, now the issue is we have to multiply
388
00:46:51,630 --> 00:46:56,970
probability probability with the original
variables. So, corresponding term is P 1 X
389
00:46:56,970 --> 00:47:03,970
1 P 2 X 2, then P n X n.
So, like here, samples sample means, so we
390
00:47:04,690 --> 00:47:11,650
will also call X bar which is nothing but,
we will call expected value of X because since
391
00:47:11,650 --> 00:47:18,650
it is quantitative measurement of uncertainty,
then obviously, we are expecting something.
392
00:47:22,380 --> 00:47:29,380
Now, expectation has to be applied. Now, so,
your expecting something, so what is that
393
00:47:30,829 --> 00:47:34,789
expectation? Is it hundred percent or is it
less than that? If it is hundred percent,
394
00:47:34,789 --> 00:47:39,829
then obviously the probability chance of probability
is exactly equal 1. So, if it is you can say
395
00:47:39,829 --> 00:47:44,480
no, then its probability almost all impossible,
its 0.
396
00:47:44,480 --> 00:47:51,480
So, now, X bar is nothing but X, so which
is nothing but summation P i X i i equal to
397
00:47:52,000 --> 00:47:59,000
1 to n divide by summation P i. You remember
one thing, in the in the first lecture under
398
00:48:00,079 --> 00:48:04,920
univariate modelling, we have discussed the
univariate statistic particularly with respect
399
00:48:04,920 --> 00:48:10,190
to mean, median, modes. And under mean, we
have different setup like arithmetic mean,
400
00:48:10,190 --> 00:48:16,819
harmonic mean and weighted average, weighted
average mean. So, now that weighted average
401
00:48:16,819 --> 00:48:20,670
is nothing but the probability issue. This
probability is just like a weight issues,
402
00:48:20,670 --> 00:48:27,670
so it is assigned weight way. But, every items
now with this with this setup, if you understand
403
00:48:29,279 --> 00:48:35,359
the concept of weighted average mean, then
obviously there is no confusion about this
404
00:48:35,359 --> 00:48:41,490
expectation in the probability theory.
So, now, in the case of weighted arithmetic
405
00:48:41,490 --> 00:48:48,490
means, so we applied summation W i. Now, we
have summation P i. Now, this is a X, our
406
00:48:49,559 --> 00:48:54,240
expected value of X or mean of X under the
probability distribution. Similarly, so here
407
00:48:54,240 --> 00:49:00,710
we will we will get P Y. So, this is this
is actually, Y structure and this is also
408
00:49:00,710 --> 00:49:07,710
P structure. So, we will get P Y. So, P Y
is nothing but P 1 Y 1 P 2 Y 2 up to P n Y
409
00:49:08,390 --> 00:49:12,099
n.
So, corresponding corresponding to Y and its
410
00:49:12,099 --> 00:49:19,099
probability, so we can get expected value
of Y. Expected value of Y is nothing but summation
411
00:49:19,710 --> 00:49:26,710
P i Y i, i equal to 1 to n. And remember,
this is nothing but simply summation P i X
412
00:49:26,910 --> 00:49:33,910
i because sum of P i is exactly equal to 1.
So, obviously it is also summation P i, so
413
00:49:33,930 --> 00:49:40,930
which is nothing but summation P i Y i. Now,
this complete setup is called as, we can say
414
00:49:43,640 --> 00:49:50,640
mean of A and mean of B.
So, now, we like to know we like to know the
415
00:49:54,779 --> 00:50:01,779
moment from univariate structure to bivariate
structures, with simple setup, where the situations
416
00:50:02,990 --> 00:50:09,990
are very certain in nature. In another case,
the situation are completely uncertain in
417
00:50:10,650 --> 00:50:17,049
nature, where the structure is not at all
simple. The structure is assigned or designed
418
00:50:17,049 --> 00:50:24,049
with weight vectors or probability vector,
chance of occurrence of that particular items.
419
00:50:24,180 --> 00:50:31,180
So, now, moving to that particular issue.
Now, for X and Y, so X 1 Y 1. Sorry, X 1,
420
00:50:34,849 --> 00:50:41,849
X 2 up to X n. P 1, P 2, P n. So, then Y 1,
Y 2 up to Y n. P 1, P 2 and P n. So, E X is
421
00:50:47,510 --> 00:50:54,510
equal to summation P i X i, i equal to 1 to
n. So, this is nothing but, X bar. Similarly,
422
00:50:55,779 --> 00:51:02,779
E upon Y is equal to summation P i Y i which
is nothing but, you can say Y bar. So, this
423
00:51:05,230 --> 00:51:12,230
is E X. Now, we can get it also V X. V X represents
variance of X. So, now, variance of X is nothing
424
00:51:13,680 --> 00:51:20,680
but E X squares minus E X whole squares. Similarly,
variance of Y is equal to E Y squares minus
425
00:51:27,799 --> 00:51:34,799
E upon Y whole squares. So, you have E X and
V X. So, that means it is the univariate structure
426
00:51:36,730 --> 00:51:42,609
with respect to central issue and this is
with respect to dispersion issue. Now, if
427
00:51:42,609 --> 00:51:48,980
we will a we will know standardization, then
it is nothing but square root of E X square
428
00:51:48,980 --> 00:51:55,980
minus E X whole squares. So, similarly, sigma
Y is nothing but, variance of Y factors.
429
00:51:57,630 --> 00:52:04,630
So, now, so you like to know E X V X E Y V
Y. Now, we like to correlate X upon Y. So,
430
00:52:08,589 --> 00:52:15,589
for that we need to have covariance covariance
of X Y. So, covariance X Y is nothing but
431
00:52:15,680 --> 00:52:22,680
E upon X Y minus E X into E Y. So, now, what
is the E X Y? E X Y is simply represented
432
00:52:27,910 --> 00:52:34,910
as summation P i P j X i Y j, i equal to 1
to n. So, with this issue, so this is the
433
00:52:41,910 --> 00:52:48,910
covariance issue, this is variance issue and
we need to have a final outcomes; that is
434
00:52:54,890 --> 00:53:01,890
correlation between X and Y. So, correlation
of X Y is nothing but covariance of X Y divide
435
00:53:03,319 --> 00:53:10,319
by sigma X into sigma Y. So, that means, it
is nothing but E X Y minus E X into E Y divide
436
00:53:18,440 --> 00:53:25,440
by, you can say sigma X into sigma Y. So,
like correlation, so this also satisfied.
437
00:53:27,589 --> 00:53:32,760
This is usually denoted as 0 X Y is equal
to 1.
438
00:53:32,760 --> 00:53:39,760
Now, so we we get to know what is the issue
of probability, its various structure various
439
00:53:43,359 --> 00:53:49,869
setup and how it is useful for a real world
business problem and how it is helpful when
440
00:53:49,869 --> 00:53:56,089
the situations are not certain at all; that
means, it is the question of uncertainty.
441
00:53:56,089 --> 00:54:00,950
So, that means probability has to be applied
when the situations are very much uncertain
442
00:54:00,950 --> 00:54:07,510
in nature. So, that is why, it plays a very
fantastic role in econometric modelling. So,
443
00:54:07,510 --> 00:54:12,519
like you know, original samples, we discuss
about univariate statistic and bivariate statistic;
444
00:54:12,519 --> 00:54:19,019
that is, with respect to covariance and correlations,
so we have to apply probability with the particular
445
00:54:19,019 --> 00:54:24,660
samples, then we can also get to know the,
what is the univariate setup and bivariate
446
00:54:24,660 --> 00:54:29,750
setup.
So, this is you can say, very much helpful
447
00:54:29,750 --> 00:54:36,240
or you can say, it is very useful for further
econometric modelling. Now, in additional
448
00:54:36,240 --> 00:54:41,640
to probability the estimation and hypothesis
testing plays a fantastic role which we will
449
00:54:41,640 --> 00:54:46,750
discuss in the next class in details. So,
with this, we can conclude this class, today.
450
00:54:46,750 --> 00:54:48,249
Thank you very much. Have a nice day.