1
00:00:44,210 --> 00:00:53,850
In the previous class, we had a look at the information
measure in terms of entropy of a source.
2
00:00:53,870 --> 00:01:05,050
Entropy of the source was given as H S equal
to minus summation of P s i log P s i. this
3
00:01:05,149 --> 00:01:17,609
is what we had defined as the entropy of a
zero memory source. Interpretation of entropy
4
00:01:17,610 --> 00:01:32,149
is average information, which I get per symbol
of the source S. We can look at the concept
5
00:01:32,149 --> 00:01:41,749
of entropy in a slightly different manner.
I could say that entropy is also a measure
6
00:01:41,750 --> 00:01:52,210
of uncertainty that gets resolved when that
event takes place. So, when an event e occurs,
7
00:01:53,030 --> 00:02:01,740
I get some information on the occurrence of
that event e. A different way of looking at
8
00:02:01,740 --> 00:02:10,760
the same problem is to say that when I observe
the event e, whatever uncertainty was associated
9
00:02:10,910 --> 00:02:17,140
before my observation, that gets resolved
on the observation of that event e.
10
00:02:17,140 --> 00:02:28,480
So, entropy of the source S could also be
interpreted as uncertainty resolved when I
11
00:02:28,480 --> 00:02:36,140
observe a particular symbol be emitted from
the source. The concept of uncertainty in
12
00:02:36,850 --> 00:02:53,220
terms of, in terms of, the concept of uncertainty
will be utilized when we are talking of mutual
13
00:02:53,220 --> 00:03:02,060
information during the course of our study.
We also had a look at some properties of entropy
14
00:03:02,510 --> 00:03:09,510
and we came to conclusion that entropy of
a source is always less than equal to log
15
00:03:09,650 --> 00:03:22,620
q, where q is the size of the source alphabet
s. And we also saw that H S is always greater
16
00:03:22,620 --> 00:03:38,100
than equal to 0, it is equal to 0 if and only
if probability of s i is equal to 1 for some
17
00:03:38,100 --> 00:03:49,740
i belonging to 1 to q. When this condition
is satisfied, then value of entropy I get
18
00:03:49,740 --> 00:03:56,740
is equal to 0. For any other case other than
this, the value of entropy I get is always
19
00:03:57,370 --> 00:04:06,470
greater than equal to 0, but less than log
q. Associated with entropy of a source, I
20
00:04:06,570 --> 00:04:13,570
can define another quantity that is known
as
21
00:04:27,819 --> 00:04:32,610
redundancy of a source.
The definition of a redundancy of a source
22
00:04:32,610 --> 00:04:34,110
is given as
23
00:04:53,819 --> 00:05:03,059
it is 1 minus H S. H S is the actual entropy
of that source S and what is the maximum entropy
24
00:05:03,069 --> 00:05:10,489
which I can get from the for that source S;
that maximum entropy obviously will be dependent
25
00:05:10,710 --> 00:05:22,740
upon the probabilities of the symbols of the
source alphabet. For the case of a zero memory
26
00:05:22,740 --> 00:05:33,680
source, this can be written as 1 minus H of
S upon log q because the maximum entropy of
27
00:05:33,689 --> 00:05:45,369
zero memory source is given by log q. If you
take, let us look at the property of the parameter
28
00:05:45,480 --> 00:05:47,100
redundancy.
29
00:05:54,000 --> 00:06:06,020
When you have equi probable symbols, when
you have equi probable symbols, you have H
30
00:06:07,889 --> 00:06:21,629
S, actual H S is equal to log q and this will
imply that the value of redundancy R is equal
31
00:06:21,629 --> 00:06:48,180
to 0. When P s i is equal to 1 for some symbol
s i in the alphabet S, then H S is equal to
32
00:06:48,180 --> 00:07:02,040
0. This implies that R is equal to 1. So,
the value of your redundancy will be always
33
00:07:02,689 --> 00:07:09,689
lying in between these two values, 0 and 1.
The lower bound is 0 and the upper bound is
34
00:07:10,990 --> 00:07:16,840
1.
Let us take a simple example to get the feel
35
00:07:16,840 --> 00:07:23,840
of this factor, which we have defined recently
that is redundancy. Let me take a simple case
36
00:07:25,749 --> 00:07:32,749
of a binary source. So, I have a binary source.
Let me assume that binary source alphabet
37
00:07:43,659 --> 00:07:50,659
is 0 and 1 and the probability of symbols
is given as one fourth that is the probability
38
00:08:01,020 --> 00:08:08,020
for 0 and probability of 1 is given as three
fourth. Now, I can simply calculate the entropy
39
00:08:10,939 --> 00:08:17,939
of this source as H S equal to minus one fourth
log of one fourth minus three fourth log of
40
00:08:24,370 --> 00:08:31,370
three fourth and this and turns out to be
0.81 bit per symbol. In my case, the symbols
41
00:08:40,580 --> 00:08:47,580
are binary digits 0 and 1.
We will call the binary digit as binits. So,
42
00:08:48,750 --> 00:08:55,750
we can say that entropy is 0.81 bit per binits.
For this source, my redundancy would be 1
43
00:09:03,560 --> 00:09:10,560
minus 0.81; log q would be equal to 1, so
the value which I get is 0.19. So, I can say
44
00:09:18,130 --> 00:09:25,130
that roughly there is a redundancy of 19 percent
in the source S. Now, all this time, we have
45
00:09:33,260 --> 00:09:40,260
been looking at a source, which emits symbol
individually. So, what I mean by that.
46
00:09:45,000 --> 00:09:52,000
If I have a source S, then this source S emits
symbol, this symbol belongs to the source
47
00:09:55,070 --> 00:10:01,360
alphabet and the probability of occurrence
of that particular symbol is also given to
48
00:10:01,360 --> 00:10:08,360
me, but I looked at the emission of the symbols
from the source individually. So, I had s
49
00:10:12,030 --> 00:10:19,030
1, s 2, s i continuously like this and we
found out the average entropy, average information
50
00:10:23,020 --> 00:10:30,020
that is nothing but the entropy of the source
for a symbol.
51
00:10:32,850 --> 00:10:39,850
If I assume that this output sequence which
I get and I block them in terms of let us
52
00:10:50,320 --> 00:10:57,320
consider that this output sequence, which
I get out here, we look at the output sequence
53
00:10:58,740 --> 00:11:05,740
in terms of blocks. For time being let me
assume that I start looking at the output
54
00:11:06,470 --> 00:11:13,470
sequence in the blocks of three symbols. So,
this would be one block, the second block
55
00:11:19,930 --> 00:11:26,580
will be like this, continues like this, I
can look at the output sequence from this
56
00:11:26,580 --> 00:11:33,580
source in terms of blocks.
Now, when I start looking at this output sequence
57
00:11:35,550 --> 00:11:42,550
in terms of block, what I could consider is
that I am forming new messages or sub messages
58
00:11:45,970 --> 00:11:52,970
out of this string. This sub messages which
I have are nothing but they are being formed
59
00:11:59,060 --> 00:12:06,060
out of symbols, which are being emitted from
this source S. So, in our case, this is block
60
00:12:10,720 --> 00:12:15,380
length of 3. So, I have messages of length
3.
61
00:12:15,380 --> 00:12:22,380
Now, if I start looking this in terms of messages
and if I were to ask you that, what is the
62
00:12:23,970 --> 00:12:30,970
information, the average information which
I get per message from this source; let us
63
00:12:32,690 --> 00:12:39,690
look at this example little more specifically.
I consider the previous example which for
64
00:12:41,930 --> 00:12:48,770
which we calculated redundancy, the same example
I consider. So, I have a source given by the
65
00:12:48,770 --> 00:12:55,770
source alphabet, which is 0, 1, the probability
of 0 as one fourth and probability of one
66
00:13:00,320 --> 00:13:06,920
as three fourth.
Now, the output of the sequence from this
67
00:13:06,920 --> 00:13:13,920
source S will be looked in terms of blocks
of length 3. So, in that case, the number
68
00:13:16,660 --> 00:13:23,660
of sub messages which I can form from the
block of length 3 are nothing but 0 0 0 0
69
00:13:27,310 --> 00:13:34,310
0 1 1 1 0 and finally, I have 1 1 1. So, these
are the number of messages, which I can form
70
00:13:45,270 --> 00:13:52,270
from the source S if I start looking the sequence
of the output in terms of blocks of three.
71
00:13:53,370 --> 00:14:00,370
How do I find out the information average
information per message for all this eight
72
00:14:04,370 --> 00:14:07,350
messages? It is very simple.
73
00:14:07,350 --> 00:14:14,350
What we can do is
these are the number of messages, which I
have different messages, I can find out what
74
00:14:23,120 --> 00:14:30,120
is the probability of occurrence of each of
this sub messages. Now, if I assume that my
75
00:14:31,630 --> 00:14:38,630
symbols are independent, then probability
of getting 0 0 0 is nothing but one fourth,
76
00:14:39,240 --> 00:14:45,330
multiply by one fourth one fourth and that
is what I get 1 by 64. Similarly, I can find
77
00:14:45,330 --> 00:14:52,330
out the probabilities of occurrence of these
messages, which I call by v j, j ranges from
78
00:14:53,950 --> 00:15:00,950
1 to 8.
Now, going by the definition of the entropy,
79
00:15:01,510 --> 00:15:08,510
I can define the entropy or the average information
which I get from the source per message would
80
00:15:09,060 --> 00:15:16,060
be nothing but given by this simple formula,
which we had seen earlier too. If you calculate,
81
00:15:16,370 --> 00:15:23,370
just plug in this values out here into this
formula, what value I will get is 2.45 bits
82
00:15:24,720 --> 00:15:31,720
per message and we had just looked that the
entropy of the binary source, when I look
83
00:15:33,790 --> 00:15:40,790
at the sequence in terms of symbol being emitted
individually, then I get 0.81 bit per symbol.
84
00:15:44,530 --> 00:15:51,530
So, the relationship between H V and H S turns
out to be H V is equal to 3 times H S.
85
00:15:56,220 --> 00:16:03,220
This is simple example, which I took to show
the relationship between a new source that
86
00:16:06,510 --> 00:16:13,510
is V and the old source S, when I start looking
at the output of the sequence from the output
87
00:16:15,840 --> 00:16:22,840
of the sequence from the source S in terms
symbols in block lengths of 3. Instead of
88
00:16:28,760 --> 00:16:35,760
looking block lengths of 3, suppose I start
looking in block lengths of n. Then what is
89
00:16:42,400 --> 00:16:49,400
the relationship which I will get between
the new source V and my old source? It is
90
00:16:56,900 --> 00:17:03,280
not very difficult to prove and we will do
very shortly that what it will turn out to
91
00:17:03,280 --> 00:17:10,280
be is nothing but n times H S. This is valid
only when source my original source S is a
92
00:17:17,209 --> 00:17:24,209
zero memory source.
What is the advantage of looking at the source
93
00:17:26,870 --> 00:17:33,870
in this form? When we do coding, we will see
that when we start looking at the original
94
00:17:36,749 --> 00:17:43,749
source in terms of blocks of n symbols, then
it is possible for me to carry out the coding
95
00:17:48,029 --> 00:17:55,019
which is more efficient than when at not looked
at the source in this form. So, with this
96
00:17:55,019 --> 00:18:02,019
motivation, we will go ahead and try to define
this new source generated from the primary
97
00:18:06,399 --> 00:18:13,399
source in terms of symbols of length n.
98
00:18:20,330 --> 00:18:27,330
Let me formally define this. Let me assume
that I have a source S, which is a zero memory
99
00:18:30,019 --> 00:18:37,019
information source. This zero memory information
source will have its source alphabet. I give
100
00:18:37,850 --> 00:18:44,850
the source alphabet as s 1, s 2 and s q. In
the earlier case, which we saw the example
101
00:18:45,600 --> 00:18:52,600
s 1, s 2, s q, we just had 0 and 1. There
were only two letters in that alphabet and
102
00:18:54,909 --> 00:19:01,169
with each of this symbols in the alphabet
or letters in the alphabet, I have the probabilities
103
00:19:01,169 --> 00:19:08,169
of s i given and let me assume that probability
of s i is equal to P i.
104
00:19:08,980 --> 00:19:15,980
Then, the n th extension of S, which I am
going to denote by S n is again a zero memory
105
00:19:26,080 --> 00:19:33,080
source with q raised to n symbols. So, the
new alphabet which I generate for the nth
106
00:19:40,259 --> 00:19:47,259
extension of the source s that is S n will
be consisting of q n symbols. I denote those
107
00:19:48,970 --> 00:19:55,970
q n symbols as sigma 1, sigma 2 up to sigma
q raise to n. Each of these symbols out here
108
00:20:00,340 --> 00:20:07,340
in the new source is nothing but a string
of symbols, which come from my old source
109
00:20:10,009 --> 00:20:17,009
of primary source S and the length of sigma
1 is n of S size.
110
00:20:20,490 --> 00:20:27,490
Similarly, sigma 2 would be another symbol
of the n th extension, which I generate by
111
00:20:28,649 --> 00:20:35,649
having a string of symbols from my primary
source S. So, I know basically what is my
112
00:20:41,850 --> 00:20:48,850
source alphabet for the nth extension of the
source S. We have seen in the earlier class
113
00:20:49,019 --> 00:20:56,019
that if I want to define my source along with
the alphabet, I require the probability of
114
00:20:56,929 --> 00:21:02,279
symbols.
So, let me assume that probability of the
115
00:21:02,279 --> 00:21:09,279
symbol in the new x source that is S n are
given by probability of sigma 1, probability
116
00:21:10,259 --> 00:21:17,259
of sigma 2, probability of sigma q n and any
one of this sigma i is related to the probabilities
117
00:21:22,279 --> 00:21:29,279
of symbols in the original source S. That
is not very difficult to show. Now, the question
118
00:21:30,460 --> 00:21:37,460
is I have my entropy of the new source, I
have the entropy of the old source, how are
119
00:21:40,730 --> 00:21:46,539
these two entropies related? We already know
the answer. What we expect is it should be
120
00:21:46,539 --> 00:21:53,539
n times H S. Let us see whether we can prove
this formally.
121
00:21:57,879 --> 00:22:04,879
So, the entropy of my n th extension, which
is given by S n H S is nothing but this formula.
122
00:22:17,690 --> 00:22:24,690
Now, we can simplify this formula as
123
00:22:49,879 --> 00:22:56,879
I write probabilities of sigma i’s as nothing
but probabilities of i 1, i 2 up to i n. This
124
00:22:58,249 --> 00:23:05,249
here when I am writing this, I am assuming
that the sequence is such that the symbols
125
00:23:05,889 --> 00:23:12,889
in this sequence are independent. Now, this
I can simplify as this summation is over source
126
00:23:15,379 --> 00:23:22,379
alphabet S n. Now, this I can simplify as
P of sigma i log of, I can break up into n
127
00:23:37,809 --> 00:23:44,809
summations. So, finally, the last will be,
now let us look at one of this term, let us
128
00:24:08,470 --> 00:24:15,470
see whether we can simplify this term.
129
00:24:15,730 --> 00:24:22,730
So, I take the first term in that summation,
which is this. I again break up probabilities
130
00:24:42,090 --> 00:24:49,090
of sigma i in terms of my probabilities of
original symbols. Now, this summation will
131
00:25:00,450 --> 00:25:07,450
be done over the alphabet S n. Now, this summation
itself can be broken up into n summations
132
00:25:13,929 --> 00:25:20,929
as follows, the multiplications and finally,
you have and obviously because the summation
133
00:26:00,830 --> 00:26:07,830
is out here are all 1, this is nothing but
q i 1 equal to 1, P i 1 log of and this is
134
00:26:14,769 --> 00:26:20,610
by definition entropy of my primary source
or the original source S.
135
00:26:20,610 --> 00:26:27,610
Now, so my final expression for the entropy
of nth extension resource will be I have shown
136
00:26:33,409 --> 00:26:40,409
that this is the entropy I get for the first
term here. So, similarly, I can show that
137
00:26:40,909 --> 00:26:47,909
this is H S, this is H S and I have n number
of terms. So, finally, I get this value to
138
00:26:48,289 --> 00:26:55,289
be equal to n times H of S. This we had seen
with an example where I had n equal to 3 and
139
00:27:03,309 --> 00:27:10,309
we verified the same thing. As I have said
that motivation for studying the nth extension
140
00:27:11,429 --> 00:27:18,429
of a source will be when we are trying to
code a zero memory source, we want to design
141
00:27:20,690 --> 00:27:27,690
efficient codes. We will have a look at this
little later in our course.
142
00:27:33,129 --> 00:27:40,129
In the previous class, we had calculated an
entropy of a TV image and entropy of that
143
00:27:44,249 --> 00:27:51,249
TV image can be calculated was roughly around
1.4 into 10 raise to 6 bits. At that stage,
144
00:28:06,480 --> 00:28:12,529
I had pointed out that the calculation of
the entropy, which we have done for the TV
145
00:28:12,529 --> 00:28:19,529
image is not really exact. In a practical
situation, you will find that the entropy
146
00:28:19,999 --> 00:28:26,999
of a TV image is much less than this quantity.
The reason is that when we calculated this
147
00:28:31,809 --> 00:28:38,809
value, we assume that each pixel of pel in
the TV image was independent. In a practical
148
00:28:43,460 --> 00:28:50,460
situation, really this is not the case. This
is one example of a source, where you have
149
00:28:54,779 --> 00:29:01,779
the symbols or the pels to be very specific
in that case. In our case, they are not independent,
150
00:29:02,100 --> 00:29:09,100
they are related to each other and because
of the inter relationships between this pels,
151
00:29:09,350 --> 00:29:16,350
when we calculate the entropy of a real TV
image, we will find that this value the real
152
00:29:16,399 --> 00:29:22,139
value turns out to be much less than what
we had calculated based on the assumption
153
00:29:22,139 --> 00:29:29,139
that is a zero memory source.
Another example is if you look at English
154
00:29:30,690 --> 00:29:37,690
text, you will find that the occurrence of
the characters in the English text is not
155
00:29:42,200 --> 00:29:49,200
independent. For example, p followed by q,
these combinations will be much less compared
156
00:30:00,480 --> 00:30:07,480
to p followed by i. So, if you look at the
text string, and if you try to calculate the
157
00:30:10,919 --> 00:30:17,919
information based on the assumption that each
of the characters are independent and calculate
158
00:30:18,539 --> 00:30:25,539
the entropy or the average information, which
I will get from that same text string based
159
00:30:25,999 --> 00:30:32,999
on the fact that there is a relationship between
the characters. Then you will find that the
160
00:30:33,539 --> 00:30:39,739
entropy calculated in the later case will
much less than the entropy calculated in the
161
00:30:39,739 --> 00:30:46,739
earlier case.
So, let us look at those sources where there
162
00:30:47,460 --> 00:30:54,460
is a dependency of symbols in the sequence
of strings coming out from the source S. Let
163
00:30:58,919 --> 00:31:05,919
us try to look at that more formally. So,
if you have a source let us say S, which emits
164
00:31:14,470 --> 00:31:21,470
s i, s 1, s 2, s i continuously it emits symbols.
Now, so far we have assumed that all these
165
00:31:31,359 --> 00:31:38,359
symbols are independent. What it means that
probability of a occurrence of a particular
166
00:31:38,539 --> 00:31:45,539
symbol at this instant is not dependent on
the occurrence of the previous symbols, but
167
00:31:48,419 --> 00:31:54,340
in a practical situation, what will happen
that the probability of occurrence of the
168
00:31:54,340 --> 00:32:01,139
symbol s i at a particular instant i will
be dependent upon the presiding symbol.
169
00:32:01,139 --> 00:32:08,139
So, let us take a simple case where I find
that the probability of occurrence of a particular
170
00:32:08,350 --> 00:32:15,350
symbol at this instant say s i is dependent
upon the occurrence of the presiding symbols.
171
00:32:23,659 --> 00:32:30,659
So, let me assume that it is dependent upon
the previous symbols. In this case, I assume
172
00:32:39,529 --> 00:32:46,529
that occurrence of s i is dependent upon previous
m symbols, s j m is earlier to s i and s j
173
00:32:55,830 --> 00:33:02,830
1 is farthest away from s i.
So, in this case, when you have this kind
174
00:33:04,479 --> 00:33:11,479
of dependencies, then this is known as a Markov
source
and for this specific example, since the dependency
175
00:33:24,200 --> 00:33:31,200
is over m preceding symbols, I will say that
this Markov source is m th order. So, if you
176
00:33:39,869 --> 00:33:46,869
have a Markov source of first order, it means
the probability of occurrence of a symbol
177
00:33:47,820 --> 00:33:54,659
is dependent upon the preceding symbols. If
I have a Markov source of order two, then
178
00:33:54,659 --> 00:34:01,659
it is dependent upon preceding two symbols.
Now, if you want to identify such sources
179
00:34:06,539 --> 00:34:13,539
Markov sources, then what is required to specify
is you should again specify, what is the source
180
00:34:15,889 --> 00:34:22,889
alphabet?
So, Markov source will be identified by the
181
00:34:23,329 --> 00:34:30,329
source alphabet. Let me assume this case.
Also, the source alphabet consists of few
182
00:34:33,460 --> 00:34:40,460
symbols or few letters and since the occurrence
of a particular symbol in the sequence is
183
00:34:42,339 --> 00:34:49,339
dependent upon m preceding symbols, then just
the probability of occurrence of the symbol
184
00:34:53,399 --> 00:35:00,399
is not enough for me to characterize this
source. To characterize this source, what
185
00:35:00,660 --> 00:35:07,660
I need is this kind of probabilities and this
are known as conditional probabilities. So,
186
00:35:15,460 --> 00:35:22,460
along with the source alphabet, I should provide
conditional probabilities.
187
00:35:22,890 --> 00:35:29,890
Now, at any particular instant of time, this
symbol can take any of the values for this
188
00:35:35,390 --> 00:35:42,390
source alphabet. So, it can take q values.
Now, emission of these values is not independent.
189
00:35:48,089 --> 00:35:55,089
It is dependent upon the previous m preceding
symbols. Now, each of this m preceding symbols
190
00:35:57,190 --> 00:36:04,190
can take the values from this source alphabet.
Therefore, the number
of possibilities of this m preceding symbols
191
00:36:18,539 --> 00:36:25,539
will be q raise to m and each of this possibility
is known as state. Once I know the state,
192
00:36:35,510 --> 00:36:42,510
with each of the state, I have to specify
q conditional probabilities q conditional
193
00:36:43,099 --> 00:36:50,099
probabilities associated with the length of
the alphabet which I have.
194
00:36:57,730 --> 00:37:04,730
A Markov source S which is identified now
by
source alphabet and conditional probabilities
195
00:37:19,059 --> 00:37:26,059
since there are q raise to m states and with
each state, you have q transition probabilities,
196
00:37:31,910 --> 00:37:38,910
therefore you will have totally q raise m
plus 1 transitional probabilities. Therefore,
197
00:37:57,039 --> 00:38:04,039
to specify a Markov source of m th order,
in this case, I will require this alphabet.
198
00:38:05,420 --> 00:38:12,420
I will require q raise to m plus 1 transitional
probabilities. How do you depict a Markov’s
199
00:38:15,630 --> 00:38:22,630
source? Is it possible to present or represent
this Markov source in a form which describes
200
00:38:28,720 --> 00:38:33,299
the source completely?
One way to do that is with the help of what
201
00:38:33,299 --> 00:38:40,299
is known as state diagram. The state diagram
basically is used to characterize this Markov
202
00:38:48,920 --> 00:38:55,920
source. Another way of depicting the Markov
source is with source is with the use of what
203
00:38:57,099 --> 00:39:04,099
is known as trellis diagram. The difference
between trellis diagram and state diagram
204
00:39:08,940 --> 00:39:15,940
is that in trellis diagram, the state diagram
is augmented with time. So, with trellis diagram,
205
00:39:18,640 --> 00:39:25,640
you have state diagram plus time. Trellis
diagram tells me basically at any particular
206
00:39:35,530 --> 00:39:42,530
instant what is the state of the source; that
is not very clear just from the state diagram.
207
00:39:43,500 --> 00:39:49,099
So, I would say that the state diagram more
concise form of representation, whereas trellis
208
00:39:49,099 --> 00:39:56,099
diagram is a more elaborate form of representing
a Markov source. Let us take a simple example
209
00:39:58,940 --> 00:40:05,940
to understand this. If I have a Markov source
of second order and let me assume that I have
210
00:40:20,130 --> 00:40:27,130
my source as again given by this source alphabet
where the binary symbols are there, then if
211
00:40:30,920 --> 00:40:37,920
I were to represent this source in terms of
s state diagram, then the way to do it is
212
00:40:40,029 --> 00:40:47,029
since q in this case is equal to 2, m is equal
to 2, the number of states which I have is
213
00:40:50,599 --> 00:40:57,599
2 raise to 2 and that is equal to 4. You represent
these states by dots.
214
00:40:59,799 --> 00:41:06,799
So, in our case, I will have four states.
I represent them by this four dots and this
215
00:41:09,000 --> 00:41:16,000
four states can be identified as 0 0, 0 1,
1 0, 1 1. I will require the conditional probabilities
216
00:41:25,099 --> 00:41:34,819
for this source S since we have q is equal
to m is equal to 2 we get m is equal to 2
217
00:41:34,819 --> 00:41:43,819
plus 1 that is equal to in our case 8. So,
I should specify eight conditional probabilities
218
00:41:44,960 --> 00:41:51,960
and these eight conditional probabilities
are depicted in this state diagram by arrows.
219
00:41:52,450 --> 00:41:59,450
For example, there could be arrows running
from one state to another state like this.
220
00:42:12,120 --> 00:42:19,740
So, arrows basically indicate what is the
conditional probabilities?
221
00:42:20,230 --> 00:42:36,000
Now, to be very specific, let us take an example.
Let me assume that probability of 0 given
222
00:42:36,000 --> 00:42:48,740
0 0 is equal to probability of 1 given 1 1
is equal to 0.8, probability of 1 given 0
223
00:42:48,740 --> 00:42:59,280
0 is equal to probability of 0 given 1 1 is
equal to 0.2. And probability of 0 given 0
224
00:42:59,799 --> 00:43:06,799
1 is equal to probability of 0 given 1 0.
225
00:43:09,859 --> 00:43:27,939
Probability of 1 0 1 is equal to 0.5. If I
were to depict this in a state diagram form,
226
00:43:27,940 --> 00:43:36,120
then since there are four states, I can indicate
this four states by simple dots as here. These
227
00:43:36,589 --> 00:43:43,589
are nothing but 0 0, 0 1, 1 0, 1 1 make this
1 1, 10 and these are the arrows. This will
228
00:44:10,859 --> 00:44:20,099
be 0.8 because the probability of 0 given
0 0 is 0.8 and when 0 occurs, it will again
229
00:44:20,099 --> 00:44:26,990
go into the state 0 0.
So, this is what it means. Then I have this
230
00:44:26,990 --> 00:44:33,990
when it is in state 0 0, when 1 occurs, it
will move over to state 0 1. So, this is the
231
00:44:34,089 --> 00:44:41,089
arrow that indicates moves from 0 0 to 0 1
and then from this, I have 0.5, I have 0.2
232
00:44:46,400 --> 00:44:53,400
and 0.5 and probability of 1 occurring given
1 1 is 0.8. Now, same thing can be indicated
233
00:45:01,539 --> 00:45:07,589
with the help of a trellis diagram. What you
have to do is basically at any instant of
234
00:45:07,589 --> 00:45:14,589
time, you draw four states. Let us indicate
the four states are s 0, s 1, s 3, s 0 corresponding
235
00:45:18,140 --> 00:45:25,140
to 0 0, s 1 corresponding to 0 1, s 2 corresponding
to 1 1, s 3 corresponding to 1 0.
236
00:45:26,329 --> 00:45:32,430
Now, you look basically at a next instant
of time, you can have again four states. So,
237
00:45:32,430 --> 00:45:39,430
s 0 can go from s 0 to s 0. So, you can have
one arrow going from s 0 to s 0 or s 0 can
238
00:45:43,289 --> 00:45:50,289
go to s 1. So, I have like this. These are
two branches, which take off from s 0. Similarly,
239
00:45:54,180 --> 00:46:01,180
if you look at s 1, this is my s 1; s 1 can
go from s 1 to s 2. So, I have s 1 going from
240
00:46:08,470 --> 00:46:15,470
s 1 to s 2. This is my s 2 state; this is
my s 3 state. It is my s 0 state. There should
241
00:46:20,309 --> 00:46:27,309
also be a link between this and this is again
0.5, 0.5. So, I have state from s 1 to s 2
242
00:46:33,750 --> 00:46:40,750
or it can be from s 1 to s 3. So, it is
for s 2, I have from s 2 to s 3, s 2 to s
3 or from s 2 to s 2 itself the way. Finally,
243
00:46:56,369 --> 00:47:03,369
for s 3, I can go from s 3 to s 0. I write
it down like this and s 3 can go to s 1. So,
244
00:47:13,859 --> 00:47:18,099
this is another time instance.
Similarly, for each time instance, I can keep
245
00:47:18,099 --> 00:47:25,099
on connecting like this. So, you can specify
the exact part with the source follows using
246
00:47:27,720 --> 00:47:33,549
this trellis diagram. So, with the help of
a trellis diagram, it is possible to find
247
00:47:33,549 --> 00:47:40,549
the exact sequence of the symbols being emitted
with reference to time. This is another form
248
00:47:41,079 --> 00:47:48,079
of representation for the source S. Now, these
are important properties of this source S.
249
00:47:53,829 --> 00:48:00,460
To understand those important properties,
let me take another simple example.
250
00:48:00,460 --> 00:48:07,460
Suppose, I have a source S, which is given
by the same source alphabet 0, 1, but conditional
251
00:48:14,970 --> 00:48:21,970
probabilities are given like this, a small
difference from the earlier example which
252
00:48:44,859 --> 00:48:51,859
we just saw. Now, if I, again this source
is a second order source, if I were to depict
253
00:49:00,500 --> 00:49:07,500
the source in terms of a state diagram, then
what I would get is
something like this. Now, there is something
254
00:49:17,569 --> 00:49:24,569
very interesting about this source. What this
state diagram shows that there is a probability
255
00:49:31,859 --> 00:49:38,859
that you will always keep on getting 1s or
you will always keep on getting 0s. Actually,
256
00:49:38,900 --> 00:49:45,900
this is not complete. I should have something
like this.
257
00:49:47,579 --> 00:49:54,579
So, initially I start the source at particular
state. Let me assume that the source starts
258
00:49:56,740 --> 00:50:03,740
at any one of the states 0 0, 0 1, 1 1, and
1 0 and the probability of this happening
259
00:50:03,900 --> 00:50:10,900
are equal, so one fourth, one fourth, one
fourth, one fourth. Once it is one state in
260
00:50:12,250 --> 00:50:19,250
long run, you will find that this source either
emits just all 1s or emits all 0s. So, what
261
00:50:25,799 --> 00:50:31,150
is the difference between this source and
the earlier source which we saw? We find that
262
00:50:31,150 --> 00:50:38,150
in this source, once I am in this state this
state, it is not possible for me to come out
263
00:50:39,279 --> 00:50:45,970
of the states, whereas that was not true in
the previous case.
264
00:50:45,970 --> 00:50:52,970
What is the difference between this? Technically,
we would say that this source is non ergodic,
265
00:50:54,380 --> 00:51:01,380
whereas so this is I would say state diagram
of a non ergodic second order Markov source,
266
00:51:01,710 --> 00:51:08,710
whereas this state diagram is for second order
Markov source, but this is ergodic. Without
267
00:51:13,710 --> 00:51:20,710
going into the mathematical intricacies of
the definition for ergodicity, we can simply
268
00:51:22,079 --> 00:51:29,079
define as an ergodic Markov source as a source,
which observe for a long time. There will
269
00:51:33,440 --> 00:51:40,440
be a definite probability of occurrence of
each and every state in that source. In this
270
00:51:41,349 --> 00:51:48,349
example, I had four states. So, I can start
from any state initially.
271
00:51:51,309 --> 00:51:58,309
If I observe this source for a very long time
and calculate the states through which it
272
00:51:58,500 --> 00:52:05,500
is passing, then those transition probabilities,
or the probabilities of the states in the
273
00:52:09,510 --> 00:52:16,510
long term will be definite and it will be
possible for me to go from one state to any
274
00:52:16,910 --> 00:52:21,890
other state. It may not be possible for me
to go directly, but indirectly. For example,
275
00:52:21,890 --> 00:52:27,769
if I want to go from this state s 0 to s 2,
it is not necessary that I will have a directly
276
00:52:27,769 --> 00:52:34,769
link between s 0 to s 2 but I can always go
to a state s 2 via s 1. So, I go to the state
277
00:52:35,690 --> 00:52:42,690
s 1 and then may be directly s 2 or it is
possible from s 0 to s 1 and from s 1 to s
278
00:52:43,410 --> 00:52:49,880
3 and s 3 to again s 0, but in the long run,
I will be able to reach from one state to
279
00:52:49,880 --> 00:52:56,880
another state. This is a crude definition
of an ergodic Markov source. To be very specific,
280
00:53:01,460 --> 00:53:08,460
there are different definitions. So, just
let me look into those definitions.
281
00:53:19,680 --> 00:53:26,680
At every transition, the matrix of transition
probability, if it is the same, then this
282
00:53:28,289 --> 00:53:35,289
transition probability is known as stationary.
We know that each state, there will be some
283
00:53:37,390 --> 00:53:44,390
transition probabilities and if these transition
probabilities are stationary, then the Markov’s
284
00:53:44,740 --> 00:53:51,740
chain is known as homogeneous Markov chain
or Markov source. If you calculate the probability
285
00:53:57,130 --> 00:54:02,480
of the states, this S i denotes the probability
of the states, not the probability of the
286
00:54:02,480 --> 00:54:09,480
symbols in the source, this basically denotes
the probability of a particular state in a
287
00:54:10,230 --> 00:54:17,230
Markov chain, this probability of state will
be definite. If it does not change with time,
288
00:54:19,039 --> 00:54:26,039
then I will say that that Markov chain is,
a Markova source is stationary.
289
00:54:26,089 --> 00:54:33,089
As discussed earlier, ergodic Markov source
or Markov chain means that no matter what
290
00:54:33,980 --> 00:54:40,980
state it finds itself in, from each state
one can eventually reach the other state.
291
00:54:43,390 --> 00:54:50,390
That is a crude definition for ergodicity
and this understanding is more than sufficient
292
00:54:51,930 --> 00:54:58,930
for our course. Now, how do I calculate the
probability of the state? Is it possible for
293
00:54:59,650 --> 00:55:06,650
me to calculate? If I assume that the Markov
source is ergodic, then just with the help
294
00:55:07,079 --> 00:55:14,079
of condition symbol probabilities, it is possible
for me to calculate probability of state.
295
00:55:15,890 --> 00:55:24,490
We will look into the calculation of this
in our next lecture, and we will also look
296
00:55:24,920 --> 00:55:33,040
at the calculation of entropy for the Markov
source.