1
00:00:42,250 --> 00:00:49,250
In the last class, we derived a very important
result in information theory, which states
2
00:00:49,310 --> 00:00:56,310
that the average length of a code can never
be greater than the entropy of a source.
3
00:00:56,800 --> 00:01:06,479
So, what we derived was average length has
to be always greater than equal to the entropy
4
00:01:06,479 --> 00:01:18,739
of the source measured in array units. Now,
to appreciate the importance of this result
5
00:01:19,330 --> 00:01:27,789
let us re visit some of the examples which
we had studied in earlier in our course. So,
6
00:01:27,789 --> 00:01:35,409
one example which we which we which we had
considered was that we had a source s consisting
7
00:01:35,410 --> 00:01:59,650
of 4 symbols s1, s2, s3, s4 each of this symbols
are equiprobable. So, we know that the entropy
8
00:02:00,240 --> 00:02:10,940
of the source measured with the base 2 is
equal to 2 bits per symbol.
9
00:02:15,440 --> 00:02:25,239
Now, if I try to design any code a binary
instantaneous code than my length of that
10
00:02:25,239 --> 00:02:37,019
code can never be less than 2 bits per symbol.
And in this case each of this probabilities
11
00:02:37,310 --> 00:02:50,430
P i which is equal to 1 by 4 is of the form
half raised to two. So, what it implies that
12
00:02:50,540 --> 00:03:02,960
I can design a compact code with 4 code words
each of length 2 and 1 such code is. So, now
13
00:03:16,950 --> 00:03:25,010
the length of this code is also 2 binits per
symbol. So, there exists no uniquely decodable
14
00:03:25,010 --> 00:03:33,400
code for this source with smaller average
code word length, that is the length of 2
15
00:03:33,400 --> 00:03:47,159
binits per symbol. Now, let us look at one
more example to understand the derivation
16
00:03:47,159 --> 00:03:54,639
of that important result that is L is greater
than or equal to entropy of a source.
17
00:03:54,640 --> 00:04:05,860
So, if I take this same source s with the
same 4 symbols, but with different probabilities
18
00:04:06,239 --> 00:04:17,439
given as half one forth, one eighth, one eight
h then the entropy for this source will turn
19
00:04:20,470 --> 00:04:36,250
out to be 1 3 4 bits per symbol. Earlier in
our course we had designed a instantaneous
20
00:04:36,250 --> 00:04:51,340
code for the source and that code was given
as S1 is 1 0, S2 is 1 1 0, S is 1 1 1 0 and
21
00:04:51,340 --> 00:05:01,660
S 4 is 0. This code is uniquely decodable
code in fact it is a instantaneous code. Now,
22
00:05:01,669 --> 00:05:08,669
to calculate the length for this code it will
turn out to be equal to 1 of 7 by 8 binits
23
00:05:12,660 --> 00:05:19,080
per symbol.
So, even in this case we find the length of
24
00:05:19,080 --> 00:05:26,080
the code greater than the entropy of the source,
but now each P i in this case also is of the
25
00:05:33,030 --> 00:05:40,030
form P i is of the form half raised to alpha
i where alpha i belongs to integer. Therefore,
26
00:05:46,530 --> 00:05:53,530
it is possible to achieve the lower bound
of 1 3 4 binits per symbol and this is done
27
00:05:59,310 --> 00:06:06,310
by setting Li equal to 1 2 3 and 3 respectively
for this 4 symbol S1, S2, S3 and S4. So, the
28
00:06:14,509 --> 00:06:21,509
code would be S1 0, S2 1 0, S3 1 1 0 and S4
1 1 1, if the length the average length for
29
00:06:33,780 --> 00:06:40,780
this code will come out to be 1 3 4 binits
per symbol. So, again in this case we find
30
00:06:48,360 --> 00:06:55,360
that the length average length turns out to
be equal to the entropy of the source. And
31
00:06:57,020 --> 00:07:04,020
as a final example to explain the importance
of Lis equal to H s let us consider a source
32
00:07:08,360 --> 00:07:12,210
s with seven symbols.
33
00:07:12,210 --> 00:07:19,210
So I have a source s consisting of seven symbols
S1, S2, S3, S4, S5, S 6, S7 each with symbol
34
00:07:26,319 --> 00:07:33,319
probabilities given as one third, 1 third,
1 by 9, 1 by 9, 1 by 9, 1 by 7. Now, the entropy
35
00:07:44,830 --> 00:07:51,830
for this source will take in the 3 array units
we calculate that will turn out to be 13 by
36
00:08:00,729 --> 00:08:07,729
9 trinary units per symbol. And again we find
that the symbol probabilities are of the form
37
00:08:18,729 --> 00:08:25,729
one third alpha i where alpha i belongs to
integer.
38
00:08:26,319 --> 00:08:33,319
And we can once we have the lengths alpha
i are nothing but the lengths for the code
39
00:08:33,640 --> 00:08:40,640
and we can design the instantaneous code as
follows S1 is 0 S2 1. Now, if you calculate
40
00:09:06,240 --> 00:09:13,240
the length for this code as P i Li Lis equal
to 1 to 7 will turn out to be 13 by 9 trinary
41
00:09:24,100 --> 00:09:31,100
symbols per source symbol. So, far what we
have seen is that we have looked at the coding
42
00:09:41,640 --> 00:09:48,640
problem for the 0 memory source with symbol
probabilities of the form 1 by r alpha i.
43
00:09:57,810 --> 00:10:04,810
So, if I have log r 1 by P i is equal to Li.
If I have my P i of this form then I can write
44
00:10:13,050 --> 00:10:20,050
log r 1 P i is equal to Li where I choose
my Li equal to alpha i. Now, the next question
45
00:10:24,170 --> 00:10:31,170
arises is that if this condition is not satisfied
then how do I choose my length. Now, what
46
00:10:35,100 --> 00:10:42,100
it means that if log of r 1 by P i is not
an integer than how do I choose my length.
47
00:10:53,600 --> 00:11:00,600
It might seem reasonable that a compact code
could be formed by choosing Li as the first
48
00:11:05,450 --> 00:11:12,450
integer which is greater than log r 1 by P
i. Now, this tempting conjecture is not valid
49
00:11:18,230 --> 00:11:25,230
if we select Li in this manner it is not necessary
that we will get a compact code, but selecting
50
00:11:31,940 --> 00:11:38,940
Li in this manner where Li is the integer,
which is just larger than log r 1 by P i can
51
00:11:43,850 --> 00:11:50,850
lead to some important results. So, let us
select Li, therefore as the unique integer
52
00:11:53,640 --> 00:11:57,000
satisfying this condition.
53
00:11:57,000 --> 00:12:04,000
So, what we will do is we will select Li as
a integer which is just larger than this value
54
00:12:17,470 --> 00:12:24,470
it means that it follows this inequality.
So, if I choose a set of Li which follow this
55
00:12:32,070 --> 00:12:39,070
inequality then the next question is it possible
for me to design or synthesize an instantaneous
56
00:12:40,339 --> 00:12:47,339
code which uses this set of Li. So, that question
can be answered that question can be answered
57
00:12:49,970 --> 00:12:56,970
if I can test krafts inequality.
So, taking exponential of the left inequality
58
00:12:58,190 --> 00:13:05,190
of equation 1 we will get 1 by P i is less
than equal to r of Li which implies that P
59
00:13:12,190 --> 00:13:19,190
i is greater than r minus Li. Now, summing
to over all i we obtain if I sum this all
60
00:13:32,450 --> 00:13:39,450
I assuming that the source is of size Q than
I can write this relationship and this is
61
00:13:50,370 --> 00:13:57,370
1 therefore, I get this relationship. Now,
this relationship is you know that we have
62
00:14:06,510 --> 00:14:11,620
discussed this earlier and we have shown that
if this condition is satisfied then it is
63
00:14:11,620 --> 00:14:18,620
possible for us to get an instantaneous code
for that source. So, this choosing Li according
64
00:14:22,620 --> 00:14:29,620
to this relationship given by 1 is acceptable
for synthesis of a instantaneous code.
65
00:14:32,910 --> 00:14:39,910
Now, equation 1 defines an acceptable set
of Li for an instantaneous code multiplying
66
00:14:41,199 --> 00:14:48,199
equation 1 by P i and summing up over all,
i we will find if i sum this up P i summation
67
00:14:53,399 --> 00:15:00,399
log r 1 by P i i is equal to
P i log r 1 by P i P i Li P i log r 1 P i
plus P i these are summed over all i we get
68
00:15:33,410 --> 00:15:40,410
the relationship as this is the entropy of
the source measured in array units this average
69
00:15:46,600 --> 00:15:53,600
length of the code, another very important
result which we have derived.
70
00:16:05,149 --> 00:16:12,149
Now, there is a difference between this result
and the result which is. So, earlier another
71
00:16:13,529 --> 00:16:20,529
result which we saw earlier was a H r S is
less than equal to L there is a important
72
00:16:24,759 --> 00:16:31,230
difference between this relationship and this
relationship. This relationship expresses
73
00:16:31,230 --> 00:16:38,230
a bound for the average land of a code independent
of any particular coding scheme the bound
74
00:16:39,940 --> 00:16:46,940
requires only that the code be instantaneous,
whereas equation 3 on the other hand is derived
75
00:16:49,720 --> 00:16:56,720
by assuming the coding method given by equation
1. So, if I use the coding method described
76
00:17:00,380 --> 00:17:07,380
by equation 1 then I get this relationship
and this relationship provides both a lower
77
00:17:07,919 --> 00:17:12,980
and upper bound on the average length of the
code.
78
00:17:12,980 --> 00:17:19,980
Now, since this relationship is valid for
any 0 memory source we may apply it to the
79
00:17:22,709 --> 00:17:29,709
nth extension of the source. Let us see basically
what happens if I apply this kind of coding
80
00:17:32,620 --> 00:17:39,620
scheme to an nth extension of a source. So,
let us assume that I have a source s which
81
00:17:43,030 --> 00:17:50,030
is a 0 memory source and I look at its nth
extension of this source. If I follow the
82
00:17:53,200 --> 00:18:00,200
strategy for coding which is given by this
equation then this equation is also valid
83
00:18:00,490 --> 00:18:07,490
for the nth extension because nth extension
of a 0 memory source is again a 0 memory source.
84
00:18:07,570 --> 00:18:14,570
So, the relationship which I get is H r S
n is less than equal to a L n less than H
85
00:18:21,590 --> 00:18:28,590
r S n plus 1, where L n is the average length
which is defined as probability of sigma i
86
00:18:43,289 --> 00:18:50,289
lambda i, i equal to 1 to Q where Q is the
size of the source Q n Q raise to n is the
87
00:18:53,690 --> 00:19:00,690
size of the nth extension source sigma i are
the symbols of the nth extension of the source
88
00:19:05,480 --> 00:19:12,480
s n lambda i are the length.
So, lambda i corresponds to the length of
89
00:19:16,390 --> 00:19:23,390
the code word corresponding to symbols from
nth extension that is sigma i. This is the
90
00:19:28,160 --> 00:19:35,160
average length and L n by n will give me the
average length for code symbols used per single
91
00:20:02,650 --> 00:20:09,650
source symbol from S. Now, we have also seen
that H r S n that is entropy of the nth extension
92
00:20:26,380 --> 00:20:33,380
of a 0 memory source is equal to n times the
entropy of the original source. So, based
93
00:20:39,870 --> 00:20:46,870
on this relationship we can write this equation.
What this equation says is that it is possible
94
00:21:17,309 --> 00:21:24,309
to make L n by n as close as we wish to H
r S by coding the nth extension of s rather
95
00:21:34,320 --> 00:21:41,320
than original source that is S because S keeps
on increasing 1 by n tend towards 0. So, in
96
00:21:46,090 --> 00:21:53,090
that case L n by n will tend towards H r S
that is entropy of the original source.
97
00:21:55,210 --> 00:22:02,210
Now, sp limit of n tending to infinity would
be L n n is equal to H r S. This is a very
98
00:22:18,799 --> 00:22:25,799
important relationship which we have derived.
This equation is known as Shannonâ€™s first
99
00:22:32,720 --> 00:22:39,720
theorem or the noiseless coding theorem. It
is one of the two major theorems of information
100
00:22:42,260 --> 00:22:49,260
theory equation 5 a tells us that we can make
the average number of r array code symbols
101
00:22:54,870 --> 00:23:01,870
per source symbol as small as possible, but
not smaller than the entropy of the source
102
00:23:05,820 --> 00:23:12,370
measured in r array units.
So, the prize which we pay for decreasing
103
00:23:12,370 --> 00:23:19,370
L n by n quantity is the increasing complexity
of the coding scheme. Now, all this discussion
104
00:23:25,100 --> 00:23:32,100
which we have done pertains to a 0 memory
source and its extension. The next question
105
00:23:33,240 --> 00:23:40,240
arises is are this results also valid for
a source which is not a 0 memory source, but
106
00:23:42,539 --> 00:23:49,539
say a Markov source. Let us extend our discussion
to a Markov source.
107
00:23:58,179 --> 00:24:05,179
Let me assume that I have a Markov source
S and have its adjoined S bar. We have seen
108
00:24:10,049 --> 00:24:17,049
the definition of a adjoint of a Markov source.
Adjoint of a Markov source is a source with
109
00:24:19,840 --> 00:24:26,840
the source with the same source symbol as
the source alphabet as the source S. And the
110
00:24:29,419 --> 00:24:42,339
symbol probabilities of the symbols in s bar
is the same as the first order symbol probabilities
111
00:24:42,340 --> 00:24:59,720
of source S. So, that is a definition of S
bar. Now, the process of encoding the symbols
112
00:24:59,720 --> 00:25:10,720
in the source S and the symbols which are
identical to the source in the source alphabet
113
00:25:10,720 --> 00:25:24,340
S into an instantaneous block code is identical,
because the source symbols are same. And the
114
00:25:24,340 --> 00:25:33,020
probability of the symbols are also the same.
In that case what it means that the average
115
00:25:33,029 --> 00:25:45,509
length which is defined as p i L i i is equal
to 1 to Q this average length is also identical
116
00:25:45,510 --> 00:25:58,090
for both S and S bar. S bar however is a 0
memory source and we may apply the earlier
117
00:25:58,090 --> 00:26:15,240
derived result to obtain H r S bar is less
than equal to L. Now, we also have seen that
118
00:26:15,240 --> 00:26:25,740
entropy of the original source is always less
than or equal to the entropy of its adjoint.
119
00:26:27,029 --> 00:26:38,889
And this is less than equal to L. So, what
follows is that H r S is less than or equal
120
00:26:38,890 --> 00:26:45,890
to L.
So, again even for a Markov source we have
121
00:26:46,179 --> 00:26:55,859
shown that the average length of a code when
we code the symbols individually in the source
122
00:26:56,730 --> 00:27:12,150
is greater or equal to the entropy of the
source. Let us extend this result to an nth
123
00:27:12,150 --> 00:27:22,750
extension of a Markov source. So, I have a
source S which is a Markov source and I am
124
00:27:22,750 --> 00:27:33,570
looking at the coding of this source in groups
of n elements. So, that means I am looking
125
00:27:33,570 --> 00:27:53,150
at s n where i code Si 1 Si 2 Si 3 up to Si
n and as 1 unit.
126
00:27:54,529 --> 00:28:07,179
Now, if I start coding the source original
source s in blocks of n source symbols then
127
00:28:07,179 --> 00:28:18,099
I can say that this source symbol form a new
source symbol which is the i. And now I will
128
00:28:18,110 --> 00:28:28,230
range from 1 to 2 raise to n. Now, let us
follow the same strategy which we followed
129
00:28:28,230 --> 00:28:35,230
earlier for a 0 memory source. Now, for a
0 memory source and its nth extension we had
130
00:28:47,720 --> 00:28:54,720
derived a result saying that L n is greater
or equal to this is result which we had derived
131
00:28:56,260 --> 00:29:03,260
for the 0 memory source.
And it is a nth extension, now based on this
132
00:29:05,929 --> 00:29:12,929
same thing is valid for a Markov source I
can say that L n is greater than or equal
133
00:29:15,809 --> 00:29:22,809
to H of v less than H of v plus 1. So, the
same result is valid when I code this source
134
00:29:36,480 --> 00:29:43,480
in groups of n source symbols. Now, we have
also seen that by definition H v is equal
135
00:29:51,740 --> 00:29:58,740
to n times H n S, where H n S is the average
information per symbol. So, if you go by this
136
00:30:08,149 --> 00:30:15,149
definition which we had seen earlier in our
lectures I can write this expression as n
137
00:30:17,190 --> 00:30:24,190
H n S greater or equal to L n. Now, if I divide
both the sides by n what I get is this result.
138
00:30:47,940 --> 00:30:54,940
Now, this is the average length of code symbols
per source symbol of S. Now if I take the
139
00:31:07,899 --> 00:31:11,820
limit as n tends to infinity.
140
00:31:11,820 --> 00:31:18,820
I will get limit n tending to infinity L n
by n is equal to H infinity S. And this is
141
00:31:35,610 --> 00:31:42,610
the entropy of a source which is a Markov
source. So, again we have derived the result
142
00:31:52,190 --> 00:31:59,190
that if I want my average length to be as
close as possible to the entropy of the source,
143
00:32:04,059 --> 00:32:11,059
then I have to use the extensions of the source.
Now, another parameter we can define which
144
00:32:23,049 --> 00:32:30,049
relates the entropy and the average length
of the source and that is known as efficiency
145
00:32:31,320 --> 00:32:38,320
of a code. Efficiency of a code is defined
as entropy of source divided by the average
146
00:32:42,230 --> 00:32:49,230
length of the code for that source. So, this
is by definition and efficiency of a code,
147
00:32:54,669 --> 00:33:01,669
what is desired is to have as large as possible
the value for this eta.
148
00:33:07,309 --> 00:33:14,309
Now, a question that arises is that we have
used the coding technique which is based on
149
00:33:21,960 --> 00:33:28,960
this inequality. So, this inequality provides
a some method of choosing the word length
150
00:33:42,090 --> 00:33:49,090
L i by encoding the symbols from S n and taking
n sufficiently large the quantity L n by n
151
00:33:54,549 --> 00:34:01,549
can be made as small as possible, but it cannot
be smaller than the entropy of the source.
152
00:34:02,320 --> 00:34:09,320
The question is suppose if n is not very large
n or capital N, if this quantity is not very
153
00:34:12,159 --> 00:34:19,159
large the theorem does not tell us what value
of L what is the average length we shall obtain
154
00:34:22,350 --> 00:34:29,350
in that case. It does not guarantee that L
or L n by n will be smallest for that fixed
155
00:34:35,529 --> 00:34:40,940
n.
Let us try to understand this with a simple
156
00:34:40,940 --> 00:34:47,940
illustration. Suppose, I have a source consisting
of 3 symbols S1 S2 S3 and let us assume the
157
00:34:50,760 --> 00:34:57,760
P i is for this is given as two third, 2 by
9, 1 by 9 log of 1 by P i. If I assume that
158
00:35:06,480 --> 00:35:13,480
I am going to design a binary code than is
equal to 0.58, 2.17, 3.17. So, my L i which
159
00:35:21,570 --> 00:35:28,570
follows in equality will be 1 3 4. So, I can
design a code which is 0 1 0 0 1 0 1 0.
160
00:35:42,010 --> 00:35:49,010
So, I have designed a code a which is based
on this coding strategy. Now, if you calculate
161
00:35:51,109 --> 00:35:58,109
the entropy for this source turns out to be
H s is equal to 1.22 bits per symbol. And
162
00:36:04,940 --> 00:36:11,940
if I calculate the length for this code is
1.78 binits per symbol. If you look at 1 .78
163
00:36:20,990 --> 00:36:27,990
it satisfies this inequality of H S this inequality
is satisfied by this code a, but unfortunately
164
00:36:44,690 --> 00:36:51,690
this coding strategy does not give me a compact
code, a code with the length average length
165
00:36:52,530 --> 00:36:58,740
of the code to be as small as possible. So,
if I take another code in this case if I have
166
00:36:58,740 --> 00:37:05,740
another code say B which is 0 1 0 1 one which
is again an instantaneous code. And you can
167
00:37:07,349 --> 00:37:14,349
calculate the length for this code turns out
to be 1.33 binits per symbol.
168
00:37:19,690 --> 00:37:36,630
So, what it means that this procedure does
not guarantee a compact code. Now, this code
169
00:37:36,630 --> 00:37:48,339
B 1.33 is very close to 1.22. So, what it
also implies that by going for an extension
170
00:37:48,339 --> 00:38:01,559
of the source I may not able to achieve much.
So, what this example demonstrate is that
171
00:38:01,570 --> 00:38:11,230
using this coding strategy, it is not essential
that we always get a compact code. We could
172
00:38:11,230 --> 00:38:18,230
get a compact code when n is very large then
when we consider nth extension of a source.
173
00:38:20,460 --> 00:38:27,240
Otherwise we have seen in this example that
follow another strategy I could get average
174
00:38:27,240 --> 00:38:33,020
length of the code which is smaller than the
strategy for given by this inequality. Now,
175
00:38:33,020 --> 00:38:40,020
this strategy of coding is known as Shannon
coding strategy or Shannon coding assignment
176
00:38:49,510 --> 00:38:56,510
for the lengths. Another question which arises
is that whenever the design using this strategy
177
00:39:02,510 --> 00:39:09,510
based on some probability of symbols given
here, but in reality if this probability is
178
00:39:14,589 --> 00:39:21,589
not correct than what is the effect on the
average length. So, let us answer this question.
179
00:39:24,680 --> 00:39:31,680
So, what we say is that we have a source s
with source symbols given as S1, S2, up to
180
00:39:36,240 --> 00:39:43,240
Sq. The real probability of these symbols
are P1, P2 up to Pq, but for some reason this
181
00:39:53,640 --> 00:39:59,530
is not available and the estimate of this
probabilities are available. So, let us call
182
00:39:59,530 --> 00:40:06,530
those estimates given as Q1, Q2 up to Q q.
So, as far as we are concerned since this
183
00:40:10,349 --> 00:40:17,220
is known to us we will be designing our code
based on this information.
184
00:40:17,220 --> 00:40:24,220
So, when we do that what it means that if
we use Shannon code assignment. Then my lengths
185
00:40:36,790 --> 00:40:43,790
for the code word for each source symbol will
be decided by log of this is a symbol which
186
00:40:57,599 --> 00:41:04,599
we use for finding out the first integer larger
than this value. So, our designing of L i
187
00:41:06,109 --> 00:41:13,109
will be based on Q i. Now, the two function
or the two probabilities are given by this.
188
00:41:16,579 --> 00:41:23,579
So, the entropy of the source is
P i log r P i. So, this is a real entropy
of the source.
189
00:41:37,250 --> 00:41:43,720
And if we design our code properly than we
expect that every length of our code should
190
00:41:43,720 --> 00:41:50,720
be as close as possible to H S, but since
this is not known to us and well be designing
191
00:41:52,710 --> 00:41:59,710
our code based on the length given by this
probabilities. Then the average length of
192
00:42:00,950 --> 00:42:07,339
the code which we will be getting in a practical
case would be very much different from the
193
00:42:07,339 --> 00:42:14,339
entropy of the source.
How much is the difference is in the entropy
194
00:42:24,250 --> 00:42:31,250
and the average length based on this is of
interest to me. And this difference is given
195
00:42:32,130 --> 00:42:39,130
in the form of a theorem. The theorem says
that the average code word length under real
196
00:43:02,270 --> 00:43:09,270
probability of real probabilities given by
P i of the code assignment L i equal to log
197
00:43:24,170 --> 00:43:31,170
of 1 by Q i. This is what is available to
me satisfies the following relationship where
198
00:44:02,990 --> 00:44:09,990
by definition P of Q is by definition given
as P i log of P i Q i i is equal to 1 to q.
199
00:44:24,140 --> 00:44:30,000
So, this is what we want to prove.
200
00:44:30,000 --> 00:44:37,000
Let us look at the proof of it
the average length of the code which we will
get is L is equal to P i is the real probability
201
00:44:53,630 --> 00:45:00,630
of occurrence of a source symbol. And the
design length for this source symbol S i would
202
00:45:02,210 --> 00:45:09,210
be given by this quantity because Q i are
available to us. And this quantity is less
203
00:45:15,349 --> 00:45:22,349
than P i i equal to 1 to Q log of r 1 by Q
i plus 1 this is equal to P i log of P i Q
204
00:45:43,000 --> 00:45:50,000
i 1 year 1 by P i 1. This is equal to summation
of P i log r P i Q i plus P i log. This by
205
00:46:34,569 --> 00:46:41,569
definition is equal to d of plus H r S plus
1. So, we have shown that L is less than H
206
00:46:55,650 --> 00:47:02,650
r S plus. Similarly, we can show the other
side of the inequality not very difficult
207
00:47:09,970 --> 00:47:10,810
to do that.
208
00:47:10,810 --> 00:47:17,810
So, we write L is equal to a again P i log
of 1 by Q i and this is greater or equal to
209
00:47:29,319 --> 00:47:36,319
summation where i P i log r 1 by Q i is equal
to
210
00:47:54,190 --> 00:48:01,190
this can be shown as equal to. So, finally,
we get result as H r S plus d of Q where d
211
00:48:29,500 --> 00:48:36,500
of P Q is by definition given as P i log P
i Q i. So, this is known as this is termed
212
00:48:53,309 --> 00:49:00,309
as relative entropy or it is also known Kullback
Leibler distance between two probability distributions.
213
00:49:35,260 --> 00:49:42,260
More appropriate would be to say to two probabilities
density.
214
00:49:57,579 --> 00:50:04,579
So, the penalty which we pay for the wrong
choice for the probabilities of the source
215
00:50:04,770 --> 00:50:11,770
symbol is in terms of the relative entropy.
So, our average length is going to increase
216
00:50:14,470 --> 00:50:21,470
by this quantity. So, we have looked at the
mathematical relationship between the average
217
00:50:23,130 --> 00:50:30,130
length of a code and the entropy of the source
for which the code has been designed. We also
218
00:50:31,210 --> 00:50:38,210
look at some of the methods to synthesize
an instantaneous code. Now, in the next class
219
00:50:40,200 --> 00:50:48,680
we will have a look at different coding strategies
which can be adapted in particular scenario
220
00:50:48,690 --> 00:51:00,150
to obtain average code word length to be as
cool as possible to the entropy of a source
221
00:51:00,150 --> 00:51:04,110
without going further extension of the source.