1
00:00:18,300 --> 00:00:25,300
So, in today's talk, we will conclude the
topic of Shannon's theory. So, in the previous
2
00:00:26,029 --> 00:00:32,169
.lass, if you remember, that we have concluded
with the idea of equivocation of keys. So,
3
00:00:32,169 --> 00:00:38,720
we have defined briefly, that what, is the
idea behind spurious keys. So, today we will
4
00:00:38,720 --> 00:00:44,120
try to understand, whether we can compute
a lower bound of the spurious keys.
5
00:00:44,120 --> 00:00:50,359
So, if I just would like to recap, that we
essentially proved this particular formula
6
00:00:50,359 --> 00:00:57,359
yesterday, that is, HK given C is equal to
HK plus HP minus HC. So, therefore, the ambiguity
7
00:00:57,530 --> 00:01:02,710
of a key given the ciphertext is described
as follows, that it is addition of the ambiguity
8
00:01:02,710 --> 00:01:09,710
of the key plus the ambiguity of the plaintext
subtracted with the ambiguity of the ciphertext.
9
00:01:10,890 --> 00:01:17,740
So, we also discussed about what is the meaning
of ideal ciphers and told, that in case of
10
00:01:17,740 --> 00:01:24,660
an ideal cipher, H key K given C is equal
to the value of HK. So, that means, that the
11
00:01:24,660 --> 00:01:28,489
ciphertext does not leak any additional information
about the key.
12
00:01:28,489 --> 00:01:33,780
So, these are the things that we described
yesterday and concluded with it. And so, therefore,
13
00:01:33,780 --> 00:01:40,780
HK given C gives us the idea of security or
insecurity. So, therefore, what we discussed
14
00:01:41,319 --> 00:01:45,849
is that even for perfect ciphers, the key
size is infinite if the message size is infinite.
15
00:01:45,849 --> 00:01:50,129
So, that was the problem with perfect ciphers,
that is, they are not practical.
16
00:01:50,129 --> 00:01:56,410
So, then we defined another kind of ciphers
called the ideal ciphers, where HK given C
17
00:01:56,410 --> 00:02:03,410
is equal to HK. And, as I, as I described
yesterday, that the main objective of our,
18
00:02:04,560 --> 00:02:08,379
what we study in this course, would be to
find ciphers, which are secured against a
19
00:02:08,379 --> 00:02:15,379
bounded adversary. That means, we are essentially
striving to achieve computational security;
20
00:02:16,010 --> 00:02:20,420
that means, security considering what is today's
computational power.
21
00:02:20,420 --> 00:02:25,540
So, for example, if you can prove that a given
cipher has got a security, which requires
22
00:02:25,540 --> 00:02:30,930
an adversary to do, say 2 power of 80 computations,
then we are fairly happy; we say, that the
23
00:02:30,930 --> 00:02:35,440
cipher has achieved computational security.
24
00:02:35,440 --> 00:02:41,630
So, the question is how to protect? So, therefore,
but still, I mean, in today's class we will
25
00:02:41,630 --> 00:02:48,090
be, essentially, still considering an unbounded
adversary and essentially address this question,
26
00:02:48,090 --> 00:02:53,640
that is, how do I protect a data against a
brute force attacker with an infinite computational
27
00:02:53,640 --> 00:02:56,420
power?
So, therefore, there is an attacker who has
28
00:02:56,420 --> 00:03:01,400
got infinite computational power and still,
can I protect the cipher? So, the idea is,
29
00:03:01,400 --> 00:03:06,940
that is, Shannon defined a certain parameter,
which is called unicity distance and we say,
30
00:03:06,940 --> 00:03:12,370
that it is the least amount of ciphertext,
which I would like to make it available to
31
00:03:12,370 --> 00:03:17,430
the adversary who is an unbounded adversary,
so that he does not find out the unique value
32
00:03:17,430 --> 00:03:21,500
of the key.
So, the strategy of the, of the adversary
33
00:03:21,500 --> 00:03:26,880
is as follows. What he does is that he takes
the ciphertext; he assumes a key, decrypts
34
00:03:26,880 --> 00:03:33,360
it back and finds out the plaintext. If the
plaintext is meaningful, then he notes it
35
00:03:33,360 --> 00:03:37,890
down. But the point is, that because of the
redundancy in the English language or rather
36
00:03:37,890 --> 00:03:42,990
any other language, any language for that
matter, the adversary will not actually find
37
00:03:42,990 --> 00:03:48,110
out the main, I mean, key uniquely. So, what
he will essentially have is a set of keys.
38
00:03:48,110 --> 00:03:53,110
So, apart from the actual key, the other keys
are called the spurious keys.
39
00:03:53,110 --> 00:04:00,110
Sir, how will the attackers enable it, like
his iterations, how will he actually decide
40
00:04:06,530 --> 00:04:06,790
or how will the computer decide, that the
text is meaningful. It cannot check based
41
00:04:06,790 --> 00:04:07,700
on its own keys.
No, no, the idea is as follows, that is, you
42
00:04:07,700 --> 00:04:13,790
take the ciphertext; consider a shift cipher,
so you have got a unique value of the key.
43
00:04:13,790 --> 00:04:20,790
So, you find out the value of the key. So,
so what you do is, that you decrypt it back
44
00:04:21,060 --> 00:04:24,690
and then you check, that whether the plaintext
make sense or not. So, what you are asking
45
00:04:24,690 --> 00:04:29,030
is that you have got a lot of work to do,
but what I told you at the beginning is, that
46
00:04:29,030 --> 00:04:31,919
I am considering what, I am considering an
unbounded adversary.
47
00:04:31,919 --> 00:04:38,520
So, the adversary has got infinite computational
power; he is, he is supreme. So, the idea
48
00:04:38,520 --> 00:04:42,659
is, that whether, the question is, that whether
even given such a kind of, such a kind of
49
00:04:42,659 --> 00:04:47,990
adversary, how many minimum number of ciphertext
will I make it available to the adversary,
50
00:04:47,990 --> 00:04:53,870
so that he cannot still guess the actual value
of the key? So, therefore, the set of the
51
00:04:53,870 --> 00:04:59,009
spurious keys should be null.
Minimum number or maximum number?
52
00:04:59,009 --> 00:05:01,520
Minimum number, because if I give you more
information, then you can do more things,
53
00:05:01,520 --> 00:05:08,520
so I would like you to give the adversary
minimum number of ciphertexts. Is it clear?
54
00:05:08,740 --> 00:05:14,930
So, therefore, so therefore, a common misconception
is that any cipher can be attacked by exhaustively
55
00:05:14,930 --> 00:05:19,189
trying all possible keys, this is the very
common misconception. So, what would you like
56
00:05:19,189 --> 00:05:24,569
to say is that for example, even if DES, it
has got a 56 bit key, so this can also be
57
00:05:24,569 --> 00:05:29,710
broken by brute force. But the idea is, that
so suppose, let us consider an adversary who
58
00:05:29,710 --> 00:05:34,349
can do 2 to the power 56 computations, so
he should be able to break it. So, the idea
59
00:05:34,349 --> 00:05:39,169
is that, but if the cipher is used within
its unicity distance, then even an all power
60
00:05:39,169 --> 00:05:45,289
adversary cannot break the cipher because,
because of the strategy. The strategy is,
61
00:05:45,289 --> 00:05:49,180
therefore you take the ciphertext, you decrypt
it back, check the plaintext, whether it makes
62
00:05:49,180 --> 00:05:53,860
sense or not. If it makes sense, you note
the key and keep the key, register the key
63
00:05:53,860 --> 00:06:00,449
as a meaningful key or rather a possible key,
but that is only one key, which is the actual
64
00:06:00,449 --> 00:06:07,449
key, the rest are spurious keys.
Sir what is the significance of unicity distance
65
00:06:09,639 --> 00:06:09,889
Yes
the significance of unicity distance
66
00:06:09,639 --> 00:06:12,039
The significance of unicity distance is that
for example, if I can compute the unicity
67
00:06:12,039 --> 00:06:17,509
distance of say DES or any other cipher, then
I would like to use the same key for so many
68
00:06:17,509 --> 00:06:22,509
times. After that, I have to change the key
if I am considering an unbounded adversary.
69
00:06:22,509 --> 00:06:27,949
So, for example, if I am considering an unbounded
advisory, then I would like to use the key
70
00:06:27,949 --> 00:06:32,360
only for, say 2 times, after that I would
like to change the key if its unicity distance
71
00:06:32,360 --> 00:06:34,059
is 2.
72
00:06:34,059 --> 00:06:41,059
So, therefore, this is what I have just described.
So, H K given C is the amount of uncertainty
73
00:06:41,099 --> 00:06:47,110
that remains of the key after the ciphertext
is revealed. So, we know that, we know, it
74
00:06:47,110 --> 00:06:50,150
is called the key equivocation and have already
defined that.
75
00:06:50,150 --> 00:06:54,499
So, what the attacker does is that he guesses
the key from the ciphertext and we shall guess
76
00:06:54,499 --> 00:06:59,409
the key and decrypt the cipher. So, what he
does next is, he checks, whether the plaintext
77
00:06:59,409 --> 00:07:04,430
obtained is meaningful English or not. If
not, he rules out the key, but due to the
78
00:07:04,430 --> 00:07:09,589
redundancy of language, more than one key
will actually pass this test; those keys,
79
00:07:09,589 --> 00:07:13,449
apart from the correct key are called spurious.
80
00:07:13,449 --> 00:07:20,449
So, then, we come to something, which is called,
we have already discussed about, what is entropy.
81
00:07:22,369 --> 00:07:29,369
So, we will be using this or rather we will
be using this entropy to find out or find
82
00:07:29,539 --> 00:07:34,419
out a minimum bound of the spurious keys.
So, we will try to find out a minimum bound
83
00:07:34,419 --> 00:07:37,770
or the lower bound of the number of spurious
keys.
84
00:07:37,770 --> 00:07:43,539
So, so, therefore, so, therefore, consider
H L, so therefore H L is used to measure the
85
00:07:43,539 --> 00:07:48,430
amount of information per letter of meaningful
strings of plaintext. So, that is the definition
86
00:07:48,430 --> 00:07:53,849
of H L. It measures the amount of information
per letter of meaningful strings of plaintext.
87
00:07:53,849 --> 00:07:59,710
So, therefore, consider random string of alphabets.
So, there are 26 letters and if all of them
88
00:07:59,710 --> 00:08:06,710
are equally likely, then what is the entropy,
is equal to log 26 base 2 and so that works
89
00:08:07,259 --> 00:08:13,399
to, computes to 4.76. But English language
you know, have a probability distribution;
90
00:08:13,399 --> 00:08:15,580
it is not 1 by 26 for all the letters.
91
00:08:15,580 --> 00:08:21,240
So, this was, this is the sort of the graph,
which was how the English language characters
92
00:08:21,240 --> 00:08:25,080
in general vary.
So, therefore, if you just 1st order entropy,
93
00:08:25,080 --> 00:08:30,169
that means, you just take one particular alphabet
among and consider its probability and then
94
00:08:30,169 --> 00:08:37,169
you feed into the formula of P x I, log P
x I negative and sigma, then your 1st order
95
00:08:38,070 --> 00:08:45,070
entropy of the English text works to 4.19.
But you can do better than that. So, you know
96
00:08:45,290 --> 00:08:50,589
that in English language, consecutive letters
essentially are not uncorrelated. For example,
97
00:08:50,589 --> 00:08:54,720
if I take q, then next letter is u, so they
are not uncorrelated.
98
00:08:54,720 --> 00:08:59,750
So, that means, that if I take 2 grams for
example, this entropy or uncertainty should
99
00:08:59,750 --> 00:09:06,149
reduce. So, therefore, if the 1st order entropy
is not enough, so what you do is that I do
100
00:09:06,149 --> 00:09:07,209
a 2nd order approximation.
101
00:09:07,209 --> 00:09:13,690
So, in 2nd order, what I do is that I compute
all possible diagrams. I find out their probabilities
102
00:09:13,690 --> 00:09:20,350
and subsequently their entropy and then I
divide, that H P square by 2, so it works
103
00:09:20,350 --> 00:09:26,949
to 3.9. So, you see that the entropy has reduced.
Similarly, you can do trigrams and again find
104
00:09:26,949 --> 00:09:32,339
out the H P 3 and then, again divide that
by 3. And similarly, you can do higher order
105
00:09:32,339 --> 00:09:34,120
approximations of the entropy.
106
00:09:34,120 --> 00:09:39,370
So, the idea is that in general, the successive
letters have got correlation, which reduces
107
00:09:39,370 --> 00:09:46,370
the entropy. So, therefore, for example, define
that higher order entropy as follows, that
108
00:09:46,759 --> 00:09:53,620
define H L as the entropy of a natural language,
L as H L is equal to limit n tends to infinity,
109
00:09:53,620 --> 00:09:58,360
H P n divided by n.
So, you find out the value of H P n, means,
110
00:09:58,360 --> 00:10:03,250
you find out all n grams, their probability
distribution, find, compute the corresponding
111
00:10:03,250 --> 00:10:08,980
entropy and then, divide this by n and your
limit tends, n tends to infinity. That means,
112
00:10:08,980 --> 00:10:15,420
this, this is an approximation for very high
values of n and it has been formed experimentally,
113
00:10:15,420 --> 00:10:20,740
that the value of H L, if you consider very
high values of n, in general ranges between
114
00:10:20,740 --> 00:10:26,120
1 and 1.5.
So, you will find that the H L falls within,
115
00:10:26,120 --> 00:10:27,379
within 1 and 1.5.
116
00:10:27,379 --> 00:10:32,819
So, so from there, what do we understand?
We understand that there is some amount of
117
00:10:32,819 --> 00:10:37,949
redundancy in the language. So, in order to
understand that better, let us quantize the
118
00:10:37,949 --> 00:10:44,949
redundancy by using these parameters R L.
R L is defined as 1 minus H L divided by log
119
00:10:47,730 --> 00:10:51,879
cardinality of the plaintext base 2.
So, you can immediately understand certain
120
00:10:51,879 --> 00:10:58,750
terms from this formula, for example, will
find that R L is equal to 1 minus H L. So,
121
00:10:58,750 --> 00:11:03,029
therefore, if you just consider a random language,
that is, if you just, if English language
122
00:11:03,029 --> 00:11:06,970
would have been random, then what would have
been entropy of H L? It would have been log
123
00:11:06,970 --> 00:11:13,240
previous 2. So, in that case, what would have
been the redundancy? It would have been 0.
124
00:11:13,240 --> 00:11:17,050
So, therefore, you see, that for any other
value of H L, like when I am considering,
125
00:11:17,050 --> 00:11:23,069
say 2 gram, 3 gram, 4 gram, 5 gram and say
n grams, then this value of H L was gradually
126
00:11:23,069 --> 00:11:28,490
reducing. So, that means, the redundancy was
increasing. So, therefore, this formula is
127
00:11:28,490 --> 00:11:31,769
able to capture the redundancy of the language.
128
00:11:31,769 --> 00:11:37,810
So, for example, if for typical values, if
you say, that H L lies between 1 and 1.5,
129
00:11:37,810 --> 00:11:44,810
so you take a value of, you take a value of,
so for example, let us consider this, that
130
00:11:45,360 --> 00:11:52,360
is, 1 minus H L divided by log of P base 2.
So, in that case, I can rearrange this as
131
00:11:54,529 --> 00:11:59,579
follows because I will be using this later
on, so I can write, for example H L divided
132
00:11:59,579 --> 00:12:06,579
by log P base 2 is equal to 1 minus r l. So,
therefore, H L is equal to 1 minus R L into
133
00:12:10,750 --> 00:12:17,750
log of cardinality of P base 2.
So, we will just note this equivalent relation
134
00:12:17,850 --> 00:12:23,420
because we will be requiring this result.
Subsequently, so you note one fact, that if
135
00:12:23,420 --> 00:12:30,420
this value of H L, that is, if H L reduces,
then the corresponding, this implies, that
136
00:12:32,300 --> 00:12:39,300
the corresponding value of R L increase. So,
therefore, if the entropy reduces of the corresponding
137
00:12:39,810 --> 00:12:44,769
language, that is equivalent to saying, that
the redundancy of the language has increased.
138
00:12:44,769 --> 00:12:50,160
So, therefore, this formula is able to capture
this intuitive result.
139
00:12:50,160 --> 00:12:57,110
So, let us therefore, so let us consider what
is the corresponding redundancy of an English
140
00:12:57,110 --> 00:13:01,420
language? So, you will like to quantize that.
So, if you find out, you will find out, that
141
00:13:01,420 --> 00:13:08,420
H L lies between 1 and 1.5. So, consider,
that H L is equal to 1.25 and you know, that
142
00:13:09,959 --> 00:13:16,230
the number of plaintext characters is 26.
So, therefore, if you feed into this formula,
143
00:13:16,230 --> 00:13:20,329
your R L works to 0.75.
So, what does it mean? It means that English
144
00:13:20,329 --> 00:13:26,420
language is 75 percent redundant, so whatever
you speak, out of that 75 percent is actually
145
00:13:26,420 --> 00:13:32,350
redundant. So, does it mean that out of 4
characters you talk, I can throw away 3 characters?
146
00:13:32,350 --> 00:13:38,240
No, it is not exactly so. So, what it means
only is, that if you do for example a Hoffmann
147
00:13:38,240 --> 00:13:42,149
coding, then essentially you are expected
to get such a kind of compression.
148
00:13:42,149 --> 00:13:48,790
So, so, therefore, that is the idea of redundancy
of a language and that is precisely the reason
149
00:13:48,790 --> 00:13:53,990
why cryptanalysis is favored. If you are the
very random kind of language, then cryptanalysis
150
00:13:53,990 --> 00:13:58,459
would have been harder, but you know, that
there is a redundancy in English language
151
00:13:58,459 --> 00:14:05,459
and that solves as extra information to the
attacker.
152
00:14:05,740 --> 00:14:12,040
So, let us try to calculate the lower bound
of equivocation of the key. So, that is the
153
00:14:12,040 --> 00:14:17,230
objective of today's class. The first part
of today's class, so for example, already
154
00:14:17,230 --> 00:14:23,990
defined n grams, so consider P n and R n and
P n and C n to be two random variables defined
155
00:14:23,990 --> 00:14:27,600
to represent n grams of the plaintext and
n grams of the ciphertext.
156
00:14:27,600 --> 00:14:34,600
So, all of us know already this formula, so
it is, what this says, H K given C n is equal
157
00:14:34,600 --> 00:14:41,600
to H K plus H P n minus H C n. So, this formula
we have already proved in the last day's class,
158
00:14:42,300 --> 00:14:48,009
yes.
So, what is H P n equal to? So, H P n we have
159
00:14:48,009 --> 00:14:53,999
defined from the definition of H L. If the
value of n is quite large, you can approximate
160
00:14:53,999 --> 00:15:00,999
that by n H L and you know that H L by my
previous result was equal to 1 minus R L log
161
00:15:09,680 --> 00:15:10,870
cardinality of P base 2.
162
00:15:10,870 --> 00:15:16,129
So, therefore, if I would like to calculate
the value of H, n H L, then I just need to
163
00:15:16,129 --> 00:15:22,449
multiply this particular thing, by n in n
in, by n. So, I get n into 1 minus R L into
164
00:15:22,449 --> 00:15:27,399
log of cardinality of P base 2. So, that is
the value of n H L.
165
00:15:27,399 --> 00:15:34,399
So, therefore, I can say from here, that H
P n, which is approximately equal to n H L,
166
00:15:35,009 --> 00:15:42,009
I can write that as equivalent, equal to n
into 1 minus R L into log of cardinality of
167
00:15:42,149 --> 00:15:49,149
P base 2 and the next thing that we want is
H of C n.
168
00:15:53,709 --> 00:16:00,709
So, I write, that H of C n is equal to or
rather is less than equal to n of log of cardinality
169
00:16:03,009 --> 00:16:10,009
of C base 2, why? Because whatever be the
entropy, so I am considering n grams, so this
170
00:16:11,139 --> 00:16:18,120
is equivalent, is saying, that H of C n by
n is less than equal to log of cardinality
171
00:16:18,120 --> 00:16:24,149
of C base 2. So, all of us know this fact
because this essentially capture the entropy
172
00:16:24,149 --> 00:16:30,009
of a random ciphertext.
So, whatever be it, the entropy that is divided
173
00:16:30,009 --> 00:16:34,519
by n should obviously be less than a random
key. So, this, this has got the, you know,
174
00:16:34,519 --> 00:16:39,519
the maximum entropy most uncertainty. So,
therefore, this value is obviously is lesser
175
00:16:39,519 --> 00:16:43,519
than this, so you will be using these two
bounds in our calculation.
176
00:16:43,519 --> 00:16:50,029
So, one bound is given by H of C n is less
than equal to n log cardinality of C base
177
00:16:50,029 --> 00:16:55,519
2 and the other approximation is H of P n
is approximately equal to n into 1 minus R
178
00:16:55,519 --> 00:17:02,519
L log cardinality of P base 2. So, we will
plug this equation, these 2 things into our,
179
00:17:02,980 --> 00:17:08,180
into our equivocation formula, that we had.
So, the formula that we had was H of K given
180
00:17:08,180 --> 00:17:13,600
C n is equal to H of K plus H of P n minus
H of C n. So, we have a fair amount of the
181
00:17:13,600 --> 00:17:18,610
estimate of the H P n and H C n, so if you
plug that, we get this value, that is, H of
182
00:17:18,610 --> 00:17:25,610
K given C n is greater than equal to H of
K minus n R L log cardinality of P base 2.
183
00:17:26,230 --> 00:17:33,230
So, do you see that? So, you see, that if
I subtract H P n minus H C n, then these particular
184
00:17:34,600 --> 00:17:41,090
terms, that is, n log cardinality of P base
2 minus n log cardinality of C base 2 gets
185
00:17:41,090 --> 00:17:44,130
cancelled if the cardinality of P and C are
same.
186
00:17:44,130 --> 00:17:50,050
So, if I just consider both of them are English
letters for example, then the cardinality
187
00:17:50,050 --> 00:17:54,160
of P and cardinality of C is the same thing.
So, they are the same value. So, therefore,
188
00:17:54,160 --> 00:17:59,620
they cancel each other and we have got a lower
bound of H K given C n, which says, that H
189
00:17:59,620 --> 00:18:06,180
K given C n is definitely greater than H K
minus n R L log of cardinality of P base 2.
190
00:18:06,180 --> 00:18:11,210
So, therefore, let us remember this formula,
so that, because we will be needing this later
191
00:18:11,210 --> 00:18:11,560
on.
192
00:18:11,560 --> 00:18:18,560
It says that H of K, if I write it in other
way, minus n of R L log of cardinality of
193
00:18:20,470 --> 00:18:27,470
P base 2 is lesser than equal to H of K given
C n. So, now, we will try to prove an upper
194
00:18:31,830 --> 00:18:36,470
bound of H K given C n, that is, H K given
C n should be lesser than equal to some term.
195
00:18:36,470 --> 00:18:41,240
So, from there we will try to find out the,
quantize the values of the spurious keys.
196
00:18:41,240 --> 00:18:44,420
So, till this part is clear.
197
00:18:44,420 --> 00:18:51,420
Yeah, we are not assuming anything of the
key, so for example we have just assumed that
198
00:18:53,070 --> 00:18:57,320
the ciphertext cardinality and the plaintext
cardinalities are the same, but the key is,
199
00:18:57,320 --> 00:19:04,320
we have not assumed the size of the key. This
calculation has got is independent of that
200
00:19:05,680 --> 00:19:05,930
information.
201
00:19:05,860 --> 00:19:10,280
So, therefore, consider possible keys. So,
therefore, K y, just define K y to be the
202
00:19:10,280 --> 00:19:15,510
possible keys given that y is the ciphertext.
So, define this set. So, what does it mean?
203
00:19:15,510 --> 00:19:19,180
It is, that possible keys given that y is
the ciphertext, so I already define. What
204
00:19:19,180 --> 00:19:23,410
it means? It means, that K y is the set of
those keys for which y is the ciphertext for
205
00:19:23,410 --> 00:19:28,200
meaningful plaintexts.
So, therefore, as I told you that, that is
206
00:19:28,200 --> 00:19:33,690
a cryptanalyst take or an attacker takes the
ciphertext, assumes the value of the key and
207
00:19:33,690 --> 00:19:39,210
finds out those keys or registers those keys
for which the plaintext is meaningful and
208
00:19:39,210 --> 00:19:45,910
that is denoted by the set K y. So, therefore,
the K y set holds those keys for which the
209
00:19:45,910 --> 00:19:52,130
corresponding plaintext is meaningful. So,
out of these keys, how many is, how many are
210
00:19:52,130 --> 00:19:58,370
spurious keys? Cardinality of K y minus 1
because only one key is the actual key, rest
211
00:19:58,370 --> 00:20:01,680
is spurious.
So, therefore, we know that when y is the
212
00:20:01,680 --> 00:20:08,260
ciphertext, number of keys is modulo of K
y. So, out of them only one is correct, so
213
00:20:08,260 --> 00:20:13,750
rest of them are spurious. So, the number
of spurious keys can be found out by cardinality
214
00:20:13,750 --> 00:20:20,290
of K y minus 1. So, what is the expected size
of cardinality? So, you know that this is
215
00:20:20,290 --> 00:20:24,700
actually a distribution. So, therefore, in
order to calculate the expected value of a
216
00:20:24,700 --> 00:20:29,240
random variable, what do we do? We multiply
the corresponding value with its probability
217
00:20:29,240 --> 00:20:30,630
and do a sigma.
218
00:20:30,630 --> 00:20:36,110
So, I guess we know this result, that if there
is a random variable x, then its expected
219
00:20:36,110 --> 00:20:43,110
value E x is computed by sigma of its corresponding
probability into the x i, where i runs over
220
00:20:44,050 --> 00:20:51,050
all possibilities. So, that is the way how
we calculate the expectation of a random variable.
221
00:20:52,990 --> 00:20:59,990
So, we apply this and we find out the expected
number of spurious keys. So, what is the expected
222
00:21:01,740 --> 00:21:06,160
number of spurious keys? Here, it is the average
number of spurious keys over all possible
223
00:21:06,160 --> 00:21:13,160
ciphertext and this is denoted by the variable
S n. So, S n is nothing but sigma of cardinality
224
00:21:14,010 --> 00:21:18,970
of K y. We are just multiplying, K, cardinality
of K y minus 1 because this is the number
225
00:21:18,970 --> 00:21:23,840
of spurious keys and we are multiplying that
by the probability of this event.
226
00:21:23,840 --> 00:21:28,020
So, what is the probability of this event,
that the ciphertext y is chosen? So, that
227
00:21:28,020 --> 00:21:34,570
is P y and that is valid for, done, this calculation
is done for all possible ciphertexts. So,
228
00:21:34,570 --> 00:21:41,570
you know, if I just simplify this formula,
then I can actually multiply this sigma, this
229
00:21:42,010 --> 00:21:48,710
P y with K y and I obtain this, distribute
this P y over 1, then I obtain sigma of P
230
00:21:48,710 --> 00:21:54,200
y. So, what is sigma of P y overall possible
ciphertext? It is unity, 1, so I get sigma
231
00:21:54,200 --> 00:22:00,930
of P y multiplied with the cardinality of
K y and that, from there I subtract the value
232
00:22:00,930 --> 00:22:06,910
of 1. So, this gives me what? This gives me
the expected number of spurious keys.
233
00:22:06,910 --> 00:22:13,910
So, therefore, from here I can write, that
S n plus 1, I can, I can just reorganize this
234
00:22:15,200 --> 00:22:19,320
equation, I can write S n plus 1 is equal
to this particular thing.
235
00:22:19,320 --> 00:22:26,320
So, therefore, I can write like, S n plus
1 is equal to sigma of P y cardinality of
236
00:22:27,830 --> 00:22:34,830
K y, where I varies overall possible ciphertext.
So, this is actually, also we have this and
237
00:22:38,090 --> 00:22:45,090
this is also we can put down from the definition
of S n.
238
00:22:45,260 --> 00:22:50,600
So, so therefore, the, if we need to calculate
the upper bound of the equivocation of key,
239
00:22:50,600 --> 00:22:54,700
we do further calculation from the definition
of H K given C n. So, what is the H K given
240
00:22:54,700 --> 00:22:59,530
C n? So, as I told you, that there are 2 random
variables here, K and C n, what we do is,
241
00:22:59,530 --> 00:23:02,700
that let us vary one random variable and keep
the other one constant.
242
00:23:02,700 --> 00:23:08,740
So, therefore, we vary in this case y and
we keep the value of K constant. So, therefore,
243
00:23:08,740 --> 00:23:13,460
from our definition of condition or entropy,
we can write that is equal to sigma of P y
244
00:23:13,460 --> 00:23:20,270
multiplied by H of K given y. So, this we
had actually written in last day's class,
245
00:23:20,270 --> 00:23:27,130
you can follow that. Therefore, this is equal
to, actually this is this equal to, should
246
00:23:27,130 --> 00:23:33,610
be actually an equal to, so this is equal
to sigma of P y, I just write this as K y.
247
00:23:33,610 --> 00:23:38,860
Therefore, what does it mean? It means K given
y. So, that is exactly the definition of h,
248
00:23:38,860 --> 00:23:41,990
that is, that is exactly the definition of
K y.
249
00:23:41,990 --> 00:23:48,990
So, I obtain the sigma, I obtain the, multiply
P y with H K y and I take the corresponding
250
00:23:49,240 --> 00:23:56,240
sigma. Now, this is less than equal to P y,
multiply with logarithm of cardinality of
251
00:23:57,960 --> 00:24:03,040
K y base 2 and taken a sigma and this follows
from what I already told you, that if this
252
00:24:03,040 --> 00:24:10,040
K y would have been a, so I mean, if for example
this H of K, this K y would have been a random
253
00:24:16,250 --> 00:24:21,940
distribution. In that case, this would have
been an upper bounded, this would have been
254
00:24:21,940 --> 00:24:28,630
equal to logarithm of cardinality of K y base
2 for any other distribution; this uncertainty
255
00:24:28,630 --> 00:24:32,710
is lesser than a random distribution.
So, therefore, I can actually find out an
256
00:24:32,710 --> 00:24:39,360
upper bound and this is the upper bound, then
I apply a sudden result from mathematics called
257
00:24:39,360 --> 00:24:44,630
genesis inequality, but it is applicable for
the logarithm series because it is a monotonically
258
00:24:44,630 --> 00:24:49,870
increasing function. But let us just believe
this fact, that I can actually find out an
259
00:24:49,870 --> 00:24:54,030
upper bound of this, which is called, so I
can upper bound this particular thing by this
260
00:24:54,030 --> 00:24:58,290
expression. So, I can take the log out and
I can take this, this as follows.
261
00:24:58,290 --> 00:25:03,590
So, what is this particular sigma equal to?
This sigma is equal to S n plus 1. From this
262
00:25:03,590 --> 00:25:10,590
result you see this and sigma of P y cardinality
of K y was equal to S n plus 1. So, we use
263
00:25:11,120 --> 00:25:14,620
this result and plug into that equation, we
obtain this. So, you see that this is equal
264
00:25:14,620 --> 00:25:20,060
to sigma of P y cardinality of K y and instead
of this I can write S n plus 1.
265
00:25:20,060 --> 00:25:24,950
So, what we have proved is that H K given
C n is less than equal to logarithm of S n
266
00:25:24,950 --> 00:25:26,700
plus 1 base 2.
267
00:25:26,700 --> 00:25:33,700
So, what we are proved is H K given C n is
less than equal to logarithm of S n plus 1
268
00:25:37,720 --> 00:25:44,720
base 2. So, now, we can actually take this
result and we can combine this fact and this
269
00:25:45,270 --> 00:25:52,270
fact and obtain that, rather write that, H
K minus n of R L logarithm of P base logarithm
270
00:25:56,990 --> 00:26:03,990
of cardinality of P base 2 is less than equal
to logarithm of S n plus 1 base 2. Is it ok?
271
00:26:16,890 --> 00:26:23,890
So, that is precisely written, what here,
so I, I write, that H K minus n R L log cardinality
272
00:26:25,090 --> 00:26:31,850
of P base 2 is less than equal to logarithm
of S n plus 1 base 2. So, therefore, I can
273
00:26:31,850 --> 00:26:36,070
actually really reorganize this and write
as, rearrange this and write as logarithm
274
00:26:36,070 --> 00:26:42,700
of S n plus 1 base 2 is upper bounded by H
K minus n R L log cardinality of P base 2.
275
00:26:42,700 --> 00:26:47,860
So, now, if the keys are chosen equi-probably,
that is, if all the keys are equally likely,
276
00:26:47,860 --> 00:26:52,200
then I can actually write, that this as an
equality, I can write, that H K is equal to
277
00:26:52,200 --> 00:26:58,550
logarithm of K base 2. So, in that case I
can plug this into this previous equation
278
00:26:58,550 --> 00:27:03,950
and this works out to this equation. So, this,
this and this inequality, so I can obtain,
279
00:27:03,950 --> 00:27:09,690
that S n plus 1, I mean, rather S n plus 1
greater than equal to cardinality of K divided
280
00:27:09,690 --> 00:27:14,340
by cardinality of P to the power of n R L.
So, this follows from this equation.
281
00:27:14,340 --> 00:27:20,010
If you plug-in the value of H K and make it
equal to logarithm of cardinality of K base
282
00:27:20,010 --> 00:27:25,000
2, so you just see, that if I take this and
if I plug here, then I, I, I can actually
283
00:27:25,000 --> 00:27:30,630
write this as logarithm of cardinality of
P base 2 to the power of n R L and from there
284
00:27:30,630 --> 00:27:36,000
I obtain a log here and this is also a log.
So, the subtraction of log would be log a
285
00:27:36,000 --> 00:27:43,000
by b. So, therefore, I can write that as logarithm
of cardinality of K divided by cardinality
286
00:27:43,290 --> 00:27:50,290
of P to the power n R L base 2 and, and on
the left-hand side, I have got S n plus 1.
287
00:27:50,300 --> 00:27:54,110
So, therefore, since I have got two logs on
both sides and we have got an increasing function,
288
00:27:54,110 --> 00:27:59,490
I can actually write the S n plus 1 is greater
than equal to, cardinality of P divide, cardinality
289
00:27:59,490 --> 00:28:03,380
of K divided by the cardinality of P to the
power of n R L.
290
00:28:03,380 --> 00:28:07,440
So, what do you obtain from here? I mean,
what is the objective, the objective is, what
291
00:28:07,440 --> 00:28:12,700
is, you note, what is the value of n, what
was n? n was the number of ciphertexts, that
292
00:28:12,700 --> 00:28:18,730
I had provided. So, therefore, now I would
like to make the number of spurious keys equal
293
00:28:18,730 --> 00:28:24,740
to 0 that is the object; that was the objective
finally.
294
00:28:24,740 --> 00:28:30,200
So, unicity distance says, that the, thus
increasing n, we obtain, obtain from here,
295
00:28:30,200 --> 00:28:34,990
that if I increase the value of n, then that
reduces the number of spurious keys. That
296
00:28:34,990 --> 00:28:40,070
means what? If I provide an attacker more
and more information, then the number of spurious
297
00:28:40,070 --> 00:28:45,140
keys gets reduced.
So, unicity distance is that particular number
298
00:28:45,140 --> 00:28:50,620
of ciphertexts, so I call it n 0 for which
the number of ciphertexts, rather the number
299
00:28:50,620 --> 00:28:55,810
of spurious keys is actually reduced to 0.
So, if I, if I, for example in the previous
300
00:28:55,810 --> 00:29:01,110
equation, if I had made the value of n, so
I can actually write, that if I just plug-in
301
00:29:01,110 --> 00:29:04,470
0 to that previous value, I would have obtained
this bound.
302
00:29:04,470 --> 00:29:10,500
You see this, that is, if this was equal to
0, then I would have obtained, would have
303
00:29:10,500 --> 00:29:17,230
obtained, what would have obtained? That cardinality
of K divided by cardinality of P to the power
304
00:29:17,230 --> 00:29:23,940
of n R L is equal to 1. So, that means, that
cardinality of K is equal to cardinality of
305
00:29:23,940 --> 00:29:30,280
P to the power of n R L. So, therefore, that
is exactly this particular equation you see,
306
00:29:30,280 --> 00:29:35,290
that n is equal to logarithm of K base 2 divided
by R L log P base 2.
307
00:29:35,290 --> 00:29:40,400
So, actually, in my equation I can take log
on both sides and I can write log K base 2
308
00:29:40,400 --> 00:29:46,360
is equal to n R L log cardinality of P base
2. So, what is the value of n here? It is
309
00:29:46,360 --> 00:29:53,360
equal to log of cardinality of K base 2 divided
by R L logarithm of P base 2. So, this is
310
00:29:56,380 --> 00:30:03,380
the value of n for which my number of spurious
keys just becomes equal to 0. So, for unicity
311
00:30:04,760 --> 00:30:08,220
distance would be greater than that, so therefore
if I had provided you more and more, more
312
00:30:08,220 --> 00:30:12,010
ciphertexts, then actually, your number of
keys would have been 0.
313
00:30:12,010 --> 00:30:16,950
So, therefore, if I use the cipher within
this unicity distance, then the number of
314
00:30:16,950 --> 00:30:22,070
spurious keys will not be 0. So, that means,
the attacker is not able to exactly find out
315
00:30:22,070 --> 00:30:27,530
the unique value of the key and we have, we
have, so throughout our calculation we actually
316
00:30:27,530 --> 00:30:32,880
consider an unbounded advisory. So, even an
unbounded advisory would not actually, actually
317
00:30:32,880 --> 00:30:36,570
find out the actual value of the key, not
the unique value of the key, but you will
318
00:30:36,570 --> 00:30:42,450
have a set of possible keys. And unicity distance
is actually that number of ciphertexts for
319
00:30:42,450 --> 00:30:45,240
which this number of spurious keys just becomes
equal to 0.
320
00:30:45,240 --> 00:30:50,330
So, therefore, we will obtain this particular
lower bound. So, therefore, this is the lower
321
00:30:50,330 --> 00:30:55,000
bound of the number of spurious keys, number
of, for the, for the lower bound, for the
322
00:30:55,000 --> 00:31:01,110
unicity distance. So, beyond that is actually
the attacker is able to find out the unique
323
00:31:01,110 --> 00:31:05,440
value of the key. So, note, that this calculation
may not be accurate for small values of n,
324
00:31:05,440 --> 00:31:10,850
why and because my original H n definition
relied upon the fact, that limit n tends to
325
00:31:10,850 --> 00:31:13,310
infinity, so that was an assumption that I
made.
326
00:31:13,310 --> 00:31:19,030
So, therefore, this may not be very much true
for n equal to 1, 2 or so on. It should be,
327
00:31:19,030 --> 00:31:21,630
it should be, should be fairly ok for large
values of n.
328
00:31:21,630 --> 00:31:28,630
So, let us do an example calculation of, with
the substitution cipher. So, we had number
329
00:31:28,930 --> 00:31:34,610
of plaintexts characters equal to 26, so the
cardinality of P was 26, cardinality of K
330
00:31:34,610 --> 00:31:40,120
was 26 factorial. So, that was around 4 into
10 to the power of 26, fairly large value
331
00:31:40,120 --> 00:31:46,250
of the key. So, your R L, if you assume is
equal to 0.75 of English language, then if
332
00:31:46,250 --> 00:31:49,480
you plug-in, you will find, that n 0 is approximately
equal to 25.
333
00:31:49,480 --> 00:31:56,000
So, that means, that given a ciphertext string
of length 25, an unbounded attacker can actually
334
00:31:56,000 --> 00:32:02,340
predict the unique value of the key. So, thus,
what we observe from here is that a key size
335
00:32:02,340 --> 00:32:07,370
alone, such a large key size does not guarantee
security if brute force is possible to an
336
00:32:07,370 --> 00:32:11,980
attacker with infinite computational power.
So, an attacker who has got an, has got an
337
00:32:11,980 --> 00:32:18,130
unbound, I mean, has got an infinite computational
power, for him such a big size of key, actually
338
00:32:18,130 --> 00:32:25,120
he requires just 25 ciphertexts to actually
find out the value of the key. All these majors
339
00:32:25,120 --> 00:32:31,170
are of course probabilistic, but it will match
with the actual result quite closely.
340
00:32:31,170 --> 00:32:36,750
So, this, with this we essentially conclude
this part of unicity distance, but we will
341
00:32:36,750 --> 00:32:43,050
conclude the remaining part with an idea of
product ciphers. So, actually, if, after this
342
00:32:43,050 --> 00:32:49,790
we will actually go, start talking about real
ciphers like block ciphers and string ciphers,
343
00:32:49,790 --> 00:32:53,320
but before this I would like to mention about
the idea of product ciphers.
344
00:32:53,320 --> 00:32:58,280
So, this was actually also mentioned in Shannon's
paper and that is why it is called the seminar
345
00:32:58,280 --> 00:33:05,260
paper. It was mentioned as old in 1949 and
the idea of forming products. So, the idea
346
00:33:05,260 --> 00:33:10,420
is still fundamental because even present,
present day ciphers like AES for example,
347
00:33:10,420 --> 00:33:15,690
still uses the concepts of product ciphers.
So, let us try to understand the concept of
348
00:33:15,690 --> 00:33:21,310
product ciphers and actually you will observe,
through that lot of things becomes meaningful;
349
00:33:21,310 --> 00:33:25,270
lot of things, which we see in our future
ciphers will actually becomes meaningful.
350
00:33:25,270 --> 00:33:27,090
So, let us try to see.
351
00:33:27,090 --> 00:33:31,380
So, before that I would just like, in order
to simplify our life, let us, I just coin
352
00:33:31,380 --> 00:33:35,740
a term called endomorphic ciphers. So, what
is an endomorphic cipher? Endomorphic ciphers
353
00:33:35,740 --> 00:33:39,100
are those ciphers for which the plaintext
and ciphertext are the same sets.
354
00:33:39,100 --> 00:33:44,010
So, for a normal substitution cipher, your
plaintext and ciphertext were just English
355
00:33:44,010 --> 00:33:49,310
language, I mean, English characters.
So, if P and C are the same, then we have
356
00:33:49,310 --> 00:33:54,040
what is called an endomorphic cipher. Therefore,
the shift cipher of an, on English language,
357
00:33:54,040 --> 00:33:57,840
on English alphabets was an example of an
endomorphic cipher.
358
00:33:57,840 --> 00:34:04,840
So, consider an endomorphic cipher and let
us try to understand certain things from history.
359
00:34:04,860 --> 00:34:10,810
So, therefore, if we have an endomorphic cipher,
so C 1, you note, that I have written (P,
360
00:34:10,810 --> 00:34:15,210
P) because P and C are the same things, that
the plaintext set and the ciphertext sets
361
00:34:15,210 --> 00:34:19,220
are the same things.
So, I write (P, P) and then followed, follow
362
00:34:19,220 --> 00:34:25,270
that with K 1 because K 1 denotes the key
set of the encryption function C 1, that is,
363
00:34:25,270 --> 00:34:29,290
the cipher C 1 and you have got an encryption
function e 1 and corresponding decryption
364
00:34:29,290 --> 00:34:36,290
function d 1. You also have a cipher, which
is called C 2 and I denote that with (P, P,
365
00:34:37,179 --> 00:34:41,040
K 2, e 2, d 2).
So, that means what the key two is? The set
366
00:34:41,040 --> 00:34:46,580
key, of the keys for the 2nd cipher e 2, the
corresponding encryption function and d 2
367
00:34:46,580 --> 00:34:51,120
is the corresponding decryption function.
So, let us try to understand or define what
368
00:34:51,120 --> 00:34:57,980
is mean by product cipher C 1 cross C 2. So,
what does C 1 cross C 2 means? It just means,
369
00:34:57,980 --> 00:35:01,990
like as you know, that you apply 2 function
after, one after the other. So, therefore,
370
00:35:01,990 --> 00:35:07,210
if I say C 1 cross C 2, it means, that first
I will apply C 1 and follow that with the
371
00:35:07,210 --> 00:35:12,110
application of C 2. So, I think, you have
seen these in a case of composition of functions
372
00:35:12,110 --> 00:35:19,110
in So, therefore, this exactly precisely similar
kind of thing, so you see C 1 cross C 2, I
373
00:35:19,220 --> 00:35:24,390
would define as P cross P because extreme
endomorphic. You see, why it is endomorphic?
374
00:35:24,390 --> 00:35:29,770
Yeah, because that repeated application also
keeps, still keeps, it is endomorphic property.
375
00:35:29,770 --> 00:35:34,180
So, it is still endomorphic, but your key
set is the condition product of K 1 and K
376
00:35:34,180 --> 00:35:39,330
2 because there is K 1 cross K 2 and you have
got the corresponding encryption function
377
00:35:39,330 --> 00:35:44,280
e and decryption function d.
So, any key I can write in the form of an
378
00:35:44,280 --> 00:35:51,280
ordered pair, K 1 cross, K 1 comma K 2 and
form ordered sets like that. So, the encryption
379
00:35:51,380 --> 00:35:57,550
function is defined as e equal to e 2 and,
but initially you take the plaintext x, you
380
00:35:57,550 --> 00:36:03,440
take the key K 1 and you apply the function
e 1. Subsequently, you choose the key K 2
381
00:36:03,440 --> 00:36:07,370
and you apply the encryption function e 2.
So, what will the corresponding decryption
382
00:36:07,370 --> 00:36:09,130
function look like?
383
00:36:09,130 --> 00:36:14,300
The corresponding decryption function will
look like this, it will look like d equal
384
00:36:14,300 --> 00:36:21,300
to d 2 d 1 (y, K 1) follow that with K 2 like
this. So, do you see, that if I apply d and
385
00:36:25,500 --> 00:36:31,010
e subsequently, they are actually inverses
of each other; that is obvious because of
386
00:36:31,010 --> 00:36:37,460
the associativity of the product rule. So,
actually you can see that because if I apply
387
00:36:37,460 --> 00:36:44,460
d and if I apply d over an application of
the e function, then actually I obtain back
388
00:36:44,640 --> 00:36:48,400
where I started with. So, you can see this
because of this.
389
00:36:48,400 --> 00:36:55,400
Yeah, it will be first d 1 and follow that
with d 2, yeah, so that you can obtain from
390
00:36:57,940 --> 00:37:04,210
the decryption function because what you do
is, that you take the corresponding, see you
391
00:37:04,210 --> 00:37:11,210
write this e 2 e 1 (x, K 1, K 2). So, that
is my y, so I take this as a y and I apply
392
00:37:14,880 --> 00:37:17,940
my d over that.
So, therefore, what I do is, that I want to
393
00:37:17,940 --> 00:37:24,940
compute this, I want to compute d y, so for
that I apply d 1 d 2 over e 2 e 1 (x, K 1,
394
00:37:29,070 --> 00:37:36,070
K 2) and then, so this is my, so therefore,
this is my e 1. So, therefore, what I do is,
395
00:37:42,680 --> 00:37:49,220
that I have actually encrypted. So, this is
my scope of the function e 2, then I apply
396
00:37:49,220 --> 00:37:56,220
d 2. So, in order to decrypt this I need K
2 and follow that with K 1. So, what you do
397
00:37:56,270 --> 00:38:02,650
is that you see, that in this case your d
2 and e 2 cancels each other. So, I can do
398
00:38:02,650 --> 00:38:08,320
that because of the associativity of the product
function. So, if I do that, I obtain d 1 and
399
00:38:08,320 --> 00:38:15,320
I obtain then e 1 x of K 1, K 1. So, again,
this d 1 and e 1 cancels each other and I
400
00:38:17,730 --> 00:38:21,370
finally obtain back the value of x.
So, therefore, you, therefore you see, that
401
00:38:21,370 --> 00:38:28,370
d is actually a corresponding decryption function.
So, this follows because of this fact, that
402
00:38:31,140 --> 00:38:35,550
is, the product rule is always associative.
403
00:38:35,550 --> 00:38:42,150
So, the question is, that if we can compute
product of ciphers, thus the cipher becomes
404
00:38:42,150 --> 00:38:46,600
stronger, that is what is most important.
So, I take two small ciphers and I compose
405
00:38:46,600 --> 00:38:51,130
them I compute the product. Does the key,
thus is, does the cipher becomes stronger,
406
00:38:51,130 --> 00:38:56,350
that means, does the key space becomes really
larger? So, in the initial, on the surface
407
00:38:56,350 --> 00:39:01,070
we have actually (K 1, K 2).
So, the condition product size should increase,
408
00:39:01,070 --> 00:39:05,330
but on a second thought, does it really become
larger? So, in order to understand that, what
409
00:39:05,330 --> 00:39:12,330
is your opinion on that? Will it really become
larger; will the key size be become larger?
410
00:39:13,280 --> 00:39:20,280
So, actually, not always, so let us try to
consider one example, I guess it will be lot
411
00:39:22,370 --> 00:39:25,690
clear through that. So, let us consider a
simple multiplicative cipher. So, what it
412
00:39:25,690 --> 00:39:31,550
does is, that it just takes x and multiplies
with a, where a is co-prime to 26. So, you
413
00:39:31,550 --> 00:39:34,920
know what is co-prime?
So, a is co-prime means, it is GCD of a and
414
00:39:34,920 --> 00:39:41,920
26 is 1 and next is you consider a shift cipher
where you take x and you add that with K.
415
00:39:42,550 --> 00:39:48,980
So, therefore, this is my m and this is my
S, so this was an example of what? It was
416
00:39:48,980 --> 00:39:55,980
an example of a computation cipher and this
was an example of a substitution cipher.
417
00:39:56,680 --> 00:40:03,680
So, now you consider, that for example, that
we have got M and you have got S and what
418
00:40:03,810 --> 00:40:08,500
you do is that you just compose this M cross
S and you obtain this, that is, y is equal
419
00:40:08,500 --> 00:40:14,190
to ax plus K is, you see, you, first of all,
have, so you need to first do the multiplication.
420
00:40:14,190 --> 00:40:21,190
So, you do a S, I mean, ax and then, you need
to do S. So, you add K with ax, so what is
421
00:40:22,300 --> 00:40:27,650
the key? In this case the key is , (a, K)
and you know, that this is an example of what?
422
00:40:27,650 --> 00:40:32,310
It is an example of affine cipher. So, what
was the key size in case of English language?
423
00:40:32,310 --> 00:40:36,480
What was the size of affine cipher? It was
equal to 312.
424
00:40:36,480 --> 00:40:43,480
So, now you consider S, from S cross M, so
x cross M is equal, y is equal to ax plus
425
00:40:43,580 --> 00:40:50,020
K, so that is, you can also write that as
ax plus a K. Now, you know, that GCD of (a,
426
00:40:50,020 --> 00:40:57,020
26) is 1. So, therefore, this is also an affine
cipher and the key would be (a, a K), but
427
00:40:57,440 --> 00:41:03,920
since the GCD of a and 26 is 1, so an inverse
exists. This we discussed and actually there
428
00:41:03,920 --> 00:41:09,619
is a one to one relation between a K and K.
So, therefore, the total size of the key space
429
00:41:09,619 --> 00:41:15,520
in S from S cross M is still 312.
So, you see, that here we have got the key
430
00:41:15,520 --> 00:41:21,770
a K and here we have the key K, but since
there is an inverse of a, that is, actually
431
00:41:21,770 --> 00:41:26,220
a one to one correspondence between this set
and this set. So, if there are 26 possibilities
432
00:41:26,220 --> 00:41:31,710
of K, then also a K, since you are doing a
modulo 26 also has got 26 possibilities.
433
00:41:31,710 --> 00:41:37,730
So, that means, this set and this set are
essentially the same things. So, that means,
434
00:41:37,730 --> 00:41:44,730
in both the cases you do M cross S or you
do S cross M, your key size is still 312.
435
00:41:44,760 --> 00:41:51,760
So, what you see is that M cross S and S cross
M are same. So, that is what is called commutative.
436
00:41:53,130 --> 00:42:00,130
So, we have got commutative ciphers. So, it
means that M cross S and S cross M, when they
437
00:42:00,710 --> 00:42:05,040
are same we call them to be commutative and
this is an example of a commutative cipher.
438
00:42:05,040 --> 00:42:09,880
It does not matter whether you do first shift
and then multiply or you do first multiply
439
00:42:09,880 --> 00:42:11,890
and then shift, both are the same things.
440
00:42:11,890 --> 00:42:17,300
So, then, let us see what an idempotent cipher
is? So, therefore, what is an idempotent function?
441
00:42:17,300 --> 00:42:20,369
It means, if we apply the same function twice,
you obtain back the same function.
442
00:42:20,369 --> 00:42:25,520
So, therefore, M is a permutation cipher,
S was a case of a substitution ciphers and
443
00:42:25,520 --> 00:42:31,650
both of them were actually idempotent ciphers.
So, a composed cipher has a larger key, but
444
00:42:31,650 --> 00:42:37,810
no extra security because M cross M, if it
is equal to M, then even composing M S for
445
00:42:37,810 --> 00:42:41,560
more than the, more than once, essentially
leaves you with the same kind of transformation.
446
00:42:41,560 --> 00:42:47,119
So, essentially, it does not add to your security.
So, therefore, for example if you had completed
447
00:42:47,119 --> 00:42:53,310
M cross M or S cross S, that would not have
led to the increase of the key space. So,
448
00:42:53,310 --> 00:42:57,980
this is because S cross S and is equal to
S and M cross M is also equal to M and these
449
00:42:57,980 --> 00:43:04,619
class of ciphers are called idempotent ciphers.
So, you could easily observe from this fact,
450
00:43:04,619 --> 00:43:09,150
that is, if I had done an M cross M, what
would have been the, would have, what would
451
00:43:09,150 --> 00:43:15,290
have that meant? I would have done a x, then
a a x, but my key size would not have still
452
00:43:15,290 --> 00:43:19,630
increased because doing S square essentially
does not increase the key space.
453
00:43:19,630 --> 00:43:23,150
Similarly, consider for shift cipher, you
do one shift and you do another shift. So,
454
00:43:23,150 --> 00:43:27,390
in both the cases you can represent that,
you can represent that, that by a 3rd shift,
455
00:43:27,390 --> 00:43:30,190
so, do you understand what I am saying?
456
00:43:30,190 --> 00:43:36,690
So, what I am saying is that if you consider,
say S as x plus K 1 for example, so I am just
457
00:43:36,690 --> 00:43:41,760
considering S cross S and I am just trying
to argue, that S cross S is actually equal
458
00:43:41,760 --> 00:43:46,980
to S. So, what is the idea? So, therefore,
imagine that in the 1st phase we have got,
459
00:43:46,980 --> 00:43:53,450
we have got, we have got the function as you
choose the key as K 1 and in the 2nd case,
460
00:43:53,450 --> 00:43:58,250
you choose the key as K 2. So, on the 1st
application of S, I would have computed x
461
00:43:58,250 --> 00:44:04,380
plus K 1 and in the 2nd click application
of S, I would have computed as K 2. Therefore,
462
00:44:04,380 --> 00:44:10,010
I would have obtained x plus K 1 plus K 2
so that, since I am doing mod 26, I can always
463
00:44:10,010 --> 00:44:16,000
represent that as x plus some K 3 mod 26,
where K 3 is nothing but the summation of
464
00:44:16,000 --> 00:44:19,880
K 1 and K 2.
So, that means, even for S cross S I have
465
00:44:19,880 --> 00:44:25,550
got the same size of the key, so it is a same
cipher. Therefore, I can conclude, that S
466
00:44:25,550 --> 00:44:31,320
cross S is equal to Ss. Similarly, for M cross
M also, you can actually show that M cross
467
00:44:31,320 --> 00:44:35,310
M is also equal to M. So, both these cross
of ciphers are something which we call as
468
00:44:35,310 --> 00:44:42,210
idempotent ciphers.
So, therefore, we have defined what the commutative
469
00:44:42,210 --> 00:44:47,950
cipher is and we have defined what an idempotent
cipher is, and what we will now consider is
470
00:44:47,950 --> 00:44:51,800
what happens if you compute the product of
such kind of ciphers, which are commutative
471
00:44:51,800 --> 00:44:53,300
as well as idempotent.
472
00:44:53,300 --> 00:45:00,300
So, actually, that you can observe from this
fact, so what we are trying to observe is,
473
00:45:02,440 --> 00:45:06,900
that there is no point of obtaining products
of idempotent ciphers. So, if you take M cross
474
00:45:06,900 --> 00:45:11,860
M, it is a same thing as M. So, that is no
point of doing such products. So, rather you
475
00:45:11,860 --> 00:45:15,660
would get product ciphers form non-idempotent
ciphers, that is, by iterating them.
476
00:45:15,660 --> 00:45:20,369
So, if we have some non-idempotent ciphers,
I would have liked to iterate them and therefore,
477
00:45:20,369 --> 00:45:25,119
that is essentially the concept of round,
which exists into all classes of symmetric
478
00:45:25,119 --> 00:45:29,410
ciphers in today's world.
So, the question is how to make non-idempotent
479
00:45:29,410 --> 00:45:34,590
ciphers or functions? So, if I, the idea would
be, that compose 2 small different cryptosystems,
480
00:45:34,590 --> 00:45:35,960
which do not commute.
481
00:45:35,960 --> 00:45:41,380
So, do you follow this? If you do not follow
them, then this will become clear because
482
00:45:41,380 --> 00:45:45,670
of this calculation. So, what was I said here
is, that if there are 2 cryptosystems, which
483
00:45:45,670 --> 00:45:49,450
are idempotent and also commute, then the
product is also idempotent.
484
00:45:49,450 --> 00:45:55,869
So, if this result is true, what does it mean?
It means that if you have got 2 cryptosystems,
485
00:45:55,869 --> 00:46:00,710
which are idempotent and also commute, then
the product is also idempotent. So, what does
486
00:46:00,710 --> 00:46:07,040
it mean? It means, that if we obtain a function
of this class, then if you take and if you
487
00:46:07,040 --> 00:46:12,700
take them and if you still product or rather
compute the product of those kinds of ciphers,
488
00:46:12,700 --> 00:46:17,410
the key size does not increase.
So, therefore, considering products does not
489
00:46:17,410 --> 00:46:22,090
help. Therefore, from this theorem or rather
this result, we know, that we actually require
490
00:46:22,090 --> 00:46:28,130
to, I mean, compute the products of ciphers,
which even if they are idempotent, they do
491
00:46:28,130 --> 00:46:29,920
not commute.
492
00:46:29,920 --> 00:46:36,270
So, that explains this point, that is, compose
two small different cryptosystems, which do
493
00:46:36,270 --> 00:46:43,150
not commute. So, those kinds of ciphers, if
you iterate them, will actually make sense.
494
00:46:43,150 --> 00:46:49,940
So, therefore, let us see this result, it
is quite simple, it says, that S 1 and S 2
495
00:46:49,940 --> 00:46:56,090
are two such cryptosystems, which are idempotent
and at the same time they compute. So, S 1
496
00:46:56,090 --> 00:47:03,090
cross S 2 cross S 1 cross S 2, I am considering
the product of these things, so therefore,
497
00:47:03,180 --> 00:47:08,840
so if I observe that, if from the associatively
I can write, like S 1 cross S 2 cross S 1
498
00:47:08,840 --> 00:47:14,840
cross S 2 and since this commutes, S 2 cross
S 1 becomes equal to S 1 cross S 2.
499
00:47:14,840 --> 00:47:21,840
So, now you know, that S 1 cross S 1 is equal
to S 1 and S 2 cross S 2 is also equal to
500
00:47:22,520 --> 00:47:26,930
S 2, so what you have obtained is, that S
1 cross S 2 and product and multiplying that
501
00:47:26,930 --> 00:47:30,350
with S 1 cross S 2, essentially leaves you
with S 1 cross S 2.
502
00:47:30,350 --> 00:47:35,660
So, what does it mean? It means that this
is an idempotent function. So, in your previous
503
00:47:35,660 --> 00:47:40,320
case we have proved, that M cross S, M and
S were essentially, both of them were idempotent.
504
00:47:40,320 --> 00:47:46,050
So, therefore, can you, can you, can you show,
can you understand why M cross S is also idempotent?
505
00:47:46,050 --> 00:47:51,080
Why? It because we have proved, that M cross
S is equal to S cross M. So, that means, M
506
00:47:51,080 --> 00:47:56,780
and S were commutative and we have also proved
that M cross M is equal to M and S cross S
507
00:47:56,780 --> 00:48:01,890
is equal to S. So, that is they were idempotent
as well. So, if you compute their products,
508
00:48:01,890 --> 00:48:04,820
then essentially, you are left with the same
thing.
509
00:48:04,820 --> 00:48:09,850
So, therefore, computing their products and
composing them does not help. So, therefore,
510
00:48:09,850 --> 00:48:16,850
you require some other additional quantity,
which will help you and that is the idea of
511
00:48:16,980 --> 00:48:17,460
rounds.
512
00:48:17,460 --> 00:48:22,240
So, therefore, the idea is that how can you
work? What is that feature? So, till now,
513
00:48:22,240 --> 00:48:26,190
whatever we have seen, there is a feature
missing, which we have not yet seen then.
514
00:48:26,190 --> 00:48:30,280
Therefore, the concept of round we are still
not able to achieve.
515
00:48:30,280 --> 00:48:36,920
So, what is that concept? That concept is
of non-linearity, but I will define that non-linearity
516
00:48:36,920 --> 00:48:41,150
concept further in our subsequent classes,
but this is a brief introduction to that.
517
00:48:41,150 --> 00:48:45,869
So, therefore, consider that instead of, I
mean, let us consider these 2 functions, S
518
00:48:45,869 --> 00:48:52,869
and P, where P is actually equal to x plus
K and S is the output of a function f x. Now,
519
00:48:54,390 --> 00:48:59,850
I claim, that this function is actually a
non-linear function with respect to addition
520
00:48:59,850 --> 00:49:00,660
operation. So, what does it mean?
521
00:49:00,660 --> 00:49:07,010
So what does it mean? So it means that if,
so therefore, what does a non-linear function
522
00:49:07,010 --> 00:49:13,480
mean? It means that if you consider f of x
1 plus x 2, so non-linear with respect to
523
00:49:13,480 --> 00:49:19,000
plus non-linearity is always with respect
to an operation. So, f of x 1 plus x 2 is
524
00:49:19,000 --> 00:49:26,000
not equal to f of x 1 plus f of x 2 and equality
would have meant linearity.
525
00:49:26,130 --> 00:49:32,450
So, therefore, now consider these function
S is equal to f x and P equal to x plus K
526
00:49:32,450 --> 00:49:39,450
and consider S cross P. So, what is S cross
P equal to? It is equal to f x plus K and
527
00:49:41,590 --> 00:49:48,090
what is S cross P cross S cross P? So, that,
that means, that you take f x plus K and then
528
00:49:48,090 --> 00:49:54,369
you do a further application of f and add
that with k. So, for this multiplication to
529
00:49:54,369 --> 00:50:00,020
increase, to increase the value of the length
of the key, so thus what, what is needed?
530
00:50:00,020 --> 00:50:03,260
Therefore, it is needed, that S cross P should
not be idempotent.
531
00:50:03,260 --> 00:50:10,260
So, if that, so what we require is, that f
of f x plus K, which is equal to this particular
532
00:50:10,960 --> 00:50:16,390
thing. When it is added with K should not
be equal to f square x plus K dash because
533
00:50:16,390 --> 00:50:22,740
if this, if you had a linear function f, then
you can actually, if this, while a linear
534
00:50:22,740 --> 00:50:26,930
function you could have actually distributed
this and this would have computed to some
535
00:50:26,930 --> 00:50:33,760
x square x plus some value of K dash.
So, that you see is exactly similar with your
536
00:50:33,760 --> 00:50:40,760
S cross P function, it, some other of an application
of f and it is added with the key K. So, therefore,
537
00:50:42,119 --> 00:50:49,119
what it, so therefore this happens only if
f is non-linear with respect to class. So,
538
00:50:50,940 --> 00:50:55,010
if this was the linear function f, then actually
it would have distributed and that result,
539
00:50:55,010 --> 00:50:59,000
that we would have obtained, would have obtained,
would have been similar to that of cross P.
540
00:50:59,000 --> 00:51:04,060
So, the size of this function, so the size
of the key of this function and the size of
541
00:51:04,060 --> 00:51:09,990
this composed function would have been the
same. So, therefore, what we need is something
542
00:51:09,990 --> 00:51:14,500
a deviation from this fact. So, we need not
linearity, but we need non-linearity. So,
543
00:51:14,500 --> 00:51:20,930
therefore, hence we have to compose linear
and non-linear functions to increase the security
544
00:51:20,930 --> 00:51:26,030
of a cipher. So, in order to increase the
security of a cipher, but we have seen till
545
00:51:26,030 --> 00:51:29,450
now are only linear components, linear transformations.
We essentially, found out multiplication with
546
00:51:29,450 --> 00:51:35,619
the matrix, which is the linear operator addition
with the key, that is, also, that is also,
547
00:51:35,619 --> 00:51:42,619
that is also a linear function. So, therefore,
all these are linear transformations. Therefore,
548
00:51:42,670 --> 00:51:47,109
we need, therefore you see that nicely from
Shannon's theory, we can actually arrive at
549
00:51:47,109 --> 00:51:51,920
the fact that we require at composition of
linear functions and as well as non-linear
550
00:51:51,920 --> 00:51:52,940
functions.
551
00:51:52,940 --> 00:51:58,570
So, with this I conclude my talk, but I would
like to give an assignment, which you are
552
00:51:58,570 --> 00:52:03,340
supposed to again do it and submit it on 20th,
before 20th, both, I have given 1 assignment
553
00:52:03,340 --> 00:52:07,810
already. The other assignment is that show
the unicity distance of the Hill Cipher with
554
00:52:07,810 --> 00:52:13,300
an m cross m encryption function is actually
less than m divided by R L, where R L is the
555
00:52:13,300 --> 00:52:18,700
redundancy, as defined in the class.
So, you can show, that the unicity distance
556
00:52:18,700 --> 00:52:24,630
of the Hill cipher with a m, m cross m encryption
function is actually less than m divided by
557
00:52:24,630 --> 00:52:26,590
R L.
558
00:52:26,590 --> 00:52:33,080
So, that is an assignment, which is given
to you and you can read further things from
559
00:52:33,080 --> 00:52:37,190
Shannon's books which is Communication Theory
of Secrecy Systems. It is actually a paper,
560
00:52:37,190 --> 00:52:41,680
it is a classical paper, so it Bell Systems,
it appeared in Bell Systems Technical Journal,
561
00:52:41,680 --> 00:52:45,750
but I am sure, that you will get online. And
the other text book that I have followed is
562
00:52:45,750 --> 00:52:50,570
from Douglas Stinson Cryptography Theory and
Practice, a 2nd edition book you can follow.
563
00:52:50,570 --> 00:52:53,940
So, I have followed that book, although the
3rd edition exists.
564
00:52:53,940 --> 00:52:58,180
And next day's topic would be symmetric key
cipher, so we will you these concepts to go
565
00:52:58,180 --> 00:53:02,570
and build ciphers now.
So, therefore, a symmetric cipher is our next
566
00:53:02,570 --> 00:53:06,329
day's topic and we will start with block ciphers
and follow that at with stream ciphers.