1
00:00:18,609 --> 00:00:25,609
So, in today's talk, we will conclude the
topic of Shannon's theory. So, in the previous
2
00:00:26,349 --> 00:00:32,500
class, if you remember, that we had concluded
with the idea of equivocation of keys. So,
3
00:00:32,500 --> 00:00:38,899
we have defined briefly, that what is the
idea behind spurious keys. So, today, we will
4
00:00:38,899 --> 00:00:44,430
try to understand, whether we can compute
a lower bound of the spurious keys.
5
00:00:44,430 --> 00:00:50,680
So, if I just would like to recap, that we
essentially proved this particular formula
6
00:00:50,680 --> 00:00:57,680
yesterday, that is, HK given C is equal to
HK plus HP minus HC. So, therefore, the ambiguity
7
00:00:57,860 --> 00:01:03,030
of a key, given the ciphertext is described
as follows, that it is addition of the ambiguity
8
00:01:03,030 --> 00:01:10,030
of the key plus the ambiguity of the plaintext
subtracted with the ambiguity of the cipher
9
00:01:10,240 --> 00:01:12,180
text.
10
00:01:12,180 --> 00:01:18,070
So, we also discussed about what is the meaning
of ideal ciphers, and told, that in case of
11
00:01:18,070 --> 00:01:25,000
an ideal cipher H key K given C is equal to
the value of HK. So, that means, that the
12
00:01:25,000 --> 00:01:28,400
cipher text does not leak any additional information
about the key.
13
00:01:28,400 --> 00:01:34,110
So, these are the things that we described
yesterday and concluded with it, and so, therefore,
14
00:01:34,110 --> 00:01:41,110
HK given C gives us the idea of the security
or insecurity. Therefore, what we discussed
15
00:01:41,640 --> 00:01:46,180
is that even for perfect ciphers the key size
is infinite, if the message size is infinite.
16
00:01:46,180 --> 00:01:51,369
So, that was the problem with perfect ciphers,
that is, they were not practical. So, when
17
00:01:51,369 --> 00:01:57,420
we define another kind of ciphers, called
the ideal ciphers, where HK given C is equal
18
00:01:57,420 --> 00:02:04,420
to HK and as I described yesterday, that the
main objective of our, what we study in this
19
00:02:05,729 --> 00:02:11,519
course, would be to find ciphers, which are
secured against an abounded adversary. That
20
00:02:11,519 --> 00:02:18,099
means we are essentially striving to achieve
computational security, that means, security,
21
00:02:18,099 --> 00:02:20,720
considering what is today's computational
power.
22
00:02:20,720 --> 00:02:25,879
So, for example, if you can prove that a given
cipher has got a security, which requires
23
00:02:25,879 --> 00:02:30,340
an adversary to do, say, up to 2 to the power
of 80 computations, then we are fairly happy
24
00:02:30,340 --> 00:02:35,819
and we say, that the cipher has achieved computational
security.
25
00:02:35,819 --> 00:02:42,819
So, the question is how to protect? So, therefore,
but still in today's class, we will be essentially,
26
00:02:43,040 --> 00:02:48,910
still considering an unbounded adversary and
essentially, address this question, that is,
27
00:02:48,910 --> 00:02:54,500
how do I protect a data against a brute force
attacker with an infinite computational power?
28
00:02:54,500 --> 00:02:59,459
Therefore, that is, an attacker who has got
infinite computational power and still, can
29
00:02:59,459 --> 00:03:03,140
I protect a cipher?
So, the idea is, that is, Shannon defined
30
00:03:03,140 --> 00:03:08,020
a certain parameter, which is called the unicity
distance, and he said, that, that is the least
31
00:03:08,020 --> 00:03:13,750
amount of ciphertext, which I would like to
make it available to this, to the adversary,
32
00:03:13,750 --> 00:03:18,489
who is an unbounded adversary, so that he
does not find out a unique value of the key.
33
00:03:18,489 --> 00:03:24,200
So, the strategy of the, of the adversary
is as follows, what he does is, is that he
34
00:03:24,200 --> 00:03:30,110
takes the ciphertext, he assumes a key, decrypts
it back and finds out the plaintext. If it,
35
00:03:30,110 --> 00:03:35,790
if the plaintext is meaningful, then he notes
it down, but the point is that because of
36
00:03:35,790 --> 00:03:41,450
the redundancy in the English language or
rather any other, any language for that matter,
37
00:03:41,450 --> 00:03:45,680
the adversary will not actually find out the
main, I mean, key unique. So, what he will
38
00:03:45,680 --> 00:03:51,430
essentially have? He will have a set of keys.
So, apart from the actual key, the other keys
39
00:03:51,430 --> 00:03:58,430
are called the spurious keys.
Sir, how will the attacker say, that doing
40
00:04:02,019 --> 00:04:07,110
the iterations and how will you actually decide,
how will the computer decide that your text
41
00:04:07,110 --> 00:04:07,360
is meaningful? We cannot check each and every
file.
42
00:04:07,110 --> 00:04:09,730
No, no. So, the idea is as follows, that is,
you take the ciphertext, consider a shift
43
00:04:09,730 --> 00:04:15,629
cipher, so you have got a unique value of
the key, so you find out the value of the
44
00:04:15,629 --> 00:04:22,400
key, you, so what you do is that you decrypt
it back and then you check, that whether the
45
00:04:22,400 --> 00:04:26,270
plaintext makes sense or not? So, what, what
you are asking is that you have got a lot
46
00:04:26,270 --> 00:04:30,340
of work to do, but what I told you at the
beginning is that, I am considering what?
47
00:04:30,340 --> 00:04:34,779
I am considering an unbounded adversary. So,
the adversary has got infinite computational
48
00:04:34,779 --> 00:04:40,210
power, he is, he is supreme.
So, the idea is that, whether, the question
49
00:04:40,210 --> 00:04:44,999
is that, whether he, even given, such a kind
of, such a kind of adversary, how many minimum
50
00:04:44,999 --> 00:04:50,189
number of ciphertext will I make it available
to the adversary, so that he cannot still
51
00:04:50,189 --> 00:04:55,999
guess the actual value of the key? So, therefore,
the set of the spurious keys should be null.
52
00:04:55,999 --> 00:05:00,089
Was that minimum number or maximum number?
Minimum number because if I give you more
53
00:05:00,089 --> 00:05:03,809
information, then you can do more things,
so I would like to give the adversary minimum
54
00:05:03,809 --> 00:05:09,900
number of cipher texts. Is it clear?
55
00:05:09,900 --> 00:05:15,619
So, therefore, a common misconception is that
any cipher can be adapt by exhaustively trying
56
00:05:15,619 --> 00:05:20,490
all possible keys; this is a very common misconception.
So, what you would like to say is that for
57
00:05:20,490 --> 00:05:26,139
example, even if DES, it has got a 56 bit
key, so this can also be broken by brute force.
58
00:05:26,139 --> 00:05:31,649
But the idea is that, so, suppose, let us
consider an adversary who can do 2 power of
59
00:05:31,649 --> 00:05:35,369
56 computations, so you should be able to
break it; so, the idea is that. But the, if
60
00:05:35,369 --> 00:05:40,550
the cipher is used within its unicity distance,
then even, an all power adversary cannot break
61
00:05:40,550 --> 00:05:45,399
the cipher. It is because of the strategy,
the strategy is therefore, what?
62
00:05:45,399 --> 00:05:49,800
You take the ciphertext, you decrypt it back,
check the plaintext, whether it makes sense
63
00:05:49,800 --> 00:05:55,309
or not? If it makes sense, you note the key
and keep the key register, the key as a meaningful
64
00:05:55,309 --> 00:05:59,899
key or rather a possible key, but that is
only one key, which is the actual key, rest
65
00:05:59,899 --> 00:06:03,930
are spurious keys.
What is the significance of unicity distance
66
00:06:03,930 --> 00:06:05,460
sir?
Yes.
67
00:06:05,460 --> 00:06:10,399
Significance of unicity distance?
The significance of unicity distance is that,
68
00:06:10,399 --> 00:06:15,249
for example, if I can compute the unicity
distance of say, DES or any other cipher,
69
00:06:15,249 --> 00:06:18,960
then I would like to use the same key for
so many times, after that I have to change
70
00:06:18,960 --> 00:06:22,029
the key.
If I am considering an unbounded adversary,
71
00:06:22,029 --> 00:06:28,270
so for example, if I am considering an unbounded
adversary, then I would like to use the key
72
00:06:28,270 --> 00:06:32,679
only for set two times, after that I would
like to change the key, if its unicity distance
73
00:06:32,679 --> 00:06:38,800
is true. So, therefore this is what I have
just described.
74
00:06:38,800 --> 00:06:44,039
So, HK given C is the amount of uncertainty
that remains of the key after the ciphertext
75
00:06:44,039 --> 00:06:49,449
is revealed. So, we know that, we know, it
is called the key equivocation; we have already
76
00:06:49,449 --> 00:06:52,429
defined that
So, what the attacker does is that he guesses
77
00:06:52,429 --> 00:06:57,599
the key from the ciphertext and he shall guess
the key and decrypt the cipher. So, what he
78
00:06:57,599 --> 00:07:02,089
does next is that he checks, whether the plaintext
obtained is meaningful English or not? If
79
00:07:02,089 --> 00:07:06,210
not, he rules out the key.
But due to the redundancy of language, more
80
00:07:06,210 --> 00:07:11,969
than one key will actually pass this test.
Those keys, apart from the correct key, are
81
00:07:11,969 --> 00:07:15,990
called spurious.
82
00:07:15,990 --> 00:07:22,759
So, then we come to something which is called,
we have already discussed about, what is entropy?
83
00:07:22,759 --> 00:07:29,759
We will be using this, rather, we will be
using entropy to find out, or find out the
84
00:07:30,059 --> 00:07:36,749
minimum bound of the spurious keys. So, we
will try to find out a lower bound of the
85
00:07:36,749 --> 00:07:41,949
number of spurious keys.
So, therefore, so, therefore, consider H L
86
00:07:41,949 --> 00:07:47,240
and H L is used to measure the amount of information
per letter of meaningful strings of plaintext.
87
00:07:47,240 --> 00:07:53,249
So, that is the definition of H L, it measures
the amount of information per letter of meaningful
88
00:07:53,249 --> 00:07:56,729
strings of plaintext.
So, therefore, consider a random string of
89
00:07:56,729 --> 00:08:01,379
alphabets. So, there are 26 letters and if
all of them are equally likely, then what
90
00:08:01,379 --> 00:08:08,379
is the entropy? It is equal to log 26 base
2 and so, that works to, computes to 4.76,
91
00:08:10,589 --> 00:08:15,610
but English language, you know, have a probability
distribution, it is not 1 by 26 for all the
92
00:08:15,610 --> 00:08:15,969
letters.
93
00:08:15,969 --> 00:08:21,679
So, this was, this is a sort of a graph, which
shows, how the English language characters,
94
00:08:21,679 --> 00:08:26,249
in general, vary. So, therefore, if you just
do a first order entropy, that means, you
95
00:08:26,249 --> 00:08:31,129
just take one particular alphabet among and
consider its probability and then you feed
96
00:08:31,129 --> 00:08:38,129
into the formula of pxi, log pxi negative
and sigma, then your first order entropy of
97
00:08:38,860 --> 00:08:45,860
the English text works to 4.19, but you can
do better than that. So, you know that in
98
00:08:45,970 --> 00:08:50,879
English language, consecutive letters essentially,
are not uncorrelated.
99
00:08:50,879 --> 00:08:56,050
For example, if I take q, then next letter
is u, so they are not uncorrelated. So, that
100
00:08:56,050 --> 00:09:00,769
means, that if I take 2 grams for example,
this entropy or uncertainty should reduce.
101
00:09:00,769 --> 00:09:06,470
So, therefore, the first order entropy is
not enough; so, what you do is that, I do
102
00:09:06,470 --> 00:09:07,509
a second order approximation.
103
00:09:07,509 --> 00:09:14,019
So, in second order, what I do is that I compute
all possible diagrams, I find out their probabilities
104
00:09:14,019 --> 00:09:20,550
and subsequently, their entropy and then I
divide that HP square by 2. So, it works to
105
00:09:20,550 --> 00:09:24,050
3.9.
So, you see, that the entropy has reduced;
106
00:09:24,050 --> 00:09:29,779
similarly, you can do trigrams and again,
find out the HP 3, and then again, divide
107
00:09:29,779 --> 00:09:34,310
that by 3 and similarly, you can do higher
order approximations of the entropy.
108
00:09:34,310 --> 00:09:39,870
So, the idea is that in general, the successive
letters have got correlation, which reduces
109
00:09:39,870 --> 00:09:46,870
the entropy. So, therefore, for example, define,
that a higher order entropy as follows, that
110
00:09:47,079 --> 00:09:53,529
define H L as the entropy of a natural language
L, as H L is equal to limit n, tends to infinity
111
00:09:53,529 --> 00:10:00,339
HP n divided by n. So, you find out the value
of HP n, means, you find out all n-grams,
112
00:10:00,339 --> 00:10:04,639
their probability distribution. Fine.
Compute the corresponding entropy and then
113
00:10:04,639 --> 00:10:09,879
divide this by n, and your limit tends, n
tends to infinity. That means, this, this
114
00:10:09,879 --> 00:10:16,149
is an approximation for very high values of
n and it has been found experimentally, that
115
00:10:16,149 --> 00:10:22,329
the value of H L, if you consider very high
values of n in general, ranges between 1 and
116
00:10:22,329 --> 00:10:28,060
1.5. So, we will find that H L falls between
within 1 and 1.5.
117
00:10:28,060 --> 00:10:33,050
So, so, from there what do we understand?
We understand that there is a, some amount
118
00:10:33,050 --> 00:10:37,120
of redundancy in the language.
So, in order to understand that better, let
119
00:10:37,120 --> 00:10:44,120
us quantize the redundancy by using this parameters,
say R L. And R L is defined as 1 minus H L
120
00:10:44,920 --> 00:10:51,920
divided by log cardinality of the plaintext
base 2. So, you can immediately understand
121
00:10:51,990 --> 00:10:57,870
certain terms form this formula. For example,
we will find that R L is equal to 1 minus
122
00:10:57,870 --> 00:11:03,350
H L. Therefore, if you just consider a random
language, that is, we will just, if English
123
00:11:03,350 --> 00:11:06,660
language would have been random, then what
would have been a entropy of H L? It would
124
00:11:06,660 --> 00:11:10,759
have been log P base 2.
So, in that case, what would have been the
125
00:11:10,759 --> 00:11:16,829
redundancy? It would have been 0. So, therefore,
you see that for any other value of H L, like
126
00:11:16,829 --> 00:11:22,120
when I am considering, say 2 gram, 3 gram,
4 gram, 5 gram and say n grams, then this
127
00:11:22,120 --> 00:11:27,860
value of H L was gradually reducing, so that
means, the redundancy was increasing. So,
128
00:11:27,860 --> 00:11:32,050
therefore, this formula is able to capture
the redundancy of the language.
129
00:11:32,050 --> 00:11:38,129
So, for example, if for typical values, if
you say that H L lies between 1 and 1.5, so
130
00:11:38,129 --> 00:11:45,129
you take a value of, you take a value of...
So, if for example, let us consider this,
131
00:11:45,509 --> 00:11:52,509
that is, 1 minus H L divided by log of P base
2, so in that case, I can rearrange this as
132
00:11:55,149 --> 00:11:59,899
follows, because I will be using this later
on. So, I can write for example, H L divided
133
00:11:59,899 --> 00:12:06,899
by log P base 2 is equal to 1 minus R L.
So, therefore, H L is equal to 1 minus R L
134
00:12:10,230 --> 00:12:17,230
into log of cardinality of P base 2. So, we
will just note this equivalent relation because
135
00:12:18,249 --> 00:12:24,410
we will be requiring this result subsequently.
So, you note one fact, that if this value
136
00:12:24,410 --> 00:12:31,410
of H L, that is, if H L reduces, then the
corresponding this implies, that the corresponding
137
00:12:33,319 --> 00:12:39,519
value of R L increases.
So, therefore, if the entropy reduces of the
138
00:12:39,519 --> 00:12:43,970
corresponding language, that is, equivalent
to saying that the redundancy of the language
139
00:12:43,970 --> 00:12:50,730
has increased. Therefore, this formula is
able to capture this intuitive result.
140
00:12:50,730 --> 00:12:57,069
So, let us, therefore, the, let us consider
what is the corresponding redundancy of an
141
00:12:57,069 --> 00:13:01,680
English language? So, we will have to quantize
that. So, if you find out, we will find out,
142
00:13:01,680 --> 00:13:08,680
that H L lies between 1 and 1.5, so consider
that H L is equal to 1.25 and you know that
143
00:13:10,279 --> 00:13:16,850
the number of plaintext characters is 26.
So, therefore, if you feed into this formula
144
00:13:16,850 --> 00:13:21,889
R L works to 0.75, so what does it mean? It
means that English language is 75 percent
145
00:13:21,889 --> 00:13:27,180
redundant. So, whatever you speak, out of
that 75 percent is actually redundant. So,
146
00:13:27,180 --> 00:13:33,339
does it mean that out of 4 characters you
talk, I can throw away 3 characters? No, it
147
00:13:33,339 --> 00:13:38,610
is not exactly so. So, what it only means
is, that if you do, for example, a Huffman
148
00:13:38,610 --> 00:13:42,689
coding, then essentially, you are expected
to get such a kind of compression.
149
00:13:42,689 --> 00:13:49,120
So, therefore, that is the idea of redundancy
of a language and that is precisely the reason
150
00:13:49,120 --> 00:13:54,370
why cryptanalysis is favored. If you had the
very random kind of language, then cryptanalysis
151
00:13:54,370 --> 00:13:57,839
would have been harder.
But you know that there is a redundancy in
152
00:13:57,839 --> 00:14:04,839
English language and that serves as extra
information to the attacker.
153
00:14:06,459 --> 00:14:12,360
So, let us try to calculate the lower bound
of equivocation of the key; so, that is the
154
00:14:12,360 --> 00:14:17,149
objective of today's class, the first part
of today's class. So, for example, I have
155
00:14:17,149 --> 00:14:24,149
already defined n-grams, so consider P n and
R n and P n and C n to be two random variables,
156
00:14:24,220 --> 00:14:27,980
defined to represent n-grams of the plaintext
and n-grams of the ciphertext.
157
00:14:27,980 --> 00:14:34,920
So, all of us know already this formula, so
it is, what besides HK given C n is equal
158
00:14:34,920 --> 00:14:41,920
to HK plus HP n minus HC n. So, this formula,
we have already proved in the last day's class.
159
00:14:42,579 --> 00:14:48,329
Yes.
So, what is HP n equal to? So, HP n, we have
160
00:14:48,329 --> 00:14:54,329
defined from the definition of HL. If the
value of n is quite large, you can approximate
161
00:14:54,329 --> 00:15:01,329
that by nH L, and you know that H L by my
previous result was equal to 1 minus R L log
162
00:15:09,999 --> 00:15:11,240
cardinality of P base 2.
163
00:15:11,240 --> 00:15:16,329
So, therefore, if I would like to calculate
the value of nH L, I would have just, I need
164
00:15:16,329 --> 00:15:23,329
to multiply this particular thing by n, so
I get n into 1 minus R L into log of cardinality
165
00:15:24,309 --> 00:15:31,120
of P base 2; so, that is the value of nH L.
So, therefore, I can say from here, that HP
166
00:15:31,120 --> 00:15:38,120
n, which is approximately equal to nH L, I
can write that as, equivalent, equal to n
167
00:15:38,420 --> 00:15:45,420
into 1 minus R L into log of cardinality of
P base 2. And the next thing that I want is
168
00:15:52,189 --> 00:15:59,189
H of C n. So, I write that H of C n is equal
to or rather is less than equal to n of log
169
00:16:01,410 --> 00:16:08,410
of cardinality of C base 2, why? Because whatever
be the entropy, so I am considering n-grams,
170
00:16:10,939 --> 00:16:17,939
so this is equivalent, is saying that H of
C n by n is less than equal to log of cardinality
171
00:16:18,430 --> 00:16:23,230
of C base 2.
So, all of us know this fact because this
172
00:16:23,230 --> 00:16:29,589
essentially, captures the entropy of a random
cipher text. So, whatever be it, the entropy
173
00:16:29,589 --> 00:16:34,100
that is divided by n, should obviously be
less than a random thing. So, the x has got
174
00:16:34,100 --> 00:16:39,709
the, you know, the maximum entropy, most uncertainty.
So, therefore, this value is obviously lesser
175
00:16:39,709 --> 00:16:45,329
than this, so you will be using these two
bounds in our calculation. So, one bound is
176
00:16:45,329 --> 00:16:51,249
given by H of C n is less than equal to n
log cardinality of C base 2 and the other
177
00:16:51,249 --> 00:16:57,199
approximation is H of P n is approximately
equal to n into 1 minus R L log cardinality
178
00:16:57,199 --> 00:17:04,199
of P base 2. So, we will plug these equations,
these two things, into our, into our equivocation
179
00:17:04,959 --> 00:17:08,500
formula, that we had.
So, the formula that we had was, H of K given
180
00:17:08,500 --> 00:17:14,189
C n is equal to H of K plus H of P n minus
H of C n. So, we have a fair amount of estimate
181
00:17:14,189 --> 00:17:20,150
of HP n and HC n, so if you plug that, we
get this value, that is, H of K given C n
182
00:17:20,150 --> 00:17:27,150
is greater than equal to H of K minus n R
L log cardinality of P base 2. So, do you
183
00:17:28,679 --> 00:17:35,429
see that? So, you see that if I subtract HP
n minus HC n, then these particular terms,
184
00:17:35,429 --> 00:17:42,320
that is, n log cardinality of P base 2 minus
n log cardinality of C base 2 gets cancelled,
185
00:17:42,320 --> 00:17:46,940
if the cardinality of P and C are same.
So, if I just consider that both of them are
186
00:17:46,940 --> 00:17:52,390
English letters for example, then the cardinality
of P and cardinality of C are the same things;
187
00:17:52,390 --> 00:17:56,870
so, there is a same value. Therefore, they
cancel each other and we have got a lower
188
00:17:56,870 --> 00:18:03,549
bound of HK given C n, which says that HK
given C n is definitely greater than HK minus
189
00:18:03,549 --> 00:18:06,500
nR L log of cardinality of P base 2.
190
00:18:06,500 --> 00:18:11,520
So, therefore, let us remember this formula,
so, because we will be needing this later
191
00:18:11,520 --> 00:18:18,520
on. It says that H of K, if I write it in
the other way, minus n of R L log of cardinality
192
00:18:20,580 --> 00:18:27,580
of P base 2 is lesser than equal to H of K
given C n.
193
00:18:27,669 --> 00:18:34,669
So, now, we will try to prove an upper bound
of HK given C n, that is, HK given C n should
194
00:18:35,360 --> 00:18:39,770
be lesser than equal to some term. So, from
there we will try to find out the quantized
195
00:18:39,770 --> 00:18:44,039
value of the spurious keys.
So, till this part is clear?
196
00:18:44,039 --> 00:18:49,100
Sir in this independent of this identity
Yeah, we are not assuming anything of the
197
00:18:49,100 --> 00:18:56,029
key. So, for example, we have assumed that
the ciphertext cardinality and the plaintext
198
00:18:56,029 --> 00:19:01,559
cardinalities are the same, but the key is,
we have not assumed the size of the key, this
199
00:19:01,559 --> 00:19:06,140
calculation is got independent of that information.
200
00:19:06,140 --> 00:19:10,600
So, therefore, consider possible keys. So,
therefore, Ky, it has defined Ky to be the
201
00:19:10,600 --> 00:19:15,620
possible keys given that y is the ciphertext;
so, define this set. So, what does it mean?
202
00:19:15,620 --> 00:19:19,309
It is that possible keys, given that y is
a cipher text. So, I have already defined
203
00:19:19,309 --> 00:19:23,630
that what it means, it means that Ky is a
set of those keys for which y is the ciphertext
204
00:19:23,630 --> 00:19:28,419
for meaningful plaintexts.
So, therefore, as I told you, that a, that
205
00:19:28,419 --> 00:19:34,020
is, a cryptanalyst takes or an attacker takes
the ciphertexts, assumes a value of key and
206
00:19:34,020 --> 00:19:39,500
finds out those keys are registers, those
keys for which the plain text is meaningful,
207
00:19:39,500 --> 00:19:46,230
and that is denoted by the set Ky. Therefore,
the Ky set holds those keys for which the
208
00:19:46,230 --> 00:19:52,330
corresponding plain text is meaningful.
So, out of these keys, how many is, how many
209
00:19:52,330 --> 00:19:59,330
are spurious keys? Cardinality of Ky minus
1, because only 1 key is the actual key, rest
210
00:19:59,340 --> 00:20:03,600
are spurious. So, therefore, we know, that
when y is the ciphertext, number of keys is
211
00:20:03,600 --> 00:20:09,929
modular of Ky; so, out of them only 1 is correct,
so rest of them are spurious.
212
00:20:09,929 --> 00:20:16,929
So, the number of spurious keys can be found
out by cardinality of Ky minus 1. So, what
213
00:20:17,020 --> 00:20:21,539
is the expected size of cardinality? So, you
know, that this is actually a distribution,
214
00:20:21,539 --> 00:20:26,450
so therefore, in order to calculate the expected
value of a random variable, what do we do?
215
00:20:26,450 --> 00:20:31,179
We multiply the corresponding value with its
probability and dual sigma.
216
00:20:31,179 --> 00:20:36,429
So, I guess, we know this result, that if
there is a random variable x, then its expected
217
00:20:36,429 --> 00:20:43,429
value Ex is computed by sigma of its corresponding
probability into x i, where i runs over all
218
00:20:44,990 --> 00:20:51,990
possibilities. So, that is the way, how we
calculate the expectation of a random variable.
219
00:20:53,289 --> 00:21:00,289
So, we apply this and we find out the expected
number of spurious keys. So, what is the expected
220
00:21:02,070 --> 00:21:06,470
number of spurious keys here? It is the average
number of spurious keys over all possible
221
00:21:06,470 --> 00:21:10,240
ciphertext and this is denoted by the variable
S n.
222
00:21:10,240 --> 00:21:16,710
So, S n is nothing but sigma of cardinality
of Ky, I mean, we have just multiplying K
223
00:21:16,710 --> 00:21:22,000
cardinality of Ky minus 1 because this is
the number of spurious keys, and we are multiplying
224
00:21:22,000 --> 00:21:26,000
that by the probability of this event. So,
what is the probability of this event, that
225
00:21:26,000 --> 00:21:32,370
the ciphertext y is chosen? That is py, and
that is varied for a done, that is, calculation
226
00:21:32,370 --> 00:21:38,740
is done for all possible ciphertexts.
So, in a, if I just simplify this formula,
227
00:21:38,740 --> 00:21:45,740
then I can actually multiply this sigma, this
py with Ky and I obtain that, this if I distribute
228
00:21:46,000 --> 00:21:51,429
this py over 1, then I obtain sigma of py.
So, what is sigma of py over all possible
229
00:21:51,429 --> 00:21:58,240
ciphertext? It is unity, 1; so, I get sigma
of py multiplied with the cardinality of Ky
230
00:21:58,240 --> 00:22:04,360
and that from there, I subtract the value
of 1. So, this gives me what? This gives me
231
00:22:04,360 --> 00:22:11,360
the expected number of spurious keys.
So, therefore, from here I can write, that
232
00:22:12,260 --> 00:22:18,679
S n plus 1, I can just reorganize this equation,
I can write S n plus 1 is equal to this particular
233
00:22:18,679 --> 00:22:19,260
thing.
234
00:22:19,260 --> 00:22:26,260
So, therefore I can write like, S n plus 1
is equal to sigma of py cardinality of Ky,
235
00:22:30,630 --> 00:22:37,630
where i varies over all possible ciphertext.
So, this is actually also, which have this,
236
00:22:38,279 --> 00:22:45,039
a, this also we can put down from the definition
of S n.
237
00:22:45,039 --> 00:22:51,010
So, therefore, the, if you need to calculate
the upper bound of the equivocation of key,
238
00:22:51,010 --> 00:22:54,200
we do further calculation from the definition
of HK given C n.
239
00:22:54,200 --> 00:22:58,230
So, what is the HK given C n? So, whereas
I told you that there are two random variables
240
00:22:58,230 --> 00:23:02,309
here, K and C n, what we do is that, let us
vary one random variable and keep the other
241
00:23:02,309 --> 00:23:08,370
one constant. So, therefore, we vary in this
case y and we keep the value of K constant.
242
00:23:08,370 --> 00:23:12,429
Therefore, from our definition of conditional
entropy, we can write that is equal to sigma
243
00:23:12,429 --> 00:23:19,429
of py multiplied by H of K given y, so this
we have actually written in last day's class,
244
00:23:20,690 --> 00:23:26,140
you can follow that.
So, therefore, this is equal to, actually
245
00:23:26,140 --> 00:23:32,399
this is equal to, should be actual an equal
to, so this is equal to sigma of py, I just
246
00:23:32,399 --> 00:23:37,049
write this as Ky. Therefore, what does it
mean? It means, K given y.
247
00:23:37,049 --> 00:23:44,049
So, that is exactly the definition of H; that
is exactly the definition of K y. So, I obtain
248
00:23:44,159 --> 00:23:50,500
the sigma, I obtain the, I multiply py with
H K y and I take the corresponding sigma.
249
00:23:50,500 --> 00:23:57,500
Now, this is less than equal to py multiply
with logarithm of cardinality of Ky base 2
250
00:23:59,600 --> 00:24:02,360
and taken a sigma.
When this follows from what I already told
251
00:24:02,360 --> 00:24:09,360
you, that if this Ky would have been a, so
I mean, if for example this Ky would have
252
00:24:15,889 --> 00:24:22,000
been a random distribution, in that case,
this would have been upper bounded, this would
253
00:24:22,000 --> 00:24:26,419
have been equal to logarithm of cardinality
of Ky base 2.
254
00:24:26,419 --> 00:24:31,980
For any other distribution, this uncertainty
is lesser than a random distribution. Therefore,
255
00:24:31,980 --> 00:24:37,350
I can actually find out an upper bound and
this is the upper bound, then I apply a, a
256
00:24:37,350 --> 00:24:42,760
certain result from mathematics is called
geneses inequality, but it is applicable for
257
00:24:42,760 --> 00:24:45,889
the logarithm series because it is a monotonically
increasing function.
258
00:24:45,889 --> 00:24:51,120
But let us just believe this fact, that I
can actually find out an upper bound of this,
259
00:24:51,120 --> 00:24:54,570
which is called..., So, I can upper bound
this particular thing by this expression,
260
00:24:54,570 --> 00:24:58,539
so I can take the log out and I can take this,
write this as follows.
261
00:24:58,539 --> 00:25:03,919
So, what is this particular sigma equal to?
This sigma is equal to S n plus 1. From this
262
00:25:03,919 --> 00:25:10,059
result, you see this, that is, sigma of py
cardinality of Ky was equal to S n plus 1.
263
00:25:10,059 --> 00:25:14,539
So, we use this result and plug in to that
equation, we obtain this. So, you see, that
264
00:25:14,539 --> 00:25:19,860
this is equal to sigma of py cardinality of
Ky and instead of this I can write, S n plus
265
00:25:19,860 --> 00:25:20,210
1.
266
00:25:20,210 --> 00:25:25,279
So, what we have proved is that H K given
C n is less than equal to logarithm of S n
267
00:25:25,279 --> 00:25:31,690
plus 1 base 2.
So, what we have proved is HK given C n is
268
00:25:31,690 --> 00:25:38,690
less than equal to logarithm of S n plus 1
base 2. So, now, we can actually take this
269
00:25:42,130 --> 00:25:49,100
result and we can combine this fact, and this
fact, and obtain that, rather, write that
270
00:25:49,100 --> 00:25:56,100
HK minus n of R L logarithm of P, logarithm
of cardinality of P base 2 is less than equal
271
00:26:01,440 --> 00:26:08,440
to logarithm of S n plus 1 base 2.
272
00:26:15,990 --> 00:26:22,990
So, that is precisely written, what, here.
So, why I write that HK minus nR L log cardinality
273
00:26:25,409 --> 00:26:32,169
of P base 2 is less than equal to logarithm
of S n plus 1 base 2? So, therefore, I can
274
00:26:32,169 --> 00:26:37,740
actually reorganize this, and write as, rearrange
this and write as logarithm of S n plus 1
275
00:26:37,740 --> 00:26:43,019
base 2 is upper bounded by H K minus nR L
log cardinality of P base 2.
276
00:26:43,019 --> 00:26:48,190
So, now, if the keys are chosen equi-probably,
that is, if all the keys are equally likely,
277
00:26:48,190 --> 00:26:52,799
then I can actually write that these are an
equal; I can write that HK is equal to logarithm
278
00:26:52,799 --> 00:26:59,260
of K base 2. So, in that case, I can plug
this into this previous equation and this
279
00:26:59,260 --> 00:27:06,260
works out to this equation, so this is equality.
So, I can obtain S n plus 1, rather, S n plus
280
00:27:07,070 --> 00:27:12,029
1 is greater than equal to cardinality of
K divided by cardinality of P to the power
281
00:27:12,029 --> 00:27:17,440
of nR L; so, this follows from this equation.
If you plug in the value of HK and make it
282
00:27:17,440 --> 00:27:22,510
equal to logarithm of cardinality of K base
2, so you just see, that if I take this and
283
00:27:22,510 --> 00:27:27,679
if I plug here, then I, and I can actually
write this as logarithm of cardinality of
284
00:27:27,679 --> 00:27:33,029
P base 2 to the power of nR L.
And from there, I obtain a log here and this
285
00:27:33,029 --> 00:27:39,059
is also a log, so the subtraction of log would
be log a by b. Therefore, I can write that
286
00:27:39,059 --> 00:27:46,059
as logarithm of cardinality of K divided by
cardinality of P to the power of nR L base
287
00:27:46,539 --> 00:27:52,059
2 and the left hand side, I have got S n plus
1. Therefore, since I have got two logs on
288
00:27:52,059 --> 00:27:56,870
both sides and we have got an increasing function,
I can actually write, that S n plus 1 is greater
289
00:27:56,870 --> 00:28:01,880
than equal to cardinality of P, cardinality
of K divided by the cardinality of P to the
290
00:28:01,880 --> 00:28:06,309
power of nR L.
So, what do you obtain from here? What is
291
00:28:06,309 --> 00:28:10,799
the objective? The objective is, what is,
he note, what is the value of n? What was
292
00:28:10,799 --> 00:28:16,500
n? n was the number of ciphertext that I have
provided. So, therefore, now I would have,
293
00:28:16,500 --> 00:28:23,500
like to make the number of spurious keys equal
to 0; that was the objective, finally.
294
00:28:24,659 --> 00:28:31,019
So, unicity distance says that the, thus the
increasing n, so we obtain from here, that
295
00:28:31,019 --> 00:28:36,210
if I increase the value of n, then that reduces
the number of spurious keys. That means what?
296
00:28:36,210 --> 00:28:42,220
If I provide an attacker more and more information,
then the number of spurious keys gets reduced.
297
00:28:42,220 --> 00:28:48,539
So, in unicity distance, is that particular
number of ciphertext, so I call it n 0 for
298
00:28:48,539 --> 00:28:53,320
which the number of ciphertext or rather,
number of spurious keys is actually reduced
299
00:28:53,320 --> 00:28:59,230
to 0. So, if I, for example in the previous
equation, if I had made the value of n, so
300
00:28:59,230 --> 00:29:04,110
I can actually write that if I just plug in
0 to that previous value, I would have obtained
301
00:29:04,110 --> 00:29:04,580
this bound.
302
00:29:04,580 --> 00:29:11,580
You see this, that is, if this was equal to
0, then I would have obtained, we would have
303
00:29:11,659 --> 00:29:16,789
obtained what? We would have obtained that
cardinality of K divided by cardinality of
304
00:29:16,789 --> 00:29:23,370
P to the power of nR L is equal to 1. So,
that means that cardinality of K is equal
305
00:29:23,370 --> 00:29:29,690
to cardinality of P to the power of nR L.
So, therefore, that is exactly this particular
306
00:29:29,690 --> 00:29:35,830
equation, you see, that n is equal to logarithm
of K base 2 divided by R L log P base 2.
307
00:29:35,830 --> 00:29:40,710
So, actually in my equation, I can take log
on both sides and I can write log K base 2
308
00:29:40,710 --> 00:29:47,100
is equal to nR L log cardinality of P base
2, so what is the value of n here? It is equal
309
00:29:47,100 --> 00:29:54,100
to log of cardinality of K base 2 divided
by R L logarithm of P base 2.
310
00:29:55,820 --> 00:30:02,820
So, this is the value of n for which my number
of spurious keys just becomes equal to 0,
311
00:30:04,080 --> 00:30:08,269
for unicity distance would be greater than
that. Therefore, if I would provide you more
312
00:30:08,269 --> 00:30:12,750
and more cipher text, then actually your number
of keys would have been 0.
313
00:30:12,750 --> 00:30:17,789
Therefore, if I use a cipher within this unicity
distance, then the number of spurious keys
314
00:30:17,789 --> 00:30:23,039
will not be 0 that means, the attacker is
not able to exactly find out the unique value
315
00:30:23,039 --> 00:30:29,230
of the key and we have, we have, so throughout
our calculation, we actually consider an unbounded
316
00:30:29,230 --> 00:30:32,500
adversary.
So, even an unbounded adversary would not
317
00:30:32,500 --> 00:30:36,669
actually find out the actual value of the
key, not the unique value of the key, but
318
00:30:36,669 --> 00:30:42,570
he will have a set of possible keys. And unicity
distance is actually, that number of ciphertext
319
00:30:42,570 --> 00:30:48,509
for which this number of spurious keys just
becomes equal to 0. Therefore, we obtain this
320
00:30:48,509 --> 00:30:52,380
particular lower bound; therefore, this is
a lower bound of the number of spurious keys,
321
00:30:52,380 --> 00:30:56,929
number of, for the, for the lower bound for
the unicity distance.
322
00:30:56,929 --> 00:31:02,230
So, beyond that, the attacker is actually
able to find out the unique value of the key.
323
00:31:02,230 --> 00:31:06,720
So, note that this calculation may not accurate
for small values of n, why? And because my
324
00:31:06,720 --> 00:31:12,149
original H L definition relate upon the fact
that limit n tends to infinity, that were
325
00:31:12,149 --> 00:31:17,070
the assumption that I made, therefore, this
may not be very much true for n equal to 1,
326
00:31:17,070 --> 00:31:22,639
2 or so on. It should be fairly o.k. for large
values of n.
327
00:31:22,639 --> 00:31:29,250
So, let us do an example calculation of, with
the substitution cipher. So, we had number
328
00:31:29,250 --> 00:31:34,840
of plaintext characters equal to 26, so the
cardinality of P was 26, cardinality of K
329
00:31:34,840 --> 00:31:39,279
was 26 factorial, so that was around 4 into
10 to the power of 26, fairly large value
330
00:31:39,279 --> 00:31:46,279
of the key. So, here R L, if you assume is
equal to 0.75 of English language, then if
331
00:31:46,570 --> 00:31:52,669
you plug in, will find that n 0 is approximately
equal to 25. So, it means, that given a ciphertext
332
00:31:52,669 --> 00:31:57,549
string of length 25, an unbounded attacker
can actually, predict the unique value of
333
00:31:57,549 --> 00:32:01,879
the key.
So, thus, what we observe from here is that
334
00:32:01,879 --> 00:32:07,409
a key size alone, such a large key size does
not guaranty security, if brute force is possible
335
00:32:07,409 --> 00:32:13,350
to an attacker with infinite computational
power. So, an attacker who has got an unbound,
336
00:32:13,350 --> 00:32:20,009
I mean, infinite computational power, for
him such a big size of key, he requires just
337
00:32:20,009 --> 00:32:24,940
25 ciphertext actually, find out the value
of the key.
338
00:32:24,940 --> 00:32:31,149
All these mirrors are, of course probabilistic,
but it will match with the actual result quite
339
00:32:31,149 --> 00:32:31,649
closely.
340
00:32:31,649 --> 00:32:37,070
So, this, with this we essentially conclude
this part of unicity distance, but we will
341
00:32:37,070 --> 00:32:41,070
conclude the remaining part with an idea of
product ciphers.
342
00:32:41,070 --> 00:32:48,070
So, actually, if, after this we will actually,
go start talking about real ciphers like block
343
00:32:48,159 --> 00:32:52,990
ciphers and stream ciphers, but before this
I would like to mention about the idea of
344
00:32:52,990 --> 00:32:57,919
product ciphers. This was actually, also mentioned
in Shannon's paper and that is why, it is
345
00:32:57,919 --> 00:33:03,919
called a seminar paper. It was mentioned,
as old in 1949 and he get the idea of forming
346
00:33:03,919 --> 00:33:09,679
products. So, the idea is still fundamental
because even present day ciphers, like AES
347
00:33:09,679 --> 00:33:12,909
for example, still uses the concepts of product
ciphers.
348
00:33:12,909 --> 00:33:18,190
So, let us try to understand the concept of
product ciphers and actually, you will observe,
349
00:33:18,190 --> 00:33:23,240
that, through that lot of things becomes meaningful;
lot of things, which we will see in our future
350
00:33:23,240 --> 00:33:26,720
ciphers, will actually become meaningful.
So, let us try.
351
00:33:26,720 --> 00:33:30,509
So, see, so where, before that I would just
like to, in order to simplify our life, let
352
00:33:30,509 --> 00:33:36,009
us, I will just coin a term, endomorphic ciphers.
So, what is an endomorphic cipher? Endomorphic
353
00:33:36,009 --> 00:33:39,370
ciphers are those ciphers, for the plaintext
and ciphertext are the same sets.
354
00:33:39,370 --> 00:33:44,399
Say, for us, normal substitution cipher where
plain text and cipher text, where just English
355
00:33:44,399 --> 00:33:49,909
language, that means English characters, so
if P and C are the same, then we have, what
356
00:33:49,909 --> 00:33:53,549
is called an endomorphic cipher.
So, therefore, the shift cipher of an, on
357
00:33:53,549 --> 00:33:58,169
English language, on English alphabets was
an example of an endomorphic cipher.
358
00:33:58,169 --> 00:34:05,169
So, consider an endomorphic cipher and let
us try to understand certain things from history.
359
00:34:05,269 --> 00:34:11,600
Therefore, if we have an endomorphic cipher,
so C 1; you note, that I have written (P,P)
360
00:34:11,600 --> 00:34:15,629
because P and C are the same things. So, the
plaintext set and the ciphertext sets are
361
00:34:15,629 --> 00:34:21,429
the same thing, so I write (P,P) and then
followed, follow that with K1 because K1 denotes
362
00:34:21,429 --> 00:34:27,820
the key set of the encryption function C 1.
So, if the cipher C 1 and we have got an encryption
363
00:34:27,820 --> 00:34:33,740
function e1 and corresponding decryption function
d1; we also have a cipher, which is called
364
00:34:33,740 --> 00:34:40,740
C 2 and I denote that with (P,P,K2,e2,d2);
so, that means what? The K2 is the set key
365
00:34:41,710 --> 00:34:46,940
of the keys for the second cipher, e2 is the
corresponding encryption function and d2 is
366
00:34:46,940 --> 00:34:51,440
the corresponding decryption function.
So, let us try to understand or define, what
367
00:34:51,440 --> 00:34:57,820
is meant by the product cipher C 1 cross C
2? So, what does C 1 cross C 2 means? It just
368
00:34:57,820 --> 00:35:01,940
means, like as you know, that we apply two
functions, after, one after the other; therefore,
369
00:35:01,940 --> 00:35:07,530
if I say, C 1 cross C 2, it means, that first
I will apply C 1 and follow that with the
370
00:35:07,530 --> 00:35:11,110
application of C 2.
So, I think, we have seen this in the case
371
00:35:11,110 --> 00:35:16,350
of composition of functions in discrete structures
class. Therefore, they exactly, precisely,
372
00:35:16,350 --> 00:35:21,530
in similar kind of thing, so you see, C 1
cross C 2, I would define as P cross P because
373
00:35:21,530 --> 00:35:24,710
it is still endomorphic. You see, why it is
endomorphic?
374
00:35:24,710 --> 00:35:30,360
Yeah, because I mean, that repeated application
also keep, still keeps its endomorphic property,
375
00:35:30,360 --> 00:35:34,990
so it is still endomorphic, but here, key
set is the Cartesian product K1 and K2, that
376
00:35:34,990 --> 00:35:41,330
is, K 1 cross K 2, and we have got the corresponding
encryption function e and decryption function
377
00:35:41,330 --> 00:35:47,890
d. So, any key I can write in the form of
an ordered pair K1 cross K, (k1,k2), I can
378
00:35:47,890 --> 00:35:52,980
form ordered sets like that.
So, the encryption function is defined as
379
00:35:52,980 --> 00:35:59,370
e equal to e2 and, but initially you take
the plaintext x, you take the key K1 and you
380
00:35:59,370 --> 00:36:05,000
apply the function e1. Subsequently, you choose
a key K2 and you apply the encryption function
381
00:36:05,000 --> 00:36:05,620
e2.
382
00:36:05,620 --> 00:36:10,380
So, what will the corresponding decryption
function look like? The corresponding decryption
383
00:36:10,380 --> 00:36:17,380
function will look like this, it will look
like d equal to d 2 (d 1(y, K 1)) follow that
384
00:36:20,620 --> 00:36:26,850
with K 2 like this.
So, do you see, that if I apply d and e subsequently,
385
00:36:26,850 --> 00:36:32,240
they are actually inverses of each other that
is obvious, because of the associativity of
386
00:36:32,240 --> 00:36:39,240
the product. So, actually, you can see that
because if I apply d and if I apply d over
387
00:36:41,570 --> 00:36:45,850
an application of the e function, then actually,
I have obtained that where I started with.
388
00:36:45,850 --> 00:36:52,850
So, you can see this because of this.
Yeah, it will be first d 1 and followed that
389
00:36:55,820 --> 00:37:00,920
with d 2, yeah, so that you can obtain from
the decryption function because what you do
390
00:37:00,920 --> 00:37:07,920
is, that you take the corresponding, so you
write this e 2(e 1(x, K 1), K 2)), so that
391
00:37:09,810 --> 00:37:16,340
is my y. So, I take this as y and I apply
my d over that.
392
00:37:16,340 --> 00:37:21,760
Therefore, what I do is that I want to compute
this, I want to compute dy, so for that I
393
00:37:21,760 --> 00:37:28,760
apply d 1 d 2 over e 2(e 1(x, K 1), K 2)).
And then, so this is my, therefore, this is
394
00:37:38,960 --> 00:37:45,960
my e 1, so therefore, what I do is that I
have actually encrypted, so this is my scope
395
00:37:47,490 --> 00:37:52,300
of the function e 2, then I apply d 2; so,
in order to, encrypt, decrypt this, I need
396
00:37:52,300 --> 00:37:57,660
K 2 and follow that with K 1.
So, what you do is that you see that in this
397
00:37:57,660 --> 00:38:04,330
case, your d 2 and e 2 cancels each other.
So, I can do that because of the associativity
398
00:38:04,330 --> 00:38:11,330
of the product function; so if I do that I
obtain d 1 and I obtain then (e 1(x, K 1),K
399
00:38:12,050 --> 00:38:17,870
1).
So, again this d 1 and e 1 cancel each other
400
00:38:17,870 --> 00:38:21,570
and I finally obtain that the value of x.
Therefore, you are therefore, you see that
401
00:38:21,570 --> 00:38:28,570
d is actually a corresponding decryption function.
So, this follows because of this fact, that
402
00:38:31,460 --> 00:38:35,870
is, the product rule is always associative.
403
00:38:35,870 --> 00:38:42,480
So, the question is that, if you can compute
product of ciphers, does the cipher become
404
00:38:42,480 --> 00:38:47,060
stronger? That is what is most important.
So, I take two small ciphers and I compose
405
00:38:47,060 --> 00:38:51,520
them, I compute the product. Thus, the key,
thus is, thus, the cipher become stronger,
406
00:38:51,520 --> 00:38:56,680
that means, thus the key space becomes really
larger. So, in the initial, on the surface,
407
00:38:56,680 --> 00:39:00,430
we have actually (K 1,K 2).
So, the product size should increase, but
408
00:39:00,430 --> 00:39:05,650
in a second thought, does it really become
larger? So, in order to understand that, what
409
00:39:05,650 --> 00:39:12,650
is your opinion on that, will it really becomes
larger? Will the key size become larger? So,
410
00:39:14,810 --> 00:39:21,810
actually, not always.
Let us try to consider one example, I guess
411
00:39:21,900 --> 00:39:25,640
it be very lot clear through that, so let
us consider a simple multiplicative cipher.
412
00:39:25,640 --> 00:39:31,330
So, what it does is that it just takes x and
multiplies with a, where a is co-prime to
413
00:39:31,330 --> 00:39:38,330
26. So, you know what is co-prime? So, a is
co-prime, means it is gcd of a and 26 is 1.
414
00:39:38,580 --> 00:39:42,890
And next is, you consider a shift cipher,
where you take x and you add that with K.
415
00:39:42,890 --> 00:39:49,440
Therefore, this is my M and this is my S;
so this was an example of, what it was an
416
00:39:49,440 --> 00:39:56,440
example of? A computation cipher and this
was an example of a substitution cipher.
417
00:39:56,700 --> 00:40:03,700
So, now you consider, that for example, that
you have got M and you have got an S and what
418
00:40:04,130 --> 00:40:08,910
we do is that we just compose this M cross
S, and you obtain this, that is, y is equal
419
00:40:08,910 --> 00:40:14,860
to ax plus K is, you see, you first of all
have, so you need to first do the multiplication,
420
00:40:14,860 --> 00:40:21,550
so you do as, I mean, ax and then, you need
to do S, so you add K with ax.
421
00:40:21,550 --> 00:40:26,820
So, what is the key in this case? The key
is the double (a,k) and you know, that this
422
00:40:26,820 --> 00:40:31,450
is an example of what? It is an example of
affine cipher. So, what was the key size in
423
00:40:31,450 --> 00:40:36,990
case of English language? What was the size
of affine cipher? It was equal to 312.
424
00:40:36,990 --> 00:40:43,990
So, now you consider S cross M; S cross M
is equal y is equal to a(x plus K); so, that
425
00:40:45,010 --> 00:40:51,710
is, you can also write that ax plus ak. Now,
you know, that gcd of (a,26) is 1, therefore,
426
00:40:51,710 --> 00:40:58,710
this is also an affine cipher, and the key
would be (a,ak), but since the gcd of a and
427
00:40:59,470 --> 00:41:05,590
26 is 1, so an inverse exists. This we discussed
and actually, there is a one-to-one relation
428
00:41:05,590 --> 00:41:12,590
between ak and k, therefore, the total size
of the key space in S cross M is still, 312.
429
00:41:13,560 --> 00:41:19,750
So, you see, that here we have got the key
ak and here we have the key k, but since there
430
00:41:19,750 --> 00:41:24,720
is an inverse of a, that is, actually a one-to-one
correspondence between this set and this set,
431
00:41:24,720 --> 00:41:30,750
so if there are 26 possibilities of K, then
also ak, since you are doing a modulo 26,
432
00:41:30,750 --> 00:41:34,910
also have got 26 possibilities.
So, that means, this set and this set are
433
00:41:34,910 --> 00:41:41,370
essentially the same things. So, that means,
in both the cases you do M cross S or you
434
00:41:41,370 --> 00:41:48,370
do S cross M, your key size is still 312.
So, what you see is that M cross S and S cross
435
00:41:49,980 --> 00:41:56,980
M are same. So, that is what you call commutative.
So, we have got a commutative cipher.
436
00:41:57,040 --> 00:42:02,810
So, it means that M cross S and S cross M,
when they are same, we call them to be commutative
437
00:42:02,810 --> 00:42:07,980
and this is an example of a commutative cipher.
Does not matter, whether you do first shift
438
00:42:07,980 --> 00:42:12,500
and then multiply or you do first multiply
and then shift, both are the same things.
439
00:42:12,500 --> 00:42:16,760
So, then let us see, what is an idempotent
cipher? So, therefore, what is an idempotent
440
00:42:16,760 --> 00:42:20,930
function? It means, if we apply the same function
twice, you obtain that, the same function.
441
00:42:20,930 --> 00:42:26,170
Therefore, M is a permutation cipher, S was
a case of a substitution cipher and both of
442
00:42:26,170 --> 00:42:32,800
them where actually idempotent ciphers. So,
a composed cipher has a larger key, but no
443
00:42:32,800 --> 00:42:38,030
extra security because M cross M, if it is
equal to M, then even composing Ms for, I
444
00:42:38,030 --> 00:42:41,840
mean, more than the, more than 1's essentially,
leaves you with the same kind of transformation.
445
00:42:41,840 --> 00:42:47,430
So, essentially, it does not add to a security.
Therefore, for example, if you have computed
446
00:42:47,430 --> 00:42:53,610
M cross M or S cross S, that would not have
led to the increase of the key space. So,
447
00:42:53,610 --> 00:42:58,330
this is because S cross S and is equal to
S and M cross M is also equal to M, and this
448
00:42:58,330 --> 00:43:04,930
class of ciphers are called idempotent ciphers.
So, you could easily observe from this fact,
449
00:43:04,930 --> 00:43:09,900
that is, if I had done M cross M, what would
have be the, would have, what would have that
450
00:43:09,900 --> 00:43:15,610
meant? I mean, I would have done ax and then
a ax, but my key size would not have still
451
00:43:15,610 --> 00:43:20,150
increased because doing S square essentially,
does not increase the key space. Similarly,
452
00:43:20,150 --> 00:43:24,290
you consider, for shift cipher you do one
shift and you do another shift; so, in both
453
00:43:24,290 --> 00:43:27,670
the cases, you can represent that, I mean,
you represent that that by a third shift.
454
00:43:27,670 --> 00:43:30,470
So, do you understand what I am saying?
455
00:43:30,470 --> 00:43:36,790
So, what I am saying is that if you consider,
say, S as x plus K 1 for example, so I am
456
00:43:36,790 --> 00:43:41,820
just considering S cross S, and I am just
trying to argue that S cross S is actually
457
00:43:41,820 --> 00:43:46,570
equal to S. So, what is the idea?
So, therefore, imagine that in the first phase
458
00:43:46,570 --> 00:43:53,070
you have got, you have got the function S,
I mean, you choose the key as K 1 and in the
459
00:43:53,070 --> 00:43:57,770
second case, you choose the key as K 2. So,
in the first application of S, I would have
460
00:43:57,770 --> 00:44:04,380
computed x plus K 1 and in the second application
of S, I would have computed as K 2.
461
00:44:04,380 --> 00:44:09,390
So, therefore, I would have obtained x plus
K 1 plus K 2. So, that, since I am doing mod
462
00:44:09,390 --> 00:44:15,910
26, I can always represent that as x plus
some K 3 mod 26 where K 3 is nothing but the
463
00:44:15,910 --> 00:44:20,060
summation of K 1 and K 2.
So, that means, even for S cross S, I have
464
00:44:20,060 --> 00:44:25,860
got the same size of the key, so it is a same
cipher. Therefore, I can conclude, that S
465
00:44:25,860 --> 00:44:31,640
cross S is equal to S. similarly, for M cross
M also, we can actually show that M cross
466
00:44:31,640 --> 00:44:35,490
M is also equal to M. So, both this class
of ciphers are something, which we called
467
00:44:35,490 --> 00:44:42,490
as idempotent ciphers.
So, therefore, we have defined what is a commutative
468
00:44:42,520 --> 00:44:47,420
cipher is and we have defined what an idempotent
cipher is, and what we will now consider is
469
00:44:47,420 --> 00:44:52,120
that what happens, if we compute the product
of such kind of ciphers, which are commutative
470
00:44:52,120 --> 00:44:53,590
as well as idempotent?
471
00:44:53,590 --> 00:45:00,590
So, actually that, you can observe from this
fact. So, therefore, what we are trying to
472
00:45:02,430 --> 00:45:06,410
observe is that there is no point of obtaining
products of idempotent ciphers; so, if you
473
00:45:06,410 --> 00:45:11,420
take M cross M, is the same thing as M. So,
that is no point of doing such products.
474
00:45:11,420 --> 00:45:15,890
So, rather, we would get product ciphers from
non-idempotent ciphers, that is, by iterating
475
00:45:15,890 --> 00:45:20,730
them. So, if we have some non-idempotent ciphers,
I would have liked to iterate them and that
476
00:45:20,730 --> 00:45:25,800
is essentially the concept of round, which
exists into all classes of symmetric ciphers
477
00:45:25,800 --> 00:45:29,730
in today's world.
So, the question is how to make non-idempotent
478
00:45:29,730 --> 00:45:34,460
ciphers or functions? So, if I, if the idea
would be, that compose two small different
479
00:45:34,460 --> 00:45:36,300
cryptosystems, which do not commute.
480
00:45:36,300 --> 00:45:42,100
So, do you follow this? If you do not follow,
then this will become clear because of this
481
00:45:42,100 --> 00:45:46,200
calculation. So, what I have said here is
that if there are 2 cryptosystems, which are
482
00:45:46,200 --> 00:45:49,490
idempotent and also commute, then the product
is also idempotent.
483
00:45:49,490 --> 00:45:56,190
So, if this result is true, what does it mean?
It means that if you have got 2 cryptosystems,
484
00:45:56,190 --> 00:46:00,950
which are idempotent and also commute, then
their product is also idempotent. So what
485
00:46:00,950 --> 00:46:06,310
does it mean? It means that if you obtain
a function of this class, then if you take,
486
00:46:06,310 --> 00:46:12,090
and if you take them and if you still product,
rather I mean compute the product of those
487
00:46:12,090 --> 00:46:16,100
kinds of ciphers, the key size does not increase.
So, therefore, considering products does not
488
00:46:16,100 --> 00:46:20,270
help.
Therefore, from this theorem or rather this
489
00:46:20,270 --> 00:46:25,800
result, we know that we actually required
to, I mean, required to compute the products
490
00:46:25,800 --> 00:46:32,610
of ciphers, which even if they are idempotent,
they do not commute. So, that explains this
491
00:46:32,610 --> 00:46:37,650
point, that is, compose two small different
cryptosystems, which do not, do not commute;
492
00:46:37,650 --> 00:46:44,200
so, those kind of ciphers, if you iterate
them, will actually make sense. Therefore,
493
00:46:44,200 --> 00:46:51,200
let us see this result, it is quite simple,
it says that S 1 and S 2 are two such cryptosystems,
494
00:46:51,660 --> 00:46:55,600
which are idempotent and at the same time
they commute.
495
00:46:55,600 --> 00:47:02,370
So, S 1 cross S 2 cross S 1 cross S 2. So,
I am considering the product of these things.
496
00:47:02,370 --> 00:47:07,970
Therefore, so, if I observe that from the
associativity, I can write like, S 1 cross
497
00:47:07,970 --> 00:47:14,810
S 2 cross S 1 cross S 2 and since this commutes
S 2 cross S 1 becomes equal to S 1 cross S
498
00:47:14,810 --> 00:47:19,270
2.
So, now you know, that S 1 cross S 1 is equal
499
00:47:19,270 --> 00:47:24,700
to S 1 and S 2 cross S 2 is also equal to
S 2. So, what we have obtained is that S 1
500
00:47:24,700 --> 00:47:29,350
cross S 2 and product and multiplying that
with S 1 cross S 2, essentially gives you
501
00:47:29,350 --> 00:47:33,560
with S 1 cross S 2. So, what does it mean?
It means this is an idempotent function.
502
00:47:33,560 --> 00:47:39,730
So, in your previous case, we are proved that
M cross S, M and S were essentially, both
503
00:47:39,730 --> 00:47:45,700
of them were idempotent. Therefore, can you
show, can you understand, why M cross S is
504
00:47:45,700 --> 00:47:49,520
also idempotent? Why?
And because we are proved, that M cross M
505
00:47:49,520 --> 00:47:54,370
was equal to S cross M, so that means, M and
S were commutative and we are also proved
506
00:47:54,370 --> 00:47:59,690
that M cross M is equal to M, and S cross
S is equal to S, so that is, they were idempotent
507
00:47:59,690 --> 00:48:04,530
as well. So, if you commute their products,
then essentially, you are left with the same
508
00:48:04,530 --> 00:48:10,250
thing. Therefore, computing their products
or composing them does not help. So, therefore,
509
00:48:10,250 --> 00:48:17,250
you require some other additional quantity,
which will help you and that is the idea of
510
00:48:17,330 --> 00:48:17,670
rounds.
511
00:48:17,670 --> 00:48:22,560
So, therefore, the idea is that, how can you,
I mean, what is that future? So, till now,
512
00:48:22,560 --> 00:48:26,980
whatever we have seen, that is a future missing,
which we have not yet seen, then therefore,
513
00:48:26,980 --> 00:48:33,230
the concept of round, we are still not able
to achieve. So, what is that concept? That
514
00:48:33,230 --> 00:48:38,360
concept is of non-linearity, but I will define
that non-linearity concept further in our
515
00:48:38,360 --> 00:48:41,300
subsequent classes, but this is the brief
introduction to that.
516
00:48:41,300 --> 00:48:46,300
So, therefore, consider that instead of, I
mean, let us consider these two functions
517
00:48:46,300 --> 00:48:53,300
S and P, where P is actually equal to x plus
k and s is the output of a function fx. Now,
518
00:48:54,710 --> 00:49:00,110
I claim, that this function is actually a
non-linear function with respect to addition
519
00:49:00,110 --> 00:49:01,810
operation. So, what does it mean?
520
00:49:01,810 --> 00:49:08,810
So, it means, that, so therefore, what does
a non-linear function mean? It means that
521
00:49:09,770 --> 00:49:15,260
if you consider f of x 1 plus x 2, so non-linear
with respect to plus; non-linearity is always
522
00:49:15,260 --> 00:49:20,700
with respect to an operation.
So, f of x 1 plus x 2 is not equal to f of
523
00:49:20,700 --> 00:49:27,700
x 1 plus f of x 2 and equality would have
mean, meant linearity. So, therefore, now
524
00:49:29,040 --> 00:49:34,510
consider this function S equal to fC and P
equal to x plus k and consider S cross P.
525
00:49:34,510 --> 00:49:41,510
So, what is S cross P equal to? It is equal
to fx plus k and what is S cross P cross S
526
00:49:44,530 --> 00:49:49,790
cross P? So, that means, that if you take
fx plus k and then you do a further application
527
00:49:49,790 --> 00:49:56,790
of f, and add that with k, so for this multiplication
to increase the value of the length of those
528
00:49:56,890 --> 00:50:02,510
key, key, so, thus, what is needed?
So, therefore, it is needed that S cross P
529
00:50:02,510 --> 00:50:09,510
should not be idempotent. So, if that so,
what we require is that f of fx plus k, which
530
00:50:10,020 --> 00:50:15,230
is equal to this particular thing or when
it added with k should not be equal to f square
531
00:50:15,230 --> 00:50:21,100
x plus k dash because if this, I mean, if
you had a linear function f, then you can
532
00:50:21,100 --> 00:50:25,980
actually have, if this was a linear function,
you would have actually distributed this and
533
00:50:25,980 --> 00:50:30,550
this would have computed to some f square
x plus some value of k dash.
534
00:50:30,550 --> 00:50:37,230
So, that you see is exactly similar with your
S cross P function, with some other application
535
00:50:37,230 --> 00:50:43,410
of f and it is added with a key K, with a
key K. Therefore, what did, therefore this
536
00:50:43,410 --> 00:50:48,620
happens only if f is non-linear with respect
to plus.
537
00:50:48,620 --> 00:50:55,110
So, if this was a linear function f, then
actually this would have distributed and that
538
00:50:55,110 --> 00:50:59,370
result that we would have obtained, would
have been similar to that of S cross P.
539
00:50:59,370 --> 00:51:04,110
So, the size of this function, whereas the
size of the key of this function and the size
540
00:51:04,110 --> 00:51:10,310
of this composed function would have been
the same. So, therefore, what we need is something,
541
00:51:10,310 --> 00:51:14,160
a deviation from this fact. So, we need not
linearity, but we need non-linearity.
542
00:51:14,160 --> 00:51:21,160
So, therefore, hence, we have to compose linear
and non-linear functions to increase the security
543
00:51:21,250 --> 00:51:26,340
of a cipher. So, in order to increase the
security of a cipher, what we have seen till
544
00:51:26,340 --> 00:51:30,630
now are only linear compounds, linear transformations.
We essentially, found out multiplication with
545
00:51:30,630 --> 00:51:37,590
a matrix with a linear operator addition with
the key, that is also, that is also, a linear
546
00:51:37,590 --> 00:51:42,380
function.
So, therefore, all these are linear transformations.
547
00:51:42,380 --> 00:51:46,890
Therefore, we need, therefore, you see, that
nicely from Shannon's theory, we can actually
548
00:51:46,890 --> 00:51:52,230
arrive at the fact that we require at composition
of linear functions and as well as non-linear
549
00:51:52,230 --> 00:51:52,950
functions.
550
00:51:52,950 --> 00:51:58,560
So, so, with this I conclude my talk, but
I would like to give you an assignment, which
551
00:51:58,560 --> 00:52:02,290
you are supposed to again, do it and submit
it on twenty eight, before twenty eight.
552
00:52:02,290 --> 00:52:06,680
I gave you one assignment already, the other
assignment is that, show that unicity distance
553
00:52:06,680 --> 00:52:12,180
of the Hill cipher with an m cross m encryption
function is actually less than m divided by
554
00:52:12,180 --> 00:52:16,720
R L, where R L is the redundancy as defined
in the class.
555
00:52:16,720 --> 00:52:22,350
So, you can show that the unicity distance
of the Hill cipher with an m cross m encryption
556
00:52:22,350 --> 00:52:26,910
function is actually less than m divided by
R L.
557
00:52:26,910 --> 00:52:33,360
So, that is an assignment, which is given
to you and you can read for the things from
558
00:52:33,360 --> 00:52:37,230
Shannon's book, which is, Communication Theory
of Secrecy Systems, which is actually a paper;
559
00:52:37,230 --> 00:52:42,920
it is a classical paper, so it is appeared
in Bell Systems Technical Journal, but I am
560
00:52:42,920 --> 00:52:45,960
sure that you will get online.
And the other text book that I have followed
561
00:52:45,960 --> 00:52:50,110
is from Douglas Stinson, Cryptography Theory
and Practice, a second edition book, it is
562
00:52:50,110 --> 00:52:54,260
you can follow. So, I have followed that book
although third edition exists.
563
00:52:54,260 --> 00:52:58,330
The next day's topic would be symmetric key
ciphers, so we will use these concepts to
564
00:52:58,330 --> 00:53:02,580
go and build ciphers.
Now, therefore, symmetric ciphers, is our
565
00:53:02,580 --> 00:53:06,800
next day's topic and we will start with block
ciphers and follow that with stream ciphers.