1
00:00:02,240 --> 00:00:06,640
So, we have said that we will measure the
time efficiency of algorithms only upto an
2
00:00:06,640 --> 00:00:12,410
order of magnitude. So, we will express the
running time as a function t of n of the input
3
00:00:12,410 --> 00:00:16,700
size n, but we will ignore constants. So,
we will only say that t of n is propositional
4
00:00:16,700 --> 00:00:22,550
to n square or n log n or 2 to the n . So,
now, the next step is to have an effective
5
00:00:22,550 --> 00:00:28,199
way of comparing these running times across
algorithms. If I know the order of magnitude
6
00:00:28,199 --> 00:00:35,250
of 1 algorithm, and the order of magnitude
of another algorithm how do I compare?
7
00:00:35,250 --> 00:00:40,840
So, the notation we need or the concept we
need is that of an upper bound which is given
8
00:00:40,840 --> 00:00:47,380
by the notation big O. So, we say that a function
g of n is an upper bound for another function
9
00:00:47,380 --> 00:00:55,090
t of n if beyond some point g of n dominates
t of n. Now, remember that g of n is going
10
00:00:55,090 --> 00:00:59,600
to be now a function which is an order a magnitude.
So, we are thrown away all the constant factors
11
00:00:59,600 --> 00:01:04,820
which we play a role in g of n. So, we allow
ourselves this constant. So, we say that it
12
00:01:04,820 --> 00:01:09,020
is not g of n alone which dominates t of n,
but g of n times some constant.
13
00:01:09,020 --> 00:01:15,619
So, there is a fixed constants c and beyond
some limits. So, there is an initial portion
14
00:01:15,619 --> 00:01:24,600
where we do not care, but beyond this limit
we have that t of n always lies below c times
15
00:01:24,600 --> 00:01:29,229
g of n. In this case c times g of n and is
an upper bound for t of n and we say that
16
00:01:29,229 --> 00:01:33,229
t of n is big O of g of n.
17
00:01:33,229 --> 00:01:40,609
So, let us look at an example. So, supposing
we have this function t of n is 100 and plus
18
00:01:40,609 --> 00:01:47,130
5 then, we claim that it is big O of n square
now remember that n is suppose to be the input
19
00:01:47,130 --> 00:01:53,100
size. So, the input size to a problem is always
going to be at least 1, there is no problem
20
00:01:53,100 --> 00:01:56,780
that needs to be solved if your input is zero
and certainly we cannot have negative. So,
21
00:01:56,780 --> 00:02:01,970
we are always having in mind the situation
that, n is bigger than or equal to 1. So,
22
00:02:01,970 --> 00:02:07,770
if we now start with our function 100 n plus
5 then, if we choose n to be bigger than 5
23
00:02:07,770 --> 00:02:14,410
then n will be bigger than this value. So,
we can say 100 n plus 5 is smaller than equal
24
00:02:14,410 --> 00:02:17,540
to 100 n plus n.
And now we can collapse this as 101 right.
25
00:02:17,540 --> 00:02:25,090
So, 100 n plus 5 is smaller than 101 provided
n is bigger than equal to 5, now, since n
26
00:02:25,090 --> 00:02:31,400
is at least 1 n square is bigger than n. So,
101 times n is going to be smaller than 100
27
00:02:31,400 --> 00:02:37,920
and 1 n square. So, by choosing n 0 to be
5 and c to be 101 we have established that
28
00:02:37,920 --> 00:02:43,943
n square is an upper bound to 100 n plus 5.
So, 100 n plus 5 is big O of n square. Now,
29
00:02:43,943 --> 00:02:49,670
we can do this using a slightly different
calculations, we can say that 100 n plus 5
30
00:02:49,670 --> 00:02:54,590
is smaller than 100 n plus 5 n for n bigger
than 1 because n is at least 1. So, 5 times
31
00:02:54,590 --> 00:03:00,020
n is going to be at least 5. So, now, if you
collapse is we get 105 n now, but the same
32
00:03:00,020 --> 00:03:05,290
logic 105 n the smaller than 105 n square
whenever n is bigger than 1.
33
00:03:05,290 --> 00:03:12,930
So, new way of establishing the same fact,
where we have chosen n 0 equal to 1 and c
34
00:03:12,930 --> 00:03:20,020
equal to 105 right. So, n 0 and c or not unique
right depending on how we do the calculation
35
00:03:20,020 --> 00:03:25,290
we might find different n 0 and different
c. But it does not matter how we choose them
36
00:03:25,290 --> 00:03:29,550
so long as we can establish the fact that
beyond a certain n 0 there is a uniform constant
37
00:03:29,550 --> 00:03:34,930
c such that c times g of n dominates t of
n. Notice that the same calculation can give
38
00:03:34,930 --> 00:03:38,730
us a tighter upper bound, this is kind of
a loose upper bound we would expect that 100
39
00:03:38,730 --> 00:03:44,260
n is smaller than n square. But we can also
say that this is big O of n, why is that?
40
00:03:44,260 --> 00:03:48,150
Because if you just stop the calculation at
this point we do not come to this stage at
41
00:03:48,150 --> 00:03:52,900
all you have establish that 100 n plus 5 is
less equal to 101. But, the same values n
42
00:03:52,900 --> 00:04:01,650
0 equal to 5 and c equal to 101 this also
tells us that 100 n plus 5 is big O of n.
43
00:04:01,650 --> 00:04:06,200
Likewise at this point if you just ignore
this step then we say that 100 n plus 5 is
44
00:04:06,200 --> 00:04:13,530
smaller than 105 n. So, for n 0 equal to 1
and c equal to 105 you have established this.
45
00:04:13,530 --> 00:04:20,030
Let us look at another example supposing we
look at 100 n square plus 20 n plus 5. Now,
46
00:04:20,030 --> 00:04:24,590
again assuming that n is bigger than 1 we
know that we can multiply by n and do not
47
00:04:24,590 --> 00:04:30,449
get any smaller. So, 20 n will be dominated
by 20 n square right and 5 will be dominated
48
00:04:30,449 --> 00:04:36,490
by 5 times n times n 5 n square. So, I now
have 100 n square plus 20 n square plus 5
49
00:04:36,490 --> 00:04:41,770
n square, is bigger than my original function
100 n square plus 20 n plus 5. So, I combine
50
00:04:41,770 --> 00:04:47,210
these, I get 125 n square and now all I have
assumed is that, n is bigger than equal to
51
00:04:47,210 --> 00:04:53,979
1. So, for n 0 equal to 1 and c equal to 125
we have that n square dominates 100 n square
52
00:04:53,979 --> 00:05:01,389
plus 20 n plus 1. So, you can easily see that,
in general if I have a n square plus b n plus
53
00:05:01,389 --> 00:05:08,669
c right this is going to be dominated by a
plus, b plus, c times n square right. So,
54
00:05:08,669 --> 00:05:13,060
this is going to be less than this for all
n greater than equal to 1. So, we can generally
55
00:05:13,060 --> 00:05:18,270
speaking, take a function like this and ignore
the lower terms because they are dominated
56
00:05:18,270 --> 00:05:22,949
by the higher term and just focus on the value
with the highest exponent.
57
00:05:22,949 --> 00:05:26,800
So, in this case in this whole thing n square
is the biggest term therefore, this whole
58
00:05:26,800 --> 00:05:31,749
thing this going to be big O of n square.
So, this is a very typical shortcut that we
59
00:05:31,749 --> 00:05:36,620
can take, you can just take an expression
ignore the coefficients pick the largest exponent
60
00:05:36,620 --> 00:05:41,279
and choose that to be the big O right.
61
00:05:41,279 --> 00:05:47,430
Now, we can also show that things are not
big O. So, for instance its intuitively clear
62
00:05:47,430 --> 00:05:51,930
that, n cube is bigger than n square now,
how do we formally show that n cube is not
63
00:05:51,930 --> 00:05:57,610
big O of n square. Well, supposing it was,
then there is exists some n 0, such that for
64
00:05:57,610 --> 00:06:05,689
all n bigger than equal to n 0, n cube must
be smaller than or equal to c times n square
65
00:06:05,689 --> 00:06:11,710
right. If this were big O of n square this
is what we must have. Now supposing, we choose
66
00:06:11,710 --> 00:06:17,139
n is equal to c then we have on the left hand
side c cube, on the right hand side we have
67
00:06:17,139 --> 00:06:22,275
c cube and certainly we have that c cube less
than equal to c cube. If i go to c plus 1
68
00:06:22,275 --> 00:06:28,330
I will have c plus 1 whole cube and this side
I will have c times c plus 1 whole square
69
00:06:28,330 --> 00:06:33,449
and now the problem is, this is bigger because
c plus 1 whole cube is bigger than c times
70
00:06:33,449 --> 00:06:38,279
c plus 1 whole square.
Therefore, no matter what c we choose, if
71
00:06:38,279 --> 00:06:45,570
we go to n equal to c we will find that inequality
that we want gets flipped around. Therefore,
72
00:06:45,570 --> 00:06:50,430
there is no c that we can choose to make n
cubes smaller than c n square beyond a certain
73
00:06:50,430 --> 00:06:54,520
point and therefore, this is not big O. So,
our intuitive idea that n cube grows faster
74
00:06:54,520 --> 00:06:59,610
than n square can be formally proved using
this definition.
75
00:06:59,610 --> 00:07:08,940
Now, here is a useful fact about big O, if
I have a function f 1 which is big O of g
76
00:07:08,940 --> 00:07:16,039
1 and another function f 2 which is big O
of g 2 then, f 1 plus f 2 is actually dominated
77
00:07:16,039 --> 00:07:21,020
by the max of g 1 and g 2. You might think
it is g 1 plus g 2 this is the obvious thing
78
00:07:21,020 --> 00:07:25,860
that comes to mind looking at this, that f
1 plus f 2 is smaller than g 1 plus g 2, but
79
00:07:25,860 --> 00:07:27,139
actually it is max.
80
00:07:27,139 --> 00:07:35,689
How do we prove this? Well, is not very difficult.
By definition if f 1 is big O of g 1 there
81
00:07:35,689 --> 00:07:42,419
exists some n 1 such that beyond n 1 f 1 is
dominated by c 1 of g 1 c 1 times g 1. Similarly,
82
00:07:42,419 --> 00:07:49,349
if f 2 is big O of g 2 there is an n 2 such
that beyond n 2 f 2 is dominated by c 2 times
83
00:07:49,349 --> 00:07:58,729
g 2 right. So, now, what we can do is we can
choose n 3 to be the maximum of n 1 and n
84
00:07:58,729 --> 00:08:08,430
2, and we can choose c 3 to be the maximum
of c 1 and c 2. So, now, let us see what happens
85
00:08:08,430 --> 00:08:18,050
beyond n 3, beyond n 3 both these inequalities
are effective. So, we have f 1 plus f 2 will
86
00:08:18,050 --> 00:08:24,680
be less than c 1 times g 1 plus c 2 times
g 2 right. Because, this is beyond both n
87
00:08:24,680 --> 00:08:28,439
1 and n 2 so, both f 1 is less than c 1 g
1 holds and f 2 less than c 2 g 2 holds. So,
88
00:08:28,439 --> 00:08:34,811
I can add the 2 and, this is the first obvious
thing that we said is it g 1 plus g 2, but
89
00:08:34,811 --> 00:08:39,550
now we can be a little clever we can say there
we have c 3. So, c 1 is smaller than c 3 because
90
00:08:39,550 --> 00:08:43,690
this is the maximum c 2 is smaller than c
3. So, I can combine these and say that this
91
00:08:43,690 --> 00:08:50,650
is less than c 3 g 1 plus c 3 g 2.
Now, having combined these I can of course,
92
00:08:50,650 --> 00:08:58,520
push them together and say this is less then
c 3 times g 1 plus g 2. But g 1 plus g 2 if
93
00:08:58,520 --> 00:09:03,810
I take the maximum of those then 2 times the
maximum will be bigger than that. So, I will
94
00:09:03,810 --> 00:09:13,709
get this is less than c 3 times 2 times the
maximum of g 1 and g 2 right. I can take this
95
00:09:13,709 --> 00:09:21,270
2 out and say that therefore, this is less
then equal to 2 c 3 times max of g 1 and g
96
00:09:21,270 --> 00:09:32,470
2 right. So, now, if I take this as my n 0
and this as my c then I have established that
97
00:09:32,470 --> 00:09:37,600
for every n bigger than n 0 namely maximum
n 1 and n 2 there is a constant which is 2
98
00:09:37,600 --> 00:09:46,010
times the max of c 1 c 2 such that f 1 plus
f 2 is dominated by c times max of g 1 g 2.
99
00:09:46,010 --> 00:09:50,610
Why is this mathematical fact useful to us?
100
00:09:50,610 --> 00:09:55,600
So, very often when we are analyzing an algorithm,
it will have different phases. It will do
101
00:09:55,600 --> 00:09:59,670
something in one part then it will continue
to some other thing and so, on. So, we could
102
00:09:59,670 --> 00:10:05,520
have 2 phases, phase A which takes time big
O of g A and phase B which takes time big
103
00:10:05,520 --> 00:10:11,940
O of g B. So, now, what is a good upper bound
for the overall running time of the algorithm.
104
00:10:11,940 --> 00:10:17,120
So, the instinctive thing would be to say
g A plus g B. But what this result tells us
105
00:10:17,120 --> 00:10:21,860
is that it is not g A plus g B that is useful
for the upper bound, but the maximum of g
106
00:10:21,860 --> 00:10:27,370
A and g B right. In other words, when we are
analyzing an algorithm it is enough to look
107
00:10:27,370 --> 00:10:32,390
at the bottle necks. It goes through many
steps look at the steps which take the maximum
108
00:10:32,390 --> 00:10:36,690
amount of time, focus on those and that will
determine the overall running time of the
109
00:10:36,690 --> 00:10:41,050
algortihm. So, when we look at a function,
an algorithm which has a loop, we typically
110
00:10:41,050 --> 00:10:45,190
look at the loop how long does the loop take.
We ignore, may be the initialization that
111
00:10:45,190 --> 00:10:48,200
takes place before the loop or some print
statement that takes place after the loop
112
00:10:48,200 --> 00:10:52,563
because that does not contribute as much to
the complexity as the loop itself. So, when
113
00:10:52,563 --> 00:10:59,000
we have multiple phases, it is the most inefficient
phase which dominates the overall behavior
114
00:10:59,000 --> 00:11:05,089
and this is formalized by the result we just
saw.
115
00:11:05,089 --> 00:11:11,540
Now, there is a symmetric notion to an upper
bound namely a lower bound. So, just like
116
00:11:11,540 --> 00:11:16,139
we said that t of n is always lying below
c of... c times g of n. We might say that
117
00:11:16,139 --> 00:11:22,720
t of n always lies above c times g of n and
this is described using this notation omega.
118
00:11:22,720 --> 00:11:27,199
So, this is just a symmetric definition which
just says that t of n is omega of g of n,
119
00:11:27,199 --> 00:11:33,079
if for every n beyond n 0 t of n lies above
c times g of n for some fixed constant c.
120
00:11:33,079 --> 00:11:37,950
So, here we have the same thing we have an
initial thing that we are not interested in,
121
00:11:37,950 --> 00:11:43,730
because at this point nothing can be said.
But beyond this n 0 we have that t of n lies
122
00:11:43,730 --> 00:11:50,240
above so, t of n is always above c times g
of n
123
00:11:50,240 --> 00:11:56,230
So, we earlier saw that n cubed is not big
O of n square, but intuitively n cube should
124
00:11:56,230 --> 00:12:00,850
be lying above n square and this is certainly
the case because, n cubed is greater than
125
00:12:00,850 --> 00:12:06,050
equal to n square for every n bigger than
equal to 1 right. So, at n equal to 1 both
126
00:12:06,050 --> 00:12:11,220
are 1, but n equal to 2 this will be 8 this
will be 4 and so, on. So, if given n 0 equal
127
00:12:11,220 --> 00:12:18,899
to 0 or n 0 equal to 1 and c equal to 1 we
can establish this. Now of course, when we
128
00:12:18,899 --> 00:12:22,870
are establishing an upper bound we are usually
talking of about the algorithm we have. You
129
00:12:22,870 --> 00:12:27,129
are saying this algorithm has an upper bound
of so, much and therefore, I can definitely
130
00:12:27,129 --> 00:12:32,140
solve the problem within this much time. Now,
when we are talking about lower bounds it
131
00:12:32,140 --> 00:12:37,029
is not that useful to talk about a specific
algorithm. It is not so useful to say that
132
00:12:37,029 --> 00:12:42,199
this algorithm takes at least so much time.
What we would like to say is something like
133
00:12:42,199 --> 00:12:46,649
this problem takes at least so much time,
no matter how you write the algorithm it is
134
00:12:46,649 --> 00:12:51,509
going to take at least so much time. So, typically
what we would like to do to make a useful
135
00:12:51,509 --> 00:12:56,829
lower bound statement is to say that a problem
takes a certain amount of time no matter how
136
00:12:56,829 --> 00:13:01,110
you try to solve it. So, the problem has a
lower bound rather than the algorithm has
137
00:13:01,110 --> 00:13:03,800
a lower bound.
Now, as you might imagine this is the fairly
138
00:13:03,800 --> 00:13:08,350
complex thing to say because what you have
to able to show is that no matter how clever
139
00:13:08,350 --> 00:13:13,540
you are, no matter how you design an algorithm
you cannot do better than a certain thing.
140
00:13:13,540 --> 00:13:18,050
This is much harder than saying I have a specific
way of doing it and I am analyzing how to
141
00:13:18,050 --> 00:13:23,819
do that. So, establishing lower bounds is
often very tricky. One of the areas where
142
00:13:23,819 --> 00:13:29,360
lower bounds have been established is sorting.
So, it can be shown that, if you are relying
143
00:13:29,360 --> 00:13:34,769
on comparing values to sort them then, you
must at least do n log n comparisons, no matter
144
00:13:34,769 --> 00:13:39,339
how you actually do the sorting. No matter
how clever your sorting algorithm, it cannot
145
00:13:39,339 --> 00:13:45,000
be faster than n log n in terms of comparing
elements but, this hard to do, remember, because
146
00:13:45,000 --> 00:13:50,600
you have to really show this independent of
the algorithm.
147
00:13:50,600 --> 00:13:56,470
Now, we could have a nice situation where
we have matching upper and lower bounds. So,
148
00:13:56,470 --> 00:14:03,370
we say that t is theta of g of n if it is
both, big O of g of n and omega of g of n.
149
00:14:03,370 --> 00:14:09,329
In other words, with suitable constants t
of n can be dominated by g of n, and it also
150
00:14:09,329 --> 00:14:14,490
lies above g of n for two different constants
of course. So, what this really means is that,
151
00:14:14,490 --> 00:14:19,089
t of n and g of n are basically are of the
same order of magnitude, they are essentially
152
00:14:19,089 --> 00:14:24,500
the same function therefore, you have reached
a kind of optimum value.
153
00:14:24,500 --> 00:14:32,170
So, as an example we can say for instance
that n into n minus 1 by 2 is theta of n square.
154
00:14:32,170 --> 00:14:35,730
In order to prove something like this, we
have to show that there is an upper bound,
155
00:14:35,730 --> 00:14:39,730
that is we can find a constant such that c
times n square dominates this and a lower
156
00:14:39,730 --> 00:14:44,089
bound. There is another constant such that
c times n square is below this. So, for the
157
00:14:44,089 --> 00:14:50,839
upper bound we just expand out n into n minus
1 by 2. So, we get n squared by 2 in the first
158
00:14:50,839 --> 00:14:56,199
term and n minus n by 2. Now, since it is
an upper bound n squared by 2 minus n by 2,
159
00:14:56,199 --> 00:15:00,089
if I ignore n by 2, this is going to be less
than n squared by 2.
160
00:15:00,089 --> 00:15:05,339
Therefore, now I have an upper bound saying
that, with the constant half this is dominated
161
00:15:05,339 --> 00:15:11,720
by n square for n bigger than 0. On the other
hand, if I want to do a lower bound then,
162
00:15:11,720 --> 00:15:17,730
I will say same thing I will expand out n
into n minus 2, I will get same expression.
163
00:15:17,730 --> 00:15:22,860
And now I will want to lower bound, so, now
what I will do is I will make this even smaller.
164
00:15:22,860 --> 00:15:33,180
I will say that I subtract not n by 2 but
n by 2 times n by 2. So, this will be bigger
165
00:15:33,180 --> 00:15:37,600
than this, because I am subtracting more.
n squared by 2 minus n by 2 will be bigger
166
00:15:37,600 --> 00:15:46,170
than n squared by 2 minus n by 2 into n by
2, but this is again n squared by 4. So, I
167
00:15:46,170 --> 00:15:52,339
have n squared by 2 minus n squared by 4,
so this simplifies to n squared by 4. In other
168
00:15:52,339 --> 00:15:56,500
words, I have shown that n into n minus 1
by 2 is bigger than equal to n squared by
169
00:15:56,500 --> 00:16:04,980
4. But now, in order to justify this, to justify
that n by 2 is increasing, n must be at least
170
00:16:04,980 --> 00:16:09,120
2. Because if n smaller than 2 this is a fraction
so I am actually reducing.
171
00:16:09,120 --> 00:16:13,720
Here, I have a different n, I have n greater
than equal to 2. I have established a lower
172
00:16:13,720 --> 00:16:20,990
bound which says that for n bigger than equal
to 2 n into n minus 1 by 2 is above one fourth
173
00:16:20,990 --> 00:16:28,170
of n square. So, therefore now if we chose
our constant to be 2 for all values bigger
174
00:16:28,170 --> 00:16:34,449
than 2, I have that n into n minus 1 by 2
is less than half of n square and n into n
175
00:16:34,449 --> 00:16:38,190
minus 2 is bigger than one fourth of n square.
So, I have found this matching upper and lower
176
00:16:38,190 --> 00:16:45,800
bound which shows that n into n minus 1 by
2 is theta of n square.
177
00:16:45,800 --> 00:16:51,629
So, to summarize when we use big O, we have
discovered an upper bound. If we say f of
178
00:16:51,629 --> 00:16:57,339
n is big O of g of n, it means that g of n
dominates f of n so f of n is no bigger than
179
00:16:57,339 --> 00:17:03,029
g of n. And this is useful to describe the
limit of a worst case running time. So, we
180
00:17:03,029 --> 00:17:09,400
can say that worst running time is upper bounded
by g of n. On the other hand, if we use omega
181
00:17:09,400 --> 00:17:15,329
we are saying that f of n is at least g of
n, g of n is a lower bound for f of n.
182
00:17:15,329 --> 00:17:20,459
As we described this is more useful for problems
as a whole, sorting as a general problem rather
183
00:17:20,459 --> 00:17:24,410
than for individual algorithm. Because it
tells us no matter how you do something, you
184
00:17:24,410 --> 00:17:30,419
will have to spend at least that much time,
but this hard to establish. And if you have
185
00:17:30,419 --> 00:17:35,780
a situation where a lower bound has been established
for a problem and you find an algorithm which
186
00:17:35,780 --> 00:17:41,210
achieves the same bound as an upper bound
then, you have found in some sense the best
187
00:17:41,210 --> 00:17:45,240
possible algorithm. Because, you cannot do
any better than g of n because we have a lower
188
00:17:45,240 --> 00:17:49,020
bound of g of n and you have achieved g of
n because you have shown your algorithm is
189
00:17:49,020 --> 00:17:55,010
big O of g of n. So, theta is a way of demonstrating
that you have found an algorithm which is
190
00:17:55,010 --> 00:17:57,289
asymptotically as efficient as possible.