1 00:00:02,240 --> 00:00:06,640 So, we have said that we will measure the time efficiency of algorithms only upto an 2 00:00:06,640 --> 00:00:12,410 order of magnitude. So, we will express the running time as a function t of n of the input 3 00:00:12,410 --> 00:00:16,700 size n, but we will ignore constants. So, we will only say that t of n is propositional 4 00:00:16,700 --> 00:00:22,550 to n square or n log n or 2 to the n . So, now, the next step is to have an effective 5 00:00:22,550 --> 00:00:28,199 way of comparing these running times across algorithms. If I know the order of magnitude 6 00:00:28,199 --> 00:00:35,250 of 1 algorithm, and the order of magnitude of another algorithm how do I compare? 7 00:00:35,250 --> 00:00:40,840 So, the notation we need or the concept we need is that of an upper bound which is given 8 00:00:40,840 --> 00:00:47,380 by the notation big O. So, we say that a function g of n is an upper bound for another function 9 00:00:47,380 --> 00:00:55,090 t of n if beyond some point g of n dominates t of n. Now, remember that g of n is going 10 00:00:55,090 --> 00:00:59,600 to be now a function which is an order a magnitude. So, we are thrown away all the constant factors 11 00:00:59,600 --> 00:01:04,820 which we play a role in g of n. So, we allow ourselves this constant. So, we say that it 12 00:01:04,820 --> 00:01:09,020 is not g of n alone which dominates t of n, but g of n times some constant. 13 00:01:09,020 --> 00:01:15,619 So, there is a fixed constants c and beyond some limits. So, there is an initial portion 14 00:01:15,619 --> 00:01:24,600 where we do not care, but beyond this limit we have that t of n always lies below c times 15 00:01:24,600 --> 00:01:29,229 g of n. In this case c times g of n and is an upper bound for t of n and we say that 16 00:01:29,229 --> 00:01:33,229 t of n is big O of g of n. 17 00:01:33,229 --> 00:01:40,609 So, let us look at an example. So, supposing we have this function t of n is 100 and plus 18 00:01:40,609 --> 00:01:47,130 5 then, we claim that it is big O of n square now remember that n is suppose to be the input 19 00:01:47,130 --> 00:01:53,100 size. So, the input size to a problem is always going to be at least 1, there is no problem 20 00:01:53,100 --> 00:01:56,780 that needs to be solved if your input is zero and certainly we cannot have negative. So, 21 00:01:56,780 --> 00:02:01,970 we are always having in mind the situation that, n is bigger than or equal to 1. So, 22 00:02:01,970 --> 00:02:07,770 if we now start with our function 100 n plus 5 then, if we choose n to be bigger than 5 23 00:02:07,770 --> 00:02:14,410 then n will be bigger than this value. So, we can say 100 n plus 5 is smaller than equal 24 00:02:14,410 --> 00:02:17,540 to 100 n plus n. And now we can collapse this as 101 right. 25 00:02:17,540 --> 00:02:25,090 So, 100 n plus 5 is smaller than 101 provided n is bigger than equal to 5, now, since n 26 00:02:25,090 --> 00:02:31,400 is at least 1 n square is bigger than n. So, 101 times n is going to be smaller than 100 27 00:02:31,400 --> 00:02:37,920 and 1 n square. So, by choosing n 0 to be 5 and c to be 101 we have established that 28 00:02:37,920 --> 00:02:43,943 n square is an upper bound to 100 n plus 5. So, 100 n plus 5 is big O of n square. Now, 29 00:02:43,943 --> 00:02:49,670 we can do this using a slightly different calculations, we can say that 100 n plus 5 30 00:02:49,670 --> 00:02:54,590 is smaller than 100 n plus 5 n for n bigger than 1 because n is at least 1. So, 5 times 31 00:02:54,590 --> 00:03:00,020 n is going to be at least 5. So, now, if you collapse is we get 105 n now, but the same 32 00:03:00,020 --> 00:03:05,290 logic 105 n the smaller than 105 n square whenever n is bigger than 1. 33 00:03:05,290 --> 00:03:12,930 So, new way of establishing the same fact, where we have chosen n 0 equal to 1 and c 34 00:03:12,930 --> 00:03:20,020 equal to 105 right. So, n 0 and c or not unique right depending on how we do the calculation 35 00:03:20,020 --> 00:03:25,290 we might find different n 0 and different c. But it does not matter how we choose them 36 00:03:25,290 --> 00:03:29,550 so long as we can establish the fact that beyond a certain n 0 there is a uniform constant 37 00:03:29,550 --> 00:03:34,930 c such that c times g of n dominates t of n. Notice that the same calculation can give 38 00:03:34,930 --> 00:03:38,730 us a tighter upper bound, this is kind of a loose upper bound we would expect that 100 39 00:03:38,730 --> 00:03:44,260 n is smaller than n square. But we can also say that this is big O of n, why is that? 40 00:03:44,260 --> 00:03:48,150 Because if you just stop the calculation at this point we do not come to this stage at 41 00:03:48,150 --> 00:03:52,900 all you have establish that 100 n plus 5 is less equal to 101. But, the same values n 42 00:03:52,900 --> 00:04:01,650 0 equal to 5 and c equal to 101 this also tells us that 100 n plus 5 is big O of n. 43 00:04:01,650 --> 00:04:06,200 Likewise at this point if you just ignore this step then we say that 100 n plus 5 is 44 00:04:06,200 --> 00:04:13,530 smaller than 105 n. So, for n 0 equal to 1 and c equal to 105 you have established this. 45 00:04:13,530 --> 00:04:20,030 Let us look at another example supposing we look at 100 n square plus 20 n plus 5. Now, 46 00:04:20,030 --> 00:04:24,590 again assuming that n is bigger than 1 we know that we can multiply by n and do not 47 00:04:24,590 --> 00:04:30,449 get any smaller. So, 20 n will be dominated by 20 n square right and 5 will be dominated 48 00:04:30,449 --> 00:04:36,490 by 5 times n times n 5 n square. So, I now have 100 n square plus 20 n square plus 5 49 00:04:36,490 --> 00:04:41,770 n square, is bigger than my original function 100 n square plus 20 n plus 5. So, I combine 50 00:04:41,770 --> 00:04:47,210 these, I get 125 n square and now all I have assumed is that, n is bigger than equal to 51 00:04:47,210 --> 00:04:53,979 1. So, for n 0 equal to 1 and c equal to 125 we have that n square dominates 100 n square 52 00:04:53,979 --> 00:05:01,389 plus 20 n plus 1. So, you can easily see that, in general if I have a n square plus b n plus 53 00:05:01,389 --> 00:05:08,669 c right this is going to be dominated by a plus, b plus, c times n square right. So, 54 00:05:08,669 --> 00:05:13,060 this is going to be less than this for all n greater than equal to 1. So, we can generally 55 00:05:13,060 --> 00:05:18,270 speaking, take a function like this and ignore the lower terms because they are dominated 56 00:05:18,270 --> 00:05:22,949 by the higher term and just focus on the value with the highest exponent. 57 00:05:22,949 --> 00:05:26,800 So, in this case in this whole thing n square is the biggest term therefore, this whole 58 00:05:26,800 --> 00:05:31,749 thing this going to be big O of n square. So, this is a very typical shortcut that we 59 00:05:31,749 --> 00:05:36,620 can take, you can just take an expression ignore the coefficients pick the largest exponent 60 00:05:36,620 --> 00:05:41,279 and choose that to be the big O right. 61 00:05:41,279 --> 00:05:47,430 Now, we can also show that things are not big O. So, for instance its intuitively clear 62 00:05:47,430 --> 00:05:51,930 that, n cube is bigger than n square now, how do we formally show that n cube is not 63 00:05:51,930 --> 00:05:57,610 big O of n square. Well, supposing it was, then there is exists some n 0, such that for 64 00:05:57,610 --> 00:06:05,689 all n bigger than equal to n 0, n cube must be smaller than or equal to c times n square 65 00:06:05,689 --> 00:06:11,710 right. If this were big O of n square this is what we must have. Now supposing, we choose 66 00:06:11,710 --> 00:06:17,139 n is equal to c then we have on the left hand side c cube, on the right hand side we have 67 00:06:17,139 --> 00:06:22,275 c cube and certainly we have that c cube less than equal to c cube. If i go to c plus 1 68 00:06:22,275 --> 00:06:28,330 I will have c plus 1 whole cube and this side I will have c times c plus 1 whole square 69 00:06:28,330 --> 00:06:33,449 and now the problem is, this is bigger because c plus 1 whole cube is bigger than c times 70 00:06:33,449 --> 00:06:38,279 c plus 1 whole square. Therefore, no matter what c we choose, if 71 00:06:38,279 --> 00:06:45,570 we go to n equal to c we will find that inequality that we want gets flipped around. Therefore, 72 00:06:45,570 --> 00:06:50,430 there is no c that we can choose to make n cubes smaller than c n square beyond a certain 73 00:06:50,430 --> 00:06:54,520 point and therefore, this is not big O. So, our intuitive idea that n cube grows faster 74 00:06:54,520 --> 00:06:59,610 than n square can be formally proved using this definition. 75 00:06:59,610 --> 00:07:08,940 Now, here is a useful fact about big O, if I have a function f 1 which is big O of g 76 00:07:08,940 --> 00:07:16,039 1 and another function f 2 which is big O of g 2 then, f 1 plus f 2 is actually dominated 77 00:07:16,039 --> 00:07:21,020 by the max of g 1 and g 2. You might think it is g 1 plus g 2 this is the obvious thing 78 00:07:21,020 --> 00:07:25,860 that comes to mind looking at this, that f 1 plus f 2 is smaller than g 1 plus g 2, but 79 00:07:25,860 --> 00:07:27,139 actually it is max. 80 00:07:27,139 --> 00:07:35,689 How do we prove this? Well, is not very difficult. By definition if f 1 is big O of g 1 there 81 00:07:35,689 --> 00:07:42,419 exists some n 1 such that beyond n 1 f 1 is dominated by c 1 of g 1 c 1 times g 1. Similarly, 82 00:07:42,419 --> 00:07:49,349 if f 2 is big O of g 2 there is an n 2 such that beyond n 2 f 2 is dominated by c 2 times 83 00:07:49,349 --> 00:07:58,729 g 2 right. So, now, what we can do is we can choose n 3 to be the maximum of n 1 and n 84 00:07:58,729 --> 00:08:08,430 2, and we can choose c 3 to be the maximum of c 1 and c 2. So, now, let us see what happens 85 00:08:08,430 --> 00:08:18,050 beyond n 3, beyond n 3 both these inequalities are effective. So, we have f 1 plus f 2 will 86 00:08:18,050 --> 00:08:24,680 be less than c 1 times g 1 plus c 2 times g 2 right. Because, this is beyond both n 87 00:08:24,680 --> 00:08:28,439 1 and n 2 so, both f 1 is less than c 1 g 1 holds and f 2 less than c 2 g 2 holds. So, 88 00:08:28,439 --> 00:08:34,811 I can add the 2 and, this is the first obvious thing that we said is it g 1 plus g 2, but 89 00:08:34,811 --> 00:08:39,550 now we can be a little clever we can say there we have c 3. So, c 1 is smaller than c 3 because 90 00:08:39,550 --> 00:08:43,690 this is the maximum c 2 is smaller than c 3. So, I can combine these and say that this 91 00:08:43,690 --> 00:08:50,650 is less than c 3 g 1 plus c 3 g 2. Now, having combined these I can of course, 92 00:08:50,650 --> 00:08:58,520 push them together and say this is less then c 3 times g 1 plus g 2. But g 1 plus g 2 if 93 00:08:58,520 --> 00:09:03,810 I take the maximum of those then 2 times the maximum will be bigger than that. So, I will 94 00:09:03,810 --> 00:09:13,709 get this is less than c 3 times 2 times the maximum of g 1 and g 2 right. I can take this 95 00:09:13,709 --> 00:09:21,270 2 out and say that therefore, this is less then equal to 2 c 3 times max of g 1 and g 96 00:09:21,270 --> 00:09:32,470 2 right. So, now, if I take this as my n 0 and this as my c then I have established that 97 00:09:32,470 --> 00:09:37,600 for every n bigger than n 0 namely maximum n 1 and n 2 there is a constant which is 2 98 00:09:37,600 --> 00:09:46,010 times the max of c 1 c 2 such that f 1 plus f 2 is dominated by c times max of g 1 g 2. 99 00:09:46,010 --> 00:09:50,610 Why is this mathematical fact useful to us? 100 00:09:50,610 --> 00:09:55,600 So, very often when we are analyzing an algorithm, it will have different phases. It will do 101 00:09:55,600 --> 00:09:59,670 something in one part then it will continue to some other thing and so, on. So, we could 102 00:09:59,670 --> 00:10:05,520 have 2 phases, phase A which takes time big O of g A and phase B which takes time big 103 00:10:05,520 --> 00:10:11,940 O of g B. So, now, what is a good upper bound for the overall running time of the algorithm. 104 00:10:11,940 --> 00:10:17,120 So, the instinctive thing would be to say g A plus g B. But what this result tells us 105 00:10:17,120 --> 00:10:21,860 is that it is not g A plus g B that is useful for the upper bound, but the maximum of g 106 00:10:21,860 --> 00:10:27,370 A and g B right. In other words, when we are analyzing an algorithm it is enough to look 107 00:10:27,370 --> 00:10:32,390 at the bottle necks. It goes through many steps look at the steps which take the maximum 108 00:10:32,390 --> 00:10:36,690 amount of time, focus on those and that will determine the overall running time of the 109 00:10:36,690 --> 00:10:41,050 algortihm. So, when we look at a function, an algorithm which has a loop, we typically 110 00:10:41,050 --> 00:10:45,190 look at the loop how long does the loop take. We ignore, may be the initialization that 111 00:10:45,190 --> 00:10:48,200 takes place before the loop or some print statement that takes place after the loop 112 00:10:48,200 --> 00:10:52,563 because that does not contribute as much to the complexity as the loop itself. So, when 113 00:10:52,563 --> 00:10:59,000 we have multiple phases, it is the most inefficient phase which dominates the overall behavior 114 00:10:59,000 --> 00:11:05,089 and this is formalized by the result we just saw. 115 00:11:05,089 --> 00:11:11,540 Now, there is a symmetric notion to an upper bound namely a lower bound. So, just like 116 00:11:11,540 --> 00:11:16,139 we said that t of n is always lying below c of... c times g of n. We might say that 117 00:11:16,139 --> 00:11:22,720 t of n always lies above c times g of n and this is described using this notation omega. 118 00:11:22,720 --> 00:11:27,199 So, this is just a symmetric definition which just says that t of n is omega of g of n, 119 00:11:27,199 --> 00:11:33,079 if for every n beyond n 0 t of n lies above c times g of n for some fixed constant c. 120 00:11:33,079 --> 00:11:37,950 So, here we have the same thing we have an initial thing that we are not interested in, 121 00:11:37,950 --> 00:11:43,730 because at this point nothing can be said. But beyond this n 0 we have that t of n lies 122 00:11:43,730 --> 00:11:50,240 above so, t of n is always above c times g of n 123 00:11:50,240 --> 00:11:56,230 So, we earlier saw that n cubed is not big O of n square, but intuitively n cube should 124 00:11:56,230 --> 00:12:00,850 be lying above n square and this is certainly the case because, n cubed is greater than 125 00:12:00,850 --> 00:12:06,050 equal to n square for every n bigger than equal to 1 right. So, at n equal to 1 both 126 00:12:06,050 --> 00:12:11,220 are 1, but n equal to 2 this will be 8 this will be 4 and so, on. So, if given n 0 equal 127 00:12:11,220 --> 00:12:18,899 to 0 or n 0 equal to 1 and c equal to 1 we can establish this. Now of course, when we 128 00:12:18,899 --> 00:12:22,870 are establishing an upper bound we are usually talking of about the algorithm we have. You 129 00:12:22,870 --> 00:12:27,129 are saying this algorithm has an upper bound of so, much and therefore, I can definitely 130 00:12:27,129 --> 00:12:32,140 solve the problem within this much time. Now, when we are talking about lower bounds it 131 00:12:32,140 --> 00:12:37,029 is not that useful to talk about a specific algorithm. It is not so useful to say that 132 00:12:37,029 --> 00:12:42,199 this algorithm takes at least so much time. What we would like to say is something like 133 00:12:42,199 --> 00:12:46,649 this problem takes at least so much time, no matter how you write the algorithm it is 134 00:12:46,649 --> 00:12:51,509 going to take at least so much time. So, typically what we would like to do to make a useful 135 00:12:51,509 --> 00:12:56,829 lower bound statement is to say that a problem takes a certain amount of time no matter how 136 00:12:56,829 --> 00:13:01,110 you try to solve it. So, the problem has a lower bound rather than the algorithm has 137 00:13:01,110 --> 00:13:03,800 a lower bound. Now, as you might imagine this is the fairly 138 00:13:03,800 --> 00:13:08,350 complex thing to say because what you have to able to show is that no matter how clever 139 00:13:08,350 --> 00:13:13,540 you are, no matter how you design an algorithm you cannot do better than a certain thing. 140 00:13:13,540 --> 00:13:18,050 This is much harder than saying I have a specific way of doing it and I am analyzing how to 141 00:13:18,050 --> 00:13:23,819 do that. So, establishing lower bounds is often very tricky. One of the areas where 142 00:13:23,819 --> 00:13:29,360 lower bounds have been established is sorting. So, it can be shown that, if you are relying 143 00:13:29,360 --> 00:13:34,769 on comparing values to sort them then, you must at least do n log n comparisons, no matter 144 00:13:34,769 --> 00:13:39,339 how you actually do the sorting. No matter how clever your sorting algorithm, it cannot 145 00:13:39,339 --> 00:13:45,000 be faster than n log n in terms of comparing elements but, this hard to do, remember, because 146 00:13:45,000 --> 00:13:50,600 you have to really show this independent of the algorithm. 147 00:13:50,600 --> 00:13:56,470 Now, we could have a nice situation where we have matching upper and lower bounds. So, 148 00:13:56,470 --> 00:14:03,370 we say that t is theta of g of n if it is both, big O of g of n and omega of g of n. 149 00:14:03,370 --> 00:14:09,329 In other words, with suitable constants t of n can be dominated by g of n, and it also 150 00:14:09,329 --> 00:14:14,490 lies above g of n for two different constants of course. So, what this really means is that, 151 00:14:14,490 --> 00:14:19,089 t of n and g of n are basically are of the same order of magnitude, they are essentially 152 00:14:19,089 --> 00:14:24,500 the same function therefore, you have reached a kind of optimum value. 153 00:14:24,500 --> 00:14:32,170 So, as an example we can say for instance that n into n minus 1 by 2 is theta of n square. 154 00:14:32,170 --> 00:14:35,730 In order to prove something like this, we have to show that there is an upper bound, 155 00:14:35,730 --> 00:14:39,730 that is we can find a constant such that c times n square dominates this and a lower 156 00:14:39,730 --> 00:14:44,089 bound. There is another constant such that c times n square is below this. So, for the 157 00:14:44,089 --> 00:14:50,839 upper bound we just expand out n into n minus 1 by 2. So, we get n squared by 2 in the first 158 00:14:50,839 --> 00:14:56,199 term and n minus n by 2. Now, since it is an upper bound n squared by 2 minus n by 2, 159 00:14:56,199 --> 00:15:00,089 if I ignore n by 2, this is going to be less than n squared by 2. 160 00:15:00,089 --> 00:15:05,339 Therefore, now I have an upper bound saying that, with the constant half this is dominated 161 00:15:05,339 --> 00:15:11,720 by n square for n bigger than 0. On the other hand, if I want to do a lower bound then, 162 00:15:11,720 --> 00:15:17,730 I will say same thing I will expand out n into n minus 2, I will get same expression. 163 00:15:17,730 --> 00:15:22,860 And now I will want to lower bound, so, now what I will do is I will make this even smaller. 164 00:15:22,860 --> 00:15:33,180 I will say that I subtract not n by 2 but n by 2 times n by 2. So, this will be bigger 165 00:15:33,180 --> 00:15:37,600 than this, because I am subtracting more. n squared by 2 minus n by 2 will be bigger 166 00:15:37,600 --> 00:15:46,170 than n squared by 2 minus n by 2 into n by 2, but this is again n squared by 4. So, I 167 00:15:46,170 --> 00:15:52,339 have n squared by 2 minus n squared by 4, so this simplifies to n squared by 4. In other 168 00:15:52,339 --> 00:15:56,500 words, I have shown that n into n minus 1 by 2 is bigger than equal to n squared by 169 00:15:56,500 --> 00:16:04,980 4. But now, in order to justify this, to justify that n by 2 is increasing, n must be at least 170 00:16:04,980 --> 00:16:09,120 2. Because if n smaller than 2 this is a fraction so I am actually reducing. 171 00:16:09,120 --> 00:16:13,720 Here, I have a different n, I have n greater than equal to 2. I have established a lower 172 00:16:13,720 --> 00:16:20,990 bound which says that for n bigger than equal to 2 n into n minus 1 by 2 is above one fourth 173 00:16:20,990 --> 00:16:28,170 of n square. So, therefore now if we chose our constant to be 2 for all values bigger 174 00:16:28,170 --> 00:16:34,449 than 2, I have that n into n minus 1 by 2 is less than half of n square and n into n 175 00:16:34,449 --> 00:16:38,190 minus 2 is bigger than one fourth of n square. So, I have found this matching upper and lower 176 00:16:38,190 --> 00:16:45,800 bound which shows that n into n minus 1 by 2 is theta of n square. 177 00:16:45,800 --> 00:16:51,629 So, to summarize when we use big O, we have discovered an upper bound. If we say f of 178 00:16:51,629 --> 00:16:57,339 n is big O of g of n, it means that g of n dominates f of n so f of n is no bigger than 179 00:16:57,339 --> 00:17:03,029 g of n. And this is useful to describe the limit of a worst case running time. So, we 180 00:17:03,029 --> 00:17:09,400 can say that worst running time is upper bounded by g of n. On the other hand, if we use omega 181 00:17:09,400 --> 00:17:15,329 we are saying that f of n is at least g of n, g of n is a lower bound for f of n. 182 00:17:15,329 --> 00:17:20,459 As we described this is more useful for problems as a whole, sorting as a general problem rather 183 00:17:20,459 --> 00:17:24,410 than for individual algorithm. Because it tells us no matter how you do something, you 184 00:17:24,410 --> 00:17:30,419 will have to spend at least that much time, but this hard to establish. And if you have 185 00:17:30,419 --> 00:17:35,780 a situation where a lower bound has been established for a problem and you find an algorithm which 186 00:17:35,780 --> 00:17:41,210 achieves the same bound as an upper bound then, you have found in some sense the best 187 00:17:41,210 --> 00:17:45,240 possible algorithm. Because, you cannot do any better than g of n because we have a lower 188 00:17:45,240 --> 00:17:49,020 bound of g of n and you have achieved g of n because you have shown your algorithm is 189 00:17:49,020 --> 00:17:55,010 big O of g of n. So, theta is a way of demonstrating that you have found an algorithm which is 190 00:17:55,010 --> 00:17:57,289 asymptotically as efficient as possible.