1 00:00:48,460 --> 00:00:55,460 We will continue with our discussion on A* and heuristic search engine, from this class 2 00:00:55,640 --> 00:01:02,640 onwards. So, the topic of this lecture is heuristic search A*, and beyond. Quickly, 3 00:01:05,890 --> 00:01:12,750 to recap what we had done in the last class: we studied the algorithm A*, which maintains 4 00:01:12,750 --> 00:01:19,750 2 lists- open and closed, and also 2 functions. One is the g value, which computes the distance 5 00:01:23,190 --> 00:01:29,649 of the state from the start state, and the h value, which is the heuristic estimate of 6 00:01:29,649 --> 00:01:36,649 the distance of that state from the goal state, and fs is the sum of gs and hs, and that gives 7 00:01:40,179 --> 00:01:47,179 us the estimated cost of a solution which goes through the node n. So, the first step 8 00:01:51,250 --> 00:01:55,860 was- if open is empty and we have still not yet found the goal, then we terminate with 9 00:01:55,860 --> 00:02:02,460 failure; otherwise, we select the minimum cost state n from open and save it in closed. 10 00:02:02,460 --> 00:02:09,250 If the selected state is a goal state, then we terminate with success and return the f 11 00:02:09,250 --> 00:02:13,100 value of that state as the cost of the goal. 12 00:02:13,100 --> 00:02:20,100 Otherwise, we expand the node n, and to generate the set of successors and for each successor 13 00:02:21,660 --> 00:02:28,660 m, we compute its cost, based on the g value of that node and the h value of that node. 14 00:02:29,920 --> 00:02:35,090 And if the node already belongs to open and closed, we update it only if the cost is decreased. 15 00:02:35,090 --> 00:02:40,930 And, if the node is already in closed and its cost has decreased, then you must bring 16 00:02:40,930 --> 00:02:47,930 it back to open. Now, in uniform cost search, we had seen that if you have only positive 17 00:02:48,380 --> 00:02:52,470 cost, then you cannot have a case where a node comes back from closed to open. 18 00:02:52,470 --> 00:02:59,359 Now, why are we so concerned about nodes coming back from closed to open? Because what we 19 00:02:59,359 --> 00:03:06,250 want is: that we want that when we expand a node, we expand it only once. If we can 20 00:03:06,250 --> 00:03:13,250 ensure that we expand nodes only once, then there is a set of states which any admissible 21 00:03:15,070 --> 00:03:20,329 algorithm must visit. As we had seen in the last class, that if you look at the set of 22 00:03:20,329 --> 00:03:27,000 states whose f value is less than the cost of the goal, then any admissible algorithm, 23 00:03:27,000 --> 00:03:31,370 any algorithm which guarantees to find the optimal solution, we will definitely have 24 00:03:31,370 --> 00:03:33,049 to expand those nodes, right? 25 00:03:33,049 --> 00:03:40,049 So, the complexity will be linear in the number of expanded nodes, if we can allow that no 26 00:03:42,900 --> 00:03:49,900 nodes are re-expanded. If you can ensure that every node is expanded only once, then, we 27 00:03:52,320 --> 00:03:59,060 are assured that if s is the set of nodes which must be visited by any admissible algorithm 28 00:03:59,060 --> 00:04:06,060 in, then our complexity is linear in s. Let us see. So, this brings us to a kind of notion 29 00:04:08,930 --> 00:04:15,930 of optimality. We say that, okay, let s denote the set of states, such that it denotes the 30 00:04:20,209 --> 00:04:27,209 set of states n, such that fn is less than C*, where C* is the cost of the optimal goal. 31 00:04:31,669 --> 00:04:38,560 We know that any algorithm which is admissible, or any algorithm which guarantees to give 32 00:04:38,560 --> 00:04:45,560 us the optimal solution- we will have to expand this set of states, right? If we can have 33 00:04:46,710 --> 00:04:53,710 our algorithm which is linear in the size of s, then that is something which is asymptotically 34 00:04:56,470 --> 00:05:02,570 optimal, right? Because any algorithm which is admissible and guarantees the optimal solution 35 00:05:02,570 --> 00:05:05,000 will have to expand that at least s nodes. 36 00:05:05,000 --> 00:05:12,000 If you are linear in s, then, that means that we have an asymptotically optimal algorithm, 37 00:05:13,319 --> 00:05:19,979 but if we end up re-expanding those nodes, then this guarantee disappears. Then it could 38 00:05:19,979 --> 00:05:26,789 be quadratic in s. It could be exponential in s. We do not know, right? What we have 39 00:05:26,789 --> 00:05:33,789 seen is that, in uniform cost search, there is no such re-expansion, because a nodes never 40 00:05:34,300 --> 00:05:40,240 comes back from close to open. Once we have expanded it and put the node in closed, it 41 00:05:40,240 --> 00:05:47,240 is done. We do not re-expand that anymore, right, but in the case of heuristic search, 42 00:05:48,729 --> 00:05:55,729 we can have scenarios where a state is re-expanded, and this is there because of what we call 43 00:05:57,629 --> 00:06:03,689 non-monotonicity of the heuristic function. I will give you an example to show where a 44 00:06:03,689 --> 00:06:10,689 node will come back from closed to open. So, the example that I will show is like this: 45 00:06:11,550 --> 00:06:18,550 that we have the state one, and let us say that the costs are 3, 2... This is the graph. 46 00:06:55,360 --> 00:07:02,360 Now, let us say that the heuristic functions are as follows: at every state, the heuristic 47 00:07:03,719 --> 00:07:10,310 function will give us an estimate of the cost to the goal. We will assume that the heuristic 48 00:07:10,310 --> 00:07:17,310 function h that is given to us, always under estimates, so it always gives us a lower bound, 49 00:07:24,430 --> 00:07:31,430 right? Now, so, but even if it underestimates, you can have an accurate heuristic and an 50 00:07:37,599 --> 00:07:38,580 inaccurate heuristic. 51 00:07:38,580 --> 00:07:45,580 So, suppose that that this heuristic function gives us a pretty accurate estimate here. 52 00:07:46,719 --> 00:07:53,719 If you can see, the cost of this is 24. The optimal cost solution from 3 is 24, right? 53 00:07:55,949 --> 00:08:02,949 Where 6 is the goal and we have 23 here. Now, that is a pretty accurate estimate that we 54 00:08:03,659 --> 00:08:10,659 have here. Unfortunately, in the other states, s gives us a pretty weak estimate. So, let 55 00:08:12,080 --> 00:08:19,080 us say we have this kind of scenarios, right? Now, let us run A* on this and see what happens, 56 00:08:37,500 --> 00:08:43,820 and I will also maintain closed, to show that some nodes will come back from closed to open. 57 00:08:43,820 --> 00:08:50,820 Intuitively, what will happen is, we will- because of this heuristic, this node will 58 00:08:52,560 --> 00:08:57,870 have a pretty accurate cost value, and therefore, will not be expanded, until we have expanded 59 00:08:57,870 --> 00:09:04,029 this whole path. And then, we are near about here, we will realize that what we had all 60 00:09:04,029 --> 00:09:08,000 along this path was an inaccurate heuristic. 61 00:09:08,000 --> 00:09:15,000 Then, we will again re-expand this path. So, let us see how we go about doing this. 62 00:09:16,220 --> 00:09:23,220 Initially, we have 1 with a cost of 5 in open, right? The first step: we will expand that, 63 00:09:26,680 --> 00:09:33,680 and that is going to give us 2 states. One is 2 with a cost of 4 plus 3, 7, and 3 with 64 00:09:36,940 --> 00:09:43,940 a pretty accurate cost, which is 23 plus 2, 25. In the next step, you are going to put 65 00:09:48,840 --> 00:09:55,840 2 with a cost of 7. So, 3 with a cost of 25 which is going to remain in open, and we are 66 00:09:57,160 --> 00:10:04,160 going to have here 4 with a cost of 4 plus 3, 7, plus 2. So 9, right? In the next step, 67 00:10:06,950 --> 00:10:13,440 we are going to pick up 4 with a cost of 9, and we are going to have 3 with a cost of 68 00:10:13,440 --> 00:10:20,440 25 here, and 5 with a cost of 3 plus 7, and 8, 11, right? 69 00:10:28,640 --> 00:10:35,640 In the next step, we are going to pick up 5 with a cost of 11 and then we have 3 with 70 00:10:37,790 --> 00:10:44,790 a cost of 25, and 6 is going to come with a cost of 20, 28, because this is the path; 71 00:10:49,710 --> 00:10:56,710 it is this path, which has a cost of 28. See, now we find that 3 has a lesser cost. See, 72 00:11:00,340 --> 00:11:05,030 at this point, though the goal is there, we cannot pick up the goal, because there can 73 00:11:05,030 --> 00:11:10,240 be better path to the goal. So, we cannot pick up and declare that we have found that 74 00:11:10,240 --> 00:11:17,240 solution. No. So, we will expand 3. When we expand 3, now, see what happens: we get 4, 75 00:11:19,870 --> 00:11:26,870 but now 4 comes with a cost of 5 plus 2; 7, right? So, the cost of 4 has decreased. We 76 00:11:29,940 --> 00:11:36,940 will take it out from closed and bring it back to open with a cost of 7, and then we 77 00:11:38,930 --> 00:11:45,930 have 6 with a cost of 28. Next step: 4 with a cost of 7 is picked up, right? 78 00:11:46,440 --> 00:11:53,440 So, we have 6 with a cost of 28, and 5 now comes in, with a cost of, how much? 5, 6, 79 00:11:58,020 --> 00:12:04,290 7, 8, 9, right? And this is better than the cost of that with which it was in closed. 80 00:12:04,290 --> 00:12:11,290 We have brought it back from closed to open, right? Now, we pick up 5 with a cost of 9, 81 00:12:12,810 --> 00:12:19,810 and that gives us 6 with a cost of 26. So, this gets replaced with 6 with a cost of 26, 82 00:12:25,790 --> 00:12:31,380 and we pick up 6 finally and find out that the optimal cost is 26, right? 83 00:12:31,380 --> 00:12:38,380 So, we see that we had to do a lot extra work, because our heuristic values were such that 84 00:12:40,740 --> 00:12:46,520 along one path it was well informed; along the optimal path it was well informed, and 85 00:12:46,520 --> 00:12:52,390 along the sub-optimal paths, it was not well informed. So, we had to expand the sub-optimal 86 00:12:52,390 --> 00:12:59,390 path up to a large extent, right? Now, there are several cases where we can make some improvements 87 00:13:03,730 --> 00:13:08,250 based on this notion. We are going to come back to that, but before that, let me let 88 00:13:08,250 --> 00:13:15,250 us study a few properties of A*. Firstly, we will say that a heuristic is called admissible 89 00:13:21,180 --> 00:13:23,780 if it always under estimates. 90 00:13:23,780 --> 00:13:30,780 That is, we always have hn is in lower bound on the actual cost of the goal, from that 91 00:13:31,540 --> 00:13:38,540 state where A* denotes the minimum distance to a goal state from the state n. This is 92 00:13:39,840 --> 00:13:46,840 what we have been talking about all the time. The heuristic is called admissible if it underestimates. 93 00:13:49,010 --> 00:13:56,010 What happens if it overestimates? See, if it overestimates, then we do not have that 94 00:14:01,560 --> 00:14:08,560 nice property. We had talked about the property, that whenever you have a node whose fn is 95 00:14:09,800 --> 00:14:16,800 less than C*, then this is the set of states which are expanded, and all states which have 96 00:14:19,800 --> 00:14:26,800 the value fn greater than C*- these are never expanded, right? So, that means, if you look 97 00:14:35,740 --> 00:14:42,740 at the state space tree, then all the states which have cost less than C*, that is, all 98 00:14:45,610 --> 00:14:52,610 the states in this region, they are going to be expanded. And the ones which are greater 99 00:14:53,160 --> 00:15:00,160 than C*, this whole region, these states are never expanded. 100 00:15:00,810 --> 00:15:07,110 These are all greater than *, and we never expand this. So, this gives us a nice bound; 101 00:15:07,110 --> 00:15:13,320 depending on the accuracy of the functions that you have, we have this nice bound, which 102 00:15:13,320 --> 00:15:18,490 tells us that these are the set of states that are that have to be expanded anyway, 103 00:15:18,490 --> 00:15:25,490 right? Now, the moment you do not have underestimating heuristics, what can happen is, along some 104 00:15:29,040 --> 00:15:36,040 path, you have the heuristic value overestimating. At this state, you will have fn greater than 105 00:15:36,470 --> 00:15:43,470 C*, but because the heuristic has overestimated, it is still possible, that below this, somewhere, 106 00:15:45,920 --> 00:15:52,920 you have you have a goal which has cost less than C*. That possibility cannot be ruled 107 00:15:56,090 --> 00:16:02,880 out. In this case, when we have underestimating heuristics, you know when the cost has exceeded 108 00:16:02,880 --> 00:16:09,880 C*, then the actual cost can only be more. If you look at the states in this frontier, 109 00:16:10,530 --> 00:16:13,760 you know that the actual cost of these states can only be more. 110 00:16:13,760 --> 00:16:20,760 So, any goal that you find beyond this, will also have cost greater than C*. So, there 111 00:16:22,050 --> 00:16:28,420 is no point in expanding those nodes, but the moment the heuristics overestimate, it 112 00:16:28,420 --> 00:16:35,420 is quite possible that there is another goal beyond this, which has cost less than C*. 113 00:16:35,540 --> 00:16:40,230 And we have this f value greater than C*, because our heuristic has given an overestimate, 114 00:16:40,230 --> 00:16:47,230 right? As soon as we have overestimating heuristics, this nice property will disappear, and then, 115 00:16:48,720 --> 00:16:54,560 that will mean that even if you find that the cost of the node, even if you find that 116 00:16:54,560 --> 00:17:00,710 you have found a goal, you still do not know that the nodes that are there in open. If 117 00:17:00,710 --> 00:17:07,039 you expand them, you can still find the lesser cost goal, right? Because cost can decrease 118 00:17:07,039 --> 00:17:09,339 beyond that point. 119 00:17:09,339 --> 00:17:15,250 That is why, though this property will not be there, but there are some advantages also, 120 00:17:15,250 --> 00:17:19,480 of using overestimating heuristics, which I will just briefly touch upon later. And 121 00:17:19,480 --> 00:17:24,169 when you actually do the experimentations, you will find that there is some merit in 122 00:17:24,169 --> 00:17:30,539 using overestimating heuristics. So, as of now, we are assuming that the heuristics are 123 00:17:30,539 --> 00:17:37,539 admissible, that is, they are underestimating heuristics. So, for finite state spaces, A* 124 00:17:40,470 --> 00:17:42,409 always terminates. 125 00:17:42,409 --> 00:17:48,020 We need not consider the proof of these; this is very straightforward, right? Because we 126 00:17:48,020 --> 00:17:54,470 are going depth first, based on the cost criterion. So, if you have a finite state space under 127 00:17:54,470 --> 00:18:01,470 finite cost, you will eventually reach the goal. At any Time: before A* terminates, there 128 00:18:03,850 --> 00:18:10,850 exists in open, a state n, that is known as optimal path from s to a goal state with fn 129 00:18:11,889 --> 00:18:18,889 less than f*s. Why is 130 00:18:32,159 --> 00:18:39,159 that the case? Suppose in the state space, let us say this is our goal, and let us say 131 00:19:00,549 --> 00:19:07,549 that this is the optimal cost path to the goal. Our objective is to find this, right? 132 00:19:10,210 --> 00:19:17,210 Now, until A* terminates, we are claiming that at least one of these states will always 133 00:19:19,840 --> 00:19:26,840 there be in open. But, until we found this state, or the best path to this state, until 134 00:19:41,269 --> 00:19:47,249 A* terminated, there was always some state which was there in open. 135 00:19:47,249 --> 00:19:54,249 So, now let us look at this example once again, just to get this idea clear. See, the optimal 136 00:19:56,580 --> 00:20:03,580 path was here. 3 was that state which was there, right here, right? And then, when 3 137 00:20:03,679 --> 00:20:09,279 got expanded, 4 was the state which was on the optimal path. Then 4 got expanded, then 138 00:20:09,279 --> 00:20:16,279 5 was there on the optimal path, right? So, at every point, see, here- 1 and 3 and 5, 139 00:20:18,129 --> 00:20:25,129 and then we find the goal, right? Now, you can prove this by induction, where initially 140 00:20:30,649 --> 00:20:36,669 it is the start state which is on the optimal path, which is there in open, right? When 141 00:20:36,669 --> 00:20:43,669 you expand open, if, when you expand the start state, you are going to have a set of states 142 00:20:45,730 --> 00:20:52,730 that are the successor of the start state, right, one of these successors is on the optimal 143 00:20:53,619 --> 00:20:59,159 path. That fellow will be there in open. If you expand that state, if you as long as you 144 00:20:59,159 --> 00:21:06,159 do not expand that state, right, you will not terminate, because that state is always 145 00:21:07,519 --> 00:21:10,879 going to have a cost less than C*. 146 00:21:10,879 --> 00:21:15,369 You will have to eventually expand it. As long as you do not, that state is the one 147 00:21:15,369 --> 00:21:21,759 which is there in open. When you expand that, then its successor will come into the- one 148 00:21:21,759 --> 00:21:27,460 of its successors on the optimal path will be there in open. So, in this way, it will 149 00:21:27,460 --> 00:21:31,049 continue, until you have expanded the goal or picked up the goal for expansion. There 150 00:21:31,049 --> 00:21:38,049 will be one of those states always there, right? You design the heuristic in such a 151 00:21:44,200 --> 00:21:50,070 way where it is admissible, like, for example, if you look at the heuristic that- the Manhattan 152 00:21:50,070 --> 00:21:54,700 mode heuristic, for the 15 puzzle. 153 00:21:54,700 --> 00:21:59,779 We said that at least that many moves will have to be taken for each tile. So, if you 154 00:21:59,779 --> 00:22:06,779 just add up the Manhattan distances of each of the tiles from their final position, then 155 00:22:07,129 --> 00:22:12,869 you get an underestimate. Likewise, when we talked about the minimum cost spanning tree 156 00:22:12,869 --> 00:22:19,869 heuristic for traveling salesperson problem, we know that the minimum the cost of the minimum 157 00:22:19,889 --> 00:22:26,889 spanning tree is an underestimate on the minimum cost tour. Or, usually, you will be able to. 158 00:22:33,990 --> 00:22:40,159 If you do not have, see, the worst case scenario is, you do not know anything about the problem, 159 00:22:40,159 --> 00:22:45,710 in which case your heuristic function is zero everywhere. So, you boil down to uniform cost 160 00:22:45,710 --> 00:22:52,710 search. If you do not have a good underestimate but have a pretty tight overestimate, then 161 00:22:52,850 --> 00:22:59,850 it is better to go for depth first branch and bound. If you do not have a good overestimate 162 00:23:04,049 --> 00:23:11,049 and you have a good underestimate, then you can go for heuristic search, like A*. 163 00:23:13,779 --> 00:23:20,429 The designing of the heuristics is, by itself, an interesting topic. And there are algorithms 164 00:23:20,429 --> 00:23:27,429 which will approximate, and will always keep the approximate solution to be of cost- you 165 00:23:29,240 --> 00:23:36,240 know you know that it is going to be it is going to be an underestimate of the actual 166 00:23:37,639 --> 00:23:44,639 solution, right? For example, there are approximation algorithms, which are going to guarantee you 167 00:23:45,869 --> 00:23:52,869 some k times optimal. So, if you find, apply that algorithm, and divide it by k- divide 168 00:23:54,009 --> 00:24:00,919 the solution cost by k- you know you have an underestimate. So, there are various ways 169 00:24:00,919 --> 00:24:07,919 of designing a heuristic which is an underestimate. When we work out some problems, you will see 170 00:24:09,190 --> 00:24:16,190 that, how to design such heuristics. Okay? Then, let us continue with this. 171 00:24:20,600 --> 00:24:27,600 So, we say that algorithm A* is admissible. That is, if there is a path from s to a goal 172 00:24:29,600 --> 00:24:36,600 state, then A* terminates by finding an optimal path. Try to look at the proof of this result 173 00:24:44,619 --> 00:24:51,619 from the book of Nielson. Nielson's Principles of AI- proof is given there, right? Now, here 174 00:24:56,850 --> 00:25:03,489 is an interesting question. This is with respect to the accuracy of the heuristics. Suppose 175 00:25:03,489 --> 00:25:10,489 we have A1 and A2, and their 2 versions of A*, such that A2 uses a heuristic which is 176 00:25:13,320 --> 00:25:19,389 more informed than A1. What do we mean by more informed? Suppose the heuristics are 177 00:25:19,389 --> 00:25:26,389 all admissible. Suppose h1 is admissible. h1 is admissible; h2 is also admissible, right? 178 00:25:30,629 --> 00:25:37,629 So, they both underestimate. So, when should we call h2 more informed than h1? If for all 179 00:25:54,980 --> 00:26:01,980 states n, h2 greater than or less than? Greater than, right? If h2 is greater than h1, then 180 00:26:13,470 --> 00:26:20,470 this is more informed than this one, but remember that though h2 is greater than h1, at every 181 00:26:22,679 --> 00:26:28,239 state, it is still the case, that both are underestimates of the optimal solution. So, 182 00:26:28,239 --> 00:26:31,179 h2 is tighter than h1. 183 00:26:31,179 --> 00:26:38,179 So, what this result tells us- slides- is that, if A1 and A2 are 2 versions of A*, such 184 00:26:39,879 --> 00:26:46,879 that A2 is more informed than A1, then A1 expands at least as many states as A2. Now, 185 00:26:50,710 --> 00:26:55,179 that is very easy to see from that result which we saw, that fn less than C* is the 186 00:26:55,179 --> 00:27:02,179 set of states that will always have to expanded. So, the set of states which have fn less than 187 00:27:03,580 --> 00:27:10,580 C* will be more, when we use the less informed heuristics, than the one where we used in 188 00:27:13,889 --> 00:27:18,980 more informed heuristic. In the cases where we used the more informed heuristics, some 189 00:27:18,980 --> 00:27:25,980 of the states will now have costs greater than C*, and there will not be expanded, right? 190 00:27:26,249 --> 00:27:33,249 But if I have both of these heuristics, h1 and h2, and we do not know which is more informed 191 00:27:37,289 --> 00:27:44,289 than the other, then, what heuristics will we use for A*? 192 00:27:46,019 --> 00:27:52,720 Suppose we have h1, we also have h2. Both, we know, are good heuristic functions. Both 193 00:27:52,720 --> 00:27:59,720 do not have much overhead of computation, and we want to run A* to the solve the problem. 194 00:28:01,659 --> 00:28:08,659 What heuristic function will we use? Average? Why average? Of the 2, yes. 195 00:28:13,049 --> 00:28:20,049 So what if we have h1 and h2? Then, it makes sense to use max of h1n and h2n at every state, 196 00:28:31,519 --> 00:28:38,519 right? Because max of h1 and h2 is also going to be an underestimate and it is also going 197 00:28:42,350 --> 00:28:49,350 to be as informed as h1 and as informed as h2, perhaps better. 198 00:29:01,159 --> 00:29:08,159 Yes. But in practice, we also have to look at the actual overhead of computing the heuristic. 199 00:29:14,369 --> 00:29:21,369 In some cases, see, there is a balance that one has to strike; in some problems, the state 200 00:29:22,570 --> 00:29:29,570 space is so bad, that unless you use a very good heuristic, you will run into trouble. 201 00:29:30,190 --> 00:29:37,190 In those cases, you do not mind spending more Time: in computing a better, accurate heuristic, 202 00:29:37,470 --> 00:29:44,470 and then using it. In other problems, wherein optimal goals are easy to find; in those problems, 203 00:29:48,690 --> 00:29:55,690 you may just go with a heuristic which is easy to compute, right? Now, we come to the 204 00:30:01,809 --> 00:30:06,350 issue of monotone heuristics. As we had seen previously, in that example, that one of the 205 00:30:06,350 --> 00:30:13,350 main problems that we had here, in the text, that we had some state in between, which had 206 00:30:19,879 --> 00:30:26,289 which was well informed, but the others were not well informed, right? Now, suppose, let 207 00:30:26,289 --> 00:30:33,289 us look at a slightly different scenario, where instead of 5 here, we have a more informed 208 00:30:37,739 --> 00:30:40,619 heuristic here, which gives us 25. 209 00:30:40,619 --> 00:30:47,619 So, the only difference with what we had done previously is that, we now have 25 here, okay? 210 00:30:52,779 --> 00:30:59,779 Now, let us see what happens. We have open and we have closed. Now, when we have 1 out 211 00:31:08,690 --> 00:31:15,690 here, it is going to be there with a cost of 25. In the first step, we are going to 212 00:31:16,669 --> 00:31:23,669 take 1 and expand that. Now, I note one interesting thing. At what do we generate here? We have 213 00:31:25,679 --> 00:31:32,679 3 with a cost of 25, as we had before, and now, if you just simply use for 4, if you 214 00:31:40,769 --> 00:31:47,769 just use the hn plus gm, then you will end up having again, as in as before, you will 215 00:31:47,820 --> 00:31:54,820 have 4 2 2s, rather, 2. Sorry, 2 with a cost of 7. Now, that is bad. Bad, because we know 216 00:31:57,179 --> 00:32:04,179 that from 1, any solution path is going to cost you 25, at least, and then this 2 as 217 00:32:05,630 --> 00:32:10,080 " " . From here, you can reach the goal in 7, with cost of 7. 218 00:32:10,080 --> 00:32:15,369 Now, we clearly know that that is not possible. That is not possible, because we already know 219 00:32:15,369 --> 00:32:22,369 that from 1, we require 25. So, if you could do from 2 with a cost of 7, then from 1, you 220 00:32:23,299 --> 00:32:28,369 should be able to do with a cost of 10. But the heuristic, which is an underestimate, 221 00:32:28,369 --> 00:32:35,369 tells us that at least 25 is required. So, what we can do is, we can say that see, this 222 00:32:35,590 --> 00:32:42,590 is clearly an underestimate, which is inaccurate. So, we can safely upgrade this to 25. Why? 223 00:32:51,100 --> 00:32:58,100 Because we know that if it is 25 from here, right, and this edge cost 3, then it is at 224 00:33:00,729 --> 00:33:07,729 least 22 from here, right? So, we can safely replace this fellow with 22, and then the 225 00:33:15,470 --> 00:33:17,830 new f value- so 22 plus 3, which is 25. 226 00:33:17,830 --> 00:33:24,830 Okay. Now, we have 2 states, one with a cost of 25, another also with a cost of 25. Now, 227 00:33:28,710 --> 00:33:33,570 recall, that when we have 2 states with the same cost, we choose the one which has lesser 228 00:33:33,570 --> 00:33:40,399 g value, because the other one has already incurred more cost. And the remaining part 229 00:33:40,399 --> 00:33:47,399 is in the heuristic. So, we will pick up 3 first, because it has less g value. So, we 230 00:33:49,850 --> 00:33:56,850 will expand 3 with a cost of 25, and then we will have 2 with a cost of 25 here, and 231 00:33:59,720 --> 00:34:06,720 4 with a cost of how much? Again, we will use the same thing. See, if we just used 3 232 00:34:09,520 --> 00:34:16,520 plus 2 plus 2, you know, that is again not accurate. So, 4 with a cost of 25. By upgrading 233 00:34:17,750 --> 00:34:24,750 this heuristic value to 20, right, so we upgrade this to 20. 234 00:34:26,419 --> 00:34:33,419 Now, 2 will be picked up, because it has lesser g value. So, pick up 2. 2 with a cost of 25 235 00:34:38,159 --> 00:34:45,159 and what we have here is 4 with a cost of 25; and now, because of the expansion of this, 236 00:34:46,859 --> 00:34:53,859 we will again have 4 with a cost of 25, but, if we take this path, then it comes to 4 plus 237 00:34:58,140 --> 00:35:05,140 3; 7 plus, so 27. So, 27 is larger than this. So, we will ignore that, and just stay with 238 00:35:07,650 --> 00:35:12,970 4 with a cost of 25. And then, you can see that we will pick up 4 with a cost of 25. 239 00:35:12,970 --> 00:35:19,970 We will get 5 with a cost of 25, and then you will expand 5 with a cost of 25, and you 240 00:35:23,250 --> 00:35:30,250 will get 6 with a cost of 26, right? So, that was nice, because we have upgraded the heuristics 241 00:35:31,069 --> 00:35:35,849 nicely and avoided all those re-expansions, right? 242 00:35:35,849 --> 00:35:42,849 So, this was proposed in, okay- so, we define a heuristic function to be monotonic, if we 243 00:35:49,470 --> 00:35:56,470 do not have this problem. That is, if we have a successor for every successor hn minus hm 244 00:35:57,670 --> 00:36:04,670 is less than or equal to cnm, right? If that happens, then it is monotonic, right, but 245 00:36:22,910 --> 00:36:29,910 clearly, that was not what was happening here, in our example. And if this if this does not 246 00:36:36,970 --> 00:36:43,970 hold, then we can update hn to have the value hn minus cnm, okay? Now, it so happens that 247 00:36:51,569 --> 00:36:55,990 if the monotone restriction is satisfied, then A* has already found an optimal path 248 00:36:55,990 --> 00:37:02,990 to the state it selects for expansion. If the monotone restriction is satisfied- because 249 00:37:04,220 --> 00:37:09,270 if the monotone restriction is satisfied, then along every path, you will update it 250 00:37:09,270 --> 00:37:16,270 with the least cost. We will just skip off these properties. 251 00:37:20,000 --> 00:37:24,619 You can just check out whether these properties hold, or just convince yourselves, by trying 252 00:37:24,619 --> 00:37:31,619 to prove them. Let us go over to what we call pathmax. Pathmax is what we just now discovered. 253 00:37:33,950 --> 00:37:40,950 What we do is, when we generate the successor m of n, we set the heuristic value to the 254 00:37:44,880 --> 00:37:51,880 maximum of hm and hn minus cnm. So, hm is the heuristic value that we computed that 255 00:37:53,369 --> 00:38:00,369 state, and because n is the parent, so, hn minus cnm is at least some amount- the amount 256 00:38:02,750 --> 00:38:08,160 of cost that will have to be incurred, because if you can reach the goal state with a cost 257 00:38:08,160 --> 00:38:15,160 less than this, then you are able to do better than hm, right? But because hm is an underestimate, 258 00:38:18,230 --> 00:38:21,930 you can actually never do better than hm. 259 00:38:21,930 --> 00:38:28,650 So, coming back to our example, you see that what we had done here was exactly that. We 260 00:38:28,650 --> 00:38:35,650 had hn as 25; hn was 25, hm was 4. So, we took max of 4 and 25 minus 3, that is 22, 261 00:38:43,049 --> 00:38:50,049 right? We took the max of these 2, and the max of these 2 was 22, which is what we updated 262 00:38:54,440 --> 00:39:01,440 the heuristic function to be, right? So, what we did here is exactly what I am explaining 263 00:39:06,630 --> 00:39:13,630 here in this. Slides, please? So, we update hm to be the maximum of hm and hn minus cnm, 264 00:39:20,470 --> 00:39:27,470 right? Okay. Let us go down further. Inadmissible heuristics. Now, see: we have seen that the 265 00:39:34,559 --> 00:39:41,559 more accurate that your heuristic is, the more number of states are pushed beyond the 266 00:39:42,530 --> 00:39:49,530 C* boundary. Right? In A*, if C* is the optimal cost, then all states having a cost less than 267 00:39:50,680 --> 00:39:57,030 C*, it must be expanded. So, the ones that are beyond C* will be not be expanded. If 268 00:39:57,030 --> 00:40:03,910 your heuristic is weak, then lot of states will be within C*, because the heuristic function 269 00:40:03,910 --> 00:40:04,990 weak, right? 270 00:40:04,990 --> 00:40:11,990 If the heuristic function is zero at all points, then you have uniform cost search, in which 271 00:40:12,049 --> 00:40:16,900 case, all those states which have the actual cost less than C* will have to be expanded. 272 00:40:16,900 --> 00:40:23,140 If you have a good heuristics, then some of these states will now have cost greater than 273 00:40:23,140 --> 00:40:27,609 C*, and that is why we have the advantage of heuristic search. Those states are not 274 00:40:27,609 --> 00:40:34,609 visited because of the heuristic function. Now, if you have an overestimating heuristic, 275 00:40:34,690 --> 00:40:41,690 then many more of those states can again go outside the C* boundary, right? As long as 276 00:40:44,349 --> 00:40:51,349 those states are not on the paths to your optimal solutions, you stand to gain. Could 277 00:40:54,470 --> 00:41:01,470 I make myself clear vaguely clear? Right. So, here, if you use an underestimating heuristics, 278 00:41:09,859 --> 00:41:16,859 then I have, let us say, this set of states which with fn less than C*, okay? And the 279 00:41:20,690 --> 00:41:27,690 goal is one of these states, right, or just beyond this. So, this is on the optimal solution, 280 00:41:30,940 --> 00:41:32,079 right? 281 00:41:32,079 --> 00:41:39,079 Now, if we use an overestimating heuristics, then the cost of these; sum of these states 282 00:41:43,299 --> 00:41:48,349 these states the cost of these states can be more, because the heuristic is overestimating. 283 00:41:48,349 --> 00:41:55,240 So, these costs can be larger than what they are now. Same states, but because the h value 284 00:41:55,240 --> 00:42:02,240 is more, so, h plus g is also more, right? So, it is quite possible that some of these 285 00:42:02,289 --> 00:42:09,289 states, let us say some of these states now, will have a cost greater than C*, when we 286 00:42:10,940 --> 00:42:17,789 use the new heuristic function and compute f dash n based on the heuristic function h 287 00:42:17,789 --> 00:42:24,789 dash n. For using these new heuristic function, which is an overestimating heuristic function, 288 00:42:26,410 --> 00:42:33,410 some of these states can jump over to the greater than C* region, right? Then, the set 289 00:42:36,150 --> 00:42:41,319 of states your A* will visit, has actually decreased, because it is now going to visit 290 00:42:41,319 --> 00:42:42,599 only this set of states. 291 00:42:42,599 --> 00:42:47,720 These are not going to be visited anymore. These are not going to be visited anymore, 292 00:42:47,720 --> 00:42:54,720 right? So, we do stand to gain? But, if it pushes these states also beyond C*, then again, 293 00:42:59,609 --> 00:43:06,609 the gain is not that much, right? There is a tradeoff that we want to do. Suppose we 294 00:43:08,230 --> 00:43:14,789 are not interested in getting the exact optimal solution; we are satisfied whether if it is 295 00:43:14,789 --> 00:43:21,420 closed optimal. So, what we will do is, we will make the heuristic overestimate. That 296 00:43:21,420 --> 00:43:28,180 is going to push several states, other states, beyond C*. So, the set of states that I will 297 00:43:28,180 --> 00:43:35,180 be expanding using A* will be, now, much fewer, right? If you have a sub-optimal goal within 298 00:43:36,859 --> 00:43:43,859 that set of states, then you are done, right? So, this balance is again used in many cases. 299 00:43:49,089 --> 00:43:55,980 For example, in several in bioinformatics, there are now several problems, where we are 300 00:43:55,980 --> 00:44:02,289 using heuristic search technique, and it has been found, that in some of the more tougher 301 00:44:02,289 --> 00:44:07,520 kinds of problems, like multiple sequence alignment, we will discuss these problems 302 00:44:07,520 --> 00:44:14,520 in the tutorial. What happens is that, if you use overestimating heuristics, the performance 303 00:44:15,510 --> 00:44:22,510 improves by an order of magnitude, almost, so, by 10 times, 50 times, it improves in 304 00:44:22,730 --> 00:44:29,730 terms of time, and because there are lots of solutions and very close together, getting 305 00:44:30,829 --> 00:44:36,420 a sub-optimal solution quickly is easy, but if you want to improve that to get the optimal 306 00:44:36,420 --> 00:44:41,460 solution, you have to do a lot of work. But, in many cases, we do not require that amount 307 00:44:41,460 --> 00:44:48,460 of accuracy. Okay. So, if you want to say that I want optimal- some k times optimal, 308 00:44:54,220 --> 00:45:01,220 where k is say 1.5, I am satisfied if it is 1.5 times optimal, within 1.5 times optimal. 309 00:45:02,270 --> 00:45:08,390 Then, you can tune your inadmissible heuristics, so that it overestimates, by not more than 310 00:45:08,390 --> 00:45:14,650 1.5 times, and then you use that heuristic. That guarantees that the solution is going 311 00:45:14,650 --> 00:45:21,650 to be " ", right, and the drawback that we have for inadmissible heuristics is that A* 312 00:45:25,549 --> 00:45:32,170 may terminate with a sub-optimal solution. But if that is acceptable, then using inadmissible 313 00:45:32,170 --> 00:45:39,170 heuristics is a smart way of cutting down the search time. 314 00:45:39,250 --> 00:45:46,250 Just as we had iterative deepening, we do have iterative deepening A*, and the idea 315 00:45:47,799 --> 00:45:54,799 is similar- that we will use depth first search, right, but unlike depth first search without 316 00:45:59,470 --> 00:46:05,099 heuristics, what we are going to do here is: we are going to use depth first search using 317 00:46:05,099 --> 00:46:12,099 the heuristics. Initially, we will set c to be the cost of the start state. Then, we will 318 00:46:19,240 --> 00:46:26,240 perform depth first branch and bound with the cut off c. So, we are going to expand 319 00:46:28,000 --> 00:46:35,000 all states that have the f value less than or equal to c. If during this, you select 320 00:46:38,680 --> 00:46:45,680 a goal for expansion, then we return that and terminate. Otherwise, otherwise- we will 321 00:46:50,160 --> 00:46:57,099 update c to the minimum f value, which exceeded c among the states which were examined, and 322 00:46:57,099 --> 00:46:59,349 repeat the same. 323 00:46:59,349 --> 00:47:06,349 So, this picture will explain the thing nicely. We start with the state s, which has cost 324 00:47:09,869 --> 00:47:16,869 fs. We perform depth first branch and bound with this, so it could be the case, that s 325 00:47:21,089 --> 00:47:27,710 is the only state which has this cost, or it could be that s and several of its successors 326 00:47:27,710 --> 00:47:34,710 have the cost fs. So, we expand all the states, right? If we have found a goal anywhere here, 327 00:47:40,680 --> 00:47:47,680 then we terminate and return that goal. Otherwise, we look at this frontier. What is this frontier? 328 00:47:50,099 --> 00:47:57,099 This frontier is the set of states whose cost exceeded the current bound, so initially, 329 00:47:59,250 --> 00:48:06,250 the current bound was fs. So, this frontier is the set of states this is the set of states, 330 00:48:12,789 --> 00:48:19,789 where fn has exceeded c, right? 331 00:48:22,890 --> 00:48:29,890 All the parents of these states had cost of c or less, right? That is why they were expanded. 332 00:48:34,630 --> 00:48:39,950 We did a depth first branch and bound, and we backtracked only when we found that we 333 00:48:39,950 --> 00:48:44,940 had reached a state whose cost is greater than c. We found that cost greater than c, 334 00:48:44,940 --> 00:48:50,130 backtrack, tried other path, again reached a state greater than c, backtrack, found other 335 00:48:50,130 --> 00:48:56,230 path, again reached a state cost greater than c. So, this frontier, from where we all backtrack 336 00:48:56,230 --> 00:49:03,230 during depth first branch and bound, consists of those states, whose cost exceeds c and 337 00:49:04,160 --> 00:49:08,430 whose parents' cost is less than or equal to c, right? 338 00:49:08,430 --> 00:49:15,430 From this frontier, I pick up the minimum cost state, right? So, let us say that z is 339 00:49:17,819 --> 00:49:24,819 the minimum cost state in this frontier, right, and then update c to fz, okay? What is fz? 340 00:49:35,359 --> 00:49:42,359 f z is the minimum cost state, which exceeds a cost of c; the minimum cost state, which 341 00:49:48,779 --> 00:49:55,779 exceeds the cost of c, right? That is fz, right? If you think of A*, then once it had 342 00:49:58,119 --> 00:50:02,670 finished off all the predecessors of this frontier states, it would have picked up this 343 00:50:02,670 --> 00:50:09,460 state z from open, because open would have had this frontier states, right? And it would 344 00:50:09,460 --> 00:50:12,829 have picked up the minimum cost state from there. So, A* would have picked up this state 345 00:50:12,829 --> 00:50:19,829 only. Now, what we do is, we do a depth first branch and bound with z- with fz -as the cutoff. 346 00:50:23,900 --> 00:50:28,750 So, with this new c as the cutoff, what is going to happen? 347 00:50:28,750 --> 00:50:34,029 In this frontier, this state is going to get expanded, so that frontier is going to get 348 00:50:34,029 --> 00:50:41,029 pushed a little bit forward. Now, the new frontier is these states and these states, 349 00:50:43,529 --> 00:50:50,529 right? Again, if you look at this frontier, pick up the one which has the minimum cost; 350 00:50:55,349 --> 00:51:02,349 let us say, that some state q here now, has the minimum cost. So, set c to fq, right? 351 00:51:03,789 --> 00:51:10,789 So, what are we doing? We are always expanding a new state, as long as we have already exhausted 352 00:51:15,829 --> 00:51:22,829 all states having less cost, right, and that is exactly what would have also have done. 353 00:51:27,980 --> 00:51:34,980 It will always expand a new state, provided all states with less cost in open has been 354 00:51:35,079 --> 00:51:42,079 expanded, right, and that is where the cost increases. When I have finished all states 355 00:51:43,500 --> 00:51:50,500 having a given cost in open, it is only then that I pick up the next largest cost- next 356 00:51:52,049 --> 00:51:59,049 larger cost- right? The advantage of doing this is that we do not store open or closed. 357 00:52:00,990 --> 00:52:02,750 We do not require open or closed. 358 00:52:02,750 --> 00:52:09,750 We are just doing depth first. So, space requirement is linear in the size of the paths that you 359 00:52:14,119 --> 00:52:21,119 are exploring, so we do not have the space blow up of A*, but at the same time, because 360 00:52:22,359 --> 00:52:29,359 we are doing iterative deepening, so we are not going purely depth first, in which case, 361 00:52:29,730 --> 00:52:34,619 we would have a missed a goal which was pretty close to the start state. So, we are going 362 00:52:34,619 --> 00:52:41,210 in a in A* style, but without saving open or closed, by progressively extending the 363 00:52:41,210 --> 00:52:48,210 frontier. And in practice, this works quite well. To see the kind of bounds that iterative 364 00:52:55,319 --> 00:53:01,220 deepening gives us, consider A*. It expands n states. 365 00:53:01,220 --> 00:53:08,220 Then I d- A* can expand, 1 plus 2 plus 3 plus up to n or order n square. Why so? Because, 366 00:53:11,650 --> 00:53:16,980 if you look at the iterations, in every iteration, what may happen is, when you pick up a new 367 00:53:16,980 --> 00:53:23,980 state, it may happen, that in the next iteration only, that state is expanded. If all the state 368 00:53:25,319 --> 00:53:32,319 costs are- text please, text. Yes, in every iteration of, what may happen is, when you 369 00:53:36,569 --> 00:53:42,869 pick up the state from the frontier, this maybe the only state having this cost, in 370 00:53:42,869 --> 00:53:48,140 which case, it is the only state that is going to get expanded, and in the next iteration, 371 00:53:48,140 --> 00:53:55,140 you will visit all these; again, you will pick up one state and just expand that. And 372 00:53:55,400 --> 00:53:59,799 in the next iteration, you will visit all these, right? 373 00:53:59,799 --> 00:54:06,799 So, in this case, what happens is, if you look at the A* sequence of states here, in 374 00:54:07,869 --> 00:54:14,319 the first iteration, you may expand one; second iteration, you are expanding 2 states; every 375 00:54:14,319 --> 00:54:19,420 iteration, only one new state is getting expanded; third iteration, 3 states are getting expanded, 376 00:54:19,420 --> 00:54:26,010 and so on, until you expand all n states. And that is the set of states which A* will 377 00:54:26,010 --> 00:54:33,010 expand. So, because you are not saving them. Because you are not saving them, so, we are 378 00:54:35,269 --> 00:54:42,269 again doing a DSBB on the state space, with a new cost cutoff, right? So, it is all in 379 00:54:43,720 --> 00:54:48,799 the interest of saving space, because space is the thing which will kill you in this kind 380 00:54:48,799 --> 00:54:53,950 of state spaces. So, this gives you, in the worst case, as you can see, order of n square, 381 00:54:53,950 --> 00:54:58,920 where n is the set of states which A* expands. n is the set of states that I was mentioning; 382 00:54:58,920 --> 00:55:02,230 all states with cost less than C*. 383 00:55:02,230 --> 00:55:06,829 This is going to it is going to be, in the worst case, quadratic in time, as compared 384 00:55:06,829 --> 00:55:13,829 to A*. So, the Time: increase is only quadratic, but the space is exponentially saved, because 385 00:55:17,369 --> 00:55:22,269 it can grow in order of b to the power of m, where b is the branching factor. But here. 386 00:55:22,269 --> 00:55:29,269 you are doing in linear space, right? So, it is asymptotically optimal. 387 00:55:30,400 --> 00:55:37,400 Okay. So, there are several extensions of these basic memory bounded search strategies. 388 00:55:42,710 --> 00:55:48,700 Maybe someTime: later, in some later lectures, I will just touch upon the other kinds of 389 00:55:48,700 --> 00:55:54,410 strategies that we have. But from the next lecture onwards, we are going to move into 390 00:55:54,410 --> 00:56:01,410 problem reduction search and game trees. 391 00:56:12,509 --> 00:56:19,230 In the last couple of lectures, we were discussing state space search. And if you remember, in 392 00:56:19,230 --> 00:56:25,089 the initial introduction that I gave on search, we said that there are 3 different paradigms 393 00:56:25,089 --> 00:56:31,089 for problem solving using search, broadly. One was state space search, of which there 394 00:56:31,089 --> 00:56:37,369 are a few topics which I left. We will cover that up later. The second topic is problem 395 00:56:37,369 --> 00:56:44,069 reduction search, and under problem reduction search, we will look at 2 kinds of graph search, 396 00:56:44,069 --> 00:56:50,069 namely and/or graphs and game trees. And/or graphs are a kind of structure which we will 397 00:56:50,069 --> 00:56:56,710 study, and it has several different applications, though people do not always refer them as 398 00:56:56,710 --> 00:57:02,599 and/or graph search, but the underlying philosophy is the same. And then, we will look at game 399 00:57:02,599 --> 00:57:08,680 trees, which is the back bone of game playing programs and for programs which optimize in 400 00:57:08,680 --> 00:57:10,049 the presence of an adversary. 401 00:57:10,049 --> 00:57:15,960 So, game search situations where you have adversaries, and there is some criterion, 402 00:57:15,960 --> 00:57:22,960 that you have to optimize. And in the presence of the adversary, you have to do that optimization. 403 00:57:23,849 --> 00:57:30,849 So, problem reduction search can be broadly defined as planning how best to solve a problem 404 00:57:32,910 --> 00:57:39,730 that can be recursively decomposed into sub-problems, in multiple ways. We can solve the same problem 405 00:57:39,730 --> 00:57:43,900 by decomposition. There are more than one decompositions of the same problem, and we 406 00:57:43,900 --> 00:57:49,799 have to decide which is the best way to decompose the problem, so that the total solution cost, 407 00:57:49,799 --> 00:57:56,799 quality of solution, or the effort of searching is minimized. So, to start with, let us consider 408 00:57:59,130 --> 00:58:06,130 the matrix multiplication problem, where you are given a set of matrices, A1to An, and 409 00:58:10,549 --> 00:58:12,690 we have to find out the product of them. 410 00:58:12,690 --> 00:58:19,690 We have to find out A1, A2 to An, right? Now, we can do it in several ways. For example, 411 00:58:26,240 --> 00:58:33,240 we can first multiply: A1A2. With that product, we can multiply A3, right?