1
00:00:48,460 --> 00:00:55,460
We will continue with our discussion on A*
and heuristic search engine, from this class
2
00:00:55,640 --> 00:01:02,640
onwards. So, the topic of this lecture is
heuristic search A*, and beyond. Quickly,
3
00:01:05,890 --> 00:01:12,750
to recap what we had done in the last class:
we studied the algorithm A*, which maintains
4
00:01:12,750 --> 00:01:19,750
2 lists- open and closed, and also 2 functions.
One is the g value, which computes the distance
5
00:01:23,190 --> 00:01:29,649
of the state from the start state, and the
h value, which is the heuristic estimate of
6
00:01:29,649 --> 00:01:36,649
the distance of that state from the goal state,
and fs is the sum of gs and hs, and that gives
7
00:01:40,179 --> 00:01:47,179
us the estimated cost of a solution which
goes through the node n. So, the first step
8
00:01:51,250 --> 00:01:55,860
was- if open is empty and we have still not
yet found the goal, then we terminate with
9
00:01:55,860 --> 00:02:02,460
failure; otherwise, we select the minimum
cost state n from open and save it in closed.
10
00:02:02,460 --> 00:02:09,250
If the selected state is a goal state, then
we terminate with success and return the f
11
00:02:09,250 --> 00:02:13,100
value of that state as the cost of the goal.
12
00:02:13,100 --> 00:02:20,100
Otherwise, we expand the node n, and to generate
the set of successors and for each successor
13
00:02:21,660 --> 00:02:28,660
m, we compute its cost, based on the g value
of that node and the h value of that node.
14
00:02:29,920 --> 00:02:35,090
And if the node already belongs to open and
closed, we update it only if the cost is decreased.
15
00:02:35,090 --> 00:02:40,930
And, if the node is already in closed and
its cost has decreased, then you must bring
16
00:02:40,930 --> 00:02:47,930
it back to open. Now, in uniform cost search,
we had seen that if you have only positive
17
00:02:48,380 --> 00:02:52,470
cost, then you cannot have a case where a
node comes back from closed to open.
18
00:02:52,470 --> 00:02:59,359
Now, why are we so concerned about nodes coming
back from closed to open? Because what we
19
00:02:59,359 --> 00:03:06,250
want is: that we want that when we expand
a node, we expand it only once. If we can
20
00:03:06,250 --> 00:03:13,250
ensure that we expand nodes only once, then
there is a set of states which any admissible
21
00:03:15,070 --> 00:03:20,329
algorithm must visit. As we had seen in the
last class, that if you look at the set of
22
00:03:20,329 --> 00:03:27,000
states whose f value is less than the cost
of the goal, then any admissible algorithm,
23
00:03:27,000 --> 00:03:31,370
any algorithm which guarantees to find the
optimal solution, we will definitely have
24
00:03:31,370 --> 00:03:33,049
to expand those nodes, right?
25
00:03:33,049 --> 00:03:40,049
So, the complexity will be linear in the number
of expanded nodes, if we can allow that no
26
00:03:42,900 --> 00:03:49,900
nodes are re-expanded. If you can ensure that
every node is expanded only once, then, we
27
00:03:52,320 --> 00:03:59,060
are assured that if s is the set of nodes
which must be visited by any admissible algorithm
28
00:03:59,060 --> 00:04:06,060
in, then our complexity is linear in s. Let
us see. So, this brings us to a kind of notion
29
00:04:08,930 --> 00:04:15,930
of optimality. We say that, okay, let s denote
the set of states, such that it denotes the
30
00:04:20,209 --> 00:04:27,209
set of states n, such that fn is less than
C*, where C* is the cost of the optimal goal.
31
00:04:31,669 --> 00:04:38,560
We know that any algorithm which is admissible,
or any algorithm which guarantees to give
32
00:04:38,560 --> 00:04:45,560
us the optimal solution- we will have to expand
this set of states, right? If we can have
33
00:04:46,710 --> 00:04:53,710
our algorithm which is linear in the size
of s, then that is something which is asymptotically
34
00:04:56,470 --> 00:05:02,570
optimal, right? Because any algorithm which
is admissible and guarantees the optimal solution
35
00:05:02,570 --> 00:05:05,000
will have to expand that at least s nodes.
36
00:05:05,000 --> 00:05:12,000
If you are linear in s, then, that means that
we have an asymptotically optimal algorithm,
37
00:05:13,319 --> 00:05:19,979
but if we end up re-expanding those nodes,
then this guarantee disappears. Then it could
38
00:05:19,979 --> 00:05:26,789
be quadratic in s. It could be exponential
in s. We do not know, right? What we have
39
00:05:26,789 --> 00:05:33,789
seen is that, in uniform cost search, there
is no such re-expansion, because a nodes never
40
00:05:34,300 --> 00:05:40,240
comes back from close to open. Once we have
expanded it and put the node in closed, it
41
00:05:40,240 --> 00:05:47,240
is done. We do not re-expand that anymore,
right, but in the case of heuristic search,
42
00:05:48,729 --> 00:05:55,729
we can have scenarios where a state is re-expanded,
and this is there because of what we call
43
00:05:57,629 --> 00:06:03,689
non-monotonicity of the heuristic function.
I will give you an example to show where a
44
00:06:03,689 --> 00:06:10,689
node will come back from closed to open. So,
the example that I will show is like this:
45
00:06:11,550 --> 00:06:18,550
that we have the state one, and let us say
that the costs are 3, 2... This is the graph.
46
00:06:55,360 --> 00:07:02,360
Now, let us say that the heuristic functions
are as follows: at every state, the heuristic
47
00:07:03,719 --> 00:07:10,310
function will give us an estimate of the cost
to the goal. We will assume that the heuristic
48
00:07:10,310 --> 00:07:17,310
function h that is given to us, always under
estimates, so it always gives us a lower bound,
49
00:07:24,430 --> 00:07:31,430
right? Now, so, but even if it underestimates,
you can have an accurate heuristic and an
50
00:07:37,599 --> 00:07:38,580
inaccurate heuristic.
51
00:07:38,580 --> 00:07:45,580
So, suppose that that this heuristic function
gives us a pretty accurate estimate here.
52
00:07:46,719 --> 00:07:53,719
If you can see, the cost of this is 24. The
optimal cost solution from 3 is 24, right?
53
00:07:55,949 --> 00:08:02,949
Where 6 is the goal and we have 23 here. Now,
that is a pretty accurate estimate that we
54
00:08:03,659 --> 00:08:10,659
have here. Unfortunately, in the other states,
s gives us a pretty weak estimate. So, let
55
00:08:12,080 --> 00:08:19,080
us say we have this kind of scenarios, right?
Now, let us run A* on this and see what happens,
56
00:08:37,500 --> 00:08:43,820
and I will also maintain closed, to show that
some nodes will come back from closed to open.
57
00:08:43,820 --> 00:08:50,820
Intuitively, what will happen is, we will-
because of this heuristic, this node will
58
00:08:52,560 --> 00:08:57,870
have a pretty accurate cost value, and therefore,
will not be expanded, until we have expanded
59
00:08:57,870 --> 00:09:04,029
this whole path. And then, we are near about
here, we will realize that what we had all
60
00:09:04,029 --> 00:09:08,000
along this path was an inaccurate heuristic.
61
00:09:08,000 --> 00:09:15,000
Then, we will again re-expand this path. So,
let us see how we go about doing this.
62
00:09:16,220 --> 00:09:23,220
Initially, we have 1 with a cost of 5 in open,
right? The first step: we will expand that,
63
00:09:26,680 --> 00:09:33,680
and that is going to give us 2 states. One
is 2 with a cost of 4 plus 3, 7, and 3 with
64
00:09:36,940 --> 00:09:43,940
a pretty accurate cost, which is 23 plus 2,
25. In the next step, you are going to put
65
00:09:48,840 --> 00:09:55,840
2 with a cost of 7. So, 3 with a cost of 25
which is going to remain in open, and we are
66
00:09:57,160 --> 00:10:04,160
going to have here 4 with a cost of 4 plus
3, 7, plus 2. So 9, right? In the next step,
67
00:10:06,950 --> 00:10:13,440
we are going to pick up 4 with a cost of 9,
and we are going to have 3 with a cost of
68
00:10:13,440 --> 00:10:20,440
25 here, and 5 with a cost of 3 plus 7, and
8, 11, right?
69
00:10:28,640 --> 00:10:35,640
In the next step, we are going to pick up
5 with a cost of 11 and then we have 3 with
70
00:10:37,790 --> 00:10:44,790
a cost of 25, and 6 is going to come with
a cost of 20, 28, because this is the path;
71
00:10:49,710 --> 00:10:56,710
it is this path, which has a cost of 28. See,
now we find that 3 has a lesser cost. See,
72
00:11:00,340 --> 00:11:05,030
at this point, though the goal is there, we
cannot pick up the goal, because there can
73
00:11:05,030 --> 00:11:10,240
be better path to the goal. So, we cannot
pick up and declare that we have found that
74
00:11:10,240 --> 00:11:17,240
solution. No. So, we will expand 3. When we
expand 3, now, see what happens: we get 4,
75
00:11:19,870 --> 00:11:26,870
but now 4 comes with a cost of 5 plus 2; 7,
right? So, the cost of 4 has decreased. We
76
00:11:29,940 --> 00:11:36,940
will take it out from closed and bring it
back to open with a cost of 7, and then we
77
00:11:38,930 --> 00:11:45,930
have 6 with a cost of 28. Next step: 4 with
a cost of 7 is picked up, right?
78
00:11:46,440 --> 00:11:53,440
So, we have 6 with a cost of 28, and 5 now
comes in, with a cost of, how much? 5, 6,
79
00:11:58,020 --> 00:12:04,290
7, 8, 9, right? And this is better than the
cost of that with which it was in closed.
80
00:12:04,290 --> 00:12:11,290
We have brought it back from closed to open,
right? Now, we pick up 5 with a cost of 9,
81
00:12:12,810 --> 00:12:19,810
and that gives us 6 with a cost of 26. So,
this gets replaced with 6 with a cost of 26,
82
00:12:25,790 --> 00:12:31,380
and we pick up 6 finally and find out that
the optimal cost is 26, right?
83
00:12:31,380 --> 00:12:38,380
So, we see that we had to do a lot extra work,
because our heuristic values were such that
84
00:12:40,740 --> 00:12:46,520
along one path it was well informed; along
the optimal path it was well informed, and
85
00:12:46,520 --> 00:12:52,390
along the sub-optimal paths, it was not well
informed. So, we had to expand the sub-optimal
86
00:12:52,390 --> 00:12:59,390
path up to a large extent, right? Now, there
are several cases where we can make some improvements
87
00:13:03,730 --> 00:13:08,250
based on this notion. We are going to come
back to that, but before that, let me let
88
00:13:08,250 --> 00:13:15,250
us study a few properties of A*. Firstly,
we will say that a heuristic is called admissible
89
00:13:21,180 --> 00:13:23,780
if it always under estimates.
90
00:13:23,780 --> 00:13:30,780
That is, we always have hn is in lower bound
on the actual cost of the goal, from that
91
00:13:31,540 --> 00:13:38,540
state where A* denotes the minimum distance
to a goal state from the state n. This is
92
00:13:39,840 --> 00:13:46,840
what we have been talking about all the time.
The heuristic is called admissible if it underestimates.
93
00:13:49,010 --> 00:13:56,010
What happens if it overestimates? See, if
it overestimates, then we do not have that
94
00:14:01,560 --> 00:14:08,560
nice property. We had talked about the property,
that whenever you have a node whose fn is
95
00:14:09,800 --> 00:14:16,800
less than C*, then this is the set of states
which are expanded, and all states which have
96
00:14:19,800 --> 00:14:26,800
the value fn greater than C*- these are never
expanded, right? So, that means, if you look
97
00:14:35,740 --> 00:14:42,740
at the state space tree, then all the states
which have cost less than C*, that is, all
98
00:14:45,610 --> 00:14:52,610
the states in this region, they are going
to be expanded. And the ones which are greater
99
00:14:53,160 --> 00:15:00,160
than C*, this whole region, these states are
never expanded.
100
00:15:00,810 --> 00:15:07,110
These are all greater than *, and we never
expand this. So, this gives us a nice bound;
101
00:15:07,110 --> 00:15:13,320
depending on the accuracy of the functions
that you have, we have this nice bound, which
102
00:15:13,320 --> 00:15:18,490
tells us that these are the set of states
that are that have to be expanded anyway,
103
00:15:18,490 --> 00:15:25,490
right? Now, the moment you do not have underestimating
heuristics, what can happen is, along some
104
00:15:29,040 --> 00:15:36,040
path, you have the heuristic value overestimating.
At this state, you will have fn greater than
105
00:15:36,470 --> 00:15:43,470
C*, but because the heuristic has overestimated,
it is still possible, that below this, somewhere,
106
00:15:45,920 --> 00:15:52,920
you have you have a goal which has cost less
than C*. That possibility cannot be ruled
107
00:15:56,090 --> 00:16:02,880
out. In this case, when we have underestimating
heuristics, you know when the cost has exceeded
108
00:16:02,880 --> 00:16:09,880
C*, then the actual cost can only be more.
If you look at the states in this frontier,
109
00:16:10,530 --> 00:16:13,760
you know that the actual cost of these states
can only be more.
110
00:16:13,760 --> 00:16:20,760
So, any goal that you find beyond this, will
also have cost greater than C*. So, there
111
00:16:22,050 --> 00:16:28,420
is no point in expanding those nodes, but
the moment the heuristics overestimate, it
112
00:16:28,420 --> 00:16:35,420
is quite possible that there is another goal
beyond this, which has cost less than C*.
113
00:16:35,540 --> 00:16:40,230
And we have this f value greater than C*,
because our heuristic has given an overestimate,
114
00:16:40,230 --> 00:16:47,230
right? As soon as we have overestimating heuristics,
this nice property will disappear, and then,
115
00:16:48,720 --> 00:16:54,560
that will mean that even if you find that
the cost of the node, even if you find that
116
00:16:54,560 --> 00:17:00,710
you have found a goal, you still do not know
that the nodes that are there in open. If
117
00:17:00,710 --> 00:17:07,039
you expand them, you can still find the lesser
cost goal, right? Because cost can decrease
118
00:17:07,039 --> 00:17:09,339
beyond that point.
119
00:17:09,339 --> 00:17:15,250
That is why, though this property will not
be there, but there are some advantages also,
120
00:17:15,250 --> 00:17:19,480
of using overestimating heuristics, which
I will just briefly touch upon later. And
121
00:17:19,480 --> 00:17:24,169
when you actually do the experimentations,
you will find that there is some merit in
122
00:17:24,169 --> 00:17:30,539
using overestimating heuristics. So, as of
now, we are assuming that the heuristics are
123
00:17:30,539 --> 00:17:37,539
admissible, that is, they are underestimating
heuristics. So, for finite state spaces, A*
124
00:17:40,470 --> 00:17:42,409
always terminates.
125
00:17:42,409 --> 00:17:48,020
We need not consider the proof of these; this
is very straightforward, right? Because we
126
00:17:48,020 --> 00:17:54,470
are going depth first, based on the cost criterion.
So, if you have a finite state space under
127
00:17:54,470 --> 00:18:01,470
finite cost, you will eventually reach the
goal. At any Time: before A* terminates, there
128
00:18:03,850 --> 00:18:10,850
exists in open, a state n, that is known as
optimal path from s to a goal state with fn
129
00:18:11,889 --> 00:18:18,889
less than f*s. Why is
130
00:18:32,159 --> 00:18:39,159
that the case? Suppose in the state space,
let us say this is our goal, and let us say
131
00:19:00,549 --> 00:19:07,549
that this is the optimal cost path to the
goal. Our objective is to find this, right?
132
00:19:10,210 --> 00:19:17,210
Now, until A* terminates, we are claiming
that at least one of these states will always
133
00:19:19,840 --> 00:19:26,840
there be in open. But, until we found this
state, or the best path to this state, until
134
00:19:41,269 --> 00:19:47,249
A* terminated, there was always some state
which was there in open.
135
00:19:47,249 --> 00:19:54,249
So, now let us look at this example once again,
just to get this idea clear. See, the optimal
136
00:19:56,580 --> 00:20:03,580
path was here. 3 was that state which was
there, right here, right? And then, when 3
137
00:20:03,679 --> 00:20:09,279
got expanded, 4 was the state which was on
the optimal path. Then 4 got expanded, then
138
00:20:09,279 --> 00:20:16,279
5 was there on the optimal path, right? So,
at every point, see, here- 1 and 3 and 5,
139
00:20:18,129 --> 00:20:25,129
and then we find the goal, right? Now, you
can prove this by induction, where initially
140
00:20:30,649 --> 00:20:36,669
it is the start state which is on the optimal
path, which is there in open, right? When
141
00:20:36,669 --> 00:20:43,669
you expand open, if, when you expand the start
state, you are going to have a set of states
142
00:20:45,730 --> 00:20:52,730
that are the successor of the start state,
right, one of these successors is on the optimal
143
00:20:53,619 --> 00:20:59,159
path. That fellow will be there in open. If
you expand that state, if you as long as you
144
00:20:59,159 --> 00:21:06,159
do not expand that state, right, you will
not terminate, because that state is always
145
00:21:07,519 --> 00:21:10,879
going to have a cost less than C*.
146
00:21:10,879 --> 00:21:15,369
You will have to eventually expand it. As
long as you do not, that state is the one
147
00:21:15,369 --> 00:21:21,759
which is there in open. When you expand that,
then its successor will come into the- one
148
00:21:21,759 --> 00:21:27,460
of its successors on the optimal path will
be there in open. So, in this way, it will
149
00:21:27,460 --> 00:21:31,049
continue, until you have expanded the goal
or picked up the goal for expansion. There
150
00:21:31,049 --> 00:21:38,049
will be one of those states always there,
right? You design the heuristic in such a
151
00:21:44,200 --> 00:21:50,070
way where it is admissible, like, for example,
if you look at the heuristic that- the Manhattan
152
00:21:50,070 --> 00:21:54,700
mode heuristic, for the 15 puzzle.
153
00:21:54,700 --> 00:21:59,779
We said that at least that many moves will
have to be taken for each tile. So, if you
154
00:21:59,779 --> 00:22:06,779
just add up the Manhattan distances of each
of the tiles from their final position, then
155
00:22:07,129 --> 00:22:12,869
you get an underestimate. Likewise, when we
talked about the minimum cost spanning tree
156
00:22:12,869 --> 00:22:19,869
heuristic for traveling salesperson problem,
we know that the minimum the cost of the minimum
157
00:22:19,889 --> 00:22:26,889
spanning tree is an underestimate on the minimum
cost tour. Or, usually, you will be able to.
158
00:22:33,990 --> 00:22:40,159
If you do not have, see, the worst case scenario
is, you do not know anything about the problem,
159
00:22:40,159 --> 00:22:45,710
in which case your heuristic function is zero
everywhere. So, you boil down to uniform cost
160
00:22:45,710 --> 00:22:52,710
search. If you do not have a good underestimate
but have a pretty tight overestimate, then
161
00:22:52,850 --> 00:22:59,850
it is better to go for depth first branch
and bound. If you do not have a good overestimate
162
00:23:04,049 --> 00:23:11,049
and you have a good underestimate, then you
can go for heuristic search, like A*.
163
00:23:13,779 --> 00:23:20,429
The designing of the heuristics is, by itself,
an interesting topic. And there are algorithms
164
00:23:20,429 --> 00:23:27,429
which will approximate, and will always keep
the approximate solution to be of cost- you
165
00:23:29,240 --> 00:23:36,240
know you know that it is going to be it is
going to be an underestimate of the actual
166
00:23:37,639 --> 00:23:44,639
solution, right? For example, there are approximation
algorithms, which are going to guarantee you
167
00:23:45,869 --> 00:23:52,869
some k times optimal. So, if you find, apply
that algorithm, and divide it by k- divide
168
00:23:54,009 --> 00:24:00,919
the solution cost by k- you know you have
an underestimate. So, there are various ways
169
00:24:00,919 --> 00:24:07,919
of designing a heuristic which is an underestimate.
When we work out some problems, you will see
170
00:24:09,190 --> 00:24:16,190
that, how to design such heuristics. Okay?
Then, let us continue with this.
171
00:24:20,600 --> 00:24:27,600
So, we say that algorithm A* is admissible.
That is, if there is a path from s to a goal
172
00:24:29,600 --> 00:24:36,600
state, then A* terminates by finding an optimal
path. Try to look at the proof of this result
173
00:24:44,619 --> 00:24:51,619
from the book of Nielson. Nielson's Principles
of AI- proof is given there, right? Now, here
174
00:24:56,850 --> 00:25:03,489
is an interesting question. This is with respect
to the accuracy of the heuristics. Suppose
175
00:25:03,489 --> 00:25:10,489
we have A1 and A2, and their 2 versions of
A*, such that A2 uses a heuristic which is
176
00:25:13,320 --> 00:25:19,389
more informed than A1. What do we mean by
more informed? Suppose the heuristics are
177
00:25:19,389 --> 00:25:26,389
all admissible. Suppose h1 is admissible.
h1 is admissible; h2 is also admissible, right?
178
00:25:30,629 --> 00:25:37,629
So, they both underestimate. So, when should
we call h2 more informed than h1? If for all
179
00:25:54,980 --> 00:26:01,980
states n, h2 greater than or less than? Greater
than, right? If h2 is greater than h1, then
180
00:26:13,470 --> 00:26:20,470
this is more informed than this one, but remember
that though h2 is greater than h1, at every
181
00:26:22,679 --> 00:26:28,239
state, it is still the case, that both are
underestimates of the optimal solution. So,
182
00:26:28,239 --> 00:26:31,179
h2 is tighter than h1.
183
00:26:31,179 --> 00:26:38,179
So, what this result tells us- slides- is
that, if A1 and A2 are 2 versions of A*, such
184
00:26:39,879 --> 00:26:46,879
that A2 is more informed than A1, then A1
expands at least as many states as A2. Now,
185
00:26:50,710 --> 00:26:55,179
that is very easy to see from that result
which we saw, that fn less than C* is the
186
00:26:55,179 --> 00:27:02,179
set of states that will always have to expanded.
So, the set of states which have fn less than
187
00:27:03,580 --> 00:27:10,580
C* will be more, when we use the less informed
heuristics, than the one where we used in
188
00:27:13,889 --> 00:27:18,980
more informed heuristic. In the cases where
we used the more informed heuristics, some
189
00:27:18,980 --> 00:27:25,980
of the states will now have costs greater
than C*, and there will not be expanded, right?
190
00:27:26,249 --> 00:27:33,249
But if I have both of these heuristics, h1
and h2, and we do not know which is more informed
191
00:27:37,289 --> 00:27:44,289
than the other, then, what heuristics will
we use for A*?
192
00:27:46,019 --> 00:27:52,720
Suppose we have h1, we also have h2. Both,
we know, are good heuristic functions. Both
193
00:27:52,720 --> 00:27:59,720
do not have much overhead of computation,
and we want to run A* to the solve the problem.
194
00:28:01,659 --> 00:28:08,659
What heuristic function will we use? Average?
Why average? Of the 2, yes.
195
00:28:13,049 --> 00:28:20,049
So what if we have h1 and h2? Then, it makes
sense to use max of h1n and h2n at every state,
196
00:28:31,519 --> 00:28:38,519
right? Because max of h1 and h2 is also going
to be an underestimate and it is also going
197
00:28:42,350 --> 00:28:49,350
to be as informed as h1 and as informed as
h2, perhaps better.
198
00:29:01,159 --> 00:29:08,159
Yes. But in practice, we also have to look
at the actual overhead of computing the heuristic.
199
00:29:14,369 --> 00:29:21,369
In some cases, see, there is a balance that
one has to strike; in some problems, the state
200
00:29:22,570 --> 00:29:29,570
space is so bad, that unless you use a very
good heuristic, you will run into trouble.
201
00:29:30,190 --> 00:29:37,190
In those cases, you do not mind spending more
Time: in computing a better, accurate heuristic,
202
00:29:37,470 --> 00:29:44,470
and then using it. In other problems, wherein
optimal goals are easy to find; in those problems,
203
00:29:48,690 --> 00:29:55,690
you may just go with a heuristic which is
easy to compute, right? Now, we come to the
204
00:30:01,809 --> 00:30:06,350
issue of monotone heuristics. As we had seen
previously, in that example, that one of the
205
00:30:06,350 --> 00:30:13,350
main problems that we had here, in the text,
that we had some state in between, which had
206
00:30:19,879 --> 00:30:26,289
which was well informed, but the others were
not well informed, right? Now, suppose, let
207
00:30:26,289 --> 00:30:33,289
us look at a slightly different scenario,
where instead of 5 here, we have a more informed
208
00:30:37,739 --> 00:30:40,619
heuristic here, which gives us 25.
209
00:30:40,619 --> 00:30:47,619
So, the only difference with what we had done
previously is that, we now have 25 here, okay?
210
00:30:52,779 --> 00:30:59,779
Now, let us see what happens. We have open
and we have closed. Now, when we have 1 out
211
00:31:08,690 --> 00:31:15,690
here, it is going to be there with a cost
of 25. In the first step, we are going to
212
00:31:16,669 --> 00:31:23,669
take 1 and expand that. Now, I note one interesting
thing. At what do we generate here? We have
213
00:31:25,679 --> 00:31:32,679
3 with a cost of 25, as we had before, and
now, if you just simply use for 4, if you
214
00:31:40,769 --> 00:31:47,769
just use the hn plus gm, then you will end
up having again, as in as before, you will
215
00:31:47,820 --> 00:31:54,820
have 4 2 2s, rather, 2. Sorry, 2 with a cost
of 7. Now, that is bad. Bad, because we know
216
00:31:57,179 --> 00:32:04,179
that from 1, any solution path is going to
cost you 25, at least, and then this 2 as
217
00:32:05,630 --> 00:32:10,080
" " . From here, you can reach the goal in
7, with cost of 7.
218
00:32:10,080 --> 00:32:15,369
Now, we clearly know that that is not possible.
That is not possible, because we already know
219
00:32:15,369 --> 00:32:22,369
that from 1, we require 25. So, if you could
do from 2 with a cost of 7, then from 1, you
220
00:32:23,299 --> 00:32:28,369
should be able to do with a cost of 10. But
the heuristic, which is an underestimate,
221
00:32:28,369 --> 00:32:35,369
tells us that at least 25 is required. So,
what we can do is, we can say that see, this
222
00:32:35,590 --> 00:32:42,590
is clearly an underestimate, which is inaccurate.
So, we can safely upgrade this to 25. Why?
223
00:32:51,100 --> 00:32:58,100
Because we know that if it is 25 from here,
right, and this edge cost 3, then it is at
224
00:33:00,729 --> 00:33:07,729
least 22 from here, right? So, we can safely
replace this fellow with 22, and then the
225
00:33:15,470 --> 00:33:17,830
new f value- so 22 plus 3, which is 25.
226
00:33:17,830 --> 00:33:24,830
Okay. Now, we have 2 states, one with a cost
of 25, another also with a cost of 25. Now,
227
00:33:28,710 --> 00:33:33,570
recall, that when we have 2 states with the
same cost, we choose the one which has lesser
228
00:33:33,570 --> 00:33:40,399
g value, because the other one has already
incurred more cost. And the remaining part
229
00:33:40,399 --> 00:33:47,399
is in the heuristic. So, we will pick up 3
first, because it has less g value. So, we
230
00:33:49,850 --> 00:33:56,850
will expand 3 with a cost of 25, and then
we will have 2 with a cost of 25 here, and
231
00:33:59,720 --> 00:34:06,720
4 with a cost of how much? Again, we will
use the same thing. See, if we just used 3
232
00:34:09,520 --> 00:34:16,520
plus 2 plus 2, you know, that is again not
accurate. So, 4 with a cost of 25. By upgrading
233
00:34:17,750 --> 00:34:24,750
this heuristic value to 20, right, so we upgrade
this to 20.
234
00:34:26,419 --> 00:34:33,419
Now, 2 will be picked up, because it has lesser
g value. So, pick up 2. 2 with a cost of 25
235
00:34:38,159 --> 00:34:45,159
and what we have here is 4 with a cost of
25; and now, because of the expansion of this,
236
00:34:46,859 --> 00:34:53,859
we will again have 4 with a cost of 25, but,
if we take this path, then it comes to 4 plus
237
00:34:58,140 --> 00:35:05,140
3; 7 plus, so 27. So, 27 is larger than this.
So, we will ignore that, and just stay with
238
00:35:07,650 --> 00:35:12,970
4 with a cost of 25. And then, you can see
that we will pick up 4 with a cost of 25.
239
00:35:12,970 --> 00:35:19,970
We will get 5 with a cost of 25, and then
you will expand 5 with a cost of 25, and you
240
00:35:23,250 --> 00:35:30,250
will get 6 with a cost of 26, right? So, that
was nice, because we have upgraded the heuristics
241
00:35:31,069 --> 00:35:35,849
nicely and avoided all those re-expansions,
right?
242
00:35:35,849 --> 00:35:42,849
So, this was proposed in, okay- so, we define
a heuristic function to be monotonic, if we
243
00:35:49,470 --> 00:35:56,470
do not have this problem. That is, if we have
a successor for every successor hn minus hm
244
00:35:57,670 --> 00:36:04,670
is less than or equal to cnm, right? If that
happens, then it is monotonic, right, but
245
00:36:22,910 --> 00:36:29,910
clearly, that was not what was happening here,
in our example. And if this if this does not
246
00:36:36,970 --> 00:36:43,970
hold, then we can update hn to have the value
hn minus cnm, okay? Now, it so happens that
247
00:36:51,569 --> 00:36:55,990
if the monotone restriction is satisfied,
then A* has already found an optimal path
248
00:36:55,990 --> 00:37:02,990
to the state it selects for expansion. If
the monotone restriction is satisfied- because
249
00:37:04,220 --> 00:37:09,270
if the monotone restriction is satisfied,
then along every path, you will update it
250
00:37:09,270 --> 00:37:16,270
with the least cost. We will just skip off
these properties.
251
00:37:20,000 --> 00:37:24,619
You can just check out whether these properties
hold, or just convince yourselves, by trying
252
00:37:24,619 --> 00:37:31,619
to prove them. Let us go over to what we call
pathmax. Pathmax is what we just now discovered.
253
00:37:33,950 --> 00:37:40,950
What we do is, when we generate the successor
m of n, we set the heuristic value to the
254
00:37:44,880 --> 00:37:51,880
maximum of hm and hn minus cnm. So, hm is
the heuristic value that we computed that
255
00:37:53,369 --> 00:38:00,369
state, and because n is the parent, so, hn
minus cnm is at least some amount- the amount
256
00:38:02,750 --> 00:38:08,160
of cost that will have to be incurred, because
if you can reach the goal state with a cost
257
00:38:08,160 --> 00:38:15,160
less than this, then you are able to do better
than hm, right? But because hm is an underestimate,
258
00:38:18,230 --> 00:38:21,930
you can actually never do better than hm.
259
00:38:21,930 --> 00:38:28,650
So, coming back to our example, you see that
what we had done here was exactly that. We
260
00:38:28,650 --> 00:38:35,650
had hn as 25; hn was 25, hm was 4. So, we
took max of 4 and 25 minus 3, that is 22,
261
00:38:43,049 --> 00:38:50,049
right? We took the max of these 2, and the
max of these 2 was 22, which is what we updated
262
00:38:54,440 --> 00:39:01,440
the heuristic function to be, right? So, what
we did here is exactly what I am explaining
263
00:39:06,630 --> 00:39:13,630
here in this. Slides, please? So, we update
hm to be the maximum of hm and hn minus cnm,
264
00:39:20,470 --> 00:39:27,470
right? Okay. Let us go down further. Inadmissible
heuristics. Now, see: we have seen that the
265
00:39:34,559 --> 00:39:41,559
more accurate that your heuristic is, the
more number of states are pushed beyond the
266
00:39:42,530 --> 00:39:49,530
C* boundary. Right? In A*, if C* is the optimal
cost, then all states having a cost less than
267
00:39:50,680 --> 00:39:57,030
C*, it must be expanded. So, the ones that
are beyond C* will be not be expanded. If
268
00:39:57,030 --> 00:40:03,910
your heuristic is weak, then lot of states
will be within C*, because the heuristic function
269
00:40:03,910 --> 00:40:04,990
weak, right?
270
00:40:04,990 --> 00:40:11,990
If the heuristic function is zero at all points,
then you have uniform cost search, in which
271
00:40:12,049 --> 00:40:16,900
case, all those states which have the actual
cost less than C* will have to be expanded.
272
00:40:16,900 --> 00:40:23,140
If you have a good heuristics, then some of
these states will now have cost greater than
273
00:40:23,140 --> 00:40:27,609
C*, and that is why we have the advantage
of heuristic search. Those states are not
274
00:40:27,609 --> 00:40:34,609
visited because of the heuristic function.
Now, if you have an overestimating heuristic,
275
00:40:34,690 --> 00:40:41,690
then many more of those states can again go
outside the C* boundary, right? As long as
276
00:40:44,349 --> 00:40:51,349
those states are not on the paths to your
optimal solutions, you stand to gain. Could
277
00:40:54,470 --> 00:41:01,470
I make myself clear vaguely clear? Right.
So, here, if you use an underestimating heuristics,
278
00:41:09,859 --> 00:41:16,859
then I have, let us say, this set of states
which with fn less than C*, okay? And the
279
00:41:20,690 --> 00:41:27,690
goal is one of these states, right, or just
beyond this. So, this is on the optimal solution,
280
00:41:30,940 --> 00:41:32,079
right?
281
00:41:32,079 --> 00:41:39,079
Now, if we use an overestimating heuristics,
then the cost of these; sum of these states
282
00:41:43,299 --> 00:41:48,349
these states the cost of these states can
be more, because the heuristic is overestimating.
283
00:41:48,349 --> 00:41:55,240
So, these costs can be larger than what they
are now. Same states, but because the h value
284
00:41:55,240 --> 00:42:02,240
is more, so, h plus g is also more, right?
So, it is quite possible that some of these
285
00:42:02,289 --> 00:42:09,289
states, let us say some of these states now,
will have a cost greater than C*, when we
286
00:42:10,940 --> 00:42:17,789
use the new heuristic function and compute
f dash n based on the heuristic function h
287
00:42:17,789 --> 00:42:24,789
dash n. For using these new heuristic function,
which is an overestimating heuristic function,
288
00:42:26,410 --> 00:42:33,410
some of these states can jump over to the
greater than C* region, right? Then, the set
289
00:42:36,150 --> 00:42:41,319
of states your A* will visit, has actually
decreased, because it is now going to visit
290
00:42:41,319 --> 00:42:42,599
only this set of states.
291
00:42:42,599 --> 00:42:47,720
These are not going to be visited anymore.
These are not going to be visited anymore,
292
00:42:47,720 --> 00:42:54,720
right? So, we do stand to gain? But, if it
pushes these states also beyond C*, then again,
293
00:42:59,609 --> 00:43:06,609
the gain is not that much, right? There is
a tradeoff that we want to do. Suppose we
294
00:43:08,230 --> 00:43:14,789
are not interested in getting the exact optimal
solution; we are satisfied whether if it is
295
00:43:14,789 --> 00:43:21,420
closed optimal. So, what we will do is, we
will make the heuristic overestimate. That
296
00:43:21,420 --> 00:43:28,180
is going to push several states, other states,
beyond C*. So, the set of states that I will
297
00:43:28,180 --> 00:43:35,180
be expanding using A* will be, now, much fewer,
right? If you have a sub-optimal goal within
298
00:43:36,859 --> 00:43:43,859
that set of states, then you are done, right?
So, this balance is again used in many cases.
299
00:43:49,089 --> 00:43:55,980
For example, in several in bioinformatics,
there are now several problems, where we are
300
00:43:55,980 --> 00:44:02,289
using heuristic search technique, and it has
been found, that in some of the more tougher
301
00:44:02,289 --> 00:44:07,520
kinds of problems, like multiple sequence
alignment, we will discuss these problems
302
00:44:07,520 --> 00:44:14,520
in the tutorial. What happens is that, if
you use overestimating heuristics, the performance
303
00:44:15,510 --> 00:44:22,510
improves by an order of magnitude, almost,
so, by 10 times, 50 times, it improves in
304
00:44:22,730 --> 00:44:29,730
terms of time, and because there are lots
of solutions and very close together, getting
305
00:44:30,829 --> 00:44:36,420
a sub-optimal solution quickly is easy, but
if you want to improve that to get the optimal
306
00:44:36,420 --> 00:44:41,460
solution, you have to do a lot of work. But,
in many cases, we do not require that amount
307
00:44:41,460 --> 00:44:48,460
of accuracy. Okay. So, if you want to say
that I want optimal- some k times optimal,
308
00:44:54,220 --> 00:45:01,220
where k is say 1.5, I am satisfied if it is
1.5 times optimal, within 1.5 times optimal.
309
00:45:02,270 --> 00:45:08,390
Then, you can tune your inadmissible heuristics,
so that it overestimates, by not more than
310
00:45:08,390 --> 00:45:14,650
1.5 times, and then you use that heuristic.
That guarantees that the solution is going
311
00:45:14,650 --> 00:45:21,650
to be " ", right, and the drawback that we
have for inadmissible heuristics is that A*
312
00:45:25,549 --> 00:45:32,170
may terminate with a sub-optimal solution.
But if that is acceptable, then using inadmissible
313
00:45:32,170 --> 00:45:39,170
heuristics is a smart way of cutting down
the search time.
314
00:45:39,250 --> 00:45:46,250
Just as we had iterative deepening, we do
have iterative deepening A*, and the idea
315
00:45:47,799 --> 00:45:54,799
is similar- that we will use depth first search,
right, but unlike depth first search without
316
00:45:59,470 --> 00:46:05,099
heuristics, what we are going to do here is:
we are going to use depth first search using
317
00:46:05,099 --> 00:46:12,099
the heuristics. Initially, we will set c to
be the cost of the start state. Then, we will
318
00:46:19,240 --> 00:46:26,240
perform depth first branch and bound with
the cut off c. So, we are going to expand
319
00:46:28,000 --> 00:46:35,000
all states that have the f value less than
or equal to c. If during this, you select
320
00:46:38,680 --> 00:46:45,680
a goal for expansion, then we return that
and terminate. Otherwise, otherwise- we will
321
00:46:50,160 --> 00:46:57,099
update c to the minimum f value, which exceeded
c among the states which were examined, and
322
00:46:57,099 --> 00:46:59,349
repeat the same.
323
00:46:59,349 --> 00:47:06,349
So, this picture will explain the thing nicely.
We start with the state s, which has cost
324
00:47:09,869 --> 00:47:16,869
fs. We perform depth first branch and bound
with this, so it could be the case, that s
325
00:47:21,089 --> 00:47:27,710
is the only state which has this cost, or
it could be that s and several of its successors
326
00:47:27,710 --> 00:47:34,710
have the cost fs. So, we expand all the states,
right? If we have found a goal anywhere here,
327
00:47:40,680 --> 00:47:47,680
then we terminate and return that goal. Otherwise,
we look at this frontier. What is this frontier?
328
00:47:50,099 --> 00:47:57,099
This frontier is the set of states whose cost
exceeded the current bound, so initially,
329
00:47:59,250 --> 00:48:06,250
the current bound was fs. So, this frontier
is the set of states this is the set of states,
330
00:48:12,789 --> 00:48:19,789
where fn has exceeded c, right?
331
00:48:22,890 --> 00:48:29,890
All the parents of these states had cost of
c or less, right? That is why they were expanded.
332
00:48:34,630 --> 00:48:39,950
We did a depth first branch and bound, and
we backtracked only when we found that we
333
00:48:39,950 --> 00:48:44,940
had reached a state whose cost is greater
than c. We found that cost greater than c,
334
00:48:44,940 --> 00:48:50,130
backtrack, tried other path, again reached
a state greater than c, backtrack, found other
335
00:48:50,130 --> 00:48:56,230
path, again reached a state cost greater than
c. So, this frontier, from where we all backtrack
336
00:48:56,230 --> 00:49:03,230
during depth first branch and bound, consists
of those states, whose cost exceeds c and
337
00:49:04,160 --> 00:49:08,430
whose parents' cost is less than or equal
to c, right?
338
00:49:08,430 --> 00:49:15,430
From this frontier, I pick up the minimum
cost state, right? So, let us say that z is
339
00:49:17,819 --> 00:49:24,819
the minimum cost state in this frontier, right,
and then update c to fz, okay? What is fz?
340
00:49:35,359 --> 00:49:42,359
f z is the minimum cost state, which exceeds
a cost of c; the minimum cost state, which
341
00:49:48,779 --> 00:49:55,779
exceeds the cost of c, right? That is fz,
right? If you think of A*, then once it had
342
00:49:58,119 --> 00:50:02,670
finished off all the predecessors of this
frontier states, it would have picked up this
343
00:50:02,670 --> 00:50:09,460
state z from open, because open would have
had this frontier states, right? And it would
344
00:50:09,460 --> 00:50:12,829
have picked up the minimum cost state from
there. So, A* would have picked up this state
345
00:50:12,829 --> 00:50:19,829
only. Now, what we do is, we do a depth first
branch and bound with z- with fz -as the cutoff.
346
00:50:23,900 --> 00:50:28,750
So, with this new c as the cutoff, what is
going to happen?
347
00:50:28,750 --> 00:50:34,029
In this frontier, this state is going to get
expanded, so that frontier is going to get
348
00:50:34,029 --> 00:50:41,029
pushed a little bit forward. Now, the new
frontier is these states and these states,
349
00:50:43,529 --> 00:50:50,529
right? Again, if you look at this frontier,
pick up the one which has the minimum cost;
350
00:50:55,349 --> 00:51:02,349
let us say, that some state q here now, has
the minimum cost. So, set c to fq, right?
351
00:51:03,789 --> 00:51:10,789
So, what are we doing? We are always expanding
a new state, as long as we have already exhausted
352
00:51:15,829 --> 00:51:22,829
all states having less cost, right, and that
is exactly what would have also have done.
353
00:51:27,980 --> 00:51:34,980
It will always expand a new state, provided
all states with less cost in open has been
354
00:51:35,079 --> 00:51:42,079
expanded, right, and that is where the cost
increases. When I have finished all states
355
00:51:43,500 --> 00:51:50,500
having a given cost in open, it is only then
that I pick up the next largest cost- next
356
00:51:52,049 --> 00:51:59,049
larger cost- right? The advantage of doing
this is that we do not store open or closed.
357
00:52:00,990 --> 00:52:02,750
We do not require open or closed.
358
00:52:02,750 --> 00:52:09,750
We are just doing depth first. So, space requirement
is linear in the size of the paths that you
359
00:52:14,119 --> 00:52:21,119
are exploring, so we do not have the space
blow up of A*, but at the same time, because
360
00:52:22,359 --> 00:52:29,359
we are doing iterative deepening, so we are
not going purely depth first, in which case,
361
00:52:29,730 --> 00:52:34,619
we would have a missed a goal which was pretty
close to the start state. So, we are going
362
00:52:34,619 --> 00:52:41,210
in a in A* style, but without saving open
or closed, by progressively extending the
363
00:52:41,210 --> 00:52:48,210
frontier. And in practice, this works quite
well. To see the kind of bounds that iterative
364
00:52:55,319 --> 00:53:01,220
deepening gives us, consider A*. It expands
n states.
365
00:53:01,220 --> 00:53:08,220
Then I d- A* can expand, 1 plus 2 plus 3 plus
up to n or order n square. Why so? Because,
366
00:53:11,650 --> 00:53:16,980
if you look at the iterations, in every iteration,
what may happen is, when you pick up a new
367
00:53:16,980 --> 00:53:23,980
state, it may happen, that in the next iteration
only, that state is expanded. If all the state
368
00:53:25,319 --> 00:53:32,319
costs are- text please, text. Yes, in every
iteration of, what may happen is, when you
369
00:53:36,569 --> 00:53:42,869
pick up the state from the frontier, this
maybe the only state having this cost, in
370
00:53:42,869 --> 00:53:48,140
which case, it is the only state that is going
to get expanded, and in the next iteration,
371
00:53:48,140 --> 00:53:55,140
you will visit all these; again, you will
pick up one state and just expand that. And
372
00:53:55,400 --> 00:53:59,799
in the next iteration, you will visit all
these, right?
373
00:53:59,799 --> 00:54:06,799
So, in this case, what happens is, if you
look at the A* sequence of states here, in
374
00:54:07,869 --> 00:54:14,319
the first iteration, you may expand one; second
iteration, you are expanding 2 states; every
375
00:54:14,319 --> 00:54:19,420
iteration, only one new state is getting expanded;
third iteration, 3 states are getting expanded,
376
00:54:19,420 --> 00:54:26,010
and so on, until you expand all n states.
And that is the set of states which A* will
377
00:54:26,010 --> 00:54:33,010
expand. So, because you are not saving them.
Because you are not saving them, so, we are
378
00:54:35,269 --> 00:54:42,269
again doing a DSBB on the state space, with
a new cost cutoff, right? So, it is all in
379
00:54:43,720 --> 00:54:48,799
the interest of saving space, because space
is the thing which will kill you in this kind
380
00:54:48,799 --> 00:54:53,950
of state spaces. So, this gives you, in the
worst case, as you can see, order of n square,
381
00:54:53,950 --> 00:54:58,920
where n is the set of states which A* expands.
n is the set of states that I was mentioning;
382
00:54:58,920 --> 00:55:02,230
all states with cost less than C*.
383
00:55:02,230 --> 00:55:06,829
This is going to it is going to be, in the
worst case, quadratic in time, as compared
384
00:55:06,829 --> 00:55:13,829
to A*. So, the Time: increase is only quadratic,
but the space is exponentially saved, because
385
00:55:17,369 --> 00:55:22,269
it can grow in order of b to the power of
m, where b is the branching factor. But here.
386
00:55:22,269 --> 00:55:29,269
you are doing in linear space, right? So,
it is asymptotically optimal.
387
00:55:30,400 --> 00:55:37,400
Okay. So, there are several extensions of
these basic memory bounded search strategies.
388
00:55:42,710 --> 00:55:48,700
Maybe someTime: later, in some later lectures,
I will just touch upon the other kinds of
389
00:55:48,700 --> 00:55:54,410
strategies that we have. But from the next
lecture onwards, we are going to move into
390
00:55:54,410 --> 00:56:01,410
problem reduction search and game trees.
391
00:56:12,509 --> 00:56:19,230
In the last couple of lectures, we were discussing
state space search. And if you remember, in
392
00:56:19,230 --> 00:56:25,089
the initial introduction that I gave on search,
we said that there are 3 different paradigms
393
00:56:25,089 --> 00:56:31,089
for problem solving using search, broadly.
One was state space search, of which there
394
00:56:31,089 --> 00:56:37,369
are a few topics which I left. We will cover
that up later. The second topic is problem
395
00:56:37,369 --> 00:56:44,069
reduction search, and under problem reduction
search, we will look at 2 kinds of graph search,
396
00:56:44,069 --> 00:56:50,069
namely and/or graphs and game trees. And/or
graphs are a kind of structure which we will
397
00:56:50,069 --> 00:56:56,710
study, and it has several different applications,
though people do not always refer them as
398
00:56:56,710 --> 00:57:02,599
and/or graph search, but the underlying philosophy
is the same. And then, we will look at game
399
00:57:02,599 --> 00:57:08,680
trees, which is the back bone of game playing
programs and for programs which optimize in
400
00:57:08,680 --> 00:57:10,049
the presence of an adversary.
401
00:57:10,049 --> 00:57:15,960
So, game search situations where you have
adversaries, and there is some criterion,
402
00:57:15,960 --> 00:57:22,960
that you have to optimize. And in the presence
of the adversary, you have to do that optimization.
403
00:57:23,849 --> 00:57:30,849
So, problem reduction search can be broadly
defined as planning how best to solve a problem
404
00:57:32,910 --> 00:57:39,730
that can be recursively decomposed into sub-problems,
in multiple ways. We can solve the same problem
405
00:57:39,730 --> 00:57:43,900
by decomposition. There are more than one
decompositions of the same problem, and we
406
00:57:43,900 --> 00:57:49,799
have to decide which is the best way to decompose
the problem, so that the total solution cost,
407
00:57:49,799 --> 00:57:56,799
quality of solution, or the effort of searching
is minimized. So, to start with, let us consider
408
00:57:59,130 --> 00:58:06,130
the matrix multiplication problem, where you
are given a set of matrices, A1to An, and
409
00:58:10,549 --> 00:58:12,690
we have to find out the product of them.
410
00:58:12,690 --> 00:58:19,690
We have to find out A1, A2 to An, right? Now,
we can do it in several ways. For example,
411
00:58:26,240 --> 00:58:33,240
we can first multiply: A1A2. With that product,
we can multiply A3, right?