1
00:00:53,000 --> 00:01:02,100
Okay, so we will start the section on informed
state space search. In informed state space
2
00:01:02,280 --> 00:01:09,280
search- slides please. So, what we have here
is- when we talk about state space search,
3
00:01:11,159 --> 00:01:17,110
we talk about the search space which is in
the form of a set of states and set of state
4
00:01:17,110 --> 00:01:22,210
transition operators. Now, when we have informed
state space search, it means that we have
5
00:01:22,210 --> 00:01:29,210
additional information, indicating the proximity
of the goal from each state. So, as just outlined
6
00:01:32,100 --> 00:01:39,100
in the previous lecture, the notion of heuristics
is as follows: that you have heuristics that
7
00:01:39,810 --> 00:01:46,329
use domain specific information to estimate
the quality or potential of partial solutions,
8
00:01:46,329 --> 00:01:51,700
so that you know that if I have if the current
state is the partial solution. Then, I know
9
00:01:51,700 --> 00:01:56,500
what is the potential of growing this into
a full solution. What is the additional cost
10
00:01:56,500 --> 00:02:03,500
that I will require for doing that? Let us
start with a few examples. One of most common
11
00:02:04,210 --> 00:02:10,080
heuristics for the 8 puzzle is the Manhattan
distance heuristics.
12
00:02:10,080 --> 00:02:16,629
Now, you remember the 8 puzzle, where you
have all the tiles- the 8 tiles- arranged
13
00:02:16,629 --> 00:02:23,629
in a 3 by 3 square and we have to slide the
tiles to bring it to some configuration. To
14
00:02:28,010 --> 00:02:35,010
see the Manhattan mode heuristics for this
problem, let us say
that the tiles are like this- that I have-
15
00:02:51,660 --> 00:02:58,660
and I want to reach the configuration- my
goal configuration- which is known, again.
16
00:03:07,890 --> 00:03:14,890
Now, the heuristic- the Manhattan mode heuristic-
says, that find for every tile, the Manhattan
17
00:03:21,659 --> 00:03:28,659
distance from its current position to its
final position. And Manhattan distance is
18
00:03:29,390 --> 00:03:36,390
a commonly used term, which says that it is
the distance computed in terms of the distance
19
00:03:36,930 --> 00:03:43,930
on the x axis plus the distance on the y axis.
It comes from the fact that Manhattan apparently
20
00:03:44,150 --> 00:03:51,150
has all roads which are either east to west
or north to south.
21
00:03:52,920 --> 00:03:58,189
So, to travel from place c to place b, the
total distance that you have to travel is
22
00:03:58,189 --> 00:04:05,189
your distance northwards plus your distance
eastwards or westwards, right? For example,
23
00:04:05,500 --> 00:04:11,909
for this 5- the initial position is here and
the final position is here. So, I have to
24
00:04:11,909 --> 00:04:18,909
take at least 2 slides to bring it here. Now,
you see, that in order to bring this configuration
25
00:04:22,250 --> 00:04:29,250
to this, this tile 5 will have to be sided
at least twice. It has to be slided at least
26
00:04:31,740 --> 00:04:38,740
twice. Similarly, the 6 the tile 6 is also
at a Manhattan distance of 2. If you look
27
00:04:38,990 --> 00:04:45,990
at the tile 3, it is at a Manhattan distance
of 1, 2, 3, 4, right? If you have to move
28
00:04:49,530 --> 00:04:55,940
this tile 3 from here to its final position,
then no matter in what way you do it, it is
29
00:04:55,940 --> 00:05:01,840
guaranteed, that at least 4 slides will have
to be made.
30
00:05:01,840 --> 00:05:08,840
So now, if I just add up the Manhattan distances
of each of these tiles together, the Manhattan
31
00:05:14,560 --> 00:05:21,560
distance that each tile has to take; if I
add them up, then, can I say that I have lower
32
00:05:23,280 --> 00:05:30,280
bound on the number of moves that I have to
make, in order to solve the puzzle? Yes. Why?
33
00:05:42,039 --> 00:05:49,039
Because every move, as we had said- what are
the operators? Move the blank up, move the
34
00:05:49,150 --> 00:05:55,729
blank left, move the blank right, or move
the blank down; but as a consequence of moving
35
00:05:55,729 --> 00:06:02,729
the blank, it is 1 tile which is being slided.
So, every operation is sliding a single tile
36
00:06:06,139 --> 00:06:11,270
and we have seen that the Manhattan distance
reflects the minimum number of slides that
37
00:06:11,270 --> 00:06:18,270
we have to make for a given tile, right? And
at a single step, we are moving only 1 time,
38
00:06:20,539 --> 00:06:21,130
right?
39
00:06:21,130 --> 00:06:27,900
If we have to move all these tiles that many
number of times, then the total Manhattan
40
00:06:27,900 --> 00:06:33,960
distance is giving me a lower bound on the
number of tiles that I have-, number of times
41
00:06:33,960 --> 00:06:40,960
that I have to move a tile, right? It could
be actually much more; actual number of slides
42
00:06:41,419 --> 00:06:48,419
can be more, but I have lower bound on the
number of slides that I have, right? Yes,
43
00:07:04,009 --> 00:07:10,410
but when you are moving 1 tile, the others
are not moving, right? When you are sliding
44
00:07:10,410 --> 00:07:15,569
1 tile, you are having a feeling of pipelining
the whole thing, but it is not actually pipelining.
45
00:07:15,569 --> 00:07:22,569
You have to do it 1 by 1. Every move that
you do, has a unique cost, right? So, even
46
00:07:22,740 --> 00:07:29,740
if they all shift by one- the 4 tiles shift
by one- it is a cost of 4, not one. That is
47
00:07:31,210 --> 00:07:38,210
why what we have is a lower bound. Does it
clarify your query? Let us look at another
48
00:07:42,970 --> 00:07:49,970
heuristic. This is the- slides please, slides
please.
49
00:07:52,419 --> 00:07:59,419
We have the minimum spanning tree heuristic
for TSP. Now, this is what we do. We have
50
00:08:01,919 --> 00:08:08,919
the t an instance of TSP and we want to find
out a lower bound on the cost of the tool.
51
00:08:10,139 --> 00:08:17,139
So, what we have is, we have these cities,
okay, and we have to find out a tour to all
52
00:08:28,509 --> 00:08:35,509
these cities. So, what we are going to do
is, we will first create a minimum spanning
53
00:08:36,630 --> 00:08:43,630
tree of this. And let us say that this is
the minimum spanning tree. Let us say that
54
00:08:52,940 --> 00:08:59,940
the cost of this minimum spanning tree is
c of s. c of the cost of the spanning tree,
55
00:09:03,860 --> 00:09:10,860
right, and let us say that the optimal cost
of the tool that we are looking for is C*.
56
00:09:15,459 --> 00:09:22,459
Then, my first claim is that cs is less than
C*. Now, let us convince ourselves that the
57
00:09:32,110 --> 00:09:39,110
cost of the minimum spanning tree is less
than the cost of the optimal tour, assuming
58
00:09:39,140 --> 00:09:46,140
that all costs are non-0. If 0 is also allowed,
then instead of less than, I will make it
59
00:09:46,279 --> 00:09:47,709
less than equal to.
60
00:09:47,709 --> 00:09:53,959
Now, the reason that this is the case is,
assume that you are given the minimum cost
61
00:09:53,959 --> 00:10:00,959
tour. If you take out any of those edges,
you will have a spanning tree, right? Now,
62
00:10:03,980 --> 00:10:10,980
if the tour cost is less than the cost of
the minimum spanning tree, then by removing
63
00:10:11,720 --> 00:10:17,640
1 edge from the tool, you will further decrease
the cost and have a spanning tree which has
64
00:10:17,640 --> 00:10:24,640
less cost than the minimum cost spanning tree,
which is a contradiction, right? Therefore,
65
00:10:26,500 --> 00:10:33,500
we have cs less than C*. Is that all right?
Okay. Let me repeat this. If I look at the
66
00:10:45,019 --> 00:10:52,019
minimum cost tool, suppose this is the minimum
cost tool. If you remove an edge from the
67
00:10:54,390 --> 00:11:01,390
minimum cost tour, what are you left with?
You are left with a spanning tree, right?
68
00:11:04,420 --> 00:11:08,769
Remember that in traveling salesperson problem,
we are not allowed to visit the same city
69
00:11:08,769 --> 00:11:11,990
more than once, right?
70
00:11:11,990 --> 00:11:17,630
Therefore, if you just chalk out the examining
part of the tour after taking out any 1 of
71
00:11:17,630 --> 00:11:24,630
the edges, what you have is a spanning tree,
right? Now, if the whole tour cost you C*,
72
00:11:27,730 --> 00:11:34,730
then the cost of this tree will be less than
or equal to C*, right? And the minimum cost
73
00:11:36,180 --> 00:11:41,860
spanning tree can have cost only lesser than
this; lesser than or equal to this. So, the
74
00:11:41,860 --> 00:11:48,860
minimum cost- spanning tree cost- is going
to be less than or equal to C*, okay? Now
75
00:11:53,760 --> 00:12:00,760
then, my claim is that this C* is again less
than twice of c s. Why so, why so? Because
76
00:12:27,529 --> 00:12:34,529
let us see the following thing: if I look
at twice of cs, that means that I am traversing
77
00:12:38,410 --> 00:12:45,410
each of edges twice, right? Now, I will show
you that I can create a tool which cost less
78
00:12:47,480 --> 00:12:54,480
than this. How? Yes, because this is- if I
have to traverse like this, let us suppose
79
00:12:59,260 --> 00:13:06,260
I go from this city to this city, right? Then
from this city to this city, right? Then,
80
00:13:08,930 --> 00:13:15,930
from here to here, then from here to here.
Then, instead of going back here and then
81
00:13:18,760 --> 00:13:25,760
taking this, I will take the shortcut, right?
And then, I will take this edge. Then, I will
82
00:13:27,120 --> 00:13:34,120
take this edge and then, instead of backtracking
along these, I will go directly to the next
83
00:13:36,399 --> 00:13:38,470
city, right?
84
00:13:38,470 --> 00:13:45,470
So, that could be this point, right, and then
from here, again back here. Now, if these
85
00:13:50,850 --> 00:13:57,850
cities are in the Euclidian space, then you
have the triangle inequality. That means that
86
00:13:58,430 --> 00:14:05,430
if I had to traverse this thing backwards,
along this along all this thing, then the
87
00:14:05,920 --> 00:14:10,980
total cost that I would have to incur is going
to be more than if I take this shortcut directly,
88
00:14:10,980 --> 00:14:17,269
from here to here. So, triangle inequality-
the cost of all these edges; the sum of these
89
00:14:17,269 --> 00:14:24,269
edges is always more than the cost of taking
the direct distance, from this city to this
90
00:14:24,579 --> 00:14:31,579
city. Therefore, this tour that I have constructed
by just jumping from leaf to leaf like this,
91
00:14:33,529 --> 00:14:40,529
has a cost that has- that is less than twice
of cs, right, and the optimal tour can only
92
00:14:41,959 --> 00:14:47,670
be better than this- better than or as maybe
just as good as this, right?
93
00:14:47,670 --> 00:14:54,670
Therefore, I have C* is less than equal to
2cs, right? Now, this gives us a nice thing-
94
00:14:56,690 --> 00:15:03,690
that if I take that if I compute the tour
in this fashion, by computing the spanning
95
00:15:04,089 --> 00:15:10,740
tree and then constructing a tool like this,
right, and then, I divide that by half- divide
96
00:15:10,740 --> 00:15:17,740
the tour cost by half, then I get a lower
bound on the cost of the optimal tour. Yes
97
00:15:26,639 --> 00:15:33,639
or no? Find this tour; this tour cost is less
than twice cs. So, if you take half of that,
98
00:15:45,050 --> 00:15:50,259
it is going to be less than cs, right? And
so therefore, that is going to be a lower
99
00:15:50,259 --> 00:15:57,259
bound on the cost of the optimal tool, and
then, you use that bound in your TSP, right?
100
00:16:08,100 --> 00:16:15,100
Heuristics are fundamental to chess programs.
Now, just to give you an idea about how chess
101
00:16:15,540 --> 00:16:21,279
programs will work, is that they will start
from 1 board configuration. And if you would
102
00:16:21,279 --> 00:16:28,279
look at it in the naïve way, then you will
start exploring all sequences of moves. And
103
00:16:30,339 --> 00:16:37,339
since the branching factor of chess is pretty
high, so, you cannot really explore very deep.
104
00:16:39,630 --> 00:16:46,220
If you start exploring right down to the winning-losing
configurations, that is going to take an enormous
105
00:16:46,220 --> 00:16:47,660
amount of time.
106
00:16:47,660 --> 00:16:53,829
So, what chess playing programs try to do,
is that they explore up to a few moves, look
107
00:16:53,829 --> 00:17:00,829
ahead. Say, 20 moves look ahead. They see
all board configurations at a look ahead of
108
00:17:02,209 --> 00:17:09,209
20 moves, right, and they evaluate each of
those board configuration based on some heuristics.
109
00:17:10,449 --> 00:17:14,990
And then based on that heuristics, they decide
which of the sequences of moves they will
110
00:17:14,990 --> 00:17:21,199
try to follow. There is more in it than I
am saying here, and we will study little bit
111
00:17:21,199 --> 00:17:28,199
of that; about how you explore those kinds
of trees when we come to game trees, but heuristics
112
00:17:28,220 --> 00:17:30,559
are very important there as well.
113
00:17:30,559 --> 00:17:37,559
So, the informed search problem is like this:
again we have our familiar 4 tuples, the state
114
00:17:40,570 --> 00:17:46,620
space, the start state, the set of transition
operators, the set of goal states, and now,
115
00:17:46,620 --> 00:17:52,700
we have an additional function h, which is
a heuristic function that estimates the distance
116
00:17:52,700 --> 00:17:59,700
to the goal. And our objective is as before:
to find a minimum cost sequence of transitions
117
00:18:00,520 --> 00:18:07,520
to a goal state, right? So, the first algorithm
that we will study here is A*. This is a very
118
00:18:13,860 --> 00:18:20,360
well known algorithm, which was proposed long
back, perhaps around the time when you were
119
00:18:20,360 --> 00:18:27,360
born. So, this algorithm is of theoretical
importance mainly, by the way, because, as
120
00:18:28,559 --> 00:18:35,559
we shall see, that it tries to maintain open
and closed explicitly. And in practice, it
121
00:18:35,929 --> 00:18:42,929
will work for very few in very few cases and
people actually we will study how to save
122
00:18:43,669 --> 00:18:46,490
the memory and still do something similar
to A*.
123
00:18:46,490 --> 00:18:53,490
That is where all the engineering comes up.
So here, we have- in the initialize step,
124
00:18:53,799 --> 00:19:00,799
we set open to s and closed to empty, and
we have 2 functions. 1 is gs, 1 is hs, and
125
00:19:05,830 --> 00:19:12,830
the cost of the state s will be denoted by
fs. Let me explain what this gh business is.
126
00:19:14,520 --> 00:19:21,520
The g value of a state, at a given point of
time, will indicate the minimum cost path
127
00:19:22,570 --> 00:19:29,570
from the start state to that state, right?
We can draw it out on a picture. So, if we
128
00:19:33,640 --> 00:19:40,640
have a state here: this is the start state,
and then I have found out a path to our state
129
00:19:47,720 --> 00:19:54,720
n, then this path cost is gn, right? hn is
the estimated cost is the estimated cost.
130
00:20:10,270 --> 00:20:17,270
This is the estimated cost of traversing from
n to a goal state- to any goal state- and
131
00:20:26,620 --> 00:20:33,620
I will maintain fn as the sum of gn plus hn.
And what does that give us?
132
00:20:41,400 --> 00:20:48,400
It gives us the cost of the best solution
that goes through n, right? Now, you have
133
00:20:57,530 --> 00:21:04,530
studied " " No? in algorithms no? Okay. Please
read it up. Please read up " " . Then, you
134
00:21:07,380 --> 00:21:14,380
will see that- then we can discuss that- what
is the similarity with this? Anyway, we have
135
00:21:15,580 --> 00:21:21,440
gn here and hn here. Now, what are the things
that can happen during the search? During
136
00:21:21,440 --> 00:21:28,440
the search, my g value can change, because
I might find some alternative better path
137
00:21:29,179 --> 00:21:36,179
to the same state, right? And also, as you
go down the path, when you from n to its successor
138
00:21:39,659 --> 00:21:46,659
m, the heuristic value of m might be better.
So, it may give you a more accurate estimate
139
00:21:48,990 --> 00:21:55,990
of the distance of m to the goal. The cost
can change the cost of the fm value- can change
140
00:21:58,080 --> 00:22:05,080
based on that- right? So, this is the basic
notion of what we mean by the g and the h.
141
00:22:07,440 --> 00:22:14,440
Now, let us see how we use this in the algorithm.
Initially, for the start state, the g value,
142
00:22:14,760 --> 00:22:18,850
gs is 0.
143
00:22:18,850 --> 00:22:25,850
Because, from the start state to the start
state, the cost is 0, and fs is the estimate
144
00:22:27,190 --> 00:22:34,190
of going from the start state to this state,
because gs is 0. So, hs plus gs is the same
145
00:22:34,580 --> 00:22:41,580
as hs. Step 2: same as before, if open is--
hs is the heuristic function which gives the
146
00:22:51,320 --> 00:22:58,320
estimate of the cost of reaching the goal
from s, right? hs is the cost estimate of
147
00:22:59,520 --> 00:23:06,520
the cost of reaching the goal from s. Then,
we have, the second step is: fail- if open
148
00:23:11,280 --> 00:23:18,280
is empty, then terminate and fail, just as
we had before. Then again, we select the minimum
149
00:23:19,990 --> 00:23:25,850
cost state n from open, and save in closed,
but here when I refer to cost, I am referring
150
00:23:25,850 --> 00:23:32,850
to fn. So, look at the f values of all the
states in open, and pick up the 1 which has
151
00:23:33,270 --> 00:23:40,270
minimum f value. I will work out an example
for this algorithm to make this thing clear.
152
00:23:44,370 --> 00:23:51,370
Then, step 4 is terminate. If n belongs to
the goal set, terminate with success and return
153
00:23:51,929 --> 00:23:53,080
fn.
154
00:23:53,080 --> 00:24:00,080
Should I bring this back? Is it okay? All
right. So, let us go to the expand step. For
155
00:24:13,480 --> 00:24:20,480
each successor m of n, if m does not belong
to open or closed, which means that we are
156
00:24:20,820 --> 00:24:25,710
expanding it for the first time- this is the
first time that we are visiting the node.
157
00:24:25,710 --> 00:24:32,710
Then, we set gm to be gn plus cnm. Now, why
is that so? If you have you have this the
158
00:24:43,440 --> 00:24:50,440
From the start state s, I have gone to n and
this was my gn. Now, when I generate a successor
159
00:24:54,970 --> 00:25:01,970
m, the g value is going to be this whole cost.
So, it is gn plus this cnm. cnm is the cost
160
00:25:06,610 --> 00:25:13,610
of applying the operator to move from n to
m, right? So, gn plus cnm- that is the value
161
00:25:13,669 --> 00:25:18,029
of gm, right?
162
00:25:18,029 --> 00:25:25,029
So, coming back to here, we have gm is equal
to gn plus cnm, and then we set fm equal to
163
00:25:31,110 --> 00:25:38,110
gm plus hm, where hm is the estimated cost
of reaching the goal from m, right? Then we
164
00:25:41,279 --> 00:25:48,279
insert m in open. Again, if m already belongs
to open or closed, then, we set gm is equal
165
00:25:52,630 --> 00:25:59,630
to minimum of gm and gn plus cnm. We see whether
we have obtained a better path to reach the
166
00:26:00,470 --> 00:26:07,470
state m, right? hm is going to remain the
same. So, we find out what is the best gm
167
00:26:11,220 --> 00:26:17,940
that we have got so far, and then add that
with hm, to get the new cost of fm. Now, if
168
00:26:17,940 --> 00:26:23,900
fm is decreased- which means that we have
indeed found a better path to that state-
169
00:26:23,900 --> 00:26:29,179
and we find that m belongs to closed, then
we will move m to open. Otherwise, we will
170
00:26:29,179 --> 00:26:31,289
simply update the cost of m.
171
00:26:31,289 --> 00:26:38,289
If m is there in open, we will simply update
its cost to the new value of fm, right? Now,
172
00:26:41,100 --> 00:26:48,100
this is similar to what we have done previously
for the uniform cost search, except that now,
173
00:26:49,740 --> 00:26:56,740
states can move from closed to open, because
of the heuristic function. The heuristic estimate,
174
00:26:57,830 --> 00:27:04,830
at an earlier stage, could have been worse
than, as you progress further along some paths,
175
00:27:06,970 --> 00:27:11,210
and may have been good always, along some
other paths. So, therefore, something which
176
00:27:11,210 --> 00:27:16,789
was considered not so good at some point of
time, suddenly might become good, because
177
00:27:16,789 --> 00:27:21,580
the other paths which were promising, as we
went down, their heuristic costs increased,
178
00:27:21,580 --> 00:27:24,690
and we found that no, these are not good enough,
right?
179
00:27:24,690 --> 00:27:31,690
We will see examples of this nature of nodes
coming back from close to open, okay? Then,
180
00:27:34,600 --> 00:27:41,600
let us go further down. Otherwise, we go to
step 2, right? Fine.
181
00:27:50,070 --> 00:27:57,070
Let us work out an example to see this. So,
I have this graph, which is the same as the
182
00:28:07,330 --> 00:28:13,640
1 that we had seen before, and let me quickly
write down the costs that we had. This was
183
00:28:13,640 --> 00:28:20,640
the cost that we had: can you see clearly?
Now, in addition to this, let us assume that
184
00:28:46,110 --> 00:28:53,110
we have some heuristic values that we find
from the nodes. So, that is some function
185
00:28:53,539 --> 00:29:00,539
which tells us at each state- what is the
estimate of the goal. So, let us say, in 1,
186
00:29:01,460 --> 00:29:08,460
I have the heuristic value of 12 and at 2,
I have heuristic value of 10. That means that
187
00:29:14,940 --> 00:29:21,940
at when I am in state 2, I have an estimate
that from 2 to goal will cost me at least
188
00:29:22,149 --> 00:29:29,149
10. That estimate is given to us. That is
the heuristic function that is given to us,
189
00:29:30,960 --> 00:29:32,250
right?
190
00:29:32,250 --> 00:29:39,250
Now, note that though I am writing down these
numbers besides the states, you do not know
191
00:29:40,370 --> 00:29:47,370
these values until you are actually- you have
generated this state, and applied the heuristic
192
00:29:47,820 --> 00:29:53,230
function on that state. So, for example, if
you have reached a particular state of 15
193
00:29:53,230 --> 00:29:58,659
puzzle, then you can use the Manhattan heuristic
on that state, to find out a lower bound on
194
00:29:58,659 --> 00:30:04,120
the number of moves that will take you to
the goal. Similarly, in case b, after you
195
00:30:04,120 --> 00:30:11,120
have generated a partial tool, it is then
that you know, that what is an estimate of
196
00:30:11,840 --> 00:30:18,840
the remaining tour. So, these are heuristic
values. And what will be the heuristic value
197
00:30:33,200 --> 00:30:38,250
of the goal? 0.
198
00:30:38,250 --> 00:30:45,250
Now, let us start with our lists. We will
have open here and
closed here. Initially, open will contain
199
00:31:04,220 --> 00:31:11,220
1, with a cost of- no, no, no, no. 12, because
you have the estimate.
200
00:31:16,250 --> 00:31:23,250
So, your cost initially is not 0 from the
start state. It is the fn value which is gn-
201
00:31:25,289 --> 00:31:31,049
gs plus hs, so gs is 0 always at the start
states. hs is 12, which I have from here.
202
00:31:31,049 --> 00:31:38,049
So, I have this thing with a cost of 12, right?
When the first step, I will expand the state
203
00:31:40,940 --> 00:31:47,350
1 and put it in closed. So, I will have the
state 1 with a cost of 12, which goes into
204
00:31:47,350 --> 00:31:54,350
closed and that will lead me to generate the
successors 2 and 5. 2 will come with a cost
205
00:31:58,179 --> 00:32:05,179
of how much? 12. Why? Because the g value
is this 2 and the h value is this 10, and
206
00:32:10,970 --> 00:32:17,970
the f value is g plus h. So, it is 10 plus
2, 12, right? What does this 12 indicate?
207
00:32:20,700 --> 00:32:27,700
It indicates an estimate of the cost of a
solution path that goes through 2, right?
208
00:32:29,080 --> 00:32:35,360
This 12 does not tell me the solution from
2 is 12, no?
209
00:32:35,360 --> 00:32:42,360
The estimate of the solution from 2 is 10,
but I have already incurred a cost of 2 for
210
00:32:43,120 --> 00:32:49,820
coming from the start state to this. So, the
total solution cost is 10 plus 2, 12. The
211
00:32:49,820 --> 00:32:56,820
estimate of the total solution cost is 12,
clear? And then, here, I have how much? 13,
212
00:33:04,029 --> 00:33:11,029
right? So, in the next step, I will be picking
up the one with minimum f value. That is 2.
213
00:33:14,049 --> 00:33:21,049
So, 2 with a cost of 12 comes out, and what
do we have here? We will have 5 with a cost
214
00:33:22,880 --> 00:33:29,880
of 13. And for 2, we will now have 3 with
a cost of 19. That is 16 plus the g value,
215
00:33:34,049 --> 00:33:41,049
which is 3. Now see, it is not that every
time I will add these things up. What I am
216
00:33:41,649 --> 00:33:47,140
doing is, see, the g values of these nodes
are being maintained. So, I know that the
217
00:33:47,140 --> 00:33:54,140
g value of 2 is 2. So I take- when I expand
2 to get 3, I will take the gn value, which
218
00:33:56,029 --> 00:34:03,029
is 2 added with this 1 to get 3. That is going
to the g value of 3. g3 is 3, and h is what
219
00:34:06,480 --> 00:34:08,599
I compute when x takes 3.
220
00:34:08,599 --> 00:34:15,599
So, g plus h is 19 and also, I will have 6
with a cost of 12. That is, 7 plus 5. g value
221
00:34:22,929 --> 00:34:29,929
is 5, h value is 7. Next, I will be picking
up 6. So, 6 with a cost of 12 is here. I have
222
00:34:37,050 --> 00:34:44,050
3 with a cost of 19, 5 with a cost of 13 and
now I am expanding 6. So, will have 7 with
223
00:34:51,010 --> 00:34:58,010
a cost of 17. Yes? Why 17? Because the g value
is 6 and the h value is 11. So, 17. And also,
224
00:35:09,530 --> 00:35:16,530
I will have 10 with a cost of 13, and also
5. But what will be the cost of 5? It is coming
225
00:35:22,950 --> 00:35:29,950
as 22. The current value of 5 is 13, so that
is no use. We just discard it. We do not do
226
00:35:37,170 --> 00:35:42,599
anything about it, because the new cost of
that- we have found the new g value is actually
227
00:35:42,599 --> 00:35:48,720
larger than the existing g value. So, this
is no good, right?
228
00:35:48,720 --> 00:35:55,720
Now, I have 2 nodes having cost of 13. Let
us say without loss of generality, then I
229
00:35:56,500 --> 00:36:03,500
will pick up 5, okay? If I pick up 5 with
a cost of 13, then what do I have here? I
230
00:36:08,820 --> 00:36:15,820
have 3 with a cost of 19, then 7 with a cost
of 17, then 10 with a cost of 13. And then
231
00:36:20,670 --> 00:36:27,670
I have 9 coming up, by expanding 5, which
comes with a cost of 12 plus 2, 14, right?
232
00:36:34,800 --> 00:36:41,800
Next step, I will be picking up 10 with a
cost of 13. So, if I pick up 10 with a cost
233
00:36:42,190 --> 00:36:49,190
of 13, then I will have this 3 with a cost
of 19, right? 7 with a cost of 17, 9 with
234
00:36:54,349 --> 00:37:01,349
a cost of 14 and by expanding 10, I will get
11 with a cost of 13, right?
235
00:37:07,230 --> 00:37:14,230
Next step, I will be picking up 11 with a
cost of 13, right? That will give me 3 with
236
00:37:20,030 --> 00:37:27,030
a cost of 19, 7 with a cost of 17, 9 with
a cost of 14 and 12 with a cost of 13. Next
237
00:37:37,880 --> 00:37:44,880
step, I will be picking up 12 with a cost
of 13 and since that is the goal so we terminate,
238
00:37:45,740 --> 00:37:50,180
and we declare that the minimum cost is 13.
Minimum- previously also, we had found that
239
00:37:50,180 --> 00:37:52,089
the minimum cost was 13.
240
00:37:52,089 --> 00:37:59,089
Now, let us compare this with what we had
for uniform cost search. So, for when we worked
241
00:38:00,560 --> 00:38:07,560
out the one for uniform cost search, you can
check that we had; this is the one that we
242
00:38:09,680 --> 00:38:11,660
had for uniform cost search.
243
00:38:11,660 --> 00:38:18,660
Then, I had 11 nodes expanded, those are the
nodes that you have not closed. You had 11
244
00:38:31,780 --> 00:38:37,839
nodes expanded, and if you compare that with
the heuristic search, when we had made use
245
00:38:37,839 --> 00:38:44,640
of this heuristic information, you see that
we now have- how many?- 6 nodes expanded,
246
00:38:44,640 --> 00:38:49,960
and then when we picked up the seventh node,
we found that it was a goal. So, we had to
247
00:38:49,960 --> 00:38:56,960
actually expand 6 nodes, right? It has reduced
the total effort of the search, but how? How
248
00:39:02,280 --> 00:39:04,490
did it deduce it?
249
00:39:04,490 --> 00:39:11,490
What it did here was that, there were some
nodes which looked extremely promising at
250
00:39:11,640 --> 00:39:18,520
the beginning, like, if you look at 3, it
came up here with a cost of only 3, but then,
251
00:39:18,520 --> 00:39:25,520
if you have to follow along 3, then if you
go to 3 to 4, well, the cost is still good.
252
00:39:25,720 --> 00:39:32,720
It is 2 plus 1 plus 2. So, if you just ignore
the heuristics, then this cost would be 5,
253
00:39:33,230 --> 00:39:40,200
which is still less than the cost of the optimal
solution. So, that means 3 would have been
254
00:39:40,200 --> 00:39:46,170
expanded, 4 would have been expanded, right?
8 would have expanded, and if you recall,
255
00:39:46,170 --> 00:39:49,869
if you go back to the previous example, you
will see that all these nodes were actually
256
00:39:49,869 --> 00:39:55,790
expanded, when we did not have the heuristic
information. But the heuristic information
257
00:39:55,790 --> 00:40:01,680
here told us that look from 3, you will have
a cost of 16, at least.
258
00:40:01,680 --> 00:40:08,680
Therefore, at this point, only the cost of
3 became 19, and as you can see, that 3 never
259
00:40:11,780 --> 00:40:17,760
got expanded. 3 never got expanded, as a result,
4 never got generated, 8 never got generated,
260
00:40:17,760 --> 00:40:24,530
and they were also not expanded, right? So,
what this essentially did was, it performed
261
00:40:24,530 --> 00:40:31,530
a look ahead, and was able to tell us, a priori,
that this path is not going to be good. Now,
262
00:40:38,160 --> 00:40:45,160
let us do some analysis on the set of nodes
that will be expanded by these algorithms.
263
00:40:49,510 --> 00:40:56,510
If you look at the uniform cost search algorithm,
then I will say that all those states which
264
00:41:01,630 --> 00:41:08,630
can be reached with a cost less than the cost
of the optimal solution; all those states
265
00:41:09,980 --> 00:41:14,130
will have to be expanded by any algorithm.
266
00:41:14,130 --> 00:41:21,130
Now, let me reiterate. Suppose I you give
me an algorithm a; you give me an algorithm
267
00:41:25,900 --> 00:41:32,900
a and you claim that this algorithm is always
going to find the optimal solution. We do
268
00:41:33,480 --> 00:41:40,480
not have any heuristics. We are talking about
the uniform cost search paradigm, so I do
269
00:41:40,480 --> 00:41:47,480
not have any heuristics. I am given a problem
which means a start state and a set of states
270
00:41:47,530 --> 00:41:52,540
transition operators, and you give me an algorithm
a and claim that this fellow is always going
271
00:41:52,540 --> 00:41:59,119
to give me the optimal solution. And then,
I am trying to say that look, I think that
272
00:41:59,119 --> 00:42:05,960
the complexity of your algorithm will be such,
that it will expand all states which have
273
00:42:05,960 --> 00:42:11,460
a cost less than the minimum solution.
274
00:42:11,460 --> 00:42:18,460
You know why? Because suppose there is some
state. So, this is our start state, and there
275
00:42:20,270 --> 00:42:27,270
is some state here- some state n- and the
cost of n the cost of n is less than C*. And
276
00:42:33,579 --> 00:42:40,579
if your algorithm a does not expand n, then
I am going to give the algorithm a another
277
00:42:43,940 --> 00:42:50,369
instance of the problem, where the entire
state space will be similar, except that just
278
00:42:50,369 --> 00:42:57,369
below n, I will add a goal, right, and whose
cost will be say cn or just cn plus some epsilon,
279
00:43:04,819 --> 00:43:11,819
where cn plus epsilon is also less than C*.
I can always find such an epsilon. And then,
280
00:43:13,119 --> 00:43:19,349
because nothing else has changed in the state
space, the algorithm a will be unable to find
281
00:43:19,349 --> 00:43:26,349
this goal, because it is not expanding n,
and if does not expand n, then it will never
282
00:43:27,060 --> 00:43:34,060
discover this goal. Therefore, it will give
you a sub-optimal solution. It will still
283
00:43:35,750 --> 00:43:39,660
give you C*, but you have a goal which has
better cost.
284
00:43:39,660 --> 00:43:46,660
Now, is this analysis clear? Once again? Okay.
My claim is that if cn is less than C*, and
285
00:43:57,810 --> 00:44:04,810
C* is- what?- optimal cost, then n must be
expanded. This is my claim. Then, n must be
286
00:44:28,630 --> 00:44:35,630
expanded. This is my claim. Now, how do we
establish this claim? We say that, let us
287
00:44:39,970 --> 00:44:46,970
assume that we have an algorithm a, which
does not expand n. So, let algorithm a does
288
00:44:58,410 --> 00:45:05,410
not expand n, right? Then, what we can do
is, we can keep the remaining state space
289
00:45:08,140 --> 00:45:15,140
identical. We do not make any change to the
remaining state space, except that below n,
290
00:45:15,560 --> 00:45:22,560
we just add a goal and give it a cost which
is between cn and C*. We can always do that,
291
00:45:27,650 --> 00:45:34,650
because cn is less than C*, so you can have
some epsilon which you add to cn, and then
292
00:45:35,339 --> 00:45:38,510
this h cost is that epsilon, right?
293
00:45:38,510 --> 00:45:45,510
So, this goal will have cost cn plus epsilon.
Now, from the point of view of a, nothing
294
00:45:46,099 --> 00:45:52,460
has changed, because the entire remaining
state space is similar, and in that in that
295
00:45:52,460 --> 00:45:59,460
scenario, a was not expanding n. So, a will
still not expand n, and if it does not expand
296
00:46:01,510 --> 00:46:08,510
n, then it will not see this goal. It will
not be able to see this goal, unless it expands
297
00:46:10,420 --> 00:46:17,420
n, clear? Understood? You create another state
space, yes. You create another state space,
298
00:46:28,020 --> 00:46:35,020
which is similar to that previous state space,
except that we have a goal just below this.
299
00:46:36,780 --> 00:46:43,780
No, not yet. So, for this class of problems,
where we do not have heuristic functions,
300
00:46:46,470 --> 00:46:53,470
the claim is that all states which have cost
less than C*, will have to be expanded. And
301
00:46:55,530 --> 00:47:01,300
if you think of it, Dijkstra's does exactly
that, right?
302
00:47:01,300 --> 00:47:07,819
It always expands the minimum cost state in
your frontier. Therefore, when you have when
303
00:47:07,819 --> 00:47:13,040
you have found the goal, then all the states
that you have in your frontier or in your
304
00:47:13,040 --> 00:47:19,930
heap, they all have cost more than the cost
of the __. And because you are dealing with
305
00:47:19,930 --> 00:47:26,760
positive edge cost, in case of Dijkstra's,
so you know that by expanding the remaining
306
00:47:26,760 --> 00:47:32,309
set of states, you are not going to ever come
to another state which has lesser cost than
307
00:47:32,309 --> 00:47:39,309
the ones that you already have, right? Now,
so this is what we have in the case of uniform
308
00:47:45,220 --> 00:47:46,920
cost search.
309
00:47:46,920 --> 00:47:53,920
What will happen in case of A*? In A*, we
have also the heuristic function. The heuristic
310
00:47:54,160 --> 00:48:00,960
function gives us an estimate of the cost
to the goal. Now, can I characterize the set
311
00:48:00,960 --> 00:48:07,960
of states which A* will expand, given a heuristic
function h? Yes, so let us make the claim
312
00:48:15,400 --> 00:48:22,400
first, then we will reason about it. The claim
is that if I have fn less than C* then n must
313
00:48:29,000 --> 00:48:36,000
be expanded. See, again, at this point of
time, we are assuming 2 things: we are assuming,
314
00:49:00,250 --> 00:49:07,250
one, that
315
00:49:25,270 --> 00:49:32,270
the heuristic function under-estimates, that
is, hn is less than or equal to, where f star
316
00:49:44,609 --> 00:49:51,609
n is the ... And second thing is, if you go
by this rule, then what we have here is, if
317
00:50:38,030 --> 00:50:45,030
fn is less than C*, then n must be expanded
by an algorithm, because if it does not- if
318
00:50:49,500 --> 00:50:56,180
it does not expand the state n- then, what
we can do is, we can again create another
319
00:50:56,180 --> 00:51:02,470
state space, which is exactly similar to the
existing state space. But below n, we will
320
00:51:02,470 --> 00:51:05,020
put another goal.
321
00:51:05,020 --> 00:51:12,020
We will put a goal, and because fn is less
than C*, so therefore, what we can do is that
322
00:51:15,710 --> 00:51:22,710
there will be some heuristic cost hn here,
right? So, we will associate that heuristic
323
00:51:23,240 --> 00:51:29,099
cost with the cost of the transition to the
goal. So, suppose we have what? The transformation
324
00:51:29,099 --> 00:51:36,099
that we are going to do is as follows: we
have n here, and then, we will create a new
325
00:51:36,790 --> 00:51:43,790
goal, say g, right? Now, this had an hn component,
and this had some gn component from above.
326
00:51:46,220 --> 00:51:53,220
So, fn was gn plus hn. What I am going to
do is, I am going to make this h cost h n,
327
00:51:59,630 --> 00:52:06,630
equal to hn. In that case, what is going to
be the f value of this? fg.
328
00:52:08,599 --> 00:52:15,599
It is the- gn will be the g for this state,
will be this gn plus hn. So, it is going to
329
00:52:16,460 --> 00:52:23,460
be fn and the h value is 0. f g will have
the same value as f n. So, now I have another
330
00:52:24,650 --> 00:52:31,650
goal. I have a goal which has the same cost
as the node n, and the algorithm, if it does
331
00:52:32,880 --> 00:52:39,880
not expand n, it will not be able to discover
this goal, right? So, by a similar reasoning,
332
00:52:40,790 --> 00:52:46,920
we establish that if you have a cost- if you
have a state whose cost is less than that
333
00:52:46,920 --> 00:52:53,920
of C*- then, every algorithm which guarantees
finding the optimal solution, will have to
334
00:52:54,579 --> 00:53:01,579
expand that, right? Now, note that I have
not written less than or equal to. I have
335
00:53:03,510 --> 00:53:10,510
written strictly less than; the ones which
are strictly less are surely going to be expanded,
336
00:53:11,339 --> 00:53:18,339
but if you have less than or equal to, then
we do not know, right?
337
00:53:20,380 --> 00:53:27,380
Now, I have not mentioned here, if you come
back to the slide algorithm A*, then, here
338
00:53:27,750 --> 00:53:34,750
I was selecting the minimum cost state n from
open, right? Now, if you have many states
339
00:53:37,730 --> 00:53:44,730
having the same cost, which one will we select?
What we do there is, if you have many costs,
340
00:53:49,849 --> 00:53:56,780
many states with the same cost, select the
one which has minimum g value- among those
341
00:53:56,780 --> 00:54:03,119
states, which have the same f value, select
the ones select the one which has minimum
342
00:54:03,119 --> 00:54:10,119
g value, because the others have already incurred
a more cost in terms of g and we do not know
343
00:54:14,030 --> 00:54:18,460
that accuracy of the heuristics, right? Okay.
344
00:54:18,460 --> 00:54:25,460
So, with that, we will conclude this lecture.
In the next lecture, I will start by analyzing
345
00:54:31,079 --> 00:54:38,079
some results of A*, and then we will study
how we can create variants of A*, which will
346
00:54:39,089 --> 00:54:44,369
work better than A*. A* does not well work
very well in practice. That is because it
347
00:54:44,369 --> 00:54:49,500
requires too much of memory. It is storing
the whole of open and the whole of closed
348
00:54:49,500 --> 00:54:54,010
and it eases up too much of memory. So, it
does not work in practice, but there are variants
349
00:54:54,010 --> 00:55:01,010
of that which are used. So, we will study
some of those in the next class.
350
00:55:13,089 --> 00:55:20,089
We will continue with our discussion on A*
and heuristic search engine from this class
351
00:55:20,200 --> 00:55:27,200
onwards. We will this The topic of this lecture
is heuristic search- A* and beyond. So quickly,
352
00:55:30,829 --> 00:55:37,089
to recap what we had done in the last class:
we studied the algorithm A* which maintains
353
00:55:37,089 --> 00:55:44,089
2 lists- open and closed- and also 2 functions.
One is the g value, which computes the distance
354
00:55:47,740 --> 00:55:54,210
of the state from the start state, and the
h value, which is the heuristic estimate of
355
00:55:54,210 --> 00:56:01,210
the distance of that state from the goal state.
And fs is the sum of gs and hs, and that gives
356
00:56:04,740 --> 00:56:11,740
us the estimated cost of a solution, which
goes through the node n. So, the first step
357
00:56:15,799 --> 00:56:20,619
was: if open is empty and we have still not
yet found the goal, then we terminate with
358
00:56:20,619 --> 00:56:27,619
failure, otherwise we select the minimum cost
state n from open and save it in closed. If
359
00:56:27,740 --> 00:56:33,819
the selected state is a goal state, then you
terminate with success and return the f value
360
00:56:33,819 --> 00:56:37,210
of that state as the cost of the goal.
361
00:56:37,210 --> 00:56:44,210
Otherwise, we expand the node n, and to generate
the set of successors and for each successor
362
00:56:46,359 --> 00:56:53,359
m, we compute its cost, based on the g value
of that node and the h value of that node,
363
00:56:54,480 --> 00:57:00,210
and if the node already belongs to open and
closed, we update it only if the cost is decreased.
364
00:57:00,210 --> 00:57:05,720
And if the node is already in closed and its
cost has decreased, then you must bring it
365
00:57:05,720 --> 00:57:12,720
back to open. Now, in uniform cost search,
we had seen that if you have only positive
366
00:57:12,940 --> 00:57:18,010
cost, then you cannot have a case where a
node comes back from closed to open.
367
00:57:18,010 --> 00:57:24,569
Here, in the first iteration, you may expand
one. Second iteration, you are expanding 2
368
00:57:24,569 --> 00:57:29,319
states. Every iteration, only 1 new state
is getting expanded. Third iteration: 3 states
369
00:57:29,319 --> 00:57:35,930
are getting expanded and so on, until you
expand all n states. And that is the set of
370
00:57:35,930 --> 00:57:42,930
states which A* will expand. Because you are
not saving them, because we are not saving
371
00:57:44,790 --> 00:57:51,790
them. So, we are again; we are doing a DFDB
on the state space, with the new cost cutter,
372
00:57:52,230 --> 00:57:58,180
right? So, it is all in the interest of saving
space, because space is the thing which will
373
00:57:58,180 --> 00:58:00,510
kill you, in this kind of state spaces.
374
00:58:00,510 --> 00:58:05,790
So, this gives you in the worst case, as you
can see, order of n square, where n is the
375
00:58:05,790 --> 00:58:10,829
set of states which A* expands. n is the set
of states that I was mentioning- all states
376
00:58:10,829 --> 00:58:17,700
with cost less than C*. This is going to be,
in the worst case, quadratic in time, as compared
377
00:58:17,700 --> 00:58:24,700
to A*. The time increase is only quadratic,
but the space is exponentially saved, because
378
00:58:28,240 --> 00:58:33,329
it can grow in order of b to the power of
m, where b is the branching factor, but here
379
00:58:33,329 --> 00:58:40,329
you are doing in linear space, right? So,
it is asymptotically optimal.
380
00:58:41,980 --> 00:58:48,980
Okay. So, there are several extensions of
this basic memory bounded search strategies.
381
00:58:53,599 --> 00:58:59,540
Maybe sometime later, in some later lectures,
I will just touch upon the other kinds of
382
00:58:59,540 --> 00:59:05,290
strategies that we have. But from the next
lecture onwards, we are going to move into
383
00:59:05,290 --> 00:59:12,290
problem reduction search and game trees.