1
00:00:50,550 --> 00:00:57,550
In today's class, we will start on game trees,
and we will see how we can model different
2
00:00:58,449 --> 00:01:04,790
game playing situations in terms of game trees.
We will study some very classical algorithms
3
00:01:04,790 --> 00:01:11,790
for searching in game trees, and finding out
when we are in a better position, and when
4
00:01:11,830 --> 00:01:14,680
we are in a worst position.
5
00:01:14,680 --> 00:01:21,680
So, in this, on searching game trees, what
essentially game trees are? They are OR trees,
6
00:01:25,780 --> 00:01:31,880
namely. We have already studied what AND/OR
graphs are, and AND/OR trees are. Game trees
7
00:01:31,880 --> 00:01:37,830
are a special type of OR tree, but there are
2 types of OR nodes. By OR tree, we mean that
8
00:01:37,830 --> 00:01:43,080
at a time, we will be selecting only one of
the successors of the OR node, but which one
9
00:01:43,080 --> 00:01:47,470
will be selected will depend on whether it
is a min node or a max node.
10
00:01:47,470 --> 00:01:54,470
I will shortly explain why we need this kind
of a representation. And briefly, we in min
11
00:01:56,239 --> 00:02:01,660
nodes, we will select the minimum cost successor,
and in max nodes, we will select the maximum
12
00:02:01,660 --> 00:02:08,660
cost successor. Terminal nodes can be winning
or losing states, but it is often infeasible
13
00:02:11,860 --> 00:02:16,590
to search up to the terminal nodes. So, we
will use heuristic costs to compare non-terminal
14
00:02:16,590 --> 00:02:23,590
nodes. So, let me start by showing a simple
formulation of a game tree search problem.
15
00:02:26,260 --> 00:02:33,260
Let us say we take that example of tictactoe.
In tictactoe, what we have is something like
16
00:02:41,140 --> 00:02:48,140
this, right? And the players alternate in
putting either a circle or a cross, in any
17
00:02:48,480 --> 00:02:55,480
of these positions. Let us say, that we start
with a configuration, where we put x here,
18
00:03:00,810 --> 00:03:07,810
right? Then, from here, so this is where the
one of this is the starting configuration,
19
00:03:11,560 --> 00:03:12,590
let us say.
20
00:03:12,590 --> 00:03:19,550
Now, it is the move of player b; so, when
it is the move of player b, then there are
21
00:03:19,550 --> 00:03:26,550
various different moves that player b can
take. Say, out out of which, this is one,
22
00:03:31,640 --> 00:03:38,640
or this is one, and so on. Then, for each
of these cases, player a has a move. Here,
23
00:03:44,550 --> 00:03:51,550
let us say player b has a move. From this
point, in each of these states, player a has
24
00:03:53,160 --> 00:03:59,370
a move. Suppose player a makes a move here,
then player a can, again, take one of many
25
00:03:59,370 --> 00:04:05,750
different moves. Suppose, if this is the step,
and player b can give something like this,
26
00:04:05,750 --> 00:04:12,750
or player b can give something like this,
and so on, right? Then, you know that suppose
27
00:04:18,900 --> 00:04:24,970
this is the move, that player b has given,
then, again, player a will have turns, but
28
00:04:24,970 --> 00:04:30,310
if you look further below this, then you will
see that all- there is a way of playing, by
29
00:04:30,310 --> 00:04:32,940
which this fellow always wins.
30
00:04:32,940 --> 00:04:39,940
Because, if now, the only possible option
of this player is to put a dot here, and if
31
00:04:41,720 --> 00:04:46,150
that happens, then the other player can put
a cross here, and then there is no way of
32
00:04:46,150 --> 00:04:52,590
saving the game, right? If you look below
this sub-tree, then you will see that that
33
00:04:52,590 --> 00:04:59,590
for every move that this player makes, the
other player has a winning strategy, right?
34
00:05:03,680 --> 00:05:10,680
Now, so, this is the structure of the game
tree. The game tree is a tree where every
35
00:05:10,939 --> 00:05:17,939
node of the tree, represents a state of the
game, and nodes can be either max node or
36
00:05:20,279 --> 00:05:26,389
a min node. I have not yet defined what I
mean by a max node or a min node, but let
37
00:05:26,389 --> 00:05:33,389
us say that one of these type of nodes is
for player a, and the other is for player
38
00:05:34,330 --> 00:05:41,330
b. So, depending on who has the move, we will
distinguish between 2 types of nodes, right?
39
00:05:44,999 --> 00:05:49,520
We will go right down up to the winning, or
we can go right down up to the winning or
40
00:05:49,520 --> 00:05:54,199
losing combination, and then decide what part
to take.
41
00:05:54,199 --> 00:06:01,199
For example, that if we start from one player,
we will indicate player a by square nodes
42
00:06:01,279 --> 00:06:08,279
and player b by round nodes. Initially, player
a has a move, and that can possibly lead to-
43
00:06:09,080 --> 00:06:16,029
say, 3 moves of 3 different steps are possible,
let us say. So, this is move 1, move 2, move
44
00:06:16,029 --> 00:06:23,029
3. Now, player b has a move, and let us say
player b has 2 moves here- maybe player b
45
00:06:24,740 --> 00:06:31,740
has just 1 option here, and has say 3 options
here, right? Then maybe, from here, player
46
00:06:36,930 --> 00:06:43,930
a can have 1 option, 2 options, etc. This
could be a winning option for player a, right?
47
00:06:52,550 --> 00:06:59,550
This may be a winning option for player b,
and then again, suppose this is the scenario:
48
00:07:20,669 --> 00:07:27,669
Let us see: we have win here for player b-
for player a; we have win here, for player
49
00:07:41,360 --> 00:07:48,360
a, right? We have win here for player a, we
have win here for player a. Now, our objective
50
00:07:48,749 --> 00:07:55,749
is to determine, that at this point, what
move we should take, right? So, we will see,
51
00:07:58,490 --> 00:08:04,509
from here, that if this is a win node for
player a, and since the opponent does not
52
00:08:04,509 --> 00:08:11,509
have any alternative choice here, so, this
is a win node for player a as well, right?
53
00:08:12,159 --> 00:08:19,159
Then, here, because player a has a choice
between these 2, so, this is also win node
54
00:08:19,199 --> 00:08:22,919
for player a.
Because, if we arrive at this state, then
55
00:08:22,919 --> 00:08:29,919
player a will take this move, and be able
to win eventually, right?
56
00:08:31,599 --> 00:08:38,599
Then, this is a losing this is a loss for
player a, this is also a loss for player a.
57
00:08:39,419 --> 00:08:46,419
Now, here, because it is the choice of the
opponent, so, therefore, at this point, it
58
00:08:49,029 --> 00:08:55,010
is loss for player a. Because if we arrive
at this state, then, the opponent is going
59
00:08:55,010 --> 00:09:02,010
to push in this direction, right? Out here,
it is win for player a, so out here also,
60
00:09:06,290 --> 00:09:13,290
it is win for player a, right? Now, let us
look at this- this is loss for player a; this
61
00:09:14,010 --> 00:09:21,010
is also loss for player a. This is loss for
player a, right? This is also loss for player
62
00:09:21,740 --> 00:09:28,740
a, so, this is also loss for player a, right?
Now, in this node, because it is opponent's
63
00:09:30,579 --> 00:09:37,069
move, so, opponent will always push us in
this direction, so, this is also loss for
64
00:09:37,069 --> 00:09:44,069
player a. So, that means- but, here, the option
is with player a, so player a can always choose
65
00:09:45,990 --> 00:09:49,069
here, so, it is win for player a.
66
00:09:49,069 --> 00:09:55,709
That means, overall in this situation, player
a has a winning strategy, and what is that
67
00:09:55,709 --> 00:10:02,709
winning strategy? It is to take this move,
right? If it takes any of the other moves,
68
00:10:04,569 --> 00:10:09,269
then player b will have a winning strategy.
If it takes this one, then player b will push
69
00:10:09,269 --> 00:10:14,930
us in this direction, and it will be win for
player b. If it takes this move, then again,
70
00:10:14,930 --> 00:10:21,930
player b will have a winning strategy by pushing
us in this direction, right? Is it clear?
71
00:10:28,380 --> 00:10:35,380
What we have done is, we have seen that we
can bottom up propagate this winning or losing,
72
00:10:39,880 --> 00:10:45,259
right up to the top, and then decide, that
which is the move that is to be taken at this
73
00:10:45,259 --> 00:10:47,880
point of time, right?
74
00:10:47,880 --> 00:10:53,490
The current state is the root of the tree,
and all other states are look-aheads. If I
75
00:10:53,490 --> 00:11:00,079
if I give this move, if he give that move-
like this, okay? Now, in typically, it is
76
00:11:00,079 --> 00:11:05,459
very difficult to do this, when the game state
space size is very large, like, you cannot
77
00:11:05,459 --> 00:11:10,230
do this kind of thing when we are dealing
with something like chess, but tictactoe,
78
00:11:10,230 --> 00:11:14,360
yes, because tictactoe state space is not
going to grow too big. So, if you write a
79
00:11:14,360 --> 00:11:19,120
program just like this, to go right down,
up to the winning or losing configurations,
80
00:11:19,120 --> 00:11:25,350
you can find out, at each point of time, what
is the best move that you have to give, and
81
00:11:25,350 --> 00:11:29,899
whether you have a winning combina winning
strategy or a losing strategy, right?
82
00:11:29,899 --> 00:11:34,470
For, when you go for games like chess, then
you cannot do this, because the state space
83
00:11:34,470 --> 00:11:41,470
is just too big. So, what we will do is, we
will expand these moves up to a certain depth,
84
00:11:45,569 --> 00:11:51,190
and we will have some heuristic functions
to evaluate the position of the game after
85
00:11:51,190 --> 00:11:58,190
that many look-aheads. So, it means that-
in chess, let us say, I decide to do 6 move
86
00:11:58,750 --> 00:12:05,750
look ahead. So, I look ahead up to 6 moves,
and at that level, I analyze all the board
87
00:12:07,110 --> 00:12:13,579
positions, depending what kind of positional
and tactical advantage I have. So, that is
88
00:12:13,579 --> 00:12:19,079
where a lot of knowledge will come in about
chess, and then I will evaluate those board
89
00:12:19,079 --> 00:12:24,129
positions and associate a heuristic cost function
with them, which will indicate the probability
90
00:12:24,129 --> 00:12:30,930
of my winning from that position, right? Probability,
or the amount of cost that I have to incur.
91
00:12:30,930 --> 00:12:37,930
And then, we will imply what is called a minmax
search, to determine what is the best move.
92
00:12:38,720 --> 00:12:45,720
For example, let us look at this game tree.
In this game tree, we have looked ahead, up
93
00:12:49,129 --> 00:12:56,129
to this many number of moves, and then we
have found out some cost criterion here, right?
94
00:12:56,490 --> 00:13:03,490
Now, what does this cost criterion means?
It, these values tell me, what is the cost
95
00:13:05,519 --> 00:13:12,519
that I have to incur if I have to win? So,
in other words, the lesser the cost, the more
96
00:13:15,699 --> 00:13:22,699
are my chances of winning; the more the cost,
the lesser is my chance of winning, right?
97
00:13:23,490 --> 00:13:30,410
So, we compute the heuristic function in that
way. We associate a cost, which tells me the
98
00:13:30,410 --> 00:13:37,410
badness of the state in which I end up. For
example, if I am here, then the badness is
99
00:13:37,949 --> 00:13:44,949
9, right? Badness or goodness, depending on
how I model, right? So, we will have 2 types
100
00:13:48,310 --> 00:13:55,310
of node here- one are the square nodes, which
we will call max nodes. At these nodes, we
101
00:13:56,100 --> 00:14:03,100
will try to maximize the cost, and the square
nodes are min nodes. In these nodes, we will
102
00:14:04,279 --> 00:14:11,279
try to minimize the cost. Now, let us understand-
why do I want to maximize or minimize?
103
00:14:11,879 --> 00:14:18,879
The kind of games that we are considering
here, are called zero sum games. The zero
104
00:14:29,420 --> 00:14:36,420
sum games means that both players are playing
for a share of some amount of cost. So, let
105
00:14:37,850 --> 00:14:44,850
us say that w is the amount of cost for which
they are playing, right? So, this cost is
106
00:14:46,879 --> 00:14:53,879
going to get distributed between player a
and player b. It means that if player a gets
107
00:14:54,019 --> 00:15:01,019
more, then player b gets less, and vice versa,
right? So, if w if the the zero sum games
108
00:15:04,680 --> 00:15:11,680
idea came from a from the notion that w is
zero; if somebody gets a positive cost, the
109
00:15:13,290 --> 00:15:18,069
other fellow will get a negative cost. If
somebody has a profit, the other person has
110
00:15:18,069 --> 00:15:25,069
a loss, right? And, to indicate this scenario,
that one the the advantage of one is the disadvantage
111
00:15:29,709 --> 00:15:33,779
of the other, is what we want to represent
by zero sum games.
112
00:15:33,779 --> 00:15:39,290
Now, this is not always true in practice;
it is not always true in practice, because
113
00:15:39,290 --> 00:15:46,060
there are cases where the 2 competitors are
adversaries of each other, but the profit
114
00:15:46,060 --> 00:15:51,050
of one, does not necessarily mean the loss
of the other, right? But, at the same time,
115
00:15:51,050 --> 00:15:56,350
there is a large class of game playing situations,
where actually you are competing for the same
116
00:15:56,350 --> 00:16:03,350
resource. So, the win of one is the loss of
the other, right? For those kinds of scenarios,
117
00:16:04,269 --> 00:16:11,269
we will have 2 types of nodes- max nodes and
min nodes, right? Because the loss of one
118
00:16:12,389 --> 00:16:17,920
is the gain of the other, it means that if
I am trying to maximize my cost, my opponent
119
00:16:17,920 --> 00:16:20,600
is trying to minimize my cost.
120
00:16:20,600 --> 00:16:26,620
If I am trying to maximize my profit, then
the opponent strategy is to minimize my profit.
121
00:16:26,620 --> 00:16:33,620
And this will happen in zero sum games, understood?
Because my objective and that person's objectives,
122
00:16:34,180 --> 00:16:41,180
are exactly opposite of each other, right?
In that case, if I have my heuristic function,
123
00:16:42,939 --> 00:16:49,110
to compute my goodness- if I evaluate the
board positions, and I find that this is this
124
00:16:49,110 --> 00:16:56,110
has goodness of hundred for me, right, then
if the total, w, is 100, then it goodness
125
00:16:57,529 --> 00:17:04,529
of zero for the other. So, if it is goodness
of 50 for me, then it is goodness of 50 for
126
00:17:05,189 --> 00:17:05,920
the other, right?
127
00:17:05,920 --> 00:17:12,500
So, as I am trying to maximize my profit,
my opponent will try to minimize that. If
128
00:17:12,500 --> 00:17:19,370
you look at this game playing situation here,
these represent my profit. I am the player
129
00:17:19,370 --> 00:17:26,370
a I am the player a, who has to make a move
here. This game tree has been formulated,
130
00:17:27,610 --> 00:17:34,610
based on the fact that I am now I do have
the move. So, I am player a; I have the move,
131
00:17:35,730 --> 00:17:42,730
right? When I have the move, I will expand
the game tree up to this; these steps, and
132
00:17:44,050 --> 00:17:50,260
I will evaluate each of these steps, depending
on how much profit I get. So, these represent
133
00:17:50,260 --> 00:17:57,260
the amount of profit that I am getting, right?
Okay. Now, let us see what happens.
134
00:17:58,860 --> 00:18:04,880
Here, my opponent has a move, so square nodes
means my move, round nodes means my opponent's
135
00:18:04,880 --> 00:18:11,810
move. So, here, the opponents has has a move,
right? So, what is the opponent going do here?
136
00:18:11,810 --> 00:18:18,810
Will minimize; so, he will pick up 10 here.
So, it means that if I reach this state of
137
00:18:18,840 --> 00:18:25,840
the game, then I cannot expect more than 10,
because the opponent will push me in this
138
00:18:26,220 --> 00:18:33,220
direction, right? Likewise, here also, this
is a min node, so I will get 9, right? This
139
00:18:36,340 --> 00:18:43,340
is also a min node, so I will have 14. This
is a min node, so I will have 13. Here, I
140
00:18:44,160 --> 00:18:51,160
will have 2; here, I will have 1; here, 3,
and 20, right?
141
00:18:53,500 --> 00:19:00,500
Now, if you look at this layer, this is again
a max node, which means that, here, I have
142
00:19:01,500 --> 00:19:08,500
my move. In that case, if I were in this situation,
I would know that going this way will give
143
00:19:08,920 --> 00:19:15,390
me 10; going this way will give me 9, so,
I will always select this direction, right?
144
00:19:15,390 --> 00:19:21,580
So, I will go for 10. Now, see, we are also
maintaining a mark here, we are maintaining
145
00:19:21,580 --> 00:19:28,580
a mark just like we were doing in your start.
Again, out of these 2, because this is a max
146
00:19:29,240 --> 00:19:36,240
node, this is my move, so I will select this
one, right? And get 14. Here, I will be selecting
147
00:19:38,810 --> 00:19:45,810
this one and get 2, here I will be selecting
this move and I will get 20.
148
00:19:45,990 --> 00:19:52,540
Again, in the min level, if I if the opponent
is in this state, opponent will know that
149
00:19:52,540 --> 00:19:58,090
this way, I will get 14; this way, I will
get 10, and remember that the opponent's objective
150
00:19:58,090 --> 00:20:03,440
is exactly the opposite of mine, because it
is a zero sum game. So, therefore, the opponent
151
00:20:03,440 --> 00:20:10,440
will choose this and give me 10. Here also,
the opponent will choose this and give me
152
00:20:11,180 --> 00:20:18,180
2, right? And here, this is a max node, so
it is my choice, so, I will select this one
153
00:20:19,450 --> 00:20:26,450
and I will get 10. That means that from this
configuration, based on the heuristics that
154
00:20:28,080 --> 00:20:35,080
I have computed at this level, this is the
best move that I have. So, it is if I take
155
00:20:36,260 --> 00:20:41,000
this move, the opponent is going to give me
this; then, I will take this, then the opponent
156
00:20:41,000 --> 00:20:46,220
will give me this, so, that is what I will
get.
157
00:20:46,220 --> 00:20:51,580
Assuming that the opponent does not make a
mistake- if the opponent makes a mistake,
158
00:20:51,580 --> 00:20:58,580
then, actually, I gain more. Then, I will
have more profit. Because, just see- if the
159
00:20:59,070 --> 00:21:04,300
opponent, instead of going towards this direction,
if you are push me towards this direction,
160
00:21:04,300 --> 00:21:11,300
then I have a strategy of getting 14, right?
Because I will push it in this direction,
161
00:21:11,300 --> 00:21:14,990
and then, the opponent will have to give me
either this or this, so I will have at least
162
00:21:14,990 --> 00:21:21,200
14, right? So, at any point of time, what
I am trying to find out is, I am trying to
163
00:21:21,200 --> 00:21:28,200
maximize my gains. Assuming that the opponent
is an adversary, is an adversary who will
164
00:21:28,440 --> 00:21:30,190
never make mistakes.
165
00:21:30,190 --> 00:21:36,070
And if the opponent makes mistakes, I will
get more. If he does not make a mistake, then
166
00:21:36,070 --> 00:21:40,780
at least this much I am guaranteed, right?
And this is a fair assumption, because we
167
00:21:40,780 --> 00:21:47,780
are talking about zero sum games, where my
gain is the opponent's loss. Now, there are
168
00:21:53,720 --> 00:21:58,370
some optimizations that we can do. We need
not actually develop the tree, right up to
169
00:21:58,370 --> 00:22:05,370
this point. We can actually do some kinds
of pruning, by going depth first in the tree.
170
00:22:06,540 --> 00:22:12,070
And this brings us to some pruning criteria;
this pruning criteria, and the algorithm called
171
00:22:12,070 --> 00:22:17,960
alpha beta pruning, was proposed by Donald
Knuth many years back.
172
00:22:17,960 --> 00:22:24,160
We will study that algorithm and see how we
can avoid visiting the entire state space
173
00:22:24,160 --> 00:22:29,770
of the game tree. So, again, here, we will
assume that the square nodes are max nodes
174
00:22:29,770 --> 00:22:36,770
here, and the round nodes are min nodes. So,
let us look at this scenario. Here, we are
175
00:22:39,070 --> 00:22:46,070
going depth first, so, we find 10 here, right?
Then, we backtrack up to a, then we are visiting
176
00:22:49,490 --> 00:22:55,720
b, and let us say, that somewhere in this
tree, we get 14.
177
00:22:55,720 --> 00:23:02,360
So, if we get 14 backed up from here, then
the claim is that, I need not traverse down
178
00:23:02,360 --> 00:23:09,360
to c. And the reason is that: see, out here,
the min node already has a 10. So, if you
179
00:23:13,120 --> 00:23:20,120
get anything more than 10 on this arc, then
the min node is always going to go that way,
180
00:23:21,310 --> 00:23:28,310
because the min node will always minimize.
So, if ever the cost of b exceeds 14, if the
181
00:23:29,540 --> 00:23:36,540
cost of b exceeds 14, exceeds10, rather; if
the cost of b exceeds 10, then a will always
182
00:23:36,850 --> 00:23:43,850
select the arc towards this 10, right? Now,
let us see what was happened here in b- this
183
00:23:44,040 --> 00:23:50,150
is a this is a max node, we already have 14,
and the cost of the maximum node can only
184
00:23:50,150 --> 00:23:56,580
grow further, because it is max of all its
successors. So, if it if you already have
185
00:23:56,580 --> 00:24:02,590
14, and then there is no point visiting c,
because whatever you get from there, b will
186
00:24:02,590 --> 00:24:08,580
be at least 14, and a is never going to give
this move towards b- it is going to select
187
00:24:08,580 --> 00:24:11,920
the move in the other direction, right?
188
00:24:11,920 --> 00:24:18,080
So, we can prune now, the sub-tree rooted
at c, and not search that part of the sub-tree
189
00:24:18,080 --> 00:24:25,080
at- all right, is this clear? So, this is
a case of shallow pruning, where the pruning
190
00:24:29,960 --> 00:24:36,960
happens with respect to some parent, which
is close by, but we can also have deep pruning,
191
00:24:37,150 --> 00:24:44,150
like we have here. Here, my claim is that
the sub-tree routed at e, can be pruned. Now,
192
00:24:49,650 --> 00:24:56,650
let us understand why. Here, at this max node,
we already have 10, right? We have 10. Now,
193
00:25:05,630 --> 00:25:12,630
out here, in this min node, we have already
got 5, right?
194
00:25:13,080 --> 00:25:20,080
Now, by that, can we say, okay, out here,
we have already got 5, right? Now, if you
195
00:25:26,740 --> 00:25:33,740
go here- further down here- then, what are
you going to get? You will get something,
196
00:25:35,760 --> 00:25:42,760
and this cost is going to be less or equal
to 5, right? But, if it if this is less or
197
00:25:42,890 --> 00:25:49,640
equal to 5, then it is of no interest for
f, because unless f is able to produce a cost
198
00:25:49,640 --> 00:25:56,270
larger than 10, the root is always going to
divert it to that side. Suppose, f is able
199
00:25:56,270 --> 00:26:03,270
to manage a cost which is less than 10, right?
Then, when we are in this min mode, this min
200
00:26:03,850 --> 00:26:09,630
node will have a cost less than 10, because
if f is not able to get few more than 10,
201
00:26:09,630 --> 00:26:13,580
then the min node will always minimize. And
so, it will always give you less than or equal
202
00:26:13,580 --> 00:26:14,990
to 10, right?
203
00:26:14,990 --> 00:26:21,990
If f is less than 10, then this is also less
than 10, right? But if f is not able to master
204
00:26:22,050 --> 00:26:28,860
more than 10, then there is no point, because
the root is going to select this way. Now,
205
00:26:28,860 --> 00:26:35,430
please try to understand this, but the root
is always going to select this arc, unless
206
00:26:35,430 --> 00:26:42,340
it finds more attractive options, this way,
and it can only find the more attractive option
207
00:26:42,340 --> 00:26:49,040
in this way, provided that the successors
of this min node are all able to give you
208
00:26:49,040 --> 00:26:56,040
more than 10. But, if now, f is unable to
give you more than 10, then there is no point
209
00:26:57,470 --> 00:27:04,470
in pursuing of any further. And now, what
we have found here is, the d already has 5,
210
00:27:04,940 --> 00:27:08,810
and it will only go only further down, right?
211
00:27:08,810 --> 00:27:15,070
So, therefore, there is no point in checking
this, because this this branch, at least,
212
00:27:15,070 --> 00:27:21,290
is not going to give f more than 10. If f
takes this branch, it is not going to take
213
00:27:21,290 --> 00:27:28,290
more than 10, right? So, we can prune it off
at e. Is this clear? What is the difference
214
00:27:41,990 --> 00:27:48,990
between- okay, when we are talking about deep
cutoff, we will now introduce a couple of
215
00:27:49,610 --> 00:27:56,230
bounds. We have to formalize this, right?
We will have to formalize, that when do we
216
00:27:56,230 --> 00:28:02,480
do this cutoff? So, deep cutoff actually gives
us an idea- see here, this 10 is just not
217
00:28:02,480 --> 00:28:09,480
the immediate parents of d; it is the parents'
parents' parent, and this can happen even
218
00:28:09,630 --> 00:28:16,630
if you drag d right down up to several more
levels of min and max. So, let us formalize
219
00:28:16,920 --> 00:28:22,080
that, then you will understand why this deep
cutoff is useful.
220
00:28:22,080 --> 00:28:29,080
So, we will consider 2 bounds on the states
of the game: one is the alpha bound. The alpha
221
00:28:32,910 --> 00:28:39,910
bound of j, is the maximum current value of
all max ancestors of j, and we will stop the
222
00:28:45,150 --> 00:28:52,150
exploration of a min node, when its value
falls, equals, or falls below alpha. So, let
223
00:28:54,530 --> 00:29:01,530
us look at a scenario here. I have this: so,
there are other states, I am just looking
224
00:29:09,310 --> 00:29:14,960
at one path, okay? These are the other paths,
which I have already visited, and these are
225
00:29:14,960 --> 00:29:21,430
the paths which I have not yet visited, right?
So, this is just one path of the game tree,
226
00:29:21,430 --> 00:29:28,430
right? So, in this way, let us say that this
is the min node j, right? Now, the alpha bound
227
00:29:50,170 --> 00:29:57,170
of j, or let us say alpha of j, is defined
as the current maximum value of all max ancestors
228
00:30:16,870 --> 00:30:23,870
of j. So, it is the values that are backed
up here. Suppose this has backed up something
229
00:30:30,950 --> 00:30:37,950
like 20, and this has backed up something
like 5; let us say, this has backed up 10,
230
00:30:40,040 --> 00:30:41,050
right? And so on.
231
00:30:41,050 --> 00:30:48,050
So, it is the current maximum among all these.
Now, whenever the value of this j- the current
232
00:30:52,500 --> 00:30:59,310
value of j- falls below this alpha, we will
not explore the remaining successors of j
233
00:30:59,310 --> 00:31:06,310
anymore. Now, let us understand why. Consider
that max ancestor, which has this value alpha
234
00:31:09,480 --> 00:31:16,480
j, right? So, in this case, let us say this
is the value. Now, the moment this fellow's
235
00:31:18,690 --> 00:31:25,690
value falls below that value, we know that
there is no point exploring this any further,
236
00:31:27,040 --> 00:31:34,040
because the game is never going to come to
this state. Because the max node here, has
237
00:31:35,360 --> 00:31:42,360
a strategy of taking you to a node, which
has at least 20- what do I mean by having
238
00:31:42,810 --> 00:31:44,430
a value 20, here?
239
00:31:44,430 --> 00:31:51,430
It means that from this node, I have already
found out some strategy of reaching cost of
240
00:31:52,410 --> 00:31:58,740
at least 20; if the opponent makes a mistake,
probably more; if the opponent does not make
241
00:31:58,740 --> 00:32:04,450
a mistake, then at least 20. So, I have a
strategy of going to a node which gives me
242
00:32:04,450 --> 00:32:11,450
at least 20. Now, here, I am finding that
when I am visiting j, the cost has already
243
00:32:12,630 --> 00:32:19,630
dipped below 20, and if I explore further,
it may dip further, or be at least, at most,
244
00:32:19,960 --> 00:32:26,960
20. It is not going to exceed 20, because
this is a min node. So, the node here is never
245
00:32:28,030 --> 00:32:35,030
going to try the strategy of coming to this
node, so the strategies for visiting this
246
00:32:35,330 --> 00:32:41,850
node are of no interest from this layer, because
there are better strategies already.
247
00:32:41,850 --> 00:32:48,650
So, this move itself will not be selected,
if it is the case that we have to end up here.
248
00:32:48,650 --> 00:32:52,510
If it is the case, that we have to end up
here, then out here, it will not select this
249
00:32:52,510 --> 00:32:59,510
node, right? Okay. So, we do not explore this,
anything this any further, but does it mean
250
00:33:01,330 --> 00:33:08,330
that we have already chosen this one? No,
because along the other path from these max
251
00:33:08,370 --> 00:33:14,350
ancestors, we might still get some cost, which
is larger than 20, right? So, we are just
252
00:33:14,350 --> 00:33:20,130
pruning off this min node; we are just pruning
of the min node here, we are not doing anything
253
00:33:20,130 --> 00:33:27,130
with respect to the other max ancestors of
this node. Is it clear? Right.
254
00:33:37,220 --> 00:33:44,220
So, in the min nodes, what we are checking
is, whether its current value has fallen below
255
00:33:45,040 --> 00:33:52,040
the value backed up in the max ancestor of
the node. If it has, then we do not further
256
00:33:53,320 --> 00:34:00,320
explore that min node, we just backtrack to
its previous max parent. And in a min node,
257
00:34:03,770 --> 00:34:10,770
we will update beta, because beta the beta
bound of j, is the minimum current value of
258
00:34:16,369 --> 00:34:23,369
all min ancestors of j. And in a similar by
a similar argument, we can say that exploration
259
00:34:23,540 --> 00:34:30,310
of a max node j is stopped, when its values
equals or exceeds beta. In a max node, we
260
00:34:30,310 --> 00:34:36,950
update alpha, because if you see, that the
value of beta is the minimum current value
261
00:34:36,950 --> 00:34:43,950
of all min ancestors of j. So, we will continuously
keep on updating beta, when we are in the
262
00:34:44,010 --> 00:34:51,010
min nodes, and we will keep on updating alpha,
as we are exploring a max node. Is this all
263
00:34:55,669 --> 00:34:56,490
right?
264
00:34:56,490 --> 00:35:03,490
Now, before we go into the algorithm, okay,
right? Now, when do we when do we turn when
265
00:35:04,390 --> 00:35:11,310
do we return when do we backtrack? In both
min and max nodes, we will return when alpha
266
00:35:11,310 --> 00:35:17,730
is greater than or equal to beta. Now, see,
this criterion is actually encompassing these
267
00:35:17,730 --> 00:35:21,960
criteria, that exploration of a min node is
stopped when its value equals or fall bel
268
00:35:21,960 --> 00:35:27,720
falls below alpha. An exploration of a max
is stopped when its value equals or exceeds
269
00:35:27,720 --> 00:35:34,720
beta, because you see the the beta value that
we are maintaining here, is the minimum of
270
00:35:37,510 --> 00:35:44,280
the of the min node that that we are currently
exploring, and all its min ancestors. Now,
271
00:35:44,280 --> 00:35:51,280
the very fact that we are still working with
this node, means that the previous beta value
272
00:35:52,060 --> 00:35:56,230
is not greater than or equal to alpha.
273
00:35:56,230 --> 00:36:00,940
So, previous beta value is still less than
or equal to alpha; that is why we are still
274
00:36:00,940 --> 00:36:07,940
visiting this, right? And when this value
the value of this min nodes falls below alpha,
275
00:36:08,290 --> 00:36:15,060
that is where the beta value will also fall
below alpha, so that is where we will stop
276
00:36:15,060 --> 00:36:22,060
exploration. So, in this so this is the criterion,
which covers both of this conditions, where
277
00:36:22,430 --> 00:36:29,430
in min or max nodes, we return when alpha
is greater than or equal to beta, right? Yes.
278
00:36:29,980 --> 00:36:36,980
No no no. See, we have gone a long way from
that. We were talking about terminal nodes;
279
00:36:39,460 --> 00:36:43,960
when we were talking about small games, right,
that is where we can go right down, up to
280
00:36:43,960 --> 00:36:50,770
the visiting. Now, we are talking about heuristic
functions is the in the level where we are
281
00:36:50,770 --> 00:36:57,000
cutting off. Are you talking about them as
terminal node? Are you referring to the ground
282
00:36:57,000 --> 00:36:59,850
level nodes, as the terminal node? Right.
Yes.
283
00:36:59,850 --> 00:37:06,850
So, when you are in the terminal nodes, that
is where we will- that is where the induction
284
00:37:09,490 --> 00:37:16,490
basis will come. So, in the as you go down
recursively, that is where you will reach
285
00:37:17,530 --> 00:37:24,530
the basis of the induction. You will start
assigning your initial values, at that point.
286
00:37:27,600 --> 00:37:34,600
We will come to the algorithm, then, I will
explain. Let us do one thing- let us work
287
00:37:35,970 --> 00:37:42,970
out the alpha beta pruning on one of the on
a on an example. So, let us say that this
288
00:37:47,280 --> 00:37:54,280
is the- initially, we do not have this graph;
initially, we do not have this tree. We are
289
00:37:55,740 --> 00:38:01,070
just developing it as we are going depth first.
In alpha beta pruning, we are going depth
290
00:38:01,070 --> 00:38:08,070
first, and we are we will do the pruning as
and when we require. so we are We will start
291
00:38:09,110 --> 00:38:15,180
from here, okay? Then, we will take we are
going depth first, so we will go this way,
292
00:38:15,180 --> 00:38:20,050
generate this state in this way, and this
way, and this way.
293
00:38:20,050 --> 00:38:27,050
So, let us say that we are working with a
depth look ahead of so many moves, right?
294
00:38:27,450 --> 00:38:34,270
When we have made so many moves, at this point,
we will evaluate the board type, boards' position.
295
00:38:34,270 --> 00:38:41,270
At this position, let us say, we find that
the board position is 10, right? Then, we
296
00:38:43,570 --> 00:38:48,130
will backtrack. Now, this is this answers
your question. Then, we have reached the level
297
00:38:48,130 --> 00:38:55,130
of look aheads; we will backtrack, we will
backtrack with the value of 10, right? Now,
298
00:38:55,930 --> 00:39:02,930
the beta value of this node is 10, right?
The beta value here is 10. Then, I look at
299
00:39:06,440 --> 00:39:13,440
this state, and let us say I evaluate this,
and find the level. Then, what am I going
300
00:39:17,070 --> 00:39:24,070
to back up? I am going- because this is a
min node- it is going to back up 10, right?
301
00:39:25,260 --> 00:39:32,260
the alpha value here So, here, the beta was
10; now, the alpha here becomes 10, right?
302
00:39:35,270 --> 00:39:42,270
Because alpha is the values that the max ancestors-
that we are making. Then, we will go this
303
00:39:44,860 --> 00:39:51,860
way, right? Again, depth first, then we go
this way. Depth first, and let us say, at
304
00:39:53,740 --> 00:40:00,740
this point, we find 9. Now, note that here,
the beta value is now 9, and the beta value
305
00:40:06,790 --> 00:40:13,790
has fallen below the alpha value. So, we will
prune here; we will not visit this side anymore,
306
00:40:18,630 --> 00:40:25,630
and why? Let us see why we will not visit
here. This max node has already got 10, and
307
00:40:26,760 --> 00:40:33,760
in this min node, you have 9. And it may only
dip further, right? So, we will not- there
308
00:40:36,260 --> 00:40:42,030
is no point exploring this anymore, because
the max node is always going to go this way,
309
00:40:42,030 --> 00:40:46,770
because this is this is going to give you
9 or less, and this is already 10, so we will
310
00:40:46,770 --> 00:40:53,190
always take this this link, clear? So, therefore,
this is pruned.
311
00:40:53,190 --> 00:40:59,040
Then, let us continue. We go back here, what
is this going to back up? It is going to backup
312
00:40:59,040 --> 00:41:06,040
10, right? Now, our alpha is 10, beta is 10,
right? Then, we go this way. Again, we will
313
00:41:11,240 --> 00:41:18,240
go depth first, so this way, this way, this
way, and we find 14 here, right? So, now,
314
00:41:24,180 --> 00:41:31,180
the beta value here is 14, alpha is 10, so,
we are still in business, right? We are still
315
00:41:32,180 --> 00:41:39,180
in business, because if we can the moment
we will the when will we when will we be out
316
00:41:41,530 --> 00:41:48,530
of business? If we find that this fellow's
cost also exceeds 10, because then, this node
317
00:41:49,570 --> 00:41:53,940
here is never going to push us in this direction.
It knows- if it pushes us in this direction,
318
00:41:53,940 --> 00:41:55,750
it will be trouble, right?
319
00:41:55,750 --> 00:42:02,750
So, let us see, so, we will still traverse,
we will take this one here, we find 15. Now,
320
00:42:05,590 --> 00:42:12,590
the moment we find 15 here, what do we have?
We will have 14 here, and that means that
321
00:42:21,250 --> 00:42:28,250
out here, the alpha value is 14, right? Alpha
is 14, beta is 10, so we have again reached
322
00:42:32,590 --> 00:42:39,590
the pruning criteria, and therefore, there
is no point in checking this. And why? Because
323
00:42:42,890 --> 00:42:49,890
if you see, that this max node has already
got 14, and this can only grow, right? And
324
00:42:50,110 --> 00:42:56,250
so, in this min node, we will never take this
this way, because if we go this way, it is
325
00:42:56,250 --> 00:43:02,770
going to be at least 14. But if it goes this
way, then we know that it is at most 10.
326
00:43:02,770 --> 00:43:08,670
So, this fellow is trying to minimize, so
he will always it will always push push us
327
00:43:08,670 --> 00:43:15,670
in this direction, right? Therefore, what
what do we back up here? 10, right? Again,
328
00:43:21,960 --> 00:43:28,960
we have alpha equal to 10, beta equal to 10,
right? Then, we go in this direction again-
329
00:43:30,710 --> 00:43:37,710
depth first, depth first, depth first, depth
first, right? Here, we find 5, right. Now,
330
00:43:44,150 --> 00:43:51,150
the beta value here is going to be 5. Now,
again, alpha is 10, beta is 5, so we have
331
00:43:55,550 --> 00:44:02,550
a pruning criterion, right? What is the pruning
criterion? Whenever we find alpha is greater
332
00:44:03,310 --> 00:44:06,450
than or equal to beta- this is the pruning
criteria.
333
00:44:06,450 --> 00:44:13,450
So, that has happened again. Alpha is 10,
beta is 5, so, we prune this. And what is
334
00:44:13,450 --> 00:44:19,980
the reason of that? Because we know that if
you come in this direction, it is not going
335
00:44:19,980 --> 00:44:25,720
to be for this node, because if you come here,
you are going to get 5 or less, and this fellow
336
00:44:25,720 --> 00:44:32,720
already has a strategy of getting 10. So,
if even if he chooses this link, it is not
337
00:44:33,050 --> 00:44:38,840
going to be because of this node. Therefore,
there is no point visiting this node further,
338
00:44:38,840 --> 00:44:45,840
right? Whether this is 5 or 3 or one does
not matter; all of them are equally bad, because
339
00:44:46,360 --> 00:44:53,360
we already have 10, okay? Then, what is this
backing up from here? 5, right? Okay.
340
00:45:01,910 --> 00:45:08,910
Alpha here is still 10, because a max ancestor
here has 10, so alpha is still 10. And again,
341
00:45:13,220 --> 00:45:20,220
what do we have as the beta out here? 5, right?
We still do not have 5 here; 5 was here, but
342
00:45:23,940 --> 00:45:30,940
we have already backtracked from there, so,
it is only on the min ancestors, right? Okay.
343
00:45:31,980 --> 00:45:38,540
So, we will go in this direction, and we will
come here. See? There is still a possibility
344
00:45:38,540 --> 00:45:44,590
that you get something more than 10 from this
side; if you get something more than 10, then
345
00:45:44,590 --> 00:45:48,830
this still remains an attractive option. So,
if you can get more than 10 here, more than
346
00:45:48,830 --> 00:45:55,830
10 here then this link will become better.
So, we come here and we get 4, right?
347
00:45:56,330 --> 00:46:03,330
Again ,we have beta equal to 4, alpha is 10;
pruning criterion applies- we prune here,
348
00:46:07,480 --> 00:46:14,480
clear? And then, we go back, so this is going
to back up 5, right? If this backs up 5, then
349
00:46:18,290 --> 00:46:24,930
herein, we will have beta equal to 5. Again,
alpha is 10, beta is 5, so we can prune out
350
00:46:24,930 --> 00:46:31,610
this whole part, because this gives us, that
if you take this route, you cannot get more
351
00:46:31,610 --> 00:46:38,400
than 5, and if that happens, then this at
in this max node, we will never consider this
352
00:46:38,400 --> 00:46:44,840
move, because if we consider this move, the
opponent has a strategy of giving us 5 or
353
00:46:44,840 --> 00:46:50,210
less. But we already have a strategy of getting
10, so we will never choose this move. So,
354
00:46:50,210 --> 00:46:52,570
no point visiting this part at all.
355
00:46:52,570 --> 00:46:59,570
So, this whole sub-tree gets pruned, and we
finally come here with alpha equal to 10.
356
00:47:00,400 --> 00:47:05,990
And we will choose this one. The best move
at this point is this, which is what we also
357
00:47:05,990 --> 00:47:11,920
found out by doing that whole bottom up stuff,
but this alpha beta pruning helps us in getting
358
00:47:11,920 --> 00:47:18,920
rid of significant portions of the game tree.
Right. Any questions? Okay. Now, let us have
359
00:47:24,320 --> 00:47:26,630
a look at the algorithm.
360
00:47:26,630 --> 00:47:33,630
In this algorithm, if j is a terminal, then
we return vj equal to hj. If we are at the
361
00:47:35,170 --> 00:47:42,170
at the leaf level, after the number of look
aheads we have done, we written vj is equal
362
00:47:43,300 --> 00:47:50,300
to hj. vj is the cost that we will eventually
return at the root. So, at a at a state j,
363
00:47:50,400 --> 00:47:57,360
vj is the best cost we can get, from the point
of view of player a. So, all costs are with
364
00:47:57,360 --> 00:48:04,360
respect to player a. If j is a max node, then
for each successor jk of j, in succession,
365
00:48:08,030 --> 00:48:15,030
we set alpha as the maximum of the existing
alpha, which it has inherited from its parent,
366
00:48:17,600 --> 00:48:22,300
say this is a recursive procedure. So, the
alpha that you are getting, is coming from
367
00:48:22,300 --> 00:48:29,050
your parent, right? And what is the definition
of the alpha value? It is the maximum value
368
00:48:29,050 --> 00:48:36,050
that is among all the max nodes, including
this current max node and its max ancestors,
369
00:48:36,680 --> 00:48:37,450
right?
370
00:48:37,450 --> 00:48:44,450
So, it is the maximum of the existing alpha,
and the value that I obtain by recursively
371
00:48:46,430 --> 00:48:53,390
calling v jk with alpha and beta bound, right?
This alpha and beta bounds, because they are
372
00:48:53,390 --> 00:48:58,670
coming from the ancestors, so we can recursively
call this procedures with the alpha and beta
373
00:48:58,670 --> 00:49:05,670
bound, being passed down, right? We are passing
the alpha and beta bounds down into the recursion.
374
00:49:07,660 --> 00:49:11,160
And then, we apply the pruning criterion,
that if alpha is greater than or equal to
375
00:49:11,160 --> 00:49:18,160
beta, then, we return beta, otherwise, we
continue. And then, if we have finished with
376
00:49:23,930 --> 00:49:30,930
all the successors, then we have not yet pruned
anything, so we have visited all the successors.
377
00:49:32,160 --> 00:49:39,160
Then the best cost that I have in the max
node is the alpha cost, so we return alpha,
378
00:49:39,660 --> 00:49:46,660
right, and we do exactly complementary stuff
for the min node. So, for each successor jk
379
00:49:46,730 --> 00:49:53,730
of j, in succession, set beta equal to minimum
of beta and v jk alpha beta. In the min nodes,
380
00:49:53,930 --> 00:49:57,410
we are maintaining the value of beta, and
again, we check if alpha is greater than or
381
00:49:57,410 --> 00:50:04,410
equal to beta, then beta and alpha, otherwise,
continuing this loop, and if you are finished
382
00:50:04,410 --> 00:50:11,410
this loop with all successors, then return
beta, right? So, what I would ask you to do
383
00:50:13,440 --> 00:50:18,460
is, you can either write a small piece of
code to check this out on some game like tictactoe
384
00:50:18,460 --> 00:50:25,460
or any other game that you can think of, and
also, to hand over this algorithm on one game
385
00:50:28,460 --> 00:50:35,460
tree, right, so that you have confidence in
what is meant by this alpha beta pruning procedure,
386
00:50:35,670 --> 00:50:38,580
okay? Okay.
387
00:50:38,580 --> 00:50:45,580
Now, with that, we come to the end of the
lecture on game trees. You have any questions?
388
00:50:49,520 --> 00:50:56,520
If the opponents make a mistake, you may go
to those parts which are pruned, but in those
389
00:51:10,780 --> 00:51:17,780
cases, you will definitely get a profit more
than what you normally get. No, so you will
390
00:51:19,720 --> 00:51:26,050
again- see, that is the next move. See, what
are we doing this analysis for? To determine
391
00:51:26,050 --> 00:51:33,050
what move we will make now, right? No, no,
no, we will not use those paths, because we
392
00:51:34,250 --> 00:51:40,950
will assume that the opponent is intelligent;
the opponent will not make mistakes, right?
393
00:51:40,950 --> 00:51:47,950
We are taking into consideration the worst
case scenario with respect to the opponent.
394
00:51:48,220 --> 00:51:54,500
We will choose our move based on that, right?
Now, if the opponent makes the wrong move,
395
00:51:54,500 --> 00:52:01,500
or the move which I have not expected, then
what we can do is, we can we will again be
396
00:52:03,850 --> 00:52:10,850
in a max node after 2 levels of move. From
there, we will again expand out the game tree
397
00:52:12,770 --> 00:52:19,270
and do the analysis again, right? And then,
determine which would be our best move there,
398
00:52:19,270 --> 00:52:25,580
but what we are assured is that, the best
move that we get from there, is going to be
399
00:52:25,580 --> 00:52:31,420
at least as good as what I originally had
planned for. That is because the opponent
400
00:52:31,420 --> 00:52:38,420
has made a mistake, so my min max value can
only be more than what I have obtained here,
401
00:52:42,490 --> 00:52:49,490
isn't it?
402
00:52:50,850 --> 00:52:57,850
No, in the previous case means, when we were
not pruning, yes. So, in that case, what we
403
00:53:05,710 --> 00:53:11,869
were doing is, we were looking ahead right
up to the leaf level, which is the number
404
00:53:11,869 --> 00:53:18,869
of lookaheads that we are doing. Yes. No,
no, no. I think I must clarify that- this
405
00:53:28,010 --> 00:53:35,010
is the current decision that I am taking.
If the opponent puts me here, I will again
406
00:53:35,130 --> 00:53:42,130
do that many more look aheads, right? See,
why I am restricting the number of look aheads-
407
00:53:43,230 --> 00:53:50,230
because of the computational complexity involved,
because the state space grows very fast. Particularly
408
00:53:50,770 --> 00:53:57,770
in games like chess, the branching is very
large, so the state space will really blow
409
00:53:57,970 --> 00:54:04,970
up. If you look, in- 15, 16 moves; I mean,
beyond that it is- even 15, 16 moves is phenomenal,
410
00:54:06,790 --> 00:54:07,300
right?
411
00:54:07,300 --> 00:54:14,300
So, you cannot really afford to go much further
below; that is why we are looking look ahead
412
00:54:14,400 --> 00:54:21,010
that many number of times. But when I have
already played a couple of moves- I have taken
413
00:54:21,010 --> 00:54:26,359
my move, the opponent has taken the move-
I will again do an analysis of them. Now,
414
00:54:26,359 --> 00:54:30,760
if you look at the expert level at which you
are playing, with the chess playing program,
415
00:54:30,760 --> 00:54:34,850
that actually, determines the number of moves,
lookaheads, that it is trying. So, if you
416
00:54:34,850 --> 00:54:39,740
look at a greater expert level, then it is
using more knowledge, and it is using more
417
00:54:39,740 --> 00:54:46,740
look aheads, right? It will take more time
also. So, in tictactoe, we do not normally
418
00:54:55,090 --> 00:55:00,850
do that, because it is a small game, so you
can actually expand it right up to the winning
419
00:55:00,850 --> 00:55:03,400
or losing configurations, right?
420
00:55:03,400 --> 00:55:08,710
But in chess, that is a million dollar question-
that is, determining the heuristic function
421
00:55:08,710 --> 00:55:13,600
is the is the key factor here. So, they do
a lot of things- they keep history of different
422
00:55:13,600 --> 00:55:20,280
games and find out, that from this position,
what is the how many winning scenarios were
423
00:55:20,280 --> 00:55:26,200
there? They try to follow also, some of those
existing patterns, or they can actually evaluate.
424
00:55:26,200 --> 00:55:32,280
Then, say, a simple thing could be, the number
of pieces that you have, right? The number
425
00:55:32,280 --> 00:55:39,280
of pieces that you have in your board, is
one of the issues, right? Actually, what happens
426
00:55:40,420 --> 00:55:47,200
is, there this is not a single objective scenario.
in a In a chess playing scenario, there can
427
00:55:47,200 --> 00:55:52,140
be multiple criterion, which determine the
tactical and positional advantage that you
428
00:55:52,140 --> 00:55:59,140
have, right? So, there is some work on that;
actually, some of my own work is on that-
429
00:55:59,820 --> 00:56:04,880
on how to extend multi-criteria searching
algorithms to game playing situations. So,
430
00:56:04,880 --> 00:56:11,880
I can give you the citation, if you want to
take a look later on.
431
00:56:25,599 --> 00:56:32,599
Right, so, we will now move into a completely
new topic. I think we have done a lot of search.
432
00:56:37,230 --> 00:56:42,630
Now, we will move into a chapter on knowledge
based systems: logic and deduction. This is
433
00:56:42,630 --> 00:56:48,320
a new thing, which we will start working at,
from this thing. And we will see, that it
434
00:56:48,320 --> 00:56:55,320
is the combination of search and deduction,
which will make up for almost the majority
435
00:56:55,970 --> 00:57:02,970
of the different kinds of activities or computational
requirements that we have in earth.
436
00:57:03,740 --> 00:57:10,740
So, in this, we are going to study bunch of
different techniques for representing problems
437
00:57:13,010 --> 00:57:20,010
in logic, and also techniques for deduction,
new facts and new rules, from existing ones.
438
00:57:21,830 --> 00:57:28,830
So, the important things are representation
of real world scenarios and reasoning with
439
00:57:31,820 --> 00:57:38,820
logic. We will study, firstly, to start with
a logic called propositional logic; then,
440
00:57:40,760 --> 00:57:47,760
we will see, add further things into the proportional
logic, namely quantifiers, to get into first
441
00:57:49,800 --> 00:57:55,820
order logic; then, we will study inference
mechanisms in proportional logic, as well
442
00:57:55,820 --> 00:58:02,820
as in first order logic. And we will see that
depending on the explicability of your logic,
443
00:58:03,420 --> 00:58:08,150
the cost that we have to pay is in terms of
the computational complexity of reasoning
444
00:58:08,150 --> 00:58:15,150
in those logics. For example, while finding
out the satisfiability of a proportional logic
445
00:58:17,280 --> 00:58:24,280
formula will be shown to be- and to complete
the question of first, the satisfiability
446
00:58:25,470 --> 00:58:30,830
in first order logic, or the validity of a
formula in first order logic, will be semi-decidable.
447
00:58:30,830 --> 00:58:37,440
There will be cases where the problem will
be unsolvable, like you have studied the halting
448
00:58:37,440 --> 00:58:42,200
problem of tuning machine; there was another
problem that you came across which is unsolvable,
449
00:58:42,200 --> 00:58:48,109
right? We will see- the inference in first
order logic is also semi-decidable, in the
450
00:58:48,109 --> 00:58:55,109
sense that, in one direction, we can say yes,
but the other direction- no answer can be
451
00:58:55,820 --> 00:59:01,700
un-decidable, right? So, we will look into
those things. We will study some classical
452
00:59:01,700 --> 00:59:07,260
techniques called generalized modus ponens,
for reasoning in first order logic. We will
453
00:59:07,260 --> 00:59:13,840
study some kind of forward and backward chaining
mechanisms for inferencing; these are popularly
454
00:59:13,840 --> 00:59:20,840
used in all the deduction systems. And we
will also study a technique called resolution,
455
00:59:22,480 --> 00:59:26,010
and give you an overview of logical reasoning
systems.