1
00:00:13,559 --> 00:00:19,270
Dominating Set and Distributed Algorithms.
Recap of previous lecture we have discussed
2
00:00:19,270 --> 00:00:26,810
Hamiltonian Graph Travelling Salesman Problem
and NP-Completeness. Content of this lecture
3
00:00:26,810 --> 00:00:34,230
we will discuss dominating sets connecting
dominating sets and distributed algorithms
4
00:00:34,230 --> 00:00:40,690
on graph.
Preliminaries given a graph G 2 vertices are
5
00:00:40,690 --> 00:00:49,880
independent if they are not neighbors. For
any vertex b the set of independent neighbors
6
00:00:49,880 --> 00:00:57,690
of v is a subset of v's neighbors such that
any 2 vertices in this subset are independent.
7
00:00:57,690 --> 00:01:06,530
For example, let us say this is the vertex
b it has the neighbors let us say u and w.
8
00:01:06,530 --> 00:01:21,329
So, u and w they are independent neighbors,
why because there is no edge connection between
9
00:01:21,329 --> 00:01:27,840
them and they are independent. So, hence for
any vertex v the set of independent neighbors
10
00:01:27,840 --> 00:01:39,020
is a subset of the v's neighbor. Such that
any 2 vertices in these subsets are independent
11
00:01:39,020 --> 00:01:56,569
u and w they belong to the neighbor of v .
Now, let us see another definition independent
12
00:01:56,569 --> 00:02:13,170
set S of G is a subset of set of vertices
such that for all vertices u and v in s they
13
00:02:13,170 --> 00:02:20,980
do not have any edge between them; that means,
such that let us say u and v they are in S
14
00:02:20,980 --> 00:02:29,150
and also u and v they do not have an edge
in G.
15
00:02:29,150 --> 00:02:38,150
Then it is then that set of vertices is called
independent set. The maximal independent set
16
00:02:38,150 --> 00:02:52,129
if the vertex not in S has a neighbor in S
for example, so; that means, the maximal independent
17
00:02:52,129 --> 00:03:05,579
set means S is an independent set is maximal
independent set when the nodes, which are
18
00:03:05,579 --> 00:03:30,640
in V minus S must haves neighbor in S. For
example, let us take a simple graph if this
19
00:03:30,640 --> 00:03:38,969
is the graph then let us say take the independent
set which are shown by the a green color vertices
20
00:03:38,969 --> 00:03:46,959
.
So, together these 2 vertices are maximal
21
00:03:46,959 --> 00:03:54,810
independent set why because the nodes which
are not green or which are not in the maximal
22
00:03:54,810 --> 00:04:00,780
independent set are having the neighbor in
this particular set. For example, this node
23
00:04:00,780 --> 00:04:05,709
is a neighbor, this node which is not in s
is also a neighbor, this node is the neighbor
24
00:04:05,709 --> 00:04:13,459
of both the elements in the independent set.
So, this also should be this also should be
25
00:04:13,459 --> 00:04:19,780
in the independent set. So, this also node
is a neighbor and this particular node is
26
00:04:19,780 --> 00:04:30,000
also a neighbor and this node is also a neighbor.
So, these 3 set of vertices is in MIS. Now
27
00:04:30,000 --> 00:04:37,800
this particular node if it is present then
they are not independent let us not take this
28
00:04:37,800 --> 00:04:44,100
particular node let us take some other node
in the independent set .
29
00:04:44,100 --> 00:04:51,199
Let us take this particular node in the independence.
So, all the green color nodes are in the independent
30
00:04:51,199 --> 00:04:58,630
set, because they do not have any edge between
them and all other nodes all other nodes are
31
00:04:58,630 --> 00:05:06,590
basically the neighbor of the other set. Hence
this is an independent set example. So, we
32
00:05:06,590 --> 00:05:14,699
have shown the independent set example.
Now the dominating set D of G is a subset
33
00:05:14,699 --> 00:05:23,630
of V, such that any node not in D is at least
one neighbor in D and let us consider the
34
00:05:23,630 --> 00:05:54,669
independent set same example, if we if we
construct, then these particular 3 different
35
00:05:54,669 --> 00:06:04,990
set of nodes is the dominating set why because
all other nodes which are not in D that is
36
00:06:04,990 --> 00:06:11,990
this is a neighbor of at least one node in
D similarly this node also is a a neighbor
37
00:06:11,990 --> 00:06:19,470
of at least one node in D this is also a neighbor
of at least one node in D so, also this node
38
00:06:19,470 --> 00:06:36,070
and this node. Therefore, D is the dominating
set and this D is a subset of the vertices
39
00:06:36,070 --> 00:06:56,650
of G such that such that the nodes in V minus
D is the neighbor in D.
40
00:06:56,650 --> 00:07:06,610
Now, if the induced graph of D if the induced
sub graph of D is connected then D is called
41
00:07:06,610 --> 00:07:15,080
the connected dominating set. For example,
here the induced graph of D is also connected
42
00:07:15,080 --> 00:07:23,349
therefore; it is a connected dominating set
.
43
00:07:23,349 --> 00:07:30,699
Now among all CTSS connecting dominating set
of a graph G the one with the minimal cardinality
44
00:07:30,699 --> 00:07:39,669
is called minimum connected dominating set
. Now find out a minimum connected dominating
45
00:07:39,669 --> 00:07:53,199
set in a given graph requires to explore the
large amount of solution space hence the problem
46
00:07:53,199 --> 00:08:02,220
of finding the maximum or a minimum connected
dominating set is basically NP hard. If let
47
00:08:02,220 --> 00:08:07,139
us say an algorithm is developed to find out
a minimum connected dominant acing that becomes
48
00:08:07,139 --> 00:08:15,370
an NP hard problem.
Now, let us take ah an application of dominating
49
00:08:15,370 --> 00:08:22,289
set and that application we can find in the
wireless communication and also the wireless
50
00:08:22,289 --> 00:08:29,949
ad hoc networks, let us see how this particular
dominating set in a graphs are used in constructing
51
00:08:29,949 --> 00:08:37,310
the algorithms for wireless networks.
So, the idea of virtual backbone routing for
52
00:08:37,310 --> 00:08:43,669
ad hoc wireless network is to operate routing
protocol over a virtual backbone, because
53
00:08:43,669 --> 00:08:57,370
there is no infrastructure in an ad hoc network
. So, the routing is to be done if it is in
54
00:08:57,370 --> 00:09:10,980
a systematic manner without too much of communication,
then it requires to create a virtual backbone
55
00:09:10,980 --> 00:09:16,060
and operate the routing protocol over it that
is the method.
56
00:09:16,060 --> 00:09:24,470
Now, here one purpose of virtual backbone
based routing is to alleviate the problem,
57
00:09:24,470 --> 00:09:31,279
which is called a broadcast storm problem
that will arise if the destination is not
58
00:09:31,279 --> 00:09:37,480
known. So, the source who wants to send the
information has to depend upon the flooding
59
00:09:37,480 --> 00:09:42,990
and the flooding will create the serious problem,
which is called a broadcast storm problem.
60
00:09:42,990 --> 00:09:52,060
Thus the construction of a virtual backbone
is very important in the wireless networks
61
00:09:52,060 --> 00:09:56,529
to operate the routing protocol without suffering
from a broadcast storm problem.
62
00:09:56,529 --> 00:10:07,310
So, in this lecture we will focus on the virtual
backbone creation, which is nothing, but is
63
00:10:07,310 --> 00:10:15,860
approximated by a minimum connected dominating
set and the kind of graph, which we will assume
64
00:10:15,860 --> 00:10:21,800
for ad hoc wireless networks is called a unit
disrupt that we will explain. Now we will
65
00:10:21,800 --> 00:10:28,910
take up an algorithm to solve the minimum
connected dominating set which will be used
66
00:10:28,910 --> 00:10:38,640
for construction of a virtual backbone . Now
this particular problem is through the algorithm
67
00:10:38,640 --> 00:10:44,170
becomes NP hard problems. So, we will look
upon the different algorithms, which are the
68
00:10:44,170 --> 00:10:52,430
approximation algorithms we will look into
that particular algorithm to see how the algorithms
69
00:10:52,430 --> 00:10:58,700
are designed? And these algorithms are distributed
algorithms why because the nature of the ad
70
00:10:58,700 --> 00:11:05,110
hoc network is large number of nodes which
are deployed and it cannot be done in a centralized
71
00:11:05,110 --> 00:11:08,690
manner. So, the algorithm has to be a distributed
algorithm.
72
00:11:08,690 --> 00:11:15,940
So, the distributed approximation algorithm
we will discuss and the performance ratio
73
00:11:15,940 --> 00:11:24,760
is 8 and that performance ratio means, that
it takes the guarantees of the algorithm how
74
00:11:24,760 --> 00:11:32,410
much it can deviate from the optimum size
of the minimum connected dominating set.
75
00:11:32,410 --> 00:11:39,130
So, let us see that sensor network is also
a network sensor network is also a wireless
76
00:11:39,130 --> 00:11:44,680
network and we can view it as an ad hoc network.
So, ad hoc wireless and sensor network has
77
00:11:44,680 --> 00:11:50,220
an application in emergency search and rescue
operation decision making in a battlefield,
78
00:11:50,220 --> 00:11:54,180
data acquisition operation in inhospitable
terrains etcetera.
79
00:11:54,180 --> 00:12:02,850
It is featured by a dynamic topology here
in such applications there is no possibility
80
00:12:02,850 --> 00:12:08,040
of establishing the infrastructure and operate
the wireless communications, hence it is a
81
00:12:08,040 --> 00:12:15,029
dynamic topology where in the infrastructures
are dynamically created on the fly on demand.
82
00:12:15,029 --> 00:12:23,300
Now it is also featured by the multi hoc communication
and it also having a limited resource that
83
00:12:23,300 --> 00:12:30,790
is the bandwidth CPU and battery are all limited
in nature and also such networks are having
84
00:12:30,790 --> 00:12:36,470
a limited security.
So, these characteristics puts my special
85
00:12:36,470 --> 00:12:44,540
challenges in routing protocol design and
hence they are inspired to create a virtual
86
00:12:44,540 --> 00:12:49,820
backbone why because there is no possibility
to create a physical backbone, but only the
87
00:12:49,820 --> 00:12:59,070
inspiration of a physical like backbone ah
as if we have seen in the wired network is
88
00:12:59,070 --> 00:13:03,690
ve very much required to be created.
So, many researchers they have proposed the
89
00:13:03,690 --> 00:13:10,889
concept of a virtual backbone and that particular
virtual backbone will be used for different
90
00:13:10,889 --> 00:13:17,709
kind of communication. For example, unicast
communication, multicast, broadcast kind of
91
00:13:17,709 --> 00:13:23,029
communication in adhoc wireless networks.
The virtual backbone is mainly used to collect
92
00:13:23,029 --> 00:13:30,540
the topology information for the route detection.
So, it also works up as the backup when the
93
00:13:30,540 --> 00:13:38,990
route is unavailable temporarily. An effective
approach based on overlaying a virtual infrastructure,
94
00:13:38,990 --> 00:13:47,760
which is also called as a core on an adhoc
network becomes a popular.
95
00:13:47,760 --> 00:13:54,070
So, routing protocols are operated over the
core; route request packets are unique hosted
96
00:13:54,070 --> 00:14:03,820
to the code nodes and a small subset of noncore
nodes are also used up in this broadcasting
97
00:14:03,820 --> 00:14:10,000
or a routing of the information. Here no broadcast
is involved and only the nodes in the core
98
00:14:10,000 --> 00:14:20,480
they are involved in the communication.
The classification of routing protocols we
99
00:14:20,480 --> 00:14:26,899
can do in 2 cat categories proactive and reactive
routing protocols, proactive routing protocols
100
00:14:26,899 --> 00:14:32,130
ask each host or a many hosts to maintain
a global topology information, thus a route
101
00:14:32,130 --> 00:14:38,600
can be provided immediately whenever it is
required, but large amount of control messages
102
00:14:38,600 --> 00:14:45,920
are required to keep each host updated for
the newest topology changes in proactive routing
103
00:14:45,920 --> 00:14:52,350
protocols.
Now, the scenario where there is a scarcity
104
00:14:52,350 --> 00:14:58,079
of the resources and also there is a change
in the infrastructure there is no infrastructure
105
00:14:58,079 --> 00:15:04,380
. In those scenarios like wireless ad hoc
networks proactive routing protocol becomes
106
00:15:04,380 --> 00:15:10,889
a costlier affair to maintain the information
by sending all the time information, whenever
107
00:15:10,889 --> 00:15:16,480
there is a little change. Rather than another
routing another way of handling the routing
108
00:15:16,480 --> 00:15:23,870
protocol is called reactive protocols reactive
protocol have a feature which is called a
109
00:15:23,870 --> 00:15:29,790
on demand. So, our host computes a route for
a specific destination only whenever it is
110
00:15:29,790 --> 00:15:34,199
necessary and hence it is called reactive
routing protocols.
111
00:15:34,199 --> 00:15:40,730
So, topology changes, which do not influence
the active roads, do not trigger any route
112
00:15:40,730 --> 00:15:45,660
maintenance function thus the communication
overhead is lower compared to the proactive
113
00:15:45,660 --> 00:15:49,399
routing protocols.
So, on demand routing protocol attract much
114
00:15:49,399 --> 00:15:59,180
attention due to their scalability and lower
protocol overhead, but most of them uses the
115
00:15:59,180 --> 00:16:05,319
flooding for the route discovery why because
the destination is not known. So, they have
116
00:16:05,319 --> 00:16:11,949
to basically depend upon the flooding for
the route discovery, flooding will have the
117
00:16:11,949 --> 00:16:15,730
disadvantage that it suffers from the broadcast
storm problem.
118
00:16:15,730 --> 00:16:19,920
So, broadcast storm problem refers to the
fact that flooding may result into an excessive
119
00:16:19,920 --> 00:16:27,769
redundancy contention and the collision. This
causes high protocol overhead and interference
120
00:16:27,769 --> 00:16:34,880
to the ongoing communication sessions. On
the other hand the unreliability of a broadcast
121
00:16:34,880 --> 00:16:40,279
may obstruct the detection of the shortest
path or simply can not detect any paths at
122
00:16:40,279 --> 00:16:45,889
all even though there exists one.
Now, in this lecture we will study the problem
123
00:16:45,889 --> 00:16:52,259
efficiently constructing the virtual backbone
for ad hoc wireless networks, the number of
124
00:16:52,259 --> 00:16:59,670
hosts forming the wireless backbone must be
as small as possible, to decrease the protocol
125
00:16:59,670 --> 00:17:07,921
overhead. Hence the core has to be of the
smallest possible size. So, the algorithm
126
00:17:07,921 --> 00:17:16,600
must be also efficient due to the resource
scarcity therefore, we must basically model
127
00:17:16,600 --> 00:17:25,290
this virtual backbone as the connected dominating
set to approximate the virtual backbone for
128
00:17:25,290 --> 00:17:32,800
the wireless.
So, we will see how we can construct the connected
129
00:17:32,800 --> 00:17:39,020
dominating set using the distributed algorithm,
which will be useful to create the virtual
130
00:17:39,020 --> 00:17:46,570
backbone for such applications.
So, let us take the assumptions. So, we assume
131
00:17:46,570 --> 00:17:56,520
that the given adhoc network instance will
contain n hosts. Now each host is in the ground
132
00:17:56,520 --> 00:18:06,970
; that means, these hosts can be dropped from
an unmanned aircraft and when it reaches the
133
00:18:06,970 --> 00:18:13,410
ground, it has an antenna mounted by a on
it that is a by de that is only directional
134
00:18:13,410 --> 00:18:17,850
antenna which will be useful to communicate
and establish the network.
135
00:18:17,850 --> 00:18:24,790
Thus the transmission range of a host is assumed
to be a disk we further assume that each transceiver
136
00:18:24,790 --> 00:18:29,980
has the same communication range that is ours.
Thus the footprint of an ad hoc network is
137
00:18:29,980 --> 00:18:32,690
nothing, but a graph which is called a unit
disk graph.
138
00:18:32,690 --> 00:18:39,330
Now, in graph theoretic terminology the network
topology we basically will assume is nothing,
139
00:18:39,330 --> 00:18:48,870
but a graph where V is the set of all hosts
and E consists of the transmission links or
140
00:18:48,870 --> 00:18:51,260
when they are able to communicate with each
other.
141
00:18:51,260 --> 00:19:01,660
So, a link between the 2 nodes u and V exist
if their distance is at most R. So, in real
142
00:19:01,660 --> 00:19:09,820
word the scenario is quite different, but
for our study of the algorithm design we will
143
00:19:09,820 --> 00:19:16,510
assume that distance if it is at most R they
are, basically established a link between
144
00:19:16,510 --> 00:19:21,770
the 2 knots .
Also we will consider that the links are bi
145
00:19:21,770 --> 00:19:26,810
directional that is if you is able to communicate
to v. So, v also is able to communicate hence
146
00:19:26,810 --> 00:19:31,310
the link is bi directional between u and v
in our model.
147
00:19:31,310 --> 00:19:39,270
Now, let us see the scenario how basically
the sensor nodes are dropped by an unmanned
148
00:19:39,270 --> 00:19:48,250
aircraft and once they will settle down using
their only directional antenna they may be
149
00:19:48,250 --> 00:19:56,080
able to communicate hence, they will form
a graph. Which is called a V comma E.? So,
150
00:19:56,080 --> 00:20:08,790
V is the set of nodes and the edges between
these 2 nodes that is between u and v is an
151
00:20:08,790 --> 00:20:23,470
edge in a graph if the distance between u
and v is at most R. R is the maximum transmission
152
00:20:23,470 --> 00:20:35,410
range. This graph we call it as the unit disc
graph, the graph which is formed here after
153
00:20:35,410 --> 00:20:46,130
the deployment of the nodes this particular
graph is called a unit disk graph and they
154
00:20:46,130 --> 00:20:50,630
will operate for a particular application
on the field.
155
00:20:50,630 --> 00:20:58,120
So, we will see the existing algorithms, which
will construct the minimum size of connected
156
00:20:58,120 --> 00:21:04,830
dominating sets, which are compared on different
parameters such as the cardinality, that is
157
00:21:04,830 --> 00:21:10,000
the size of the connected dominating sets
how much is basically how many nodes are involved?
158
00:21:10,000 --> 00:21:18,350
Then how many messages are required to establish
the algorithm and how much time is required
159
00:21:18,350 --> 00:21:23,220
to establish the algorithm. The messages which
are exchanged what is basically the maximum
160
00:21:23,220 --> 00:21:33,780
size of them. And also basically the knowledge
the local knowledge, which is required in
161
00:21:33,780 --> 00:21:41,190
the algorithm design that is called the information.
So, information up to 2 hop is required sometimes
162
00:21:41,190 --> 00:21:48,760
it is 1 hop is required. So, the little information
is required to live better for the algorithm.
163
00:21:48,760 --> 00:21:58,430
So, here we will we will assume that we will
see that this algorithm, we will study which
164
00:21:58,430 --> 00:22:04,510
will have the in information which is required
is 1 hop information the message length is
165
00:22:04,510 --> 00:22:15,540
of the order big O of delta and the time required
is big O of n delta. And the total message
166
00:22:15,540 --> 00:22:20,610
required is of the order n and the approximation
is 8 approximation algorithms which we are
167
00:22:20,610 --> 00:22:25,320
going to discuss.
Now, computing through an algorithm the mcd
168
00:22:25,320 --> 00:22:41,680
ds that is minimum connected dominating set
169
00:22:41,680 --> 00:22:51,960
in a unit disk graph
is NP hard note that the problem of finding
170
00:22:51,960 --> 00:22:57,370
and MCDS in a graph is equivalent to problem
of finding or spanning tree with, the maximum
171
00:22:57,370 --> 00:23:05,761
number of leaves and here the non leaves nodes
in the spanning tree will form the MCDS and
172
00:23:05,761 --> 00:23:17,140
MIS is also a dominating set.
Now for a graph G having an edge if and only
173
00:23:17,140 --> 00:23:21,960
if the length of the edge is less than or
equal to 1 or at most 1, then that particular
174
00:23:21,960 --> 00:23:27,980
graph is called a unit disk graph that we
will assume in our discussion.
175
00:23:27,980 --> 00:23:36,740
Let us take the example how the unit disk
graph is being formed. So, so a node is at
176
00:23:36,740 --> 00:23:44,980
the center and it is able to communicate in
this range and that range is R, which is fixed
177
00:23:44,980 --> 00:23:50,700
for all the nodes to communicate. So, if this
is the node this is this communication range
178
00:23:50,700 --> 00:24:00,920
if this is another node it is communication
range is overlapping with this previous nodes
179
00:24:00,920 --> 00:24:12,880
communication range, hence there is an edge
why because this particular disc is of only
180
00:24:12,880 --> 00:24:16,770
one unit length.
Hence they are able to communicate with each
181
00:24:16,770 --> 00:24:23,200
other and there exists an edge in the unit
disk graph. So, the topology of a wireless
182
00:24:23,200 --> 00:24:27,480
ad hoc network can be modeled as a unit disk
graph which is nothing, but a geometry graph
183
00:24:27,480 --> 00:24:31,960
in which there is an edge between the 2 nodes
ah if and only if their distance is at most
184
00:24:31,960 --> 00:24:41,210
one .
Similarly we can see that there is an edge
185
00:24:41,210 --> 00:24:50,290
between these 2 nodes these nodes and these
nodes. So, if we finally, see that this particular
186
00:24:50,290 --> 00:25:02,920
wireless network can be modeled as a graph
and this graph is called a unit disk graph.
187
00:25:02,920 --> 00:25:13,390
So, from now on we will not call it as a unit
disk graph, you will just call it as a graph
188
00:25:13,390 --> 00:25:23,080
and the graph we can obtain from wireless
ad hoc networks and on this particular network
189
00:25:23,080 --> 00:25:28,150
we have to find out the connected dominating
set minimum size connected dominating set.
190
00:25:28,150 --> 00:25:33,420
For example, in this particular graph if we
can see what is the minimum size of connected
191
00:25:33,420 --> 00:25:47,550
dominating set, we can take an example that
here this particular node if we pick in a
192
00:25:47,550 --> 00:26:00,760
CDS and this node also in the CDS then this
will become an independent set.
193
00:26:00,760 --> 00:26:07,130
Why because all the nodes they are independent
and all the nodes are which are not there
194
00:26:07,130 --> 00:26:12,480
in this particular set they are the neighbors
of at least one of these nodes. For example,
195
00:26:12,480 --> 00:26:18,480
this node is the neighbor of both the nodes
hence this particular blue one they are nothing
196
00:26:18,480 --> 00:26:25,200
, but they are the MISS or it is also called
the dominating set.
197
00:26:25,200 --> 00:26:37,220
So, these set of nodes are the dominating
set together , but they are not connected
198
00:26:37,220 --> 00:26:44,270
to connect them we can use this node also
to be included and if that is included then
199
00:26:44,270 --> 00:26:53,070
the size of the connected dominating set comprises
of 3 nodes . So, if the 3 nodes are taken
200
00:26:53,070 --> 00:27:01,830
into an account then this will be a dominating
set and since the induced sub graph induced
201
00:27:01,830 --> 00:27:06,870
by this dominating set is also connected hence
it is also a connected dominating set if the
202
00:27:06,870 --> 00:27:10,830
3 different nodes are taken ah to an to an
account.
203
00:27:10,830 --> 00:27:17,000
So, through the algorithm you have to find
out the connected dominating set of the minimum
204
00:27:17,000 --> 00:27:23,750
size and that will be our problem.
In this particular lecture let us see some
205
00:27:23,750 --> 00:27:31,850
of the estimates that the algorithm which
we are going to discuss is an approximation
206
00:27:31,850 --> 00:27:36,170
algorithm and what is the performance guarantees
of this algorithm for that there are some
207
00:27:36,170 --> 00:27:40,550
backgrounds.
So, the lemma says that the size of any independent
208
00:27:40,550 --> 00:27:48,950
set in a unit disk graph is at most 4 off
plus one for that let us see the picture.
209
00:27:48,950 --> 00:28:00,710
Now, here this particular node vi is having
the transmission range and another node v
210
00:28:00,710 --> 00:28:06,200
that is able to communicate with vi is having
this communication range.
211
00:28:06,200 --> 00:28:19,070
So, if all the nodes can be connected through
a tree and we can list the nodes in a in some
212
00:28:19,070 --> 00:28:24,980
order of the traversal of this particular
node. So, if let us say in that order these
213
00:28:24,980 --> 00:28:32,300
2 nodes we have picked up this particular
node, which is shown as the green color is
214
00:28:32,300 --> 00:28:38,340
able to communicate.
In this particular range now there can be
215
00:28:38,340 --> 00:28:45,770
at most 5 different independent nodes independent
nodes means they are not able to communicate.
216
00:28:45,770 --> 00:28:56,530
So, 5 are there now if they overlapped then
this will basically give a 5 the nodes and
217
00:28:56,530 --> 00:29:04,980
these are this particular space is overlapped
with the previous space. Now we will count
218
00:29:04,980 --> 00:29:15,000
how many maximum number of independent nodes
can be the neighbor of vi or it can be lying
219
00:29:15,000 --> 00:29:25,421
in U i set let I us say this node this node
this node and so on these node all 5 node
220
00:29:25,421 --> 00:29:38,010
cannot be all 5 independent nodes
when the nodes cannot be position here in
221
00:29:38,010 --> 00:29:41,910
U i.
So, with this information let us see this
222
00:29:41,910 --> 00:29:49,590
particular fact let U be any independent set
of V and let T prime be the spanning tree
223
00:29:49,590 --> 00:29:58,570
of an opt that is the optimal size of connected
dominating set. Now consider an arbitrary
224
00:29:58,570 --> 00:30:06,340
preorder traversal of T prime given as V 1
V 2 and so on up to be opt U i be the U 1
225
00:30:06,340 --> 00:30:16,890
be the set of nodes in you they are adjacent
to V 1 for any known I which b is basically
226
00:30:16,890 --> 00:30:25,450
more than 2, but less than opt let u iv the
be the set of nodes in u they are adjacent
227
00:30:25,450 --> 00:30:36,480
to V i, but none of the other nodes . So,
we will take only these 2 nodes that we have
228
00:30:36,480 --> 00:30:45,780
shown in the picture and let us find out that
this will partition these nodes into U 1 U
229
00:30:45,780 --> 00:30:51,380
2 and so on the partition of U.
So, v 1 is adjacent to at most 5 independent
230
00:30:51,380 --> 00:31:09,140
nodes. So, the size of u one is basically
at most 5. Now for I is more than 2 that is
231
00:31:09,140 --> 00:31:20,810
the second this is U 1 . So, all 5 we have
included there can be 5 different nodes this
232
00:31:20,810 --> 00:31:33,450
is V 1. Now we will include 2 which will communicate
with 1 and this particular portion we have
233
00:31:33,450 --> 00:31:42,840
to count how many more nodes can be there?
So, this is at most 240 hence the coverage
234
00:31:42,840 --> 00:31:51,890
range of node V i will imply that U I must
be less than or equal to 4.
235
00:31:51,890 --> 00:32:03,090
Hence the maximum independent set in this
particular U I, if we count is nothing, but
236
00:32:03,090 --> 00:32:16,790
4 times this is an opt minus 1. So, that comes
out to be 4 or plus 1. Hence the size of the
237
00:32:16,790 --> 00:32:25,990
independent set maximal independent set is
nothing, but 4 opt plus 1 having proved this
238
00:32:25,990 --> 00:32:31,590
particular lemma you will see that the algorithm
which we are going to discuss requires at
239
00:32:31,590 --> 00:32:41,240
most 4 the size of the MIS is 4 opt plus 1.
Now here we will show that the minimization
240
00:32:41,240 --> 00:32:50,050
algorithm the approximation is nothing, but
is a ratio performance ratio, which is called
241
00:32:50,050 --> 00:32:56,290
it is nothing, but a supremum of this particular
factor A i upon opt five; that means, A i
242
00:32:56,290 --> 00:33:06,710
is the output size of CDS and opt is the optimal
size of that particular problem instance i.
243
00:33:06,710 --> 00:33:12,600
So, for all instance of the problem you have
to find out the supremum that becomes the
244
00:33:12,600 --> 00:33:18,260
approximation algorithm guarantees or the
performance ratio.
245
00:33:18,260 --> 00:33:24,830
Let us see the algorithm which runs in 2 phases
the phase one will construct the maximal independent
246
00:33:24,830 --> 00:33:31,750
set and in the second phase we will connect
them through a tree, which is called Steiner
247
00:33:31,750 --> 00:33:37,360
tree and we will also show that performance
ratio of this algorithm is 8 the algorithm
248
00:33:37,360 --> 00:33:42,890
is message and time efficient also .
Let us start with the algorithm that all the
249
00:33:42,890 --> 00:33:51,080
nodes are initially colored as white the dominator
is colored as black and dominatee is colored
250
00:33:51,080 --> 00:33:57,200
as gray. So, there are 3 different colors
we are going to use in this algorithm white,
251
00:33:57,200 --> 00:34:12,600
then gray, and black .
Now we assume that each vertex has the knowledge
252
00:34:12,600 --> 00:34:18,330
of it is distance one neighbors and also they
have another information, which is called
253
00:34:18,330 --> 00:34:27,060
an effective degrees; that means, how many
number of white neighbors are there at any
254
00:34:27,060 --> 00:34:36,669
point of time is called an effective degree
.
255
00:34:36,669 --> 00:34:42,650
Now this information can be collected by periodic
event driven messages, which are called hello
256
00:34:42,650 --> 00:34:48,050
hello messages the effective degree of a vertex
is the total number of white neighbor that
257
00:34:48,050 --> 00:34:55,190
I have already told.
Now, one of these host nodes we had we have
258
00:34:55,190 --> 00:35:02,310
designated as a leader this particular assumption
is also realistic, because if the leader is
259
00:35:02,310 --> 00:35:11,240
not because leader can be a commander mobile
commanders mobile for the platoon of the soldiers
260
00:35:11,240 --> 00:35:19,360
in a particular mission. So, there is always
a leader designated in such network and in
261
00:35:19,360 --> 00:35:25,410
many such applications.
Now, if it is not designated then we can designate
262
00:35:25,410 --> 00:35:31,310
using an algorithm which is called a distributed
leader election algorithm. So, if we run the
263
00:35:31,310 --> 00:35:37,480
dis distributed leader election algorithm
it will take order n time and the number of
264
00:35:37,480 --> 00:35:46,300
messages will be n log n with the best known
such available. So, hence let us assume without
265
00:35:46,300 --> 00:35:52,050
loss of generality that a host s is the leader
in the construction of CDS.
266
00:35:52,050 --> 00:36:03,940
Now the phase one will construct the MIS.
So, let us start with the leader node s is
267
00:36:03,940 --> 00:36:12,740
the leader. So, leader first color itself
as a black now here this is the leader, which
268
00:36:12,740 --> 00:36:21,650
will color itself is let us use the black
while all other nodes are white.
269
00:36:21,650 --> 00:36:32,290
Now, this particular node will send a message
which is called a dominator, because after
270
00:36:32,290 --> 00:36:37,630
becoming black this particular node is called
a dominator node it will send a message which
271
00:36:37,630 --> 00:36:47,660
is called a dominator . And that will go in
it is all the nodes in it is communication
272
00:36:47,660 --> 00:36:55,670
range which is nothing, but a disk within
the disk now any white node u let us say this
273
00:36:55,670 --> 00:37:02,690
is the white node u receiving the dominator
message for the first time from a vertex v
274
00:37:02,690 --> 00:37:12,510
it will color itself as a gray it will color
itself as a gray and broadcast a message which
275
00:37:12,510 --> 00:37:16,020
is called a dominatee.
So, it will broadcast. So, this particular
276
00:37:16,020 --> 00:37:22,700
message when it will be broadcast will be
broadcast in it is communication range and
277
00:37:22,700 --> 00:37:34,390
this you will select v as it is dominator.
Now this particular white node here they are
278
00:37:34,390 --> 00:37:40,592
the white nodes when they will receive this
particular message they become active. The
279
00:37:40,592 --> 00:37:53,230
active white host with the highest effective
degrees D star among all it is active white
280
00:37:53,230 --> 00:38:01,160
neighbor will color itself black and broadcast
dominator .
281
00:38:01,160 --> 00:38:16,970
Let us assume that this is the node having
the highest degree and it has colored itself
282
00:38:16,970 --> 00:38:24,720
as a black and it will broadcast a message,
which is called a dominator .
283
00:38:24,720 --> 00:38:37,120
So, the white node decreases it is effective
degree by one and broadcast message the degree
284
00:38:37,120 --> 00:38:44,600
whenever it receives a dominatee message.
The message degree contains the senders current
285
00:38:44,600 --> 00:38:52,280
effective degree a white vertex receiving
a degree message will update it is neighborhood
286
00:38:52,280 --> 00:38:56,180
information.
So, each gray vertex will broadcast message
287
00:38:56,180 --> 00:39:01,720
number of black neighbors, when it detects
that none of it is neighbors is white. So,
288
00:39:01,720 --> 00:39:05,660
phase one terminates when there is no white
neighbors left.
289
00:39:05,660 --> 00:39:15,400
Now, we will begin the phase 2 phase 2 is
nothing, but it will construct a Steiner through
290
00:39:15,400 --> 00:39:31,600
a Steiner tree
it will connect the nodes in MIS, which we
291
00:39:31,600 --> 00:39:41,980
have obtained in the phase 1 and hence MIS
plus the connectors will become a connected
292
00:39:41,980 --> 00:39:55,780
dominating set. So, Steiner tree will be able
to form a sub graph induced by the MIS nodes
293
00:39:55,780 --> 00:39:58,520
and that will be the connected dominating
set together.
294
00:39:58,520 --> 00:40:07,630
Now when S receives S here is basically the
leader node when leader receives the message
295
00:40:07,630 --> 00:40:13,740
number of black neighbors from all of it is
gray neighbors it starts phase 2 by broadcasting
296
00:40:13,740 --> 00:40:21,980
a message M . So, our host is ready to be
exploring for hi as no white neighbors I Steiner
297
00:40:21,980 --> 00:40:25,820
tree is used to connect all the black host
generated in phase 1 that I have told the
298
00:40:25,820 --> 00:40:33,971
idea is to pick those grey vertices, which
connect to many of the black neighbors. Why
299
00:40:33,971 --> 00:40:48,350
because, we want to minimize the nodes in
CDS so; that means, with the minimum number
300
00:40:48,350 --> 00:40:54,790
of nodes we want to establish a connection
want to connect MIS nodes.
301
00:40:54,790 --> 00:41:02,440
Now, we will apply a classical depth for search
algorithm and that distributed version that
302
00:41:02,440 --> 00:41:08,100
is the distributed depth for search spanning
tree algorithm to compute the Steiner tree
303
00:41:08,100 --> 00:41:14,720
here in this particular phase 2 algorithm.
Now a black node without any dominator is
304
00:41:14,720 --> 00:41:23,520
active and usually no black vertex has the
dominator and all hosts are unexplored message
305
00:41:23,520 --> 00:41:29,370
M contains the field next which specifies
next has to be explored a gray vertex with
306
00:41:29,370 --> 00:41:34,310
a at least one active black neighbors are
effective.
307
00:41:34,310 --> 00:41:42,630
So, M is built by the black vertex it is next
field contains the idea of unexplored gray
308
00:41:42,630 --> 00:41:52,590
vertices, which connects to the maximum active
black host M is built by the gray vertex then
309
00:41:52,590 --> 00:41:56,080
the next field of id contains the unexplored
black neighbors.
310
00:41:56,080 --> 00:42:03,070
So, either M, is built by the black vertex
or M is built by the gray vertex. So, any
311
00:42:03,070 --> 00:42:09,070
black host receiving message M for the first
time from the gray host sets it is dominated
312
00:42:09,070 --> 00:42:16,160
to that v by broadcasting the parent message.
So, when a host u receives message M from
313
00:42:16,160 --> 00:42:22,180
v that specifies u to be explored next, if
none of u's neighbors is white, u then color
314
00:42:22,180 --> 00:42:28,560
itself as a black, sets it is dominated to
v and broadcasts it is own message M; otherwise,
315
00:42:28,560 --> 00:42:35,030
u defer the operation until none of it is
neighbors is white. Similarly this particular
316
00:42:35,030 --> 00:42:44,110
process will follow for the gray neighbors.
Now, when there is no when the gray vertices
317
00:42:44,110 --> 00:42:53,380
becomes ineffective if there is no black neighbors
active. So, when there is no active effective
318
00:42:53,380 --> 00:43:02,900
black neighbors remains the gray neighbors
becomes ineffective similarly for the black
319
00:43:02,900 --> 00:43:11,090
nodes. So, when s gets a message done that
is it has no effective gray neighbors and
320
00:43:11,090 --> 00:43:16,690
the algorithm will terminate.
Let us see this entire algorithm working through
321
00:43:16,690 --> 00:43:24,060
in this particular example this is s that
is the leader node. So, in phase one it will
322
00:43:24,060 --> 00:43:32,440
broadcast first it will make itself as a black
and broadcast message as a dominator in it
323
00:43:32,440 --> 00:43:36,540
is one hop neighborhood they will turn as
a gray.
324
00:43:36,540 --> 00:43:43,960
So, all the other nodes are a turned as a
gray nodes, these gray nodes will further
325
00:43:43,960 --> 00:43:51,540
communicate a dominatee message in it is neighborhood
and the white nodes who will receive the gray
326
00:43:51,540 --> 00:43:57,240
or a dominating they become an active. So,
this particular every white node will count
327
00:43:57,240 --> 00:44:07,210
how many active white neighbors are there
and the maximum one for example, having that
328
00:44:07,210 --> 00:44:13,990
particular value will basically become black.
So, in the phase one when phase one will finish
329
00:44:13,990 --> 00:44:20,540
it will basically form the black nodes and
they are nothing, but the MIS .
330
00:44:20,540 --> 00:44:31,310
Now, in phase 2 when there is no white node
left then phase 2 will begin. Now phase 2
331
00:44:31,310 --> 00:44:37,700
will be initiated either by the gray notes
gray nodes will inform how many black neighbors
332
00:44:37,700 --> 00:44:45,590
are there. For example, this is the gray node
which has 2 different black 1 2 and 3 3 different
333
00:44:45,590 --> 00:44:52,690
black nodes are there in it is neighborhood.
So, it will inform to these particular black
334
00:44:52,690 --> 00:45:02,520
nodes using this particular message M.
So, this particular black node will choose
335
00:45:02,520 --> 00:45:09,780
this gray node and it will send a parent message.
So, it will establish a link between it now
336
00:45:09,780 --> 00:45:16,960
since this is the node which is having the
maximum number of black neighbors. So, this
337
00:45:16,960 --> 00:45:31,920
will in the phase 2 through the Steiner tree
this will become a black . So, this is the
338
00:45:31,920 --> 00:45:43,140
black node that is the gray node having the
gray node having the maximum number of black
339
00:45:43,140 --> 00:45:55,410
neighbors will be basically elected as the
connectors or through the Steiner tree it
340
00:45:55,410 --> 00:46:04,720
will be turned into a dominator.
So, when phase 2 will finish then there will
341
00:46:04,720 --> 00:46:15,470
not be; that means, all the black nodes will
be having assigned a connector node through
342
00:46:15,470 --> 00:46:24,500
the gray node and also there is no gray node
which is having an active or an effective
343
00:46:24,500 --> 00:46:35,510
black node without any connection with it
is neighbor dominating sets or it is neighbor
344
00:46:35,510 --> 00:46:42,280
MISS. Hence the algorithm will terminate with
forming a connected dominating set, which
345
00:46:42,280 --> 00:46:49,790
is shown here with the black nodes.
The complexity of the phase one and the complexity
346
00:46:49,790 --> 00:46:55,410
of phase 2 nodes which we will see the total
of message and time complexity is given as
347
00:46:55,410 --> 00:47:01,640
of the order n and message complexities of
the order n time's big delta.
348
00:47:01,640 --> 00:47:08,450
Now, let us see the performance analysis phase
one computes n MIS which contains all the
349
00:47:08,450 --> 00:47:13,090
black nodes.
Now, another lemma says that in phase 2 at
350
00:47:13,090 --> 00:47:19,650
least one green vertex, which connects to
the maximum number of black vertices, will
351
00:47:19,650 --> 00:47:24,650
be selected.
Now, here in lemma 3.4 we will see that if
352
00:47:24,650 --> 00:47:33,150
cr the number of black nodes hosts after the
phase 1, then it requires at most c minus
353
00:47:33,150 --> 00:47:37,670
1 grey host which can be colored black in
phase 2.
354
00:47:37,670 --> 00:47:44,710
Lemma 3.5 says that if there are gray vertex
which connects at least 3 black nodes. So,
355
00:47:44,710 --> 00:47:51,960
the number of black vertices required is c
minus 2.
356
00:47:51,960 --> 00:48:01,900
Therefore if we check the performance ratio
of this algorithm in the phase 1 the size
357
00:48:01,900 --> 00:48:11,480
of the MIS is let us say 4 opt 4 opt plus
1.
358
00:48:11,480 --> 00:48:28,440
Now, in phase 2 we require let us say M is
size minus 2 that is plus 4 opt plus 1 and
359
00:48:28,440 --> 00:48:36,290
minus 2 these are the connectors. So, totally
if we count it becomes 8 opt. So, 8 opt and
360
00:48:36,290 --> 00:48:44,030
this 1 and 1 will go out. So, hence the performance
ratio of this algorithm is 8 opt.
361
00:48:44,030 --> 00:48:51,210
More references on this you can find the paper
by Rajiv Misra "Minimum Connected Dominating
362
00:48:51,210 --> 00:48:56,410
Set Using a Collaborative Cover Heuristic
for Ad Hoc Sensor Networking" published in
363
00:48:56,410 --> 00:49:00,630
I triple e transaction parallel distributed
computing 2010..
364
00:49:00,630 --> 00:49:05,490
Conclusion in this lecture we have discussed
the algorithm distributed algorithm for connected
365
00:49:05,490 --> 00:49:14,100
dominating set of smaller size, we have seen
that this construction requires first to construct
366
00:49:14,100 --> 00:49:18,619
maximal independent set and then Steiner tree
to connect all these vertices and the performance
367
00:49:18,619 --> 00:49:24,360
ratio we have shown that it is 8.
In future scope of this algorithm is to study
368
00:49:24,360 --> 00:49:27,300
the problem of maintaining serious in the
mobility environment.
369
00:49:27,300 --> 00:49:27,800
Thank you .