1
00:00:18,029 --> 00:00:27,050
in our last lecture we were looking at matrix
conditioning. so matrix conditioning allows
2
00:00:27,050 --> 00:00:37,239
us to separate bad matrices from good matrices
and we know when the calculations can go wrong
3
00:00:37,239 --> 00:00:44,140
because the matrix is bad and we are able
to make a judgment on the quality of solution
4
00:00:44,140 --> 00:00:48,030
for linear algebraic equations.
5
00:00:48,030 --> 00:00:52,859
so i gave a very simple example polynomial
approximations. some continuous function we
6
00:00:52,859 --> 00:01:03,350
are trying to approximate of f of x. f of
z is of some function which you are trying
7
00:01:03,350 --> 00:01:13,690
to approximate and then if you develop an
approximation which is polynomial approximation,
8
00:01:13,690 --> 00:01:28,210
but in polynomial approximations we have shown
that you get equations of the type h theta=u.
9
00:01:28,210 --> 00:01:44,110
so h is the hilbert matrix. theta is the parameter
vector parameters of your polynomial coefficient
10
00:01:44,110 --> 00:01:52,930
and u is the depending upon how you have formulated
the problem. u will be a vector which is a
11
00:01:52,930 --> 00:02:00,470
finite dimensional vector are and this is
exactly like solving ax=b. x is theta, b is
12
00:02:00,470 --> 00:02:13,890
u and a is matrix h and what i showed you
last line was that h3. h3 is hilbert matrix
13
00:02:13,890 --> 00:02:57,230
which is 1, ½, 1/3, ½, 1/3, ¼ and where
i will just write it here.
14
00:02:57,230 --> 00:03:12,680
so this is my h matrix
and then i took a solution here
15
00:03:12,680 --> 00:03:43,459
my right hand side was 11/6, 13/12 and 47/60
when theta=, 1, 1, 1 transpose. when theta=1,
16
00:03:43,459 --> 00:03:55,900
1, 1 vector containing three 1, 1, 1 your
right hand side will be this and this is the
17
00:03:55,900 --> 00:04:04,610
exact solution. i just showed you how things
can go wrong even for this matrix for which
18
00:04:04,610 --> 00:04:17,940
the condition number is not so bad. the condition
number we found here was 748.
19
00:04:17,940 --> 00:04:32,210
so for this particular matrix condition number
c infinity or c infinity of h3 was 748. i
20
00:04:32,210 --> 00:04:35,620
just showed you how things can go wrong.
21
00:04:35,620 --> 00:04:44,990
instead of solving the original problem we
solved slightly modified version of this problem
22
00:04:44,990 --> 00:05:29,840
which is 1, 0.5, 0.333, 05, 0.333 and 0.25.
this matrix i would say is my h3+delta h3.
23
00:05:29,840 --> 00:05:38,190
this matrix has a slight error as compared
to the original matrix h3. so if i start doing
24
00:05:38,190 --> 00:05:45,800
computations with this matrix and instead
of this u if i take slightly perturbed u which
25
00:05:45,800 --> 00:05:58,070
is 1.83 actually i have done is i have just
truncated the fractions.
26
00:05:58,070 --> 00:06:20,400
1.08 and 0.783. so this is my u+ delta u.
i am just calling this u+ delta u because
27
00:06:20,400 --> 00:06:28,789
i have truncated it i decided to truncate
and write the same equation. my solution changed
28
00:06:28,789 --> 00:06:55,650
so drastically 1.09. so this is my theta+
delta theta. this is for a matrix for which
29
00:06:55,650 --> 00:07:06,880
you have condition number of 700. so actually
if i try to estimate what is the fractional
30
00:07:06,880 --> 00:07:12,900
change that we have made on in h matrix or
in u matrix.
31
00:07:12,900 --> 00:07:25,410
it is of the order of order of 0.3% change,
0.3% error. i can find this error using norms.
32
00:07:25,410 --> 00:07:33,990
norm of delta h3 by norm of h3 or norm of
delta u/norm of u. i can find out what is
33
00:07:33,990 --> 00:07:43,840
the percentage change it is of the order of
0.3%, but the solution changes by almost 50%
34
00:07:43,840 --> 00:07:50,880
drastic change in the solution just because
and here whatever you try to do you do maximal
35
00:07:50,880 --> 00:07:54,850
pivoting you do whatever reordering of the
calculations you will get into trouble.
36
00:07:54,850 --> 00:08:06,099
that is because this matrix is ill-condition.
now the example that i gave you earlier might
37
00:08:06,099 --> 00:08:13,110
lead you to believe that it is something to
do with singular matrices this is nothing
38
00:08:13,110 --> 00:08:18,229
to do with singularity. what matters here
is the condition number. condition number
39
00:08:18,229 --> 00:08:26,180
if you take in terms of 2 norm. condition
number is ratio or square root of ratio of
40
00:08:26,180 --> 00:08:28,280
the singular value of the matrix.
41
00:08:28,280 --> 00:08:32,950
largest singular value by smaller singular
value finite angle values of a transpose a
42
00:08:32,950 --> 00:08:40,880
and take ratio of maximum eigenvalue of a
transpose a by minimum eigenvalue of a transpose
43
00:08:40,880 --> 00:08:52,980
a find the square root that is what matters.
if that ratio is small then the matrix is
44
00:08:52,980 --> 00:09:00,290
well conditioned that ratio is large individual
eigenvalue may not be=0, but if that ratio
45
00:09:00,290 --> 00:09:05,250
of the largest eigenvalue of a transpose a
to smallest eigenvalue of a transpose a if
46
00:09:05,250 --> 00:09:09,090
that ratio is large matrix is ill-conditional.
47
00:09:09,090 --> 00:09:15,880
that can create problem for you using any
sophisticated program or any sophisticated
48
00:09:15,880 --> 00:09:21,299
computer it will create a problem for you.
so that is inherent problem with the matrix
49
00:09:21,299 --> 00:09:31,760
not the problem with the computer or the program.
so there is one more example which i have
50
00:09:31,760 --> 00:09:37,449
given here i will not write the numbers because
numbers are very small.
51
00:09:37,449 --> 00:09:42,680
but i have tried to show here in this example
is that very, very simple example. i think
52
00:09:42,680 --> 00:09:58,579
i demonstrated this to you this a matrix
this is simple a matrix 1, 2, 3, 4, 5, 6,
53
00:09:58,579 --> 00:10:12,920
7, 8, 9. this is not called by any name. this
particular matrix is highly ill-condition.
54
00:10:12,920 --> 00:10:17,890
condition number of this matrix it appears
what is there you have just written 1 to 9
55
00:10:17,890 --> 00:10:24,280
numbers at the particular sequence. the condition
number of this matrix if you ask matrix the
56
00:10:24,280 --> 00:10:30,080
condition number that is c 2 of a.
57
00:10:30,080 --> 00:10:46,970
so which is square root of
58
00:10:46,970 --> 00:11:08,480
lambda max of a transpose a by lambda min.
this turns out to be 3.81* 10 to the power
59
00:11:08,480 --> 00:11:16,260
16. condition number of this matrix is very,
very large. you can do a simple experiment
60
00:11:16,260 --> 00:11:23,620
in matlab or scilab or any software take this
matrix find its inverse. well matlab will
61
00:11:23,620 --> 00:11:28,650
give you a warning. this matrix is highly
ill-conditioned the result may not be reliable
62
00:11:28,650 --> 00:11:31,049
and you can check that.
63
00:11:31,049 --> 00:11:35,020
if you find inverse of this matrix and what
should happen if you find inverse of this
64
00:11:35,020 --> 00:11:40,160
matrix and multiply with it the matrix itself
you should get identity matrix. if you do
65
00:11:40,160 --> 00:11:45,660
that experiment and numerical experiment in
matlab you will get matrix which has nothing
66
00:11:45,660 --> 00:11:55,170
to do with identity matrix. you will get some
other matrix. you get numbers like 2, 8, 18
67
00:11:55,170 --> 00:11:57,500
when you multiply a* a inverse for this matrix.
68
00:11:57,500 --> 00:12:09,860
because this is highly ill-conditioned matrix
and then i have given one more example. so
69
00:12:09,860 --> 00:12:22,329
what i want to stress here is that inherently
a given matrix is every matrix will come with
70
00:12:22,329 --> 00:12:29,920
its own characteristics and then that will
dictates how the calculations proceed and
71
00:12:29,920 --> 00:12:36,710
should be able to recognize bad matrices or
ill-conditioned matrices.
72
00:12:36,710 --> 00:12:42,370
there is one more matrix i have shown here.
okay for this matrix if you do a inverse you
73
00:12:42,370 --> 00:12:47,669
can do that experiment in matlab you will
never get identity matrix, but i will give
74
00:12:47,669 --> 00:13:09,880
you another matrix in 10 to the power -17,
1, 2, 1, 2, 1, 2. i have taken this matrix.
75
00:13:09,880 --> 00:13:20,480
you might say that is it like a null matrix
10 to the power -17*1. 10 to the power -17*2
76
00:13:20,480 --> 00:13:28,490
it looks like a null matrix. all the elements
of this matrix are close to 0 though i have
77
00:13:28,490 --> 00:13:29,980
written here 1, 2, 1.
78
00:13:29,980 --> 00:13:46,390
it is multiplied by 10 to the power -17. now
if i do inversion of this matrix and then
79
00:13:46,390 --> 00:13:53,400
multiply inverse of this matrix * b matrix
matlab will give you perfect identity matrix
80
00:13:53,400 --> 00:14:00,189
back. why condition number of this matrix
even though this is like a null matrix all
81
00:14:00,189 --> 00:14:09,819
elements are close to 0. the condition number
of this matrix c 2 b it turns out to be 5.474
82
00:14:09,819 --> 00:14:23,310
very well conditioned matrix no problems in
the calculations.
83
00:14:23,310 --> 00:14:29,669
that is because if you take b transpose b
find out its maximum eigenvalue minimum eigenvalue
84
00:14:29,669 --> 00:14:35,500
take a ratio that will come out to be this
and square root of that it will come out to
85
00:14:35,500 --> 00:14:41,740
be this. so this means inherently you are
not going to get into any trouble when you
86
00:14:41,740 --> 00:14:46,030
do calculations with this matrix which is
close to null matrix. its eigenvalues are
87
00:14:46,030 --> 00:14:50,450
very close to 0, but that does not matter.
88
00:14:50,450 --> 00:14:55,260
what matters is the condition number, what
matter is the ratio of maximum to minimum
89
00:14:55,260 --> 00:15:03,370
singular value square root of that is what
dictates the calculations. so with this we
90
00:15:03,370 --> 00:15:12,670
come to an end of discussion on linear algebraic
equations. we looked at many things, we looked
91
00:15:12,670 --> 00:15:18,230
at i suppose you have learned much more than
what you know about solving linear algebraic
92
00:15:18,230 --> 00:15:23,420
equations as compared to your undergraduate
courses on ax=b.
93
00:15:23,420 --> 00:15:26,130
you probably thought you knew everything about
ax=b you probably thought you knew everything
94
00:15:26,130 --> 00:15:31,790
about ax=b right. gaussian elimination and
then you are done, but what you see here is
95
00:15:31,790 --> 00:15:38,870
far, far more than what you know. we looked
at sparse matrix methods, efficient way of
96
00:15:38,870 --> 00:15:43,450
calculating. that too i just could cover a
few of them just to give you a taste of what
97
00:15:43,450 --> 00:15:45,400
it is it is much more to it.
98
00:15:45,400 --> 00:15:49,949
there is a sparse matrix toolbox in matlab
or scilab you know there are many routines
99
00:15:49,949 --> 00:15:56,110
which exploits special structure of a matrix
and do fast computations. the reason for introducing
100
00:15:56,110 --> 00:16:01,620
sparse matrix was to sensitize you that there
exist something called sparse matrix computations.
101
00:16:01,620 --> 00:16:07,130
so in your problem in your m. tech problem
or phd problem when you hit upon large scale
102
00:16:07,130 --> 00:16:14,380
matrices try to look for sparse matrix, try
to exploit sparsity if you can.
103
00:16:14,380 --> 00:16:24,950
you can make your computation very, very fast.
then next thing we looked at was iterative
104
00:16:24,950 --> 00:16:32,560
methods like jacobi method, gauss-seidel method
and so on. iterative methods in general it
105
00:16:32,560 --> 00:16:40,319
is difficult to prove, but in general they
work faster than these gaussian elimination
106
00:16:40,319 --> 00:16:44,740
particularly for large scale matrices. we
looked at 2 classes of iterative methods one
107
00:16:44,740 --> 00:16:50,990
was gauss-seidel those kind of method the
other one was optimization based, gradient
108
00:16:50,990 --> 00:16:55,069
method, conjugate gradient method and so on.
so we left here 2 classes.
109
00:16:55,069 --> 00:17:03,180
in particular for jacobi method and gauss-seidel
method those kind of methods we also analyzed
110
00:17:03,180 --> 00:17:11,930
the convergence behavior. when are you guaranteed
to converge to a solution of ax=b. so we looked
111
00:17:11,930 --> 00:17:17,400
at convergence properties, we also looked
at how to tweak my problem to ensure convergence.
112
00:17:17,400 --> 00:17:28,550
so now we have broadened our toolkit for solving
ax=b we have many more methods now for solving
113
00:17:28,550 --> 00:17:29,550
ax=b.
114
00:17:29,550 --> 00:17:36,540
moreover we know what really matters in iterative
methods eigenvalues. eigenvalue you know probably
115
00:17:36,540 --> 00:17:45,060
unexpectedly prompt up when you try to analyze
this. it is not really unexpected eigenvalues
116
00:17:45,060 --> 00:17:52,640
problems have come when you try to analyze
linear difference equations and then we related
117
00:17:52,640 --> 00:17:58,200
spectral radius. we related maximum magnitude
of eigenvalue of certain matrix. if it is
118
00:17:58,200 --> 00:18:02,200
inside unit circle we said that we are guaranteed
convergence and so on.
119
00:18:02,200 --> 00:18:10,630
we also looked at some theorems to ensure
convergence and then finally we move to this
120
00:18:10,630 --> 00:18:16,560
matrix conditioning try to weed out good matrices
and bad matrices. weed out bad matrices from
121
00:18:16,560 --> 00:18:26,510
the set of matrices. so we know now how do
judge whether a matrix is bad and that is
122
00:18:26,510 --> 00:18:36,340
why you are getting wrong answers or your
problem formulation the strategy for computing
123
00:18:36,340 --> 00:18:39,680
is bad and matrix is good, but you have made
mistakes.
124
00:18:39,680 --> 00:18:47,980
so you know where how to distinguish between
these 2. so with this you have a good idea
125
00:18:47,980 --> 00:18:55,190
of how to deal with ax=b. now let us move
on to solving non linear algebraic equations
126
00:18:55,190 --> 00:19:05,140
so that is what i am going to do next. i update
my notes. for this so we now start with the
127
00:19:05,140 --> 00:19:07,600
next tool.
128
00:19:07,600 --> 00:19:15,930
let me draw the diagram again that just to
bring you back to the entire theme of this
129
00:19:15,930 --> 00:19:26,340
course. we have this original problem. we
have this mathematical model and some problem
130
00:19:26,340 --> 00:19:36,100
which we cannot solve directly. so we use
this. we had a mathematical model and some
131
00:19:36,100 --> 00:19:45,670
original problem that we wanted to solve.
so using approximation theory we transform
132
00:19:45,670 --> 00:19:52,630
this problem to a computable form and then
we said we are going to look at 4 different
133
00:19:52,630 --> 00:19:56,830
tools or there are 4 different approaches
typically to solve this problem.
134
00:19:56,830 --> 00:20:09,100
i am going to cover 3 of them. so one is solving
ax=b. we could be using this tool to solve
135
00:20:09,100 --> 00:20:23,890
this problem or i could be using f of x=0.
it is quite likely that to do this i might
136
00:20:23,890 --> 00:20:33,910
be using this newton method i am using ax=b
to solve. so this could be directly being
137
00:20:33,910 --> 00:20:45,430
used or it can be indirectly being used we
do not know. the third tool is ode initial
138
00:20:45,430 --> 00:20:51,620
value problem solver ivp solvers all this
euler method, runge–kutta methods.
139
00:20:51,620 --> 00:20:57,650
so the third one which we are going to look
at is that ivp solver and of course the fourth
140
00:20:57,650 --> 00:21:12,360
tool is the stochastic tool. i am not going
to look at the stochastic tool. so this one
141
00:21:12,360 --> 00:21:18,160
we are done with. i am moving to this tool
now and towards the end of the course we would
142
00:21:18,160 --> 00:21:25,370
be covering this. this will be left untouched
because it probably would need one more course
143
00:21:25,370 --> 00:21:32,780
to cover stochastic tools and what do you
get here is the approximate solution.
144
00:21:32,780 --> 00:21:39,910
this is the approximate solution for the original
problem that you get. so this is done we are
145
00:21:39,910 --> 00:21:47,250
moving to this. eventually we will move to
this and that is end of the course. so this
146
00:21:47,250 --> 00:21:56,460
is the overall structure just to give you
a global picture of what has been happening.
147
00:21:56,460 --> 00:22:04,750
so now let us move on f of x=0 solving non
linear algebraic equations. we have already
148
00:22:04,750 --> 00:22:06,240
done something about this.
149
00:22:06,240 --> 00:22:13,320
we have already derived newton method starting
from taylor series approximations. you might
150
00:22:13,320 --> 00:22:18,930
wonder where newton method what is there,
why do i need many more things, but just like
151
00:22:18,930 --> 00:22:25,650
gaussian elimination is one way of solving
linear algebraic equations you have realized
152
00:22:25,650 --> 00:22:31,250
that newton method is just one approach there
are many ways of doing it and the reason why
153
00:22:31,250 --> 00:22:36,380
there are many ways of doing it is because
there is no method which is naci. one method
154
00:22:36,380 --> 00:22:39,670
which works for everything.
155
00:22:39,670 --> 00:22:44,730
sometimes one approach works better sometimes
the other approach works better. so you have
156
00:22:44,730 --> 00:22:52,430
to be ready with multiple tools and you know
use appropriate tool whenever required. in
157
00:22:52,430 --> 00:22:58,120
some cases you do not require newton method.
some cases it is not possible to apply newton
158
00:22:58,120 --> 00:23:05,160
method because newton method requires jacobian
calculation. if i have 100 equations in 100
159
00:23:05,160 --> 00:23:12,400
unknowns you have seen that kind of scenario
in solving partial differential equation.
160
00:23:12,400 --> 00:23:17,510
developing a matrix even if it is numerically
developing a matrix 100 cross 100 at each
161
00:23:17,510 --> 00:23:23,550
iteration it is painful. it is computationally
intensive and just imagine if you are trying
162
00:23:23,550 --> 00:23:28,310
to solve steady state stimulation of a complete
chemical plant thousands of equations to be
163
00:23:28,310 --> 00:23:40,730
solved simultaneously. if you are trying to
simulate a section of a plant many, many thousand
164
00:23:40,730 --> 00:23:44,700
equations non-linear algebraic equations to
be solved simultaneously.
165
00:23:44,700 --> 00:23:53,780
if you have to compute jacobian even numerically
it is not an easy task. so what has happened
166
00:23:53,780 --> 00:24:02,450
is as computers have become more and more
powerful we are also trying to solve problems
167
00:24:02,450 --> 00:24:08,090
which are larger and larger problem. okay
25 years, 30 years back probably nobody thought
168
00:24:08,090 --> 00:24:14,809
of solving thousand equation of thousand unknown
in the classroom now you can do it as a part
169
00:24:14,809 --> 00:24:20,660
of your assignment which would be probably
an m tech thesis sometime back.
170
00:24:20,660 --> 00:24:28,550
so things have changed because of what we
want to solve with growing power of computers
171
00:24:28,550 --> 00:24:38,870
also has changed. so there are problems which
earlier with slow computers would take days
172
00:24:38,870 --> 00:24:48,450
to solve. well now also there are problems
which takes days to solve except what you
173
00:24:48,450 --> 00:24:52,950
are trying now is different from what you
are trying earlier. so even with very, very
174
00:24:52,950 --> 00:24:57,150
fast computers and very fast good software.
175
00:24:57,150 --> 00:25:03,830
you still have problem which are and there
is no end to this. this will just keep on
176
00:25:03,830 --> 00:25:07,090
growing.
177
00:25:07,090 --> 00:25:25,980
so now let us look at different methods for
solving non-linear algebraic equations. so
178
00:25:25,980 --> 00:25:30,140
i can now just work with abstract form because
you have seen many, many example where you
179
00:25:30,140 --> 00:25:41,380
have to solve non-algebraic equations. so
my intention is to solve fi x=0 where i goes
180
00:25:41,380 --> 00:25:53,340
from 1, 2 to n x belongs to rn or now we are
comfortable with a notion of a function vector
181
00:25:53,340 --> 00:26:10,340
or i can write this into function vector fx=0
same problem fx=0 where f is a map from rn
182
00:26:10,340 --> 00:26:12,250
to rn.
183
00:26:12,250 --> 00:26:17,670
more sophisticated way of writing the same
thing is that f is as function vector you
184
00:26:17,670 --> 00:26:23,550
are trying to look for that value of x where
f of x will give you 0 vector this is 0 vector.
185
00:26:23,550 --> 00:26:30,420
f of x=0. f is a map from rn to rn n dimensions
to n dimensions. so these are the kind of
186
00:26:30,420 --> 00:26:49,510
equations that we are interested in solving.
what would be the simplest method? so first
187
00:26:49,510 --> 00:26:55,600
of all for solving non-linear algebraic equations
except for some very, very special cases where
188
00:26:55,600 --> 00:26:56,950
you can solve them magnetically.
189
00:26:56,950 --> 00:27:08,340
if you remove those small set of problems
where for example you can solve multiple dimensional
190
00:27:08,340 --> 00:27:13,920
quadric equation simultaneously. you can construct
just like you can solve one dimension quantification
191
00:27:13,920 --> 00:27:18,030
simultaneously. you can solve multivariable
quadric equations simultaneously, but these
192
00:27:18,030 --> 00:27:24,580
kind of analytical solutions are very, very
few. in general even if you have a polynomial
193
00:27:24,580 --> 00:27:29,590
of in one variable you cannot solve it analytically.
194
00:27:29,590 --> 00:27:38,110
it is very difficult to construct solutions
or routes of that equation. so we need methods
195
00:27:38,110 --> 00:27:45,240
that can solve non-linear algebraic equations.
well one thing i would say is that if there
196
00:27:45,240 --> 00:27:52,361
are methods which require less computations
better it is. now first of all let us look
197
00:27:52,361 --> 00:28:00,830
at methods which do not require derivatives
calculations. i want to solve f of x=0 without
198
00:28:00,830 --> 00:28:06,830
having to compute derivatives or even if i
have to compute derivatives i can do it in
199
00:28:06,830 --> 00:28:08,809
some simple way.
200
00:28:08,809 --> 00:28:14,540
rather than computing entire jacobian. so
i am going to give you up gradation of methods.
201
00:28:14,540 --> 00:28:24,540
finally we will of course move to newton methods,
but in newton methods the problem step in
202
00:28:24,540 --> 00:28:35,059
terms of large computations is jacobian calculations.
so there are methods which can do what is
203
00:28:35,059 --> 00:28:44,580
called as a jacobian update. jacobian updates
do not require explicit differentiation. they
204
00:28:44,580 --> 00:28:51,430
try to construct an approximate jacobian by
using last value of the jacobian and adding
205
00:28:51,430 --> 00:28:55,140
some correction because you have moved to
a new point.
206
00:28:55,140 --> 00:29:02,200
these methods broadly called as broyden's
updates or quasi newton method is also what
207
00:29:02,200 --> 00:29:04,560
we look at.
208
00:29:04,560 --> 00:29:14,990
so the first method class of methods i am
going to call these are known as well in iterative
209
00:29:14,990 --> 00:29:21,640
scheme everything is successive substitutions,
but this class is also specially known successive
210
00:29:21,640 --> 00:29:27,340
substitution methods. what i mean here by
successive substitution method is this sub
211
00:29:27,340 --> 00:29:35,250
class of methods by which you do not have
to compute any derivative that is what i mean
212
00:29:35,250 --> 00:29:36,480
right now.
213
00:29:36,480 --> 00:29:44,390
in general every method that we are looking
at iterative method is successive substitution.
214
00:29:44,390 --> 00:30:01,090
so the question is can i arrange my calculations
in such a way that you know i start with some
215
00:30:01,090 --> 00:30:11,080
initial guess x0 and then i generate a new
guess from the old guess. i want to solve
216
00:30:11,080 --> 00:30:22,100
for f of x=0. in some situations in some problems
for example tubular reactor with axial mixing
217
00:30:22,100 --> 00:30:30,980
that problem which we have been taking as
a theme example throughout the course.
218
00:30:30,980 --> 00:30:43,050
you can rearrange these equations f of x=0
into a spectral form ax= g of x where a is
219
00:30:43,050 --> 00:30:52,490
a constant matrix and g is some non linear
function. so actually if you want to what
220
00:30:52,490 --> 00:31:02,880
is f of x. f of x is nothing, but ax-gx. i
am trying to solve f of x=0 in this particular
221
00:31:02,880 --> 00:31:10,580
case reduces to this problem. in some cases
like tubular reactor axial mixing or some
222
00:31:10,580 --> 00:31:14,700
other things you might get naturally this
kind of a form.
223
00:31:14,700 --> 00:31:30,690
another way of creating this form is just
add x on both sides. if i add x on both sides
224
00:31:30,690 --> 00:31:46,520
and call this as g of x. so this is x= g of
x or i could do in general more some matrix
225
00:31:46,520 --> 00:31:57,410
here bx = so the form is same. so i can do
either do this transformation or in some cases
226
00:31:57,410 --> 00:32:04,260
the problem discretization will yield this
kind of a form depending upon what kind of
227
00:32:04,260 --> 00:32:20,310
structure the problem has. so you get this
special form now what can i do with this.
228
00:32:20,310 --> 00:32:32,990
so if i have the special form ax=g of x i
could convert this into solving linear algebraic
229
00:32:32,990 --> 00:32:52,370
equations by very, very simple trick. so if
i start with some initial guess let x0 is
230
00:32:52,370 --> 00:33:13,250
my initial guess then what i am going to do
i am going to solve for ax k+1=g of. is everyone
231
00:33:13,250 --> 00:33:21,350
with me on this. see i start with x0 if i
substitute x0 here i can compute this g of
232
00:33:21,350 --> 00:33:28,670
x this is known to me. what is not known to
me x k+1 x1 is not known to me.
233
00:33:28,670 --> 00:33:36,290
but then it becomes a linear algebraic equation.
this is b this is a this is ax=b. i can solve
234
00:33:36,290 --> 00:33:46,049
for x k+1 then you know i can solve for x
k+1 using any method of ax=b gaussian elimination
235
00:33:46,049 --> 00:33:56,710
or gauss seidel method or whatever. so this
will generate an iteration and when do you
236
00:33:56,710 --> 00:34:03,760
terminate what is the advantage of doing this.
i am not computing jacobian.
237
00:34:03,760 --> 00:34:21,730
so i will terminate my iterations when x k+1-.
so this is less than some epsilon. i will
238
00:34:21,730 --> 00:34:29,639
terminate when this becomes < epsilon i will
terminate my iterations. this method in general
239
00:34:29,639 --> 00:34:38,010
it looks very simple to formulate no jacobian
you can compute. well this method in some
240
00:34:38,010 --> 00:34:48,159
cases does work and when we move on to implicit
methods for solving ode initial value problems
241
00:34:48,159 --> 00:34:51,369
we will see merit in using this method.
242
00:34:51,369 --> 00:34:56,070
what is very, very critical here is that this
method will converge if you give a initial
243
00:34:56,070 --> 00:35:02,670
guess which is very close to the solution.
when will this method converge, how will it
244
00:35:02,670 --> 00:35:06,850
converge we will postponed that discussion
to a later part. i will discuss about that
245
00:35:06,850 --> 00:35:14,280
towards the end at least i have mentioned
about it. though we cannot go too much into
246
00:35:14,280 --> 00:35:21,280
detail. this particular method if you give
a good initial reasonable initial guess. this
247
00:35:21,280 --> 00:35:23,640
method will converge to the solution.
248
00:35:23,640 --> 00:35:29,080
okay generating good initial guess may not
be always possible particularly for large
249
00:35:29,080 --> 00:35:36,730
problems. if you are solving simulation for
simulation of an entire section of a plant
250
00:35:36,730 --> 00:35:42,380
generating initial guess is not joke it is
quite difficult. so it might be difficult
251
00:35:42,380 --> 00:35:47,930
to use it there, but in some small problem
where you can generate initial guess quite
252
00:35:47,930 --> 00:35:54,520
well. for example implicit newton method or
trapezoidal rule where you can use explicit
253
00:35:54,520 --> 00:36:01,830
method to create a good guess for the implicit
method this method will work quite well.
254
00:36:01,830 --> 00:36:10,040
now while implementing these kinds of methods
i can also have variations which are similar
255
00:36:10,040 --> 00:36:18,190
to jacobi method which are similar to gauss-seidel
method which are similar to relaxation method.
256
00:36:18,190 --> 00:36:25,190
so i am now going to talk about variants of
successive substitution method which are like
257
00:36:25,190 --> 00:36:31,810
jacobi method or which are like gauss-seidel
method. so when i do that i cannot of course
258
00:36:31,810 --> 00:36:34,240
use this vector matrix notation.
259
00:36:34,240 --> 00:36:42,880
what we did in jacobi we went equation by
equation. so the same thing i am going to
260
00:36:42,880 --> 00:36:45,930
do here.
261
00:36:45,930 --> 00:36:57,070
so i go back to my original form. so instead
of writing if the equation that i want to
262
00:36:57,070 --> 00:37:16,300
solve i am going to rearrange into this form.
x1=gi x for i= 1, 2 where g of x is nothing,
263
00:37:16,300 --> 00:37:35,020
hut g1x, g2 x a small g is nothing but one
element in the function vector. i am looking
264
00:37:35,020 --> 00:37:39,310
at element by element converting into this
form is not difficult. i can pre multiply
265
00:37:39,310 --> 00:37:41,660
both the sides by a inverse.
266
00:37:41,660 --> 00:37:53,800
so it will be x= removing a matrix is not
a big deal. so now suppose i have this equation.
267
00:37:53,800 --> 00:38:13,130
and then how will you form jacobi like iterations.
my jacobi iterations will be xi k+1= gi of
268
00:38:13,130 --> 00:38:32,680
x k. i am going to use the old value and create
a new value for i=1, 2, n. how will you create
269
00:38:32,680 --> 00:38:43,780
gauss-seidel like iterations use the new value
as it get created. so gauss-seidel iterations
270
00:38:43,780 --> 00:38:48,420
is as concept you can use it in context of
linear algebraic equations, you can use in
271
00:38:48,420 --> 00:38:53,380
the context of non linear algebraic equations
to understand the concept you can do relaxation
272
00:38:53,380 --> 00:38:55,910
iterations the same ideas.
273
00:38:55,910 --> 00:39:10,290
so my first equation will be x1 k+1= g1 x
k is my first equation.
274
00:39:10,290 --> 00:39:29,130
my second equation that is x2 k+1 will be
g2. now here i will use x1 k+1. i will x2
275
00:39:29,130 --> 00:39:41,480
k x3k, xnk. well unlike the linear algebraic
equations x2 will appear on both sides because
276
00:39:41,480 --> 00:39:45,800
these are non linear equations you may not
be able to separate them. what will my x3
277
00:39:45,800 --> 00:40:04,560
k k+1 this will use g3 x1, k+1, x2 k+1, x3
k. is this clear? see as and when the new
278
00:40:04,560 --> 00:40:08,810
value gets created i am using it in the next
equation.
279
00:40:08,810 --> 00:40:21,120
i am solving n equation, equation by equation.
i am solving equation by equation one equation
280
00:40:21,120 --> 00:40:31,230
at a time. this would be gauss-seidel iteration
and i can write a generic form for this for
281
00:40:31,230 --> 00:40:43,450
ith case to use new values up to i-1 and old
values for i to n. so you write a generic
282
00:40:43,450 --> 00:40:59,500
form for this. how will you create the iteration
for relaxation method? x nu will be x k+ omega
283
00:40:59,500 --> 00:41:06,020
times the gauss seidel steps where omega is>1
or<1 depending upon.
284
00:41:06,020 --> 00:41:10,490
here it is difficult to say whether the convergence
will occur between 0 to 2 and all that it
285
00:41:10,490 --> 00:41:17,320
is not possible to say here. in linear case
we could say that we could give necessary
286
00:41:17,320 --> 00:41:25,190
sufficient conditions for convergence it is
not possible to do that here. so inherently
287
00:41:25,190 --> 00:41:32,560
because you are using the new value every
time it generated one would expect that this
288
00:41:32,560 --> 00:41:38,619
gauss-seidel iterations will converge faster
than the jacobi iterations and so on.
289
00:41:38,619 --> 00:41:46,670
so this iterations would be better in terms
of convergence properties. this cannot be
290
00:41:46,670 --> 00:41:53,380
proved, but at least we can hope that this
correction.
291
00:41:53,380 --> 00:42:13,761
so one can device relaxation iterations here
by saying xi k+1= x1k+ omega times. so you
292
00:42:13,761 --> 00:42:21,330
make one gauss-seidel iteration choose that
omega which is positive omega>0 and create
293
00:42:21,330 --> 00:42:31,770
a new guess which amplifies the change predicted
by the gauss-seidel step and so on. so one
294
00:42:31,770 --> 00:42:43,859
can have all these kinds of variation here.
so advantage of these methods is that no gradient
295
00:42:43,859 --> 00:42:47,290
evaluation no jacobian calculations.
296
00:42:47,290 --> 00:42:56,540
the flip side is that they will converge if
you have a good initial guess. if without
297
00:42:56,540 --> 00:43:01,680
gradient calculation if you are good initial
guess and if it works great. you are able
298
00:43:01,680 --> 00:43:06,730
to save on computation you can do solve the
equations very fast. if not you have to go
299
00:43:06,730 --> 00:43:17,880
for gradient based calculations. now i want
to talk about one method which is in between.
300
00:43:17,880 --> 00:43:29,860
this method will use gradient evaluations,
but will not do the full jacobian.
301
00:43:29,860 --> 00:43:39,780
it only calculates some gradient and so this
particular method is called as wegstein method
302
00:43:39,780 --> 00:43:59,220
or multivariate secant method. so let us move
on to now gradient based method.
303
00:43:59,220 --> 00:44:09,870
so in the class of derivative based methods
we already looked at newton method. now i
304
00:44:09,870 --> 00:44:16,119
want to revisit newton method for the univariate
case. why i am looking at this will become
305
00:44:16,119 --> 00:44:24,230
clear soon because i want to talk about this
intermediate method called wegstein method.
306
00:44:24,230 --> 00:44:31,950
so the motivation comes from univariate methods.
so univariate method newton method if i have
307
00:44:31,950 --> 00:44:38,180
f of x=0 where x belongs to r.
308
00:44:38,180 --> 00:44:47,690
i want to solve one variable equation f of
x=0. newton method of course if this function
309
00:44:47,690 --> 00:45:07,890
is differentiable you can write x k+1=xk-f
of xk. i can write this as f of xk/f prime
310
00:45:07,890 --> 00:45:16,930
x k derivative of f with respect to x. we
can have a slight variation of this method.
311
00:45:16,930 --> 00:45:28,780
this is the classical newton method. in a
slight variation of this method called secant
312
00:45:28,780 --> 00:45:29,780
method.
313
00:45:29,780 --> 00:45:43,590
in secant method what we do is this f prime
xk we approximate this derivative using last
314
00:45:43,590 --> 00:46:06,580
2 iterates. so we approximate as fxk-fx k-1/
this small variation is called as secant method
315
00:46:06,580 --> 00:46:17,810
where this f prime xk is replaced by an approximation
of the derivative. here it is not in a true
316
00:46:17,810 --> 00:46:29,560
sense a may not be a good approximation because
delta xk-xk-1 need not be small. so this may
317
00:46:29,560 --> 00:46:35,150
not be a good approximation, but this method
works quite well for many simple problems.
318
00:46:35,150 --> 00:46:41,190
so this variation is called a secant method
where now to kickoff the secant method you
319
00:46:41,190 --> 00:46:51,090
need 2 initial guesses not 1 initial guess.
x0, x1 then you can create the next x2 starting
320
00:46:51,090 --> 00:47:02,359
from x0 x1 because this gradient calculation
will require x0 x1 then you compute the gradient
321
00:47:02,359 --> 00:47:15,500
from 2 initial guesses. and then you can move
on to the x2 then x2 x1 you can create x3
322
00:47:15,500 --> 00:47:19,480
from x3, x2 you can create x4 and so on.
323
00:47:19,480 --> 00:47:27,890
so you start with 2 initial guesses x0, x1
create x2 using x2, x1 create x3 and so on.
324
00:47:27,890 --> 00:47:35,000
there is one more variation of this method
is called as regula falsi probably all these
325
00:47:35,000 --> 00:47:40,330
methods names which i am talking about right
now might be familiar to you from your b tech
326
00:47:40,330 --> 00:47:46,099
background because one variable newton method,
secant method and regula falsi are typically
327
00:47:46,099 --> 00:47:56,180
taught in undergraduate curriculum. the slight
variation this is based on observation that
328
00:47:56,180 --> 00:48:04,420
if you have this is my x and i am plotting
f of x.
329
00:48:04,420 --> 00:48:14,150
f of x has some behavior like this i am looking
for this point where f of x=0. i am looking
330
00:48:14,150 --> 00:48:19,500
for this point or i am looking for this point
where f of x=0. i am looking for routes of
331
00:48:19,500 --> 00:48:26,869
the equation f of x=0 which means i am looking
for point where f of x (48:23 of x becomes
332
00:48:26,869 --> 00:48:38,020
0. now one observation is that whenever there
is an interval in which f of x has positive
333
00:48:38,020 --> 00:48:44,230
sign on one side and f of x has negative sign
on the other side f of x crosses 0.
334
00:48:44,230 --> 00:48:50,810
f of x is a continuous function it will cross
0 somewhere at least once it may be multiple
335
00:48:50,810 --> 00:49:00,500
times we did not know but at least once it
crosses 0. so this regula falsi method actually
336
00:49:00,500 --> 00:49:09,890
tries to use this idea and make some modification
to secant method. so it starts with 2 initial
337
00:49:09,890 --> 00:49:24,460
guesses. so it will start with x0 and x1 such
that function evaluated x0 and function evaluated
338
00:49:24,460 --> 00:49:31,000
as x1 have opposite sign.
339
00:49:31,000 --> 00:49:38,670
and then as it does proceed in the calculations
it tries to maintain this. it tries to maintain
340
00:49:38,670 --> 00:49:45,600
the fact that there are 2 successive guesses
should always have function values which are
341
00:49:45,600 --> 00:49:55,890
opposite sign if that is maintained then convergence
to the solution can be faster. so this is
342
00:49:55,890 --> 00:50:11,599
where f of x k is>0 and this is where f of
xk<0. if you have a scenario where you know
343
00:50:11,599 --> 00:50:14,869
function value changes sign from positive
to negative.
344
00:50:14,869 --> 00:50:18,810
this is only true for a one variable function.
it is difficult to say something like this
345
00:50:18,810 --> 00:50:23,710
for a multivariable function. for one variable
function multivariable function vector. i
346
00:50:23,710 --> 00:50:30,550
am talking of 1 variable scalar function changes
sign at 2 different points then there is route
347
00:50:30,550 --> 00:50:37,339
somewhere in between that is the idea. so
the modification here is that
348
00:50:37,339 --> 00:51:02,849
i will just write down this modification here.
you carry out the iterations by this formula
349
00:51:02,849 --> 00:51:04,540
if it is less than 0.
350
00:51:04,540 --> 00:51:31,890
and the second case is. so whether you use
xk or whether you use xk-1 then you move forward
351
00:51:31,890 --> 00:51:41,750
to compute the derivative approximations that
will be based on the sign of f of xk+1 which
352
00:51:41,750 --> 00:51:48,310
you get. so when you go for the new iteration
calculations you keep checking the sign and
353
00:51:48,310 --> 00:51:52,000
based on the sign you make a judgment as to
how to proceed further. so this is regula
354
00:51:52,000 --> 00:51:54,609
falsi approximation.
355
00:51:54,609 --> 00:52:04,140
now what i am going to do next is use this
univariate method. i am going to use univariate
356
00:52:04,140 --> 00:52:11,160
secant method and create a multivariate secant
method. this multivariate secant method is
357
00:52:11,160 --> 00:52:16,859
called as wegstein method. the advantage of
multivariate secant method is that number
358
00:52:16,859 --> 00:52:25,609
of derivative calculation is very, very small
equal to number of equations whereas in newton
359
00:52:25,609 --> 00:52:29,910
method the classical newton method you have
to compute the full jacobian and n cross and
360
00:52:29,910 --> 00:52:33,890
elements which can be quite large.
361
00:52:33,890 --> 00:52:41,099
so if you see this software like aspen plus
they seem to be preferring this wegstein method
362
00:52:41,099 --> 00:52:45,820
which works quite well which is multivariate
secant method. so we will look at it in our
363
00:52:45,820 --> 00:52:46,290
next lecture.