﻿1 00:00:17,940 --> 00:00:24,780 Good afternoon; this is Doctor Pradhan here; welcome to NPTEL project on econometric modeling. 2 00:00:24,780 --> 00:00:30,200 So, today we will continue the reliability part of bivariate econometric modeling. In 3 00:00:30,200 --> 00:00:36,610 the last class we have discussed the near depth reliability and the structure of reliability 4 00:00:36,610 --> 00:00:43,610 for the bivariate estimated econometric model. Now, we like to highlight this same issue 5 00:00:45,300 --> 00:00:49,280 again here because, some of the things we have not discussed last class. 6 00:00:49,280 --> 00:00:56,280 So, the thing is for 2 variables Y and X, our fitted model is like this: Y hat equal 7 00:00:58,510 --> 00:01:05,510 to alpha hat plus beta hat X. Now, the essential point is here we have 2 specific objectives; 8 00:01:09,280 --> 00:01:14,930 the first objective is to know the significance of the parameters and second objective is 9 00:01:14,930 --> 00:01:21,320 to know the overall fitness of the models. In this particular… suppose reliability 10 00:01:21,320 --> 00:01:28,320 is concerned, we have 2 specific objectives. First objective is to know the significance 11 00:01:30,690 --> 00:01:37,690 of parameters and that is with respect to alpha hat and beta hats and second, the significance 12 00:01:41,590 --> 00:01:48,590 of the significance of overall fitness of the model overall fitness of the model. So, 13 00:01:58,170 --> 00:02:01,720 we have 2 specific objectives so far as reliability is concerned. 14 00:02:01,720 --> 00:02:06,680 So, first objective is to know the significance of the parameters that is the weightage of 15 00:02:06,680 --> 00:02:13,680 you know, each parameters when we fit the you know, regression equations with respect 16 00:02:15,050 --> 00:02:21,360 to X and Y. Then, obviously the impact can be negative or the impact can be positive 17 00:02:21,360 --> 00:02:28,360 which is just through the slope of the, you know, X coefficient or X variables. This which 18 00:02:30,730 --> 00:02:35,870 we have to just through slope of the X variable that is nothing but, beta coefficient and 19 00:02:35,870 --> 00:02:41,240 alpha coefficient is just to know the significance of the, you know, supporting factors. 20 00:02:41,240 --> 00:02:47,400 Now, you know, just to put in a straight line equation, this is you know intercept and this 21 00:02:47,400 --> 00:02:52,790 is what we call it a slope. Now, we like to know whether this you know, intercept or the 22 00:02:52,790 --> 00:02:59,530 supporting component is significant one for influencing Y and whether X component is significant 23 00:02:59,530 --> 00:03:05,319 one. Again, for influencing Y now to know this one, we have standard procedure; so, 24 00:03:05,319 --> 00:03:10,720 that part the discussion of, this part particularly is known as the reliability of this estimated 25 00:03:10,720 --> 00:03:13,660 model. So, before we… first we start this first 26 00:03:13,660 --> 00:03:18,020 objective; that is the significance of the parameters. So, the significance of parameter 27 00:03:18,020 --> 00:03:24,980 is that we have to represent the estimated models in a typical tabular form so that we 28 00:03:24,980 --> 00:03:27,780 can understand the exact structure of the reliability. 29 00:03:27,780 --> 00:03:34,370 Now, when we have estimated models, we had Y hat equal to alpha hat plus beta hat X so 30 00:03:34,370 --> 00:03:41,370 then the standard table we have to design is here. We have estimated parameters; then 31 00:03:47,900 --> 00:03:54,900 second, the estimated values; then, third column represents variance - variance of estimated 32 00:04:02,489 --> 00:04:09,489 values. Then, standard error then t statistics, then you know probability level of significance, 33 00:04:12,930 --> 00:04:19,289 these are the structure of this particular you know, significance of the parameters; 34 00:04:19,289 --> 00:04:23,650 that means with respect to the first objective. So, what are the estimated parameters for 35 00:04:23,650 --> 00:04:28,349 this particular you know, bivariate setup? The estimated parameter, first parameter is 36 00:04:28,349 --> 00:04:35,349 related to alpha hat and second parameter is beta hat; this is what that means. Now, 37 00:04:35,570 --> 00:04:41,410 this table is altogether complete one so, this is what we have to design this entire 38 00:04:41,410 --> 00:04:47,960 table alright. Now, these particular structures, what is 39 00:04:47,960 --> 00:04:52,780 estimated value alpha hat? We will get it you know this is nothing but, Y bar minus 40 00:04:52,780 --> 00:04:59,780 beta hat X bar and beta hat is equal to summation XY by summation X square which we have discussed 41 00:05:00,030 --> 00:05:06,730 long back. Now, variance of estimated alpha that is nothing but variance of alpha hat 42 00:05:06,730 --> 00:05:11,389 and this is nothing but variance of beta hat. Then, standard error of beta that is nothing 43 00:05:11,389 --> 00:05:18,389 but, variance of alpha hat and this is square root of variance of beta hat. So, this is 44 00:05:19,630 --> 00:05:25,680 standard error; when we design T T statistic for this is t alpha hat and this is t beta 45 00:05:25,680 --> 00:05:29,220 hat and we like to know what is the significance levels. 46 00:05:29,220 --> 00:05:35,470 Now, this model is you know theoretically is but, you know, technically or practically 47 00:05:35,470 --> 00:05:41,110 so far as the significance of the parameter is concerned, we have to evaluate in a proper 48 00:05:41,110 --> 00:05:45,040 sequence and that has to be compared with the tabulated value which we have discussed 49 00:05:45,040 --> 00:05:49,340 details in the last class. Now, what is all about this variance of alpha 50 00:05:49,340 --> 00:05:54,919 hat? so basically the variance of alpha hat is derived there are you know technical procedure 51 00:05:54,919 --> 00:05:58,729 how you have to get the variance of alpha hat but, in the mean times variance of alpha 52 00:05:58,729 --> 00:06:05,560 hat is nothing but, sigma square u into summation X square divided by n summation X square. 53 00:06:05,560 --> 00:06:11,400 So, here this is you know this particular item is a capital X and this particular x 54 00:06:11,400 --> 00:06:17,630 is a small x. This is nothing but, deviation format this is we can represent in X minus 55 00:06:17,630 --> 00:06:24,190 X bar alright. Now, altogether there are 4 items, sigma square 56 00:06:24,190 --> 00:06:30,460 u summation capital X square then a n into summation small x square. So, this is nothing 57 00:06:30,460 --> 00:06:37,120 but, variance of you know, variance of X so the question is, what is sigma square here? 58 00:06:37,120 --> 00:06:42,580 So, sigma square is sigma square u is called as an error variance here. This is otherwise 59 00:06:42,580 --> 00:06:49,580 called as error variance here we are calculating the variance of a particular variance x or 60 00:06:50,889 --> 00:06:57,690 particular variable y. We have to also calculate the variance of you can say u or e because, 61 00:06:57,690 --> 00:07:03,990 in a bivariate setup we start with 2 variables Y and X. But, ultimately we with the help 62 00:07:03,990 --> 00:07:08,800 of you know estimated model we get to know the or you have to create another variable 63 00:07:08,800 --> 00:07:14,789 called as u or otherwise called as error term. Now, altogether when we have a fitted model 64 00:07:14,789 --> 00:07:20,250 then the entire system consists of you know 4 important columns. So, first column is related 65 00:07:20,250 --> 00:07:25,400 to Y column; it gives the information about y structure and we can get to know what is 66 00:07:25,400 --> 00:07:30,690 the variation of Y or you can say, standard deviation of Y or mean of Y. so, these are 67 00:07:30,690 --> 00:07:34,400 the statistics we have to draw from the Y column. 68 00:07:34,400 --> 00:07:39,539 Similarly, in the X column we have series of X information corresponding to or Y component. 69 00:07:39,539 --> 00:07:46,539 So, we can also get to know the entire descriptive statistics of x variable. Now, next to X we 70 00:07:47,620 --> 00:07:52,229 start with a variable called Y hat. Y hat is nothing but, alpha hat plus beta hat X. 71 00:07:52,229 --> 00:07:57,350 now, with the help of alpha hat value and beta hat value and with the help of X information 72 00:07:57,350 --> 00:08:04,350 then we can create the y hat columns. So, Y hat column also we can get the descriptive 73 00:08:04,389 --> 00:08:10,919 statistic y hat because, y hat altogether here another variable which is designed through 74 00:08:10,919 --> 00:08:15,870 the help of Y and X and the estimated parameter alpha hat and beta hat. 75 00:08:15,870 --> 00:08:21,449 Now, with respect to Y hat and Y, we have to create another column called as a error 76 00:08:21,449 --> 00:08:27,550 columns; so that is represented as u columns or you can say e column. Now, corresponding 77 00:08:27,550 --> 00:08:32,560 to every figure of Y hat and Y, we have to find out the error component for instance, 78 00:08:32,560 --> 00:08:39,560 u 1 is equal to y one minus y one hat. Similarly, e 2 equal to y 2 minus y 2 hat so the difference 79 00:08:39,800 --> 00:08:46,800 between the estimated Y and you know actual Y; so this will give the error representation. 80 00:08:46,850 --> 00:08:53,850 Now, once you have error series starting from u 1 to e 1, e n provided the system is n th 81 00:08:53,959 --> 00:08:59,149 observation then, you have to calculate the error variance. So, these error variance you 82 00:08:59,149 --> 00:09:04,459 know it is called as a sigma square u; so sigma square u. Sometimes, you know this error 83 00:09:04,459 --> 00:09:11,120 variance we will represent here summation e square by n minus 2 this summation sigma 84 00:09:11,120 --> 00:09:15,580 square u equal to summation e square n minus 2. 85 00:09:15,580 --> 00:09:21,920 Here, you know basically this particular summation e square by n minus 2. We can put it in a 86 00:09:21,920 --> 00:09:27,600 other way: summation is e square by n minus k actually k is k is the number of variables 87 00:09:27,600 --> 00:09:33,980 in this particular system or number of parameters, in this particular systems. Now, since this 88 00:09:33,980 --> 00:09:40,980 particular model is a bivariate one obviously, there are 2 variables and there are 2 parameters 89 00:09:41,350 --> 00:09:48,000 right. That is, alpha parameter and beta parameters; so as a result k is represented as here 2. 90 00:09:48,000 --> 00:09:53,440 So, there is no point to write summation e square by n minus k because, it is already 91 00:09:53,440 --> 00:09:59,170 known to us that k represents the total number of variables in the system. That is, you can 92 00:09:59,170 --> 00:10:06,060 say y and x or number of parameter in the systems; that is alpha hat and beta hat. Now, 93 00:10:06,060 --> 00:10:12,990 but, when there is you know, multivariate system then these particular terms can be 94 00:10:12,990 --> 00:10:18,570 represented as a summation e square by n minus k. For instance, if we have trivariate models 95 00:10:18,570 --> 00:10:24,019 then obviously, summation is e square by n minus 3 because there are 3 variables in the 96 00:10:24,019 --> 00:10:29,279 system. Similarly, we have to extend one after another then obviously the n minus k component 97 00:10:29,279 --> 00:10:34,810 will be start, you can say expanding. Now, sigma square e equal to summation e square 98 00:10:34,810 --> 00:10:41,810 by n minus 2 where summation e square is equal to summation y hat square plus summation y 99 00:10:45,670 --> 00:10:52,390 hat square; this is summation u y square minus summation y hat square. So, that means in 100 00:10:52,390 --> 00:10:56,630 other words it is nothing but, summation y square minus summation y hat square. 101 00:10:56,630 --> 00:11:02,850 Let me explain how it has happened here. This is you know, usually derived in a technical 102 00:11:02,850 --> 00:11:07,810 procedures. So, the details you know calculating procedure of this particular terms we can 103 00:11:07,810 --> 00:11:10,230 analyze here. 104 00:11:10,230 --> 00:11:17,230 Now, this particular system this particular system say let us say we have a system; say 105 00:11:17,779 --> 00:11:24,779 y equal to y hat plus e. So, this is how we start the process. What we will do? Let us 106 00:11:25,740 --> 00:11:32,730 we call it equation number one then we you can say subtract y bar on both the sides so 107 00:11:32,730 --> 00:11:36,970 this is nothing but, y hat minus y bar so plus e. 108 00:11:36,970 --> 00:11:43,970 Now, you see here so the actual the actual representation is like this now this is what 109 00:11:46,649 --> 00:11:53,260 we call as the x series and this is what we will call it y series. So, then we have a 110 00:11:53,260 --> 00:12:00,260 estimated you know corresponding to y; so we can get the you can say we can get the 111 00:12:02,620 --> 00:12:08,950 y you know y bar and corresponding to x we have to get the x bar. So, this is what we 112 00:12:08,950 --> 00:12:11,440 call is a mean of y and this is called as a mean of x. 113 00:12:11,440 --> 00:12:18,440 Now, with respect to y and x information, our objective is to get the estimated line; 114 00:12:18,730 --> 00:12:23,490 that is called as a best fitted line. Now, let us assume that this best fitted line can 115 00:12:23,490 --> 00:12:28,800 be represented like this. So, this is y here which is equal to alpha hat plus beta hat 116 00:12:28,800 --> 00:12:35,800 x beta hat x now this particular point is very relevant because, in this particular 117 00:12:37,240 --> 00:12:44,240 point where y hat y hat bar or exactly equal to y bar so, this is a… we are representing 118 00:12:45,079 --> 00:12:51,880 here that means this entire representation we can write it here like this y minus y bar 119 00:12:51,880 --> 00:12:58,880 is equal to y hat minus y. hat bar plus e because y y bar and y hat bar is equal at 120 00:12:59,440 --> 00:13:03,220 that point of you know equilibrium; so it is not a issue. 121 00:13:03,220 --> 00:13:09,550 So, what we have to do instead of writing this one? We call it this is small y in deviation 122 00:13:09,550 --> 00:13:15,529 format and this is what we will call it y hat in a deviation format and this is e. This 123 00:13:15,529 --> 00:13:22,529 is also as usual error terms. Now, this is what we have derived from here; now put it 124 00:13:24,630 --> 00:13:31,560 in a proper way. So, it is y equal to y hat plus e; now, what we have to do? This is original 125 00:13:31,560 --> 00:13:37,740 equation we have y minus y equal to y hat plus e this y and this y hat is in a capital 126 00:13:37,740 --> 00:13:43,390 format and this y and this y hat is you know deviation format; there is a huge difference 127 00:13:43,390 --> 00:13:49,390 between this deviation and actual. Now, we have transferred the actual to deviation 128 00:13:49,390 --> 00:13:55,320 format. For this simplicity is concerned, now what we have to do? We have to apply summation. 129 00:13:55,320 --> 00:14:01,500 We first apply square in both the sides and then we have to apply the summation to get 130 00:14:01,500 --> 00:14:05,959 the entire structures. Now, what we have to do? If we do that then the entire structure 131 00:14:05,959 --> 00:14:12,959 becomes summation y square equal to summation y hat plus e whole square. Obviously, i equal 132 00:14:13,800 --> 00:14:20,790 to 1 to n here this is i equal to 1 to n because i represents the sample units; it will start 133 00:14:20,790 --> 00:14:27,790 from 1 to n because, we are in the process of cross sectional modeling and our sample 134 00:14:28,029 --> 00:14:35,029 unit represent here i. Now, obviously i equal to 1 to up to n alright; 135 00:14:36,740 --> 00:14:43,740 now, what you have to do? This particular component y hat y hat bar, this is what we 136 00:14:46,550 --> 00:14:53,550 can write in the format like this; y hat squares i equal to 1 to n plus summation e square 137 00:14:54,420 --> 00:15:01,420 i equal to 1 plus 2 summation y hat into e. So, this if you, if we expand this particular 138 00:15:04,420 --> 00:15:09,760 you know, right hand side of this equation then, we will get summation y square equal 139 00:15:09,760 --> 00:15:16,420 to summation y hat square plus summation e square plus 2 summation y hat into e. But, 140 00:15:16,420 --> 00:15:22,700 this particular term is exactly equal to 0 this particular term is exactly equal to 0. 141 00:15:22,700 --> 00:15:29,170 Now, the question is how it becomes 0; so let me explain here. 142 00:15:29,170 --> 00:15:36,170 The structure is here; our point is here to prove that summation y hat e equal to 0 so 143 00:15:36,529 --> 00:15:43,529 first of all what is y hat y hat is equal to y hat minus y hat bar so this is nothing 144 00:15:44,720 --> 00:15:51,720 but, y hat minus y bar alright. Now, if we will simplify then it is nothing but, alpha 145 00:15:52,940 --> 00:15:59,940 hat plus beta hat x minus alpha hat minus beta hat x bar. Again, if you simplify then 146 00:16:01,690 --> 00:16:08,690 it is nothing but, alpha, alpha hat cancels so, beta hat into x minus x bar which is equal 147 00:16:10,579 --> 00:16:15,660 to beta hat into small x. That is what we call it deviation; that means, this particular 148 00:16:15,660 --> 00:16:20,880 item is small x. So, this is one part of the problem then e 149 00:16:20,880 --> 00:16:27,880 equal to y minus y hat so that means that means e equal to y minus y hat which is nothing 150 00:16:28,630 --> 00:16:35,630 but, y minus beta hat x beta hat x i now this is y e and this is y hat now we have to integrate 151 00:16:40,120 --> 00:16:47,120 now summation y hat e is equal to summation because this is summation here so this beta 152 00:16:49,660 --> 00:16:56,660 hat x into y minus beta hat x so obviously this is x i and this is x i this is y i so 153 00:16:59,769 --> 00:17:06,769 like this so of course, i equal to 1 to n and this side i equal to also one to n actually 154 00:17:09,010 --> 00:17:15,510 the term is 2 into summation y hat into e but, if we prove that summation y hat equal 155 00:17:15,510 --> 00:17:22,510 to 0 then obviously, 2 into 0 equal to 0. Now, what we have to do? Here, just we take 156 00:17:22,739 --> 00:17:29,739 beta hat common then, summation x i y i minus beta hat beta hat is equal to common here. 157 00:17:41,529 --> 00:17:48,100 So, beta hat summation x squares then this beta hat is you can say, we have taken common 158 00:17:48,100 --> 00:17:55,100 beta hat so then it is nothing but, beta hat into summation x i y i minus what is beta? 159 00:17:56,889 --> 00:18:03,159 Beta hat is nothing but, summation x i y i divided by summation x square. We have again 160 00:18:03,159 --> 00:18:08,619 summation x square; so this summation x square, this summation x square cancels. That means, 161 00:18:08,619 --> 00:18:15,619 equal to beta hat into summation x i y i minus summation x i y i. so, summation x i y I; 162 00:18:18,179 --> 00:18:22,979 so this and this is canceled. That means, it is nothing but, beta hat into 0 which is 163 00:18:22,979 --> 00:18:29,979 nothing but, equal to 0 alright. Now, so that means the entire structure is like this; so 164 00:18:30,840 --> 00:18:37,690 beta hat beta hat equal to summation x y x i y i. We are just expanding the beta hat 165 00:18:37,690 --> 00:18:42,690 value here; so obviously, summation x square summation x square cancels. The left out term 166 00:18:42,690 --> 00:18:49,450 is summation x i y i. Obviously, summation x i y i is here so this is summation x i y 167 00:18:49,450 --> 00:18:56,450 i so that means it is equal to 0. Now, we have proved that summation y hat equal 168 00:18:57,039 --> 00:19:04,039 to 0. Now, you come to this stage here so summation that means summation y square summation 169 00:19:04,399 --> 00:19:11,330 y square is equal to summation y hat square plus summation e square. Now, we will start 170 00:19:11,330 --> 00:19:18,330 our process here; so what is exactly this particular? so this particular is like this. 171 00:19:19,970 --> 00:19:26,970 We start with you can say, y equal to y hat plus e then you know we transfer into y equal 172 00:19:27,359 --> 00:19:34,359 to y hat plus e then after you know doing so you know process so we get to we have received 173 00:19:37,190 --> 00:19:44,190 summation y square equal to summation y hat square plus summation e square so this is 174 00:19:44,419 --> 00:19:51,419 i equal to 1 to n and this is i equal to 1 to n and this is also i equal to 1 to n. This 175 00:19:52,080 --> 00:19:58,109 is how the entire structure is all about; that means, our point is here to justify the 176 00:19:58,109 --> 00:20:05,109 significance of the alpha parameter and beta parameters and just this particular task. 177 00:20:05,479 --> 00:20:12,369 So, we need to have variance of alpha hat and to have variance of alpha hat and to have 178 00:20:12,369 --> 00:20:18,340 variance of beta hat we need to integrate with again with error variance because this 179 00:20:18,340 --> 00:20:24,479 particular variance of alpha hat depends upon the variance of error variance and again for 180 00:20:24,479 --> 00:20:28,220 you can say a variance of beta hat we need also error variance. 181 00:20:28,220 --> 00:20:33,849 So, we like to know, what is the exact component of error variance; explaining by this process, 182 00:20:33,849 --> 00:20:39,899 we are in the stage that summation y square equal to summation y hat square plus summation 183 00:20:39,899 --> 00:20:46,419 e square. This particular term is called as a TSS and this particular term is called as 184 00:20:46,419 --> 00:20:53,419 a ESS and this particular term is called as a RSS; this particular term is called as a 185 00:20:54,830 --> 00:21:01,119 RSS. What is exactly this particular term? That means, this is called as a total sum 186 00:21:01,119 --> 00:21:05,460 square; this is explained sum square and this is called as a residual sum square. So, that 187 00:21:05,460 --> 00:21:12,460 means, this is what we call as a total sum square total sum squares. Then, this is explained 188 00:21:14,450 --> 00:21:21,450 sum square, explained sum squares and this particular term is called as a this particular 189 00:21:24,499 --> 00:21:31,499 term is called as a residual - residual sum squares. Sum square it is otherwise known 190 00:21:34,519 --> 00:21:41,519 as unexplained sum square unexplained sum squares; this is otherwise called as a unexplained 191 00:21:54,559 --> 00:21:59,450 sum square. Now, what is exactly a this particular term 192 00:21:59,450 --> 00:22:06,450 so this is nothing but, summation y minus y bar whole squares i equal to 1 to n and 193 00:22:10,169 --> 00:22:15,879 this particular term is nothing but, y hat square this is nothing but, summation y hat 194 00:22:15,879 --> 00:22:22,879 minus y bar whole squares i equal to 1 to n then this is nothing but, summation you 195 00:22:25,450 --> 00:22:32,450 can say y i minus y hat whole square i equal to 1 n. In fact, the entire process is started 196 00:22:34,690 --> 00:22:41,690 from here only because, our entire model is nothing but, e equal to y minus y hat and 197 00:22:42,389 --> 00:22:47,700 the way we are minus minimizing the error sum we have received the alpha hat component 198 00:22:47,700 --> 00:22:53,649 and beta hat component. Now, to justify the significance of this particular 199 00:22:53,649 --> 00:23:00,549 parameter alpha hat and the parameter beta hat we again come down this particular process. 200 00:23:00,549 --> 00:23:07,549 Now, we have to explain how this means there is lots of interesting facts behind this particular 201 00:23:09,220 --> 00:23:16,220 structure. So, let us see how is this particular structures alright. 202 00:23:16,609 --> 00:23:23,609 Now, we have the component summation y square equal to summation y hat square plus summation 203 00:23:26,259 --> 00:23:33,259 e square. That means, what we can conclude total sum square is equal to explained sum 204 00:23:33,429 --> 00:23:40,229 square plus residual sum square alright. So, we are now in a position to say that total 205 00:23:40,229 --> 00:23:47,229 sum square is equal to explained sum square and residual sum square i have highlighted 206 00:23:48,029 --> 00:23:55,029 earlier. That, you know when you have y series and we have x series that is our you can say 207 00:23:55,309 --> 00:24:01,999 a beginning; so we have y information and we have x information and through the process 208 00:24:01,999 --> 00:24:08,149 we have received the error component. That is how we can say it is all about you can 209 00:24:08,149 --> 00:24:15,149 say statistics or econometric; so that means, we like to verify that whether x is totally 210 00:24:15,679 --> 00:24:22,679 influencing the y component or x is partly influencing y and some of the other part can 211 00:24:22,960 --> 00:24:27,739 be explained in other way. For instance, if x is not 100 percent influencing 212 00:24:27,739 --> 00:24:34,320 y then obviously there is some point of lacking; so that lacking part, we have to discuss and 213 00:24:34,320 --> 00:24:39,109 that is nothing but, it is called as a residuals. So, that means when we have y series; we like 214 00:24:39,109 --> 00:24:43,509 to know what is the total sum square; that is nothing but, sum of y i minus y bar the 215 00:24:43,509 --> 00:24:50,509 deviation and its squares. That means the variation from all these points to the, you 216 00:24:51,460 --> 00:24:56,499 can say, from the arithmetic mean now total sum square is equal to explained sum square. 217 00:24:56,499 --> 00:25:02,669 That is nothing but, summation y hat minus y hat bar squares and rest is summation e 218 00:25:02,669 --> 00:25:09,669 squares that is residual sum square. Now, put it technically. What I will do? Let 219 00:25:10,129 --> 00:25:15,259 us assume that this is equation number 1; so what I will do? I will divide summation 220 00:25:15,259 --> 00:25:22,259 y square both the sides; so dividing summation y square on both the sides of equation 1 then, 221 00:25:31,529 --> 00:25:38,309 what do you have? You see here so summation y square divided by summation y square is 222 00:25:38,309 --> 00:25:45,309 equal to summation y hat square by summation y square plus summation e square by summation 223 00:25:47,229 --> 00:25:53,950 y square alright. Now, this particular term is exactly equal 224 00:25:53,950 --> 00:26:00,950 to to 1; this is equal to 1. Now, this is one component and this is another component. 225 00:26:05,529 --> 00:26:09,799 So, that means 1 equal to summation y hat square by summation y square plus summation 226 00:26:09,799 --> 00:26:16,799 e square by summation y square alright. This is how means we are in a position to draw 227 00:26:21,330 --> 00:26:27,950 like this. obviously, i equal to 1 to n here i equal to 1 to n here; so this is i equal 228 00:26:27,950 --> 00:26:34,950 to 1 up to n here alright. Now, we have 2 parts; so we call it this is 229 00:26:35,999 --> 00:26:42,999 part A and this is we call it a part B. Let us first explain what is this part; a component. 230 00:26:43,099 --> 00:26:49,909 So, part of a component is like this summation y hat square by summation y squares what is 231 00:26:49,909 --> 00:26:56,909 y hat exactly. So, y hat is nothing but, summation small beta hat means beta hat and small x 232 00:26:58,679 --> 00:27:05,679 this is whole square i equal to 1 to n divided by summation y square obviously i equal to 233 00:27:07,489 --> 00:27:14,489 1 to n alright. Now, what is beta? That means, if we simplify 234 00:27:15,649 --> 00:27:22,649 further then it is nothing but, beta hat square then summation x square divided by summation 235 00:27:23,029 --> 00:27:30,029 y squares alright; what is beta hat? Beta hat actually, beta hat is equal to summation 236 00:27:30,129 --> 00:27:37,129 x y by summation x square beta hat by beta hat equal to summation x i by summation x 237 00:27:39,960 --> 00:27:46,960 square. Now, you see here if you simplify further then what we can do? Summation put 238 00:27:48,580 --> 00:27:52,089 t here; so summation I will write it here again. 239 00:27:52,089 --> 00:27:59,089 Summation y hat square by summation y square is equal to summation x y whole square divided 240 00:28:07,549 --> 00:28:14,549 by summation x square whole square into summation x square divided by summation y square; this 241 00:28:17,259 --> 00:28:23,029 is what y hat square by summation y square. Now, you see here this is summation x square 242 00:28:23,029 --> 00:28:29,190 and this is summation x x square to the power again to… so this is how it is cancelled 243 00:28:29,190 --> 00:28:34,769 so that means it is nothing but, summation x y whole square by summation x square into 244 00:28:34,769 --> 00:28:40,629 summation y square alright. This is the left out term from this you know component; That 245 00:28:40,629 --> 00:28:46,529 means, this is what we have received from the part a. so, part A it will expand this 246 00:28:46,529 --> 00:28:53,529 part A that is the variance of explained ratio between explained sum square to total sum 247 00:28:53,950 --> 00:28:59,149 square so means the exact term is summation y square equal to summation y hat square plus 248 00:28:59,149 --> 00:29:03,330 summation e square; that means, the total sum square equal to explained sum square plus 249 00:29:03,330 --> 00:29:07,019 residual sum square. Now, what we have done? We divide the total 250 00:29:07,019 --> 00:29:13,019 sum square both the side so then the left side of this problem is equal to 1. Then, 251 00:29:13,019 --> 00:29:18,089 right part of the first part is the explained sum square divided by total sum square. This 252 00:29:18,089 --> 00:29:23,719 is how it is called; as a you know ratio between the explained sum square to total square then, 253 00:29:23,719 --> 00:29:29,460 the ratio between residual sum square to total sum square. Now, we like to know if we have 254 00:29:29,460 --> 00:29:34,979 a component explained sum square to total sum square, what is that issue and if you 255 00:29:34,979 --> 00:29:41,309 know the ratio component is residual sum square divided by total sum square, what is that 256 00:29:41,309 --> 00:29:46,769 component? So, then we have to now you know, interpret accordingly. 257 00:29:46,769 --> 00:29:53,469 Now, by this process we are in the, we are you know coming to a position that summation 258 00:29:53,469 --> 00:29:59,529 y hat square by summation y square that is nothing but, ESS by TSS is nothing but, summation 259 00:29:59,529 --> 00:30:05,159 x y square by summation x square into summation y square. This is what we call it is just 260 00:30:05,159 --> 00:30:10,779 like r square this is what we call it a r square; that is what is r square r square 261 00:30:10,779 --> 00:30:17,779 is nothing but, square of square of correlation coefficient this is what is called as a correlation 262 00:30:21,339 --> 00:30:26,849 coefficient. You see, what is correlation? Then, correlation 263 00:30:26,849 --> 00:30:33,809 is simply nothing but, covariance of X Y divided by sigma x into sigma y. If we will simplify 264 00:30:33,809 --> 00:30:40,299 further then it is nothing but, summation x minus x bar into y minus y bar divided by 265 00:30:40,299 --> 00:30:47,299 n or divided by summation x square by n square root; then summation y square by n square 266 00:30:48,339 --> 00:30:55,339 root; so this n this n this n cancelled, alright. Now, if this is R component, this particular 267 00:30:57,809 --> 00:31:04,739 component is nothing but, summation x y this is summation x y divided by summation x square 268 00:31:04,739 --> 00:31:11,049 into summation y square alright. Now, if we will make it square then obviously r square 269 00:31:11,049 --> 00:31:18,049 equal to summation x y whole square divided by summation x square into summation y square 270 00:31:19,029 --> 00:31:25,959 so what we have is received from here only so that means this particular ratio explains 271 00:31:25,959 --> 00:31:32,959 some square to total sum square is nothing but, the r square component that means what 272 00:31:33,139 --> 00:31:39,979 is r square here r square represent the square of correlation coefficient but, you know this 273 00:31:39,979 --> 00:31:44,879 particular component is very much true when we are in the bivariate process but, when 274 00:31:44,879 --> 00:31:51,619 there is multivariate process then this uh you know ratio between explained sum square 275 00:31:51,619 --> 00:31:55,989 to total sum square cannot be represented as a simple correlation coefficient that is 276 00:31:55,989 --> 00:31:59,379 something different. What is this difference? The difference is 277 00:31:59,379 --> 00:32:05,330 actually, this particular r square component is represented as a coefficient of determination 278 00:32:05,330 --> 00:32:12,330 so this particular component this r square component is represented as a coefficient 279 00:32:12,909 --> 00:32:19,909 of determination; this particular item is represented as a coefficient of determination; 280 00:32:23,049 --> 00:32:25,700 so, what is this coefficient of determination? 281 00:32:25,700 --> 00:32:32,700 Now, coefficient of determination that means, you see here we had, we have here y square 282 00:32:34,159 --> 00:32:41,159 is equal to summation y square equal to summation y hat square plus summation e square. So, 283 00:32:41,859 --> 00:32:48,099 that is what we have received; 1 equal to summation y hat square by summation e y square 284 00:32:48,099 --> 00:32:55,099 plus summation e square by summation y square. This is what we have received and by the process 285 00:32:57,629 --> 00:33:04,629 this is otherwise known as ESS by TSS and this is what we had RSS by TSS. 286 00:33:06,089 --> 00:33:13,089 Now, this particular component by the you know, by the process of derivations, what 287 00:33:13,289 --> 00:33:19,869 we have received it is nothing but, simply you can say R square. Usually, when we will 288 00:33:19,869 --> 00:33:26,289 represent the coefficient of that determination then it is nothing but, represented as a capital 289 00:33:26,289 --> 00:33:31,830 R square. So, what we have written earlier, it is called as a small R square; that means, 290 00:33:31,830 --> 00:33:38,830 small and capital R square; both have same in the case of in the case of bivariate models. 291 00:33:39,599 --> 00:33:45,559 So, bivariate model in the that means in the case of bivariate model, the coefficient determination 292 00:33:45,559 --> 00:33:52,219 and the square of correlation coefficient are similar so that means they are same but, 293 00:33:52,219 --> 00:33:57,289 the interpretation is somewhat different in the correlation coefficient. What we have 294 00:33:57,289 --> 00:34:03,269 to study? You know, association between the 2 variable degree of association between 2 295 00:34:03,269 --> 00:34:06,709 variables. Now, here R square capital R square we have 296 00:34:06,709 --> 00:34:12,629 judged ratio between explained sum square by total sum square and explained some square 297 00:34:12,629 --> 00:34:19,320 is nothing but, total sum of the x component that is explained items and divided by total 298 00:34:19,320 --> 00:34:25,389 sum of y component which is nothing but, dependent component. Now, we like to know what is the 299 00:34:25,389 --> 00:34:31,859 percentage influence of independent variable to dependent variable or you know explanatory 300 00:34:31,859 --> 00:34:37,889 variable to explained variable that is what we are now in the process so that means R 301 00:34:37,889 --> 00:34:44,010 square is the ratio between explained sum square to total sum square by default it is 302 00:34:44,010 --> 00:34:51,010 equal to 1 here plus RSS by TSS here RSS by TSS here. 303 00:34:51,659 --> 00:34:58,659 Now, there are you know beautifully interpretation here; so, what is this beautiful interpretation? 304 00:34:58,720 --> 00:35:05,720 you know You know, fortunately this particular item can be again turned into this one. Now, 305 00:35:05,849 --> 00:35:12,849 we know correlation coefficient is usually in between minus 1 less than equal to 1 so 306 00:35:14,039 --> 00:35:20,809 this is how the correlation coefficient range this is correlation correlation coefficient 307 00:35:20,809 --> 00:35:27,809 range correlation coefficient range alright now correlation coefficient range R square 308 00:35:28,660 --> 00:35:35,660 R square is always in between 0 to 1 so this is the range of coefficient of determination 309 00:35:35,960 --> 00:35:41,359 so what is the coefficient of determination it is the ratio between explained sum square 310 00:35:41,359 --> 00:35:48,359 to total sum square means technically or you can say it would go by physical interpretation 311 00:35:48,710 --> 00:35:55,480 it is the variation of you know total variation of explained items to you can say total variation 312 00:35:55,480 --> 00:36:00,160 on y. So, this is how it is called as a R square 313 00:36:00,160 --> 00:36:06,109 or coefficient determinations. Coefficient determination, coefficient of determination 314 00:36:06,109 --> 00:36:13,109 is nothing but, percentage of proportion variation of y which is explained by the you know, proportion 315 00:36:13,450 --> 00:36:20,450 variation of x. This particular term is called as a proportion variation of y which is explained 316 00:36:21,910 --> 00:36:28,359 by proportion variation of x and this particular component is represented as proportion variation 317 00:36:28,359 --> 00:36:34,680 of y which is explained by proportion variation of y that means this is total sum square is 318 00:36:34,680 --> 00:36:38,440 nothing but, y square; so this is our total component. 319 00:36:38,440 --> 00:36:45,010 We like to know what is the x inflation y, what is e inflation y, so that is why it is 320 00:36:45,010 --> 00:36:50,069 known as a proportion variation of y. This is proportion variation of y which is explained 321 00:36:50,069 --> 00:36:56,829 by this you know, proportion variation of x because ESS is you know, the entire component 322 00:36:56,829 --> 00:37:03,250 of ESS depends upon the x component only then this is nothing but, proportion variation 323 00:37:03,250 --> 00:37:08,990 of y which is x means which is not explained properly that is what we called as a RSS. 324 00:37:08,990 --> 00:37:15,599 That means, that will taken care by u component so 1 equal to R square by R square plus RSS 325 00:37:15,599 --> 00:37:20,819 by TSS. Now, so we have the range 0 R square and 1; 326 00:37:20,819 --> 00:37:26,650 so, this will give you the model signal. This will give the reliability of the model signal. 327 00:37:26,650 --> 00:37:31,470 So far as the second objective is concerned, now you see, we start with the first objective 328 00:37:31,470 --> 00:37:37,140 and by default we are now going to explain the second objective. So, that is the overall 329 00:37:37,140 --> 00:37:41,869 fitness of the model. Now, the moment will get R square that is the proper structure, 330 00:37:41,869 --> 00:37:46,480 how you have to you know receive this R square and how you have to go for its statistical 331 00:37:46,480 --> 00:37:51,019 level of significant because, suppose a first objectivity is concerned with respect to alpha 332 00:37:51,019 --> 00:37:55,519 hat and beta hat, so we are applying the t statistic. Now, when we are going for you 333 00:37:55,519 --> 00:38:00,250 know, over all fitness of the model then, we have to use the f statistics. 334 00:38:00,250 --> 00:38:06,769 Now, we are just explaining how we are receiving the error variance and how it is connected 335 00:38:06,769 --> 00:38:13,769 to total variance of y and total variance of x. now, by this process we like to explain 336 00:38:14,609 --> 00:38:19,339 how is the structure of this significance of the individual parameters that to alpha 337 00:38:19,339 --> 00:38:25,569 hat and beta hat. And, in the other side by means by using all these you know TSS ESS 338 00:38:25,569 --> 00:38:31,019 and RSS, we like to explain how the overall fitness of the module will be statistically 339 00:38:31,019 --> 00:38:36,170 significant. So, that means we have 2 clear cut objectives in our mind first is the significance 340 00:38:36,170 --> 00:38:41,380 of the parameter and the significance of the overall fitness of the module. So, before 341 00:38:41,380 --> 00:38:47,269 i means, before i highlight the entire structure of the R square significance label and the 342 00:38:47,269 --> 00:38:52,760 typical parameters significance variable. We like to highlight here the influence of 343 00:38:52,760 --> 00:38:59,760 R square because the value of R square always in between 0 to 1; so if it is 0 how is this 344 00:39:00,900 --> 00:39:05,410 structure and if it is 1 how is the structure? Let us see here. 345 00:39:05,410 --> 00:39:12,410 Now, R square the entire component is R square R square plus summation e square y summation 346 00:39:13,349 --> 00:39:20,279 y square is exactly equal to 1. Now, this is how we have observed; now since this is 347 00:39:20,279 --> 00:39:27,279 our target, so what we will do? We will take R square equal to 1 minus summation e square 348 00:39:28,559 --> 00:39:35,299 by summation y square, this is what we have received form this you know simplification. 349 00:39:35,299 --> 00:39:42,299 So, what we have to do here now? Let us say, if case 1: case 1 if R square is equal to 350 00:39:42,970 --> 00:39:48,710 1 then, what will happen? If R square equal to 1 this particular item is equal to 0; this 351 00:39:48,710 --> 00:39:54,680 particular item exactly equal to 0 so R square equal to 1 means this is equal to 1 and this 352 00:39:54,680 --> 00:40:01,680 particular item is equal to 0. So, that means, the model is the absolutely fit for this you 353 00:40:02,809 --> 00:40:09,809 can say problem. So, when R square is 1 then it is the best 354 00:40:10,359 --> 00:40:16,359 fitted models. Now, when R square exactly equal to 1 then, the unexplained component 355 00:40:16,359 --> 00:40:22,029 the percentage of unexplained component is exactly equal to 0. That means, there is no 356 00:40:22,029 --> 00:40:28,000 way you has a impact on you can say y variables. So, that means, the 100 percent the percentage 357 00:40:28,000 --> 00:40:34,680 influence of x on y; so this is how, this is the case where R square exactly equal to 358 00:40:34,680 --> 00:40:41,140 1. But, in real life situation or real life problem, it is very difficult to get a situation 359 00:40:41,140 --> 00:40:45,950 when R square exactly equal to 1 alright. In the other side, when R square equal to 360 00:40:45,950 --> 00:40:51,690 1 then it is called as a complete fitted or perfectly fit model. This is what we will 361 00:40:51,690 --> 00:40:58,690 call as a perfectly fitted model - perfectly fitted models but, this is not the sufficient 362 00:40:59,970 --> 00:41:06,450 condition. this is the necessary conditions the when R square equal to 1 the overall fitness 363 00:41:06,450 --> 00:41:12,779 of the model is very high or very high means it is excellent one so that means it is completely 364 00:41:12,779 --> 00:41:19,609 fitted model estimated model so it can be used for forecasting and for but, the sufficient 365 00:41:19,609 --> 00:41:25,299 condition is that when R square is exactly equal to 1. Then, corresponding to the first 366 00:41:25,299 --> 00:41:31,750 objective with respect to significance of the alpha hat and beta hat; it has to be significant 367 00:41:31,750 --> 00:41:38,200 - highly significant. Then, the model we can say that it is best fitted model otherwise 368 00:41:38,200 --> 00:41:43,880 of R square is exactly 1 and model is you know the significance of the model is explicitly 369 00:41:43,880 --> 00:41:50,230 high and other side the parameters are not statistical significant or few parameter are 370 00:41:50,230 --> 00:41:55,500 statistical significant and other parameters are not even significant at a very lower. 371 00:41:55,500 --> 00:42:01,809 Then, the model cannot be used as a forecasting. Even if R square equal to 1 because we are 372 00:42:01,809 --> 00:42:06,930 just in the beginning of this process and we have R square equal to 1 and parameters 373 00:42:06,930 --> 00:42:13,500 are not all parameters, are not statistically highly significant then there is a serious 374 00:42:13,500 --> 00:42:19,410 problem in the modeling. So, there will be some you know complex problem in between. 375 00:42:19,410 --> 00:42:24,880 So, that complex problems we have not highlighted; we will highlight details when we will proceed 376 00:42:24,880 --> 00:42:31,640 you know, when we will proceed accordingly. So, we like to know in later stage, not now. 377 00:42:31,640 --> 00:42:37,609 So, what we can now, you can explain that when R square equal to 1, just we interpret 378 00:42:37,609 --> 00:42:42,329 that it is perfectly fitted the models keeping other things it remains constant. 379 00:42:42,329 --> 00:42:49,329 Now, case 2: when R square is equal to 0, R square equal to 0 then, the model is completely 380 00:42:54,069 --> 00:43:01,069 unfit. That means the entire variations will be receive from u only. That means, this particular 381 00:43:04,029 --> 00:43:11,029 item is equal to 1 and this is equal to 0. Now, when R square equal to R square equal 382 00:43:11,559 --> 00:43:16,529 to 1; so that means, this is equal to 0 when R square equal to 1 then summation y square 383 00:43:16,529 --> 00:43:21,829 is equal to summation y hat squares; this is summation y square equal to summation y 384 00:43:21,829 --> 00:43:25,690 hat square. When it is unfit then summation y square is 385 00:43:25,690 --> 00:43:29,839 equal to summation e square alright but, this is rare and this is rare. Why it is rare? 386 00:43:29,839 --> 00:43:36,839 It may, it may not be rare but, this is you know very extreme situation. The reality is 387 00:43:41,369 --> 00:43:48,279 that when we will when we are in the process of you know fitting a model then, obviously 388 00:43:48,279 --> 00:43:53,619 we must have some theoretical knowledge. So, when we have a theoretical knowledge then, 389 00:43:53,619 --> 00:43:59,720 obviously means most of the instances R square cannot be equal to 0. It may be very low level 390 00:43:59,720 --> 00:44:06,720 but, it cannot be 0. If your R square value is coming 0 that means, your theory is not 391 00:44:07,069 --> 00:44:12,119 absolutely that means identification of problem with relate to all variables are not systematically. 392 00:44:12,119 --> 00:44:17,490 So, there is some kind of problems; that is why before going to fit this particular model 393 00:44:17,490 --> 00:44:23,210 that is your theoretical knowledge may be very perfect and you must be in a position 394 00:44:23,210 --> 00:44:29,769 to to identify exactly the structural variables. If your initial you know, initial homework 395 00:44:29,769 --> 00:44:35,339 is very tough then, obviously later stage of modeling will not face problems. Otherwise, 396 00:44:35,339 --> 00:44:39,880 it is just like a continuous process until you get the best fitted models. If you do 397 00:44:39,880 --> 00:44:46,230 not go stepwise you process then, obviously every time we will go back to again original 398 00:44:46,230 --> 00:44:51,960 position till we get the better fitted model. So, that is why each and every stage should 399 00:44:51,960 --> 00:44:58,960 be perfectly before we going to next stage so in reality we have R square one extreme 400 00:45:01,269 --> 00:45:08,269 equal to 1 and another extreme R square equal to 0 but, it is very essential and it is also 401 00:45:08,539 --> 00:45:15,539 very essential. So, what is the actual is, when R square variable is close to 1 then, 402 00:45:15,839 --> 00:45:22,839 it is called as a best you know means a better fitted model. We cannot say best fitted model 403 00:45:23,539 --> 00:45:27,150 when we will call is a best fitted model then, obviously R square equal to 1. 404 00:45:27,150 --> 00:45:34,150 Now, when R square R square is close to 1 then it is called as a that means, the fitness 405 00:45:34,859 --> 00:45:40,039 of the model will start increasing that means, you start from R square equal to point 0 point 406 00:45:40,039 --> 00:45:47,039 0 1 point 0 2 point 0 3 point 0 4, 0 5, like this; so we will go up to point 0 0. 407 00:45:47,990 --> 00:45:54,990 Now, we have 3 different ranges; in fact, see here; the range is like this. So, take 408 00:45:55,640 --> 00:46:02,640 a case here so this is what we called 0 and point one 0.1 then 0.2 then 0.3 like this 409 00:46:05,589 --> 00:46:12,190 then this is 0.5. Then of course, this is 1.0; this is how the R square ranges this 410 00:46:12,190 --> 00:46:17,069 is the R square range. so when we will call it as R square 0 less than 1. So, the range 411 00:46:17,069 --> 00:46:24,069 will be like this so this is you know middle; now if you are in this stage the model fitness 412 00:46:24,329 --> 00:46:29,660 or model accuracy will start declining when we are moving this side then, the model accuracy 413 00:46:29,660 --> 00:46:34,539 will be start increasing. Now, always our objective is to go this side, 414 00:46:34,539 --> 00:46:39,609 not this side. So, that the model fitness or overall fitness of the model will be start 415 00:46:39,609 --> 00:46:46,400 increasing now when your R square value will be closed towards one then it is the signal 416 00:46:46,400 --> 00:46:51,900 of or it is just like a green signal it is the outcome of the best fitted model. So, 417 00:46:51,900 --> 00:46:58,160 when we are closing to 0 or close to 0 then, obviously it is a, it will give you the red 418 00:46:58,160 --> 00:47:04,480 signal. That means we are we are diverting from the best fitted model so we should not 419 00:47:04,480 --> 00:47:09,680 go towards the red signal whether you have to go towards the green signal where the best 420 00:47:09,680 --> 00:47:14,829 fitted model or the model accuracy will be start increasing so this would be our main 421 00:47:14,829 --> 00:47:21,440 agenda before we will go to this process. Now, you come back to the original position, 422 00:47:21,440 --> 00:47:26,710 what is this actually structure? Our objective is here to test the R square whether R square 423 00:47:26,710 --> 00:47:28,559 is statistical significant or not. 424 00:47:28,559 --> 00:47:34,390 Further, we have to prepare the ANOVA tables; so we have to prepare the ANOVA table just 425 00:47:34,390 --> 00:47:40,539 like a… in the first objective, the first objective we have explained here. The first 426 00:47:40,539 --> 00:47:47,539 objective what we have explained here is this is what the first objective we have explained 427 00:47:50,470 --> 00:47:57,190 here the first objective here the fitness of the model is like this. so, we like to 428 00:47:57,190 --> 00:48:04,190 know the target component is t alpha hat and component is t beta hat. Now, we have received 429 00:48:05,220 --> 00:48:11,599 here summation e square by n minus 2; so that is what sigma u now variance of alpha hat 430 00:48:11,599 --> 00:48:14,740 you have. Similarly, you have to go for standard error 431 00:48:14,740 --> 00:48:21,190 of alpha hat standard error of alpha hat is nothing but, variance of alpha hat. Now similarly, 432 00:48:21,190 --> 00:48:27,859 what we have to do here now we have to get the variance of beta hat here variance of 433 00:48:27,859 --> 00:48:33,619 beta hat is nothing but, sigma square u by summation x square. Similarly, standard error 434 00:48:33,619 --> 00:48:39,710 of beta hat standard error beta hat is nothing but, square root of variance of beta hat so 435 00:48:39,710 --> 00:48:45,549 this is what the beta hat parameter structures and alpha hat alpha hat parameter structure 436 00:48:45,549 --> 00:48:52,549 is alpha hat parameter structure is you can say that means, standard error of alpha hat 437 00:48:53,140 --> 00:49:00,140 is nothing but, sigma square u summation x square by n summation x square. So, this is 438 00:49:00,869 --> 00:49:06,990 what the structure this is standard error of this terms alright. 439 00:49:06,990 --> 00:49:13,819 Now, standard error of this much; so what you have to do? Once you have alpha variance 440 00:49:13,819 --> 00:49:18,930 of alpha hat, you can get the standard error of alpha hat. So, what is the issue here? 441 00:49:18,930 --> 00:49:24,599 Now, our objective is to know whether alpha hat is significant or beta hat is alpha hat 442 00:49:24,599 --> 00:49:29,049 is significant beta hat is significant. So, we need to calculate t of alpha hat and you 443 00:49:29,049 --> 00:49:35,309 need to calculate t of beta hat so further to know the significance of this particular 444 00:49:35,309 --> 00:49:41,690 alpha hat and beta hat so we have to apply a statistic hypothesis or we have to use the 445 00:49:41,690 --> 00:49:45,059 statistical hypothesis. Basically, the statistical hypothesis is divided 446 00:49:45,059 --> 00:49:50,730 into 2 parts called as a null hypothesis and alternative hypothesis. So, this is null hypothesis 447 00:49:50,730 --> 00:49:56,289 then in contemporary to null hypothesis we have alternative hypothesis so we start with 448 00:49:56,289 --> 00:50:01,220 the null hypothesis that the suppose, our target is to test alpha is significant alpha 449 00:50:01,220 --> 00:50:07,109 is significant means alpha must have some value if alpha has a some value. Then, we 450 00:50:07,109 --> 00:50:11,859 on the basis of that value you have to test the significance now let us we start with 451 00:50:11,859 --> 00:50:18,859 that alpha is equal to 0 so alpha 0 usually fit that alpha equal to 0 let us say alpha 452 00:50:19,339 --> 00:50:24,430 alpha equal to 0 and we have to test alpha naught equal to 0. 453 00:50:24,430 --> 00:50:28,749 Once you you know reject this small hypothesis then we are in the right trac[k]- if you could 454 00:50:28,749 --> 00:50:33,990 not reject then that variable may not be statistical significant so that means so t of alpha hat 455 00:50:33,990 --> 00:50:38,299 is basically we will calculate technically is nothing but, alpha hat by standard error 456 00:50:38,299 --> 00:50:44,009 of alpha hat and p of beta hat is nothing but, you can say beta hat y standard error 457 00:50:44,009 --> 00:50:50,349 of beta hat now this is calculated statistic this is calculated statistic that has to be 458 00:50:50,349 --> 00:50:55,710 compare with the tabulated statistic so this is to be also compare with tabulated statistic. 459 00:50:55,710 --> 00:51:00,460 Then we get to know whether this particular item is statistical significant or not and 460 00:51:00,460 --> 00:51:06,809 if it is significant at what level their significant so we have different structure of significance 461 00:51:06,809 --> 00:51:13,809 tailed 5 percent one tailed and 2 tailed and 10 percent 1 tailed and 2 tailed so starting 462 00:51:19,529 --> 00:51:24,589 procedure is we have to start with the 1 percent level. Then, if it is non significant then 463 00:51:24,589 --> 00:51:27,490 you have n to move to 5 percent. If it is not significant then, 5 percent; then you 464 00:51:27,490 --> 00:51:31,670 have to go to 10 percent but, if you will get significance at one percent then that 465 00:51:31,670 --> 00:51:36,460 means your model accurate is very very high and the reliability of the model is also that 466 00:51:36,460 --> 00:51:43,230 means if the reliability of the model is perfectly. If we are getting significance at 10 percent 467 00:51:43,230 --> 00:51:50,230 level yes model is reliable one but, the degrees of reliability may be very less so when the 468 00:51:50,769 --> 00:51:55,970 variable is statistically significant in a close to one or at the level of one percent 469 00:51:55,970 --> 00:52:02,259 then obviously the model reliability or model accuracy is very high or absolutely. 470 00:52:02,259 --> 00:52:09,259 Now, we will target or we have to reformulate or we have to design or redesign in such a 471 00:52:10,420 --> 00:52:17,420 way so that the parameters means involve in this particular systems, modeling systems 472 00:52:17,720 --> 00:52:24,029 should be highly significant highly statistically significant and mostly at it should be at 473 00:52:24,029 --> 00:52:31,029 the level of one percent only if it is so then the model reliability so far as the first 474 00:52:31,220 --> 00:52:34,339 order condition is… Now, again for sufficient condition we have 475 00:52:34,339 --> 00:52:40,029 to go for R square that means, there are 2 problems here. So, your all parameters should 476 00:52:40,029 --> 00:52:47,029 be statistical significant at the higher level 1 percent level and same times your R square 477 00:52:47,309 --> 00:52:54,259 will be also statistically significant at the 1 percent level or that is at a higher 478 00:52:54,259 --> 00:53:00,029 levels. If it is so then, the model is absolutely fit for the forecasting but, the problem if 479 00:53:00,029 --> 00:53:05,710 parameters are significant and R square is not significant or R square is significant 480 00:53:05,710 --> 00:53:09,069 parameters are not significant; then the problem is very complicated. 481 00:53:09,069 --> 00:53:14,619 So, that means there is some kind of fault or problem in between this process; so that 482 00:53:14,619 --> 00:53:20,480 process means has to be investigated further again there are certain problems in between; 483 00:53:20,480 --> 00:53:24,690 so that we are getting first part and we are not receiving the second part the systems. 484 00:53:24,690 --> 00:53:30,519 The system will be very much or perfectly when parameters are significant it should 485 00:53:30,519 --> 00:53:37,049 be R square should be statistically significant. If not then there is serious issue for this 486 00:53:37,049 --> 00:53:41,710 particular estimated model. We have to redesign or you have to rebuild till you get the best 487 00:53:41,710 --> 00:53:47,390 fitted models where both parameters are statistical significant and your R square will be statically 488 00:53:47,390 --> 00:53:51,670 significant. So, we will discuss details in the next class; thank you very much; have 489 00:53:51,670 --> 00:53:52,369 a nice day.