1 00:00:13,880 --> 00:00:21,000 hi welcome to today's lecture so i will ah start from with a brief recap of what we have 2 00:00:21,000 --> 00:00:24,820 discussed in last lecture so in last lecture we began with standard deviation we. had a 3 00:00:24,820 --> 00:00:29,190 recap over zee score and plotting box plots now and last towards the end of last lecture 4 00:00:29,190 --> 00:00:33,750 we had discussed about moments as a way of characterizing data ok so the definition of 5 00:00:33,750 --> 00:00:38,700 moment as you would recall is in general so given a set of observations y i of a variable 6 00:00:38,700 --> 00:00:44,020 y the rth sample moment about zero is defined as m r star is equal to summation y to the 7 00:00:44,020 --> 00:01:13,350 power r by n for r is one two three dot dot dot ok so clearly . if we give if we set the 8 00:01:13,350 --> 00:01:20,149 value of r equal to one so m one star is summation y by n which is nothing but the mean so in 9 00:01:20,149 --> 00:01:32,420 other words the first moment about zero of set of observations is the mean of the distribution 10 00:01:32,420 --> 00:01:36,659 ok so we can next go define in a more general sense the rth sample moment about any particular 11 00:01:36,659 --> 00:01:37,840 value and in particular we want to know the rth sample moment about the mean so the rth 12 00:01:37,840 --> 00:01:54,600 sample moment about the mean is defined by summation y minus y bar to the power r by 13 00:01:54,600 --> 00:02:08,250 n for r equal to one two three dot dot dot so clearly so when you say about the mean 14 00:02:08,250 --> 00:02:12,810 if we are to generalize it about a value a instead of y bar we will put a value of a 15 00:02:12,810 --> 00:02:13,810 ok so again as before we did if you put r equal to one so first moment about the mean 16 00:02:13,810 --> 00:02:17,720 is summation of y minus y bar whole to the power one by n and summation y minus y bar 17 00:02:17,720 --> 00:02:23,500 is going to give you a value of zero as we had determined in last lecture . 18 00:02:23,500 --> 00:02:34,490 so first moment about the mean is zero what about the second moment about the mean if 19 00:02:34,490 --> 00:02:47,930 lets say. so n is reasonably large so your m two is nothing but summation y minus y bar 20 00:02:47,930 --> 00:03:03,870 whole square by n and you can clearly see that this is nothing but very close to what 21 00:03:03,870 --> 00:03:08,850 is our definition of the variance ok as opposed to divide it by n minus one we have divided 22 00:03:08,850 --> 00:03:13,840 by n but for n large your m two is nothing but the sample variance ok again we can use 23 00:03:13,840 --> 00:03:17,670 it for getting higher moments like third moment about the mean y minus y bar whole to the 24 00:03:17,670 --> 00:03:28,421 power three by n so one aspect that we discussed in last class was depending on the nature 25 00:03:28,421 --> 00:03:39,040 of the distribution all odd moments about the mean so in which means that m one m three 26 00:03:39,040 --> 00:03:48,340 m five m seven so on and so forth will return you a value of zero this is because for every 27 00:03:48,340 --> 00:04:01,210 value of y which is situated to the left of y bar so there is another value of y . which 28 00:04:01,210 --> 00:04:06,340 is situated to the right of y bar and their frequencies of these two values are equal 29 00:04:06,340 --> 00:04:10,640 which means that for every negative value that you accumulate for lets say y one minus 30 00:04:10,640 --> 00:04:17,190 y bar there is a corresponding y two minus y bar which is positive and equal value so 31 00:04:17,190 --> 00:04:25,289 these will cancel each other out eventually giving you a value of m three or m five which 32 00:04:25,289 --> 00:04:37,790 will be equal to zero but of course for a non zero for a asymmetric 33 00:04:37,790 --> 00:04:41,510 distribution this value is not going to be zero it will have some value now depend the 34 00:04:41,510 --> 00:04:47,880 way these moments are defined if you have a value of y which has a given unit then this 35 00:04:47,880 --> 00:04:53,600 m r will not return your value which is unit less with rather it is unit which it has some 36 00:04:53,600 --> 00:04:59,470 units so you of course want to eliminate that i will you know that aspect of dimensionality 37 00:04:59,470 --> 00:05:02,440 in your measurements and for that purpose what you typically do is you divide by another 38 00:05:02,440 --> 00:05:15,120 moment which is raised to some other powers so that the units are the same so a three 39 00:05:15,120 --> 00:05:20,590 is one such . measure it is defined as summation of y minus y bar whole cube by summation of 40 00:05:20,590 --> 00:05:31,630 y minus y bar whole square whole to the power three by two so it is nothing but m three 41 00:05:31,630 --> 00:05:49,669 by m two whole to the power three by two so this is of course unit less as you can see 42 00:05:49,669 --> 00:05:57,980 from the definition this is called skewness so we had worked out so logic ah we had reasoned 43 00:05:57,980 --> 00:06:01,960 that if you have a distribution which one like this so this is y this is frequency so 44 00:06:01,960 --> 00:06:08,460 your mean is somewhere here and so all these values so because there is a precedence of 45 00:06:08,460 --> 00:06:13,960 values which are to the left so all these y minus y bar values to then this domain will 46 00:06:13,960 --> 00:06:19,240 give me negative and in y minus y bar in this domain will be positive as a consequence of 47 00:06:19,240 --> 00:06:42,540 which there is a possibility that when you compute y minus y bar whole cube summation 48 00:06:42,540 --> 00:07:01,760 this might turn out to be negative so there is a greater chance that in this case . that 49 00:07:01,760 --> 00:07:06,000 you the when you calculate m three or a three you get a value which is negative because 50 00:07:06,000 --> 00:07:10,310 m two m four m six are always positive because they have y minus y bar whole square whole 51 00:07:10,310 --> 00:07:13,270 fourth whole six those are always positive so all odd moments for asymmetric distributions 52 00:07:13,270 --> 00:07:16,040 may be either negative or positive depending on how the data is biased 53 00:07:16,040 --> 00:07:19,840 ok so another measure so skewness of a data is basically to see differentiate it in a 54 00:07:19,840 --> 00:07:25,020 symmetric distribution with a non symmetric distribution either which is biased in to 55 00:07:25,020 --> 00:07:29,510 the left or biased to the right ok so these two are asymmetric and these will give me 56 00:07:29,510 --> 00:07:43,280 different values so we had worked out an example in last class where we tried to find out what 57 00:07:43,280 --> 00:08:02,600 is the skewness measure for this particular population we anticipated it would be negative 58 00:08:02,600 --> 00:08:11,259 we turned out with a value which is slightly positive but let us work out another example 59 00:08:11,259 --> 00:08:18,390 where let us say where we take a data . which is biased to the right ok so in that case 60 00:08:18,390 --> 00:08:28,009 let me have these values so lets say lets say our mode is three so 61 00:08:28,009 --> 00:08:35,289 and this is one this is five i have two and some intermediate value four ok so let say 62 00:08:35,289 --> 00:08:39,690 i have one ones two twos three threes two fours and one five ok so a one two three four 63 00:08:39,690 --> 00:08:48,600 five six seven eight nine values ok so let a is you know let us calculate the mean so 64 00:08:48,600 --> 00:08:55,080 y bar is going to be one plus four plus nine plus eight plus five so nine numbers which 65 00:08:55,080 --> 00:09:04,550 is five plus five ten nine ten and seventeen twenty seven so y bar is nothing but three 66 00:09:04,550 --> 00:09:13,100 ok so y minus y bar whole cubed . is going to be give me a value of minus two whole cubed 67 00:09:13,100 --> 00:09:24,320 plus minus one whole cubed into two plus one whole cubed into two plus two cubed ok so 68 00:09:24,320 --> 00:09:28,720 i have minus eight here i have minus one minus two here ok plus two here plus eight here 69 00:09:28,720 --> 00:09:39,410 ok so two cubed is eight eight eight so these exactly balance each other out and in for 70 00:09:39,410 --> 00:09:43,480 this particular distribution i get y minus y bar whole cube to be zero summation of y 71 00:09:43,480 --> 00:09:50,350 minus y bar whole cube to be zero ok so clearly what you see is your you know your data is 72 00:09:50,350 --> 00:09:56,260 slowly shifting to the right ok from a value so but if we have if you bias the data even 73 00:09:56,260 --> 00:10:01,339 to the right side even more then we will slowly get a value which is much positive than i 74 00:10:01,339 --> 00:10:05,180 r . and than otherwise ok there is another metric of ah . you know 75 00:10:05,180 --> 00:10:13,110 of characterizing a distribution which we call as kurtosis so the kurtosis is a way 76 00:10:13,110 --> 00:10:18,370 of measuring the peakedness of a curve or how flat or how sharp is the curve ok so i 77 00:10:18,370 --> 00:10:26,339 can have two curves lets say this is one situation this is another situation this is another 78 00:10:26,339 --> 00:10:33,240 situation ok so what i see is the peakedness of the curves this is increasing right this 79 00:10:33,240 --> 00:10:40,130 is called it measures the peakedness or flatness of a curve ok and it is given by this metric 80 00:10:40,130 --> 00:10:48,250 of a four which is defined by summation y minus y bar whole to the power forth by summation 81 00:10:48,250 --> 00:11:00,470 y minus y bar whole square whole square ok so this is nothing but . m four by m two square 82 00:11:00,470 --> 00:11:14,959 so if we compare our definition of a three and a four so a three i will i write a three 83 00:11:14,959 --> 00:11:26,000 was defined as m three by m two whole to the power three by two a four is defined by m 84 00:11:26,000 --> 00:11:36,800 four by m two whole square and as i said so the aim here is to define these matrix in 85 00:11:36,800 --> 00:11:45,260 such a way that you come up with a non dimensional term and that is exactly how you have defined 86 00:11:45,260 --> 00:11:54,649 m four because m four has powers of y minus y bar whole to the power four so the definition 87 00:11:54,649 --> 00:12:00,870 of you know m four has whole power four this has whole power two so you have to you know 88 00:12:00,870 --> 00:12:07,850 square it to generate something which has the same units of m four and that is why your 89 00:12:07,850 --> 00:12:14,710 a four is defined by m four by m two square ok so let us calculate a some a simple example 90 00:12:14,710 --> 00:12:18,662 where we calculate a four ok so we want to calculate a four lets say our data is . so 91 00:12:18,662 --> 00:12:26,191 let us take a very flat distribution ok so lets say our data is one two three four so 92 00:12:26,191 --> 00:12:37,470 each of these values only appear once in this case ok so your y bar is equal to two point 93 00:12:37,470 --> 00:12:53,040 five ok so i can have y minus y bar one two three four y minus y bar whole square equal 94 00:12:53,040 --> 00:12:58,149 to the power forth so this is two point five so one point five whole to the power four 95 00:12:58,149 --> 00:13:02,459 this is point five whole to the power four point five whole to the power four one point 96 00:13:02,459 --> 00:13:10,610 five whole to the power four ok so we can you can go through the calculation and see 97 00:13:10,610 --> 00:13:26,160 what value of a four you get for this distribution versus for another distribution where let 98 00:13:26,160 --> 00:13:30,149 us say one two two three four three three four . right so we have made the distribution 99 00:13:30,149 --> 00:13:33,100 so the other point lets say slightly higher so we have generated another distribution 100 00:13:33,100 --> 00:13:37,279 so lets say this is distribution one which is completely flat and this is distribution 101 00:13:37,279 --> 00:13:43,100 two which is slightly more peaked because you have these two points which are occurring 102 00:13:43,100 --> 00:13:50,820 at a slightly higher frequency ok please go through this calculation and see what value 103 00:13:50,820 --> 00:13:59,620 of a four you get you can generate one more distribution where you arbitrarily lets say 104 00:13:59,620 --> 00:14:07,790 you make it one two two two three three four so it is a symmetric so let us say the next 105 00:14:07,790 --> 00:14:10,100 distribution is a symmetric ok but two has a higher value ok you can make a you know 106 00:14:10,100 --> 00:14:13,950 keep a keep on making it more and more peaked and see what kind of value you get 107 00:14:13,950 --> 00:14:16,640 so these exercises will help you get an idea of how to go about generating or coming up 108 00:14:16,640 --> 00:14:20,730 with important . matrix of quantifying the statistics of the data so in the later part 109 00:14:20,730 --> 00:14:24,440 of this class today i wanted to discuss about an. analytical tool or statistical how can 110 00:14:24,440 --> 00:14:28,180 you use a software to do this statistical analysis of course we very well saw that even 111 00:14:28,180 --> 00:14:32,060 for these four points if you have to start to do the calculation by hand beyond a point 112 00:14:32,060 --> 00:14:34,959 we are not able to do so we need a tool which would enable us to do these calculations if 113 00:14:34,959 --> 00:14:39,420 you have clear data sets or where the data is to be actually read from a file where you 114 00:14:39,420 --> 00:14:43,560 have lets say you know observations from ten different experiments so on and so forth in 115 00:14:43,560 --> 00:14:48,220 that case i wanted to you know introduce you to this language called r so this so what 116 00:14:48,220 --> 00:14:52,110 exactly is r r is a software environment which is used for data analysis specifically it 117 00:14:52,110 --> 00:14:55,470 is a gnu package and the source code of r is freely available so that is the best part 118 00:14:55,470 --> 00:15:01,100 of it you can and it has a command line interface and it has other interfaces also . so and 119 00:15:01,100 --> 00:15:05,610 more importantly it can produce publication quality graphs with mathematical symbols ok 120 00:15:05,610 --> 00:15:12,029 so r is essentially an interpreted language this is a sample example of a console of r 121 00:15:12,029 --> 00:15:25,240 ok so you have you know this is as it is written clearly here r is free software and comes 122 00:15:25,240 --> 00:15:36,050 with absolutely no warranty ok so you can belief me you can download it 123 00:15:36,050 --> 00:15:43,310 from the net and i will come to the details of how you can download it and how you can 124 00:15:43,310 --> 00:15:47,570 use it so what are the applications of this r language it is used by statisticians it 125 00:15:47,570 --> 00:16:12,370 was established in the university of auckland and it is now widely used to the extent that 126 00:16:12,370 --> 00:16:14,660 there are group of researchers who contribute to the further development of this language 127 00:16:14,660 --> 00:16:19,190 ok so it is requiring it is used by statisticians for statistical computation and software development 128 00:16:19,190 --> 00:16:22,630 r supports matrix arithmetic and its performance is comparable to that of you know. expensive 129 00:16:22,630 --> 00:16:25,280 softwares . widely used expensive softwares like matlab for which you need to purchase 130 00:16:25,280 --> 00:16:30,589 a license and r can be used to perform high performance statistical computation and to 131 00:16:30,589 --> 00:16:34,790 the extent that it is also used by the business fraternity so it brings us to the first question 132 00:16:34,790 --> 00:16:48,820 how do you get r so r is an open source programming language so you can download it from this 133 00:16:48,820 --> 00:16:56,630 so there is a website called r project dot org and the best part about it is it is available 134 00:16:56,630 --> 00:17:00,130 in all the different formats all the different operating systems so you can download it for 135 00:17:00,130 --> 00:17:03,082 windows you can download it for linux or you can download it for matt ok so now r itself 136 00:17:03,082 --> 00:17:13,850 is a command line interface so sometimes people want graphical user interfaces for easy use 137 00:17:13,850 --> 00:17:31,090 of the facility and even to understand how it is used so r there are various g u i softwares 138 00:17:31,090 --> 00:17:33,070 which you know run our code so r studio is one such then 139 00:17:33,070 --> 00:17:36,350 so let me give you an example of how these r . and r studio work so this is a console 140 00:17:36,350 --> 00:17:41,390 of r ok this is an r console ok and this is an r studio console ok so when we say console 141 00:17:41,390 --> 00:17:48,620 so you can download r and you can write down so in the console of course you saw the difference 142 00:17:48,620 --> 00:17:57,090 between r studio and r here everything you have to write down and then get to your point 143 00:17:57,090 --> 00:18:07,490 here you have a way of you know ah browsing through different aspects seeing what are 144 00:18:07,490 --> 00:18:15,950 the tools viewing and also the help file is much more easy accessible ok so you can download 145 00:18:15,950 --> 00:18:21,929 e you know any of these two softwares i recommend that you download r studio for your use ok 146 00:18:21,929 --> 00:18:30,442 so so that brings us to our studio ok so we can clearly see so this is this is for example 147 00:18:30,442 --> 00:18:41,380 the console of r studio so you have three . if you look at the r studio com command 148 00:18:41,380 --> 00:18:49,010 there are three different windows this is called the workspace so lets see if you have 149 00:18:49,010 --> 00:18:53,090 generated some variables you can you will see them being recorded here with the full 150 00:18:53,090 --> 00:18:56,080 information and this is the main command window where you will actually enter various things 151 00:18:56,080 --> 00:18:58,640 to do ok so let us just do some simple computation 152 00:18:58,640 --> 00:19:04,820 in r studio so i can open r studio so i can lets say if i want to do simple arithmetic 153 00:19:04,820 --> 00:19:14,330 i can define a a as a variable and i can assign it the value of one ok i can write a a equal 154 00:19:14,330 --> 00:19:20,530 to one now the value of a a is not displaced but what you see is in this section you can 155 00:19:20,530 --> 00:19:23,120 see the value of a a being generated and its value is obviously written here so in order 156 00:19:23,120 --> 00:19:31,290 to know exactly what is the value of a if you write a a and press enter then you get 157 00:19:31,290 --> 00:19:42,929 a value of one ok so similarly i can do b b is equal to two so . note that in statistical 158 00:19:42,929 --> 00:19:52,450 language if you irrespective of whether you give a space or not. it will still work it 159 00:19:52,450 --> 00:19:59,010 will not crimp it will still work so in both these cases b b is also stored at two and 160 00:19:59,010 --> 00:20:03,970 b b b c is also stored too even though you gave spaces before the equal to but for your 161 00:20:03,970 --> 00:20:08,520 own clarity it is better that when you write there is a space in between an equal to or 162 00:20:08,520 --> 00:20:12,450 any other symbol ok so in the r studio console itself we can do 163 00:20:12,450 --> 00:20:18,120 basic calculations so for example i can write a a plus b b enter so i get the value of three 164 00:20:18,120 --> 00:20:23,702 i can do simple arithmetic so i can write a a power b b so that is x to the power y 165 00:20:23,702 --> 00:20:32,930 right and i can enter i can evaluate the value of one so a is one one to the power two is 166 00:20:32,930 --> 00:20:48,840 one i can write b b to the power b b so i can do so two square is four i can do you 167 00:20:48,840 --> 00:20:57,880 know ah simple calculation so if i do sine . of thirty degrees so remember that for you 168 00:20:57,880 --> 00:21:10,350 know it is calculated in radians so sine of thirty degrees we always think it is half 169 00:21:10,350 --> 00:21:31,020 but this guy is giving a value of minus point nine eight this is because it is calculating 170 00:21:31,020 --> 00:21:43,260 in in radians ok so in order to calculate the value of sine 171 00:21:43,260 --> 00:21:50,960 of thirty in in radians you have to write thirties slash pi by one eighty ok if you 172 00:21:50,960 --> 00:21:57,150 write and you know good part is it gives you you know what is the way in which you write 173 00:21:57,150 --> 00:22:03,370 so sine of x is how you have to enter this value and within this you can do anything 174 00:22:03,370 --> 00:22:10,080 if i put enter here now i get a value of point five which was not negative so when you are 175 00:22:10,080 --> 00:22:16,960 doing trigonometry calculations you have to enter these values in terms of radians ok 176 00:22:16,960 --> 00:22:26,929 similarly i can do the same thing cos of pi by two return me a value of this see so one 177 00:22:26,929 --> 00:22:32,780 thing is these things these values are calculated numerically so this is the reason why you 178 00:22:32,780 --> 00:22:40,780 see that when its . value of cos pi by two it is not zero but it is coming as six point 179 00:22:40,780 --> 00:22:44,460 one two whatever into ten to e minus one seven means six point one seven ten to the power 180 00:22:44,460 --> 00:22:53,179 minus seven which is as good as zero but it is not exactly zero and this is because these 181 00:22:53,179 --> 00:22:59,690 values are internally computed by a code so it is an approximate 182 00:22:59,690 --> 00:23:09,820 so i can do the same thing i can do lets say log of ten so so if you see the syntax it 183 00:23:09,820 --> 00:23:15,750 is log x comma base ok so i can write you know so this is another way of writing is 184 00:23:15,750 --> 00:23:18,010 so lets say if i do log of ten i get the value two point three that means that it is actually 185 00:23:18,010 --> 00:23:20,990 calculating the natural logarithm and not the log base ten ok so lets say if i do log 186 00:23:20,990 --> 00:23:25,140 ten comma ten so now it is giving a value of one so if i entered the base and this is 187 00:23:25,140 --> 00:23:28,600 my number this is the base with which i am calculating my logarithm it is giving me the 188 00:23:28,600 --> 00:23:33,800 value of one but says . if you just write log it will give you a value with respect 189 00:23:33,800 --> 00:23:39,559 to x the natural log so i can also do log ten of ten then also you get a value of one 190 00:23:39,559 --> 00:23:46,930 ok so you can easily go through the list of these kind of in inbuilt functions which do 191 00:23:46,930 --> 00:23:54,760 the basic calculations ok now lets say i have ten values right i have ten values and i want 192 00:23:54,760 --> 00:24:01,000 to calculate the you know lets say standard deviation mean or median of a distribution 193 00:24:01,000 --> 00:24:09,049 how do i do it ok so what you do here is you enter lets say 194 00:24:09,049 --> 00:24:14,540 data is so because i did these you know i i generated this data before it is already 195 00:24:14,540 --> 00:24:18,191 showing up as there but i can write i can rewrite data i would use this expression c 196 00:24:18,191 --> 00:24:23,390 of one comma two comma three. ok so lets say i enter as a vector so when so the syntax 197 00:24:23,390 --> 00:24:31,179 is c and within that you have you put numbers ok by default you want to put numbers so when 198 00:24:31,179 --> 00:24:37,549 i do . enter and then i write data so what you see here data got initialized to a row 199 00:24:37,549 --> 00:24:47,150 vector which has five entries one two three four five ok in order to type data i should 200 00:24:47,150 --> 00:24:57,130 just write data and then i get back what it is and because it is a row vector it is showing 201 00:24:57,130 --> 00:25:04,130 as one of one two three four five ok so now i can change you know lets say i have an five 202 00:25:04,130 --> 00:25:11,900 new entries i can write data is equal to so i can write data is equal to c of so i had 203 00:25:11,900 --> 00:25:15,960 my original data and i am overrating adding three more numbers say six seven nine ok so 204 00:25:15,960 --> 00:25:24,300 i write data is equal to c of data comma six seven nine now if i type data so what you 205 00:25:24,300 --> 00:25:31,260 see here data has now become a eight column entry where in additional to one two three 206 00:25:31,260 --> 00:25:35,700 four five you have three more numbers which have been added . 207 00:25:35,700 --> 00:25:45,240 ok so i can just get the value of data here by writing data and enter and then i get this 208 00:25:45,240 --> 00:25:53,320 value now calculating these basic matrix in you know in r is super simple so in order 209 00:25:53,320 --> 00:26:07,270 to calculate the mean of data i will just write mean of data and i will enter and i 210 00:26:07,270 --> 00:26:17,240 get the exact value which is four point six two five ok i can calculate a median of data 211 00:26:17,240 --> 00:26:25,330 median is four point five so i have one two three four five six seven eight so my median 212 00:26:25,330 --> 00:26:36,180 is at position between four and five and which is nothing but four point five by two which 213 00:26:36,180 --> 00:26:45,370 is what it is giving four point five now what is the mode of this distribution we can clearly 214 00:26:45,370 --> 00:26:50,360 see that these are different values which don't have any so there is no particular value 215 00:26:50,360 --> 00:27:05,050 which has you know which is maximal in frequency so if i write mode of data let us see what 216 00:27:05,050 --> 00:27:11,920 it gives . it says numeri which means you know so this 217 00:27:11,920 --> 00:27:16,110 is it is not giving an exact value because it does i don't have any particular value 218 00:27:16,110 --> 00:27:21,150 which is repeating so let us again change the you know expression for data by writing 219 00:27:21,150 --> 00:27:30,020 data is equal to c of data comma three three comma three comma four comma four comma four 220 00:27:30,020 --> 00:27:47,750 comma four ok this is how i modify data i can type it here but i can clearly see here 221 00:27:47,750 --> 00:27:58,820 now a data is showing up as a fourteen column vector ok now once it is. bigger than a you 222 00:27:58,820 --> 00:28:05,020 know certain size of course it is difficult to see here but what you can do is you can 223 00:28:05,020 --> 00:28:12,440 write data and. enquire its value so you have this entire distribution ok of data so i can 224 00:28:12,440 --> 00:28:16,890 just write now if i do mode of data is giving me numeric 225 00:28:16,890 --> 00:28:24,049 value so we have to see because now so let us see . median data ok it is giving the value 226 00:28:24,049 --> 00:28:32,529 four so now i guess if we arrange them in terms of ascending order then four will be 227 00:28:32,529 --> 00:28:52,360 there in the center and that is why this meeting is giving you a value of four ok so if i go 228 00:28:52,360 --> 00:28:58,780 back to the presentation so let me just briefly ah you know say what we have done 229 00:28:58,780 --> 00:29:06,539 so we you can create a custom vector so using this c open bracket and then you have various 230 00:29:06,539 --> 00:29:15,279 entries one two three four five six in that way you will get these values you can enter 231 00:29:15,279 --> 00:29:26,110 them you can you know right introduce this vector as a sequence so i can write from one 232 00:29:26,110 --> 00:29:38,830 to seven by one which means i want a sequence which is increasing. in units of one i can 233 00:29:38,830 --> 00:29:42,659 generate this vector you can repeat it so you have one which you want to repeat ten 234 00:29:42,659 --> 00:29:48,539 times you can generate this vector you can do this repeating sequence repeating of a 235 00:29:48,539 --> 00:29:53,110 range you can generate this victor similarly for sequence you can be . so of course you 236 00:29:53,110 --> 00:29:56,279 need to remember that these are case specific so. in one line you write b in the next line 237 00:29:56,279 --> 00:29:59,520 you write capital b you will be shown an error ok 238 00:29:59,520 --> 00:30:06,210 so and we briefly discussed about you know all these you know basic ah manipulations 239 00:30:06,210 --> 00:30:16,700 like addition subtraction multiplication division so please note that if you have a vector if 240 00:30:16,700 --> 00:30:23,990 you have a vector then when you do these sum so lets say a is this particular vector when 241 00:30:23,990 --> 00:30:30,690 i write b is a plus one one is getting added element wise so that is why you are having 242 00:30:30,690 --> 00:30:36,880 you know two three four five six seven from one two three four five six ok so these operations 243 00:30:36,880 --> 00:30:46,059 operate on element wise basis so which is why if you do right c is equal to a by five 244 00:30:46,059 --> 00:31:08,480 you will get this particular values ok or a star b now if you do a star b again its 245 00:31:08,480 --> 00:31:16,799 an element wise operation first one will be one into two ok so we had modified b somewhere 246 00:31:16,799 --> 00:31:29,580 else oh to b is a plus one c is a . minus one ok so a star b you can see that this will 247 00:31:29,580 --> 00:31:39,820 accordingly change ok. so this is a plus b this is a star p c one into two is two in 248 00:31:39,820 --> 00:31:50,000 the last case six into seven is forty two so you have these particular elements or you 249 00:31:50,000 --> 00:32:04,560 can also do a power to exponential whatever ok so this gives you an idea of the usability 250 00:32:04,560 --> 00:32:15,850 of this particular language r which is very easy to learn you can download it you can 251 00:32:15,850 --> 00:32:26,850 use it for analyzing your data with that i stop here 252 00:32:26,850 --> 00:32:32,860 in the next class we'll do one more session with the language r to see how we can import 253 00:32:32,860 --> 00:32:45,649 data from you know so of course it is good enough to write six values seven 254 00:32:45,649 --> 00:32:51,570 values but you have a trench of data then you need a way to import this data into this 255 00:32:51,570 --> 00:32:58,429 art software and operate on them and we'll also briefly discussed how to do plotting 256 00:32:58,429 --> 00:33:03,809 with that i thank you for your attention today and i look forward to having our next lecture 257 00:33:03,809 --> 00:33:04,460 thank you .