1
00:00:13,830 --> 00:00:18,720
hello and welcome to today's lecture on biostatistics
so we will start with a brief recap of the.
2
00:00:18,720 --> 00:00:24,020
of r language and then solve few examples
in r studio ok so just to. ah revisit our
3
00:00:24,020 --> 00:00:27,680
discussion of what is r its a software environment
for statistical computing and data analysis
4
00:00:27,680 --> 00:00:32,619
so it is a open source package it is freely
available it has command line interface but
5
00:00:32,619 --> 00:00:36,390
since for many users is using a command line
interface may not be beneficial or easy to
6
00:00:36,390 --> 00:00:39,940
handle there are you ah know open source software
switch which provide a graphical user interface
7
00:00:39,940 --> 00:00:43,000
so that it is widely used to and the best
part of r is can produce publication quality
8
00:00:43,000 --> 00:00:45,970
graphs with mathematical symbols .
so it is an interpreted language this is just
9
00:00:45,970 --> 00:01:04,930
the you know a print screen of the r console
you see it is just a you know you can just
10
00:01:04,930 --> 00:01:16,020
enter things here but and you have an r has
been widely used by statistics it was actually
11
00:01:16,020 --> 00:01:27,710
developed at the university of auckland and
it is now widely used it has a support base
12
00:01:27,710 --> 00:01:43,220
where people contribute to its further development
and it has met the performance benchmarks
13
00:01:43,220 --> 00:01:58,560
comparable to that of g n u octave or matlab
so it is that is you know reason why it is
14
00:01:58,560 --> 00:02:02,130
used for statistical analysis of big data
as well as in business analytics so you can
15
00:02:02,130 --> 00:02:09,080
download r from this r project dot o r g ah
r projectors o r g and you can depending on
16
00:02:09,080 --> 00:02:13,610
the software interface whether windows unix
or macros you can download the appropriate
17
00:02:13,610 --> 00:02:21,239
file and install it
and so because it is a command line interface
18
00:02:21,239 --> 00:02:28,850
so people have developed ah software based
on r which have a g u i . interface and r
19
00:02:28,850 --> 00:02:38,630
studio is one such them on such of them and
in r this is just a print screen of how r
20
00:02:38,630 --> 00:02:43,530
studio looks this is the command window where
we type in numbers or whatever we need to
21
00:02:43,530 --> 00:02:48,030
do this is the workspace which stores all
the data that we generate and here you have
22
00:02:48,030 --> 00:02:56,100
you have to provided additional information
on how to use it or what something means ok
23
00:02:56,100 --> 00:03:04,650
so this is just an example of how i can create
a vector in r and so let's directly go to
24
00:03:04,650 --> 00:03:09,250
r studio and see rerun all these examples
so that your idea is in christened ok
25
00:03:09,250 --> 00:03:17,770
so this was an example i had done earlier
but let me see so in order to generate a vector
26
00:03:17,770 --> 00:03:26,709
so you can do all basic computation i can
write a equal to one b b equal to two i can
27
00:03:26,709 --> 00:03:32,800
write a power b b so i can get these answers
b b power b b ok exponential of b b so on
28
00:03:32,800 --> 00:03:34,810
and so for so and what you see is if i drag
this down . all these numbers that we are
29
00:03:34,810 --> 00:03:38,090
putting in are we getting stored here ok in
terms of value it has stored the value of
30
00:03:38,090 --> 00:03:46,500
a a stored the value of b b b c whatever ok
so you can. do the basic arithmetic you can
31
00:03:46,500 --> 00:04:07,900
have so sin of thirty as i pointed out yesterday
you have to write by pi by one eighty so in
32
00:04:07,900 --> 00:04:11,101
order to come you know convert it into radians
and only then will you get the value ok
33
00:04:11,101 --> 00:04:15,660
so this is how you get point five if you do
sin of thirty you will get some other value
34
00:04:15,660 --> 00:04:31,680
so remember that we have to convert every
degree into a radian in order to use the sin
35
00:04:31,680 --> 00:04:40,949
cos tan functions you can do a sin or sin
inverse let's say we can do a sin of half
36
00:04:40,949 --> 00:04:47,300
and you get the value of point five to three
which is nothing but this particular value
37
00:04:47,300 --> 00:04:57,370
ok so arc sin arc chords you can do log log
ten of ten is one ok you can also do log of
38
00:04:57,370 --> 00:05:04,539
ten comma ten which is also going to give
you the value so you can put log ten base
39
00:05:04,539 --> 00:05:12,289
. two ok you can do just log of three let's
say it is the natural log and you will get
40
00:05:12,289 --> 00:05:15,430
slightly value because e is two point seven
something ok
41
00:05:15,430 --> 00:05:20,159
so these are all the basic calculations in
order to generate a victor let's say so i
42
00:05:20,159 --> 00:05:29,970
can write c c is equal to c one comma two
comma three comma. ok so c and so the as you
43
00:05:29,970 --> 00:05:46,419
see the format the syntax is you get the the
variable name which is
44
00:05:46,419 --> 00:05:56,229
a vector and c is for con concatenate and
you give them item one item two separated
45
00:05:56,229 --> 00:06:09,229
by commas ok so if i type c c and put an enter
it will give you the exact value so this is
46
00:06:09,229 --> 00:06:19,990
how you create vectors you can do elementary
vector operations like let's say you can do
47
00:06:19,990 --> 00:06:46,440
so let's say c c is this let's say you define
b b as c of you can . do c c and b b as another
48
00:06:46,440 --> 00:06:54,059
vector i can do c c plus b b and i get the
answer i can do c c star b b so you see that
49
00:06:54,059 --> 00:07:00,460
these are element wise operations so that
is why when i am doing c c star b b one is
50
00:07:00,460 --> 00:07:04,559
multiplied by four so on and so forth ok so
if i want to multiply something throughout
51
00:07:04,559 --> 00:07:10,009
then i should do let's say four start c c
then all the individual enties ent ah ah entries
52
00:07:10,009 --> 00:07:12,939
are multiplied by this factor
so for a scalar multiplication you put a pre
53
00:07:12,939 --> 00:07:16,419
factor outside the vector and for these are
all vector additions vector multiplications
54
00:07:16,419 --> 00:07:20,130
what you see is these operate at the at the
single ah element level ok so if i go back
55
00:07:20,130 --> 00:07:22,789
to this ah power point so this you can even
create at same number repeated whatever number
56
00:07:22,789 --> 00:07:29,259
of times so the syntax is very you know understandable
you right repeat whatever is the number is
57
00:07:29,259 --> 00:07:43,360
one or a or b and times how many times you
want to repeat and you see you . you create
58
00:07:43,360 --> 00:07:53,629
a vector of the same number which is ten times
you can repeat a sequence so let's say you
59
00:07:53,629 --> 00:08:09,289
have one
to three and so one colon three means between
60
00:08:09,289 --> 00:08:19,289
one to three and this has to be repeated five
times and you can accordingly get this particular
61
00:08:19,289 --> 00:08:22,370
you know vector
you can also create a sequence so this is
62
00:08:22,370 --> 00:08:45,000
another example of a sequence where you go
from one to seven in steps of one and you
63
00:08:45,000 --> 00:08:49,850
will get this particular t you can create
a repeat of a sequence so every time you put
64
00:08:49,850 --> 00:08:56,780
a bracket and you put an operator it operates
on this whole thing and that is how you have
65
00:08:56,780 --> 00:09:03,810
a repetition of a sequence which is number
of time five times so f seven seven seven
66
00:09:03,810 --> 00:09:11,480
seven so on and so forth ok so this is just
what i did which was element wise operations
67
00:09:11,480 --> 00:09:15,140
on a vector you can you know do all the calculations
here let us go back to r studio ok if i go
68
00:09:15,140 --> 00:09:22,530
back to r studio
now let's say i have my vectors . and what
69
00:09:22,530 --> 00:09:33,330
i want to add a vector online ok as opposed
to typing them inside like this. if the vector
70
00:09:33,330 --> 00:09:37,350
is long then it is difficult to enter so what
i can do is i can add the vector online so
71
00:09:37,350 --> 00:09:41,060
in case i can write that is equal to scan
till open brackets so if i do enter then it
72
00:09:41,060 --> 00:09:45,130
gives the it. it puts the command prompt here
which means i can enter anything i put spaces
73
00:09:45,130 --> 00:09:47,860
if i put enter it again gives me the option
of adding you know many more numbers so you
74
00:09:47,860 --> 00:09:55,460
see the number thirteen is written here because
i have already made twelve entries so this
75
00:09:55,460 --> 00:09:58,060
is the thirteenth entry
so it begins with one which is this entry
76
00:09:58,060 --> 00:10:04,820
ok so i can keep on putting random numbers
and every time i put enter it gives me the
77
00:10:04,820 --> 00:10:15,250
ah it gives me the option of putting in any
entries but if i press one more entry then
78
00:10:15,250 --> 00:10:22,040
essentially it will it will . it will understand
that that is the end of it and it has created
79
00:10:22,040 --> 00:10:24,000
this particular vector which has seventeen
items only so i can now type that and see
80
00:10:24,000 --> 00:10:33,350
what is the value of and you see that this
is this is the way it is so you see that depending
81
00:10:33,350 --> 00:10:38,620
on how you you know you write it and how you
know what is your font size so when it is
82
00:10:38,620 --> 00:10:46,630
coming you can only reach till thirteenth
entry here that is why from the fourteenth
83
00:10:46,630 --> 00:10:54,180
entry it is coming here ok so in order to
know the length of the vector i can use this
84
00:10:54,180 --> 00:11:06,850
length of that ok it tells me that there are
seventeen column entries in this particular
85
00:11:06,850 --> 00:11:14,460
vector
i can also so the jargon for entering a character
86
00:11:14,460 --> 00:11:19,480
entry is slightly different so let's say i
can do this extract that is equal to scan
87
00:11:19,480 --> 00:11:25,710
ok if i do this if i want to enter characters
then what i have to do is let's say monday
88
00:11:25,710 --> 00:11:33,110
tuesday wednesday . ok so i put two times
that so in this case i have to ok let us see
89
00:11:33,110 --> 00:11:41,680
ok i think i have to write that is equal to
scan what equal to char ok so now so i it
90
00:11:41,680 --> 00:11:47,600
it so by default this function scan always
expects to get real numbers so that is why
91
00:11:47,600 --> 00:12:02,140
you could see expected a real got m which
is a character so if i put this statement
92
00:12:02,140 --> 00:12:09,490
inside that scan and what it is expecting
. is a character then there is no problem
93
00:12:09,490 --> 00:12:24,030
so if i now write that here you will see monday
tuesday wednesday as the numbers being entered
94
00:12:24,030 --> 00:12:29,170
ok
i can add ok so i can add so let's say if
95
00:12:29,170 --> 00:12:39,840
i have x x equal to c of ok i can write ah
ah i can add elements to x x by writing c
96
00:12:39,840 --> 00:12:55,800
of x x comma six ok if i write x x now you
see it has added so it is possible to add
97
00:12:55,800 --> 00:13:07,840
numbers into a particular vector and as before
as i had shown you before we can you know
98
00:13:07,840 --> 00:13:14,730
even puts to the order of the x x does not
matter i could have well written c of six
99
00:13:14,730 --> 00:13:21,440
seven eight comma x x in that the vector range
would have been changed ok so if i go back
100
00:13:21,440 --> 00:13:28,990
to the presentation so it the the next things
. of course these are still numbers which
101
00:13:28,990 --> 00:13:33,430
i can enter and i can keep on entering on
screen but if you have a big file then this
102
00:13:33,430 --> 00:13:41,890
function does not work you have to use what
is called. you know you can import data using
103
00:13:41,890 --> 00:13:57,000
various ways ok so this is an example where
you can import data using. c s v format this
104
00:13:57,000 --> 00:14:07,230
is called comma separated variable so let's
do this. particular example what i will do
105
00:14:07,230 --> 00:14:13,780
is i will create and an i'll open an excel
sheet and let us ok so i have entered some
106
00:14:13,780 --> 00:14:25,790
numbers and i can save it in c s v format
so when you do save as by default it is always
107
00:14:25,790 --> 00:14:32,610
in excel excel workbook but what you can go
down and you can choose comma separated values
108
00:14:32,610 --> 00:14:47,260
ok this is c s v and you can . do the save
it give it prompts you this particular warning
109
00:14:47,260 --> 00:14:53,680
but you can say continue and this you know
this is stored as a c s v file ok
110
00:14:53,680 --> 00:15:07,051
now what i can do is i can do this i can go
to r studio and i can write x x is equal to
111
00:15:07,051 --> 00:15:17,460
read dot c s v and i can write file dot choose
what this means so when you do this it allows
112
00:15:17,460 --> 00:15:39,740
you it allows you to choose a particular file
ok so if i do enter it will give it'll prompts
113
00:15:39,740 --> 00:15:47,620
me this and this is the latest workbook that
we have had this is a c s v file i can open
114
00:15:47,620 --> 00:15:52,180
it and it gets chosen ok you can select it
if i right x x here and you see that these
115
00:15:52,180 --> 00:15:56,650
are the values which it chose so even though
we did not explicitly enter the tags of what
116
00:15:56,650 --> 00:16:01,150
are the column names. in c s v these things
are already chosen and even the left row . numbers
117
00:16:01,150 --> 00:16:07,820
they were stored in the c s v format generated
ok so if i go back to the c s v file ok so
118
00:16:07,820 --> 00:16:17,900
you can you know. this is the easiest way
of you know this is just another way of example
119
00:16:17,900 --> 00:16:39,120
of how you can choose this particular values
ok so this is the most widely used system
120
00:16:39,120 --> 00:16:43,350
to import data but particularly when you have
big data ok
121
00:16:43,350 --> 00:16:49,440
now let's get down to some examples of finding
frequency mean medians one and so forth ok
122
00:16:49,440 --> 00:16:56,600
if i go back to r studio so let's say i will
again entered another set of parameters let's
123
00:16:56,600 --> 00:17:09,850
say x x x x in the screen is equal to c of
i have entered this random array i can know
124
00:17:09,850 --> 00:17:16,600
the length of the vector by writing length
of x x ok . it has eight entries i can find
125
00:17:16,600 --> 00:17:33,389
minimum of x x which is one max of x x which
is six i can have mean of x x which is three
126
00:17:33,389 --> 00:17:42,419
point two five i can have median of x x it
gives me a value of three ok so these are
127
00:17:42,419 --> 00:17:53,799
widely. useful you know ways of doing of getting
the statistics descriptive statistics from
128
00:17:53,799 --> 00:18:13,320
your vector ok and one one more thing i wanted
to show is if you do table of x x then you
129
00:18:13,320 --> 00:18:23,350
will get the distribution so how many values
have have value of one you have only one entry
130
00:18:23,350 --> 00:18:31,249
of one you have it says that there are two
entries of two let's see so these are the
131
00:18:31,249 --> 00:18:34,289
numbers i have entered there are two twos
and that is why when i do the frequency count
132
00:18:34,289 --> 00:18:38,840
there are two twos here similarly two three
is one one one ok
133
00:18:38,840 --> 00:18:41,419
so this is . a very useful way of getting
the you know the statistics from these particular
134
00:18:41,419 --> 00:18:45,899
examples i have shown you how to calculate
the mean median minimum and maximum you can
135
00:18:45,899 --> 00:18:49,499
also do the variance and standard deviation
as so you can use this particular function
136
00:18:49,499 --> 00:18:53,369
of where to find out the variance of this
distribution s d to find the standard deviation
137
00:18:53,369 --> 00:19:08,740
of this particular vector so in this particular
example where you have how many entries one
138
00:19:08,740 --> 00:19:33,720
two three four five six seven eight nine ten
eleven entries your variance is coming out
139
00:19:33,720 --> 00:19:38,749
to be a value of thirty two and standard deviation
is around five point six ok so you can also
140
00:19:38,749 --> 00:19:44,149
find out the standard deviation and square
it to find out the value of variance which
141
00:19:44,149 --> 00:19:51,399
will give you the same information or you
can calculate the square root of the variance
142
00:19:51,399 --> 00:20:00,309
to find out the standard deviation ok
some more examples so you can ah i use i will
143
00:20:00,309 --> 00:20:09,470
show you how you can use this particular functions
ok again let's go back to r studio ok so let's
144
00:20:09,470 --> 00:20:14,399
say i want to sort . x x so let let us write.
you know a longer value longer vector ok so
145
00:20:14,399 --> 00:20:18,330
y y has fourteen entries length should give
me a value of fourteen so let us ah i already
146
00:20:18,330 --> 00:20:24,409
know the information i can use these earlier
values to find out this but i what i can also
147
00:20:24,409 --> 00:20:50,590
do as you can do sort of y y ok and then you
see that it is being sorted in ascending order
148
00:20:50,590 --> 00:20:58,340
if you wanted to sort the same you know same
vector in reducing order the other way round
149
00:20:58,340 --> 00:21:03,840
so you can write short y y comma decreasing
equal to true so in this case you have the
150
00:21:03,840 --> 00:21:10,450
reverse order from the topmost number to the
lowest number ok you can also have let's say
151
00:21:10,450 --> 00:21:16,289
you want to some statistics how many there
are how many numbers in this vector which
152
00:21:16,289 --> 00:21:25,320
are . greater than two for example so i can
write x x so there is three six three four
153
00:21:25,320 --> 00:21:39,090
five with the five numbers which are greater
than x x ok
154
00:21:39,090 --> 00:21:52,419
i can also write so i can ask at which position
is x x equal to five it is. returning me a
155
00:21:52,419 --> 00:21:57,929
value of eight let's see where i have the
first value of five one two three four five
156
00:21:57,929 --> 00:22:01,950
six seven no one two three four five so x
x is equal to five it is returning via value
157
00:22:01,950 --> 00:22:12,110
of eight ok i can do another thing which is
the summary . of y y so what it gives me are
158
00:22:12,110 --> 00:22:19,269
all the. minimum the first quartile the median
the third quartile and the maximum ok so instead
159
00:22:19,269 --> 00:22:31,299
of summary you can also use this function
called quantile and this is the same thing
160
00:22:31,299 --> 00:22:33,940
but given in terms of exact percentage again
you see the minimum is one your ah first quantile
161
00:22:33,940 --> 00:22:38,250
is quartile is two as is showing up here your
median which is the fiftieth percentile has
162
00:22:38,250 --> 00:22:43,289
a value of four you are seventy so your seventy
fifth percentile is this and the maximum is
163
00:22:43,289 --> 00:22:52,809
seven ok so clearly so in this case you have
median and mean both reported you mean you
164
00:22:52,809 --> 00:22:59,559
write quantile you will only get the plot
of these actual percentiles ok so these are
165
00:22:59,559 --> 00:23:11,429
the things that i wanted to. you know these
functions i wanted to ok these are useful
166
00:23:11,429 --> 00:23:17,019
ok
let's take . another example so imagine you
167
00:23:17,019 --> 00:23:25,490
are doing a measurement where you're tracking
or. your aim is to correlate the cell so a
168
00:23:25,490 --> 00:23:30,869
cell which is moving and you want to. see
whether when it is moving it is elongated
169
00:23:30,869 --> 00:23:39,929
or it is it is round ok and what you do is
you do this particular you know ah this is
170
00:23:39,929 --> 00:23:45,820
this is the data where you have two matrix
for characterance you know the cell phenotype
171
00:23:45,820 --> 00:23:56,610
one is whether it is circular or spindle so
circular cell would look something like this
172
00:23:56,610 --> 00:24:00,000
this is your circular cell and this is your
mode spindle shaped so this is spindle and
173
00:24:00,000 --> 00:24:08,399
this is circular ok and you are so these are
of course so you have a nucleus at the center
174
00:24:08,399 --> 00:24:16,919
ok and you want to see what are the trajectories
of these cells in two d plane and you want
175
00:24:16,919 --> 00:24:19,480
to find out so based on the distance it is
moving each of these have two population ok
176
00:24:19,480 --> 00:24:23,389
migratory . and non migratory ok same here
you have m and non migratory ok so this is
177
00:24:23,389 --> 00:24:25,580
just an example so in this particular example
what we have is the distribution for let's
178
00:24:25,580 --> 00:24:28,450
say twenty two such cells for each cell you
have the shape which is. either a circular
179
00:24:28,450 --> 00:24:31,409
or spindle and the phenotype which you characterize
as non migratory or migratory
180
00:24:31,409 --> 00:24:36,100
so essentially both these metrics are categorical
data and you want to find out ok so from this
181
00:24:36,100 --> 00:24:44,759
table you want to find out what is the distribution
of this data so i can have i can first generate
182
00:24:44,759 --> 00:24:55,020
the table of the cell type ok so this is just
the way to you know import the data into the
183
00:24:55,020 --> 00:24:56,940
r ah framework you can do this command of
table to get what are the different distributions
184
00:24:56,940 --> 00:25:00,119
what you see is they are circular in shape
two of the such cells . are migratory and
185
00:25:00,119 --> 00:25:04,399
there are nine which are. non migratory and
in case of spindle shapes eight of them are
186
00:25:04,399 --> 00:25:07,350
migratory two of them are non migratory so
this data kind of conveys the point that a
187
00:25:07,350 --> 00:25:08,929
greater proportions of cells which migrate
are spindle in shape and a greater proportion
188
00:25:08,929 --> 00:25:11,399
of cells which are non migratory are circular
in shape ok so this is a way of plotting this
189
00:25:11,399 --> 00:25:14,759
data you can do bar plot of this table and
you have this particular distribution
190
00:25:14,759 --> 00:25:18,809
so let's go back to r so let's use the same
data which is y y i can write let's say bar
191
00:25:18,809 --> 00:25:23,590
plot of y y and ok figure margins are too
large . so this is perhaps not coming here
192
00:25:23,590 --> 00:25:30,580
i dont understand but you can also go histogram
of y y this problem is coming we will see
193
00:25:30,580 --> 00:25:41,620
but if we if you use these particular functions
box plot bar plot let us reduce the file size
194
00:25:41,620 --> 00:25:49,769
will ok now you see ok so it was coming like
this because this required a certain amount
195
00:25:49,769 --> 00:25:55,220
of space you see histogram of y y will give
you this particular distribution it conveys
196
00:25:55,220 --> 00:26:02,659
a message you have a peak here and another
peak here and then one convert data here ok
197
00:26:02,659 --> 00:26:09,080
let us see again if i could you know plot
the bar plot here so this is your same. plot
198
00:26:09,080 --> 00:26:16,159
plotted in bar plot manner we can also do
the same thing for box plot ok so this is
199
00:26:16,159 --> 00:26:20,259
the box plot ok with buds . what we can also
do is we can so if i go back to the presentation
200
00:26:20,259 --> 00:26:24,700
now so this is what we had plotted which is
the bar plot of the table and you see that
201
00:26:24,700 --> 00:26:28,950
in case of the these cells which are circular
you have a greater portion which are non migratory
202
00:26:28,950 --> 00:26:36,629
ok ah which are circular in shape and lesser
portion which are migratory but circular in
203
00:26:36,629 --> 00:26:59,309
shape ok
so i can also represent this data by side
204
00:26:59,309 --> 00:27:06,960
by a plotting this side by side so this would
be how it would look like so if i write this
205
00:27:06,960 --> 00:27:09,769
particular framework which is beside equal
to true then i would generate this particular
206
00:27:09,769 --> 00:27:12,840
plot ok so in terms of migratory and these
are circular which is very few and you know
207
00:27:12,840 --> 00:27:17,869
ah and spindle shape which are very high ok
so i can use a similar thing to generate the
208
00:27:17,869 --> 00:27:22,370
plotting for r let's say you have a trajectory
of a cell which is moving in x y plane i can
209
00:27:22,370 --> 00:27:27,120
any how imported . the data ok so you can
clearly see that by plotting the data. as
210
00:27:27,120 --> 00:27:31,919
[vocalized-nose] so you have read this particular
data and the data has become ah you know data
211
00:27:31,919 --> 00:27:38,970
is as represented as x and y and so what you
see here is you have first imported the data
212
00:27:38,970 --> 00:27:41,809
and in this case we are plotting data colon
comma two data colon comma three
213
00:27:41,809 --> 00:27:47,429
why are you choosing the second and the third
column the reason is because when you import
214
00:27:47,429 --> 00:27:53,889
your you know you have an a row or a column
which gets inserted which is the row number
215
00:27:53,889 --> 00:27:59,769
so let's go back to r so which is the vector
that we had you know let's again have z z
216
00:27:59,769 --> 00:28:06,509
is equal to read dot c s v file dot choose
ok so if i again read my . workbook three
217
00:28:06,509 --> 00:28:09,559
and i plot z z so you see that there are three
columns which are generated this one was the
218
00:28:09,559 --> 00:28:15,429
default entry in excel which we don't have
no control over so this is why you have to
219
00:28:15,429 --> 00:28:21,059
neglect this particular column and you plot
x two and x four and this is what we have
220
00:28:21,059 --> 00:28:25,639
done in this power point file this is what
we have done in this power point file whereby
221
00:28:25,639 --> 00:28:30,070
we have plotted data co colon comma two colon
comma three this is high and then this is
222
00:28:30,070 --> 00:29:05,990
the corresponding you know. plot in x y plane
i can also what i can do is i can use i can
223
00:29:05,990 --> 00:29:10,539
give a title to this particular plot which
is migration trajectory which appears here
224
00:29:10,539 --> 00:29:23,140
i can set my limits so x lab so i can set
my labels so x lab is x coordinate y lab is
225
00:29:23,140 --> 00:29:36,159
y coordinate you can enter here you can what
you can also do is you can change with the
226
00:29:36,159 --> 00:29:50,620
play with the color of this particular trajectory
what you see here is colour equal to red means
227
00:29:50,620 --> 00:29:55,850
essentially we are resetting the trajectory
color . to red
228
00:29:55,850 --> 00:30:09,950
and the last is you can also put x lim x lim
is for limits so if you want to you know probe
229
00:30:09,950 --> 00:30:59,720
it within a certain domain you want to plot
it within a certain range you can. have control
230
00:30:59,720 --> 00:31:06,509
over x range and y range with that ah i think
you have gotten a good enough handle of how
231
00:31:06,509 --> 00:31:12,190
to do this ok so we would ah stop here i would
just go back to. you know let me go back to
232
00:31:12,190 --> 00:31:25,480
this particular slide let me just go back
to r studio once more so you can have these
233
00:31:25,480 --> 00:32:07,190
particular plots that
is so we had plotted histogram of y y
234
00:32:07,190 --> 00:32:37,809
you can plot it in histogram version ok
so if you write histogram of y y comma frequency
235
00:32:37,809 --> 00:33:05,580
equal to false what it does it it actually
converts this data as as density or relative
236
00:33:05,580 --> 00:33:16,200
frequency and you instead of absolute values
you will get a range ok so here itself for
237
00:33:16,200 --> 00:33:40,880
example the. you know in terms of y y i can
write other values i can change the color
238
00:33:40,880 --> 00:34:21,010
. of this and so on and so forth with that
239
00:34:21,010 --> 00:34:55,510
you will
240
00:34:55,510 --> 00:36:01,310
i hope you have. you have been convinced of
the power of this r software which is open
241
00:36:01,310 --> 00:36:29,490
source you can freely download it install
it and use it for your own purposes particularly
242
00:36:29,490 --> 00:36:49,270
when you are handling you know
243
00:36:49,270 --> 00:37:49,160
you should get into the habit of calculating
standard deviation and mean and all these
244
00:37:49,160 --> 00:38:28,460
descriptive strategies even of generating
these plots in r ok
245
00:38:28,460 --> 00:38:45,940
with that
i thank you for your attention and we'll meet
246
00:38:45,940 --> 00:39:05,080
again in next class .