1
00:00:00,510 --> 00:00:17,820
Welcome to the course on Biostatistics and
Design of Experiments. Today we will continue
2
00:00:17,820 --> 00:00:26,910
on the t-test like I had mentioned that t-test
does a comparison of a means.
3
00:00:26,910 --> 00:00:33,030
If the mean of one sample and you have a mean
of another sample, you want to know whether
4
00:00:33,030 --> 00:00:37,949
they come from the same population or they
come from different population and so on actually.
5
00:00:37,949 --> 00:00:43,420
Basically, t-test compares means, there are
3 different types of t-test 1 sample t-test,
6
00:00:43,420 --> 00:00:49,190
2 sample t-test and paired t-test. So, in
1 sample t-test what to you do is, you take
7
00:00:49,190 --> 00:00:55,440
a sample and the mean which you get compare
it with the population mean and you find out
8
00:00:55,440 --> 00:01:03,299
whether this sample mean comes from this population
or not. In a two sample t-test, you are comparing
9
00:01:03,299 --> 00:01:09,470
2 sets of samples for example if I am comparing
drug a and drug b performance, then I will
10
00:01:09,470 --> 00:01:14,000
get some mean for sample set 1 and I will
get some mean for sample set 2 and I would
11
00:01:14,000 --> 00:01:18,610
like to see whether there is any statistically
significant difference between these 2 means
12
00:01:18,610 --> 00:01:27,600
or not. In the paired t-test, you use the
same subject for performing the experiments
13
00:01:27,600 --> 00:01:34,440
for example, if I am performing clinical trials
and I have 10 volunteers I may give drug a
14
00:01:34,440 --> 00:01:40,010
to the 10 volunteers and after a few days
I may give drug b to the same 10 volunteers
15
00:01:40,010 --> 00:01:49,350
and I will see the change. So theoretically,
if a and b perform in the same way that difference
16
00:01:49,350 --> 00:01:51,770
should be 0 that is called the paired t-test.
17
00:01:51,770 --> 00:01:57,310
Let us continue with some problems because
problems are very very important, problems
18
00:01:57,310 --> 00:02:04,570
give you an idea about a how to attack these
type of a situations and we need to know how
19
00:02:04,570 --> 00:02:10,859
to use the tables, we need to know when to
use a 1 sample t-test or a 2 sample t-test,
20
00:02:10,859 --> 00:02:18,769
when to use a 1 tailed or 2 tailed, how to
create the a null hypothesis and alternate
21
00:02:18,769 --> 00:02:27,780
hypothesis and so on actually. The mean heart
rate of 20 rats is say 282 with a coefficient
22
00:02:27,780 --> 00:02:34,150
variation of 15 %, what is coefficient of
variation? 100 sigma divided by mu. During
23
00:02:34,150 --> 00:02:41,730
anesthesia the mean rate becomes 240, it looks
like it is gone down, but the CV is the same.
24
00:02:41,730 --> 00:02:45,959
Is the decrease in heart rate significant?
We need to find out whether the decrease in
25
00:02:45,959 --> 00:02:51,879
the heart rate is significant. Give 95 % confidence
limit for the change in the heart rate that
26
00:02:51,879 --> 00:02:57,349
means, it is the change that is 240 minus
282 or 282 minus 240. What is the confidence
27
00:02:57,349 --> 00:03:03,430
interval? Although we say 282 minus 240 the
change is 42, but then there is always be
28
00:03:03,430 --> 00:03:09,260
going to be a region of confidence because
in any mean will always have a standard deviation.
29
00:03:09,260 --> 00:03:13,510
Let us go about solving 1 at a time.
30
00:03:13,510 --> 00:03:21,590
So coefficient of variation is given by 15
%, 15 is equal to 100 into s the standard
31
00:03:21,590 --> 00:03:31,019
deviation divided x bar. As you can see 282
is the mean that is x bar so from there I
32
00:03:31,019 --> 00:03:39,579
calculate s simple. I just rearrange, so s
comes out to be 42.3. Now the variance s square
33
00:03:39,579 --> 00:03:50,750
is equal to square of that 1789. Now I can
calculate the standard error of the mean by
34
00:03:50,750 --> 00:03:56,109
divide by 20 here right if you remember, what
is the standard error s by square root of
35
00:03:56,109 --> 00:04:02,950
n or the square of the standard error if the
mean will be s square divided by n, that is
36
00:04:02,950 --> 00:04:09,699
what I get it 89.45. Now CV is a same for
the anesthesia group also, the standard error
37
00:04:09,699 --> 00:04:16,950
will be the same for the anesthesia group
do you understand this so far. I know CV,
38
00:04:16,950 --> 00:04:22,460
I take 282 here x bar so I calculate s that
is the standard deviation. Now the standard
39
00:04:22,460 --> 00:04:28,340
error of the mean, I just have to divide by
n and then later on I need to take square
40
00:04:28,340 --> 00:04:32,140
root, but right now we are not taking square
root we leave it like that. Now CV is the
41
00:04:32,140 --> 00:04:37,160
same for the anesthesia group also, so standard
error for the anesthesia group also will be
42
00:04:37,160 --> 00:04:39,870
the same ok.
43
00:04:39,870 --> 00:04:40,870
.
44
00:04:40,870 --> 00:04:45,860
Let us formulate our hypothesis, h naught
mu equal to mu naught that means there is
45
00:04:45,860 --> 00:04:52,680
no difference, in Ha we say mu is less than
mu naught. If you read the problem decrease
46
00:04:52,680 --> 00:04:57,840
in heart rate, so we are saying mu is less
than mu not. So what do you do we take our
47
00:04:57,840 --> 00:05:04,720
t equation, t is equal to x bar minus mu naught
divided by s by square root of n you remember
48
00:05:04,720 --> 00:05:10,460
this equation right, t is equal to x bar minus
mu not divided by s square root of n this
49
00:05:10,460 --> 00:05:17,130
call the standard error and mu not is the
population mean and this is a sample mean
50
00:05:17,130 --> 00:05:24,870
t degrees of freedom will n minus 1. We calculate
the t and them we compare table t, if the
51
00:05:24,870 --> 00:05:29,650
t calculated is less than the table t we accept
h naught, the t calculated is greater than
52
00:05:29,650 --> 00:05:35,240
the table t we reject the H0. Here, it is
a 1 tail test please remember because we are
53
00:05:35,240 --> 00:05:39,560
talking about this type of situation, we are
not saying mu naught equal to mu naught that
54
00:05:39,560 --> 00:05:45,590
means less or more, but we are specifically
talking about mu a less than mu not. So it
55
00:05:45,590 --> 00:05:47,840
is a 1 tail test simple.
56
00:05:47,840 --> 00:05:53,060
.
What we do is we t calculate I know 240 the
57
00:05:53,060 --> 00:06:02,580
other one is 282, if you remember we got 89.45,
so square root of 89.45 is what I need put
58
00:06:02,580 --> 00:06:12,970
here. 240 minus 282 I will get minus 4.44.
Now let us go to t table, with t table if
59
00:06:12,970 --> 00:06:21,000
I go for 1 tail test for p is equal to 0.05,
1 tail test p is equal to 0.05 .
60
00:06:21,000 --> 00:06:33,840
I need
to look into this region. I get the 9 degrees
61
00:06:33,840 --> 00:06:42,860
of freedom because there are 10 rats, 10 minus
1 is 9, it is 1.833. P from the table is 1.833,
62
00:06:42,860 --> 00:06:52,460
t calculated is minus 4.44. It is greater,
so we reject the null hypothesis at p is equal
63
00:06:52,460 --> 00:06:59,370
to 0.05. We need to accept the alternative
hypothesis that means, the heart rate has
64
00:06:59,370 --> 00:07:04,870
decreased because of anesthesia and this decrease
is significant at 95 confidence interval.
65
00:07:04,870 --> 00:07:11,140
So you understand this it is very simple.
We need to use this table as I said if you
66
00:07:11,140 --> 00:07:19,120
are talking about 1 tail test use the top
numbers, if you are using 2 tail test we use
67
00:07:19,120 --> 00:07:26,760
the bottom number. When I said 0.05 for 2
tail I mean sorry 1 tail I used this, where
68
00:07:26,760 --> 00:07:34,190
has if I had said for a 2 tail I would have
used this. As I said before also, 2 tail includes
69
00:07:34,190 --> 00:07:41,190
the outside area on both sides of my normal
distribution curve, if I want to consider
70
00:07:41,190 --> 00:07:46,870
only one side obviously 0.05 by 2 that is
way here it is 0.025.
71
00:07:46,870 --> 00:07:52,040
So you know how to use. Now I want to no change
in the heart rate. What is the change? 42
72
00:07:52,040 --> 00:08:00,020
is a change in the heart rate. So, I need
to calculate the overall standard error understood.
73
00:08:00,020 --> 00:08:06,970
The overall standard error how do I do, both
are same right 89.45 this is this variance
74
00:08:06,970 --> 00:08:20,200
of that 89.45 divided by 89.45 that comes
out to be 178.9, I take square root of that
75
00:08:20,200 --> 00:08:31,110
I will get 13.4. Now I need to consider all
the data sets. I have 1 set is 10, I have
76
00:08:31,110 --> 00:08:36,360
9 degrees of freedom other set is 10, I have
9 degrees of freedom now I 18 degrees of freedom
77
00:08:36,360 --> 00:08:41,080
and I need to use 2 tail test, 2 tail table
because we are talking about plus or minus
78
00:08:41,080 --> 00:08:46,560
here, you remember these equation right mu
equal to x bar plus or minus t d f is by square
79
00:08:46,560 --> 00:08:54,270
root of n. So, I use the both the regions
p is equal to 0.05 with 18 degrees of freedom,
80
00:08:54,270 --> 00:09:10,240
I read out this 2.101 I take 2.1 and 13.4
comes from here and 42 is the change in heart.
81
00:09:10,240 --> 00:09:15,830
The confidence limit for the change in heart
at 95 % confidence is 14 to 70.
82
00:09:15,830 --> 00:09:22,030
Do you understand the second part of the problem?
The second part of the problem is you are
83
00:09:22,030 --> 00:09:26,740
suppose to find out the confidence limit for
the change in the heart rate, change in the
84
00:09:26,740 --> 00:09:32,390
heart rate is originally it was 282 now it
became 240. Now to find the confidence limit
85
00:09:32,390 --> 00:09:40,610
I use this equation if you remember this so,
it is a plus or minus t s by square root n.
86
00:09:40,610 --> 00:09:48,820
Now we have 89.45 for one set 89.45 for the
second set, so total 178.9. I take a square
87
00:09:48,820 --> 00:09:59,860
root here, it becomes 13.4, that is the standard
this whole portion. Now t I use 18 degrees
88
00:09:59,860 --> 00:10:05,620
of freedom do you understand why because,
the first set has a 10 rats 9 degrees freedom,
89
00:10:05,620 --> 00:10:11,089
second has 10 rats 9 degrees of freedom totally
18 degrees of freedom. I used 2 tail because
90
00:10:11,089 --> 00:10:20,550
this is a plus or minus we are doing so and
then t comes out form the table 0.05 18 degrees
91
00:10:20,550 --> 00:10:29,050
of freedom 2.101, so I substitute here I put
13.4. So when I do that I get the change in
92
00:10:29,050 --> 00:10:36,640
the heart rate, the confidence limit for the
change in heart rate at 95 % confidence is
93
00:10:36,640 --> 00:10:44,000
40 into 70. It is a interesting problem very
very useful problem, we do quite a lot of
94
00:10:44,000 --> 00:10:51,180
these type of a studies in clinical trails
we have a drug, you do give the drug and you
95
00:10:51,180 --> 00:10:57,580
see changes you know the original population
and you are looking at change then you are
96
00:10:57,580 --> 00:11:02,100
trying to see whether the change is statistically
significant or no. So this is varying.
97
00:11:02,100 --> 00:11:07,279
Let us look at another problem, the concentration
of a drug in the brain of 6 rabbit's is determined
98
00:11:07,279 --> 00:11:18,860
after 5 minutes, after administrating the
drug. You are injecting a drug on 6 rabbit's
99
00:11:18,860 --> 00:11:24,230
then after 5 minutes we are measuring their
concentration, it comes out to be 40.2, 38.7,
100
00:11:24,230 --> 00:11:32,589
41.6, 40.5, 43.2, 39.5. Now calculate the
99 % confidence interval for the true mean
101
00:11:32,589 --> 00:11:39,350
concentration that we can do, the mean concentration
of the same drug in rats are reported to be
102
00:11:39,350 --> 00:11:48,430
45 milligram per gram now this is the sample
data from rabbit's these the data which they
103
00:11:48,430 --> 00:11:55,600
got from rats is there a statistical significance
difference between these 2 data. So 2 parts,
104
00:11:55,600 --> 00:12:02,380
1 is calculating the 95, 99 % confidence is
very simple. We can calculate x bar from this
105
00:12:02,380 --> 00:12:09,610
data then a for 5 degrees of freedom with
95 %, 99 % confidence from the table we can
106
00:12:09,610 --> 00:12:14,270
calculate t and then we can calculate the
standard error that is s by square root of
107
00:12:14,270 --> 00:12:19,829
n that will give you my confidence limit right.
108
00:12:19,829 --> 00:12:32,339
.
Then the second part of the problem. Let us
109
00:12:32,339 --> 00:12:39,620
look at this average so, the average comes
out to be 40.6 and S is equal to 1.6. Of course,
110
00:12:39,620 --> 00:12:44,970
we can calculate the CV also in this case,
but mainly we are more interested in this.
111
00:12:44,970 --> 00:12:55,550
So, we take it as a 40.6, we take this as
45 then you know the S then divided by square
112
00:12:55,550 --> 00:13:02,180
root of 6 and then degrees of freedom is 5,
if t is less than that the t calculated is
113
00:13:02,180 --> 00:13:09,640
less than the table t accept h naught, t is
greater than the table t reject h naught.
114
00:13:09,640 --> 00:13:17,050
We are doing 2 tail test because, the question
is, in rats it is 45 milligram per gram is
115
00:13:17,050 --> 00:13:22,510
there a statistical difference between these
2 data, is there are a statistical difference.
116
00:13:22,510 --> 00:13:29,880
The h naught will be mu equal to mu naught,
h alternate will be mu is not equal to mu
117
00:13:29,880 --> 00:13:31,240
naught.
118
00:13:31,240 --> 00:13:37,630
So we do that do that two tail test. For a
5 degrees of freedom 99 % confidence.
119
00:13:37,630 --> 00:13:45,800
Let us look at the table two tail test 99
% confidence means it is 0.01 for a 5 degrees
120
00:13:45,800 --> 00:14:01,730
of freedom you get this here 4.032. So this
is the equation here, x naught minus mu naught,
121
00:14:01,730 --> 00:14:08,800
we do 40.6 minus 45 and then you are standard
deviation is 1.6 for the data. So, divided
122
00:14:08,800 --> 00:14:17,860
by square root of 6 that will come to minus
6.73 the data is significantly different.
123
00:14:17,860 --> 00:14:25,440
Data is significantly different so that means,
we can reject the null hypothesis. That means
124
00:14:25,440 --> 00:14:32,190
we accept the alternative hypothesis. Now
the difference in the mean 40.6 minus 45 is
125
00:14:32,190 --> 00:14:40,430
4.4 right. So confidence interval on the mean
so obviously t into s divided by square root
126
00:14:40,430 --> 00:14:47,830
n, t is 5 degrees of freedom. We get this
and then s square root of n. So we get a confidence
127
00:14:47,830 --> 00:14:53,520
interval on the mean as this actually, the
difference in the mean is this, confidence
128
00:14:53,520 --> 00:15:05,110
interval on the mean is this. Now we can use
the GraphPad also to do the same calculation,
129
00:15:05,110 --> 00:15:12,550
we can use not only the GraphPad we can also
use our excel also to do the calculation.
130
00:15:12,550 --> 00:15:17,050
Shell we look at excel? Quite simple.
131
00:15:17,050 --> 00:15:39,019
Let us go back to excel, so the data is 40.2,
38.7, 41.6, 40.5, 43.2, 39.4 so we have a
132
00:15:39,019 --> 00:15:59,370
sorry 38.7 and this data is not there. We
have 6 data sets, now the command if you remember
133
00:15:59,370 --> 00:16:10,390
we go to this and there is something called
a t-test. So when you look at the t-test what
134
00:16:10,390 --> 00:16:23,810
are the various things array 1. So we give
the data set here and then we give the array
135
00:16:23,810 --> 00:16:35,670
2. So here we give the 45, we give two tail
test and then here we give pair or two sample
136
00:16:35,670 --> 00:16:49,550
t-test of course, as you can see we cannot
do unfortunately we cannot do in excel this
137
00:16:49,550 --> 00:16:57,280
type of a calculations unfortunately. But
we can use the other software here as you
138
00:16:57,280 --> 00:17:15,329
can see descriptive statistics, t-test here
the GraphPad, here you have the 1 sample t-test.
139
00:17:15,329 --> 00:17:41,150
We can give the data here 40.2 then 38.7 then
41.6 and then 40.5, 43.2, 39.4 then we say
140
00:17:41,150 --> 00:17:49,730
we are comparing with 45. So, we have a 6
and then mu naught is here so now when you
141
00:17:49,730 --> 00:17:59,410
say calculate now of course, this gives you
only at a 95 % confidence interval. This is
142
00:17:59,410 --> 00:18:09,660
doing at the p value comes out to be 0.0011
and it is a statistically significant difference
143
00:18:09,660 --> 00:18:16,830
as you can see here. Even at 95 % confidence
interval also it will come out to be statistically
144
00:18:16,830 --> 00:18:20,510
significant difference because if you look
at the table here.
145
00:18:20,510 --> 00:18:30,490
Yeah if you look at the table here for a 5
degrees of freedom, for a 99 % confidence
146
00:18:30,490 --> 00:18:38,640
interval it is 4.032, for 95 % confidence
interval it is 2.571. So your t value which
147
00:18:38,640 --> 00:18:47,800
we calculate is quite high here 6.6892 so
obviously that number is very high, your t
148
00:18:47,800 --> 00:18:52,670
value which your calculating is pretty high
so obviously, that number is much higher than
149
00:18:52,670 --> 00:19:06,600
the table t whether it is at 95 % or whether
it is at 99 %. You see that with the excel
150
00:19:06,600 --> 00:19:19,160
we do not have way of calculating the 1 sample
t-test, where as with GraphPad we can do this
151
00:19:19,160 --> 00:19:24,940
type of a calculation with is. Now let us
go to next type of tests that is called a
152
00:19:24,940 --> 00:19:25,940
two sample t-tests.
153
00:19:25,940 --> 00:19:30,220
That means, I have 2 drugs I am comparing
the performance of 2 drugs and I am trying
154
00:19:30,220 --> 00:19:36,080
tell whether there is a statistically significant
difference or not or I am comparing a control
155
00:19:36,080 --> 00:19:42,049
and drug A which I have introducing in the
market or I am comparing already existing
156
00:19:42,049 --> 00:19:49,020
drug and another drug or I am controlling,
I am comparing the performance of a drug on
157
00:19:49,020 --> 00:19:54,410
a male population and a female population.
We have 2 sets of samples, so you will have
158
00:19:54,410 --> 00:20:01,490
a x one bar and x two bar that is two means
and then we are trying to see whether each
159
00:20:01,490 --> 00:20:07,580
of this sample means comes from the same population
or they come from different population. For
160
00:20:07,580 --> 00:20:14,120
example, we are telling a drug is tested in
rats for lowering blood glucose level this
161
00:20:14,120 --> 00:20:18,390
is drug treated rats, this is the glucose
level and where as this is the controlled
162
00:20:18,390 --> 00:20:26,290
rats this is the glucose level. So we have
2 sets of samples, 6 samples for a drug treated
163
00:20:26,290 --> 00:20:34,480
rats, 6 for the controlled. Now is the drug
effective in lowering the blood glucose level.
164
00:20:34,480 --> 00:20:40,290
For example imagine this sample comes from
one population like this, imagine this sample
165
00:20:40,290 --> 00:20:46,210
comes from another population like this, now
are these populations or overlapping so much
166
00:20:46,210 --> 00:20:52,679
that we can consider it as one single population
or they are different that is a whole idea
167
00:20:52,679 --> 00:20:58,870
of this and is the drug effective in lowering
that means, this is a single tailed test,
168
00:20:58,870 --> 00:21:09,070
for totally we have t value so we can find
out and we can tell whether there is a statistically
169
00:21:09,070 --> 00:21:17,010
significant difference by saying whether the
drug is lowering or there is no different.
170
00:21:17,010 --> 00:21:25,080
The null hypothesis will be mu t is equal
to mu c that means a test and control we say
171
00:21:25,080 --> 00:21:31,400
drug treated we call it test, c means control
whereas alternate could be mu t less than
172
00:21:31,400 --> 00:21:35,110
mu c.
173
00:21:35,110 --> 00:21:43,940
This is the equation by which these things
are calculated. Suppose, we have a number
174
00:21:43,940 --> 00:21:49,770
of data points for one set of sample, number
of data points of another set of sample so
175
00:21:49,770 --> 00:21:55,919
the degrees of freedom will be this plus this
minus 2, that is why in this particular problem
176
00:21:55,919 --> 00:22:06,400
so may have 3, 4, 5, 6; 3, 4, 5, 6 plus 5
is 11, 11 minus 2 is 9 degrees of freedom
177
00:22:06,400 --> 00:22:11,360
do you understand. So here, we have taken
6 data points for a drug treated here we have
178
00:22:11,360 --> 00:22:17,290
taken 5 data point. The advantage of 2 sample
t-test I could have different sets of samples
179
00:22:17,290 --> 00:22:22,590
also in fact this particular equation if you
look at this equation very carefully I can
180
00:22:22,590 --> 00:22:28,840
have any value of n h, I can have any value
of n l so it did not be the same. So t value
181
00:22:28,840 --> 00:22:34,919
is given by this equation X h minus X l that
means, mean of one data set minus mean of
182
00:22:34,919 --> 00:22:42,049
a another data set or sample set 1 by n h
plus 1 by n l square root and then we have
183
00:22:42,049 --> 00:22:54,210
something called s p here, this is the standard
error. If you look it very carefully this
184
00:22:54,210 --> 00:22:58,740
for a one sample we used to have s p divided
by square root of n. This almost analogous
185
00:22:58,740 --> 00:23:06,510
to that here we have square root of 1 plus
n h plus 1 by l s p, but here s p it looks
186
00:23:06,510 --> 00:23:12,870
slightly different n h minus 1 s h square
plus n l minus 1 s l square divided by n h
187
00:23:12,870 --> 00:23:19,880
minus 1 plus n l minus 1 that is the equation
for s p here. There are some similarities
188
00:23:19,880 --> 00:23:27,980
if you look at it this equation and the normal
t equation for one sample t-test and so on.
189
00:23:27,980 --> 00:23:35,550
So, it is quite simple. We have 2 data sets
I get the mean of this, I get the standard
190
00:23:35,550 --> 00:23:42,350
deviation of this n h is equal to 6, n l is
equal to see 5. I substitute the standard
191
00:23:42,350 --> 00:23:49,820
deviations here square it here it will become
6 minus 1, 5 minus 1 so we here 1 by 6 1 by
192
00:23:49,820 --> 00:23:55,860
5 and then here you can get the mean of both
the data sets and you get the answer. So we
193
00:23:55,860 --> 00:24:01,549
can use this simple excel to calculate means
and standard deviations and that is what I
194
00:24:01,549 --> 00:24:08,330
will do. That is the whole idea is we try
to calculate t using fundas it is easy to
195
00:24:08,330 --> 00:24:14,250
use software's whether excel command like
t-test it is easy to use the GraphPad like
196
00:24:14,250 --> 00:24:19,169
software or any commercial software, but then
you need to know what is the underline equation
197
00:24:19,169 --> 00:24:23,950
that is very, very important and that is what
I would like to emphasize on.
198
00:24:23,950 --> 00:24:29,360
.
With drug for the numbers are here, 6 numbers
199
00:24:29,360 --> 00:24:33,900
are here control that means you are not giving
drug there are 5 numbers so we can get the
200
00:24:33,900 --> 00:24:39,350
average. Average is very simple you add all
of them divided by 6. You add all of them
201
00:24:39,350 --> 00:24:46,280
divided by 5. Standard Deviation of this data
set again you can use excel there is s t d
202
00:24:46,280 --> 00:24:52,731
e v function. This gives you the Standard
Deviation of this, this gives you the standard
203
00:24:52,731 --> 00:24:59,750
deviation of this. Now you are squaring this
also because I need to get here right, so
204
00:24:59,750 --> 00:25:12,440
I am squaring this and then I am adding this
and then from there I calculate the t using
205
00:25:12,440 --> 00:25:22,370
this formula the denominator is 1 by 6 plus
1 by 5 a square root here. So, I calculate
206
00:25:22,370 --> 00:25:29,990
this, from there I can calculate the t calculated
so it is comes out to be minus 1.89. Now t
207
00:25:29,990 --> 00:25:36,880
table is how many degrees of freedom 6 plus
5 is 11. 11 minus 2 is 9, so 9 degrees of
208
00:25:36,880 --> 00:25:44,179
freedom you go the t table it is a single
tail 95 % because why single tail is the drug
209
00:25:44,179 --> 00:25:52,840
effective in lowering so it is a single tailed
test. We go to the table it is a single tail
210
00:25:52,840 --> 00:26:04,810
95 % 9 degrees of freedom it is comes to 1.833.
The t calculated should be greater than 1.833
211
00:26:04,810 --> 00:26:10,990
so it is comes to 1.89. So we can reject the
null hypothesis. Please note it is only marginally
212
00:26:10,990 --> 00:26:20,820
higherd 1.89 and 1.833 is not very high. Where
as if we take a 99 % confidence interval that
213
00:26:20,820 --> 00:26:29,030
is 0.01 for 9 degrees of freedom it is 2.82,
t table is higher so we will not reject the
214
00:26:29,030 --> 00:26:37,910
null hypothesis. It is a very close problem,
very close conclusion. I would not go with
215
00:26:37,910 --> 00:26:43,940
close conclusion I may do more experiments
to be very sure you understand in this problem,
216
00:26:43,940 --> 00:26:53,269
with 95 % a 1.833 is a table so 1.89 so I
can the reject the null hypothesis, with 99
217
00:26:53,269 --> 00:26:59,980
% and a one tailed test you get much higher
greater than 2.3 so you will not reject the
218
00:26:59,980 --> 00:27:07,630
null hypothesis and even 1.83 and 1.89 are
very close. Although you can say I will reject
219
00:27:07,630 --> 00:27:13,750
the null hypothesis is a 95 % confidence,
I would not go too much I may try to do more
220
00:27:13,750 --> 00:27:21,470
experiments so that we get a better difference
between the t calculated and the t table do
221
00:27:21,470 --> 00:27:27,789
you understand. This is a very close call
actually.
222
00:27:27,789 --> 00:27:37,590
The excel also can do this and I showed you
last time just few minutes back excel cannot
223
00:27:37,590 --> 00:27:43,880
do one sample t-test, but it can do a two
sample t-test, it can do a paired two sample
224
00:27:43,880 --> 00:27:49,850
t-test, it can do a two sample t-test when
equal variance, two sample t-test when the
225
00:27:49,850 --> 00:27:54,760
unequal variance so the commands are array
1, array 2 tails and type. Array 1 is 1 first
226
00:27:54,760 --> 00:28:02,280
data set, array 2 is second data set, tails
is one tail, 2 means two tail. So, from excel
227
00:28:02,280 --> 00:28:12,800
we get 0.043 that is the probability, so it
is less. We can say we will reject the null
228
00:28:12,800 --> 00:28:21,730
hypothesis, but it is very close call. How
do you do it in excel? We just put these data
229
00:28:21,730 --> 00:28:27,100
inside and then we do the calculations.
230
00:28:27,100 --> 00:29:10,630
We say 2.02, 1.71, 2.04, 1.5, 1.83, 1.69 then
here 1.71, 2.04 then 2.15, 1.92, 1.78, 2.04
231
00:29:10,630 --> 00:29:25,400
2.2 so use the same command t test. So we
give the first data set and then comma we
232
00:29:25,400 --> 00:29:35,340
give the second data set then we comma it
is a 1 sample t-test and we can give equal
233
00:29:35,340 --> 00:29:45,531
variance or unequal variance so if I put 3
.043. Basically I have given it as 3 that
234
00:29:45,531 --> 00:29:50,760
means unequal variance so I can even give
it as equal. For unequal variance I am getting
235
00:29:50,760 --> 00:29:59,960
0.43 yes the variances are unequal as you
can see standard deviations are unequal here.
236
00:29:59,960 --> 00:30:09,290
It is giving 0.43 so we reject the null hypothesis
hence, accept the alternate hypothesis. Same
237
00:30:09,290 --> 00:30:18,789
thing we can do with the, a GraphPad software
also.
238
00:30:18,789 --> 00:30:25,640
But you see it is going to be very close in
GraphPad software. So what do we do? How do
239
00:30:25,640 --> 00:30:32,750
we do this in GraphPad? Let us go back there
so here we have the t-test command compare
240
00:30:32,750 --> 00:30:39,320
2 means. So we go there and then we continue.
241
00:30:39,320 --> 00:30:42,299
.
Because the data the t calculated and t table
242
00:30:42,299 --> 00:30:49,080
are very close to each other some software's
may say it is not statistically significance,
243
00:30:49,080 --> 00:30:55,230
some software may say statistically significant.
Especially you need to watch out with the
244
00:30:55,230 --> 00:30:59,940
data is very close to each other that is t
calculated and t table is better to do more
245
00:30:59,940 --> 00:31:06,900
experiments to have more confidence in your
conclusion. So we will continue on these t-tests
246
00:31:06,900 --> 00:31:08,750
in the next class also.
Thank you very much