Diarrhea is a major public health problem in underdeveloped countries, especially for babies. Diarrhea leads to dehydration, which results in millions of deaths each year worldwide.
Researchers in Peru conducted a double-blind randomized controlled trial, published in The New England Journal of Medicine, to determine whether Bismuth salicylate would improve the outcomes of infants suffering from diarrhea.
In their study, all infants received the standard therapy for diarrhea: oral re-hydration. In addition to the rehydration, 85 babies received bismuth salicylate, while 84 babies received a placebo. The total stool volumes for all infants over the course of their illness was measured. To adjust for body size, the researchers divided by body weight to obtain their outcome of interest: stool output per kilogram of body weight. The results of their study are available on the course website.
DiaData <- read.delim("https://raw.githubusercontent.com/IowaBiostat/data-sets/main/diarrhea/diarrhea.txt")
Let’s examine the distribution of stool output by group using a boxplot of the data:
boxplot(DiaData$Stool ~ DiaData$Group,
col = c("burlywood4", "darkgoldenrod3"),
ylab = "Drug Group",
xlab = "Stool Output",
main = "Effect of bismuth salicylate",
horizontal = T)
Summarize the distribution and describe any differences you notice about the two groups
The distribution appears to be right skewed in both groups with some very high values. Stool output appears to be a little lower in the treatment group.
To do a student’s two sample t-test, we must be able to find the following values:
These three can be found using the equations below (which are also in your lecture slides):
Test statistic:
\(SD_{(pooled)} = \sqrt{ \frac{(n_1 - 1)S_1^2 + (n_2 - 1)S_2^2}{n_1 + n_2 -2} }\)
\(SE_d = SD_{(pooled)} * \sqrt{ \frac{1}{n_1} + \frac{1}{n_2}}\)
\(t = \frac{\bar{x_1} - \bar{x_2}}{SE_d}\) with:
\(df = n_1 + n_2 - 2\)
Confidence interval:
\(\bar{x_1} - \bar{x_2} \pm t_{\alpha/2, df}*SE_d\)
So as you can tell we need to compute several statistics from the data in order to do this t-test. We need to compute the mean of continuous outcomes for both groups as those will be our \(\bar{x_1}\) and \(\bar{x_2}\) as well as the standard error above.
What is our null hypothesis for this data?
The null hypothesis is that there is no difference in the means of the two groups, meaning \(\bar{x_1} = \bar{x_2}\) or equivalently \(\bar{x_1} - \bar{x_2} = 0\).
Calculate the test statistic and 95% confidence interval manually. Summarize your results.
As a reminder, here are some helpful functions for finding summary statistics:
Recall the ‘by’ function we learned in Lab 4: allows us to run a function over a data set by groups.
#Find n in each group
table(DiaData$Group)
#Find mean of each group
by(DiaData$Stool, DiaData$Group, mean)
#Find standard deviation of each group
by(DiaData$Stool, DiaData$Group, sd)
Hypothesis Test:
x1 <- 260.2976 # Mean of control group
x2 <- 181.8706 # Mean of treatment group
sd1 <- 253.6135 # SD of control group
sd2 <- 197.3636 # SD of treatment group
n1 <- 84 # n for control group
n2 <- 85 # n for treatment group
sd_pooled <- sqrt(((n1 - 1)*sd1^2 + (n2 - 1)*sd2^2) / (n1 + n2 - 2))
SE <- sd_pooled * sqrt((1 / n1) + (1 / n2))
df <- n1 + n2 - 2
t <- (x1 - x2) / (SE)
t
## [1] 2.244989
2*pt(t, df = df, lower.tail = F)
## [1] 0.02608141
The stool output is significantly lower in the treatment group than in the control group.
Confidence Interval:
(x1 - x2) + qt(c(0.025, 0.975), df = df)*SE
## [1] 9.45734 147.39666
In other words, the average amount by which the treatment lowers stool output is between 9 and 147 ml/kg.
Check with t.test function
Comments about t.test code:
Note that we can use the tilde (~) to specify that you want to look at stool output by group. We use the tilde because all stool outputs are in one column and group is given in another column. If you were given the stool outputs in two columns (Control & Treatment), then you would simply give R the two columns separated by a comma. It would look something like: ‘t.test(Treatment, Control, var.equal = TRUE)’. See lab 10’s paired t-test example using the ‘Anorexia’ data for an example of data where each group is in a separate column.
Note also that we no longer have to say ‘paired=TRUE’ in the function. The default is ‘paired=FALSE’ and that is exactly the test we want.
We have to specify that we are assuming the population variances are equal using the argument ‘var.equal=TRUE’, otherwise the results will not be the same as the results we computed by hand.
t.test(DiaData$Stool~DiaData$Group, var.equal = TRUE)
##
## Two Sample t-test
##
## data: DiaData$Stool by DiaData$Group
## t = 2.245, df = 167, p-value = 0.02608
## alternative hypothesis: true difference in means between group Control and group Treatment is not equal to 0
## 95 percent confidence interval:
## 9.457362 147.396700
## sample estimates:
## mean in group Control mean in group Treatment
## 260.2976 181.8706
When we calculated the two sample t-test by hand above, we had to assume the population variances (square of the standard deviation) were equal.
Review the standard deviations of the two groups and describe whether you think the assumption is met
by(DiaData$Stool, DiaData$Group, sd)
## DiaData$Group: Control
## [1] 253.6135
## ------------------------------------------------------------
## DiaData$Group: Treatment
## [1] 197.3636
If the population variances were not similar, we would consider using the Welch’s two sample t-test. The Welch’s test accounts for data that do not have equal population variances, providing us with a more accurate test result if the assumption of equal variances is violated. To do this test by hand is quite computational so we will learn how to compute it in R.
To conduct a Welch’s test, we simply switch the variance argument to var.equal= FALSE. This is also the default, so you can leave out the argument entirely and get the same answer.
t.test(DiaData$Stool~DiaData$Group, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: DiaData$Stool by DiaData$Group
## t = 2.2417, df = 156.64, p-value = 0.02638
## alternative hypothesis: true difference in means between group Control and group Treatment is not equal to 0
## 95 percent confidence interval:
## 9.323083 147.530979
## sample estimates:
## mean in group Control mean in group Treatment
## 260.2976 181.8706
Here our p-value is very close to the one we obtained using the Student’s test. This will generally be the case when the population variances are reasonably close or when the sample sizes are very large.
Some infants are born with congenital heart defects and require surgery very early in life. One approach to performing this surgery is known as “circulatory arrest.” A downside of this procedure, however, is that it cuts off the flow of blood to the brain, possibly resulting in brain damage. An alternative procedure, “low-flow bypass” maintains circulation to the brain, but does so with an external pump that potentially causes other sorts of injuries to the brain.
To investigate the treatments, surgeons at Harvard Medical School conducted a randomized controlled trial. In the trial, 70 infants received low-flow bypass surgery and 73 received the circulatory arrest approach. The researchers looked at two outcomes: the Psychomotor Development Index (PDI), which measures physiological development, and the Mental Development Index (MDI), which measures mental development. For both indices, higher scores indicate greater levels of development. The results of their study are on the course website.
Use:
heart <- read.delim("https://raw.githubusercontent.com/IowaBiostat/data-sets/main/infant-heart/infant-heart.txt")
by(heart$PDI, heart$Treatment, sd)
## heart$Treatment: Circulatory arrest
## [1] 16.488
## ------------------------------------------------------------
## heart$Treatment: Low-flow bypass
## [1] 14.68527
by(heart$MDI, heart$Treatment, sd)
## heart$Treatment: Circulatory arrest
## [1] 16.56448
## ------------------------------------------------------------
## heart$Treatment: Low-flow bypass
## [1] 14.57256
The standard deviations appear fairly similar in both cases, therefore we will use student’s two sample t-test.
t.test(heart$PDI ~ heart$Treatment, var.equal = T)
##
## Two Sample t-test
##
## data: heart$PDI by heart$Treatment
## t = -2.2385, df = 141, p-value = 0.02676
## alternative hypothesis: true difference in means between group Circulatory arrest and group Low-flow bypass is not equal to 0
## 95 percent confidence interval:
## -11.0232390 -0.6840017
## sample estimates:
## mean in group Circulatory arrest mean in group Low-flow bypass
## 91.91781 97.77143
There is significant evidence that physiological development is better in the low-flow bypass group.
t.test(heart$MDI ~ heart$Treatment, var.equal = T)
##
## Two Sample t-test
##
## data: heart$MDI by heart$Treatment
## t = -1.2696, df = 141, p-value = 0.2063
## alternative hypothesis: true difference in means between group Circulatory arrest and group Low-flow bypass is not equal to 0
## 95 percent confidence interval:
## -8.484009 1.848393
## sample estimates:
## mean in group Circulatory arrest mean in group Low-flow bypass
## 103.0822 106.4000
There is insufficient evidence that mental development is difference between the two groups.
Using the output from part b), we get a 95% confidence interval of (-11.02, -0.68). This means that we can say with 95% confidence that the physiological development score is between 0.68 and 11.02 higher for those in the low-flow bypass group. Therefore, the low-flow bypass group did better.
Using the output from part c), we get a 95% confidence interval of (-8.48, 1.85). This means we can say with 95% confidence that the true difference in mental development score between the circulatory arrest and low-flow bypass groups is between -8.45 and 1.85. Therefore, we cannot claim that either group did better.
Summary taken from the journal for comparison (some of their analyses were different, but our results should support the same conclusions):
Of the 171 patients enrolled in the study, 155 were evaluated. After adjustment for the presence or absence of a ventricular septal defect, the infants assigned to circulatory arrest, as compared with those assigned to low-flow bypass, had a lower mean score on the Psychomotor Development Index. The method of support was not associated with the prevalence of abnormalities on MRI scans of the brain, or scores on the Mental Development Index. Heart surgery performed with circulatory arrest as the predominant support strategy is associated with a higher risk of delayed motor development and neurologic abnormalities at the age of one year than is surgery with low-flow bypass as the predominant support strategy.