Do children eat more food when they prepared their own healthy and balanced meal?Conducted a Two-Sample T-Test in R to compare the difference.

Evangeline LeeBlockedUnblockFollowFollowingMay 28IntroductionAs the technology advances, so does the junk food.

Children nowadays have more choices to the food they want to eat and most of them would choose junk food simply because junk food taste better than healthy meals.

Though schools offer variety of healthier choices for lunch under National School Lunch and School Breakfast Programs, however in reality, what ends up on a child’s plate is not always nutritious.

The purpose of the study was to examine the effect of children’s participation in meal preparation on their own lunch meal.

The dataset we are using here is from Klazinevan der Horst, Aurore Ferrage, and Andreas Rytz’s research report “Involving Children in Meal Preparation” published on Appetite (vol.

79, pp.

18–24) in August 2014.

The experiment conducted with 47 children aged 6 to 10 years old and separated them into two groups: treatment group 1 had 25 children (n = 25) who prepared their own balanced lunch meal (pasta, breaded chicken, cauliflower, and salad) with assistance of a parent; treatment group 2 had 22 children (n = 22) who did not prepare their own lunch meal (the same meal style as the former) and the parent prepared alone.

Below is the full dataset:HypothesisSince we only have two population sample here and the goal is to compare the mean difference between these two treatment groups, therefore we will perform a Two-Sample T-Test here.

We set our alpha level as 0.

05 and hypothesis as follows:H0: μ1- μ2 = 0 — There is no difference between the mean of two groupsH1: μ1- μ2 ≥ 0 — There is difference between the mean of two groups where group 1 is greater than group 2Two-Sample T-Test AssumptionsBefore we conduct the Two-Sample T-Test, we should first check its assumptions because there is different option on variance to select when running the T-Test.

The assumptions for a Two-Sample T-Test are as follow:1.

Independent observations.

2.

Normal distribution for each of the two groups.

3.

Equal variance for each of the two groups.

Assumption 1: Independent ObservationsSince 47 children were separated into 2 different groups, we know that there is no repeated observation, i.

e.

, child A was in treatment group 1 and would not be in treatment group 2.

Hence the first assumption holds.

Assumption 2: Normal DistributionNow we need to check the normal distribution assumption.

We begin by examine the density plot and the boxplot of the dataset.

# store mean for each groupkid.

calories.

mean <- ddply(kid.

calories, "Trt", summarize, trt.

mean = mean(Calories))# density plotggplot(kid.

calories, aes(x = Calories, fill = Trt)) + geom_density(alpha = 0.

25) + geom_vline(data = kid.

calories.

mean, aes(xintercept = trt.

mean, col = Trt), size = 1) + theme_bw() + ggtitle("Density Plot for Treatments") + labs(y = "Density", fill = "Treatment", col = "Mean")The density plot above shows the distribution of the treatment groups.

The overall plot looks approximately normal and just a little bit skewed to the left.

This should not be a problem at all when we conduct the Two Sample T-Test.

# boxplotggplot(kid.

calories, aes(x = Trt, y = Calories, fill = Trt)) + geom_boxplot(alpha = 0.

5) + xlim("1", "2") + theme_bw() + ggtitle("Boxplot for Treatments") + labs(x = "Treatment", y = "Calories") + scale_fill_discrete(name = "Treatment")The boxplot tells us there is no outlier in each treatment group.

Treatment group 1 looks symmetric and have normal distribution.

However treatment 2, although quite symmetric, skewed to the left a bit.

Again, this should not be a problem as they are approximately normal.

Another plot we can check normality with is using the normal Q-Q plot.

aov_kid.

calories <- aov(Calories ~ Trt, kid.

calories)# normal Q-Q plotggplot(aov_kid.

calories, aes(sample = .

stdresid)) + stat_qq() + geom_abline(col = "red", size = 1) + theme_bw() + ggtitle("Normal Q-Q Plot") + labs(x = "Theoretical Quantiles", y = "Sample Quantiles")From above Normal Q-Q Plot, although most of the points jitter a little bit, but they all fall around the theoretical straight line in red.

However, the tails are not on the straight line and this is because we have data further away on both left and right side (see the two ends of the density plot).

This should be solved if the sample size gets larger.

If we think checking normal assumption using graphs are not enough, we can also perform a Shapiro-Wilk Normality Test for each group:# group 1with(kid.

calories, shapiro.

test(Calories[Trt == "1"]))# group 2with(kid.

calories, shapiro.

test(Calories[Trt == "2"]))From the output above, the two p-values (0.

3195 and 0.

451) are greater than the alpha level 0.

05 implying that the distribution of the data are not significantly different from the normal distribution.

In other words, since the data do not depart too much from normality and provided the sample size isn’t too small, there is no need to be overly concerned if the data violate the normal assumption a bit.

we can assume the normality here and assumption 2 also holds.

Assumption 3: Equal VarianceLastly, we need to check the equal variance assumption.

We need to know in advance if there is equal variance or not when conducting the Two-Sample T-Test because we need to feed the correct set var.

equal argument in R for the T-Test.

To check for equal variance, we can use Levene’s Test:# levene's testleveneTest(kid.

calories$Calories ~ kid.

calories$Trt)The p-value is 0.

8716, which is greater than 0.

05.

Hence we have insufficient evidence to conclude that the variances are different, therefore the equal variance assumption is not violated, there exists equal group variances.

Just a side note on checking the equal variance assumption… you can also use the simple F Ratio to test the equal variance and it will give you a more accurate result.

However you will have to make sure that the data you are using is truly normally distributed.

If you are not sure about the normality, use Levene’s Test as it is more robust.

Two-Sample T-TestSince the dataset met all assumptions for the Two-Sample T-Test, we can now conduct this test:t.

test(kid.

calories$Calories ~ kid.

calories$Trt, var.

equal = T, alternative = "g")From the output above, the average caloric intake for treatment group1 (children who prepared their own lunch meal) is about 431.

4kcal.

The average caloric intake for treatment group 2 (parents prepared the lunch) is about 346.

8kcal.

The difference in mean between these two group is 84.

6kcal.

The 95% confidence interval indicates that the average difference of the caloric intake between the two groups is likely to be greater than 34.

1kcal.

The p-value of 0.

004 indicates that if the average difference in caloric intake were 0kcal, the probability of selecting a sample with an average caloric intake less than this would be 0.

352%.

Since p-value is less than the significance level of 0.

05, we reject the null hypothesis (H0) and conclude the alternative hypothesis (H1).

In other words, there is sufficient evidence that treatment group 1 has more caloric intake when compared to treatment group 2.

ReferencesVan der Horst, K.

, Ferrage, A.

, & Rytz, A.

(2014).

Involving children in meal preparation.

Effects on food intake.

Appetite, 79(1), 18–24.

doi:10.

1016/j.

appet.

2014.

03.

030Liu, C.

, Milton, J.

, & McIntosh, A.

(n.

d.

).

One and Two Sample Tests and ANOVA.

Retrieved May 28, 2019, from http://sphweb.

bumc.

bu.

edu/otlt/MPH-Modules/BS/R/R4_One-TwoSampleTests-ANOVA/R4_One-TwoSampleTests-ANOVA_print.

htmlNCSS.

(2019, May 28).

Two-Sample T-Test.

Reading.

Retrieved May 28, 2019, from https://ncss-wpengine.

netdna-ssl.

com/wp-content/themes/ncss/pdf/Procedures/NCSS/Two-Sample_T-Test.

pdf.. More details