Title: | Data Bank for Statistical Analysis and Visualization |
---|---|
Description: | Contains data organized by topics: categorical data, regression model, means comparisons, independent and repeated measures ANOVA, mixed ANOVA and ANCOVA. |
Authors: | Alboukadel Kassambara [aut, cre] |
Maintainer: | Alboukadel Kassambara <[email protected]> |
License: | GPL-2 |
Version: | 0.1.0.999 |
Built: | 2024-11-22 05:21:56 UTC |
Source: | https://github.com/kassambara/datarium |
Paired nominal data providing the smoking status 62 individuals before and after emotive video communications showing the danger of smoking. This a demo dataset for McNemar test.
data("antismoking")
data("antismoking")
A data frame with 62 rows and 3 columns.
data(antismoking) xtabs(~before + after, data = antismoking)
data(antismoking) xtabs(~before + after, data = antismoking)
The data provide the anxiety score, measured at three time points, of three groups of individuals practicing physical exercises at different levels (grp1: basal, grp2: moderate and grp3: high)
Two-way mixed ANOVA can be used to evaluate if there is interaction between group and time in explaining the anxiety score.
data("anxiety")
data("anxiety")
A data frame with 45 rows and 5 columns.
data(anxiety) head(as.data.frame(anxiety))
data(anxiety) head(as.data.frame(anxiety))
The data correspond to an experiment in which a treatment for depression is studied. Two groups of patients (1: control / 2: treatment) have been followed at five different times (0: pre-test, 1: one month post-test, 3: 3 months follow-up and 6: 6 months follow-up). The dependent variable is a depression score.
Repeated measures ANOVA can be performed in order to determine the effect of the treatment and the effect of time on the depression score.
data("depression")
data("depression")
A data frame with 24 rows and 5 columns.
data(depression) head(as.data.frame(depression))
data(depression) head(as.data.frame(depression))
Contains the weights by sex (M for male; F for female). The question is whether the average women’s weight differs from the average men’s weight?
A two-samples independent t-test can be performed to answer to this question.
data("genderweight")
data("genderweight")
A data frame with 40 rows and 3 columns
data(genderweight) head(as.data.frame(genderweight))
data(genderweight) head(as.data.frame(genderweight))
Measures of cholesterol concentration in 72 participants treated with three different drugs. The aim is to examine the potential of new class of drugs in lowering the cholesterol concentration and consequently reducing heart attack.
A pharmaceutical company tested three treatments for migraine headache sufferers. 72 participants were enrolled in the experiments. The aim is to examine the potential of new class of treatments in lowering the pain score associated with the migraine headache episode.
The participants include 36 males and 36 females. Males and females were further (equally) subdivided into whether they were at low or high risk of migraine headache.
This data set is suited for three way Anova test.
It contain the following variables:
gender, which has two categories: "male" and "female";
risk which has two levels: "low" and "high"
treatment, which has three categories: "X", "Y" and "Z".
data("headache")
data("headache")
A data frame with 72 rows and 4 columns.
data(headache) head(as.data.frame(headache))
data(headache) head(as.data.frame(headache))
Measures of cholesterol concentration in 72 participants treated with three different drugs. The aim is to examine the potential of new class of drugs in lowering the cholesterol concentration and consequently reducing heart attack.
The participants include 36 males and 36 females. Males and females were further (equally) subdivided into whether they were at low or high risk of heart attack.
This data set is suited for three way Anova test.
It contain the following variables:
gender, which has two categories: "male" and "female";
risk which has two levels: "low" and "high"
drug, which has three categories: "A", "B" and "C".
data("heartattack")
data("heartattack")
A data frame with 72 rows and 5 columns.
data(heartattack) head(as.data.frame(heartattack))
data(heartattack) head(as.data.frame(heartattack))
A data frame containing the frequency of execution of 13 house tasks in the couple.
data("housetasks.raw")
data("housetasks.raw")
A data frame with 1744 rows and 2 columns (tasks and status).
data(housetasks.raw) table(housetasks.raw)
data(housetasks.raw) table(housetasks.raw)
Contains the job satisfaction score organized by gender and education level.
data("jobsatisfaction")
data("jobsatisfaction")
A data frame with 58 rows and 3 columns.
data(jobsatisfaction) head(as.data.frame(jobsatisfaction))
data(jobsatisfaction) head(as.data.frame(jobsatisfaction))
A data frame containing the impact of three advertising medias (youtube, facebook and newspaper) on sales. Data are the advertising budget in thousands of dollars along with the sales (in thousands of units). The advertising experiment has been repeated 200 times. This is a simulated data.
data("marketing")
data("marketing")
A data frame with 200 rows and 4 columns.
data(marketing) res.lm <- lm(sales ~ youtube*facebook, data = marketing) summary(res.lm)
data(marketing) res.lm <- lm(sales ~ youtube*facebook, data = marketing) summary(res.lm)
Contains the weight of 10 mice. The question is whether the average weight of the mice differs from 25g.
A one sample t-test can be performed to answer to this question.
data("mice")
data("mice")
A data frame with 10 rows and 2 columns
data(mice) head(as.data.frame(mice))
data(mice) head(as.data.frame(mice))
contains the weight of 10 mice before and after the treatment.
A paired-samples t-test can be performed to answer to this question.
data("mice2")
data("mice2")
A data frame with 10 rows and 3 columns
data(mice2) head(as.data.frame(mice2))
data(mice2) head(as.data.frame(mice2))
Contains the performance score measures of participants at two time points. The aim of this study is to evaluate the effect of gender and stress on performance score. The three-way mixed ANOVA test can be used to investigate this question.
The data include two between-subjects factors (gender and stress) and one within-subject factor (time, repeated measures).
data("performance")
data("performance")
A data frame with 24 rows and 5 columns.
data(performance) head(as.data.frame(performance))
data(performance) head(as.data.frame(performance))
Contains the type of properties and the buyer types. Buyer categories are: "single male", "single female", "married couple" and "family".
The type of property these buyers purchased were sorted into four categories: "flat", "bungalow" (i.e., a one-storey home), "detached house" and "terrace" (i.e., a block of adjoining houses).
Chi-square test of independence can be used to assess the association between the type of buyer who purchases a property and the type of property that is purchased.
data("properties")
data("properties")
A data frame with 333 rows and 2 columns.
data("properties") head(as.data.frame(properties))
data("properties") head(as.data.frame(properties))
Presents the frequencies of individuals at high risk of renal calculi according to age and gender. This a demo dataset for Cochran-Armitage trend test for investigating whether there is a linear trend between the proportion of individual with renal stone and ages.
data("renalstone")
data("renalstone")
A data frame with 3513 rows and 3 columns.
Hazra, Avijit, and Nithya Jaideep Gogtay. 2016. “Biostatistics Series Module 4: Comparing Groups – Categorical Variables.” In Indian Journal of Dermatology
data(renalstone) xtabs(~stone+age+gender, data = renalstone)
data(renalstone) xtabs(~stone+age+gender, data = renalstone)
The dataset contains 10 individuals' self-esteem score on three time points during a specific diet to determine whether their self-esteem improved.
One-way repeated measures ANOVA can be performed in order to determine the effect of time on the self-esteem score.
data("selfesteem")
data("selfesteem")
A data frame with 10 rows and 4 columns.
data(selfesteem) head(as.data.frame(selfesteem))
data(selfesteem) head(as.data.frame(selfesteem))
Data are the self esteem score of 12 individuals enrolled in 2 successive short-term trials (4 weeks) - control (placebo) and special diet trials.
The self esteem score was recorded at three time points: at the beginning (t1), midway (t2) and at the end (t3) of the trials.
The same 12 participants are enrolled in the two different trials with enough time between trials.
Two-way repeated measures ANOVA can be performed in order to determine whether there is interaction between time and treatment on the self esteem score.
data("selfesteem2")
data("selfesteem2")
A data frame with 24 rows and 5 columns.
data(selfesteem2) head(as.data.frame(selfesteem2))
data(selfesteem2) head(as.data.frame(selfesteem2))
Researchers want to evaluate the effect of a new "treatment" and "exercise" on the stress score reduction after adjusting for "age".
Two-way ANCOVA can be performed in order to determine whether there is interaction between exercise and treatment on the stress score.
data("stress")
data("stress")
A data frame with 60 rows and 5 columns.
data(stress) head(as.data.frame(stress))
data(stress) head(as.data.frame(stress))
Repeated measures nominal designs where 73 subjects are asked to perform 3 tasks. The outcome of each task is a dichotomous value, success or failure. Each row correspond to a participant (called "block" in the jargon). This a demo dataset for Cochran's Q test.
data("taskachievment")
data("taskachievment")
A data frame with 73 rows and 4 columns.
data(taskachievment) head(taskachievment)
data(taskachievment) head(taskachievment)
Survival of passengers on the Titanic. This data set provides information on the fate of passengers on the fatal maiden voyage of the ocean liner ‘Titanic’. Columns are economic status (Class), Sex, Age and Survived.
data("titanic.raw")
data("titanic.raw")
A data frame with 2201 rows and 4 columns.
data(titanic.raw) data("titanic.raw") with(titanic.raw, table(Class, Survived))
data(titanic.raw) data("titanic.raw") with(titanic.raw, table(Class, Survived))
A researcher wanted to assess the effects of Diet and Exercises on weight loss in 10 sedentary males.
The participants were enrolled in four trials: (1) no diet and no exercises; (2) diet only; (3) exercises only; and (4) diet and exercises combined.
Each participant performed all four trials. The order of the trials was counterbalanced and sufficient time was allowed between trials to allow any effects of previous trials to have dissipated (i.e., a "wash out" period).
Each trial lasted nine weeks and the weight loss score was measured at the beginning of each trial (t1), at the midpoint of each trial (t2) and at the end of each trial (t3).
Three-way repeated measures ANOVA can be performed in order to determine whether there is interaction between diet, exercises and time on the weight loss score.
data("weightloss")
data("weightloss")
A data frame with 48 rows and 6 columns.
data(weightloss) head(as.data.frame(weightloss))
data(weightloss) head(as.data.frame(weightloss))