Package 'datarium' reference manual

Title:	Data Bank for Statistical Analysis and Visualization
Description:	Contains data organized by topics: categorical data, regression model, means comparisons, independent and repeated measures ANOVA, mixed ANOVA and ANCOVA.
Authors:	Alboukadel Kassambara [aut, cre]
Maintainer:	Alboukadel Kassambara <[email protected]>
License:	GPL-2
Version:	0.1.0.999
Built:	2025-03-16 06:20:03 UTC
Source:	https://github.com/kassambara/datarium

Air Passengers

Description

This dataset contains the number of monthly air passengers from 1949 to 1960. It is taken from the R datasets package and formatted into long data frame format.

Usage

data("AirPassengersDf")
data("AirPassengersDf")

Format

A data frame with 142 rows and 2 columns (Month and Passengers).

Examples

data("AirPassengersDf")
head(AirPassengersDf)
data("AirPassengersDf")
head(AirPassengersDf)

Anti-Smoking Emotive Communication Data for McNemar Test

Description

Paired nominal data providing the smoking status 62 individuals before and after emotive video communications showing the danger of smoking. This a demo dataset for McNemar test.

Usage

data("antismoking")
data("antismoking")

Format

A data frame with 62 rows and 3 columns.

Examples

data(antismoking)
xtabs(~before + after, data = antismoking)
data(antismoking)
xtabs(~before + after, data = antismoking)

Anxiety Data for Two-Way Mixed ANOVA

Description

The data provide the anxiety score, measured at three time points, of three groups of individuals practicing physical exercises at different levels (grp1: basal, grp2: moderate and grp3: high)

Two-way mixed ANOVA can be used to evaluate if there is interaction between group and time in explaining the anxiety score.

Usage

data("anxiety")
data("anxiety")

Format

A data frame with 45 rows and 5 columns.

Examples

data(anxiety)
head(as.data.frame(anxiety))
data(anxiety)
head(as.data.frame(anxiety))

Depression Data for Two Way Mixed ANOVA

Description

The data correspond to an experiment in which a treatment for depression is studied. Two groups of patients (1: control / 2: treatment) have been followed at five different times (0: pre-test, 1: one month post-test, 3: 3 months follow-up and 6: 6 months follow-up). The dependent variable is a depression score.

Repeated measures ANOVA can be performed in order to determine the effect of the treatment and the effect of time on the depression score.

Usage

data("depression")
data("depression")

Format

A data frame with 24 rows and 5 columns.

Examples

data(depression)
head(as.data.frame(depression))
data(depression)
head(as.data.frame(depression))

Weight Data By Gender for Two-Samples Mean Test

Description

Contains the weights by sex (M for male; F for female). The question is whether the average women’s weight differs from the average men’s weight?

A two-samples independent t-test can be performed to answer to this question.

Usage

data("genderweight")
data("genderweight")

Format

A data frame with 40 rows and 3 columns

Examples

data(genderweight)
head(as.data.frame(genderweight))
data(genderweight)
head(as.data.frame(genderweight))

Headache Data for Three Way ANOVA

Description

Measures of cholesterol concentration in 72 participants treated with three different drugs. The aim is to examine the potential of new class of drugs in lowering the cholesterol concentration and consequently reducing heart attack.

A pharmaceutical company tested three treatments for migraine headache sufferers. 72 participants were enrolled in the experiments. The aim is to examine the potential of new class of treatments in lowering the pain score associated with the migraine headache episode.

The participants include 36 males and 36 females. Males and females were further (equally) subdivided into whether they were at low or high risk of migraine headache.

This data set is suited for three way Anova test.

It contain the following variables:

gender, which has two categories: "male" and "female";
risk which has two levels: "low" and "high"
treatment, which has three categories: "X", "Y" and "Z".

Usage

data("headache")
data("headache")

Format

A data frame with 72 rows and 4 columns.

Examples

data(headache)
head(as.data.frame(headache))
data(headache)
head(as.data.frame(headache))

Heart Attack Data for Three Way ANOVA

Description

The participants include 36 males and 36 females. Males and females were further (equally) subdivided into whether they were at low or high risk of heart attack.

This data set is suited for three way Anova test.

It contain the following variables:

gender, which has two categories: "male" and "female";
risk which has two levels: "low" and "high"
drug, which has three categories: "A", "B" and "C".

Usage

data("heartattack")
data("heartattack")

Format

A data frame with 72 rows and 5 columns.

Examples

data(heartattack)
head(as.data.frame(heartattack))
data(heartattack)
head(as.data.frame(heartattack))

Housetasks

Description

A data frame containing the frequency of execution of 13 house tasks in the couple.

Usage

data("housetasks.raw")
data("housetasks.raw")

Format

A data frame with 1744 rows and 2 columns (tasks and status).

Examples

data(housetasks.raw)
table(housetasks.raw)

data(housetasks.raw)
table(housetasks.raw)

Job Satisfaction Data for Two-Way ANOVA

Description

Contains the job satisfaction score organized by gender and education level.

Usage

data("jobsatisfaction")
data("jobsatisfaction")

Format

A data frame with 58 rows and 3 columns.

Examples

data(jobsatisfaction)
head(as.data.frame(jobsatisfaction))
data(jobsatisfaction)
head(as.data.frame(jobsatisfaction))

Marketing Data Set

Description

A data frame containing the impact of three advertising medias (youtube, facebook and newspaper) on sales. Data are the advertising budget in thousands of dollars along with the sales (in thousands of units). The advertising experiment has been repeated 200 times. This is a simulated data.

Usage

data("marketing")
data("marketing")

Format

A data frame with 200 rows and 4 columns.

Examples

data(marketing)
res.lm <- lm(sales ~ youtube*facebook, data = marketing)
summary(res.lm)

data(marketing)
res.lm <- lm(sales ~ youtube*facebook, data = marketing)
summary(res.lm)

Mice Weight Data for One Sample Mean Test

Description

Contains the weight of 10 mice. The question is whether the average weight of the mice differs from 25g.

A one sample t-test can be performed to answer to this question.

Usage

data("mice")
data("mice")

Format

A data frame with 10 rows and 2 columns

Examples

data(mice)
head(as.data.frame(mice))
data(mice)
head(as.data.frame(mice))

Mice Weight Data for Paired-Samples Mean Test

Description

contains the weight of 10 mice before and after the treatment.

A paired-samples t-test can be performed to answer to this question.

Usage

data("mice2")
data("mice2")

Format

A data frame with 10 rows and 3 columns

Examples

data(mice2)
head(as.data.frame(mice2))
data(mice2)
head(as.data.frame(mice2))

Performance Data for Three-Way Mixed ANOVA

Description

Contains the performance score measures of participants at two time points. The aim of this study is to evaluate the effect of gender and stress on performance score. The three-way mixed ANOVA test can be used to investigate this question.

The data include two between-subjects factors (gender and stress) and one within-subject factor (time, repeated measures).

Usage

data("performance")
data("performance")

Format

A data frame with 24 rows and 5 columns.

Examples

data(performance)
head(as.data.frame(performance))
data(performance)
head(as.data.frame(performance))

Properties Data for Chi-square Test of Independence

Description

Contains the type of properties and the buyer types. Buyer categories are: "single male", "single female", "married couple" and "family".

The type of property these buyers purchased were sorted into four categories: "flat", "bungalow" (i.e., a one-storey home), "detached house" and "terrace" (i.e., a block of adjoining houses).

Chi-square test of independence can be used to assess the association between the type of buyer who purchases a property and the type of property that is purchased.

Usage

data("properties")
data("properties")

Format

A data frame with 333 rows and 2 columns.

Examples

data("properties")
head(as.data.frame(properties))
data("properties")
head(as.data.frame(properties))

Risk of Renal Stone Data for Cochran-Armitage Trend Test

Description

Presents the frequencies of individuals at high risk of renal calculi according to age and gender. This a demo dataset for Cochran-Armitage trend test for investigating whether there is a linear trend between the proportion of individual with renal stone and ages.

Usage

data("renalstone")
data("renalstone")

Format

A data frame with 3513 rows and 3 columns.

References

Hazra, Avijit, and Nithya Jaideep Gogtay. 2016. “Biostatistics Series Module 4: Comparing Groups – Categorical Variables.” In Indian Journal of Dermatology

Examples

data(renalstone)
xtabs(~stone+age+gender, data = renalstone)
data(renalstone)
xtabs(~stone+age+gender, data = renalstone)

Self-Esteem Score Data for One-way Repeated Measures ANOVA

Description

The dataset contains 10 individuals' self-esteem score on three time points during a specific diet to determine whether their self-esteem improved.

One-way repeated measures ANOVA can be performed in order to determine the effect of time on the self-esteem score.

Usage

data("selfesteem")
data("selfesteem")

Format

A data frame with 10 rows and 4 columns.

Examples

data(selfesteem)
head(as.data.frame(selfesteem))
data(selfesteem)
head(as.data.frame(selfesteem))

Self Esteem Score Data for Two-way Repeated Measures ANOVA

Description

Data are the self esteem score of 12 individuals enrolled in 2 successive short-term trials (4 weeks) - control (placebo) and special diet trials.

The self esteem score was recorded at three time points: at the beginning (t1), midway (t2) and at the end (t3) of the trials.

The same 12 participants are enrolled in the two different trials with enough time between trials.

Two-way repeated measures ANOVA can be performed in order to determine whether there is interaction between time and treatment on the self esteem score.

Usage

data("selfesteem2")
data("selfesteem2")

Format

A data frame with 24 rows and 5 columns.

Examples

data(selfesteem2)
head(as.data.frame(selfesteem2))
data(selfesteem2)
head(as.data.frame(selfesteem2))

Stress Data for Two-Way ANCOVA

Description

Researchers want to evaluate the effect of a new "treatment" and "exercise" on the stress score reduction after adjusting for "age".

Two-way ANCOVA can be performed in order to determine whether there is interaction between exercise and treatment on the stress score.

Usage

data("stress")
data("stress")

Format

A data frame with 60 rows and 5 columns.

Examples

data(stress)
head(as.data.frame(stress))
data(stress)
head(as.data.frame(stress))

Task Achievment Data for Cochran's Q Test

Description

Repeated measures nominal designs where 73 subjects are asked to perform 3 tasks. The outcome of each task is a dichotomous value, success or failure. Each row correspond to a participant (called "block" in the jargon). This a demo dataset for Cochran's Q test.

Usage

data("taskachievment")
data("taskachievment")

Format

A data frame with 73 rows and 4 columns.

Examples

data(taskachievment)
head(taskachievment)
data(taskachievment)
head(taskachievment)

Survival of Passengers on the Titanic

Description

Survival of passengers on the Titanic. This data set provides information on the fate of passengers on the fatal maiden voyage of the ocean liner ‘Titanic’. Columns are economic status (Class), Sex, Age and Survived.

Usage

data("titanic.raw")
data("titanic.raw")

Format

A data frame with 2201 rows and 4 columns.

Examples

data(titanic.raw)
data("titanic.raw")
with(titanic.raw, table(Class, Survived))

data(titanic.raw)
data("titanic.raw")
with(titanic.raw, table(Class, Survived))

Weight Loss Score Data for Three-way Repeated Measures ANOVA

Description

A researcher wanted to assess the effects of Diet and Exercises on weight loss in 10 sedentary males.

The participants were enrolled in four trials: (1) no diet and no exercises; (2) diet only; (3) exercises only; and (4) diet and exercises combined.

Each participant performed all four trials. The order of the trials was counterbalanced and sufficient time was allowed between trials to allow any effects of previous trials to have dissipated (i.e., a "wash out" period).

Each trial lasted nine weeks and the weight loss score was measured at the beginning of each trial (t1), at the midpoint of each trial (t2) and at the end of each trial (t3).

Three-way repeated measures ANOVA can be performed in order to determine whether there is interaction between diet, exercises and time on the weight loss score.

Usage

data("weightloss")
data("weightloss")

Format

A data frame with 48 rows and 6 columns.

Examples

data(weightloss)
head(as.data.frame(weightloss))
data(weightloss)
head(as.data.frame(weightloss))

Package 'datarium'

Help Index

Air Passengers

Description

Usage

Format

Examples

Anti-Smoking Emotive Communication Data for McNemar Test

Description

Usage

Format

Examples

Anxiety Data for Two-Way Mixed ANOVA

Description

Usage

Format

Examples

Depression Data for Two Way Mixed ANOVA

Description

Usage

Format

Examples

Weight Data By Gender for Two-Samples Mean Test

Description

Usage

Format

Examples

Headache Data for Three Way ANOVA

Description

Usage

Format

Examples

Heart Attack Data for Three Way ANOVA

Description

Usage

Format

Examples

Housetasks

Description

Usage

Format

Examples

Job Satisfaction Data for Two-Way ANOVA

Description

Usage

Format

Examples

Marketing Data Set

Description

Usage

Format

Examples

Mice Weight Data for One Sample Mean Test

Description

Usage

Format

Examples

Mice Weight Data for Paired-Samples Mean Test

Description

Usage

Format

Examples

Performance Data for Three-Way Mixed ANOVA

Description

Usage

Format

Examples

Properties Data for Chi-square Test of Independence

Description

Usage

Format

Examples

Risk of Renal Stone Data for Cochran-Armitage Trend Test

Description

Usage

Format

References

Examples

Self-Esteem Score Data for One-way Repeated Measures ANOVA

Description