| Title: | Tools for IPAG Courses |
|---|---|
| Description: | Provides a collection of intuitive and user-friendly functions for computing confidence intervals for common statistical tasks, including means, differences in means, proportions, and odds ratios. The package also includes tools for linear regression analysis and several real-world datasets intended for teaching and applied statistical inference. |
| Authors: | Gwenaël Piaser [aut, cre] |
| Maintainer: | Gwenaël Piaser <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-17 07:04:46 UTC |
| Source: | https://github.com/gpiaser/ipag |
Dataset from Hamermesh, D. S., & Parker, A. (2005), "Beauty in the classroom: Instructors’ pulchritude and putative pedagogical productivity", Economics of Education Review, 24(4), 369–376.
data(Beauty)data(Beauty)
A data frame with the following variables:
The professor’s identification number.
Average professor evaluation score, ranging from 1 (very unsatisfactory) to 5 (excellent).
Rank of professor: teaching, tenure track, or tenured.
Ethnicity of professor: not minority or minority.
Gender of professor: female or male.
Language of the school where the professor received education: English or non-English.
Age of the professor.
Percentage of students in the class who completed the evaluation.
Number of students in the class who completed the evaluation.
Total number of students enrolled in the class.
Class level: lower or upper.
Number of professors teaching sections of the course in the sample: single or multiple.
Number of credits of the class: one credit (e.g. lab, PE) or multi credit.
Beauty rating of professor from lower-level female students (1 = lowest, 10 = highest).
Beauty rating of professor from upper-level female students (1 = lowest, 10 = highest).
Beauty rating of professor from second upper-level female students (1 = lowest, 10 = highest).
Beauty rating of professor from lower-level male students (1 = lowest, 10 = highest).
Beauty rating of professor from upper-level male students (1 = lowest, 10 = highest).
Beauty rating of professor from second upper-level male students (1 = lowest, 10 = highest).
Average beauty rating of the professor.
Outfit of professor in picture: not formal or formal.
Color of professor’s picture: color or black and white.
The dataset examines the relationship between instructors' physical attractiveness and student evaluation scores, controlling for demographic and class characteristics.
Hamermesh, D. S., & Parker, A. (2005). Beauty in the classroom: Instructors’ pulchritude and putative pedagogical productivity. Economics of Education Review, 24(4), 369–376. doi:10.1016/j.econedurev.2004.07.013
This dataset was imported from a CSV file and included in the IPAG package for demonstration. Data are taken from the article by Augsburg, B., De Haas, R., Harmgart, H., & Meghir, C. (2015). The impacts of microcredit: Evidence from Bosnia and Herzegovina. American Economic Journal: Applied Economics, 7(1), 183-203.
data(Bosnia)data(Bosnia)
A data frame with the following variables:
Household income for the control group before the experiment
Household income for the treatment group before the experiment
Household income for the control group after the experiment
Household income for the treatment group after the experiment
Dataset from Koob (2021), "Determinants of content marketing effectiveness: Conceptual framework and empirical findings from a managerial perspective." PloS ONE, 16(4), e0249457.
data(ContentMarketing)data(ContentMarketing)
A data frame with the following variables:
The company’s identification number.
Effectiveness of the content marketing strategy. Marketing and communications executives rated the degree of effectiveness on a scale from 1 to 5 based on their perception and expertise.
Content marketing strategy context. Four-item scale measuring whether the organization had a defined, comprehensible, and long-term content marketing strategy. Rated from 1 ("totally disagree") to 5 ("totally agree").
Content production context. Reflects the organization's efforts to optimize content value for customers, meet content quality standards, and plan and create content systematically.
Content distribution context / intermediate number of media platforms. Measures the number of media platforms used to distribute content.
Content distribution context / joint deployment of print and digital platforms. Measures the simultaneous use of print and digital media for content distribution.
Content Promotion Context. Measures the importance attached to content promotion. Respondents indicated the share of total content marketing investment devoted to promotion activities.
Content Marketing Performance Measurement Context. Captures the frequency of content marketing performance measurement across print and digital platforms and the use of performance data to guide improvement.
Content Marketing Organization. Captures structural specialization, autonomy in content marketing, and processes and systems that enable specialization.
Organization size. Three dummy variables categorize organizations by number of employees: "Tiny" (250-499), "Small" (500-999), "Medium" (1,000-4,999), "Big" (>=5,000).
Sector affiliation. Dummy variable distinguishing organizations in the "industrial" or "service" sector.
Koob, C. (2021). Determinants of content marketing effectiveness: Conceptual framework and empirical findings from a managerial perspective. PloS ONE, 16(4), e0249457.
This dataset was imported from a CSV file and included in the IPAG package for demonstration. The reference article is Escobar, L. E., Molina-Cruz, A., & Barillas-Mury, C. (2020). BCG vaccine protection from severe coronavirus disease 2019 (COVID-19). Proceedings of the National Academy of Sciences, 117(30), 17720-17726.
data(covid19)data(covid19)
A data frame with the following variables:
Number of deaths per million inhabitants as of April 22, 2020.
The name of the country.
Daily caloric intake.
Per capita CO2 emissions in 2014.
Body mass index in 2016 (male population).
Number of people who died of SARS in 2004.
Proportion of children under one year of age vaccinated with the DTP vaccine (diphtheria, tetanus, poliomyelitis) in 2011.
BCG vaccination policy: "current", "never" or "interrupted".
Latitude of the country's capital.
Longitude of the country's capital.
Imported and exported goods as a percentage of GDP in 2018.
Health expenditure per capita in 2015.
Percentage of the state budget allocated to health in 2010.
Number of tuberculosis cases per 100,000 inhabitants.
GDP per capita.
Area of the country.
Democracy index of the country.
Human Development Index in 2018.
Life expectancy at birth.
Number of children per woman.
Population density of the country.
Total population of the country.number of children per woman
Measure of income inequality (0 = perfect equality, 1 = perfect inequality).
Median age of the population.
Number of days between the first confirmed Covid-19 case in China and the first confirmed case in the country.
https://doi.org/10.1073/pnas.2008410117
Various international public databases (WHO, World Bank, etc.)
Dataset from Harrison Jr, D., & Rubinfeld, D. L. (1978), "Hedonic housing prices and the demand for clean air", Journal of Environmental Economics and Management, 5(1), 81–102.
data(Housing)data(Housing)
A data frame with the following variables:
Per capita crime rate by town.
Proportion of residential land zoned for lots over 25,000 square feet.
Proportion of non-retail business acres per town.
Charles River dummy variable: 1 if the tract bounds the river, 0 otherwise.
Nitric oxides concentration (parts per 10 million).
Average number of rooms per dwelling.
Proportion of owner-occupied units built prior to 1940.
Weighted distances to five Boston employment centres.
Index of accessibility to radial highways.
Full-value property tax rate per $10,000.
Pupil–teacher ratio by town.
Computed as , where is the proportion
of Black residents by town.
Percentage of lower-status population.
Median value of owner-occupied homes in thousands of US dollars.
The dataset is a cross-section of housing values in Boston suburbs and is widely used to study hedonic pricing models and the demand for environmental quality.
Harrison Jr, D., & Rubinfeld, D. L. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1), 81–102. doi:10.1016/0095-0696(78)90006-2
This function performs a linear regression and returns a summary including:
Adjusted R-squared
Overall F-test p-value
Table with parameter estimates, confidence intervals (default 99%), p-values, and significance stars (*, **, ***)
linear_regress(formula, data, level = 0.99)linear_regress(formula, data, level = 0.99)
formula |
A formula like Y ~ X1 + X2 |
data |
A data frame |
level |
Confidence level (default 0.99) |
Object of class 'linear_regress'
data(Housing, package = "IPAG") linear_regress(MEDV ~ RM + LSTAT, data = Housing)data(Housing, package = "IPAG") linear_regress(MEDV ~ RM + LSTAT, data = Housing)
Dataset combining information from:
McKinsey, "Valuing the merit of teachers", Direction interministérielle de la transformation publique.
OECD (2012), "Does Performance-Based Pay Improve Teaching?", PISA in Focus, No. 16, OECD Publishing, Paris.
data(McKinsey)data(McKinsey)
A data frame with the following variables:
The name of the country.
Teacher efficiency measured by PISA reading tests.
Teacher salaries in relation to GDP per capita. 0 means salaries equal GDP per capita, 0.5 means 1.5 times higher than GDP per capita, 1 means 2 times higher than GDP per capita.
GDP per capita in USD 1,000.
Cumulative expenditure by educational establishments in USD 1,000.
Teacher merit pay (y = yes, n = no).
The dataset contains teacher efficiency as measured by reading performance on PISA tests, along with explanatory variables related to salary, GDP, expenditures, and performance-based pay.
McKinsey, "Valuing the merit of teachers", Direction interministérielle de la transformation publique.
OECD (2012), "Does Performance-Based Pay Improve Teaching?", PISA in Focus, No. 16, OECD Publishing, Paris, doi:10.1787/5k98q27r2stb-en
Confidence interval for a mean
mean_ci(x, level = 0.99, na.rm = TRUE)mean_ci(x, level = 0.99, na.rm = TRUE)
x |
Numeric vector |
level |
Confidence level (default 0.99) |
na.rm |
Remove NA values |
Object of class 'mean_ci'
x <- c(4.2, 5.1, 6.3, 5.8, 4.9) mean_ci(x) mean_ci(x, level = 0.95)x <- c(4.2, 5.1, 6.3, 5.8, 4.9) mean_ci(x) mean_ci(x, level = 0.95)
Confidence interval for the difference of means
mean_diff_ci(x, y, level = 0.99, paired = FALSE, na.rm = TRUE)mean_diff_ci(x, y, level = 0.99, paired = FALSE, na.rm = TRUE)
x |
Numeric vector |
y |
Numeric vector |
level |
Confidence level (default 0.99) |
paired |
Logical; are the samples paired? |
na.rm |
Remove NA values |
Object of class 'mean_diff_ci'
x <- c(5.1, 4.9, 6.2, 5.8, 5.4) y <- c(4.8, 4.7, 5.9, 5.2, 5.0) mean_diff_ci(x, y) mean_diff_ci(x, y, paired = TRUE)x <- c(5.1, 4.9, 6.2, 5.8, 5.4) y <- c(4.8, 4.7, 5.9, 5.2, 5.0) mean_diff_ci(x, y) mean_diff_ci(x, y, paired = TRUE)
Confidence interval for odds ratio from a 2x2 table
oddsratio_ci(a, b, c, d, level = 0.99)oddsratio_ci(a, b, c, d, level = 0.99)
a, b, c, d
|
Cell counts of the 2x2 contingency table |
level |
Confidence level (default 0.99) |
Object of class 'oddsratio_ci'
oddsratio_ci(a = 12, b = 5, c = 4, d = 15) oddsratio_ci(a = 12, b = 5, c = 4, d = 15, level = 0.95)oddsratio_ci(a = 12, b = 5, c = 4, d = 15) oddsratio_ci(a = 12, b = 5, c = 4, d = 15, level = 0.95)
Confidence interval for a proportion
prop_ci(trials, successes, level = 0.99)prop_ci(trials, successes, level = 0.99)
trials |
Number of trials |
successes |
Number of successes |
level |
Confidence level (default 0.99) |
Object of class 'prop_ci'
# 45 successes out of 100 trials prop_ci(trials = 100, successes = 45) prop_ci(trials = 100, successes = 45, level = 0.95)# 45 successes out of 100 trials prop_ci(trials = 100, successes = 45) prop_ci(trials = 100, successes = 45, level = 0.95)