Syllabus: Applied Statistics for Health Researchers

Objective

Develop essential competencies in applied statistics for health through the use of the R programming language, providing a theoretical and practical framework that enables participants to analyze data, interpret results, and apply biostatistical techniques in research and professional projects within the health sector.

Course Requirements

  1. Basic understanding of data management, including knowledge of data structures and types.
  2. Completion of the Data Management Strategies course.

Course Content

Module 1: Introduction to Biostatistics

  • Fundamental concepts of biostatistics
  • Installation of the preferred statistical environment
  • Importance of accurate and valid data collection in epidemiology
    • History and the scientific method
    • Epidemiological study designs
    • Validity in epidemiological studies
    • Sources of bias in data
  • Ethics and responsible conduct in data management

Module 2: Descriptive Statistics

Measures of Central Tendency

  • Arithmetic mean (average)
  • Geometric mean
  • Weighted mean
  • Median
  • Mode

Measures of Dispersion

  • Range
  • Interquartile range
  • Variance
  • Standard deviation

Statistical Distributions

  • What is a distribution? Theoretical concepts

Module 3: Fundamentals of Probability

  • Basic concepts of probability
  • Probability theorems
  • Conditional probability
  • Probability distributions
  • Screening tests

Module 4: Statistical Inference

  • Introduction to statistical inference
  • Hypothesis testing:
    • z test
    • t test
    • Proportion tests
    • Algorithm for selecting hypothesis tests
  • Bootstrapping for inferential statistics through resampling

Module 5: Inferential Statistics with infer in R

  • Application of the infer package in R for hypothesis testing
  • Practical exercises in statistical inference

Module 6: Reproducibility in Research

  • Importance of reproducibility in scientific research
  • Strategies to ensure transparency and replicability in data analysis

References

  • Álvarez Cáceres, R. (2007). Estadística aplicada a las ciencias de la salud. Editorial Díaz de Santos, S.A. https://www.editdiazdesantos.com/wwwdat/pdf/9788479788230.pdf

  • Andrade, H. A. (2019). Bioestadística aplicada en ciencias de la salud. Guía complementaria al curso. Fundación Gustavo Palma Calderón. https://www.researchgate.net/publication/330521436_Bioestadistica_Aplicada_Ciencias_de_la_Salud

  • Couch, S. P., Bray, A. P., Ismay, C., Chasnovski, E., Baumer, B. S., & Çetinkaya-Rundel, M. (2021). infer: An R package for tidyverse-friendly statistical inference. Journal of Open Source Software, 6(65), 3661.

  • Dalgaard, P. (2008). Introductory Statistics with R (Second ed.). Springer. 10.1007/978-0-387-79054-1

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2023, June 21). An Introduction to Statistical Learning. Trevor Hastie. https://www.statlearning.com/

  • Kleinbaum, D. G., Kupper, L. L., Nizam, A., & Rosenberg, E. S. (2013). Applied Regression Analysis and Other Multivariable Methods (D. G. Kleinbaum, L. L. Kupper, A. Nizam, & E. S. Rosenberg, Eds.; 5th ed.). Cengage Learning.

  • Porta, M. S., Greenland, S., Hernán, M., Silva, I. d. S., & Last, J. M. (Eds.). (2014). A Dictionary of Epidemiology (6th ed.). Oxford University Press. 10.1093/acref/9780199976720.001.0001

  • Díaz-Portillo, J. (2011). Guía práctica del curso de bioestadística aplicada a las ciencias de la salud. Instituto Nacional de Gestión Sanitaria, Servicio de Recursos Documentales y Apoyo Institucional. Madrid, España.

  • R Core Team, R. (2013). R: A language and environment for statistical computing. https://apps.dtic.mil/sti/citations/AD1039033

  • Rosner, B. (2016). Fundamentals of Biostatistics (8th ed.). Cengage Learning. https://www.cengage.com/c/fundamentals-of-biostatistics-8e-rosner/9781305268920/