OER Commons

Missing Data and Multiple Imputation Decision Tree

Rating

This document is intended to provide practical guidelines for researchers to follow when examining their data for missingness and making decisions about how to handle that missingness. We primarily offer recommendations for multiple imputation, but also indicate where the same decisional guidelines are appropriate for other types of missing data procedures such as full information maximum likelihood (FIML). Streamlining procedures to address missing data and increasing the transparency of those procedures through consensus on reporting standards is inexorably linked to the goals of open scholarship (i.e., the endeavour to improve openness, integrity, social justice, diversity, equity, inclusivity and accessibility in all areas of scholarly activities, and by extension, academic fields beyond the sciences and academic activities; Pownall et al., 2021). Successfully implementing transparent and accessible guidelines for addressing missing data is also important for Diversity, Equity, Inclusion, and Accessibility (DEIA) improvement efforts (Randall et al., 2021). Structural barriers to participation in research can lead to participants from minoritized groups disproportionately dropping out of longitudinal, developmental studies or not completing measures (Randall et al., 2021). This selection effect can bias model estimates and confidence intervals, leading to unsubstantiated claims about equitable outcomes. In addition to often creating artificially small estimates of inequalities between groups, listwise deletion also limits statistical power for minoritized groups who are already underrepresented in many datasets.

Subject:: Mathematics; Statistics and Probability
Material Type:: Diagram/Illustration; Reading
Author:: Alex Uzdavines; Ben Van Dusen; Daria Gerasimova; David Moreau; Denver Brown; James M. Clay; Jayson Nissen; Jessica A. R. Logan; Kathleen Schmidt; Keven Joyal-Desmarais; Kevin M. King; Mahmoud M. Elsherif; Martin Vasilev; Max A. Halvorson; Menglin Xu; Pamela E. Davis-Kean; Rick A. Cruz; Sierra Bainter; Adrienne D. Woods
Date Added:: 04/25/2022

More Less

Unrestricted Use

Public Domain

Secondary Data Preregistration

Rating

Preregistration is the process of specifying project details, such as hypotheses, data collection procedures, and analytical decisions, prior to conducting a study. It is designed to make a clearer distinction between data-driven, exploratory work and a-priori, confirmatory work. Both modes of research are valuable, but are easy to unintentionally conflate. See the Preregistration Revolution for more background and recommendations.

For research that uses existing datasets, there is an increased risk of analysts being biased by preliminary trends in the dataset. However, that risk can be balanced by proper blinding to any summary statistics in the dataset and the use of hold out datasets (where the "training" and "validation" datasets are kept separate from each other). See this page for specific recommendations about "split samples" or "hold out" datasets. Finally, if those procedures are not followed, disclosure of possible biases can inform the researcher and her audience about the proper role any results should have (i.e. the results should be deemed mostly exploratory and ideal for additional confirmation).

This project contains a template for creating your preregistration, designed specifically for research using existing data. In the future, this template will be integrated into the OSF.

Subject:: Life Science; Social Science
Material Type:: Reading
Author:: Alexander C. DeHaven; Andrew Hall; Brian Brown; Charles R. Ebersole; Courtney K. Soderberg; David Thomas Mellor; Elliott Kruse; Jerome Olsen; Jessica Kosie; K.D. Valentine; Lorne Campbell; Marjan Bakker; Olmo van den Akker; Pamela Davis-Kean; Rodica I. Damian; Stuart J Ritchie; Thuy-vy Nguyen; William J. Chopik; Sara J. Weston
Date Added:: 08/03/2021

More Less

Unrestricted Use

Public Domain

Secondary Data Preregistration

Rating

Preregistration is the process of specifying project details, such as hypotheses, data collection procedures, and analytical decisions, prior to conducting a study. It is designed to make a clearer distinction between data-driven, exploratory work and a-priori, confirmatory work. Both modes of research are valuable, but are easy to unintentionally conflate. See the Preregistration Revolution for more background and recommendations.

For research that uses existing datasets, there is an increased risk of analysts being biased by preliminary trends in the dataset. However, that risk can be balanced by proper blinding to any summary statistics in the dataset and the use of hold out datasets (where the "training" and "validation" datasets are kept separate from each other). See this page for specific recommendations about "split samples" or "hold out" datasets. Finally, if those procedures are not followed, disclosure of possible biases can inform the researcher and her audience about the proper role any results should have (i.e. the results should be deemed mostly exploratory and ideal for additional confirmation).

This project contains a template for creating your preregistration, designed specifically for research using existing data. In the future, this template will be integrated into the OSF.

Subject:: Applied Science
Material Type:: Reading
Author:: Alexander C. DeHaven; Andrew Hall; Brian Brown; Charles R. Ebersole; Courtney K. Soderberg; David Thomas Mellor; Elliott Kruse; Jerome Olsen; Jessica Kosie; K. D. Valentine; Lorne Campbell; Marjan Bakker; Olmo van den Akker; Pamela Davis-Kean; Rodica I. Damian; Stuart J. Ritchie; Thuy-vy Ngugen; William J. Chopik; Sara J. Weston
Date Added:: 08/12/2021

More Less

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

3 Results

Search Resources

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

3 Results