OER Commons

Project FeederWatch: Integrating Real-Time Science and Math

Conditional Remix & Share Permitted

CC BY-SA

Project FeederWatch: Integrating Real-Time Science and Math

Rating

This article discusses Project FeederWatch, a real-time citizen science project, and how elementary teachers can use this bird data to integrate math lessons and concepts.

Subject:: Life Science
Material Type:: Lesson Plan
Provider:: Ohio State University College of Education and Human Ecology
Provider Set:: Beyond Penguins and Polar Bears: An Online Magazine for K-5 Teachers
Author:: Jessica Fries-Gaither
Date Added:: 10/17/2014

More Less

Project Organization and Management for Genomics

Unrestricted Use

CC BY

Project Organization and Management for Genomics

Rating

Data Carpentry Genomics workshop lesson to learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database. Good data organization is the foundation of any research project. It not only sets you up well for an analysis, but it also makes it easier to come back to the project later and share with collaborators, including your most important collaborator - future you. Organizing a project that includes sequencing involves many components. There’s the experimental setup and conditions metadata, measurements of experimental parameters, sequencing preparation and sample information, the sequences themselves and the files and workflow of any bioinformatics analysis. So much of the information of a sequencing project is digital, and we need to keep track of our digital records in the same way we have a lab notebook and sample freezer. In this lesson, we’ll go through the project organization and documentation that will make an efficient bioinformatics workflow possible. Not only will this make you a more effective bioinformatics researcher, it also prepares your data and project for publication, as grant agencies and publishers increasingly require this information. In this lesson, we’ll be using data from a study of experimental evolution using E. coli. More information about this dataset is available here. In this study there are several types of files: Spreadsheet data from the experiment that tracks the strains and their phenotype over time Spreadsheet data with information on the samples that were sequenced - the names of the samples, how they were prepared and the sequencing conditions The sequence data Throughout the analysis, we’ll also generate files from the steps in the bioinformatics pipeline and documentation on the tools and parameters that we used. In this lesson you will learn: How to structure your metadata, tabular data and information about the experiment. The metadata is the information about the experiment and the samples you’re sequencing. How to prepare for, understand, organize and store the sequencing data that comes back from the sequencing center How to access and download publicly available data that may need to be used in your bioinformatics analysis The concepts of organizing the files and documenting the workflow of your bioinformatics analysis

Subject:: Business and Communication; Genetics; Life Science; Management
Material Type:: Module
Provider:: The Carpentries
Author:: Amanda Charbonneau; Bérénice Batut; Daniel O. S. Ouso; Deborah Paul; Erin Alison Becker; François Michonneau; Jason Williams; Juan A. Ugalde; Kevin Weitemier; Laura Williams; Paula Andrea Martinez; Peter R. Hoyt; Rayna Michelle Harris; Taylor Reiter; Toby Hodges; Tracy Teal
Date Added:: 08/07/2020

More Less

Conditional Remix & Share Permitted

CC BY-NC

Project TIER - Soup-to-Nuts Exercises

Rating

The soup-to-nuts exercises take students through the entire process of research with statistical data, from the very beginning when they first access the original data, through cleaning and processing the data to prepare them for analysis, to the very end when they generate the results that they present in a written report. Throughout each exercise, there will be an emphasis on adopting a transparent workflow and constructing replication documentation that ensures all the work done for the exercise can be independently reproduced.

Subject:: Applied Science; Information Science
Material Type:: Data Set; Homework/Assignment; Module
Author:: Project TIER
Date Added:: 05/14/2022

More Less

Public Availability of Published Research Data in High-Impact Journals

Unrestricted Use

CC BY

Public Availability of Published Research Data in High-Impact Journals

Rating

Background There is increasing interest to make primary data from published research publicly available. We aimed to assess the current status of making research data available in highly-cited journals across the scientific literature. Methods and Results We reviewed the first 10 original research papers of 2009 published in the 50 original research journals with the highest impact factor. For each journal we documented the policies related to public availability and sharing of data. Of the 50 journals, 44 (88%) had a statement in their instructions to authors related to public availability and sharing of data. However, there was wide variation in journal requirements, ranging from requiring the sharing of all primary data related to the research to just including a statement in the published manuscript that data can be available on request. Of the 500 assessed papers, 149 (30%) were not subject to any data availability policy. Of the remaining 351 papers that were covered by some data availability policy, 208 papers (59%) did not fully adhere to the data availability instructions of the journals they were published in, most commonly (73%) by not publicly depositing microarray data. The other 143 papers that adhered to the data availability instructions did so by publicly depositing only the specific data type as required, making a statement of willingness to share, or actually sharing all the primary data. Overall, only 47 papers (9%) deposited full primary raw data online. None of the 149 papers not subject to data availability policies made their full primary data publicly available. Conclusion A substantial proportion of original research papers published in high-impact journals are either not subject to any data availability policies, or do not adhere to the data availability instructions in their respective journals. This empiric evaluation highlights opportunities for improvement.

Subject:: Applied Science; Health, Medicine and Nursing
Material Type:: Reading
Provider:: PLOS ONE
Author:: Alawi A. Alsheikh-Ali; John P. A. Ioannidis; Mouaz H. Al-Mallah; Waqas Qureshi
Date Added:: 08/07/2020

More Less

Public Data Archiving in Ecology and Evolution: How Well Are We Doing?

Unrestricted Use

CC BY

Public Data Archiving in Ecology and Evolution: How Well Are We Doing?

Rating

Policies that mandate public data archiving (PDA) successfully increase accessibility to data underlying scientific publications. However, is the data quality sufficient to allow reuse and reanalysis? We surveyed 100 datasets associated with nonmolecular studies in journals that commonly publish ecological and evolutionary research and have a strong PDA policy. Out of these datasets, 56% were incomplete, and 64% were archived in a way that partially or entirely prevented reuse. We suggest that cultural shifts facilitating clearer benefits to authors are necessary to achieve high-quality PDA and highlight key guidelines to help authors increase their data’s reuse potential and compliance with journal data policies.

Subject:: Biology; Life Science
Material Type:: Reading
Provider:: PLOS Biology
Author:: Dominique G. Roche; Loeske E. B. Kruuk; Robert Lanfear; Sandra A. Binning
Date Added:: 08/07/2020

More Less

Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size

Unrestricted Use

CC BY

Publication Bias in Psychology: A Diagnosis Based on the Correlation between Effect Size and Sample Size

Rating

Background The p value obtained from a significance test provides no information about the magnitude or importance of the underlying phenomenon. Therefore, additional reporting of effect size is often recommended. Effect sizes are theoretically independent from sample size. Yet this may not hold true empirically: non-independence could indicate publication bias. Methods We investigate whether effect size is independent from sample size in psychological research. We randomly sampled 1,000 psychological articles from all areas of psychological research. We extracted p values, effect sizes, and sample sizes of all empirical papers, and calculated the correlation between effect size and sample size, and investigated the distribution of p values. Results We found a negative correlation of r = −.45 [95% CI: −.53; −.35] between effect size and sample size. In addition, we found an inordinately high number of p values just passing the boundary of significance. Additional data showed that neither implicit nor explicit power analysis could account for this pattern of findings. Conclusion The negative correlation between effect size and samples size, and the biased distribution of p values indicate pervasive publication bias in the entire field of psychology.

Subject:: Psychology; Social Science
Material Type:: Reading
Provider:: PLOS ONE
Author:: Anton Kühberger; Astrid Fritz; Thomas Scherndl
Date Added:: 08/07/2020

More Less

P values in display items are ubiquitous and almost invariably significant: A survey of top science journals

Unrestricted Use

CC BY

P values in display items are ubiquitous and almost invariably significant: A survey of top science journals

Rating

P values represent a widely used, but pervasively misunderstood and fiercely contested method of scientific inference. Display items, such as figures and tables, often containing the main results, are an important source of P values. We conducted a survey comparing the overall use of P values and the occurrence of significant P values in display items of a sample of articles in the three top multidisciplinary journals (Nature, Science, PNAS) in 2017 and, respectively, in 1997. We also examined the reporting of multiplicity corrections and its potential influence on the proportion of statistically significant P values. Our findings demonstrated substantial and growing reliance on P values in display items, with increases of 2.5 to 14.5 times in 2017 compared to 1997. The overwhelming majority of P values (94%, 95% confidence interval [CI] 92% to 96%) were statistically significant. Methods to adjust for multiplicity were almost non-existent in 1997, but reported in many articles relying on P values in 2017 (Nature 68%, Science 48%, PNAS 38%). In their absence, almost all reported P values were statistically significant (98%, 95% CI 96% to 99%). Conversely, when any multiplicity corrections were described, 88% (95% CI 82% to 93%) of reported P values were statistically significant. Use of Bayesian methods was scant (2.5%) and rarely (0.7%) articles relied exclusively on Bayesian statistics. Overall, wider appreciation of the need for multiplicity corrections is a welcome evolution, but the rapid growth of reliance on P values and implausibly high rates of reported statistical significance are worrisome.

Subject:: Mathematics; Statistics and Probability
Material Type:: Reading
Provider:: PLOS ONE
Author:: Ioana Alina Cristea; John P. A. Ioannidis
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Python for Humanities

Rating

Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data. This is an introduction to Python designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about Python syntax, the Jupyter notebook interface, and move through how to import CSV files, using the pandas package to work with data frames, how to calculate summary information from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from Python.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Iain Emsley
Date Added:: 08/07/2020

More Less

Unrestricted Use

Public Domain

Python for visualization

Rating

Python for data visualization

Subject:: Applied Science; Computer Science
Material Type:: Activity/Lab
Author:: Sunena Rose M V
Date Added:: 03/31/2023

More Less

Questionable and Open Research Practices in Education Research

Unrestricted Use

CC BY

Questionable and Open Research Practices in Education Research

Rating

Discussions of how to improve research quality are predominant in a number of fields, including education. But how prevalent are the use of problematic practices and the improved practices meant to counter them? This baseline information will be a critical data source as education researchers seek to improve our research practices. In this preregistered study, we replicated and extended previous studies from other fields by asking education researchers about 10 questionable research practices and 5 open research practices. We asked them to estimate the prevalence of the practices in the field, self-report their own use of such practices, and estimate the appropriateness of these behaviors in education research. We made predictions under four umbrella categories: comparison to psychology, geographic location, career stage, and quantitative orientation. Broadly, our results suggest that both questionable and open research practices are part of the typical research practices of many educational researchers. Preregistration, code, and data can be found at https://osf.io/83mwk/.

Subject:: Education
Material Type:: Reading
Author:: Bryan G. Cook; Jaret Hodges; Jonathan Plucker; Matthew C. Makel
Date Added:: 08/07/2020

More Less

Questionable research practices in ecology and evolution

Unrestricted Use

CC BY

Questionable research practices in ecology and evolution

Rating

We surveyed 807 researchers (494 ecologists and 313 evolutionary biologists) about their use of Questionable Research Practices (QRPs), including cherry picking statistically significant results, p hacking, and hypothesising after the results are known (HARKing). We also asked them to estimate the proportion of their colleagues that use each of these QRPs. Several of the QRPs were prevalent within the ecology and evolution research community. Across the two groups, we found 64% of surveyed researchers reported they had at least once failed to report results because they were not statistically significant (cherry picking); 42% had collected more data after inspecting whether results were statistically significant (a form of p hacking) and 51% had reported an unexpected finding as though it had been hypothesised from the start (HARKing). Such practices have been directly implicated in the low rates of reproducible results uncovered by recent large scale replication studies in psychology and other disciplines. The rates of QRPs found in this study are comparable with the rates seen in psychology, indicating that the reproducibility problems discovered in psychology are also likely to be present in ecology and evolution.

Subject:: Biology; Ecology; Life Science
Material Type:: Reading
Provider:: PLOS ONE
Author:: Ashley Barnett; Fiona Fidler; Hannah Fraser; Shinichi Nakagawa; Tim Parker
Date Added:: 08/07/2020

More Less

Conditional Remix & Share Permitted

CC BY-SA

RDBMS

Rating

This course content provides a summarized syllabus of RDBMS which is helpful to any Computer Science technical student

Subject:: Career and Technical Education
Material Type:: Full Course
Date Added:: 07/05/2016

More Less

Raiders of the lost HARK: a reproducible inference framework for big data science

Unrestricted Use

CC BY

Raiders of the lost HARK: a reproducible inference framework for big data science

Rating

Hypothesizing after the results are known (HARK) has been disparaged as data dredging, and safeguards including hypothesis preregistration and statistically rigorous oversight have been recommended. Despite potential drawbacks, HARK has deepened thinking about complex causal processes. Some of the HARK precautions can conflict with the modern reality of researchersâ€™ obligations to use big, â€˜organicâ€™ data sourcesâ€”from high-throughput genomics to social media streams. We here propose a HARK-solid, reproducible inference framework suitable for big data, based on models that represent formalization of hypotheses. Reproducibility is attained by employing two levels of model validation: internal (relative to data collated around hypotheses) and external (independent to the hypotheses used to generate data or to the data used to generate hypotheses). With a model-centered paradigm, the reproducibility focus changes from the ability of others to reproduce both data and specific inferences from a study to the ability to evaluate models as representation of reality. Validation underpins â€˜natural selectionâ€™ in a knowledge base maintained by the scientific community. The community itself is thereby supported to be more productive in generating and critically evaluating theories that integrate wider, complex systems.

Subject:: Applied Science; Health, Medicine and Nursing
Material Type:: Reading
Provider:: Palgrave Communications
Author:: Iain E. Buchan; James S. Koopman; Jiang Bian; Matthew Sperrin; Mattia Prosperi; Mo Wang
Date Added:: 08/07/2020

More Less

Read the Fine Print

Educational Use

Ranking the Rocks

Rating

Student teams assign importance factors, called "desirability points," the rock properties found in the previous lesson/activity in order to mathematically determine the overall best rocks for building caverns within. They learn the real-world connections and relationships between the rock and the important engineering properties for designing and building caverns (or tunnels, mines, building foundations, etc.).

Subject:: Applied Science; Architecture and Design; Engineering
Material Type:: Activity/Lab
Provider:: TeachEngineering
Provider Set:: TeachEngineering
Date Added:: 09/18/2014

More Less

Conditional Remix & Share Permitted

CC BY-NC-SA

Raystown Lake UBD

Rating

Students will learn about the water cycle, watersheds, and point and non-point source pollution. Students will then apply this knowledge to take a position in the debate about the proposed development at Hawn's Bridge Peninsula at Raystown Lake and write a letter to the editor expressing their opinion. Pairs well with an Engineering Design Challenge or a Meaningful Watershed Educational Experience (MWEE).

Subject:: Applied Science; Business and Communication; Career and Technical Education; Communication; Composition and Rhetoric; Ecology; English Language Arts; Environmental Science; Environmental Studies; Life Science
Material Type:: Lesson Plan; Module
Date Added:: 05/11/2021

More Less

Conditional Remix & Share Permitted

CC BY-NC-SA

Recording Weather Data -- Out Teach

Rating

Students will make it a regular practice to record weather data using thermometers, wind vanes and rain gauges.

Subject:: Physical Science
Material Type:: Lesson Plan
Author:: Out Teach
Date Added:: 07/22/2021

More Less

Registered reports: an early example and analysis

Unrestricted Use

CC BY

Registered reports: an early example and analysis

Rating

The recent ‘replication crisis’ in psychology has focused attention on ways of increasing methodological rigor within the behavioral sciences. Part of this work has involved promoting ‘Registered Reports’, wherein journals peer review papers prior to data collection and publication. Although this approach is usually seen as a relatively recent development, we note that a prototype of this publishing model was initiated in the mid-1970s by parapsychologist Martin Johnson in the European Journal of Parapsychology (EJP). A retrospective and observational comparison of Registered and non-Registered Reports published in the EJP during a seventeen-year period provides circumstantial evidence to suggest that the approach helped to reduce questionable research practices. This paper aims both to bring Johnson’s pioneering work to a wider audience, and to investigate the positive role that Registered Reports may play in helping to promote higher methodological and statistical standards.

Subject:: Applied Science; Information Science; Psychology; Social Science
Material Type:: Reading
Provider:: PeerJ
Author:: Caroline Watt; Diana Kornbrot; Richard Wiseman
Date Added:: 08/07/2020

More Less

Renewable Energy Living Lab: Exploring Regional and Local Resources

Read the Fine Print

Educational Use

Renewable Energy Living Lab: Exploring Regional and Local Resources

Rating

Students become familiar with the online Renewable Energy Living Lab interface and access its real-world solar energy data to evaluate the potential for solar generation in various U.S. locations. They become familiar with where the most common sources of renewable energy are distributed across the U.S. Through this activity, students and teachers gain familiarity with the living lab's GIS graphic interface and query functions, and are exposed to the available data in renewable energy databases, learning how to query to find specific information for specific purposes. The activity is intended as a "training" activity prior to conducting activities such as The Bright Idea activity, which includes a definitive and extensive end product (a feasibility plan) for students to create.

Subject:: Applied Science; Ecology; Engineering; Life Science
Material Type:: Activity/Lab
Provider:: TeachEngineering
Provider Set:: TeachEngineering
Author:: Jessica Noffsinger; Jonathan Knudtsen; Karen Johnson; Mike Mooney; Minal Parekh; Scott Schankweiler
Date Added:: 09/18/2014

More Less

Renewable Energy Living Lab: The Bright Idea

Read the Fine Print

Educational Use

Renewable Energy Living Lab: The Bright Idea

Rating

Students use real-world data to evaluate the feasibility of solar energy and other renewable energy sources in different U.S. locations. Working in small groups, students act as engineers evaluating the suitability of installing solar panels at four company locations. They access data from the online Renewable Energy Living Lab from which they make calculations and analyze how successful solar energy generation would be, as well as the potential for other power sources at those locations. Then they summarize their results, analysis and recommendations in the form of feasibility plans prepared for a CEO.

Subject:: Applied Science; Engineering
Material Type:: Activity/Lab
Provider:: TeachEngineering
Provider Set:: TeachEngineering
Author:: Jessica Noffsinger; Jonathan Knudtsen; Karen Johnson; Mike Mooney; Minal Parekh; Scott Schankweiler
Date Added:: 09/18/2014

More Less

Unrestricted Use

CC BY

Reproducibility for Data Science

Rating

This course was developed and taught by Ben Marwick, Professor of Archaeology at University of Washington. It is a requirement for the UW Master of Science in Data Science, introduces students to the principles and tools for computational reproducibility in data science using R. Topics covered include acquiring, cleaning and manipulating data in a reproducible workflow using the tidyverse. Students will use literate programming tools, and explore best practices for organizing data analyses. Students will learn to write documents using R markdown, compile R markdown documents using knitr and related tools, and publish reproducible documents to various common formats. Students will learn strategies and tools for packaging research compendia, dependency management, and containerising projects to provide computational isolation.

Subject:: Anthropology; Applied Science; Archaeology; Information Science; Social Science
Material Type:: Full Course; Lecture Notes; Primary Source
Author:: Ben Marwick
Date Added:: 01/04/2022

More Less

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

429 Results

Search Resources

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

429 Results