Open filters Close filters

215 Results

Considerations for mosquito microbiome research from the Mosquito Microbiome Consortium

Unrestricted Use

CC BY

Considerations for mosquito microbiome research from the Mosquito Microbiome Consortium

Rating

This resource is a video abstract of a research paper created by Research Square on behalf of its authors. It provides a synopsis that's easy to understand, and can be used to introduce the topics it covers to students, researchers, and the general public. The video's transcript is also provided in full, with a portion provided below for preview:

"The mosquito microbiome is critical for mosquito development. Its influence on mosquito-borne pathogen transmission has resulted in increasing research interest. Although the mosquito microbiome has been extensively characterized, resulting in large amounts of data, neither standardized methods for mosquito microbiome research nor a curated data repository are available. With an overarching goal of collectively unravelling the role of the mosquito microbiome in mosquito biology, the authors created the Mosquito Microbiome Consortium to address this lack of standardized methods and data repository..."

The rest of the transcript, along with a link to the research itself, is available on the resource itself.

Subject:: Biology; Life Science
Material Type:: Diagram/Illustration; Reading
Provider:: Research Square
Provider Set:: Video Bytes
Date Added:: 02/26/2021

Course Syllabi for Open and Reproducible Methods

Unrestricted Use

CC BY

Course Syllabi for Open and Reproducible Methods

Rating

A collection of course syllabi from any discipline featuring content to examine or improve open and reproducible research practices. Email to join project, access articles, or add other syllabi.

Subject:: Applied Science; Life Science; Physical Science; Social Science
Material Type:: Syllabus
Date Added:: 06/18/2020

Conditional Remix & Share Permitted

CC BY-SA

Curate Science

Rating

Curate Science is a unified curation system and platform to verify that research is transparent and credible. It will allow researchers, journals, universities, funders, teachers, journalists, and the general public to ensure:- Transparency: Ensure research meets minimum transparency standards appropriate to the article type and employed methodologies.- Credibility: Ensure follow-up scrutiny is linked to its parent paper, including critical commentaries, reproducibility/robustness re-analyses, and new sample replications.

Subject:: Applied Science; Life Science; Physical Science; Social Science
Material Type:: Data Set
Provider:: Curate Science
Date Added:: 06/18/2020

Data Analysis and Visualization in Python for Ecologists

Unrestricted Use

CC BY

Data Analysis and Visualization in Python for Ecologists

Rating

Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data. This is an introduction to Python designed for participants with no programming experience. These lessons can be taught in one and a half days (~ 10 hours). They start with some basic information about Python syntax, the Jupyter notebook interface, and move through how to import CSV files, using the pandas package to work with data frames, how to calculate summary information from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from Python.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Maxim Belkin; Tania Allard
Date Added:: 03/20/2017

Data Analysis and Visualization in R for Ecologists

Unrestricted Use

CC BY

Data Analysis and Visualization in R for Ecologists

Rating

Data Carpentry lesson from Ecology curriculum to learn how to analyse and visualise ecological data in R. Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with ecology data in R. This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from R.

Subject:: Applied Science; Computer Science; Ecology; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Ankenbrand, Markus; Arindam Basu; Ashander, Jaime; Bahlai, Christie; Bailey, Alistair; Becker, Erin Alison; Bledsoe, Ellen; Boehm, Fred; Bolker, Ben; Bouquin, Daina; Burge, Olivia Rata; Burle, Marie-Helene; Carchedi, Nick; Chatzidimitriou, Kyriakos; Chiapello, Marco; Conrado, Ana Costa; Cortijo, Sandra; Cranston, Karen; Cuesta, Sergio Martínez; Culshaw-Maurer, Michael; Czapanskiy, Max; Daijiang Li; Dashnow, Harriet; Daskalova, Gergana; Deer, Lachlan; Direk, Kenan; Dunic, Jillian; Elahi, Robin; Fishman, Dmytro; Fouilloux, Anne; Fournier, Auriel; Gan, Emilia; Goswami, Shubhang; Guillou, Stéphane; Hancock, Stacey; Hardenberg, Achaz Von; Harrison, Paul; Hart, Ted; Herr, Joshua R.; Hertweck, Kate; Hodges, Toby; Hulshof, Catherine; Humburg, Peter; Jean, Martin; Johnson, Carolina; Johnson, Kayla; Johnston, Myfanwy; Jordan, Kari L; K. A. S. Mislan; Kaupp, Jake; Keane, Jonathan; Kerchner, Dan; Klinges, David; Koontz, Michael; Leinweber, Katrin; Lepore, Mauro Luciano; Li, Ye; Lijnzaad, Philip; Lotterhos, Katie; Mannheimer, Sara; Marwick, Ben; Michonneau, François; Millar, Justin; Moreno, Melissa; Najko Jahn; Obeng, Adam; Odom, Gabriel J.; Pauloo, Richard; Pawlik, Aleksandra Natalia; Pearse, Will; Peck, Kayla; Pederson, Steve; Peek, Ryan; Pletzer, Alex; Quinn, Danielle; Rajeg, Gede Primahadi Wijaya; Reiter, Taylor; Rodriguez-Sanchez, Francisco; Sandmann, Thomas; Seok, Brian; Sfn_brt; Shiklomanov, Alexey; Shivshankar Umashankar; Stachelek, Joseph; Strauss, Eli; Sumedh; Switzer, Callin; Tarkowski, Leszek; Tavares, Hugo; Teal, Tracy; Theobold, Allison; Tirok, Katrin; Tylén, Kristian; Vanichkina, Darya; Voter, Carolyn; Webster, Tara; Weisner, Michael; White, Ethan P; Wilson, Earle; Woo, Kara; Wright, April; Yanco, Scott; Ye, Hao
Date Added:: 03/20/2017

Data Analysis and Visualization with Python for Social Scientists

Unrestricted Use

CC BY

Data Analysis and Visualization with Python for Social Scientists

Rating

Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data. This is an introduction to Python designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about Python syntax, the Jupyter notebook interface, and move through how to import CSV files, using the pandas package to work with data frames, how to calculate summary information from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from Python.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Geoffrey Boushey; Stephen Childs
Date Added:: 08/07/2020

Unrestricted Use

CC BY

Data Carpentry

Rating

Data Carpentry trains researchers in the core data skills for efficient, shareable, and reproducible research practices. We run accessible, inclusive training workshops; teach openly available, high-quality, domain-tailored lessons; and foster an active, inclusive, diverse instructor community that promotes and models reproducible research as a community norm.

Subject:: Applied Science; Life Science; Physical Science; Social Science
Material Type:: Full Course
Provider:: Data Carpentry Community
Author:: Data Carpentry Community
Date Added:: 06/18/2020

Data Carpentry for Biologists

Unrestricted Use

CC BY

Data Carpentry for Biologists

Rating

The Biology Semester-long Course was developed and piloted at the University of Florida in Fall 2015. Course materials include readings, lectures, exercises, and assignments that expand on the material presented at workshops focusing on SQL and R.

Subject:: Applied Science; Biology; Computer Science; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Ethan White; Zachary Brym
Date Added:: 08/07/2020

Data Cleaning with OpenRefine for Ecologists

Unrestricted Use

CC BY

Data Cleaning with OpenRefine for Ecologists

Rating

A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identified and corrected or formatting made consistent. This step must be taken with the same care and attention to reproducibility as the analysis. OpenRefine (formerly Google Refine) is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another. This lesson will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Cam Macdonell; Deborah Paul; Phillip Doehle; Rachel Lombardi
Date Added:: 03/20/2017

Data Intro for Archivists

Unrestricted Use

CC BY

Data Intro for Archivists

Rating

This Library Carpentry lesson introduces archivists to working with data. At the conclusion of the lesson you will: be able to explain terms, phrases, and concepts in code or software development; identify and use best practice in data structures; use regular expressions in searches.

Subject:: Applied Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: James Baker; Jeanine Finn; Jenny Bunn; Katherine Koziar; Noah Geraci; Scott Peterson
Date Added:: 08/07/2020

Data Management & Reproducibility

Conditional Remix & Share Permitted

CC BY-NC

Data Management & Reproducibility

Rating

Introduction to data management and reproducibility for researchers as a presentation.

Subject:: Applied Science; Life Science; Physical Science; Social Science
Material Type:: Lesson
Provider:: New York University
Author:: Vicky Steeves
Date Added:: 04/04/2019

Data Management with SQL for Ecologists

Unrestricted Use

CC BY

Data Management with SQL for Ecologists

Rating

Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Christina Koch; Donal Heidenblad; Katy Felkner; Rémi Rampin; Timothée Poisot
Date Added:: 03/20/2017

Data Management with SQL for Social Scientists

Unrestricted Use

CC BY

Data Management with SQL for Social Scientists

Rating

This is an alpha lesson to teach Data Management with SQL for Social Scientists, We welcome and criticism, or error; and will take your feedback into account to improve both the presentation and the content. Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data; Social Science
Material Type:: Module
Provider:: The Carpentries
Author:: Peter Smyth
Date Added:: 08/07/2020

Data Organization in Spreadsheets for Ecologists

Unrestricted Use

CC BY

Data Organization in Spreadsheets for Ecologists

Rating

Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. We organize data in spreadsheets in the ways that we as humans want to work with the data, but computers require that data be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Christie Bahlai; Peter R. Hoyt; Tracy Teal
Date Added:: 03/20/2017

Data Organization in Spreadsheets for Social Scientists

Unrestricted Use

CC BY

Data Organization in Spreadsheets for Social Scientists

Rating

Lesson on spreadsheets for social scientists. Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. Typically we organize data in spreadsheets in ways that we as humans want to work with the data. However computers require data to be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:: Applied Science; Information Science; Mathematics; Measurement and Data; Social Science
Material Type:: Module
Provider:: The Carpentries
Author:: David Mawdsley; Erin Becker; François Michonneau; Karen Word; Lachlan Deer; Peter Smyth
Date Added:: 08/07/2020

Data Wrangling and Processing for Genomics

Unrestricted Use

CC BY

Data Wrangling and Processing for Genomics

Rating

Data Carpentry lesson to learn how to use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation. A lot of genomics analysis is done using command-line tools for three reasons: 1) you will often be working with a large number of files, and working through the command-line rather than through a graphical user interface (GUI) allows you to automate repetitive tasks, 2) you will often need more compute power than is available on your personal computer, and connecting to and interacting with remote computers requires a command-line interface, and 3) you will often need to customize your analyses, and command-line tools often enable more customization than the corresponding GUI tools (if in fact a GUI tool even exists). In a previous lesson, you learned how to use the bash shell to interact with your computer through a command line interface. In this lesson, you will be applying this new knowledge to carry out a common genomics workflow - identifying variants among sequencing samples taken from multiple individuals within a population. We will be starting with a set of sequenced reads (.fastq files), performing some quality control steps, aligning those reads to a reference genome, and ending by identifying and visualizing variations among these samples. As you progress through this lesson, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.

Subject:: Applied Science; Computer Science; Genetics; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Adam Thomas; Ahmed R. Hasan; Aniello Infante; Anita Schürch; Dev Paudel; Erin Alison Becker; Fotis Psomopoulos; François Michonneau; Gaius Augustus; Gregg TeHennepe; Jason Williams; Jessica Elizabeth Mizzi; Karen Cranston; Kari L Jordan; Kate Crosby; Kevin Weitemier; Lex Nederbragt; Luis Avila; Peter R. Hoyt; Rayna Michelle Harris; Ryan Peek; Sheldon John McKay; Sheldon McKay; Taylor Reiter; Tessa Pierce; Toby Hodges; Tracy Teal; Vasilis Lenis; Winni Kretzschmar; dbmarchant
Date Added:: 08/07/2020

Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition

Unrestricted Use

CC BY

Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition

Rating

Access to data is a critical feature of an efficient, progressive and ultimately self-correcting scientific ecosystem. But the extent to which in-principle benefits of data sharing are realized in practice is unclear. Crucially, it is largely unknown whether published findings can be reproduced by repeating reported analyses upon shared data (‘analytic reproducibility’). To investigate this, we conducted an observational evaluation of a mandatory open data policy introduced at the journal Cognition. Interrupted time-series analyses indicated a substantial post-policy increase in data available statements (104/417, 25% pre-policy to 136/174, 78% post-policy), although not all data appeared reusable (23/104, 22% pre-policy to 85/136, 62%, post-policy). For 35 of the articles determined to have reusable data, we attempted to reproduce 1324 target values. Ultimately, 64 values could not be reproduced within a 10% margin of error. For 22 articles all target values were reproduced, but 11 of these required author assistance. For 13 articles at least one value could not be reproduced despite author assistance. Importantly, there were no clear indications that original conclusions were seriously impacted. Mandatory open data policies can increase the frequency and quality of data sharing. However, suboptimal data curation, unclear analysis specification and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings.

Subject:: Applied Science; Information Science
Material Type:: Reading
Provider:: Royal Society Open Science
Author:: Alicia Hofelich Mohr; Bria Long; Elizabeth Clayton; Erica J. Yoon; George C. Banks; Gustav Nilsonne; Kyle MacDonald; Mallory C. Kidwell; Maya B. Mathur; Michael C. Frank; Michael Henry Tessler; Richie L. Lenne; Sara Altman; Tom E. Hardwicke
Date Added:: 08/07/2020

Databases and SQL

Unrestricted Use

CC BY

Databases and SQL

Rating

Software Carpentry lesson that teaches how to use databases and SQL In the late 1920s and early 1930s, William Dyer, Frank Pabodie, and Valentina Roerich led expeditions to the Pole of Inaccessibility in the South Pacific, and then onward to Antarctica. Two years ago, their expeditions were found in a storage locker at Miskatonic University. We have scanned and OCR the data they contain, and we now want to store that information in a way that will make search and analysis easy. Three common options for storage are text files, spreadsheets, and databases. Text files are easiest to create, and work well with version control, but then we would have to build search and analysis tools ourselves. Spreadsheets are good for doing simple analyses, but they don’t handle large or complex data sets well. Databases, however, include powerful tools for search and analysis, and can handle large, complex data sets. These lessons will show how to use a database to explore the expeditions’ data.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Amy Brown; Andrew Boughton; Andrew Kubiak; Avishek Kumar; Ben Waugh; Bill Mills; Brian Ballsun-Stanton; Chris Tomlinson; Colleen Fallaw; Dan Michael Heggø; Daniel Suess; Dave Welch; David W Wright; Deborah Gertrude Digges; Donny Winston; Doug Latornell; Erin Alison Becker; Ethan Nelson; Ethan P White; François Michonneau; George Graham; Gerard Capes; Gideon Juve; Greg Wilson; Ioan Vancea; Jake Lever; James Mickley; John Blischak; JohnRMoreau@gmail.com; Jonah Duckles; Jonathan Guyer; Joshua Nahum; Kate Hertweck; Kevin Dyke; Louis Vernon; Luc Small; Luke William Johnston; Maneesha Sane; Mark Stacy; Matthew Collins; Matty Jones; Mike Jackson; Morgan Taschuk; Patrick McCann; Paula Andrea Martinez; Pauline Barmby; Piotr Banaszkiewicz; Raniere Silva; Ray Bell; Rayna Michelle Harris; Rémi Emonet; Rémi Rampin; Seda Arat; Sheldon John McKay; Sheldon McKay; Stephen Davison; Thomas Guignard; Trevor Bekolay; lorra; slimlime
Date Added:: 03/20/2017

Data sharing in PLOS ONE: An analysis of Data Availability Statements

Unrestricted Use

CC BY

Data sharing in PLOS ONE: An analysis of Data Availability Statements

Rating

A number of publishers and funders, including PLOS, have recently adopted policies requiring researchers to share the data underlying their results and publications. Such policies help increase the reproducibility of the published literature, as well as make a larger body of data available for reuse and re-analysis. In this study, we evaluate the extent to which authors have complied with this policy by analyzing Data Availability Statements from 47,593 papers published in PLOS ONE between March 2014 (when the policy went into effect) and May 2016. Our analysis shows that compliance with the policy has increased, with a significant decline over time in papers that did not include a Data Availability Statement. However, only about 20% of statements indicate that data are deposited in a repository, which the PLOS policy states is the preferred method. More commonly, authors state that their data are in the paper itself or in the supplemental information, though it is unclear whether these data meet the level of sharing required in the PLOS policy. These findings suggest that additional review of Data Availability Statements or more stringent policies may be needed to increase data sharing.

Subject:: Applied Science; Computer Science; Health, Medicine and Nursing; Information Science; Social Science
Material Type:: Reading
Provider:: PLOS ONE
Author:: Alicia Livinski; Christopher W. Belter; Douglas J. Joubert; Holly Thompson; Lisa M. Federer; Lissa N. Snyders; Ya-Ling Lu
Date Added:: 08/07/2020

Discrepancies in the Registries of Diet vs Drug Trials

Unrestricted Use

CC BY

Discrepancies in the Registries of Diet vs Drug Trials

Rating

This cross-sectional study examines discrepancies between registered protocols and subsequent publications for drug and diet trials whose findings were published in prominent clinical journals in the last decade. ClinicalTrials.gov was established in 2000 in response to the Food and Drug Administration Modernization Act of 1997, which called for registration of trials of investigational new drugs for serious diseases. Subsequently, the scope of ClinicalTrials.gov expanded to all interventional studies, including diet trials. Presently, prospective trial registration is required by the National Institutes of Health for grant funding and many clinical journals for publication.1 Registration may reduce risk of bias from selective reporting and post hoc changes in design and analysis.1,2 Although a study3 of trials with ethics approval in Finland in 2007 identified numerous discrepancies between registered protocols and subsequent publications, the consistency of diet trial registration and reporting has not been well explored.

Subject:: Applied Science; Health, Medicine and Nursing
Material Type:: Reading
Provider:: JAMA Network Open
Author:: Cara B. Ebbeling; David S. Ludwig; Steven B. Heymsfield
Date Added:: 08/07/2020