Updating search results...

Analysis

Deriving meaning and knowledge from data. Software, code, licensing, maintenance, statistics, methods, code sharing, documentation, and more.
 

110 affiliated resources

Search Resources

View
Selected filters:
Python for Harvesting Data on the Web
Conditional Remix & Share Permitted
CC BY-NC
Rating
0.0 stars

This session is an intermediate-to-advanced level class that offers some ideas for how to approach the following common data wrangling needs in research: 1) Obtain data and load it into a suitable data "container" for analysis, often via a web interface, especially an API, 2) parse the data retrieved via an API and turn it into a useful object for manipulation and analysis, and 3) perform some basic summary counts of records in a dataset and work up a quick visualization.

Subject:
Applied Science
Life Science
Physical Science
Social Science
Material Type:
Activity/Lab
Provider:
New York University
Author:
Nick Wolf
Vicky Steeves
Date Added:
01/06/2020
Python for Humanities
Unrestricted Use
CC BY
Rating
0.0 stars

Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data. This is an introduction to Python designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about Python syntax, the Jupyter notebook interface, and move through how to import CSV files, using the pandas package to work with data frames, how to calculate summary information from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from Python.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Iain Emsley
Date Added:
08/07/2020
Qualitative Research Using Open Tools
Unrestricted Use
CC BY
Rating
0.0 stars

Qualitative research has long suffered from a lack of free tools for analysis, leaving no options for researchers without significant funds for software licenses. This presents significant challenges for equity. This panel discussion will explore the first two free/libre open source qualitative analysis tools out there: qcoder (R package) and Taguette (desktop application). Drawing from the diverse backgrounds of the presenters (social science, library & information science, software engineering), we will discuss what openness and extensibility means for qualitative research, and how the two tools we've built facilitate equitable, open sharing.

Subject:
Applied Science
Life Science
Physical Science
Social Science
Material Type:
Lesson
Provider:
New York University
Author:
Beth M. Duckles
Vicky Steeves
Date Added:
05/07/2019
Questionable research practices among italian research psychologists
Unrestricted Use
CC BY
Rating
0.0 stars

A survey in the United States revealed that an alarmingly large percentage of university psychologists admitted having used questionable research practices that can contaminate the research literature with false positive and biased findings. We conducted a replication of this study among Italian research psychologists to investigate whether these findings generalize to other countries. All the original materials were translated into Italian, and members of the Italian Association of Psychology were invited to participate via an online survey. The percentages of Italian psychologists who admitted to having used ten questionable research practices were similar to the results obtained in the United States although there were small but significant differences in self-admission rates for some QRPs. Nearly all researchers (88%) admitted using at least one of the practices, and researchers generally considered a practice possibly defensible if they admitted using it, but Italian researchers were much less likely than US researchers to consider a practice defensible. Participants’ estimates of the percentage of researchers who have used these practices were greater than the self-admission rates, and participants estimated that researchers would be unlikely to admit it. In written responses, participants argued that some of these practices are not questionable and they have used some practices because reviewers and journals demand it. The similarity of results obtained in the United States, this study, and a related study conducted in Germany suggest that adoption of these practices is an international phenomenon and is likely due to systemic features of the international research and publication processes.

Subject:
Psychology
Social Science
Material Type:
Reading
Provider:
PLOS ONE
Author:
Coosje L. S. Veldkamp
Franca Agnoli
Jelte M. Wicherts
Paolo Albiero
Roberto Cubelli
Date Added:
08/07/2020
Questionable research practices in ecology and evolution
Unrestricted Use
CC BY
Rating
0.0 stars

We surveyed 807 researchers (494 ecologists and 313 evolutionary biologists) about their use of Questionable Research Practices (QRPs), including cherry picking statistically significant results, p hacking, and hypothesising after the results are known (HARKing). We also asked them to estimate the proportion of their colleagues that use each of these QRPs. Several of the QRPs were prevalent within the ecology and evolution research community. Across the two groups, we found 64% of surveyed researchers reported they had at least once failed to report results because they were not statistically significant (cherry picking); 42% had collected more data after inspecting whether results were statistically significant (a form of p hacking) and 51% had reported an unexpected finding as though it had been hypothesised from the start (HARKing). Such practices have been directly implicated in the low rates of reproducible results uncovered by recent large scale replication studies in psychology and other disciplines. The rates of QRPs found in this study are comparable with the rates seen in psychology, indicating that the reproducibility problems discovered in psychology are also likely to be present in ecology and evolution.

Subject:
Biology
Ecology
Life Science
Material Type:
Reading
Provider:
PLOS ONE
Author:
Ashley Barnett
Fiona Fidler
Hannah Fraser
Shinichi Nakagawa
Tim Parker
Date Added:
08/07/2020
Raiders of the lost HARK: a reproducible inference framework for big data science
Unrestricted Use
CC BY
Rating
0.0 stars

Hypothesizing after the results are known (HARK) has been disparaged as data dredging, and safeguards including hypothesis preregistration and statistically rigorous oversight have been recommended. Despite potential drawbacks, HARK has deepened thinking about complex causal processes. Some of the HARK precautions can conflict with the modern reality of researchers’ obligations to use big, ‘organic’ data sources—from high-throughput genomics to social media streams. We here propose a HARK-solid, reproducible inference framework suitable for big data, based on models that represent formalization of hypotheses. Reproducibility is attained by employing two levels of model validation: internal (relative to data collated around hypotheses) and external (independent to the hypotheses used to generate data or to the data used to generate hypotheses). With a model-centered paradigm, the reproducibility focus changes from the ability of others to reproduce both data and specific inferences from a study to the ability to evaluate models as representation of reality. Validation underpins ‘natural selection’ in a knowledge base maintained by the scientific community. The community itself is thereby supported to be more productive in generating and critically evaluating theories that integrate wider, complex systems.

Subject:
Applied Science
Health, Medicine and Nursing
Material Type:
Reading
Provider:
Palgrave Communications
Author:
Iain E. Buchan
James S. Koopman
Jiang Bian
Matthew Sperrin
Mattia Prosperi
Mo Wang
Date Added:
08/07/2020
Reproducibility Immersive Course
Conditional Remix & Share Permitted
CC BY-SA
Rating
0.0 stars

Various fields in the natural and social sciences face a ‘crisis of confidence’. Broadly, this crisis amounts to a pervasiveness of non-reproducible results in the published literature. For example, in the field of biomedicine, Amgen published findings that out of 53 landmark published results of pre-clinical studies, only 11% could be replicated successfully. This crisis is not confined to biomedicine. Areas that have recently received attention for non-reproducibility include biomedicine, economics, political science, psychology, as well as philosophy. Some scholars anticipate the expansion of this crisis to other disciplines.This course explores the state of reproducibility. After giving a brief historical perspective, case studies from different disciplines (biomedicine, psychology, and philosophy) are examined to understand the issues concretely. Subsequently, problems that lead to non-reproducibility are discussed as well as possible solutions and paths forward.

Subject:
Applied Science
Life Science
Physical Science
Social Science
Material Type:
Activity/Lab
Provider:
New York University
Author:
Vicky Steeves
Date Added:
06/01/2018
Reproducible Research
Read the Fine Print
Rating
0.0 stars

Modern scientific research takes advantage of programs such as Python and R that are open source. As such, they can be modified and shared by the wider community. Additionally, there is added functionality through additional programs and packages, such as IPython, Sweave, and Shiny. These packages can be used to not only execute data analyses, but also to present data and results consistently across platforms (e.g., blogs, websites, repositories and traditional publishing venues).

The goal of the course is to show how to implement analyses and share them using IPython for Python, Sweave and knitr for RStudio to create documents that are shareable and analyses that are reproducible.

Course outline is as follows:
1) Use of IPython notebooks to demonstrate and explain code, visualize data, and display analysis results
2) Applications of Python modules such as SymPy, NumPy, pandas, and SciPy
3) Use of Sweave to demonstrate and explain code, visualize data, display analysis results, and create documents and presentations
4) Integration and execution of IPython and R code and analyses using the IPython notebook

Subject:
Applied Science
Information Science
Material Type:
Full Course
Author:
Christopher Ahern
Date Added:
08/07/2020
Reproducible Research Methods
Read the Fine Print
Rating
0.0 stars

This is the website for the Autumn 2014 course “Reproducible Research Methods” taught by Eric C. Anderson at NOAA’s Southwest Fisheries Science Center. The course meets on Tuesdays and Thursdays from 3:30 to 4:30 PM in Room 188 of the Fisheries Ecology Division.
It runs from Oct 7 to December 18.

The goal of this course is for scientists, researchers, and students to learn:

to write programs in the R language to manipulate and analyze data,
to integrate data analysis with report generation and article preparation using knitr,
to work fluently within the Rstudio integrated development environment for R,
to use git version control software and GitHub to effectively manage source code, collaborate efficiently with other researchers, and neatly package their research.

Subject:
Applied Science
Information Science
Material Type:
Full Course
Author:
Eric C. Anderson
Date Added:
08/07/2020
Reproducible Research: Walking the Walk
Read the Fine Print
Rating
0.0 stars

Description

This hands-on tutorial will train reproducible research warriors on the practices and tools that make experimental verification possible with an end-to-end data analysis workflow. The tutorial will expose attendees to open science methods during data gathering, storage, analysis, up to publication into a reproducible article.

Attendees are expected to have basic familiarity with scientific Python and Git.

Subject:
Applied Science
Information Science
Material Type:
Module
Author:
Matt McCormick
Date Added:
08/07/2020
Reproducible Science Curriculum Lesson for Automation
Read the Fine Print
Rating
0.0 stars

Workshop goals
- Why are we teaching this
- Why is this important
- For future and current you
- For research as a whole
- Lack of reproducibility in research is a real problem

Materials and how we'll use them
- Workshop landing page, with

- links to the Materials
- schedule

Structure oriented along the Four Facets of Reproducibility:

- Documentation
- Organization
- Automation
- Dissemination

Will be available after the Workshop

How this workshop is run
- This is a Carpentries Workshop
- that means friendly learning environment
- Code of Conduct
- active learning
- work with the people next to you
- ask for help

Subject:
Applied Science
Information Science
Material Type:
Module
Author:
François Michonneau
Kim Gilbert
Matt Pennell
Date Added:
08/07/2020
Research Project Management Using the Open Science Framework
Conditional Remix & Share Permitted
CC BY-NC
Rating
0.0 stars

An introduction to managing, annotating, organizing, archiving, and publishing research data using the Open Science Framework.

Subject:
Applied Science
Life Science
Physical Science
Social Science
Material Type:
Activity/Lab
Provider:
New York University
Author:
Nick Wolf
Vicky Steeves
Date Added:
01/06/2020
R for Reproducible Scientific Analysis
Unrestricted Use
CC BY
Rating
0.0 stars

This lesson in part of Software Carpentry workshop and teach novice programmers to write modular code and best practices for using R for data analysis. an introduction to R for non-programmers using gapminder data The goal of this lesson is to teach novice programmers to write modular code and best practices for using R for data analysis. R is commonly used in many scientific disciplines for statistical analysis and its array of third-party packages. We find that many scientists who come to Software Carpentry workshops use R and want to learn more. The emphasis of these materials is to give attendees a strong foundation in the fundamentals of R, and to teach best practices for scientific computing: breaking down analyses into modular units, task automation, and encapsulation. Note that this workshop will focus on teaching the fundamentals of the programming language R, and will not teach statistical analysis. The lesson contains more material than can be taught in a day. The instructor notes page has some suggested lesson plans suitable for a one or half day workshop. A variety of third party packages are used throughout this workshop. These are not necessarily the best, nor are they comprehensive, but they are packages we find useful, and have been chosen primarily for their usability.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Adam H. Sparks
Ahsan Ali Khoja
Amy Lee
Ana Costa Conrado
Andrew Boughton
Andrew Lonsdale
Andrew MacDonald
Andris Jankevics
Andy Teucher
Antonio Berlanga-Taylor
Ashwin Srinath
Ben Bolker
Bill Mills
Bret Beheim
Clare Sloggett
Daniel
Dave Bridges
David J. Harris
David Mawdsley
Dean Attali
Diego Rabatone Oliveira
Drew Tyre
Elise Morrison
Erin Alison Becker
Fernando Mayer
François Michonneau
Giulio Valentino Dalla Riva
Gordon McDonald
Greg Wilson
Harriet Dashnow
Ido Bar
Jaime Ashander
James Balamuta
James Mickley
Jamie McDevitt-Irwin
Jeffrey Arnold
Jeffrey Oliver
John Blischak
Jonah Duckles
Josh Quan
Julia Piaskowski
Kara Woo
Kate Hertweck
Katherine Koziar
Katrin Leinweber
Kellie Ottoboni
Kevin Weitemier
Kiana Ashley West
Kieran Samuk
Kunal Marwaha
Kyriakos Chatzidimitriou
Lachlan Deer
Lex Nederbragt
Liz Ing-Simmons
Lucy Chang
Luke W Johnston
Luke Zappia
Marc Sze
Marie-Helene Burle
Marieke Frassl
Mark Dunning
Martin John Hadley
Mary Donovan
Matt Clark
Melissa Kardish
Mike Jackson
Murray Cadzow
Narayanan Raghupathy
Naupaka Zimmerman
Nelly Sélem
Nicholas Lesniak
Nicholas Potter
Nima Hejazi
Nora Mitchell
Olivia Rata Burge
Paula Andrea Martinez
Pete Bachant
Phil Bouchet
Philipp Boersch-Supan
Piotr Banaszkiewicz
Raniere Silva
Rayna Michelle Harris
Remi Daigle
Research Bazaar
Richard Barnes
Robert Bagchi
Rémi Emonet
Sam Penrose
Sandra Brosda
Sarah Munro
Sasha Lavrentovich
Scott Allen Funkhouser
Scott Ritchie
Sebastien Renaut
Thea Van Rossum
Timothy Eoin Moore
Timothy Rice
Tobin Magle
Trevor Bekolay
Tyler Crawford Kelly
Vicken Hillis
Yuka Takemon
bippuspm
butterflyskip
waiteb5
Date Added:
03/20/2017
R for Social Scientists
Unrestricted Use
CC BY
Rating
0.0 stars

From Data Carpentry: Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with social sciences data in R.This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting.

Subject:
Social Science
Material Type:
Activity/Lab
Provider:
New York University
Author:
Vicky Steeves
Date Added:
01/15/2020
R for Social Scientists
Unrestricted Use
CC BY
Rating
0.0 stars

Data Carpentry lesson part of the Social Sciences curriculum. This lesson teaches how to analyse and visualise data used by social scientists. Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with social sciences data in R. This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting.

Subject:
Applied Science
Information Science
Mathematics
Measurement and Data
Social Science
Material Type:
Module
Provider:
The Carpentries
Author:
Angela Li
Ben Marwick
Christina Maimone
Danielle Quinn
Erin Alison Becker
Francois Michonneau
Geoffrey LaFlair
Hao Ye
Jake Kaupp
Juan Fung
Katrin Leinweber
Martin Olmos
Murray Cadzow
Date Added:
08/07/2020
R para Análisis Científicos Reproducibles
Unrestricted Use
CC BY
Rating
0.0 stars

Una introducción a R utilizando los datos de Gapminder. El objetivo de esta lección es enseñar a las programadoras principiantes a escribir códigos modulares y adoptar buenas prácticas en el uso de R para el análisis de datos. R nos provee un conjunto de paquetes desarrollados por terceros que se usan comúnmente en diversas disciplinas científicas para el análisis estadístico. Encontramos que muchos científicos que asisten a los talleres de Software Carpentry utilizan R y quieren aprender más. Nuestros materiales son relevantes ya que proporcionan a los asistentes una base sólida en los fundamentos de R y enseñan las mejores prácticas del cómputo científico: desglose del análisis en módulos, automatización tareas y encapsulamiento. Ten en cuenta que este taller se enfoca en los fundamentos del lenguaje de programación R y no en el análisis estadístico. A lo largo de este taller se utilizan una variedad de paquetes desarrolados por terceros, los cuales no son necesariamente los mejores ni se encuentran explicadas todas sus funcionalidades, pero son paquetes que consideramos útiles y han sido elegidos principalmente por su facilidad de uso.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
A. s
Alejandra Gonzalez-Beltran
Ana Beatriz Villaseñor Altamirano
Antonio
AntonioJBT
Belinda Weaver
Claudia Engel
Cynthia Monastirsky
Daniel Beiter
David Mawdsley
David Pérez-Suárez
Erin Becker
EuniceML
François Michonneau
Gordon McDonald
Guillermina Actis
Guillermo Movia
Hely Salgado
Ido Bar
Ivan Ogasawara
Ivonne Lujano
James J Balamuta
Jamie McDevitt-Irwin
Jeff Oliver
Jonah Duckles
Juan M. Barrios
Katrin Leinweber
Kevin Alquicira
Kevin Martínez-Folgar
Laura Angelone
Laura-Gomez
Leticia Vega
Marcela Alfaro Córdoba
Marceline Abadeer
Maria Florencia D'Andrea
Marie-Helene Burle
Marieke Frassl
Matias Andina
Murray Cadzow
Narayanan Raghupathy
Naupaka Zimmerman
Paola Prieto
Paula Andrea Martinez
Raniere Silva
Rayna M Harris
Richard Barnes
Richard McCosh
Romualdo Zayas-Lagunas
Sandra Brosda
Sasha Lavrentovich
Shirley Alquicira Hernandez
Silvana Pereyra
Tobin Magle
Veronica Jimenez
juli arancio
raynamharris
saynomoregrl
Date Added:
08/07/2020
Social Science Workshop Overview
Unrestricted Use
CC BY
Rating
0.0 stars

Workshop overview for the Data Carpentry Social Sciences curriculum. Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop teaches data management and analysis for social science research including best practices for data organization in spreadsheets, reproducible data cleaning with OpenRefine, and data analysis and visualization in R. This curriculum is designed to be taught over two full days of instruction. Materials for teaching data analysis and visualization in Python and extraction of information from relational databases using SQL are in development. Interested in teaching these materials? We have an onboarding video and accompanying slides available to prepare Instructors to teach these lessons. After watching this video, please contact team@carpentries.org so that we can record your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at centrally-organized Data Carpentry Social Sciences workshops.

Subject:
Applied Science
Information Science
Mathematics
Measurement and Data
Social Science
Material Type:
Module
Provider:
The Carpentries
Author:
Angela Li
Erin Alison Becker
Francois Michonneau
Maneesha Sane
Sarah Brown
Tracy Teal
Date Added:
08/07/2020
Software Carpentry
Unrestricted Use
CC BY
Rating
0.0 stars

Since 1998, Software Carpentry has been teaching researchers the computing skills they need to get more done in less time and with less pain. Our volunteer instructors have run hundreds of events for more than 34,000 researchers since 2012. All of our lesson materials are freely reusable under the Creative Commons - Attribution license.

Subject:
Applied Science
Life Science
Physical Science
Social Science
Material Type:
Full Course
Provider:
Software Carpentry Community
Author:
Software Carpentry Community
Date Added:
06/18/2020
Statistics with JASP and the Open Science Framework
Unrestricted Use
CC BY
Rating
0.0 stars

This webinar will introduce the integration of JASP Statistical Software (https://jasp-stats.org/) with the Open Science Framework (OSF; https://osf.io). The OSF is a free, open source web application built to help researchers manage their workflows. The OSF is part collaboration tool, part version control software, and part data archive. The OSF connects to popular tools researchers already use, like Dropbox, Box, Github, Mendeley, and now is integrated with JASP, to streamline workflows and increase efficiency.

Subject:
Applied Science
Computer Science
Information Science
Material Type:
Lecture
Provider:
Center for Open Science
Author:
Center for Open Science
Date Added:
08/07/2020
Two Years Later: Journals Are Not Yet Enforcing the ARRIVE Guidelines on Reporting Standards for Pre-Clinical Animal Studies
Unrestricted Use
CC BY
Rating
0.0 stars

A study by David Baker and colleagues reveals poor quality of reporting in pre-clinical animal research and a failure of journals to implement the ARRIVE guidelines. There is growing concern that poor experimental design and lack of transparent reporting contribute to the frequent failure of pre-clinical animal studies to translate into treatments for human disease. In 2010, the Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines were introduced to help improve reporting standards. They were published in PLOS Biology and endorsed by funding agencies and publishers and their journals, including PLOS, Nature research journals, and other top-tier journals. Yet our analysis of papers published in PLOS and Nature journals indicates that there has been very little improvement in reporting standards since then. This suggests that authors, referees, and editors generally are ignoring guidelines, and the editorial endorsement is yet to be effectively implemented.

Subject:
Applied Science
Health, Medicine and Nursing
Life Science
Material Type:
Reading
Provider:
PLOS Biology
Author:
Ana Sottomayor
David Baker
Katie Lidster
Sandra Amor
Date Added:
08/07/2020