OER Commons

Unrestricted Use

CC BY

Carpentries Instructor Training

Rating

A two-day introduction to modern evidence-based teaching practices, built and maintained by the Carpentry community.

Subject:: Applied Science; Computer Science; Education; Higher Education; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Aleksandra Nenadic; Alexander Konovalov; Alistair John Walsh; Allison Weber; Amy E. Hodge; Andrew B. Collier; Anita Schürch; AnnaWilliford; Ariel Rokem; Brian Ballsun-Stanton; Callin Switzer; Christian Brueffer; Christina Koch; Christopher Erdmann; Colin Morris; Dan Allan; DanielBrett; Danielle Quinn; Darya Vanichkina; David Jennings; Eric Jankowski; Erin Alison Becker; Evan Peter Williamson; François Michonneau; Gerard Capes; Greg Wilson; Ian Lee; Jason M Gates; Jason Williams; Jeffrey Oliver; Joe Atzberger; John Bradley; John Pellman; Jonah Duckles; Jonathan Bradley; Karen Cranston; Karen Word; Kari L Jordan; Katherine Koziar; Katrin Leinweber; Kees den Heijer; Laurence; Lex Nederbragt; Maneesha Sane; Marie-Helene Burle; Mik Black; Mike Henry; Murray Cadzow; Neal Davis; Neil Kindlon; Nicholas Tierney; Nicolás Palopoli; Noah Spies; Paula Andrea Martinez; Petraea; Rayna Michelle Harris; Rémi Emonet; Rémi Rampin; Sarah Brown; Sarah M Brown; Sarah Stevens; Sean; Serah Anne Njambi Kiburu; Stefan Helfrich; Steve Moss; Stéphane Guillou; Ted Laderas; Tiago M. D. Pereira; Toby Hodges; Tracy Teal; Yo Yehudi; amoskane; davidbenncsiro; naught101; satya-vinay
Date Added:: 08/07/2020

More Less

Data Analysis and Visualization in R for Ecologists

Unrestricted Use

CC BY

Data Analysis and Visualization in R for Ecologists

Rating

Data Carpentry lesson from Ecology curriculum to learn how to analyse and visualise ecological data in R. Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with ecology data in R. This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from R.

Subject:: Applied Science; Computer Science; Ecology; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Ankenbrand, Markus; Arindam Basu; Ashander, Jaime; Bahlai, Christie; Bailey, Alistair; Becker, Erin Alison; Bledsoe, Ellen; Boehm, Fred; Bolker, Ben; Bouquin, Daina; Burge, Olivia Rata; Burle, Marie-Helene; Carchedi, Nick; Chatzidimitriou, Kyriakos; Chiapello, Marco; Conrado, Ana Costa; Cortijo, Sandra; Cranston, Karen; Cuesta, Sergio Martínez; Culshaw-Maurer, Michael; Czapanskiy, Max; Daijiang Li; Dashnow, Harriet; Daskalova, Gergana; Deer, Lachlan; Direk, Kenan; Dunic, Jillian; Elahi, Robin; Fishman, Dmytro; Fouilloux, Anne; Fournier, Auriel; Gan, Emilia; Goswami, Shubhang; Guillou, Stéphane; Hancock, Stacey; Hardenberg, Achaz Von; Harrison, Paul; Hart, Ted; Herr, Joshua R.; Hertweck, Kate; Hodges, Toby; Hulshof, Catherine; Humburg, Peter; Jean, Martin; Johnson, Carolina; Johnson, Kayla; Johnston, Myfanwy; Jordan, Kari L; K. A. S. Mislan; Kaupp, Jake; Keane, Jonathan; Kerchner, Dan; Klinges, David; Koontz, Michael; Leinweber, Katrin; Lepore, Mauro Luciano; Li, Ye; Lijnzaad, Philip; Lotterhos, Katie; Mannheimer, Sara; Marwick, Ben; Michonneau, François; Millar, Justin; Moreno, Melissa; Najko Jahn; Obeng, Adam; Odom, Gabriel J.; Pauloo, Richard; Pawlik, Aleksandra Natalia; Pearse, Will; Peck, Kayla; Pederson, Steve; Peek, Ryan; Pletzer, Alex; Quinn, Danielle; Rajeg, Gede Primahadi Wijaya; Reiter, Taylor; Rodriguez-Sanchez, Francisco; Sandmann, Thomas; Seok, Brian; Sfn_brt; Shiklomanov, Alexey; Shivshankar Umashankar; Stachelek, Joseph; Strauss, Eli; Sumedh; Switzer, Callin; Tarkowski, Leszek; Tavares, Hugo; Teal, Tracy; Theobold, Allison; Tirok, Katrin; Tylén, Kristian; Vanichkina, Darya; Voter, Carolyn; Webster, Tara; Weisner, Michael; White, Ethan P; Wilson, Earle; Woo, Kara; Wright, April; Yanco, Scott; Ye, Hao
Date Added:: 03/20/2017

More Less

Unrestricted Use

CC BY

Data Carpentry R for Genomics

Rating

Data Carpentry's aim is to teach researchers basic concepts, skills, and tools for working more effectively with data. The lessons below were designed for those interested in working with Genomics data in R.

Subject:: Applied Science; Computer Science
Material Type:: Lesson
Provider:: NumFocus
Provider Set:: Data Carpentry Genomics Materials
Author:: Kate Hertweck; Ryan Williams; Susan McClatchey; Tracy Teal
Date Added:: 03/28/2017

More Less

Data Organization in Spreadsheets for Ecologists

Unrestricted Use

CC BY

Data Organization in Spreadsheets for Ecologists

Rating

Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. We organize data in spreadsheets in the ways that we as humans want to work with the data, but computers require that data be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Christie Bahlai; Peter R. Hoyt; Tracy Teal
Date Added:: 03/20/2017

More Less

Data Wrangling and Processing for Genomics

Unrestricted Use

CC BY

Data Wrangling and Processing for Genomics

Rating

Data Carpentry lesson to learn how to use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation. A lot of genomics analysis is done using command-line tools for three reasons: 1) you will often be working with a large number of files, and working through the command-line rather than through a graphical user interface (GUI) allows you to automate repetitive tasks, 2) you will often need more compute power than is available on your personal computer, and connecting to and interacting with remote computers requires a command-line interface, and 3) you will often need to customize your analyses, and command-line tools often enable more customization than the corresponding GUI tools (if in fact a GUI tool even exists). In a previous lesson, you learned how to use the bash shell to interact with your computer through a command line interface. In this lesson, you will be applying this new knowledge to carry out a common genomics workflow - identifying variants among sequencing samples taken from multiple individuals within a population. We will be starting with a set of sequenced reads (.fastq files), performing some quality control steps, aligning those reads to a reference genome, and ending by identifying and visualizing variations among these samples. As you progress through this lesson, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.

Subject:: Applied Science; Computer Science; Genetics; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Adam Thomas; Ahmed R. Hasan; Aniello Infante; Anita Schürch; Dev Paudel; Erin Alison Becker; Fotis Psomopoulos; François Michonneau; Gaius Augustus; Gregg TeHennepe; Jason Williams; Jessica Elizabeth Mizzi; Karen Cranston; Kari L Jordan; Kate Crosby; Kevin Weitemier; Lex Nederbragt; Luis Avila; Peter R. Hoyt; Rayna Michelle Harris; Ryan Peek; Sheldon John McKay; Sheldon McKay; Taylor Reiter; Tessa Pierce; Toby Hodges; Tracy Teal; Vasilis Lenis; Winni Kretzschmar; dbmarchant
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Genomics Workshop Overview

Rating

Workshop overview for the Data Carpentry genomics curriculum. Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop teaches data management and analysis for genomics research including: best practices for organization of bioinformatics projects and data, use of command-line utilities, use of command-line tools to analyze sequence quality and perform variant calling, and connecting to and using cloud computing. This workshop is designed to be taught over two full days of instruction. Please note that workshop materials for working with Genomics data in R are in “alpha” development. These lessons are available for review and for informal teaching experiences, but are not yet part of The Carpentries’ official lesson offerings. Interested in teaching these materials? We have an onboarding video and accompanying slides available to prepare Instructors to teach these lessons. After watching this video, please contact team@carpentries.org so that we can record your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at centrally-organized Data Carpentry Genomics workshops.

Subject:: Applied Science; Computer Science; Genetics; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Amanda Charbonneau; Erin Alison Becker; François Michonneau; Jason Williams; Maneesha Sane; Matthew Kweskin; Muhammad Zohaib Anwar; Murray Cadzow; Paula Andrea Martinez; Taylor Reiter; Tracy Teal
Date Added:: 08/07/2020

More Less

Good enough practices in scientific computing

Unrestricted Use

CC BY

Good enough practices in scientific computing

Rating

Computers are now essential in all branches of science, but most researchers are never taught the equivalent of basic lab skills for research computing. As a result, data can get lost, analyses can take much longer than necessary, and researchers are limited in how effectively they can work with software and data. Computing workflows need to follow the same practices as lab projects and notebooks, with organized data, documented steps, and the project structured for reproducibility, but researchers new to computing often don't know where to start. This paper presents a set of good computing practices that every researcher can adopt, regardless of their current level of computational skill. These practices, which encompass data management, programming, collaborating with colleagues, organizing projects, tracking work, and writing manuscripts, are drawn from a wide variety of published sources from our daily lives and from our work with volunteer organizations that have delivered workshops to over 11,000 people since 2010.

Subject:: Biology; Life Science
Material Type:: Reading
Provider:: PLOS Computational Biology
Author:: Greg Wilson; Jennifer Bryan; Justin Kitzes; Karen Cranston; Lex Nederbragt; Tracy K. Teal
Date Added:: 08/07/2020

More Less

Introduction to Cloud Computing for Genomics

Unrestricted Use

CC BY

Introduction to Cloud Computing for Genomics

Rating

Data Carpentry lesson to learn how to work with Amazon AWS cloud computing and how to transfer data between your local computer and cloud resources. The cloud is a fancy name for the huge network of computers that host your favorite websites, stream movies, and shop online, but you can also harness all of that computing power for running analyses that would take days, weeks or even years on your local computer. In this lesson, you’ll learn about renting cloud services that fit your analytic needs, and how to interact with one of those services (AWS) via the command line.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Abigail Cabunoc Mayes; Adina Howe; Amanda Charbonneau; Bob Freeman; Brittany N. Lasseigne, PhD; Bérénice Batut; Caryn Johansen; Chris Fields; Darya Vanichkina; David Mawdsley; Erin Becker; François Michonneau; Greg Wilson; Jason Williams; Joseph Stachelek; Kari L. Jordan, PhD; Katrin Leinweber; Maxim Belkin; Michael R. Crusoe; Piotr Banaszkiewicz; Raniere Silva; Renato Alves; Rémi Emonet; Stephen Turner; Taylor Reiter; Thomas Morrell; Tracy Teal; William L. Close; ammatsun; vuw-ecs-kevin
Date Added:: 03/28/2017

More Less

Unrestricted Use

CC BY

Introduction to Geospatial Concepts

Rating

Data Carpentry lesson to understand data structures and common storage and transfer formats for spatial data. The goal of this lesson is to provide an introduction to core geospatial data concepts. It is intended for learners who have no prior experience working with geospatial data, and as a pre-requisite for the R for Raster and Vector Data lesson . This lesson can be taught in approximately 75 minutes and covers the following topics: Introduction to raster and vector data format and attributes Examples of data types commonly stored in raster vs vector format Introduction to categorical vs continuous raster data and multi-layer rasters Introduction to the file types and R packages used in the remainder of this workshop Introduction to coordinate reference systems and the PROJ4 format Overview of commonly used programs and applications for working with geospatial data The Introduction to R for Geospatial Data lesson provides an introduction to the R programming language while the R for Raster and Vector Data lesson provides a more in-depth introduction to visualization (focusing on geospatial data), and working with data structures unique to geospatial data. The R for Raster and Vector Data lesson assumes that learners are already familiar with both geospatial data concepts and the core concepts of the R language.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Anne Fouilloux; Chris Prener; Dev Paudel; Ethan P White; Joseph Stachelek; Katrin Leinweber; Lauren O'Brien; Michael Koontz; Paul Miller; Tracy Teal; Whalen
Date Added:: 08/07/2020

More Less

Introduction to Geospatial Raster and Vector Data with R

Unrestricted Use

CC BY

Introduction to Geospatial Raster and Vector Data with R

Rating

Data Carpentry lesson to open, work with, and plot vector and raster-format spatial data in R. The episodes in this lesson cover how to open, work with, and plot vector and raster-format spatial data in R. Additional topics include working with spatial metadata (extent and coordinate reference systems), reprojecting spatial data, and working with raster time series data.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Ana Costa Conrado; Angela Li; Anne Fouilloux; Brett Lord-Castillo; Ethan P White; Joseph Stachelek; Juan F Fung; Katrin Leinweber; Klaus Schliep; Kristina Riemer; Lachlan Deer; Lauren O'Brien; Marchand; Punam Amratia; Sergio Marconi; Stéphane Guillou; Tracy Teal; zenobieg
Date Added:: 08/07/2020

More Less

Introduction to the Command Line for Genomics

Unrestricted Use

CC BY

Introduction to the Command Line for Genomics

Rating

Data Carpentry lesson to learn to navigate your file system, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards with genomics data. Command line interface (OS shell) and graphic user interface (GUI) are different ways of interacting with a computer’s operating system. The shell is a program that presents a command line interface which allows you to control your computer using commands entered with a keyboard instead of controlling graphical user interfaces (GUIs) with a mouse/keyboard combination. There are quite a few reasons to start learning about the shell: For most bioinformatics tools, you have to use the shell. There is no graphical interface. If you want to work in metagenomics or genomics you’re going to need to use the shell. The shell gives you power. The command line gives you the power to do your work more efficiently and more quickly. When you need to do things tens to hundreds of times, knowing how to use the shell is transformative. To use remote computers or cloud computing, you need to use the shell.

Subject:: Applied Science; Computer Science; Genetics; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Amanda Charbonneau; Amy E. Hodge; Anita Schürch; Bastian Greshake Tzovaras; Bérénice Batut; Colin Davenport; Diya Das; Erin Alison Becker; François Michonneau; Giulio Valentino Dalla Riva; Jessica Elizabeth Mizzi; Karen Cranston; Kari L Jordan; Mattias de Hollander; Mike Lee; Niclas Jareborg; Omar Julio Sosa; Rayna Michelle Harris; Ross Cunning; Russell Neches; Sarah Stevens; Shannon EK Joslin; Sheldon John McKay; Siva Chudalayandi; Taylor Reiter; Tobi; Tracy Teal; Tristan De Buysscher
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Library Carpentry: Introduction to Git

Rating

Library Carpentry lesson: An introduction to Git. What We Will Try to Do Begin to understand and use Git/GitHub. You will not be an expert by the end of the class. You will probably not even feel very comfortable using Git. This is okay. We want to make a start but, as with any skill, using Git takes practice. Be Excellent to Each Other If you spot someone in the class who is struggling with something and you think you know how to help, please give them a hand. Try not to do the task for them: instead explain the steps they need to take and what these steps will achieve. Be Patient With The Instructor and Yourself This is a big group, with different levels of knowledge, different computer systems. This isn’t your instructor’s full-time job (though if someone wants to pay them to play with computers all day they’d probably accept). They will do their best to make this session useful. This is your session. If you feel we are going too fast, then please put up a pink sticky. We can decide as a group what to cover.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Alex Mendes; Alexander Gary Zimmerman; Alexander Mendes; Amiya Maji; Amy Olex; Andrew Lonsdale; Annika Rockenberger; Begüm D. Topçuoğlu; Belinda Weaver; Benjamin Bolker; Bill McMillin; Brian Moore; Casey Youngflesh; Christoph Junghans; Christopher Erdmann; DSTraining; Dan Michael O. Heggø; David Jennings; Erin Alison Becker; Evan Williamson; Garrett Bachant; Grant Sayer; Ian Lee; Jake Lever; Jamene Brooks-Kieffer; James Baker; James E McClure; James O'Donnell; James Tocknell; Janoš Vidali; Jeffrey Oliver; Jeremy Teitelbaum; Jeyashree Krishnan; Joe Atzberger; Jonah Duckles; Jonathan Cooper; João Rodrigues; Katherine Koziar; Katrin Leinweber; Kunal Marwaha; Kurt Glaesemann; L.C. Karssen; Lauren Ko; Lex Nederbragt; Madicken Munk; Maneesha Sane; Marie-Helene Burle; Mark Woodbridge; Martino Sorbaro; Matt Critchlow; Matteo Ceschia; Matthew Bourque; Matthew Hartley; Maxim Belkin; Megan Potterbusch; Michael Torpey; Michael Zingale; Mingsheng Zhang; Nicola Soranzo; Nima Hejazi; Nora McGregor; Oscar Arbeláez; Peace Ossom Williamson; Raniere Silva; Rayna Harris; Rene Gassmoeller; Rich McCue; Richard Barnes; Ruud Steltenpool; Ryan Wick; Rémi Emonet; Samniqueka Halsey; Samuel Lelièvre; Sarah Stevens; Saskia Hiltemann; Schlauch, Tobias; Scott Bailey; Shari Laster; Simon Waldman; Stefan Siegert; Thea Atwood; Thomas Morrell; Tim Dennis; Tommy Keswick; Tracy Teal; Trevor Keller; TrevorLeeCline; Tyler Crawford Kelly; Tyler Reddy; Umihiko Hoshijima; Veronica Ikeshoji-Orlati; Wes Harrell; Will Usher; William Sacks; Wolmar Nyberg Åkerström; Yuri; abracarambar; ajtag; butterflyskip; cmjt; hdinkel; jonestoddcm; pllim
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Library Carpentry: OpenRefine

Rating

Library Carpentry lesson: an introduction to OpenRefine for Librarians This Library Carpentry lesson introduces people working in library- and information-related roles to working with data in OpenRefine. At the conclusion of the lesson you will understand what the OpenRefine software does and how to use the OpenRefine software to work with data files.

Subject:: Applied Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Alexander Mendes; Anna Neatrour; Antonin Delpeuch; Betty Rozum; Christina Koch; Christopher Erdmann; Daniel Bangert; Elizabeth Lisa McAulay; Evan Williamson; Jamene Brooks-Kieffer; James Baker; Jamie Jamison; Jeffrey Oliver; Katherine Koziar; Naupaka Zimmerman; Paul R. Pival; Rémi Emonet; Tim Dennis; Tom Honeyman; Tracy Teal; andreamcastillo; dnesdill; hauschke; mhidas
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

OpenRefine for Social Science Data

Rating

Lesson on OpenRefine for social scientists. A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identifed and corrected or formatting made consistent. This step must be taken with the same care and attention to reproducibility as the analysis. OpenRefine (formerly Google Refine) is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another. This lesson will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.

Subject:: Applied Science; Information Science; Mathematics; Measurement and Data; Social Science
Material Type:: Module
Provider:: The Carpentries
Author:: Erin Becker; François Michonneau; Geoff LaFlair; Karen Word; Lachlan Deer; Peter Smyth; Tracy Teal
Date Added:: 08/07/2020

More Less

Project Organization and Management for Genomics

Unrestricted Use

CC BY

Project Organization and Management for Genomics

Rating

Data Carpentry Genomics workshop lesson to learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database. Good data organization is the foundation of any research project. It not only sets you up well for an analysis, but it also makes it easier to come back to the project later and share with collaborators, including your most important collaborator - future you. Organizing a project that includes sequencing involves many components. There’s the experimental setup and conditions metadata, measurements of experimental parameters, sequencing preparation and sample information, the sequences themselves and the files and workflow of any bioinformatics analysis. So much of the information of a sequencing project is digital, and we need to keep track of our digital records in the same way we have a lab notebook and sample freezer. In this lesson, we’ll go through the project organization and documentation that will make an efficient bioinformatics workflow possible. Not only will this make you a more effective bioinformatics researcher, it also prepares your data and project for publication, as grant agencies and publishers increasingly require this information. In this lesson, we’ll be using data from a study of experimental evolution using E. coli. More information about this dataset is available here. In this study there are several types of files: Spreadsheet data from the experiment that tracks the strains and their phenotype over time Spreadsheet data with information on the samples that were sequenced - the names of the samples, how they were prepared and the sequencing conditions The sequence data Throughout the analysis, we’ll also generate files from the steps in the bioinformatics pipeline and documentation on the tools and parameters that we used. In this lesson you will learn: How to structure your metadata, tabular data and information about the experiment. The metadata is the information about the experiment and the samples you’re sequencing. How to prepare for, understand, organize and store the sequencing data that comes back from the sequencing center How to access and download publicly available data that may need to be used in your bioinformatics analysis The concepts of organizing the files and documenting the workflow of your bioinformatics analysis

Subject:: Business and Communication; Genetics; Life Science; Management
Material Type:: Module
Provider:: The Carpentries
Author:: Amanda Charbonneau; Bérénice Batut; Daniel O. S. Ouso; Deborah Paul; Erin Alison Becker; François Michonneau; Jason Williams; Juan A. Ugalde; Kevin Weitemier; Laura Williams; Paula Andrea Martinez; Peter R. Hoyt; Rayna Michelle Harris; Taylor Reiter; Toby Hodges; Tracy Teal
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Social Science Workshop Overview

Rating

Workshop overview for the Data Carpentry Social Sciences curriculum. Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop teaches data management and analysis for social science research including best practices for data organization in spreadsheets, reproducible data cleaning with OpenRefine, and data analysis and visualization in R. This curriculum is designed to be taught over two full days of instruction. Materials for teaching data analysis and visualization in Python and extraction of information from relational databases using SQL are in development. Interested in teaching these materials? We have an onboarding video and accompanying slides available to prepare Instructors to teach these lessons. After watching this video, please contact team@carpentries.org so that we can record your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at centrally-organized Data Carpentry Social Sciences workshops.

Subject:: Applied Science; Information Science; Mathematics; Measurement and Data; Social Science
Material Type:: Module
Provider:: The Carpentries
Author:: Angela Li; Erin Alison Becker; Francois Michonneau; Maneesha Sane; Sarah Brown; Tracy Teal
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Version Control with Git

Rating

This lesson is part of the Software Carpentry workshops that teach how to use version control with Git. Wolfman and Dracula have been hired by Universal Missions (a space services spinoff from Euphoric State University) to investigate if it is possible to send their next planetary lander to Mars. They want to be able to work on the plans at the same time, but they have run into problems doing this in the past. If they take turns, each one will spend a lot of time waiting for the other to finish, but if they work on their own copies and email changes back and forth things will be lost, overwritten, or duplicated. A colleague suggests using version control to manage their work. Version control is better than mailing files back and forth: Nothing that is committed to version control is ever lost, unless you work really, really hard at it. Since all old versions of files are saved, itâ€™s always possible to go back in time to see exactly who wrote what on a particular day, or what version of a program was used to generate a particular set of results. As we have this record of who made what changes when, we know who to ask if we have questions later on, and, if needed, revert to a previous version, much like the â€œundoâ€ feature in an editor. When several people collaborate in the same project, itâ€™s possible to accidentally overlook or overwrite someoneâ€™s changes. The version control system automatically notifies users whenever thereâ€™s a conflict between one personâ€™s work and anotherâ€™s. Teams are not the only ones to benefit from version control: lone researchers can benefit immensely. Keeping a record of what was changed, when, and why is extremely useful for all researchers if they ever need to come back to the project later on (e.g., a year later, when memory has faded). Version control is the lab notebook of the digital world: itâ€™s what professionals use to keep track of what theyâ€™ve done and to collaborate with other people. Every large software development project relies on it, and most programmers use it for their small jobs as well. And it isnâ€™t just for software: books, papers, small data sets, and anything that changes over time or needs to be shared can and should be stored in a version control system.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Alexander G. Zimmerman; Amiya Maji; Amy L Olex; Andrew Lonsdale; Annika Rockenberger; Begüm D. Topçuoğlu; Ben Bolker; Bill Sacks; Brian Moore; Casey Youngflesh; Charlotte Moragh Jones-Todd; Christoph Junghans; David Jennings; Erin Alison Becker; François Michonneau; Garrett Bachant; Grant Sayer; Holger Dinkel; Ian Lee; Jake Lever; James E McClure; James Tocknell; Janoš Vidali; Jeremy Teitelbaum; Jeyashree Krishnan; Jimmy O'Donnell; Joe Atzberger; Jonah Duckles; Jonathan Cooper; João Rodrigues; Katherine Koziar; Katrin Leinweber; Kunal Marwaha; Kurt Glaesemann; L.C. Karssen; Lauren Ko; Lex Nederbragt; Madicken Munk; Maneesha Sane; Marie-Helene Burle; Mark Woodbridge; Martino Sorbaro; Matt Critchlow; Matteo Ceschia; Matthew Bourque; Matthew Hartley; Maxim Belkin; Megan Potterbusch; Michael Torpey; Michael Zingale; Mingsheng Zhang; Nicola Soranzo; Nima Hejazi; Oscar Arbeláez; Peace Ossom Williamson; Pey Lian Lim; Raniere Silva; Rayna Michelle Harris; Rene Gassmoeller; Rich McCue; Richard Barnes; Ruud Steltenpool; Rémi Emonet; Samniqueka Halsey; Samuel Lelièvre; Sarah Stevens; Saskia Hiltemann; Schlauch, Tobias; Scott Bailey; Simon Waldman; Stefan Siegert; Thomas Morrell; Tommy Keswick; Traci P; Tracy Teal; Trevor Keller; TrevorLeeCline; Tyler Crawford Kelly; Tyler Reddy; Umihiko Hoshijima; Veronica Ikeshoji-Orlati; Wes Harrell; Will Usher; Wolmar Nyberg Åkerström; abracarambar; butterflyskip; jonestoddcm
Date Added:: 03/20/2017

More Less

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

17 Results

Search Resources

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

17 Results