Open filters Close filters

429 Results

Data Intro for Archivists

Unrestricted Use

CC BY

Data Intro for Archivists

Rating

This Library Carpentry lesson introduces archivists to working with data. At the conclusion of the lesson you will: be able to explain terms, phrases, and concepts in code or software development; identify and use best practice in data structures; use regular expressions in searches.

Subject:: Applied Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: James Baker; Jeanine Finn; Jenny Bunn; Katherine Koziar; Noah Geraci; Scott Peterson
Date Added:: 08/07/2020

Data Is Present: Open Workshops and Hackathons

Unrestricted Use

CC BY

Data Is Present: Open Workshops and Hackathons

Rating

Original data has become more accessible thanks to cultural and technological advances. On the internet, we can find innumerable data sets from sources such as scientific journals and repositories, local and national governments, and non-governmental organisations. Often, these data may be presented in novel ways, by creating new tables or plots, or by integrating additional data. Free, open-source software has become a great companion for open data. This open scholarship project offers free workshops and coding meet-ups (hackathons) to learn and practise data presentation, across the UK. It is made possible by a fellowship of the Software Sustainability Institute.

Subject:: Applied Science; Life Science; Physical Science; Social Science
Material Type:: Activity/Lab
Author:: Pablo Bernabeu
Date Added:: 01/27/2020

The Data Journalism Handbook

Conditional Remix & Share Permitted

CC BY-SA

The Data Journalism Handbook

Rating

When you combine the sheer scale and range of digital information now available with a journalist’s "nose for news" and her ability to tell a compelling story, a new world of possibility opens up. With The Data Journalism Handbook, you’ll explore the potential, limits, and applied uses of this new and fascinating field.

This valuable handbook has attracted scores of contributors since the European Journalism Centre and the Open Knowledge Foundation launched the project at MozFest 2011. Through a collection of tips and techniques from leading journalists, professors, software developers, and data analysts, you’ll learn how data can be either the source of data journalism or a tool with which the story is told—or both.

Subject:: Business and Communication; Journalism
Material Type:: Textbook
Provider:: University of Bath
Author:: Jonathan Gray; Liliana Bounegru; Lucy Chambers
Date Added:: 07/02/2019

Data Literacies Definitions

Only Sharing Permitted

CC BY-NC-ND

Data Literacies Definitions

Rating

Nebraska Data Literacies key terms defined.

Subject:: Mathematics; Measurement and Data
Material Type:: Primary Source
Date Added:: 11/15/2018

Data Management Planning

Unrestricted Use

Public Domain

Data Management Planning

Rating

Data management planning is the starting point in the data life cycle. Creating a formal document that outlines what you will do with the data during and after the completion of research helps to ensure that the data is safe for current and future use. This lesson describes the benefits of a data management plan (DMP), outlines the components of a DMP, details tools for creating a DMP, provides NSF DMP information, and demonstrates the use of an example DMP.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Data Management & Reproducibility

Conditional Remix & Share Permitted

CC BY-NC

Data Management & Reproducibility

Rating

Introduction to data management and reproducibility for researchers as a presentation.

Subject:: Applied Science; Life Science; Physical Science; Social Science
Material Type:: Lesson
Provider:: New York University
Author:: Vicky Steeves
Date Added:: 04/04/2019

Data Management Short Course for Scientists

Read the Fine Print

Educational Use

Data Management Short Course for Scientists

Rating

The ESIP Federation, in cooperation with NOAA and the Data Conservancy, seeks to share the community's knowledge with scientists who increasingly need to be better data managers, as well as to support workforce development for new data management professionals. Over the next several years, the ESIP Federation expects to evolve training courses which seeks to improve the understanding of scientific data management among scientists, emerging scientists, and data professionals of all sorts.

All courses are available under a Creative Commons Attribution 3.0 license that allows you to share and adapt the work as long as you cite the work according to the citation provided. Please send feedback upon the courses to shortcourseeditors@esipfed.org.

Subject:: Applied Science; Information Science
Material Type:: Lecture; Module; Primary Source
Author:: Earth Science Information Partners
Date Added:: 03/21/2022

Data Management Skillbuilding Hub - DataOne

Unrestricted Use

Public Domain

Data Management Skillbuilding Hub - DataOne

Rating

The Data Management Skillbuilding Hub is a repository for open educational resources regarding data management, meaning that it is a collection of learning resources freely contributed by anyone willing to share them. Materials such as lessons, best practices, and videos, are stored in the DataONEorg GitHub repository as well as searchable through the Data Management Training Clearinghouse. We invite you submit your own educational resources so that the Data Management Skillbuilding Hub can remain an up-to-date and sustainable educational tool for all to benefit from. You can easily contribute learning materials to the Skillbuilding Hub via GitHub online.

Subject:: Applied Science; Information Science
Material Type:: Lesson; Primary Source
Provider:: DataONE
Date Added:: 03/21/2022

Data Management with SQL for Ecologists

Unrestricted Use

CC BY

Data Management with SQL for Ecologists

Rating

Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Christina Koch; Donal Heidenblad; Katy Felkner; Rémi Rampin; Timothée Poisot
Date Added:: 03/20/2017

Data Management with SQL for Social Scientists

Unrestricted Use

CC BY

Data Management with SQL for Social Scientists

Rating

This is an alpha lesson to teach Data Management with SQL for Social Scientists, We welcome and criticism, or error; and will take your feedback into account to improve both the presentation and the content. Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data; Social Science
Material Type:: Module
Provider:: The Carpentries
Author:: Peter Smyth
Date Added:: 08/07/2020

Only Sharing Permitted

CC BY-ND

Data Modeling

Rating

This video gives a quick overview of data modeling.

Subject:: Applied Science; Computer Science
Material Type:: Lesson
Author:: Los Angeles Pacific University
Date Added:: 12/07/2020

Data Organization in Spreadsheets for Ecologists

Unrestricted Use

CC BY

Data Organization in Spreadsheets for Ecologists

Rating

Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. We organize data in spreadsheets in the ways that we as humans want to work with the data, but computers require that data be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Christie Bahlai; Peter R. Hoyt; Tracy Teal
Date Added:: 03/20/2017

Data Organization in Spreadsheets for Social Scientists

Unrestricted Use

CC BY

Data Organization in Spreadsheets for Social Scientists

Rating

Lesson on spreadsheets for social scientists. Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. Typically we organize data in spreadsheets in ways that we as humans want to work with the data. However computers require data to be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:: Applied Science; Information Science; Mathematics; Measurement and Data; Social Science
Material Type:: Module
Provider:: The Carpentries
Author:: David Mawdsley; Erin Becker; François Michonneau; Karen Word; Lachlan Deer; Peter Smyth
Date Added:: 08/07/2020

Data Quality Control and Assurance

Unrestricted Use

Public Domain

Data Quality Control and Assurance

Rating

Quality assurance and quality control are phrases used to describe activities that prevent errors from entering or staying in a data set. These activities ensure the quality of the data before it is collected, entered, or analyzed, as well as actively monitoring and maintaining the quality of data throughout the study. In this lesson, we define and provide examples of quality assurance, quality control, data contamination and types of errors that may be found in data sets. After completing this lesson, participants will be able to describe best practices in quality assurance and quality control and relate them to different phases of data collection and entry.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Unrestricted Use

Public Domain

Data Sharing

Rating

When first sharing research data, researchers often raise questions about the value, benefits, and mechanisms for sharing. Many stakeholders and interested parties, such as funding agencies, communities, other researchers, or members of the public may be interested in research, results and related data. This lesson addresses data sharing in the context of the data life cycle, the value of sharing data, concerns about sharing data, and methods and best practices for sharing data.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lesson
Provider:: DataONE
Author:: DataONE Community Engagement & Outreach Working Group
Date Added:: 11/21/2020

Data Sharing, Mandates, and Repositories

Conditional Remix & Share Permitted

CC BY-NC

Data Sharing, Mandates, and Repositories

Rating

Some research funders have a mandate for data resulting from their funded research to be shared. This presentation provides a general definition of data sharing and how scholars can identify and follow data sharing mandates.

Subject:: Applied Science; Education; Higher Education; Information Science
Material Type:: Lecture
Author:: Kristy Padron
Date Added:: 11/22/2020

Data Sharing by Scientists: Practices and Perceptions

Unrestricted Use

CC BY

Data Sharing by Scientists: Practices and Perceptions

Rating

Background Scientific research in the 21st century is more data intensive and collaborative than in the past. It is important to study the data practices of researchers – data accessibility, discovery, re-use, preservation and, particularly, data sharing. Data sharing is a valuable part of the scientific method allowing for verification of results and extending research from prior results. Methodology/Principal Findings A total of 1329 scientists participated in this survey exploring current data sharing practices and perceptions of the barriers and enablers of data sharing. Scientists do not make their data electronically available to others for various reasons, including insufficient time and lack of funding. Most respondents are satisfied with their current processes for the initial and short-term parts of the data or research lifecycle (collecting their research data; searching for, describing or cataloging, analyzing, and short-term storage of their data) but are not satisfied with long-term data preservation. Many organizations do not provide support to their researchers for data management both in the short- and long-term. If certain conditions are met (such as formal citation and sharing reprints) respondents agree they are willing to share their data. There are also significant differences and approaches in data management practices based on primary funding agency, subject discipline, age, work focus, and world region. Conclusions/Significance Barriers to effective data sharing and preservation are deeply rooted in the practices and culture of the research process as well as the researchers themselves. New mandates for data management plans from NSF and other federal agencies and world-wide attention to the need to share and preserve data could lead to changes. Large scale programs, such as the NSF-sponsored DataNET (including projects like DataONE) will both bring attention and resources to the issue and make it easier for scientists to apply sound data management principles.

Subject:: Ecology; Life Science; Social Science
Material Type:: Reading
Provider:: PLOS ONE
Author:: Arsev Umur Aydinoglu; Carol Tenopir; Eleanor Read; Kimberly Douglass; Lei Wu; Maribeth Manoff; Mike Frame; Suzie Allard
Date Added:: 08/07/2020

Data Structures and Algorithms Materials

Unrestricted Use

CC BY

Data Structures and Algorithms Materials

Rating

Assignments, notes, and exam questions for CS 315: Data Structures and Algorithms. Taught by Raphael Finkel, Department of Computer Science, University of Kentucky.

Subject:: Applied Science; Computer Science
Material Type:: Assessment; Homework/Assignment; Lecture Notes
Provider:: University of Kentucky
Author:: Raphael Finkel
Date Added:: 07/16/2024

Data Training Engaging End-users

Conditional Remix & Share Permitted

CC BY-SA

Data Training Engaging End-users

Rating

Data Tree is a free online course with all you need to know for research data management, along with ways to engage and share data with business, policymakers, media and the wider public. The self-paced training course will take 15 to 20 hours to complete in eight structured modules. The course is packed with video, quizzes and real-life examples of data management, along with valuable tips from experts in data management, data sharing and science communication. The training course materials will be available for structured learning, but also to dip into for immediate problem solving.

Data Tree is funded by the Natural Environment Research Council (NERC) through the National Productivity Investment Fund (NPIF), delivered by the Institute for Environmental Analytics and Stats4SD and supported by the Institute of Physics.

Subject:: Applied Science; Information Science
Material Type:: Module; Primary Source
Date Added:: 05/16/2022

Data Wrangling and Processing for Genomics

Unrestricted Use

CC BY

Data Wrangling and Processing for Genomics

Rating

Data Carpentry lesson to learn how to use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation. A lot of genomics analysis is done using command-line tools for three reasons: 1) you will often be working with a large number of files, and working through the command-line rather than through a graphical user interface (GUI) allows you to automate repetitive tasks, 2) you will often need more compute power than is available on your personal computer, and connecting to and interacting with remote computers requires a command-line interface, and 3) you will often need to customize your analyses, and command-line tools often enable more customization than the corresponding GUI tools (if in fact a GUI tool even exists). In a previous lesson, you learned how to use the bash shell to interact with your computer through a command line interface. In this lesson, you will be applying this new knowledge to carry out a common genomics workflow - identifying variants among sequencing samples taken from multiple individuals within a population. We will be starting with a set of sequenced reads (.fastq files), performing some quality control steps, aligning those reads to a reference genome, and ending by identifying and visualizing variations among these samples. As you progress through this lesson, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.

Subject:: Applied Science; Computer Science; Genetics; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Adam Thomas; Ahmed R. Hasan; Aniello Infante; Anita Schürch; Dev Paudel; Erin Alison Becker; Fotis Psomopoulos; François Michonneau; Gaius Augustus; Gregg TeHennepe; Jason Williams; Jessica Elizabeth Mizzi; Karen Cranston; Kari L Jordan; Kate Crosby; Kevin Weitemier; Lex Nederbragt; Luis Avila; Peter R. Hoyt; Rayna Michelle Harris; Ryan Peek; Sheldon John McKay; Sheldon McKay; Taylor Reiter; Tessa Pierce; Toby Hodges; Tracy Teal; Vasilis Lenis; Winni Kretzschmar; dbmarchant
Date Added:: 08/07/2020